Restructuring with pivot_wider() and pivot_longer()

One useful data skill is being able to move from untidy data to tidy data and back again. The functions that allow you to do this, pivot_wider() and pivot_longer(), exist in the package tidyr. This package is included in tidyverse, along with many other helpful packages. To access these tools, install and load tidyverse:

install.packages("tidyverse")
library(tidyverse)

With the tools loaded, you can restructure the summary dataset from above. Begin with the first, tidy version:

penguins_sum
## # A tibble: 9 × 3
## # Groups:   island [3]
##   island     year mean_body_mass_g
##   <fct>     <int>            <dbl>
## 1 Biscoe     2007            4741.
## 2 Biscoe     2008            4628.
## 3 Biscoe     2009            4793.
## 4 Dream      2007            3684.
## 5 Dream      2008            3779.
## 6 Dream      2009            3691.
## 7 Torgersen  2007            3763.
## 8 Torgersen  2008            3856.
## 9 Torgersen  2009            3489.

To make the untidy version, “pivot” this data from long to wide format using the pivot_wider() function from tidyr:

penguins_wide <- penguins_sum %>%
  pivot_wider(id_cols = c("island", "year"), 
              names_from = year, 
              values_from = mean_body_mass_g)

penguins_wide
## # A tibble: 3 × 4
## # Groups:   island [3]
##   island    `2007` `2008` `2009`
##   <fct>      <dbl>  <dbl>  <dbl>
## 1 Biscoe     4741.  4628.  4793.
## 2 Dream      3684.  3779.  3691.
## 3 Torgersen  3763.  3856.  3489.

Looking more closely at pivot_wider()

Given this untidy table, you can tidy the data by pivoting from “wide” to “long” using pivot_longer().

penguins_wide %>%
  pivot_longer(cols = c("2007", "2008", "2009"), 
               names_to = "year",
               values_to = "mean_body_mass_g")
## # A tibble: 9 × 3
## # Groups:   island [3]
##   island    year  mean_body_mass_g
##   <fct>     <chr>            <dbl>
## 1 Biscoe    2007             4741.
## 2 Biscoe    2008             4628.
## 3 Biscoe    2009             4793.
## 4 Dream     2007             3684.
## 5 Dream     2008             3779.
## 6 Dream     2009             3691.
## 7 Torgersen 2007             3763.
## 8 Torgersen 2008             3856.
## 9 Torgersen 2009             3489.

As you might have noticed, pivot_longer() and pivot_wider() are inverse operations. Pivoting a widened dataset to a longer format gives you back the original dataset, and vice versa.

If you are interested in learning more about tidy data and pivoting, see the Tidy Data chapter in R for Data Science.