Summary tables

kable and kableExtra

The Summarizing data page shows how to compute summaries of your data. Summary tables can be useful for displaying data, and the kable() function in the R package knitr allows you to present tables with helpful formatting.

First, install and load the package.

install.packages("knitr")
library(knitr)

Using the palmerpenguins data, calculate the mean value of penguin body mass (g) across islands over time. (For more information on calculating summary statistics, see the Data Wrangling) section.)

The below code calculates summary statistics, saved in a new dataset called penguin_sum.

penguin_sum <- penguins %>%
  group_by(island, year) %>%
  summarize(mean_body_mass_g = mean(body_mass_g, na.rm = TRUE)) %>% 
  ungroup()

You can print the dataset within R/RStudio:

penguin_sum
## # A tibble: 9 × 3
##   island     year mean_body_mass_g
##   <fct>     <int>            <dbl>
## 1 Biscoe     2007            4741.
## 2 Biscoe     2008            4628.
## 3 Biscoe     2009            4793.
## 4 Dream      2007            3684.
## 5 Dream      2008            3779.
## 6 Dream      2009            3691.
## 7 Torgersen  2007            3763.
## 8 Torgersen  2008            3856.
## 9 Torgersen  2009            3489.

To make a possibly more appealing table with the same data, use kable():

kable(x = penguin_sum, 
      format = "html",
      col.names = c("Island", "Year", "Mean Body Mass (g)"),
      caption = "Mean body mass of penguins on different islands over time")
Mean body mass of penguins on different islands over time
Island Year Mean Body Mass (g)
Biscoe 2007 4740.909
Biscoe 2008 4628.125
Biscoe 2009 4792.797
Dream 2007 3684.239
Dream 2008 3779.412
Dream 2009 3691.477
Torgersen 2007 3763.158
Torgersen 2008 3856.250
Torgersen 2009 3489.062

For additional customization options, use the package kableExtra.

install.packages("kableExtra")
library(kableExtra)

Start with very similar code to what you used with the basic kable table. Add the function kable_styling() to get the basic kableExtra format:

kable(x = penguin_sum, 
      col.names = c("Island", "Year", "Mean Body Mass (g)"),
      caption = "Mean body mass of penguins on different islands over time") %>%
  kable_styling()
Mean body mass of penguins on different islands over time
Island Year Mean Body Mass (g)
Biscoe 2007 4740.909
Biscoe 2008 4628.125
Biscoe 2009 4792.797
Dream 2007 3684.239
Dream 2008 3779.412
Dream 2009 3691.477
Torgersen 2007 3763.158
Torgersen 2008 3856.250
Torgersen 2009 3489.062

Some other options to consider as alternatives to kable_styling() are: kable_classic(), kable_paper(), kable_classic_2(),kable_minimal(), kable_material() and kable_material_dark().

To change the width of the table, add full_width = FALSE within your kable_styling() (or similar) argument. Additional options with kableExtra include changing the font:

kable(x = penguin_sum, 
      col.names = c("Island", "Year", "Mean Body Mass (g)"),
      caption = "Mean body mass of penguins on different islands over time") %>%
   kable_classic(full_width = FALSE, html_font = "Cambria", font_size=16)
Mean body mass of penguins on different islands over time
Island Year Mean Body Mass (g)
Biscoe 2007 4740.909
Biscoe 2008 4628.125
Biscoe 2009 4792.797
Dream 2007 3684.239
Dream 2008 3779.412
Dream 2009 3691.477
Torgersen 2007 3763.158
Torgersen 2008 3856.250
Torgersen 2009 3489.062

You can also adjust the font size:

kable(x = penguin_sum, 
      col.names = c("Island", "Year", "Mean Body Mass (g)"),
      caption = "Mean body mass of penguins on different islands over time") %>%
   kable_material_dark(html_font = "Cambria")
Mean body mass of penguins on different islands over time
Island Year Mean Body Mass (g)
Biscoe 2007 4740.909
Biscoe 2008 4628.125
Biscoe 2009 4792.797
Dream 2007 3684.239
Dream 2008 3779.412
Dream 2009 3691.477
Torgersen 2007 3763.158
Torgersen 2008 3856.250
Torgersen 2009 3489.062

To learn more about these and other customizations, see the kableExtra vignette.

gt

Another approach for creating tables is the gt package. gt stands for “grammar of tables,” and feels a lot like working with ggplot’s grammar of graphics.

install.packages("gt")
library(gt)

You can create a basic table by adding gt() on to the end of your code:

penguin_sum %>% 
  gt()
island year mean_body_mass_g
Biscoe 2007 4740.909
Biscoe 2008 4628.125
Biscoe 2009 4792.797
Dream 2007 3684.239
Dream 2008 3779.412
Dream 2009 3691.477
Torgersen 2007 3763.158
Torgersen 2008 3856.250
Torgersen 2009 3489.062

You can also adjust your column names and include a title by adding on to your baseline table. Notice that you can use the md() wrapper, which stands for “markdown,” to use markdown styling in your tables; the ** on either side will bold the title argument. One * on either side will italicize the text.

penguin_sum %>% 
  gt() %>% 
  cols_label(
    mean_body_mass_g = "Mean Body Mass (g)"
    ) %>% 
  tab_header(
    title = md("**Mean body mass of penguins on different islands**"),
    subtitle = "2007-2009"
    ) 
Mean body mass of penguins on different islands
2007-2009
island year Mean Body Mass (g)
Biscoe 2007 4740.909
Biscoe 2008 4628.125
Biscoe 2009 4792.797
Dream 2007 3684.239
Dream 2008 3779.412
Dream 2009 3691.477
Torgersen 2007 3763.158
Torgersen 2008 3856.250
Torgersen 2009 3489.062

Finally, you can both group and color your columns to help the viewer differentiate between the species. Before adding color to your table, you will need to complete a few additional steps.

First, install a few new packages and choose a color palette from the many available:

install.packages("paletteer")
install.packages("scales")

You can run this line of code to look at all the color options:

info_paletteer(color_pkgs = NULL)

The syntax is generally packagename::palette and you will usually have to install the package before accessing the palette. Notice that you can specify how many colors we want with the n = 3 argument.

install.packages("ggsci")
palette <- paletteer::paletteer_d("ggsci::teal_material", n = 3) %>%
  as.character()

penguin_sum %>% 
  gt() %>% 
  cols_label(
    mean_body_mass_g = "Mean Body Mass (g)"
    ) %>% 
  tab_header(
    title = md("**Mean body mass of penguins on different islands**"),
    subtitle = "2007-2009"
    ) %>% 
  data_color(
    columns = "island",
    colors = scales::col_factor(
      as.character(palette),
      domain = NULL
    )
  )
Mean body mass of penguins on different islands
2007-2009
island year Mean Body Mass (g)
Biscoe 2007 4740.909
Biscoe 2008 4628.125
Biscoe 2009 4792.797
Dream 2007 3684.239
Dream 2008 3779.412
Dream 2009 3691.477
Torgersen 2007 3763.158
Torgersen 2008 3856.250
Torgersen 2009 3489.062

As with kable, there is a lot that you can do to customize your gt tables. These examples do not provide a full introduction to the options available within gt; there are many great resources which provide more deatil. For more on using gt, consider starting with that go in depth on using the package—the gt website.