The Summarizing data page shows how to compute summaries of your data. Summary tables can be useful for displaying data, and the kable()
function in the R package knitr
allows you to present tables with helpful formatting.
First, install and load the package.
install.packages("knitr")
library(knitr)
Using the palmerpenguins
data, calculate the mean value of penguin body mass (g) across islands over time. (For more information on calculating summary statistics, see the Data Wrangling) section.)
The below code calculates summary statistics, saved in a new dataset called penguin_sum
.
penguin_sum <- penguins %>%
group_by(island, year) %>%
summarize(mean_body_mass_g = mean(body_mass_g, na.rm = TRUE)) %>%
ungroup()
You can print the dataset within R/RStudio:
penguin_sum
## # A tibble: 9 × 3
## island year mean_body_mass_g
## <fct> <int> <dbl>
## 1 Biscoe 2007 4741.
## 2 Biscoe 2008 4628.
## 3 Biscoe 2009 4793.
## 4 Dream 2007 3684.
## 5 Dream 2008 3779.
## 6 Dream 2009 3691.
## 7 Torgersen 2007 3763.
## 8 Torgersen 2008 3856.
## 9 Torgersen 2009 3489.
To make a possibly more appealing table with the same data, use kable()
:
kable(x = penguin_sum,
format = "html",
col.names = c("Island", "Year", "Mean Body Mass (g)"),
caption = "Mean body mass of penguins on different islands over time")
Island | Year | Mean Body Mass (g) |
---|---|---|
Biscoe | 2007 | 4740.909 |
Biscoe | 2008 | 4628.125 |
Biscoe | 2009 | 4792.797 |
Dream | 2007 | 3684.239 |
Dream | 2008 | 3779.412 |
Dream | 2009 | 3691.477 |
Torgersen | 2007 | 3763.158 |
Torgersen | 2008 | 3856.250 |
Torgersen | 2009 | 3489.062 |
For additional customization options, use the package kableExtra
.
install.packages("kableExtra")
library(kableExtra)
Start with very similar code to what you used with the basic kable
table. Add the function kable_styling()
to get the basic kableExtra
format:
kable(x = penguin_sum,
col.names = c("Island", "Year", "Mean Body Mass (g)"),
caption = "Mean body mass of penguins on different islands over time") %>%
kable_styling()
Island | Year | Mean Body Mass (g) |
---|---|---|
Biscoe | 2007 | 4740.909 |
Biscoe | 2008 | 4628.125 |
Biscoe | 2009 | 4792.797 |
Dream | 2007 | 3684.239 |
Dream | 2008 | 3779.412 |
Dream | 2009 | 3691.477 |
Torgersen | 2007 | 3763.158 |
Torgersen | 2008 | 3856.250 |
Torgersen | 2009 | 3489.062 |
Some other options to consider as alternatives to kable_styling()
are: kable_classic()
, kable_paper()
, kable_classic_2()
,kable_minimal()
, kable_material()
and kable_material_dark()
.
To change the width of the table, add full_width = FALSE
within your kable_styling()
(or similar) argument. Additional options with kableExtra
include changing the font:
kable(x = penguin_sum,
col.names = c("Island", "Year", "Mean Body Mass (g)"),
caption = "Mean body mass of penguins on different islands over time") %>%
kable_classic(full_width = FALSE, html_font = "Cambria", font_size=16)
Island | Year | Mean Body Mass (g) |
---|---|---|
Biscoe | 2007 | 4740.909 |
Biscoe | 2008 | 4628.125 |
Biscoe | 2009 | 4792.797 |
Dream | 2007 | 3684.239 |
Dream | 2008 | 3779.412 |
Dream | 2009 | 3691.477 |
Torgersen | 2007 | 3763.158 |
Torgersen | 2008 | 3856.250 |
Torgersen | 2009 | 3489.062 |
You can also adjust the font size:
kable(x = penguin_sum,
col.names = c("Island", "Year", "Mean Body Mass (g)"),
caption = "Mean body mass of penguins on different islands over time") %>%
kable_material_dark(html_font = "Cambria")
Island | Year | Mean Body Mass (g) |
---|---|---|
Biscoe | 2007 | 4740.909 |
Biscoe | 2008 | 4628.125 |
Biscoe | 2009 | 4792.797 |
Dream | 2007 | 3684.239 |
Dream | 2008 | 3779.412 |
Dream | 2009 | 3691.477 |
Torgersen | 2007 | 3763.158 |
Torgersen | 2008 | 3856.250 |
Torgersen | 2009 | 3489.062 |
To learn more about these and other customizations, see the kableExtra vignette.
Another approach for creating tables is the gt
package. gt
stands for “grammar of tables,” and feels a lot like working with ggplot
’s grammar of graphics.
install.packages("gt")
library(gt)
You can create a basic table by adding gt()
on to the end of your code:
penguin_sum %>%
gt()
island | year | mean_body_mass_g |
---|---|---|
Biscoe | 2007 | 4740.909 |
Biscoe | 2008 | 4628.125 |
Biscoe | 2009 | 4792.797 |
Dream | 2007 | 3684.239 |
Dream | 2008 | 3779.412 |
Dream | 2009 | 3691.477 |
Torgersen | 2007 | 3763.158 |
Torgersen | 2008 | 3856.250 |
Torgersen | 2009 | 3489.062 |
You can also adjust your column names and include a title by adding on to your baseline table. Notice that you can use the md()
wrapper, which stands for “markdown,” to use markdown styling in your tables; the **
on either side will bold the title
argument. One *
on either side will italicize the text.
penguin_sum %>%
gt() %>%
cols_label(
mean_body_mass_g = "Mean Body Mass (g)"
) %>%
tab_header(
title = md("**Mean body mass of penguins on different islands**"),
subtitle = "2007-2009"
)
Mean body mass of penguins on different islands | ||
---|---|---|
2007-2009 | ||
island | year | Mean Body Mass (g) |
Biscoe | 2007 | 4740.909 |
Biscoe | 2008 | 4628.125 |
Biscoe | 2009 | 4792.797 |
Dream | 2007 | 3684.239 |
Dream | 2008 | 3779.412 |
Dream | 2009 | 3691.477 |
Torgersen | 2007 | 3763.158 |
Torgersen | 2008 | 3856.250 |
Torgersen | 2009 | 3489.062 |
Finally, you can both group and color your columns to help the viewer differentiate between the species. Before adding color to your table, you will need to complete a few additional steps.
First, install a few new packages and choose a color palette from the many available:
install.packages("paletteer")
install.packages("scales")
You can run this line of code to look at all the color options:
info_paletteer(color_pkgs = NULL)
The syntax is generally packagename::palette
and you will usually have to install the package before accessing the palette. Notice that you can specify how many colors we want with the n = 3
argument.
install.packages("ggsci")
palette <- paletteer::paletteer_d("ggsci::teal_material", n = 3) %>%
as.character()
penguin_sum %>%
gt() %>%
cols_label(
mean_body_mass_g = "Mean Body Mass (g)"
) %>%
tab_header(
title = md("**Mean body mass of penguins on different islands**"),
subtitle = "2007-2009"
) %>%
data_color(
columns = "island",
colors = scales::col_factor(
as.character(palette),
domain = NULL
)
)
Mean body mass of penguins on different islands | ||
---|---|---|
2007-2009 | ||
island | year | Mean Body Mass (g) |
Biscoe | 2007 | 4740.909 |
Biscoe | 2008 | 4628.125 |
Biscoe | 2009 | 4792.797 |
Dream | 2007 | 3684.239 |
Dream | 2008 | 3779.412 |
Dream | 2009 | 3691.477 |
Torgersen | 2007 | 3763.158 |
Torgersen | 2008 | 3856.250 |
Torgersen | 2009 | 3489.062 |
As with kable, there is a lot that you can do to customize your gt
tables. These examples do not provide a full introduction to the options available within gt
; there are many great resources which provide more deatil. For more on using gt
, consider starting with that go in depth on using the package—the gt
website.