Meet the Palmer Penguins

These tutorials make use of a dataset named penguins from the palmerpenguins package. The dataset contains different body measurements for three species of penguins from three islands in the Palmer Archipelago, Antarctica.

The penguins dataset is useful for learning R, because it contains multiple kinds of data (both categorical and numeric variables). More information about the package and its data can be found on its github repository and documentation. You can read about why we are not using iris, another common example dataset, in this blogpost by Megan Stodel.

Artwork by @allison_horst

If you want to work along with any of the examples on this site, you will first need to install the palmerpenguins package:

install.packages("palmerpenguins")

Note that you only need to install the package once. Once you have installed the package, you will need to load the package every time you want to use the data. To load the package:

library(palmerpenguins)

Now that the package is loaded, you can access the data with the data() function:

data("penguins")

You might notice that the dataset does not seem to be fully loaded. Next to penguins, there is probably a <Promise> symbol. To really load the data, click on <Promise>. The penguins dataset is now ready to use.

You can explore the penguins data:

penguins
## # A tibble: 344 × 8
##    species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
##    <fct>   <fct>              <dbl>         <dbl>             <int>       <int>
##  1 Adelie  Torgersen           39.1          18.7               181        3750
##  2 Adelie  Torgersen           39.5          17.4               186        3800
##  3 Adelie  Torgersen           40.3          18                 195        3250
##  4 Adelie  Torgersen           NA            NA                  NA          NA
##  5 Adelie  Torgersen           36.7          19.3               193        3450
##  6 Adelie  Torgersen           39.3          20.6               190        3650
##  7 Adelie  Torgersen           38.9          17.8               181        3625
##  8 Adelie  Torgersen           39.2          19.6               195        4675
##  9 Adelie  Torgersen           34.1          18.1               193        3475
## 10 Adelie  Torgersen           42            20.2               190        4250
## # … with 334 more rows, and 2 more variables: sex <fct>, year <int>

In this data, every row is a unique observation (an individual penguin). For each penguin, you have the species of the penguin, which island it lives on, its sex, the year the measurements were taken, and some body size measurements.