From a package

Many R packages contain their own data. Most of the data used in tutorials on the Data at Reed site comes from a package called palmerpenguins. The palmerpenguins package includes datasets from a study of three different species of penguins found in Antarctica.

Artwork by @allison_horst

To load data from a package, first install the package with the install.packages() function:

install.packages("palmerpenguins")

Now that you have the package installed, you need to load the package into your environment with library(). (You only need to install the package once. You will need to load the package every time you open R.)

library(palmerpenguins)

Now palmerpenguins is loaded. Where is the data?

When working with a package, there are a couple of options for finding the data contained in that package. (While all of these examples focus on the palmerpenguins data, you could use these instructions for any dataset of interest.)

For the most complete information, run help(package = "palmerpenguins") in your console. This will show you all of the data and functions included in a package, as well as descriptions of what they are.

If you are already familiar with the data and functions in a package and you only need a list of names to jog your memory, you can make use of RStudio’s autocomplete. Type the name of your package followed by two colons, like this

palmerpenguins::

Your tooltip should show you all of the datasets included in the package.

Once you have found the name of the dataset you want, in our case penguins, you can use the data() function:

data(penguins)

Depending on how datasets are included in a package, you might run that command and see nothing, or you may see <Promise> in your Environment in the top-right corner of your RStudio window. You may not be sure if your data is loaded. One way to confirm that your data is loaded is to run the View() function in your console:

View(penguins)

Alternatively, you can use the head() function to print the first few rows:

head(penguins)
## # A tibble: 6 × 8
##   species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex  
##   <fct>   <fct>           <dbl>         <dbl>            <int>       <int> <fct>
## 1 Adelie  Torge…           39.1          18.7              181        3750 male 
## 2 Adelie  Torge…           39.5          17.4              186        3800 fema…
## 3 Adelie  Torge…           40.3          18                195        3250 fema…
## 4 Adelie  Torge…           NA            NA                 NA          NA <NA> 
## 5 Adelie  Torge…           36.7          19.3              193        3450 fema…
## 6 Adelie  Torge…           39.3          20.6              190        3650 male 
## # … with 1 more variable: year <int>

Your data is loaded! You are ready to go.