Useful R Packages
Packages are bundles of specialized code that you can add in to go beyond the basic R functions. When you install a package you are adding in extra coding options that can help you analyze or visualize your data more easily. Anyone can write packages and they can be general or very specific, so depending on your task, you may find someone has written a package to make it easier for you.
The terms "package" and "library" are used interchangeably. When using R you will run install.packages()
when you need to add a package for the first time, then you run library()
to load the package. install.packages()
is like buying a book—you only need to do it once—and library()
is like getting it off the shelf—you need to do it everytime you want to use that book.
The one package to rule them all
The first step in all of your scripts will likely be this line of code:
library(tidyverse)
{tidyverse} is a meta-package that will load many other packages within a single step. When you run the line above, it will load in the following packages for you automatically:
ggplot2
,dplyr
tidyr
readr
purrr
tibble
stringr
forcats
Below are categories that contain other useful packages. If you are trying to load in data from an online database (ex: US Census) be sure to check out the Direct Data Access libraries. There may be a library that will load your data in for you without the need for you to download it from the website.
Loading Data
Package Name |
What It Does |
Learn More |
readr | for loading in .csv, .txt, and more file types | readr documentation |
readxl | for loading .xlsx file types and other Excel extensions | readxl documentation |
haven | for loading Stata, SPS, and SPSS files | haven documentation |
jsonlite | for importing JSON objects and converting to R data types | jsonlite documentation |
googlesheets4 | for loading data from a Google Drive account | googlesheets4 documentation |
rvest | for web-scraping | rvest documentation |
duckdb | for loading more data than R likes to load; if you have a huge dataset, use this package | duckdb documentation |
Formatting Data
Package Name |
What It Does |
Learn More |
dplyr | contains the most commonly used tools for data manipulation | dplyr documentation |
tidyr | tools for pivoting tables from wide to long format and vice versa | tidyr documentation |
janitor | for cleaning up and standardizing data names | janitor documentation |
stringr | helpful functions for manipulating strings | stringr documentation |
scales | for overriding default settings for significant digits, plot axes, and more | scales documentation |
lubridate | a must have package for formatting any data that is a date or time | lubridate documentation |
data.table | good functions for speeding up analysis when you have large data sets | data.table documentation |
broom | for making your data more tidyverse friendly | broom documentation |
purrr | tools for working with functions and vectors, helpful for converting from lists of lists to data frames | purrr documentation |
Creating Nice Plots & Tables
Package Name |
What It Does |
Learn More |
ggplot2 | the best package for making your graphs look nice | ggplot2 documentation |
gt | stands for "great tables" and follow through on its promise | gt documentation |
gtsummary | works with gt to display publication-ready summary of regressions and more | gtsummary documentation |
viridis | has pretty color palettes | viridis documentation |
RColorBrewer | has pretty color palettes | RColorBrewer documentation |
ggpubr | customization for ggplot2 that helps make publication-ready documents | ggpubr documentation |
patchwork | works well with ggplot2 to help align multiple plots or tables in one figure or page | patchwork documentation |
gridExtra | helps align multiple plots or tables in one figure or page | gridExtra documentation |
wesanderson | has color palettes that correspond to each Wes Anderson movie | wesanderson documentation |
plotly | for making your graphs interactive, works well with the shiny package | plotly documentation |
Useful Stats Packages
Package Name |
What It Does |
Learn More |
stats | the main source for statistical functions beyond base R | stats documentation |
lme4 | for linear regression with mixed-effects models | lme4 documentation |
lmerTest | statistical tests for analyzing linear mixed-effect models | lmerTest documentation |
MASS | for regression analysis of non-linear models | MASS documentation |
Hmisc | a lot of miscellaneous additional functions for statistical analyis | Hmisc documentation |
FactoMineR | for multivariate exploratory data analysis | FactoMineR documentation |
outliers | many specific tests for detecting outliers | outliers documentation |
vegan | for ordination analyses and diversity stats, particularly good for ecology | vegan documentation |
car | extra tools for regression analysis | car documentation |
cluster | tools for performing cluster analysis | cluster documentation |
forcats | tools for working with categorical variables | forcats documentation |
Direct Data Access
Package Name |
Database Accessed |
Learn More |
tidycensus | US Census | tidycensus documentation |
rnoaa* | National Oceanic and Atmospheric Administration | |
COVID19 | daily updates on Covid data | COVID19 documentation |
wbstats | World Bank data | wbstats documentation |
tidyquant | Stock market data | fredr documentation |
crimedata | Crime Open Database | crimedata documentation |
eurostat | Eurostat Open Data | eurostat documentation |
WDI | World Bank and World Development Indicators | WDI documentation |
imf.data | International Monetary Fund | imf.data documentation |
fredr | Federal Reserve of Economic Data | fredr documentation |
googleanalyticsR | Google Analytics | googleanalyticsR documentation |
* They're working on a replacement, but it is still usable.