Making informative and readable graphs

In addition to x, y, color, and fill covered in the Intro to data visualization section, you can use additional visual variables to represent variables in your data. Below are some additional methods for conveying more information on your graphs, while still keeping your graphs readable

Aesthetic mappings

The aesthetics shape and linetype are best used to display a categorical variable, while size is best used with continuous variables. These are used in the same way as color above. For example, you could make the same graph that was shown in the Linegraphs section, with different islands shown by linetype instead of color:

ggplot(data = penguins_sum, 
       mapping = aes(x = year, y = mean_body_mass_g, linetype = island)) +
  geom_line()

To see more specifics of mapping aesthetics, this blog post provides helpful examples and highlights cases where different aesthetic mappings might be particularly appropriate.

There are many resources available for using ggplot2 in visualization, including the comprehensive ggplot2 documentation and the related chapter in R for Data Science.

Adjusting transparency with alpha

Sometimes, when using geom_point(), points can get a bit crowded. (This is called overplotting.) One way to address this is to change the transparency of the points. In the code below, alpha = 0.5 specifies the transparency of the points as 50%, on a scale from 0 (most transparent) to 1 (least transparent):

ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = sex)) +
  geom_point(alpha = 0.5)

Creating side-by-side graphs (faceting)

To add another variable to an already “busy” graph, facet_wrap() can be useful. Consider the following plot:

ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = sex)) +
  geom_point(alpha = 0.7)

Incorporating facet_wrap() separates the data by species and plots them side-by-side, providing a visual that shows how bill length and bill depth vary across species and across sex:

ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = sex)) +
  geom_point(alpha = 0.7) +
  facet_wrap(~species)

Faceting by species creates three graphs, with identical x and y axis scales and coloring. These side-by-side graphs facilitate comparisons across values of species.