Barplots

Barplots demonstrate the distribution of data across categories (categorical variables) and use either the geom_bar() or geom_col() function.

ggplot(data = penguins, mapping = aes(x = island)) +
  geom_bar()

You can display another variable by including fill = inside of your aes() call.

ggplot(data = penguins, mapping = aes(x = island, fill = sex)) +
  geom_bar()

In the above examples, ggplot2 determines the height of bars by counting the number of observations (this counting is done automatically and is not specified in the ggplot call). This count provides geom_bar() with a y aesthetic, so it is not necessary to specify one in the ggplot2 code.

Your data may already contain counts, in which case you would use a different geom_ to contruct your barplot (geom_col). Continuing to work with the penguins example, first create a new dataset with the count of penguins by island:

penguin_sum <- penguins %>%
  count(island)

penguin_sum
## # A tibble: 3 × 2
##   island        n
##   <fct>     <int>
## 1 Biscoe      168
## 2 Dream       124
## 3 Torgersen    52

To make the same graph as above with pre-counted data, use the geom_col() (as in “column”) function, and specify the height of the bars.

ggplot(data = penguin_sum, mapping = aes(x = island, y = n)) +
  geom_col()

Notice that you had to specify y = n inside of aes(). A helpful way to remember the difference between geom_bar() and geom_col() is that Col needs Counts. For additional examples of geom_bar() and geom_col(), see the barplots section of ModernDive.