Barplots demonstrate the distribution of data across categories (categorical variables) and use either the geom_bar()
or geom_col()
function.
ggplot(data = penguins, mapping = aes(x = island)) +
geom_bar()
You can display another variable by including fill =
inside of your aes()
call.
ggplot(data = penguins, mapping = aes(x = island, fill = sex)) +
geom_bar()
In the above examples, ggplot2
determines the height of bars by counting the number of observations (this counting is done automatically and is not specified in the ggplot
call). This count provides geom_bar()
with a y
aesthetic, so it is not necessary to specify one in the ggplot2
code.
Your data may already contain counts, in which case you would use a different geom_
to contruct your barplot (geom_col
). Continuing to work with the penguins example, first create a new dataset with the count of penguins by island
:
penguin_sum <- penguins %>%
count(island)
penguin_sum
## # A tibble: 3 × 2
## island n
## <fct> <int>
## 1 Biscoe 168
## 2 Dream 124
## 3 Torgersen 52
To make the same graph as above with pre-counted data, use the geom_col()
(as in “column”) function, and specify the height of the bars.
ggplot(data = penguin_sum, mapping = aes(x = island, y = n)) +
geom_col()
Notice that you had to specify y = n
inside of aes()
. A helpful way to remember the difference between geom_bar()
and geom_col()
is that Col needs Counts. For additional examples of geom_bar()
and geom_col()
, see the barplots section of ModernDive.