Modifying rows and columns
The dplyr
package is a good resource for modifying rows and columns of a data frame.
Rows
To add rows, you can use bind_rows
to attach data to the bottom of your data set. This is especially useful when you have a list of many data frames, for example after reading in multiple files of the same format that might be separated by county or age group.
library(dplyr)
new_data <- data.frame(Sepal.Length = c(3,4,5),
Sepal.Width = c(5,4,3),
Petal.Length = c(1,2,3),
Petal.Width = c(3,2,1),
Species = c("setosa", "versicolor", "virginica"))
bind_rows(iris, new_data)
To select a subset of rows by index, you can use bracket notation or the slice
command.
iris[20:30, ]
slice(iris, 20:30)
To select a subset of rows by a condition, the filter
command is useful. Something we might be interested in is only looking at the biggest Versicolors, say any Versicolor with a petal length greater than 4.5.
filter(iris, Species == "versicolor", Petal.Length > 4.5)
Columns
To pick out columns by index or name, you can use bracket notation or the select
command. Say we only want the species and petal size of iris
, that is we don’t care about sepal size. Any of these methods will work to obtain those columns.
iris[ ,3:5]
select(iris, 3:5)
select(iris, Petal.Length, Petal.Width, Species)
select(iris, -c(1,2))
To add a column from data you already have as a vector, you can use the typical bracket notation. If we have a vector of planting dates called x
that we want to add onto iris
as a column, we could just assign this data to the column plant_date
. That column didn’t exist before this step, but after this it will show up as part of the data set.
iris$plant_date <- x
Sometimes you’ll want to create new data from your existing data. For example, if we’re interested in the ratio of petal length to width of each flower, we’ll want to divide Petal.Length
by Petal.Width
down the whole dataset. We can add this with the mutate
command and call this new column Petal.Ratio
.
mutate(iris, Petal.Ratio = Petal.Length / Petal.Width)