In today’s Lab you will gain practice with the following concepts from Lecture 5:
apply
and map
as loop alternativesmutate
commands to manipulate datasummarize
commands to produce simple tabular summaries, and interpreting the resultslibrary(tidyverse)
## ── Attaching packages ─────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1 ✔ purrr 0.3.3
## ✔ tibble 2.1.3 ✔ dplyr 0.8.3
## ✔ tidyr 1.0.0 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ── Conflicts ────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
Cars93 <- as_tibble(MASS::Cars93) # Pull Cars93 from MASS
Note: This question previously (accidentally) appeared on Lab 4. Feel free to skip it if you already succeeded on this question in the previous week.
(a) The nlevels
command tells you the number of levels in a factor variable. Use this function in combination with summarize_if()
to produce an integer vector showing the number of levels for each factor variables in the Cars93 data.
# Edit me
(b) levels()
returns the possible levels of a factor variable. Use this function in combination with select
and map
to create a list of all the levels of the Manufacturer, AirBags, DriveTrain, and Man.trans.avail variables
# Edit me
mutate()
variants with Cars93(a) Use the toupper()
command in combination with mutate_if()
to produce a new version of Cars93 where every factor variable has been converted to upper case.
# Edit me
(b) Currently the price columns of the Cars93
reflect prices in $1000’s of dollars. Use mutate_at
to create a version of Cars93
where all prices are in $’s. (e.g., what used to be a price of 12.9 should become 12900).
# Edit me
(c) Use mutate_if
to normalize all of the numeric variables in the Cars93
data to have variance 1. Save the resulting mutated data in a variable called Cars93.norm
. (Hint: this is equivalent to dividing each of the columns by the standard deviation of the given column.)
# Edit me
To check that you’ve succeeded, you can confirm that the following lines of code all return the answer 1
.
var(Cars93.norm$Min.Price)
var(Cars93.norm$Horsepower)
summarize()
variantsUse summarize_if
to calculate the standard deviation of every numeric column in the original Cars93
data. You’ll want to further specify na.rm = TRUE
to ensure that you get a non-NA
output value even for variables that have some missing (NA
) observations.
# Edit me