library(tidyverse)
## ── Attaching packages ────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1 ✔ purrr 0.3.3
## ✔ tibble 2.1.3 ✔ dplyr 0.8.3
## ✔ tidyr 1.0.0 ✔ stringr 1.4.0
## ✔ readr 1.3.1 ✔ forcats 0.4.0
## ── Conflicts ───────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
We’ll begin by doing all the same data processing as in lecture.
# Load data from MASS into a tibble
birthwt <- as_tibble(MASS::birthwt)
# Rename variables
birthwt <- birthwt %>%
rename(birthwt.below.2500 = low,
mother.age = age,
mother.weight = lwt,
mother.smokes = smoke,
previous.prem.labor = ptl,
hypertension = ht,
uterine.irr = ui,
physician.visits = ftv,
birthwt.grams = bwt)
# Change factor level names
birthwt <- birthwt %>%
mutate(race = recode_factor(race, `1` = "white", `2` = "black", `3` = "other")) %>%
mutate_at(c("mother.smokes", "hypertension", "uterine.irr", "birthwt.below.2500"),
~ recode_factor(.x, `0` = "no", `1` = "yes"))
(a) Create a summary table showing the average birthweight (rounded to the nearest gram) grouped by race, mother’s smoking status, and hypertension.
# Edit me
(b) How many rows are there in the summary table? Are all possible combinations of the three grouping variables shown? Explain.
Your answer here
(c) Repeat part (b), this time adding the argument .drop = FALSE
to your group_by()
call. What happens?
# Edit me
(a) Construct a violin plot of showing how the distribution of diamond prices varies by diamond cut
.
# Edit me
(b) Use facet_grid
with geom_historam
to construct 7 histograms showing the distribution of price within every category of diamond color
.
# Edit me