(a) Write a function called calculateRowMeans
that uses a for loop to calculate the row means of a matrix x
.
# calculateRowMeans computes the row means of a matrix x
# input: matrix x
# output: vector of length nrow(x) giving row means of x
calculateRowMeans <- function(x) {
row.means <- numeric(nrow(x))
for(i in 1:nrow(x)) {
row.means[i] <- mean(x[i,])
}
row.means
}
(b) Try out your function on the random matrix fake.data
defined below.
set.seed(12345) # Set seed of random number generator
fake.data <- matrix(runif(800), nrow=25)
calculateRowMeans(fake.data)
## [1] 0.5339087 0.6259388 0.4966049 0.5399315 0.5049318 0.5633372 0.4686503
## [8] 0.4196579 0.5273801 0.4639143 0.5472661 0.5043049 0.6169601 0.4690874
## [15] 0.4920191 0.5841288 0.6108891 0.4879246 0.5401770 0.5223512 0.5086669
## [22] 0.4643891 0.5250635 0.4791480 0.5795024
(c) Use the apply()
function to calculate the row means of the matrix fake.data
apply(fake.data, MARGIN=1, FUN=mean)
## [1] 0.5339087 0.6259388 0.4966049 0.5399315 0.5049318 0.5633372 0.4686503
## [8] 0.4196579 0.5273801 0.4639143 0.5472661 0.5043049 0.6169601 0.4690874
## [15] 0.4920191 0.5841288 0.6108891 0.4879246 0.5401770 0.5223512 0.5086669
## [22] 0.4643891 0.5250635 0.4791480 0.5795024
(d) Compare this to the output of the rowMeans()
function to check that your calculation is correct.
identical(calculateRowMeans(fake.data), apply(fake.data, MARGIN=1, FUN=mean))
## [1] TRUE
(a) Use group_by()
and summarize()
commands on the Cars93 data set to create a table showing the average Turn.circle
of cars, broken down by vehicle Type
and DriveTrain
Cars93 %>%
group_by(Type, DriveTrain) %>%
summarize(mean(Turn.circle))
## # A tibble: 14 x 3
## # Groups: Type [6]
## Type DriveTrain `mean(Turn.circle)`
## <fct> <fct> <dbl>
## 1 Compact 4WD 37
## 2 Compact Front 38.8
## 3 Compact Rear 35.5
## 4 Large Front 42
## 5 Large Rear 43.8
## 6 Midsize Front 40.5
## 7 Midsize Rear 39
## 8 Small 4WD 33.5
## 9 Small Front 35.3
## 10 Sporty 4WD 39.5
## 11 Sporty Front 37
## 12 Sporty Rear 41.2
## 13 Van 4WD 41.8
## 14 Van Front 41.8
(b) Are all combinations of Type and DriveTrain shown in the table? If not, which ones are missing? Why are they missing?
Some are missing. E.g., there is no entry for Large 4WD cars. This is because there are no vehicles in this category.
sum(Cars93$Type == "Large" & Cars93$DriveTrain == "4WD")
## [1] 0
(c) Add the argument .drop = FALSE
to your group_by
command, and then re-run your code. What happens now?
Cars93 %>%
group_by(Type, DriveTrain, .drop = FALSE) %>%
summarize(mean(Turn.circle))
## # A tibble: 18 x 3
## # Groups: Type [6]
## Type DriveTrain `mean(Turn.circle)`
## <fct> <fct> <dbl>
## 1 Compact 4WD 37
## 2 Compact Front 38.8
## 3 Compact Rear 35.5
## 4 Large 4WD NaN
## 5 Large Front 42
## 6 Large Rear 43.8
## 7 Midsize 4WD NaN
## 8 Midsize Front 40.5
## 9 Midsize Rear 39
## 10 Small 4WD 33.5
## 11 Small Front 35.3
## 12 Small Rear NaN
## 13 Sporty 4WD 39.5
## 14 Sporty Front 37
## 15 Sporty Rear 41.2
## 16 Van 4WD 41.8
## 17 Van Front 41.8
## 18 Van Rear NaN
The
.drop
argument, which is set toTRUE
by default, controls whether variable combinatinos that never appear together are dropped. When we set.drop = FALSE
the combinations with 0 counts still appear in the table, but the summary showsNaN
in that cell (not a number).
(d) Having a car with a small turn radius makes city driving much easier. What Type of car should city drivers opt for?
Small cars appear to have smaller turn radii.
(e) Does the vehicle’s DriveTrain
appear to have an impact on turn radius?
There is no consistent association.