Code
weight Time Chick Diet
159 79 6 14 1
126 168 12 11 1
269 40 0 25 2
215 77 12 20 1
252 135 14 23 2
10 171 18 1 1
Consider the dataset.
Means of chicks’s weight by diet.
Mean and standard deviations of chicks’s weight by diet.
# A tibble: 4 × 5
Diet Mn StDev q1 q3
<fct> <dbl> <dbl> <dbl> <dbl>
1 1 103. 56.7 57.8 136.
2 2 123. 71.6 65.5 163
3 3 143. 86.5 67.5 199.
4 4 135. 68.8 71.2 185.
The underlying ideas behind so much of statistics rely on three ideas/functions and one sentence enhancer.
%>%
– make code read (almost) like EnglishThe function group_by
is incredibly helpful, but not that exciting.
If we group the dataset ChickWeight by Diet, things change only slightly. But what group_by
returns is now ready to be passed into summarise
.
The function summarise
collapses multiple observations down into one number, for instance into a summary statistic. As we saw before, we can summarize multiple variables at once.
What does the following code do?
We can also summarize multiple variables at once – by group or not.
The function mutate
allows us to create new variables and add them to the data frame. Recall our summarized data named
We can create a new variable (column)
In case our new variable isn’t automatically printed, remember we can make R print things for us.
mutate
works on any data frame. For instance, you might have two variables that are obviously better off as a ratio.
Summary statistics and a lesson about missing values in R.
# A tibble: 8 × 4
Family Mn Total Sm
<fct> <dbl> <int> <dbl>
1 Ailuridae 1.5 1 1.5
2 Canidae 4.43 18 79.8
3 Felidae 2.69 19 51.2
4 Hyaenidae 2.4 4 9.6
5 Mustelidae 3.65 30 110.
6 Procyonidae 3.08 4 12.3
7 Ursidae 2.1 4 8.4
8 Viverridae NA 32 83.1
We removed NAs from mean calculation and made code (almost) read like English
# A tibble: 8 × 4
Family Mn Total Sm
<fct> <dbl> <int> <dbl>
1 Ailuridae 1.5 1 1.5
2 Canidae 4.43 18 79.8
3 Felidae 2.69 19 51.2
4 Hyaenidae 2.4 4 9.6
5 Mustelidae 3.65 30 110.
6 Procyonidae 3.08 4 12.3
7 Ursidae 2.1 4 8.4
8 Viverridae 2.77 32 83.1
Maybe relative brain size matters to you, i.e. heaviest brain relative to body weight.
Then throw in some summarise
and find which family has greatest mean brain to body weight ratio.