Lab 07: ANOVA

Use the dataset penguins. Download the data and save the CSV file into your Lab07 directory (so that it is submit when you upload your Lab07 folder). Load the dataset into R.

Read the help file about penguins, to better understand the data set.

Please submit a folder containing your Quarto input (qmd) and output (html) documents for this lab, named Lab07, to our shared Google Drive folder when you are finished.

Choose a numerical and an appropriate categorical variable to perform ANOVA.

  1. Make a box plot of your numerical variable across the levels of your categorical variable.

  2. State one interesting thing about your plot in context of the data.

  3. Make a scatter plot of your numerical variable across the levels of your categorical variable; use geom_point() instead of geom_boxplot(). State one thing about this plot that you don’t like?

  4. Make a scatter plot of your numerical variable. Replace geom_point() with

geom_jitter(alpha = 0.25) +
    stat_summary(fun.data = "mean_se",
                 fun.args = list(mult = 2),
                 aes(color = YOUR_CATEGORICAL_VAR))

where you replace YOUR_CATEGORICAL_VAR with the name of your chosen categorical variable.

  1. Do your best to explain what this plot represents.

  2. State one interesting thing about this plot that stands out, but is less easy to see with your box plot from above.

  3. Perform a hypothesis test of these data, using a level of significance of your choice. Please state your hypotheses.

  4. Based on your test, which hypothesis is more likely given the data? State your conclusion as either “reject H0” or “fail to reject H0”, and explain why this conclusion is appropriate.

  5. Interpret in context of the data your conclusion of the hypothesis test.

  6. State more clearly what your hypothesis test suggests about the data. Try to relate it back to your plot in part 4.