MATH 456 Homework 05

Due 2026-03-10 by 11:59pm

  1. Download the following dataset into your Homework 04 repository: penguins. Here's the metadata. Please push this dataset along with your qmd file for Homework 05. I don't need the output file nor any of the output file's dependencies.

    In an attempt to predict a penguins body mass, body_mass_g, use the explanatory variables bill_length_mm, flipper_length_mm, island, sex, and species.

  2. Use dplyr's functions select and mutate to select only the variables of interest and throw away any rows containing NAs.

  3. Use ggplot2 to make an appropriate scatter plot. Color the points using a categorical/qualitative variable.

  4. Fit a model with unique intercepts by sex and a shared slope across bill_length_mm.

  5. Interpret the slope of the model above in context of the data.

  6. Interpret adjusted in context of the data.

  7. Fit a model with unique intercepts by species and unique slopes across bill_length_mm by species.

  8. Set up and conclude a hypothesis test for the term speciesChinstrap:bill_length_mm using a level of signifiance of .

  9. Interpret the conclusion of your hypothesis test in context of the data.

  10. Interpret the coefficient for the term speciesChinstrap:bill_length_mm in context of the data.

  11. Interpret the coefficient for the term speciesGentoo in context of the data.

  12. Interpret adjusted in context of the data.

  13. Looking at the (many) p-values for all the coefficients and adjusted values for both models above, what can we say about the relationship between adjusted and p-values in general?

  14. Using only the variables mentioned above, fit a model with the highest adjusted you can.