MATH 456 Homework 04

Due 2026-03-05 by 11:59pm

  1. Download the following dataset into your Homework 04 repository: carnivora. Here's the metadata. Please push this dataset along with your qmd file for Homework 04. I don't need the output file nor any of the output file's dependencies.

We'll focus on the variables SW and SB.

  1. Use dplyr's functions select and mutate to select only the variables of interest, throw away any rows containing NAs, and rename the variables to something more meaningful.

  2. Use ggplot2 to make a scatter plot with SW on the x-axis and SB on the y-axis. As it is, describe this plot and speak about the reasonableness of fitting a line through these data.

  3. Make a new plot with both axes on the log10 scale. Describe this plot and speak about the reasonableness of fitting a line through these data.

  4. Fit a linear model that appropriately matches the plot above.

  5. Make a new plot with both axes on their original scale and put on the plot the fitted curve from the model above.

  6. Make a histogram of the standardized residuals.

  7. Make a scatter plot of the standardized residuals on the y-axis and the predicted values, what we've called , on the x-axis.

  8. How well does this model fit the assumptions linear models? Explain.

  9. Fit a linear model that uses a quadratic function of SW.

  10. Make a new plot with both axes on their original scale and put on the plot the fitted curve from the model above.

  11. Make a histogram of the standardized residuals.

  12. Make a scatter plot of the standardized residuals on the y-axis and the predicted values, what we've called , on the x-axis.

  13. How well does this model fit the assumptions linear models? Explain.

  14. Compare the mean squared errors for both models above. Be sure to make your comparison meaningful; pay careful attention to the units.

  15. Which model seems to predict the brain weight of animals from the order Carnivora better? Explain.