MATH 315 Homework 10
Due 2025-11-19 by 11:59pm
-
Load the dataset on penguins. Our goal is to predict
body_mass_g(grams) using bothspeciesandbill_length_mm(mm). Assume a level of significance of 0.05.a. Perform step 1 of any regression analysis, using
ggplot2. Please color the points byspeciesand draw the linear regression lines over the plot.b. Produce the code to fit the linear regression model to predict
body_mass_gusing bothspeciesandbill_length_mmthat gives unique intercepts and slopes to each level ofspecies.c. Make a data frame that stores the standardized residuals and fitted values from this model.
d. Make a ggplot2 scatter plot of the standardized residuals (y-axis) on the fitted values (x-axis). What assupmtions of linear regression does this plot help us check? Do the assumptions seem reasonably met? Why or why not?
e. Make a ggplot2 histogram of the standardized residuals. What assumption of linear regression does this help us check? Does the assumption seem reasonably met? Why or why not?
f. Are there any potential outliers that we need to be concerned with? Explain.
Theoretically, if the assumptions of linear regression aren't satisfactorily met, you'd adjust your model and try again.g. Use adjusted R-squared to determine if including
speciesas an explanatory variable improves the overall model fit, as compared to not includingspecies. Report two adjusted R-squared numbers to justify your conclusion.h. Calculate the unique intercepts for each level of
species.i. Interpret an intercept for a level of
speciesin context of the data. Does this intercept value make logical sense?j. Calculate the unique slopes for each level of
species.j. Interpret a slope for a level of
speciesin context of the data. Please be specific about which level ofspeciesthis slope is referring to.k. Using the p-values for the slope/offsets, what can you say about the differences between the
species'sbody_mass_g?