MATH 315 Homework 11
Due 2025-12-08 by 11:59pm
Load the dataset about penguins.
-
Goal: use multiple linear regression to predict body weight using any combination of the other variables you want (except
body_mass_g).a. Perform step 1 of any regression analysis, using
ggplot2. Your plot does not have to match your final model exactly, but it should account for one of the (potentially many) numerical explanatory variables and one of the (potentially many) categorical explanatory variables.b. Fit multiple linear regression using whichever explanatory variables you want (other than
body_mass_g). You should include at least two numeric variables. If you are feeling competitive, your goal is to find an adjustedgreater than 0.85. c. Interpret the adjusted
of your model in context of the data. d. Interpret an intercept in context of the data. Be specific.
e. Does the interpretted intercept make sense? Why or why not?
f. Interpret a slope in context of the data. Be specific.
g. Calculate a prediction of body weight from your model.
h. Interpret the prediction in context of the data. Be specific.
-
Goal: use logistic regression to predict the penguin sex using any combination of the other variables you want (other than
sex).a. Create a new column in the dataset named
sexbthat stores 1 for females and 0 otherwise.b. Perform step 1 of any regression analysis, using
ggplot2. Try to use color andgeom_jitter(width = w, height = h)wherewandhare some values you choose to make the plot look better.c. Fit logistic regression using whichever explanatory variables you want (other than
sexandsexb) to predict the variablesexb. You should include at least two numeric variables.d. Insert the following code into a new code chunk. Then call this function on the variable you created from
glm, e.g. if you definedfit <- glm(...), then in a new code chunk callaR2(fit).aR2 <- function(fitl) { llf <- logLik(fitl) llnull <- logLik(update(fitl, .~1)) lr <- as.numeric(llf - llnull) y <- fitl$y n <- length(y) p <- mean(y) m <- 3 * n * p * (1 - p) k <- length(fitl$coefficients) return(1 - exp(-(lr - k) / m)) }e. Interpret the number you got from part d. as you would adjusted
in context of this model. There's some disagreement on how to best calculate something like adjusted
for logistic regression. After some searching around I found a recommended calculation from someone I trust. So I wrote up the recommendation into the function above. d. Calculate and interpret, in context of the data, a slope for one of your numeric variables.
e. Calculate and interpret, in context of the data, two predicted probabilities from your model. You should have one probability greater than 0.5, thus (probably) predicting a female, and one probability less than 0.5, thus (probably) predicting a male.