Due: 2020-05-15 by 11:59pm

Use the dataset possum, which records various aspects about 104 opossum. For all models below use totalL as the numeric response variable, headL as the numeric explanatory variable, and pop as the categorical explanatory variable.

Please read each question carefully.

  1. For half points on this problem, make an appropriate plot of your numeric response variable across all levels of the categorical explanatory variable. Your plot should include all the data and some graphical summary of the data that doesn’t hide any of your data points. For full points on this problem, use the function labs() to better label the axes, including correct units, and add one extra thing to this plot – it could be making the points transparent, adding more summary statistics to the plot, an effective use of color, or some other meaningful addition.

  2. Write two complete English sentences about the plot from part 1., in the context of the data using statistical keywords from our class. Points for this question are based on the validity of your statistical statement, the informativeness of your statement, and the accuracy of your statement.

  3. Fit a multiple linear regression model with unique intercepts and unique slopes across head length by levels of the variable pop. Present the R code to fit this model.

  4. Make an appropriately matching plot for this model. Add labels to plot’s axes, including the correct units.

  5. Interpret in context of the data, the intercept for level vic.

  6. Does a proper interpretation of the intercept qualify as extrapolation? Explain.

  7. Does the estimated intercept make sense? Explain.

  8. Interpret in context of the data, the slope for level other.

  9. For half points on this problem, calculate a 90% bootstrap confidence interval for a slope, not a slope offset. For full points on this problem, the slope must be for the level vic. Hint for full points: use the element t of your returned boot object to add the appropriate bootstrap resampled coefficients together to get bootstrap resampled slopes for level vic. Then calculate the appropriate percentiles for your confidence interval.

  10. Make a density plot of your bootstrap resampled slopes from part 9.

  11. What is the name of the distribution you plotted in part 10.?

  12. Describe using statistical keywords the shape of the distribution in part 10.

  13. If your sample size were to decrease, what would happen to the shape of the distribution in part 10.? What if your sample size were to increase, what would happen to the shape of the distribution in part 10.? Explain.

  14. Explain the theoretical meaning behind the 90% confidence from your 90% confidence interval in part 9.

  15. Calculate two bootstrap confidence intervals for a prediction at some value, of your choice, along the x-axis, one confidence interval for each of the two levels of your categorical explanatory variable.

  16. Interpret one of your confidence intervals in context of the data.

  17. For half points on this problem, compare your confidence intervals and make an informed statement about the predicted total length of an opossum dependent on its head length and the population it comes from. For full points on this problem, make one ggplot plot that displays density plots for both predictions where each density plot is colored and approppriately labeled. Hint: it’s best to consider the structure of the dataframe needed to make this plot. The ggplot() syntax should follow easily, once the dataframe is set up correctly. With this in mind, note that c(c(1,3,5),c(2,4,6)) == c(1,3,5,2,4,6).