MATH 314 Homework 13
Due 2026-05-05 by 11:59pm
Download the following dataset into your Homework 13 repository: breast cancer. Here's the associated metadata.
Please push this data set along with your ipynb file for Homework
13.
-
Fit a logistic regression model to predict whether or not tumor is malignant (
diagnosis == "M"). If you are up for a challenge, I could find a model that had accuracy/true positive rate of 93.5%, but the metadata says they found a model (that I could not recreate) which acheived 97.5% accuracy. If you are not up for a challenge, pick a numerical explanatory variable and use it. -
Plot the model along one numerical explanatory variable.
-
Make two predictions, one for which the predicted probability of a malignant tumor is high and one for which the predicted probability is low. Did you have to extrapolate to make such a prediction? In either case, what does that tell you about that quality of the numerical predictor?
-
Calculate a "slope" relative to the mean of your numerical predictor.
-
Use our bootstrap function to calculate a 95% confidence interval for the slope you calculated in 4.