MATH 385 Homework 15

Use the default dataset. This dataset contains information about people's credit loans and whether or not they defaulted on their loan. The column default records "Yes" anytime a person defaulted on their loan. The column balance is the balance of the account when the data were collected, and the column income records that person's annual income, both in U.S. dollars ($).

  1. Logistic Regression
    1. Fit a logistic regression model with unique intercepts by student and a second order model using the variables balance and income.
    2. Write a sentence interpretting a prediction for a person who is a student, has a balance of $100, and an income of $500.
    3. Using your predicted probability above, would you classify this person as likely to default on their loan or not? Explain.
    4. Write a sentence interpretting a prediction for a person who is not a student, has a balance of $100, and an income of $500.
    5. Using your predicted probabilities above, does it appear that the status of student has a large impact on whether or not a person is likely to default on their loan? Explain.
    6. Calculate the confusion matrix for your model and write a sentence explaining the accuracy.
    7. Write a sentence or two criticizing your model. What's wrong with it? Explain, using code and complete sentences. Hint: can you find the characteristics of a person for which this model predicts that they will default?