MATH 350 Homework 09

You will fit simple and multiple linear regression to a dataset of your choice. You can use the McDonalds dataset we've been using, or you can find your own. I found the McDonalds dataset by searching the collection of datasets on Kaggle as hosted by Joakim Arvidsson.

If you re-use the McDonalds dataset, you can't use the variables I used. You must find work with your own.

If you want to use a new dataset, here are some tips.

  • don't over think this and waste time looking,
  • you will need at least three numeric variables,
  • if in doubt, ask me to look over your dataset before you start coding.
  1. Use our code from the regression Colab notebook to fit linear regression using only one explanatory variable. When there is only one explanatory variable, they call this simple linear regression.
  2. Confirm your solutions using the following solutions to the by-hand likelihood calculations. Let yˉ=N1n=1Nyn\bar{y} = N^{-1}\sum_{n=1}^N y_n and xˉ=N1n=1Nxn\bar{x} = N^{-1}\sum_{n=1}^N x_n. β^0=yˉβ^1xˉ\hat{\beta}_0 = \bar{y} - \hat{\beta}_1 \bar{x} and β^1=n=1N(ynyˉ)(xnxˉ)n=1N(xnxˉ)2\hat{\beta}_1 = \frac{\sum_{n=1}^N (y_n - \bar{y})(x_n - \bar{x})}{\sum_{n=1}^N (x_n - \bar{x})^2}
  3. Use our code from the regression Colab notebook to fit linear regression using more than one explanatory variable. When there is more than one explanatory variable, they call this multiple linear regression.