MATH 314 Practice Exam 02/Final
Exam 02/Final on Wednesday 2026-05-13 from 2 - 3.50pm in Holt 291
I tried really hard to make exactly 5 things wrong in each code chuck of each of the problems below, but counting is hard.
Please cross out the wrong character(s) and write an ordered, comma
separated list of replacement characters after the comment #. Place
a line through nothing to denote that you want to add characters at
that point. To delete characters without replacement, use under-scores
(e.g. ____) as empty replacements of the characters you cross out.
If you don't know the correct Python syntax for the replacement characters you want, make something not unreasonable up.
You will not receive credit if you cross out an entire line, even if your fix is correct.
You should assume the following import statements precede each code chunk in each question.
import plotnine as pn
import numpy as np
import pandas as pd
import patsy as pt
import statsmodels.api as sm
import scipy.stats as spicy
from scipy.optimize import minimize
url = "https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2020/2020-07-07/coffee_ratings.csv"
df = pd.read_csv(url)
def bootstrap(arr, T, R = 1_000):
N = np.shape(arr)[0]
Ts = np.zeros(R)
rng = np.random.default_rng()
for r in range(R):
idx = rng.integers(N, size = N)
if type(arr) is np.ndarray:
Ts[r] = T(arr[idx])
else:
Ts[r] = T(arr.iloc[idx])
return Ts
-
Using the dataset
df, filter the dataset such that the variabletotal_cup_pointshas only values greater than 40 and the variablealtitude_mean_metershas only values less than 2000. Remove any rows containing NaNs in only either of these columns. Plot these variables using plotnine, making sure thattotal_cup_pointsis the response variable, and color the points byspecies. -
Fit linear regression to predict
total_cup_pointsusing a shared slope onaftertasteand unique intercepts and slopes onaltitude_mean_metersbyspecies. Species has values Arabica and Robusta. Make a prediction for an Arabica coffee's total cup points when it has meanaftertasteand meanaltitude_mean_metersfor only Arabica coffees. -
Use the function
bootstrapabove to calculate an 87% confidence interval for the slope onaltitude_mean_meters, for thespeciesRobusta and a medianaftertaste, of the linear regression model in 2. -
Fit logistic regression to predict whether or not a coffee is the
speciesArabica. You first have to create an appropriate response variable. Use an interaction term betweenaftertasteandaltitude_mean_meters. Predict the probability that a coffee is a Robusta given mean values for both after taste and altitude. -
Write a function that can be used with Scipy's
minimize()to fit a linear regression model of an arbitrary number of predictors. Use this function, together withminimize()to fit a linear regression model predictingtotal_cup_pointswith unique intercepts byspeciesand a shared slope onaftertaste. Use the Python library patsy to obtain the design matrices appropriate for this model and then useminimize()to estimate the coeficients.