## The dataset mtcars consists of 32 observations on cars from 1974.
## A few variables related to each car were recorded. Build a
## multiple regression model to predict mpg using the variables disp,
## cyl, and wt.
## The dataset is built into R:
## mtcars
## More information about the dataset is available here:
## ?mtcas
## Identify the response variable(s) and its(their) statistical type(s).
## Identify the explanatory variable(s) and its(their) statistical type(s).
## Provide R code to make an informative plot of your data/model.
## Write 1 complete English sentence describing the estimated
## intercept for cars with 6 cylinders.
## Does the estimated intercept for cars with 6 cylinders make sense
## in context of these data. Explain why or why not.
## Write 1 complete English sentence describing the estimated
## intercept for cars with 4 cylinders.
## Does the estimated intercept for cars with 4 cylinders make sense
## in context of these data. Explain why or why not.
## Write 1 complete English sentence describing the estimated slope
## across wt. State clearly to which levels this applies.
## Provide R code to calculate the mean of wt and disp by levels of
## the categorical variable cyl. If you use any library, be sure to
## load it.
## Write down both R code and mathematical symbols that would make a
## prediction for cars with 8 cylinders when wt and disp are
## equal to their mean, call it wt_bar and disp_bar.
## Interpret your prediction in context of these data.
## Write down both R code and mathematical symbols that would make a
## prediction for cars with 8 cyliners when disp is equal to its
## mean and wt is equal to 8.
## Using words from our class, which prediction is more reasonable at
## wt equal to wt_bar or wt equal to 8? Why?
## Calculate and interpret a 90% confidence interval for an offset
## in context of these data.
## Do these data suggest a significant difference in the cars mpg
## based on how many cylinders they have? Explain.
## Using the dataset iris, which is built into R, create a categorical
## response variable with two levels where the variable takes on the
## value 1 for the species versicolor and 0 otherwise.
## Use the code below to predict the species based on a numerical
## explanatory variable of your choice.
X <- model.matrix()
ll <- function(beta, y, mX) {
lin <- apply(mX, 1, function(row) {sum(beta * row)})
sum( log1p(exp(lin)) - y*lin )
}
beta_hat <- optim()$par
## Use the code below estimate the probability that an iris is a
## member of the species versicolor based on the mean of your numerical
## explanatory variable. Interpret, in context of the data, the
## predicted probability.
pred_logistic <- function(mX, betahat) {
lin <- apply(mX, 1, function(row) {sum(betahat * row)})
1 / (1 + exp(-lin))
}
pred_logistic(matrix(c(1, ?), ncol=?), beta_hat)
## Estimate the change in probability given some change in the
## numerical explanatory variable. Interpret, in context of the data,
## the change in predicted probability.
blogistic <- function(data, idx) {
?
diff(pred_logistic(matrix(c(1, ?
1, ?),
ncol=?, byrow=TRUE), beta_hat))
b <- boot::boot() #, ncpus=3, parallel="multicore")
boot::boot.ci()
## Estimate the change in probability given some other change in the
## numerical explanatory variable. Interpret, in context of the data,
## the change in predicted probability.
## Are your confidence intervals the same? Should they be in logistic
## regression? Explain.