## The dataset wages consists of 3294 observations on hourly wages. A
## few variables related to earnings were recorded. Build a multiple
## regression model to predict wage using the variables sex,
## exper, and school.
## The dataset is availebl at
## https://raw.githubusercontent.com/roualdes/data/master/wages.csv
## More information about the dataset is available here:
## https://github.com/roualdes/data/blob/master/wages.txt
## Identify the response variable(s) and its(their) statistical type(s).
## Identify the explanatory variable(s) and its(their) statistical type(s).
## Provide R code to make an informative plot of your data/model.
## Write 1 complete English sentence describing the estimated
## intercept for females.
## Does the estimated intercept for femalse make sense in context of
## these data. Explain why or why not.
## Write 1 complete English sentence describing the estimated
## intercept for males.
## Does the estimated intercept for males make sense in context
## of these data. Explain why or why not.
## Write 1 complete English sentence describing the estimated slope
## across exper. State clearly to which sexes this applies.
## Provide R code to calculate the mean of exper and school by levels
## of the categorical variable sex. If you use any library, be sure
## to load it.
## Write down both R code and mathematical symbols that would make a
## prediction for the hourly wage of males when exper and school are
## equal to their mean, call it exper_bar and school_bar.
## Interpret your prediction in context of these data.
## Write down both R code and mathematical symbols that would make a
## prediction for the hourly wage of mages when exper is equal to its
## mean and school is equal to 20.
## Using words from our class, which prediction is more reasonable at
## school equal to school_bar or school equal to 20? Why?
## Calculate and interpret a 90% confidence interval for an offset
## in context of these data.
## Do these data suggest a significant difference in the hourly wage
## between males and females? Explain.
## Using the dataset
## (carnivora)[https://raw.githubusercontent.com/roualdes/data/master/carnivora.csv]
## create a categorical response variable with two levels where the
## variable takes on the value 1 for the Super Family Caniformia and 0 otherwise.
## Use the code below to predict the Super Family of the Order
## Carnivora based on a numerical explanatory variable of your choice.
X <- model.matrix()
ll <- function(beta, y, mX) {
lin <- apply(mX, 1, function(row) {sum(beta * row)})
sum( log1p(exp(lin)) - y*lin )
}
beta_hat <- optim()$par
## Use the code below estimate the probability that an animal is a
## member of the Super Family Caniformia based on the mean of your numerical
## explanatory variable. Interpret, in context of the data, the
## predicted probability.
pred_logistic <- function(mX, betahat) {
lin <- apply(mX, 1, function(row) {sum(betahat * row)})
1 / (1 + exp(-lin))
}
pred_logistic(matrix(c(1, ?), ncol=?), beta_hat)
## Estimate the change in probability given some change in the
## numerical explanatory variable. Interpret, in context of the data,
## the change in predicted probability.
blogistic <- function(data, idx) {
?
diff(pred_logistic(matrix(c(1, ?
1, ?),
ncol=?, byrow=TRUE), beta_hat))
b <- boot::boot() #, ncpus=3, parallel="multicore")
boot::boot.ci()
## Estimate the change in probability given some other change in the
## numerical explanatory variable. Interpret, in context of the data,
## the change in predicted probability.
## Are your confidence intervals the same? Should they be in logistic
## regression? Explain.