```
<- function(theta, data) {
ll return (-sum(...))
}
```

# Worksheet 04

Please create a folder named `worksheet04`

and create a QMD file named `main.qmd`

within it. Put your solutions to Worksheet 04 into `main.qmd`

. When finished, render the file, and submit by dragging and dropping `worksheet04`

, the entire folder along with all its contents, into our shared Google Drive folder. Thus, you’ve successfully submit Worksheet 04 when our shared Google Drive folder has just one folder, named `worksheet04`

, in it and your Worksheet 04 solutions contained within `worksheet04`

.

## Steps to find maximum likelihood estimators by pen/paper math

The likelihood function is defined as

\[L(\theta | \mathbf{X}) = \prod_{n=1}^N f(X_n | \theta)\]

It’s generally easier to work with the log-likelihood, instead of the likelihood itself. The log-likelihood is defined as a function of the unknown parameter(s) \(\theta\).

\[ l(\theta) = \log{L(\theta | \mathbf{X})} = \sum_{n=1}^N log(f(X_n | \theta)) \]

To find the best guess of \(\theta\) in terms of data \(X_1, \ldots, X_N\) perform the following steps.

Plug in the appropriate density function (for your particular problem) \(f(x|\theta)\) into the log-likelihood function. Simplify.

Differentiate (the simplified) log-liklihood function \(l(\theta)\) with respect to \(\theta\), simplify, and set derivative equal to \(0\). In symbols we’d write,

\[ \frac{d}{d \theta} l(\theta) = 0 \]

Solve for parameter \(\theta\) in terms of the data \(\mathbf{X}\) \[ \hat{\theta} = \hat{\theta}(\mathbf{X}) \]

## Steps to find the maximum likelihood estimators by R

Write the log-liklihood function as

where the negative sign accounts for the fact that the R function `optim`

performs minimization by default. Pass this function to `optim(...)`

as

```
optim(initial_point, ll, data = data, method = "L-BFGS-B",
lower = c(...), upper = c(...))
```

You need to create a vector of initial values for `theta`

and pass them in as the variable `initial_point`

. The data for your log-likelihood function should be stored in `data`

. And you need to tell `optim()`

what the bounds on your parameters values are.

## Bias, Variance, and Mean Squared Error

In class, we claimed that the estimate \(\hat{v}\) of the variance \(v\) with the denominator of \(N - 1\) is unbiased, while the variance with the denominator \(N\) is biased. Specifically, we showed that

\[\text{Bias}[\hat{v}] = \mathbb{E}[\hat{v} - v] = -v / N\]

where \(N\) is the amount of observations you have.

Pick a distribution, such that you know (or can find) the variance \(v\), and generate \(R\)

**unbiased**estimates \(\hat{v}_r\) for \(r\) in \(1:R\) of the variance and \(R\) estimates of the**biased**variance. You should choose your distribution’s parameters, \(R\), and \(N\).For your specific set up, what do you expect the bias to be for the

**biased**estimator of \(v\)?Estimate the bias of the

**biased**version of \(\hat{v}\) using your \(R\) estimates of \(v\). Is this estimate close to what you expect?For your specific set up, what do you expect the bias to be for the

**unbiased**estimator of \(v\)?Estimate the bias of the

**unbiased**version of \(\hat{v}\) using your \(R\) estimates of \(v\). Is this estimate close to what you expect?Estimate the variance for the biased and unbiased versions of \(\hat{v}\).

Estimate the mean squared error for the biased and unbiased versions of \(\hat{v}\). Which is smaller? Which estimator do you prefer? Why?

# Likelihood

Let \(X_1, \ldots, X_{50} \sim_{iid} \text{Poisson}(\lambda)\) where all we know is that \(\lambda > 0\). Suppose the observations \(X_n\) record the number of surface imperfections for a random sample of \(50\) metal plates and are summarized in the following table:

number of scratches per plate 0 1 2 3 4 frequnecy 4 12 11 14 9 - Find the maximum likelihood estimator of \(\lambda\) using a computer. Use the function
`dpois(x, lambda, log = TRUE)`

to help you write your log-likelihood function.

- Find the maximum likelihood estimator of \(\lambda\) using a computer. Use the function

- Find the maximum likelihood estimator of \(\lambda\) using pen and paper. The density function for the Poisson distribution is

\[f(x | \lambda) = \frac{\lambda^x e^{-\lambda}}{x!}\]

# Likelihood or Method of Moments?

A plumbing supplier typically ships packages of supplies containing many different combinations of items such as pipes, sealants, and drains. Almost invariably a shipment contains one or more incorrectly filled items: a part may be defective, missing, or not the type ordered. In this context, the random variables of interest is the proportion \(P\) of incorrectly filled items per shipment. A family of distributions for modeling the distribution of proportions has density function

\[f(p | \theta) = \theta p ^{\theta - 1}\]

for \(0 < p < 1\) and \(\theta > 0\). Suppose we have a sample of \(N = 5\) shipments with proportions \(P_1, \ldots, P_5\) of incorrectly filled items \(0.05, 0.31, 0.17, 0.23, 0.08\). Find an estimate of \(\theta\) from these data.