Distribution of $$X$$ density function $$\mathbb{E}[X]$$ $$\mathbb{V}[X]$$ parameter bounds
Binomial$$(K, p)$$ $${K \choose x}p^x(1-p)^{K-x}$$ $$Kp$$ $$Kp(1 - p)$$ $$0 \leq p \leq 1$$
1. Assume the random variable $$X$$ follows the Normal distribution, $$X \sim \text{N}(\mu, \sigma^2)$$, with $$\mathbb{E}[X] = \mu$$ and $$\mathbb{V}[X] = \sigma^2$$.

1. Choose a value of the mean $$\mu$$ and standard deviation $$\sigma$$, and store them as variables. If you want to use the English language equivalents, use the variable names mu and sigma.

2. Using seq() to generate a sequence of 501 values from mu - 4 * sigma to mu + 4 * sigma named x. The argument length.out allows you specify how long you want the vector x to be.

3. Put x into a dataframe, along with a column of the evaluation of the Normal distribution’s density function, dnorm(x, mean = mu, sd = sigma).

4. Make a plot of your Normal distribution’s density function using geom_line().

5. Randomly generate N = 1001 observations from your Normal distribution, using rnorm(...).

6. Estimate the following probabilities using your vector of observations,
1. $$\mathbb{P}[$$ mu - sigma $$< X <$$ mu + sigma $$]$$.
2. $$\mathbb{P}[$$ mu - 2*sigma $$< X <$$ mu + 2*sigma $$]$$.
3. $$\mathbb{P}[$$ mu - 3*sigma $$< X <$$ mu + 3*sigma$$]$$.
2. Choose parameters $$K, p$$ for a Binomial distribution to generate random data from. Let’s call this distribution $$F$$.

1. Generate N = 100 observations from $$F$$ and store them in a variable named x.

2. Calculate and store the sample mean of x, call it Ehat.

3. Wrap parts a., and b. in a for loop of length R = 1000. Don’t forget to pre-allcoate as necessary.

4. Put your collection of R sample means into a dataframe.

5. Make a density plot of your R sample means.

6. Use var() to calculate the sample variance of your R sample means, call it Vhat_Ehat.

7. Estimate the following probabilities using your vector of R sample means. I’m using the symbol $$K\hat{p}$$ to represent as a random variable (recall estimates are now random variables) your vector of sample means.

1. $$\mathbb{P}[$$ K*p - sqrt(Vhat_Ehat) $$< K\hat{p} <$$ K*p + sqrt(Vhat_Ehat) $$]$$.
2. $$\mathbb{P}[$$ K*p - 2*sqrt(Vhat_Ehat) $$< K\hat{p} <$$ K*p + 2*sqrt(Vhat_Ehat) $$]$$.
3. $$\mathbb{P}[$$ K*p - 3*sqrt(Vhat_Ehat) $$< K\hat{p} <$$ K*p + 3*sqrt(Vhat_Ehat) $$]$$.
8. Your answers from h. should be close to the calculations in 1.f.. Compare the numbers and decide how far off they are. Try increasing or decreasing R to see how close these probabilities are to the ones above.