Sampling Distribution

Edward A. Roualdes

Recap

recap: Goal of Statistics

Statistics estimates population parameters:

  • population mean \(\mu\) using sample mean, \(\bar{X}\)
  • population proportion \(p\) using sample mean, \(\hat{p}\)
  • population standard deviation \(\sigma\) using sample standard deviation, \(s\)

Point Estimate

A calculated sample statistic is known as a point estimate. Statistics thinks of these as random variables, since (abstractly) statistics are just functions of random variables.

Sampling Distribution

Point Estimates Vary

Consider a dataset about possums from Australia and New Guinea. Suppose we collected 1000 random samples of 50 possums each and calculated the mean from each sample. So now we have 1000 sample means. What’s the shape of the sample means?

Sampling Distribution, example

Sampling Distribution, definition

It is useful to think of a particular point estimate as being drawn from this continuous distribution, the sampling distribution.

  • The sampling distribution represents the distribution (shape of histogram) of the point estimates based on samples of a fixed size from a certain population.

Sampling Distribution, parameters

The sampling distribution has its own mean and standard deviation, called the standard error.

  • The standard deviation associated with a point estimator is called the standard error.

Standard Error

Example

Let’s use the definition \(\sigma_{\bar{X}} = \sigma/\sqrt{n}\).

  • Would you be more trusting of a sample that has \(100\) observations or \(400\) observations?
  • If \(\sigma = 10\), what is \(\sigma_{\bar{X}}\) when \(n=100\)?
  • What is \(\sigma_{\bar{X}}\) when \(n=400\)?
  • What do the respective sampling distributions look like?

Take Away

  • Statistics imagines repeated sampling, even if it never happens
  • Mathematically, repeated sampling can/does happen
  • Standard Error (error in estimate) decreases as sample size goes up
    • intuition comes from standard error of the sample mean, \(\frac{\sigma}{\sqrt{n}}\)
  • We use these lessons to build new tools to learn about populations of interest