Use the dataset from my GitHub repository named radon. Perform a short analysis on the numeric variable log_radon grouped by floor.

According to the EPA, “radon is a naturally occurring radioactive gas that can cause lung cancer.” Measurements of log_radon were taken on a specific floor of a random sample of homes in Minnesota. By taking \(\log_{10}\) of radon, log_radon is essentially measuring the magnitude of the radon found. The categorical variable floor records the lowest level of the house and on which level the radon measurement was taken. floor takes on the value 0 if the radon measurement was recorded in the basement or 1 if the measurement was recorded one the first floor of the house.

Determine if the mean magnitude of radon in Minnesota homes is different between homes with basements and homes without basements.

Your analysis should have

  1. A well labeled plot of your variable. Put axis labels on your plot by using bp.labels(...).

  2. A point estimate of the population mean magnitude of radon on each floor of the house. Use Scipy’s function minimize(...) along with the simplified log-likelihood from the Normal random variable to estimate the population means.

  3. Write one complete English sentence explaining the two values you just found, in context of the data.

  4. Use the bootstrap method to produce a confidence interval for each floor’s mean, for a percent confidence of your choice.

  5. Write one complete English sentence describing each of the confidence intervals you just found, in context of the data.

  6. Add to or make a separate well labeled plot that includes a visualization of your analysis.