1. Pick a dataset (CSV file) from my GitHub repository named data. Most files READMEs as .txt files1. Perform a short analysis on a single numerical variable of your choice. Your analysis should include:

    1. A sentence or two, in your own words (ie not directly copied from the README), explaining what the dataset is all about and what variable you will investigate in your analysis.

    2. A well labeled, units and all, plot of your variable. Put axis labels on your plot by using bp.labels(...). You can put \(\LaTeX\) on your plot labels if you call bp.LaTeX() first.

    3. A point estimate of the population mean. Use Scipy’s function minimize(...) along with the simplified log-likelihood from the Normal random variable to estimate the population mean.

    4. Write one complete English sentence explaining the value you just found, in context of the data.

    5. Use the bootstrap method to produce a confidence interval, for a percent confidence of your choice.

    6. Write one complete English sentence describing the confidence interval you just found, in context of the data.

    7. Add to or make a separate well labeled plot that includes a visualization of your analysis.

  2. Use the dataset climate to perform a short analysis using the paired data model. Provide an answer to the question: Has the average North American temperature increased over time?

    Your analysis should include all the same components as above, plus a justification for your choice of using a paired data model, ie what makes these data paired? or what are they paired on?. Please note that the code is not terribly different from the code above, so my main concern is your ability to interpret your data and your analysis of these data in context of the dataset.


  1. If there isn’t an associated README consider helping me out by writing one and filing a PR.