MATH 351 Homework 04
Welford's online algorithm for computing the mean and variance work with just one data point at a time, as if the data were streaming in and you didn't have the memory or ability to store the past data.
I prefer to write the math down a bit differently, but to the same effect. Initialize . Update these variables with a new observation
Write a Python class called
OnlineMoments
, which implements the following API:om = OnlineMoments() om.update(1) om.update(2) om.update(3) om.mean() om.var()
Ensure your class passess the following tests:
om = OnlineMoments() om.update(1) om.update(1) om.update(1) assert om.mean() == 1 assert om.var() == 0
import numpy as np om = OnlineMoments() om.update(1) om.update(2) om.update(3) assert om.mean() == 2 assert np.isclose(om.var(), np.var([1, 2, 3]))
Why did we need to use the Numpy function isclose? Read the Python webpage Floating Point Arithmetic: Issues and Limitations most of which is not specific to Python.
Read the subsections how does floating point work? and a variance calculation gone wrong on the incredibly informative webpage Examples of floating point problems by Julia Evans.
- Write one more test of your class
OnlineMoments
using the more challenging example provided on Julia's webpage in the subsection Examples of floating point problems. Hint: you'll have to write a for-loop to make your class methods fit Julia's example.