MATH 314 Homework 04

Due 2025-09-24 by 11:59pm

Uniform sampling from a stream of numbers, where you don't have the availability (memory or otherwise) to store more than one of the numbers at a time, is a clever trick.

  1. Write a Python class called OnlineUniformSampler which implements the following API:

    ou = OnlineUniformSampler(95928)
    ou.update(1)
    ou.update(2)
    ou.update(3)
    ou.sample()
    ou.count()
    

    The class method sample() should return one, uniformly chosen, of the values input to the possibly many calls of update().

    The class method update() should implement an algorithm based on the following description. For the first value passed to update(), store it.

    For -th call to update(x_i), replace the stored value with the passed in value x_i with probability 1/i. Not all calls to update() will successfully replace the stored value.

    To replace a stored value with the -th value of with a probability , use code like

    rng = np.random.default_rng(seed)
    if rng.uniform() <= p:
        u = x
    

    where the variable rng should be re-used within this class.

  2. Provide code to reasonable test your class.

As was requested last class, the theory for this algorithm is explained (in my opinion) well on this blog post.