MATH 314 Homework 04

Due 2026-02-10 by 11:59pm

Uniform sampling from a stream of numbers, where you don't have the availability (memory or otherwise) to store more than one of the numbers at a time, is a clever trick.

  1. Write a Python class called OnlineUniformSampler which implements the following API:

    ou = OnlineUniformSampler(95928)
    ou.update(1)
    ou.update(2)
    ou.update(3)
    ou.sample()
    ou.count()
    

    The class method sample() should return one, uniformly chosen, of the values input to the possibly many calls of update().

    The class method update() should implement an algorithm based on the following description. For the first value passed to update(), store it.

    For -th call to update(x_i), replace the stored value with the passed in value x_i with probability 1/i. Not all calls to update() will successfully replace the stored value.

    To replace a stored value with the -th value of with a probability , use code like

    rng = np.random.default_rng(seed)
    if rng.uniform() <= p:
        u = x
    

    where the variable rng should be re-used within this class.

    The method count() should return the number of times the method update() was called.

  2. Provide code to reasonably test your class.

If you are interested, the theory for this algorithm is explained (in my opinion) well on this blog post.