In [65]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as sp

Probability is an expectation

$$\mathbb{E}[1_A(X)] = \sum_{x \in S} 1_A(x) f(x) = \sum_{x \in A} f(x) = \mathbb{P}[X \in A]$$

This has two consequences

* probability, like all expectations, are defined in the limit
* probability, like most expectations, can be approximated with a mean of random data

What's the probability of rolling an even number from a fair die, $\mathbb{P}[X \in \{2, 4, 6\}]$ where $X \sim \text{Uniform}(1, 6)$?

In [66]:
U = sp.randint(1, 7) # U(1, 6)
U.pmf(3)

0.16666666666666666

In [67]:
a = np.arange(1, 7)
# np.sum(U.pmf(a % 2 == 0)) # right, for the wrong reason
np.sum(U.pmf(a[a % 2 == 0])) # right, for the right reason + numpy indexing

0.5

In [68]:
np.sum(U.pmf(np.arange(2, 7, 2)))

0.5

We can approximate this probability by generating random data from the Uniform(1, 6) distribution and calculating the appropriate mean. Since probability is an expectation of an indicator function, we need a mean of an indicator function.

In [69]:
rng = np.random.default_rng()
x = rng.integers(1, 7, size = 10000) # only integers in {1, 2, 3, 4, 5, 6}
np.mean(x % 2 == 0) # so this checks for data in {2, 4, 6}

0.4985

If we change the distribution, then the probability might change, but the way we approximate the probability stays largely the same: take a mean of an indicator function.

In [70]:
B = sp.binom(10, 0.1)
np.sum(B.pmf(np.arange(2, 7, 2)))

0.20500828650000025

In [71]:
y = rng.binomial(10, 0.1, size = 10000) # integers in {0, ..., 10}

N.B. Because I, unintentionally, changed the values that the random variable could take on, from $\{1, \ldots, 6\}$ to $\{0, 1, \ldots, 10 \}$, we must check which elements are in $\{2, 4, 6\}$ (which corresponds to even numbers from a fair die) directly, instead of using mod 2.

In [72]:
b = (y == 2) | (y == 4) | (y == 6)
b

array([False, False, False, ..., False,  True, False])

In [73]:
y

array([1, 3, 1, ..., 0, 2, 0])

In [74]:
np.mean(b)

0.214

which approximates $\mathbb{P}[X \in \{2, 4, 6\}] \approx 0.205$ when $X \sim \text{Binomial}(10, 0.1)$.