View on GitHub

memo

Hypergeometric Distribution

Hypergeometric Distribution

p.d.f. of hypergeometric function is defined by

\[\begin{eqnarray} k \in \{0, 1, \ldots, N\}, \ \mathrm{Hyper}(k; N, K, n) & := & \frac{ \left( \begin{array}{c} K \\ k \end{array} \right) \left( \begin{array}{c} N - K \\ n - k \end{array} \right) }{ \left( \begin{array}{c} N \\ n \end{array} \right) } \end{eqnarray}\]

c.d.f.

\(\begin{eqnarray} F_{\mathrm{Hyper}}(k; N, K, n) & := & \sum_{i=0}^{k} \mathrm{Hyper}(i; N, K, n) \end{eqnarray}\)

\[\begin{eqnarray} \sum_{i=k + 1}^{N} \mathrm{Hyper}(i; N, K, n) & = & 1 - F_{\mathrm{Hyper}}(k; N, K, n) \end{eqnarray}\]

Example

Condition treatment group control group total
YES $k$ $K − k$ $K$
NO $n − k$ $N + k − n − K$ $N − K$
total $n$ $N − n$ $N$

Suppose that $k$ is given by a value of some r.v. $X$. Then p.d.f of $X$ follows $\mathrm{Hyper}(k; N, K, n)$. Indeed, there are ${K \choose k}$ patterns to become YES state, ${N - K \choose n - k }$ patterns to become NO state and ${ N \choose n}$ total patterns to be in treatment group from $N$.

Proposition

\[\forall k \in \{0, 1, \ldots, N\}, \ \max(0, n + K - N) \le k \le \min(K, n) \Rightarrow \mathrm{Hyper}(k; N, K, n) \ge 0 .\]

proof

Reference