memo

Chapter3-02. Sufficinecy and completeness

3.2 Sufficinecy and completeness

3.2 Sufficient statistics

$(\mathcal{X}, \mathcal{A})$,
- measurable sp.
$(\mathcal{T}, \mathcal{B})$,
- measurable sp.
$\Theta \subseteq \mathbb{R}$,
$\mathcal{P} := \{P_{\theta}\}_{\theta \in \Theta}$,
- family of measure over $(\mathcal{X}, \mathcal{A})$,
$T: \mathcal{X} \rightarrow \mathcal{T}$
- statistics

$T$ is said to be sufficient if

\[\forall A \in \mathcal{A}, \ \exists q_{A}:\mathcal{T} \rightarrow \mathbb{R}: \mathcal{B} \text{-measurable } \ \text{ s.t. } \ \forall B \in \mathcal{B}, \ \theta \in \Theta, \ \int_{B} q_{A}(t) \ d P_{\theta}^{T}(dt) = P_{\theta}(A \cap T^{-1}(B))\]

■

Remark

The definition of sufficiency is interpreted as the family of distributions can be expressed by the measurable funciton which does not depend on $\theta$. In other words, a statistics $T$ has sufficient information to determine a $\theta$. The definition of sufficient statistics is equivalent to

\[\forall A \in \mathcal{A}, \ \exists q_{A}: \mathcal{T} \rightarrow \mathbb{R}: \mathcal{B} \text{-measurable } \ \text{ s.t. } \ q_{A}(t) = P_{\theta}(A \mid T = t) \ t \text{-a.e.}\]

■

Example 3.1 Bernoulli trial

$\mathcal{X} := \{0, 1\}^{n}$,
$\mathcal{A} := 2^{\mathcal{X}}$,
$\Theta := [0, 1]$,
$\mathcal{T} := \{0, 1, \ldots, n\}$,
$\mathcal{B} := 2^{\mathcal{T}}$,

We define p.d.f. of this trials as follows.

\[\theta \in \Theta, \ x \in \mathcal{X}, \ p_{\theta}(x) := \theta^{\sum_{i=1}^{n} x_{i}} (1 - \theta)^{n - \sum_{i=1}^{n} x_{i}},\]

Corresponding probability measure is

\[A \in \mathcal{A} \ P_{\theta}(A) := \sum_{x \in A} p_{\theta}(x).\]

We show that statistic

\[T(x) := \sum_{i=1}^{n} x_{i}\]

is sufficient. To show that, we need to confirm the equation of definition of sufficiency. It is easy to see that

\[\begin{eqnarray} P_{\theta}(\{x\} \cap T^{-1}(\{t\}) = \begin{cases} \theta^{t}(1 - \theta)^{n - t} & x \in T^{-1}(\{t\}) \\ 0 & x \notin T^{-1}(\{t\}) \end{cases}. \label{example_sufficiency_lhs} \end{eqnarray}\]

Hence

\[\begin{eqnarray} \forall A \in \mathcal{A}, \ P_{\theta}(A \cap T^{-1}(\{t\}) & = & P_{\theta}(A \cap T^{-1}(\{t\})) \nonumber \\ & = & \sum_{x \in A} P_{\theta}(\{x\} \cap T^{-1}(\{t\})) \nonumber \\ & = & |A \cap T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]

In particular, if we take $A$ as $\mathcal{X}$, we have

\[\begin{eqnarray} P_{\theta}(T^{-1}(\{t\}) & = & |T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & |\{x \in \mathcal{X} \mid \sum_{i=1}^{n} x_{i} = t \} | \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & \left( \begin{array}{c} n \\ t \end{array} \right) \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]

For evey $A$ we define $\mathcal{B}$ measurable function $q_{A}$ by

\[\begin{eqnarray} r(x, t) := \begin{cases} \left( \begin{array}{c} n \\ t \end{array} \right)^{-1} & (T(x) = t) \\ 0 & \text{otherwise} \end{cases} \nonumber \end{eqnarray}\] \[q_{A}(t) := \sum_{x \in A} r(x, t) = |A \cap T^{-1}(\{t\})| \left( \begin{array}{c} n \\ t \end{array} \right)^{-1}\]

Now we show that $q_{A}$ satisfies the equality of the definition of sufficiency. LHS of the equality is

\[\begin{eqnarray} \int_{B} q_{A}(t) P_{\theta}^{T}(dt) & = & \sum_{t \in B} q_{A}(t) P_{\theta}^{T}(\{t\}) \nonumber \\ & = & \sum_{t \in B} q_{A}(t) \left( \begin{array}{c} n \\ t \end{array} \right) \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & \sum_{t \in B} |A \cap T^{-1}(\{t\})| \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]

RHS of the equality is

\[\begin{eqnarray} P_{\theta}(A \cap T^{-1}(B)) & = & \sum_{t \in B} P_{\theta}(A \cap T^{-1}(\{t\})) \nonumber \\ & = & \sum_{t \in B} |A \cap T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \end{eqnarray}\]

■

Proposition 3.3

$\{P_{\theta}\}_{\theta \in \Theta}$,
$f: \mathcal{X} \rightarrow \mathbb{R}$,
- measurable
- for all $\theta \in \Theta$, $P_{\theta}$-integrable
$T: \mathcal{X} \rightarrow \mathcal{T}$,
- sufficient statistics

Then there exists measurable function $g:\mathcal{T} \rightarrow \mathbb{R}$l such that

\[\forall \theta \in \Theta, \ g(t) = \mathrm{E}_{\theta} \left[ f \mid T = t \right] \quad P_{\theta}^{T}\text{-a.s.}\]

proof

$\Box$

Definition 3.11 Complete

$(\mathcal{X}, \mathcal{A})$,
- measurable space
$\Theta$
- parameter sp.
${P_{\theta}}_{\theta \in \Theta}$
- a family of probability distribution over $(\mathcal{X}, \mathcal{A})$.

$\mathcal{P}$ is said to be (boundedly) complete if for all (bounded) measurable function $f: \mathcal{X} \rightarrow \mathbb{R}$,

\[\begin{eqnarray} \mathrm{E}_{\theta} \left[ f \right] = 0 \ (\forall \theta \in \Theta) \ \Rightarrow \ \forall \theta \in \Theta f = 0 \ P_{\theta} \text{-a.s.} , \end{eqnarray}\]

■

Remark

If $f \ge 0$, the above statemet always holds.

■

Definition 3.12 complete map

$(\mathcal{T}, \mathcal{B})$,
- measurable sp.
$f: \mathcal{X} \rightarrow \mathcal{T}$,
- measurable map

$T$ is said to be (boundedly) complete if $\{P_{\theta}^{T}\}_{\theta \in \Theta}$ is (boundedly) complete.

■

Proposition 3.3

$\mathcal{P} := \{P_{\theta}\}_{\theta \in \Theta}$,
$f: \mathcal{X} \rightarrow \mathbb{R}$,
- $\forall \theta \in \Theta$, $P_{\theta}$-integrable
$T: \mathcal{X} \rightarrow \mathbb{R}$,
- sufficient with respect to $\mathcal{P}$,

Then there exists measurable function $g: \mathcal{T} \rightarrow \mathbb{R}$ such that

\[\forall \theta \in \Theta, \ g(t) = \mathrm{E}_{\theta} \left[ f \mid T = t \right] \quad P_{\theta}^{T} \text{-a.s.}\]

proof

$\Box$

Proposition 3.4

$\mathcal{P} := \{P_{\theta}\}_{\theta \in \Theta}$,
$f: \mathcal{X} \rightarrow \mathbb{R}$,
- $\forall \theta \in \Theta$, $P_{\theta}$-integrable
$T: \mathcal{X} \rightarrow \mathbb{R}$,
- sufficient with respect to $\mathcal{P}$,

Then there exists $f^{\prime}: \mathcal{X} \rightarrow \mathbb{R}$ such that

\[\forall \theta \in \Theta, \ f^{\prime}(x) = \mathrm{E}_{\theta} \left[ f \mid T \right] \quad P_{\theta} \text{-a.s.}\]

proof

$\Box$