memo

Chapter2-06. Characterization of Sufficiency

2.6 Characterization of Sufficiency

$\mathcal{P} := \{P_{\theta}\}_{\theta \in \Theta}$,
- sample sp. $(\mathcal{X}, \mathcal{A})$上の確率測度の族
$T: \mathcal{X} \rightarrow \mathcal{T}$
- 統計量

Definition. sufficient

$T$がsufficientであるとは、

\[\forall A \in \mathcal{A}, \ \exists q(A \mid t):\mathcal{T} \rightarrow \mathbb{R}: \mathcal{B} \text{-measurable } \ \text{ s.t. } \ \forall B \in \mathcal{B}, \ \theta \in \Theta, \ \int_{B} q(A \mid t) \ d P_{\theta}^{T}(dt) = P_{\theta}(A \cap T^{-1}(B))\]

がなりたつことを言う。

■

統計量が十分とは、分布族を$\theta$に依存しない可測関数で表現できるということである。つまり、統計量$T$が$\theta$を特定するのに十分な情報を持っているということ。上の定義は以下と同値である。

\[\forall A \in \mathcal{A}, \ \exists q(A \mid \cdot): \mathcal{T} \rightarrow \mathbb{R}: \mathcal{B} \text{-measurable }, \ \text{ s.t. } \ q(A \mid t) = P_{\theta}(A \mid T = t) \ t \text{-a.e.}\]

Example 2.4.1からorder statisticsはsufficient statisticである。

Example. Bernoulli trial

$\mathcal{X} := \{0, 1\}^{n}$,
$\mathcal{A} := 2^{\mathcal{X}}$,
$\Theta := [0, 1]$,
$\mathcal{T} := \{0, 1, \ldots, n\}$,
$\mathcal{B} := 2^{\mathcal{T}}$,

We define p.d.f. of this trials as follows.

\[\theta \in \Theta, \ x \in \mathcal{X}, \ p_{\theta}(x) := \theta^{\sum_{i=1}^{n} x_{i}} (1 - \theta)^{n - \sum_{i=1}^{n} x_{i}},\]

Corresponding probability measure is

\[A \in \mathcal{A} \ P_{\theta}(A) := \sum_{x \in A} p_{\theta}(x).\]

We show that statistic

\[T(x) := \sum_{i=1}^{n} x_{i}\]

is sufficient. To show that, we need to confirm the equation of definition of sufficiency. It is easy to see that

\[\begin{eqnarray} P_{\theta}(\{x\} \cap T^{-1}(\{t\}) = \begin{cases} \theta^{t}(1 - \theta)^{n - t} & x \in T^{-1}(\{t\}) \\ 0 & x \notin T^{-1}(\{t\}) \end{cases}. \label{example_sufficiency_lhs} \end{eqnarray}\]

Hence

\[\begin{eqnarray} \forall A \in \mathcal{A}, \ P_{\theta}(A \cap T^{-1}(\{t\}) & = & P_{\theta}(A \cap T^{-1}(\{t\})) \nonumber \\ & = & \sum_{x \in A} P_{\theta}(\{x\} \cap T^{-1}(\{t\})) \nonumber \\ & = & |A \cap T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]

In particular, if we take $A$ as $\mathcal{X}$, we have

\[\begin{eqnarray} P_{\theta}(T^{-1}(\{t\}) & = & |T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & |\{x \in \mathcal{X} \mid \sum_{i=1}^{n} x_{i} = t \} | \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & \left( \begin{array}{c} n \\ t \end{array} \right) \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]

For evey $A$ we define $\mathcal{B}$ measurable function $q_{A}$ by

\[\begin{eqnarray} r(x, t) := \begin{cases} \left( \begin{array}{c} n \\ t \end{array} \right)^{-1} & (T(x) = t) \\ 0 & \text{otherwise} \end{cases} \nonumber \end{eqnarray}\] \[q_{A}(t) := \sum_{x \in A} r(x, t) = |A \cap T^{-1}(\{t\})| \left( \begin{array}{c} n \\ t \end{array} \right)^{-1}\]

Now we show that $q_{A}$ satisfies the equality of the definition of sufficiency. LHS of the equality is

\[\begin{eqnarray} \int_{B} q_{A}(t) P_{\theta}^{T}(dt) & = & \sum_{t \in B} q_{A}(t) P_{\theta}^{T}(\{t\}) \nonumber \\ & = & \sum_{t \in B} q_{A}(t) \left( \begin{array}{c} n \\ t \end{array} \right) \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & \sum_{t \in B} |A \cap T^{-1}(\{t\})| \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]

RHS of the equality is

\[\begin{eqnarray} P_{\theta}(A \cap T^{-1}(B)) & = & \sum_{t \in B} P_{\theta}(A \cap T^{-1}(\{t\})) \nonumber \\ & = & \sum_{t \in B} |A \cap T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \end{eqnarray}\]

■

Theorem 2.6.1

$\mathcal{X}$
- Euclidian sp.
$T$が$\mathcal{P}$について、十分とする

このとき、$\theta$に依存しない可測関数$g:\mathcal{T} \rightarrow [0, 1]$が存在して、

\[g(t) = P_{\theta}(A \mid T = t)\]

となる。

proof.

Theorem 2.5.1と同様の議論をすれば良い。

$\Box$

この定理はもう少し一般的な形で成り立つ。詳細は数理統計学の命題3.3.をみる。

さて、Bernoulli trialの例で見たように、定義に基づく十分性の証明は煩雑である。十分性の判定が容易となる同値な条件を与える。まず必要な用語を定義する。

Definition. domination, absolutely continuous

$(\Omega, \mathcal{F}, P)$,
$(\mathcal{X}, \mathcal{A})$,
- measurable sp.
$\mathcal{P} := \{P_{\theta} \mid \theta \in \Theta\}$,
$\mu: \mathcal{A} \rightarrow [0, \infty)$,
- $\sigma$-finite measure over $(\mathcal{X}, \mathcal{A})$,

$P_{\theta}$ is absolutely continuous with respect to $\mu$ or dominated by $\mu$ if

\[\forall A \in \mathcal{A}, \ \mu(A) = 0 \Rightarrow P_{\theta}(A) = 0 .\]

Then we denote this as $P_{\theta} \ll \mu$. We extend this definition to set of measures. $\mathcal{P}$ is dominated by $\mu$ if $\mu$ satisfies

\[\begin{eqnarray} & & \forall \theta \in \Theta, \ \forall A \in \mathcal{A}, \ \mu(A) \Rightarrow P_{\theta}(A) = 0 \nonumber \\ & \Leftrightarrow & \forall \theta \in \Theta, \ P_{\theta} \ll \mu . \nonumber \end{eqnarray}\]

■

Lemma

$\mu$
- $\sigma$-finite measure
$\mathcal{P} \ll \mu$,

Then

\[\exists \{P_{n}\}_{n \in \mathbb{N}} \subseteq \mathcal{P} \text{ s.t. } \mathcal{P} \ll P_{*} := \sum_{i = 1}^{\infty} 2^{-n}P_{n},\]

proof.

With out loss of generality, we assume that $\mu$ is finite measure. Indeed, if we assume that the statement holds for finite measure and $\mu$ is $\sigma$-finite measure, then there exists $\{B_{n}\}_{n \in \mathbb{N}}$ such that

\[\mathcal{X} = \sum_{i \in \mathbb{N}} B_{n}, \ \mu(B_{n}) < \infty .\]

Here we define finite measure $\mu^{*}$ by

\[\mu^{*} := \sum_{n \in \mathbb{N}} \frac{ \mu(A \cap B_{n}) }{ 2^{n}\mu(B_{n}) } 1_{\{\mu(A_{n}) \neq 0\}} .\]

Then $\mathcal{P} \ll \mu^{*}$. Indeed,

\[\mu^{*}(A) = 0 \Leftrightarrow \forall n \in \mathbb{N}, \ \mu(A \cap B_{n}) = 0\]

and

\[\begin{eqnarray} \forall P \in \mathcal{P}, \ P(A) & = & P(A \cap \sum_{n \in \mathbb{N}} B_{n}) \nonumber \\ & = & \sum_{n \in \mathbb{N}} P(A \cap B_{n}) . \nonumber \end{eqnarray}\]

Hence by $\mathcal{P} \ll \mu$, we obtain

\[\begin{eqnarray} \mu^{*}(A) = 0 & \Rightarrow & \forall n \in \mathbb{N}, \ \mu(A \cap B_{n}) = 0 \nonumber \end{eqnarray}\]

$\mu^{*}$ is finite measure so that we obtain $P_{*}$ by applying the statement for finite measure to $\mu^{*}$.

We assume $\mu$ is finite measure. Now we define

\[\begin{eqnarray} & & P \in \mathcal{P}, \ f_{P} := \frac{ d P }{ d \mu } \nonumber \\ & & S_{P} := \{ x \in \mathcal{X} \mid f_{P}(x) > 0 \} . \nonumber \end{eqnarray}\]

Radon-Nikodym derivative is $\mathcal{A}$ measurable so that $S_{P} \in \mathcal{A}$. And $S_{P} \neq \emptyset$ since $S_{P} = \emptyset$ implies

\[\begin{eqnarray} P(\mathcal{X}) & = & \int_{\mathcal{X}} f_{P}(x) \ dx \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]

Claim(a):

\[\begin{equation} P(A) = 0 \Leftrightarrow \mu(A \cap S_{P}) = 0 \label{lemma_03_26_step_b} \end{equation}\]

To prove the claim, for all $P \in \mathcal{P}$, we have

\[\begin{eqnarray} P(A) & = & \int_{A} f_{P}(x) \ dx \nonumber \\ & = & \int_{A} f_{P}(x) \ \mu(dx) \\ & = & \int_{A \cap S_{P}} f_{P}(x) \ \mu(dx) \nonumber \end{eqnarray}\]

This implies that

\[\begin{eqnarray} P(A) = 0 & \Leftrightarrow & 1_{A \cap S_{P}}(x) = 0 \ \mu \text{-a.e.} \nonumber \\ & \Leftrightarrow & \mu(A \cap S_{P}) = 0 . \nonumber \end{eqnarray}\]

Hence the claim is proved.

\[\mathcal{C} := \{ S \in \mathcal{A} \mid \exists \{P_{n}\}_{n \in \mathbb{N}} \in \mathcal{P}, \ S = \cup_{n \in \mathbb{N}} S_{P_{n}} \}\]

$\mathcal{C}$ is not empy set. For all $P \in \mathcal{P}$, $S_{P} \in \mathcal{A}$ and we take a sequence by $P_{n} := P$,

\[S_{P} = \bigcup_{n \in \mathbb{N}} S_{P_{n}} .\]

Moreover it is easy to check that $\mathcal{C}$ is closed with respect to countable sum since finite cartesian product of countable sets is stil countlable sets. We will show that there is the maximum value in $\{\mu(S) \mid S \in \mathcal{C}\}$. Since $\mu$ is finite measure by our assumption,

\[D := \sup \{ \mu(S) \mid S \in \mathcal{C} \} < \infty .\]

We can take a sequence $\{S^{i}\}_{i \in \mathbb{N}} \subseteq \mathcal{C}$ such that

\[\mu(S^{i}) \nearrow D \ (i \rightarrow \infty) .\]

Let $S^{*} := \cup_{i \in \mathbb{N}} S^{i}$. Obviously, $\mu(S^{*}) \le D$ by definiton of $D$. Additionally, by monotonicity of measure

\[\forall i \in \mathbb{N}, \ \mu(S^{i}) \le \mu(S_{*})\]

Hence by taking the limit both side,

\[D \le \mu(S_{*}) .\]

Therefore $D = \mu(S^{*})$ and $S^{*} \in \mathcal{C}$.

Now we construct $P_{*}$ which satisfies the equation in the statement. $S^{*} \in \mathcal{C}$ so that we can take a sequence $\{P_{n}^{*}\}_{n \in \mathbb{N}}$ such that $S = \cup_{n \in \mathbb{N}}S_{P_{n}^{*}}$. We define $P_{*} := \sum_{n \in \mathbb{N}} 2^{-n}P_{n}^{*}$.

Claim: $\mathcal{P} \ll P_{*}$.

Let $A \in \mathcal{A}$ be fixed. Suppose $P_{*}(A) = 0$. We have

\[\begin{eqnarray} \forall P \in \mathcal{P}, \ P(A) & = & P(A \cap S^{*}) + P(A \cap (S^{*})^{c}) \nonumber \\ & \le & P(A \cap S^{*}) + P((S^{*})^{c}) . \label{lemma_03_26_inequality_for_claim} \end{eqnarray}\]

We only need to show

(1) $P(A \cap S_{*}) = 0$,
(2) $P((S^{*})^{c}) = 0$.

For (1),

\[\begin{eqnarray} P_{*}(A) = 0 & \Rightarrow & P_{n}(A) = 0 \ (\forall n \in \mathbb{N}) \quad (\because \text{definition of } P_{*}) \nonumber \\ & \Rightarrow & \mu(A \cap S_{P_{n}}) = 0 \ (\forall n \in \mathbb{N}) \quad (\because \eqref{lemma_03_26_step_b}) \nonumber \\ & \Rightarrow & \mu(A \cap S_{*}) = 0 \nonumber \\ & \Rightarrow & P(A \cap S_{*}) = 0 \ (\forall P \in \mathcal{P}) \quad (\because \mathcal{P} \ll \mu) \nonumber \end{eqnarray}\]

For (2), we show that $\mu((S^{*})^{c} \cap S_{P}) = 0 \ (\forall P \in \mathcal{P})$. If we prove this, by $\eqref{lemma_03_26_step_b}$ we obtain $P((S^{*})^{c}) = 0$. Suppose that there exists $P \in \mathcal{P}$ such that $\mu((S^{*})^{c} \cap S_{P}) > 0$, then

\[\begin{eqnarray} \mu(S^{*} \cup S_{P}) & = & \mu((S^{*} \cup S_{P}) \cap S^{*}) + \mu((S^{*} \cup S_{P}) \cap (S^{*})^{c}) \nonumber \\ & = & \mu(S^{*}) + \mu((S^{*})^{c} \cap S_{P}) \nonumber \\ & > & \mu(S^{*}) \end{eqnarray}\]

This is contradictition since $\mu(S^{*})$ is the maximum value in $\mathcal{C}$.

$\Box$

We prove equivalent condition of sufficiency. This is known as factorization theorem.

Thereom 2.6.2.

$\mathcal{P} := \{P_{\theta} \mid \theta \in \Theta\}$,
$\mu:\mathcal{A} \rightarrow [0, \infty)$,
- $\sigma$-finite measure

\[\mathcal{P} \ll \mu\]

Then the following statemets are equivalent:

(a) $T:\mathcal{X} \rightarrow \mathcal{T}$ is sufficient for $\mathcal{P}$,
(b) there exist $g_{\theta}$ nonnegative $\mathcal{B}$ measurable function and $h_{\theta}$ nonnegative $\mathcal{B}$ measurable function such that

\[\begin{equation} \forall \theta \in \Theta, \ \frac{ d P_{\theta} }{ d \mu } (x) = g_{\theta}(T(x)) h_{\theta}(x) \ \mu \text{-a.e.} \label{factorization_theorem_factorization} \end{equation}\]

proof.

(a) $\Rightarrow$ (b) By definition of sufficiency,

\[\begin{eqnarray} \forall A \in \mathcal{A}, \ \exists q_{A}(t) \ \text{ s.t. } \ \forall B \in \mathcal{B}, \ \int_{B} q_{A}(t) \ dP_{\theta}^{T}(dt) & = & P_{\theta}(A \cap T^{-1}(B)) \nonumber \\ & = & \int_{\mathcal{X}} 1_{A \cap T^{-1}(B)}(x) P_{\theta}(dx) \nonumber \\ & = & \int_{\mathcal{X}} 1_{A}(x)1_{B}(T(x)) P_{\theta}(dx) \nonumber \end{eqnarray}\]

Since $\mathcal{P} \ll \mu$, by lemma, there exists a sequence $\{\theta_{n}\}_{n \in \mathbb{N}}$ such that

\[\mathcal{P} \ll P_{*} := \sum_{n \in \mathbb{N}} 2^{-n}P_{\theta_{n}} .\]

We have

\[\begin{eqnarray} \forall n \in \mathbb{N}, \ \int_{B} q_{A}(t) \ \left( \sum_{i=1}^{n} 2^{-i} P_{\theta_{i}}^{T}(dt) \right) & = & \int_{\mathcal{X}} 1_{A}(x)1_{B}(T(x)) \left( \sum_{i=1}^{n} 2^{-i} P_{\theta_{i}}(dx) \right) . \end{eqnarray}\]

By taking the limit of the above equation as $n \rightarrow \infty$, we obtain

\[\begin{eqnarray} \int_{B} q_{A}(t) \ dP_{*}^{T}(dt) & = & \int_{\mathcal{X}} 1_{A}(x)1_{B}(T(x)) P_{*}(dx) . \label{theorem_02_06_02_equation_02} \end{eqnarray}\]

Hence for all $P_{*}^{T}$ integrable function $g$,

\[\begin{eqnarray} \int_{\mathcal{T}} q_{A}(t) g(t) \ dP_{\theta}^{T}(dt) & = & \int_{\mathcal{X}} 1_{A}(x)g(T(x)) P_{\theta}(dx) . \label{theorem_02_06_02_equation} \end{eqnarray}\]

This can be proved by standard procedure for Lebesgue integrable function. We only give short sketch of this proof. For all $P_{*}^{T}$ integrable function $g$ can be approximated by linear sum of indicator function, say $\sum_{i=1}^{N} a_{i} 1_{B_{i}}(t)$ where $a_{i} \in \mathcal{T}$ and $B_{i} \in \mathcal{B}$. By linearity of integral,

\[\begin{eqnarray} \int_{\mathcal{T}} q_{A}(t) \sum_{i=1}^{N} a_{i} 1_{B_{i}}(t) \ P_{\theta}^{T}(dt) & = & \int_{\mathcal{X}} 1_{A}(x) \sum_{i=1}^{N} a_{i} 1_{B_{i}}(T(x)) P_{\theta}(dx) . \nonumber \end{eqnarray}\]

Then taking the limit of both side.

Since $P_{\theta} \ll \mathcal{P}_{*}$, $P_{\theta}^{T} \ll P_{\theta}$. Its Radon-Nikodym derivative is

\[g_{\theta}(t) := \frac{ d P_{\theta}^{T} }{ d P_{*}^{T} } (t)\]

We substitute the above into $\eqref{theorem_02_06_02_equation}$, we obtain

\[\begin{eqnarray} P_{\theta}(A) & = & \int_{\mathcal{T}} q_{A}(t) \ P_{\theta}^{T}(dt) \nonumber \\ & = & \int_{\mathcal{X}} q_{A}(t) g_{\theta}(t) \ P_{*}^{T}(dt) \ (\because \text{ property of Radon-Nikodym}) \nonumber \\ & = & \int_{\mathcal{X}} 1_{A}(t) g_{\theta}(T(x)) \ P_{*}(dx) \quad (\because \eqref{theorem_02_06_02_equation_02}) \nonumber \\ & = & \int_{\mathcal{X}} 1_{A}(t) g_{\theta}(T(x)) \frac{ d P_{*} }{ d \mu } (x) \ \mu(dx) \nonumber \end{eqnarray}\]

Since $A$ is arbitrary, the above equation implies

\[\frac{ d P_{\theta} }{ d \mu } (x) = g_{\theta}(T(x)) \frac{ d P_{*} }{ d \mu } (x) \ \mu \text{-a.e.} .\]

We take $g(x) := (dP_{*} / d\mu)(x)$. The proof is complete.

(b) $\Rightarrow$ (a) Let $P_{*}$ be defined above. We show that

\[\forall P_{\theta} \in \mathcal{P}, \ \exists \tilde{H}_{\theta}: \mathcal{T} \rightarrow \mathbb{R}_{\ge 0}, \ \text{ s.t. } \ \frac{ d P_{\theta} }{ d P_{*} } (x) = \tilde{H}(T(x)) \quad P_{*} \text{-a.s.}\]

where $\tilde{H}_{\theta}$ is $\mathcal{B}$ measurable. Let

\[g_{*}(t) := \sum_{n=1}^{\infty} 2^{-n}g_{\theta_{n}}(t) .\]

By $\eqref{factorization_theorem_factorization}$,

\[\forall n \in \mathbb{N}, \ \forall A \in \mathcal{A}, \ \sum_{i=1}^{n} 2^{-i} P_{\theta_{i}}(A) = \sum_{i=1}^{n} 2^{-i} \int_{A} g_{\theta_{i}}(T(x)) h(x) \ \mu(dx) \quad \mu \text{-a.e.} .\]

Hence by taking the limit we obtain

\[\begin{equation} \forall A \in \mathcal{A}, \ P_{*}(A) = \int_{A} g_{*}(T(x)) h(x) \ \mu(dx) \quad \mu \text{-a.e.} . \label{factorization_theorem_03_08} \end{equation}\]

Let $B_{*} := \{t \in \mathcal{T} \mid g_{*}(t) > 0\}$ and

\[\begin{equation} \tilde{g}_{\theta}(t) := \begin{cases} \frac{g_{\theta}(t)}{g_{*}}(t) & t \in B_{*} \\ 0 & t \notin B_{*} \end{cases} . \label{factorization_theorem_03_09} \end{equation}\]

Then

\[\tilde{g}_{\theta}(T(x)) = \frac{ d P_{\theta} }{ d P_{*} } (x) .\]

Indeed, for all $A \in \mathcal{A}$,

\[\begin{eqnarray} \int_{A} \tilde{g}_{\theta}(T(x)) \ P_{*}(dx) & = & \int_{A} \tilde{g}_{\theta}(T(x)) h(x) g_{*}(T(x)) \ \mu(dx) \quad (\because\ \eqref{factorization_theorem_03_08}) \nonumber \\ & = & \int_{A} g_{\theta}(T(x)) h(x) 1_{B_{*}}(T(x)) \ \mu(dx) \quad (\because\ \eqref{factorization_theorem_03_09}) \nonumber \\ & = & \int_{A} 1_{B_{*}}(T(x)) \ P_{\theta}(dx) \quad (\because\ \eqref{factorization_theorem_factorization}) \nonumber \\ & = & P_{\theta}(A \cap T^{-1}(B_{*})) \nonumber \\ & = & P_{\theta}(A) \nonumber \end{eqnarray}\]

The last equality can be verified

\[\begin{eqnarray} P_{*}(T^{-1}(B_{*}^{c})) & = & \int_{\mathcal{X}} 1_{B_{*}^{c}}(T(x)) h(x) g_{*}(T(x)) \ \mu(dx) \nonumber \\ & = & 0 \nonumber \end{eqnarray}\]

and $\mathcal{P} \ll P_{*}$ implies

\[\forall \theta \in \Theta, \ P_{\theta}(T^{-1}(B_{*}^{c})) = 0 .\]

We define $\mathcal{B}$ measurable function by conditional probability with respect to $P_{*}$,

\[\begin{eqnarray} A \in \mathcal{A}, \ q_{A}(t) := \mathrm{E}^{P_{*}} \left[ \left. 1_{A} \right| T = t \right] \nonumber \end{eqnarray}\]

By definition of conditional probability, for all $P_{*}^{T}$ integrable function $f: \mathcal{T} \rightarrow \mathbb{R}$,

\[\int_{\mathcal{T}} q_{A}(t) f(t) \ P_{*}^{T}(dt) = \int_{\mathcal{X}} 1_{A}(x) f(T(x)) \ P_{*}(dx) .\]

In particular, we take $f = 1_{B}\tilde{g}_{\theta}$ and

\[\begin{eqnarray} \forall B \in \mathcal{B}, \ \theta \in \Theta, \ \int_{B} q_{A}(t) \ P_{\theta}^{T}(dt) & = & \int_{T^{-1}(B)} 1_{A}(x) \ P_{\theta}(dx) \nonumber \\ & = & P_{\theta}(A \cap T^{-1}(B)) . \end{eqnarray}\]

Therefore $T$ is sufficient for $\mathcal{P}$.

$\Box$

Example of sufficient statistics.

$n \in \mathbb{N}$,
$T(x) := \sum_{i=1}^{n} x_{j}$,
- statistics

\[\mathcal{P} := \left\{ \int_{\cdot} \cdots \int_{\cdot} \left( \frac{ 1 }{ \sqrt{2\pi} } \right)^{n} \prod_{i=1}^{n} \exp \left( - \frac{ 1 }{ 2 } (x_{i} - \theta)^{2} \right) \ d x_{1} \cdots \ d x_{n} \mid \theta \in \mathbb{R} \right\}\]

Then $T$ is sufficient for $\mathcal{P}$ since

\[\frac{ d P_{\theta} }{ d x } (x) = \left( \frac{ 1 }{ \sqrt{2\pi} } \right)^{2} \exp \left( -\frac{1}{2} \sum_{j=1}^{n} x_{j}^{2} \right) \exp \left( \theta T(x) - \frac{ n \theta^{2} }{ 2 } \right) .\]

■

Application of sufficiency

Proposition

$\mathcal{P} := \{P_{\theta} \}_{\theta \in \Theta}$,
$f: \mathcal{X} \rightarrow \mathbb{R}$
- integrable for all $P_{\theta}$,

If $T$ is sufficinet for $\mathcal{P}$, then there exists $g: \mathcal{T} \rightarrow \mathbb{R}$ such that

\[\forall \theta \in \Theta, \ g(t) = \mathrm{E} \left[ f \mid T = t \right] \quad P_{\theta}^{T} \text{-a.e.}\]

proof.

Without loss of generality, $f$ is nonnegative. We can take a sequence of nonnegative simple functions $\{f_{n}\}_{n}$ such that $f_{n} \nearrow f$. Let $f_{n} = \sum_{j=1}^{k_{n}} c_{n, j} 1_{A_{n,j}}$ and

\[g_{n}(t) := \sum_{j=1}^{k_{n}} c_{n, j}q_{A_{n, j}}(t) .\]

Then

\[\begin{equation} g_{n}(t) = \mathrm{E} \left[ \left. f_{n} \right| T = t \right] \quad P_{\theta}^{T} \text{-a.s.} . \label{proposition_03_03_approximation} \end{equation}\]

We define

\[g(t) := \begin{cases} \lim \sup_{n \rightarrow \infty} g_{n}(t) & \lim \sup_{n \rightarrow \infty} g_{n}(t) < \infty \\ 0 & \lim \sup_{n \rightarrow \infty} g_{n}(t) = \infty \end{cases} .\]

$g$ is $\mathcal{B}$ measurable. By taking the limit of the both side of $\eqref{proposition_03_03_approximation}$. we obtain the equation in the statement.

$\Box$

Corollary

$\mathcal{P} := \{P_{\theta} \}_{\theta \in \Theta}$,
$f: \mathcal{X} \rightarrow \mathbb{R}$
- integrable for all $P_{\theta}$,

If $T$ is sufficinet for $\mathcal{P}$, then there exists $f^{\prime}: \mathcal{X} \rightarrow \mathbb{R}$ such that

\[\forall \theta \in \Theta, \ f^{\prime}(x) = \mathrm{E}_{\theta} \left[ f \mid T \right] \quad P_{\theta} \text{-a.e.}\]

proof.

By the above proposition, there exists $g$ such that

\[g(t) = \mathrm{E} \left[ \left. f \right| T = t \right] .\]

We take $f^{\prime} := g(T(x))$. Then

\[\begin{eqnarray} T^{-1}(B) \in \sigma(T), \ \int_{T^{-1}(B)} f^{\prime}(x) \ P(dx) & = & \int_{T^{-1}(B)} g(T(x)) \ P(dx) \nonumber \\ & = & \int_{T^{-1}(B)} \mathrm{E} \left[ \left. f \right| T = T(x) \right] \ P(dx) \nonumber \\ & = & \int_{B} f(t) \ P^{T}(dt) \nonumber \\ & = & \int_{A} \mathrm{E}_{\theta} \left[ \left. f \right| T = T(x) \right] \ P(dx) \end{eqnarray}\]

$\Box$

We consider statistical decision problem $(\mathcal{X}, \mathcal{A}, \mathcal{P}, \Theta, D, \mathcal{B}, L, \Delta)$. Additionally, we assume

(i) Decision space $D$ is convex borel set over $\mathbb{R}^{p}$ and $\mathcal{B} := \mathcal{B}(D)$
(ii) loss function $L(\theta, a)$ is convex function with respect to $a$.

Rao Blackwell’s theorem

$\mathcal{P} := \{P_{\theta}\}_{\theta \in \Theta}$,
$T$
- sufficinet statistic for $\mathcal{P}$
$\delta:\mathcal{X} \rightarrow D$
- nonrandomized decision function

Let

\[\theta \in \Theta, \ \mathrm{E} \left[ \|\delta| \right] < \infty\]

and

\[\delta_{0} := \mathrm{E} \left[ \left. \delta \right| T \right] .\]

Then

\[\forall \theta \in \Theta, \ R(\theta, \delta_{0}) \le R(\theta, \delta) .\]

proof.

\[\begin{eqnarray} L(\theta, \delta_{0}) & = & L \left( \theta, \mathrm{E}_{\theta} \left[ \left. \delta \right| T \right] \right) \nonumber \\ & \le & \mathrm{E}_{\theta} \left[ \left. L(\theta, \delta) \right| T \right] \nonumber \end{eqnarray}\] \[\begin{eqnarray} & & \mathrm{E} \left[ L \left( \theta, \mathrm{E}_{\theta} \left[ \left. \delta \right| T \right] \right) \right] \le \mathrm{E} \left[ \mathrm{E}_{\theta} \left[ \left. L(\theta, \delta) \right| T \right] \right] \nonumber \\ & \Leftrightarrow & \mathrm{E} \left[ L(\theta, \delta_{0}) \right] \le \mathrm{E} \left[ L(\theta, \delta) \right] \nonumber \\ & \Leftrightarrow & R(\theta, \delta_{0}) \le R(\theta, \delta) \end{eqnarray}\]

$\Box$

Rao-Blackwell’s theorem indicates that we need to only consider sufficient statistics.