2.6 Characterization of Sufficiency
- \(\mathcal{P} := \{P_{\theta}\}_{\theta \in \Theta}\),
- sample sp. \((\mathcal{X}, \mathcal{A})\)上の確率測度の族
- $T: \mathcal{X} \rightarrow \mathcal{T}$
- 統計量
Definition. sufficient
$T$がsufficientであるとは、
\[\forall A \in \mathcal{A}, \ \exists q(A \mid t):\mathcal{T} \rightarrow \mathbb{R}: \mathcal{B} \text{-measurable } \ \text{ s.t. } \ \forall B \in \mathcal{B}, \ \theta \in \Theta, \ \int_{B} q(A \mid t) \ d P_{\theta}^{T}(dt) = P_{\theta}(A \cap T^{-1}(B))\]がなりたつことを言う。
統計量が十分とは、分布族を$\theta$に依存しない可測関数で表現できるということである。 つまり、統計量$T$が$\theta$を特定するのに十分な情報を持っているということ。 上の定義は以下と同値である。
\[\forall A \in \mathcal{A}, \ \exists q(A \mid \cdot): \mathcal{T} \rightarrow \mathbb{R}: \mathcal{B} \text{-measurable }, \ \text{ s.t. } \ q(A \mid t) = P_{\theta}(A \mid T = t) \ t \text{-a.e.}\]Example 2.4.1からorder statisticsはsufficient statisticである。
Example. Bernoulli trial
- \(\mathcal{X} := \{0, 1\}^{n}\),
- \(\mathcal{A} := 2^{\mathcal{X}}\),
- \(\Theta := [0, 1]\),
- \(\mathcal{T} := \{0, 1, \ldots, n\}\),
- \(\mathcal{B} := 2^{\mathcal{T}}\),
We define p.d.f. of this trials as follows.
\[\theta \in \Theta, \ x \in \mathcal{X}, \ p_{\theta}(x) := \theta^{\sum_{i=1}^{n} x_{i}} (1 - \theta)^{n - \sum_{i=1}^{n} x_{i}},\]Corresponding probability measure is
\[A \in \mathcal{A} \ P_{\theta}(A) := \sum_{x \in A} p_{\theta}(x).\]We show that statistic
\[T(x) := \sum_{i=1}^{n} x_{i}\]is sufficient. To show that, we need to confirm the equation of definition of sufficiency. It is easy to see that
\[\begin{eqnarray} P_{\theta}(\{x\} \cap T^{-1}(\{t\}) = \begin{cases} \theta^{t}(1 - \theta)^{n - t} & x \in T^{-1}(\{t\}) \\ 0 & x \notin T^{-1}(\{t\}) \end{cases}. \label{example_sufficiency_lhs} \end{eqnarray}\]Hence
\[\begin{eqnarray} \forall A \in \mathcal{A}, \ P_{\theta}(A \cap T^{-1}(\{t\}) & = & P_{\theta}(A \cap T^{-1}(\{t\})) \nonumber \\ & = & \sum_{x \in A} P_{\theta}(\{x\} \cap T^{-1}(\{t\})) \nonumber \\ & = & |A \cap T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]In particular, if we take $A$ as $\mathcal{X}$, we have
\[\begin{eqnarray} P_{\theta}(T^{-1}(\{t\}) & = & |T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & |\{x \in \mathcal{X} \mid \sum_{i=1}^{n} x_{i} = t \} | \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & \left( \begin{array}{c} n \\ t \end{array} \right) \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]For evey $A$ we define $\mathcal{B}$ measurable function \(q_{A}\) by
\[\begin{eqnarray} r(x, t) := \begin{cases} \left( \begin{array}{c} n \\ t \end{array} \right)^{-1} & (T(x) = t) \\ 0 & \text{otherwise} \end{cases} \nonumber \end{eqnarray}\] \[q_{A}(t) := \sum_{x \in A} r(x, t) = |A \cap T^{-1}(\{t\})| \left( \begin{array}{c} n \\ t \end{array} \right)^{-1}\]Now we show that \(q_{A}\) satisfies the equality of the definition of sufficiency. LHS of the equality is
\[\begin{eqnarray} \int_{B} q_{A}(t) P_{\theta}^{T}(dt) & = & \sum_{t \in B} q_{A}(t) P_{\theta}^{T}(\{t\}) \nonumber \\ & = & \sum_{t \in B} q_{A}(t) \left( \begin{array}{c} n \\ t \end{array} \right) \theta^{t}(1 - \theta)^{n - t}. \nonumber \\ & = & \sum_{t \in B} |A \cap T^{-1}(\{t\})| \theta^{t}(1 - \theta)^{n - t}. \nonumber \end{eqnarray}\]RHS of the equality is
\[\begin{eqnarray} P_{\theta}(A \cap T^{-1}(B)) & = & \sum_{t \in B} P_{\theta}(A \cap T^{-1}(\{t\})) \nonumber \\ & = & \sum_{t \in B} |A \cap T^{-1}(\{t\}) | \theta^{t}(1 - \theta)^{n - t}. \end{eqnarray}\]Theorem 2.6.1
- $\mathcal{X}$
- Euclidian sp.
- $T$が$\mathcal{P}$について、十分とする
このとき、$\theta$に依存しない可測関数\(g:\mathcal{T} \rightarrow [0, 1]\)が存在して、
\[g(t) = P_{\theta}(A \mid T = t)\]となる。
proof.
Theorem 2.5.1と同様の議論をすれば良い。
この定理はもう少し一般的な形で成り立つ。 詳細は数理統計学の命題3.3.をみる。
さて、Bernoulli trialの例で見たように、定義に基づく十分性の証明は煩雑である。 十分性の判定が容易となる同値な条件を与える。 まず必要な用語を定義する。
Definition. domination, absolutely continuous
- \((\Omega, \mathcal{F}, P)\),
- \((\mathcal{X}, \mathcal{A})\),
- measurable sp.
- \(\mathcal{P} := \{P_{\theta} \mid \theta \in \Theta\}\),
- $\mu: \mathcal{A} \rightarrow [0, \infty)$,
- $\sigma$-finite measure over \((\mathcal{X}, \mathcal{A})\),
\(P_{\theta}\) is absolutely continuous with respect to $\mu$ or dominated by $\mu$ if
\[\forall A \in \mathcal{A}, \ \mu(A) = 0 \Rightarrow P_{\theta}(A) = 0 .\]Then we denote this as \(P_{\theta} \ll \mu\). We extend this definition to set of measures. \(\mathcal{P}\) is dominated by $\mu$ if $\mu$ satisfies
\[\begin{eqnarray} & & \forall \theta \in \Theta, \ \forall A \in \mathcal{A}, \ \mu(A) \Rightarrow P_{\theta}(A) = 0 \nonumber \\ & \Leftrightarrow & \forall \theta \in \Theta, \ P_{\theta} \ll \mu . \nonumber \end{eqnarray}\]Lemma
- $\mu$
- $\sigma$-finite measure
- \(\mathcal{P} \ll \mu\),
Then
\[\exists \{P_{n}\}_{n \in \mathbb{N}} \subseteq \mathcal{P} \text{ s.t. } \mathcal{P} \ll P_{*} := \sum_{i = 1}^{\infty} 2^{-n}P_{n},\]proof.
With out loss of generality, we assume that $\mu$ is finite measure. Indeed, if we assume that the statement holds for finite measure and $\mu$ is $\sigma$-finite measure, then there exists \(\{B_{n}\}_{n \in \mathbb{N}}\) such that
\[\mathcal{X} = \sum_{i \in \mathbb{N}} B_{n}, \ \mu(B_{n}) < \infty .\]Here we define finite measure \(\mu^{*}\) by
\[\mu^{*} := \sum_{n \in \mathbb{N}} \frac{ \mu(A \cap B_{n}) }{ 2^{n}\mu(B_{n}) } 1_{\{\mu(A_{n}) \neq 0\}} .\]Then \(\mathcal{P} \ll \mu^{*}\). Indeed,
\[\mu^{*}(A) = 0 \Leftrightarrow \forall n \in \mathbb{N}, \ \mu(A \cap B_{n}) = 0\]and
\[\begin{eqnarray} \forall P \in \mathcal{P}, \ P(A) & = & P(A \cap \sum_{n \in \mathbb{N}} B_{n}) \nonumber \\ & = & \sum_{n \in \mathbb{N}} P(A \cap B_{n}) . \nonumber \end{eqnarray}\]Hence by \(\mathcal{P} \ll \mu\), we obtain
\[\begin{eqnarray} \mu^{*}(A) = 0 & \Rightarrow & \forall n \in \mathbb{N}, \ \mu(A \cap B_{n}) = 0 \nonumber \end{eqnarray}\]\(\mu^{*}\) is finite measure so that we obtain \(P_{*}\) by applying the statement for finite measure to \(\mu^{*}\).
We assume \(\mu\) is finite measure. Now we define
\[\begin{eqnarray} & & P \in \mathcal{P}, \ f_{P} := \frac{ d P }{ d \mu } \nonumber \\ & & S_{P} := \{ x \in \mathcal{X} \mid f_{P}(x) > 0 \} . \nonumber \end{eqnarray}\]Radon-Nikodym derivative is \(\mathcal{A}\) measurable so that \(S_{P} \in \mathcal{A}\). And \(S_{P} \neq \emptyset\) since \(S_{P} = \emptyset\) implies
\[\begin{eqnarray} P(\mathcal{X}) & = & \int_{\mathcal{X}} f_{P}(x) \ dx \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]Claim(a):
\[\begin{equation} P(A) = 0 \Leftrightarrow \mu(A \cap S_{P}) = 0 \label{lemma_03_26_step_b} \end{equation}\]To prove the claim, for all $P \in \mathcal{P}$, we have
\[\begin{eqnarray} P(A) & = & \int_{A} f_{P}(x) \ dx \nonumber \\ & = & \int_{A} f_{P}(x) \ \mu(dx) \\ & = & \int_{A \cap S_{P}} f_{P}(x) \ \mu(dx) \nonumber \end{eqnarray}\]This implies that
\[\begin{eqnarray} P(A) = 0 & \Leftrightarrow & 1_{A \cap S_{P}}(x) = 0 \ \mu \text{-a.e.} \nonumber \\ & \Leftrightarrow & \mu(A \cap S_{P}) = 0 . \nonumber \end{eqnarray}\]Hence the claim is proved.
\[\mathcal{C} := \{ S \in \mathcal{A} \mid \exists \{P_{n}\}_{n \in \mathbb{N}} \in \mathcal{P}, \ S = \cup_{n \in \mathbb{N}} S_{P_{n}} \}\]$\mathcal{C}$ is not empy set. For all $P \in \mathcal{P}$, \(S_{P} \in \mathcal{A}\) and we take a sequence by \(P_{n} := P\),
\[S_{P} = \bigcup_{n \in \mathbb{N}} S_{P_{n}} .\]Moreover it is easy to check that $\mathcal{C}$ is closed with respect to countable sum since finite cartesian product of countable sets is stil countlable sets. We will show that there is the maximum value in \(\{\mu(S) \mid S \in \mathcal{C}\}\). Since $\mu$ is finite measure by our assumption,
\[D := \sup \{ \mu(S) \mid S \in \mathcal{C} \} < \infty .\]We can take a sequence \(\{S^{i}\}_{i \in \mathbb{N}} \subseteq \mathcal{C}\) such that
\[\mu(S^{i}) \nearrow D \ (i \rightarrow \infty) .\]Let \(S^{*} := \cup_{i \in \mathbb{N}} S^{i}\). Obviously, \(\mu(S^{*}) \le D\) by definiton of $D$. Additionally, by monotonicity of measure
\[\forall i \in \mathbb{N}, \ \mu(S^{i}) \le \mu(S_{*})\]Hence by taking the limit both side,
\[D \le \mu(S_{*}) .\]Therefore \(D = \mu(S^{*})\) and \(S^{*} \in \mathcal{C}\).
Now we construct \(P_{*}\) which satisfies the equation in the statement. \(S^{*} \in \mathcal{C}\) so that we can take a sequence \(\{P_{n}^{*}\}_{n \in \mathbb{N}}\) such that \(S = \cup_{n \in \mathbb{N}}S_{P_{n}^{*}}\). We define \(P_{*} := \sum_{n \in \mathbb{N}} 2^{-n}P_{n}^{*}\).
Claim: \(\mathcal{P} \ll P_{*}\).
Let $A \in \mathcal{A}$ be fixed. Suppose \(P_{*}(A) = 0\). We have
\[\begin{eqnarray} \forall P \in \mathcal{P}, \ P(A) & = & P(A \cap S^{*}) + P(A \cap (S^{*})^{c}) \nonumber \\ & \le & P(A \cap S^{*}) + P((S^{*})^{c}) . \label{lemma_03_26_inequality_for_claim} \end{eqnarray}\]We only need to show
- (1) \(P(A \cap S_{*}) = 0\),
- (2) \(P((S^{*})^{c}) = 0\).
For (1),
\[\begin{eqnarray} P_{*}(A) = 0 & \Rightarrow & P_{n}(A) = 0 \ (\forall n \in \mathbb{N}) \quad (\because \text{definition of } P_{*}) \nonumber \\ & \Rightarrow & \mu(A \cap S_{P_{n}}) = 0 \ (\forall n \in \mathbb{N}) \quad (\because \eqref{lemma_03_26_step_b}) \nonumber \\ & \Rightarrow & \mu(A \cap S_{*}) = 0 \nonumber \\ & \Rightarrow & P(A \cap S_{*}) = 0 \ (\forall P \in \mathcal{P}) \quad (\because \mathcal{P} \ll \mu) \nonumber \end{eqnarray}\]For (2), we show that \(\mu((S^{*})^{c} \cap S_{P}) = 0 \ (\forall P \in \mathcal{P})\). If we prove this, by \(\eqref{lemma_03_26_step_b}\) we obtain \(P((S^{*})^{c}) = 0\). Suppose that there exists $P \in \mathcal{P}$ such that \(\mu((S^{*})^{c} \cap S_{P}) > 0\), then
\[\begin{eqnarray} \mu(S^{*} \cup S_{P}) & = & \mu((S^{*} \cup S_{P}) \cap S^{*}) + \mu((S^{*} \cup S_{P}) \cap (S^{*})^{c}) \nonumber \\ & = & \mu(S^{*}) + \mu((S^{*})^{c} \cap S_{P}) \nonumber \\ & > & \mu(S^{*}) \end{eqnarray}\]This is contradictition since \(\mu(S^{*})\) is the maximum value in \(\mathcal{C}\).
We prove equivalent condition of sufficiency. This is known as factorization theorem.
Thereom 2.6.2.
- \(\mathcal{P} := \{P_{\theta} \mid \theta \in \Theta\}\),
- \(\mu:\mathcal{A} \rightarrow [0, \infty)\),
- $\sigma$-finite measure
Then the following statemets are equivalent:
- (a) $T:\mathcal{X} \rightarrow \mathcal{T}$ is sufficient for $\mathcal{P}$,
- (b) there exist \(g_{\theta}\) nonnegative $\mathcal{B}$ measurable function and \(h_{\theta}\) nonnegative $\mathcal{B}$ measurable function such that
proof.
(a) $\Rightarrow$ (b) By definition of sufficiency,
\[\begin{eqnarray} \forall A \in \mathcal{A}, \ \exists q_{A}(t) \ \text{ s.t. } \ \forall B \in \mathcal{B}, \ \int_{B} q_{A}(t) \ dP_{\theta}^{T}(dt) & = & P_{\theta}(A \cap T^{-1}(B)) \nonumber \\ & = & \int_{\mathcal{X}} 1_{A \cap T^{-1}(B)}(x) P_{\theta}(dx) \nonumber \\ & = & \int_{\mathcal{X}} 1_{A}(x)1_{B}(T(x)) P_{\theta}(dx) \nonumber \end{eqnarray}\]Since $\mathcal{P} \ll \mu$, by lemma, there exists a sequence \(\{\theta_{n}\}_{n \in \mathbb{N}}\) such that
\[\mathcal{P} \ll P_{*} := \sum_{n \in \mathbb{N}} 2^{-n}P_{\theta_{n}} .\]We have
\[\begin{eqnarray} \forall n \in \mathbb{N}, \ \int_{B} q_{A}(t) \ \left( \sum_{i=1}^{n} 2^{-i} P_{\theta_{i}}^{T}(dt) \right) & = & \int_{\mathcal{X}} 1_{A}(x)1_{B}(T(x)) \left( \sum_{i=1}^{n} 2^{-i} P_{\theta_{i}}(dx) \right) . \end{eqnarray}\]By taking the limit of the above equation as $n \rightarrow \infty$, we obtain
\[\begin{eqnarray} \int_{B} q_{A}(t) \ dP_{*}^{T}(dt) & = & \int_{\mathcal{X}} 1_{A}(x)1_{B}(T(x)) P_{*}(dx) . \label{theorem_02_06_02_equation_02} \end{eqnarray}\]Hence for all \(P_{*}^{T}\) integrable function $g$,
\[\begin{eqnarray} \int_{\mathcal{T}} q_{A}(t) g(t) \ dP_{\theta}^{T}(dt) & = & \int_{\mathcal{X}} 1_{A}(x)g(T(x)) P_{\theta}(dx) . \label{theorem_02_06_02_equation} \end{eqnarray}\]This can be proved by standard procedure for Lebesgue integrable function. We only give short sketch of this proof. For all \(P_{*}^{T}\) integrable function $g$ can be approximated by linear sum of indicator function, say \(\sum_{i=1}^{N} a_{i} 1_{B_{i}}(t)\) where \(a_{i} \in \mathcal{T}\) and \(B_{i} \in \mathcal{B}\). By linearity of integral,
\[\begin{eqnarray} \int_{\mathcal{T}} q_{A}(t) \sum_{i=1}^{N} a_{i} 1_{B_{i}}(t) \ P_{\theta}^{T}(dt) & = & \int_{\mathcal{X}} 1_{A}(x) \sum_{i=1}^{N} a_{i} 1_{B_{i}}(T(x)) P_{\theta}(dx) . \nonumber \end{eqnarray}\]Then taking the limit of both side.
Since \(P_{\theta} \ll \mathcal{P}_{*}\), \(P_{\theta}^{T} \ll P_{\theta}\). Its Radon-Nikodym derivative is
\[g_{\theta}(t) := \frac{ d P_{\theta}^{T} }{ d P_{*}^{T} } (t)\]We substitute the above into \(\eqref{theorem_02_06_02_equation}\), we obtain
\[\begin{eqnarray} P_{\theta}(A) & = & \int_{\mathcal{T}} q_{A}(t) \ P_{\theta}^{T}(dt) \nonumber \\ & = & \int_{\mathcal{X}} q_{A}(t) g_{\theta}(t) \ P_{*}^{T}(dt) \ (\because \text{ property of Radon-Nikodym}) \nonumber \\ & = & \int_{\mathcal{X}} 1_{A}(t) g_{\theta}(T(x)) \ P_{*}(dx) \quad (\because \eqref{theorem_02_06_02_equation_02}) \nonumber \\ & = & \int_{\mathcal{X}} 1_{A}(t) g_{\theta}(T(x)) \frac{ d P_{*} }{ d \mu } (x) \ \mu(dx) \nonumber \end{eqnarray}\]Since $A$ is arbitrary, the above equation implies
\[\frac{ d P_{\theta} }{ d \mu } (x) = g_{\theta}(T(x)) \frac{ d P_{*} }{ d \mu } (x) \ \mu \text{-a.e.} .\]We take \(g(x) := (dP_{*} / d\mu)(x)\). The proof is complete.
(b) $\Rightarrow$ (a) Let \(P_{*}\) be defined above. We show that
\[\forall P_{\theta} \in \mathcal{P}, \ \exists \tilde{H}_{\theta}: \mathcal{T} \rightarrow \mathbb{R}_{\ge 0}, \ \text{ s.t. } \ \frac{ d P_{\theta} }{ d P_{*} } (x) = \tilde{H}(T(x)) \quad P_{*} \text{-a.s.}\]where \(\tilde{H}_{\theta}\) is $\mathcal{B}$ measurable. Let
\[g_{*}(t) := \sum_{n=1}^{\infty} 2^{-n}g_{\theta_{n}}(t) .\]By \(\eqref{factorization_theorem_factorization}\),
\[\forall n \in \mathbb{N}, \ \forall A \in \mathcal{A}, \ \sum_{i=1}^{n} 2^{-i} P_{\theta_{i}}(A) = \sum_{i=1}^{n} 2^{-i} \int_{A} g_{\theta_{i}}(T(x)) h(x) \ \mu(dx) \quad \mu \text{-a.e.} .\]Hence by taking the limit we obtain
\[\begin{equation} \forall A \in \mathcal{A}, \ P_{*}(A) = \int_{A} g_{*}(T(x)) h(x) \ \mu(dx) \quad \mu \text{-a.e.} . \label{factorization_theorem_03_08} \end{equation}\]Let \(B_{*} := \{t \in \mathcal{T} \mid g_{*}(t) > 0\}\) and
\[\begin{equation} \tilde{g}_{\theta}(t) := \begin{cases} \frac{g_{\theta}(t)}{g_{*}}(t) & t \in B_{*} \\ 0 & t \notin B_{*} \end{cases} . \label{factorization_theorem_03_09} \end{equation}\]Then
\[\tilde{g}_{\theta}(T(x)) = \frac{ d P_{\theta} }{ d P_{*} } (x) .\]Indeed, for all $A \in \mathcal{A}$,
\[\begin{eqnarray} \int_{A} \tilde{g}_{\theta}(T(x)) \ P_{*}(dx) & = & \int_{A} \tilde{g}_{\theta}(T(x)) h(x) g_{*}(T(x)) \ \mu(dx) \quad (\because\ \eqref{factorization_theorem_03_08}) \nonumber \\ & = & \int_{A} g_{\theta}(T(x)) h(x) 1_{B_{*}}(T(x)) \ \mu(dx) \quad (\because\ \eqref{factorization_theorem_03_09}) \nonumber \\ & = & \int_{A} 1_{B_{*}}(T(x)) \ P_{\theta}(dx) \quad (\because\ \eqref{factorization_theorem_factorization}) \nonumber \\ & = & P_{\theta}(A \cap T^{-1}(B_{*})) \nonumber \\ & = & P_{\theta}(A) \nonumber \end{eqnarray}\]The last equality can be verified
\[\begin{eqnarray} P_{*}(T^{-1}(B_{*}^{c})) & = & \int_{\mathcal{X}} 1_{B_{*}^{c}}(T(x)) h(x) g_{*}(T(x)) \ \mu(dx) \nonumber \\ & = & 0 \nonumber \end{eqnarray}\]and \(\mathcal{P} \ll P_{*}\) implies
\[\forall \theta \in \Theta, \ P_{\theta}(T^{-1}(B_{*}^{c})) = 0 .\]We define $\mathcal{B}$ measurable function by conditional probability with respect to \(P_{*}\),
\[\begin{eqnarray} A \in \mathcal{A}, \ q_{A}(t) := \mathrm{E}^{P_{*}} \left[ \left. 1_{A} \right| T = t \right] \nonumber \end{eqnarray}\]By definition of conditional probability, for all \(P_{*}^{T}\) integrable function $f: \mathcal{T} \rightarrow \mathbb{R}$,
\[\int_{\mathcal{T}} q_{A}(t) f(t) \ P_{*}^{T}(dt) = \int_{\mathcal{X}} 1_{A}(x) f(T(x)) \ P_{*}(dx) .\]In particular, we take \(f = 1_{B}\tilde{g}_{\theta}\) and
\[\begin{eqnarray} \forall B \in \mathcal{B}, \ \theta \in \Theta, \ \int_{B} q_{A}(t) \ P_{\theta}^{T}(dt) & = & \int_{T^{-1}(B)} 1_{A}(x) \ P_{\theta}(dx) \nonumber \\ & = & P_{\theta}(A \cap T^{-1}(B)) . \end{eqnarray}\]Therefore $T$ is sufficient for $\mathcal{P}$.
Example of sufficient statistics.
- $n \in \mathbb{N}$,
- \(T(x) := \sum_{i=1}^{n} x_{j}\),
- statistics
Then $T$ is sufficient for $\mathcal{P}$ since
\[\frac{ d P_{\theta} }{ d x } (x) = \left( \frac{ 1 }{ \sqrt{2\pi} } \right)^{2} \exp \left( -\frac{1}{2} \sum_{j=1}^{n} x_{j}^{2} \right) \exp \left( \theta T(x) - \frac{ n \theta^{2} }{ 2 } \right) .\]Application of sufficiency
Proposition
- \(\mathcal{P} := \{P_{\theta} \}_{\theta \in \Theta}\),
- $f: \mathcal{X} \rightarrow \mathbb{R}$
- integrable for all \(P_{\theta}\),
If $T$ is sufficinet for $\mathcal{P}$, then there exists $g: \mathcal{T} \rightarrow \mathbb{R}$ such that
\[\forall \theta \in \Theta, \ g(t) = \mathrm{E} \left[ f \mid T = t \right] \quad P_{\theta}^{T} \text{-a.e.}\]proof.
Without loss of generality, $f$ is nonnegative. We can take a sequence of nonnegative simple functions \(\{f_{n}\}_{n}\) such that \(f_{n} \nearrow f\). Let \(f_{n} = \sum_{j=1}^{k_{n}} c_{n, j} 1_{A_{n,j}}\) and
\[g_{n}(t) := \sum_{j=1}^{k_{n}} c_{n, j}q_{A_{n, j}}(t) .\]Then
\[\begin{equation} g_{n}(t) = \mathrm{E} \left[ \left. f_{n} \right| T = t \right] \quad P_{\theta}^{T} \text{-a.s.} . \label{proposition_03_03_approximation} \end{equation}\]We define
\[g(t) := \begin{cases} \lim \sup_{n \rightarrow \infty} g_{n}(t) & \lim \sup_{n \rightarrow \infty} g_{n}(t) < \infty \\ 0 & \lim \sup_{n \rightarrow \infty} g_{n}(t) = \infty \end{cases} .\]$g$ is $\mathcal{B}$ measurable. By taking the limit of the both side of \(\eqref{proposition_03_03_approximation}\). we obtain the equation in the statement.
Corollary
- \(\mathcal{P} := \{P_{\theta} \}_{\theta \in \Theta}\),
- $f: \mathcal{X} \rightarrow \mathbb{R}$
- integrable for all \(P_{\theta}\),
If $T$ is sufficinet for $\mathcal{P}$, then there exists $f^{\prime}: \mathcal{X} \rightarrow \mathbb{R}$ such that
\[\forall \theta \in \Theta, \ f^{\prime}(x) = \mathrm{E}_{\theta} \left[ f \mid T \right] \quad P_{\theta} \text{-a.e.}\]proof.
By the above proposition, there exists $g$ such that
\[g(t) = \mathrm{E} \left[ \left. f \right| T = t \right] .\]We take \(f^{\prime} := g(T(x))\). Then
\[\begin{eqnarray} T^{-1}(B) \in \sigma(T), \ \int_{T^{-1}(B)} f^{\prime}(x) \ P(dx) & = & \int_{T^{-1}(B)} g(T(x)) \ P(dx) \nonumber \\ & = & \int_{T^{-1}(B)} \mathrm{E} \left[ \left. f \right| T = T(x) \right] \ P(dx) \nonumber \\ & = & \int_{B} f(t) \ P^{T}(dt) \nonumber \\ & = & \int_{A} \mathrm{E}_{\theta} \left[ \left. f \right| T = T(x) \right] \ P(dx) \end{eqnarray}\]We consider statistical decision problem \((\mathcal{X}, \mathcal{A}, \mathcal{P}, \Theta, D, \mathcal{B}, L, \Delta)\). Additionally, we assume
- (i) Decision space $D$ is convex borel set over $\mathbb{R}^{p}$ and $\mathcal{B} := \mathcal{B}(D)$
- (ii) loss function $L(\theta, a)$ is convex function with respect to $a$.
Rao Blackwell’s theorem
- \(\mathcal{P} := \{P_{\theta}\}_{\theta \in \Theta}\),
- $T$
- sufficinet statistic for $\mathcal{P}$
- $\delta:\mathcal{X} \rightarrow D$
- nonrandomized decision function
Let
\[\theta \in \Theta, \ \mathrm{E} \left[ \|\delta| \right] < \infty\]and
\[\delta_{0} := \mathrm{E} \left[ \left. \delta \right| T \right] .\]Then
\[\forall \theta \in \Theta, \ R(\theta, \delta_{0}) \le R(\theta, \delta) .\]proof.
\[\begin{eqnarray} L(\theta, \delta_{0}) & = & L \left( \theta, \mathrm{E}_{\theta} \left[ \left. \delta \right| T \right] \right) \nonumber \\ & \le & \mathrm{E}_{\theta} \left[ \left. L(\theta, \delta) \right| T \right] \nonumber \end{eqnarray}\] \[\begin{eqnarray} & & \mathrm{E} \left[ L \left( \theta, \mathrm{E}_{\theta} \left[ \left. \delta \right| T \right] \right) \right] \le \mathrm{E} \left[ \mathrm{E}_{\theta} \left[ \left. L(\theta, \delta) \right| T \right] \right] \nonumber \\ & \Leftrightarrow & \mathrm{E} \left[ L(\theta, \delta_{0}) \right] \le \mathrm{E} \left[ L(\theta, \delta) \right] \nonumber \\ & \Leftrightarrow & R(\theta, \delta_{0}) \le R(\theta, \delta) \end{eqnarray}\]Rao-Blackwell’s theorem indicates that we need to only consider sufficient statistics.