2.1 Formal Relation between True and Model
Definition 1
- $W \subseteq \mathbb{R}^{d}$,
$q$ is said to be realizable by a statistical model $p(x \mid w)$ if
\[\exists w \in W \text{ s.t. } q(x) = p(x \mid w) .\]Otherwise, $q$ is unrealizable.
Lemma 1
- (1-a) $q$ is realizable by the statisical model $p(x \mid w)$
- (1-b) $W_{00}$ is not emptyset.
(2) Suppose $W_{00}$ is not emptyset.
\[\forall w_{1}, w_{2} \in W, \ p(x \mid w_{1}) = p(x \mid w_{2}) .\]proof
Example 6
\[p(x, y \mid a, b) := \frac{1}{\sqrt{2\pi}} \exp \left( - \frac{1}{2} \left( y - a \sin(bx) \right)^{2} \right) .\]If $q(x, y) := p(x, y \mid 1, 1)$,
\[W_{00} = \{ (-1, -1), (1, 1) \} .\]If $q(x, y) := p(x, y \mid 0, 0)$,
\[W_{00} = \{ (-1, -1), (1, 1) \} .\]If
\[q(x, y) := \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{1}{2} \right),\]then
\[W_{00} = \{ (-1, -1), (1, 1) \} .\]Definition 3
\[W_{0} := \arg\inf_{w \in W} L(w) .\]$q$ is said to be regular for $p(x \mid w)$ if
\[(\nabla^{2} L(w_{0}))_{i, j} := \left( \frac{\partial L}{\partial w_{i} \partial w_{j}} \right)(w_{0}) .\]is positive definite.
Lemma 2
- (1) $\mathrm{card}(W_{0}) \ge 1$,
- (2) $\mathrm{card}(W_{0}) \ge 1$,
proof
Definiiton 4
- $W_{0}$,
- not emptyset
the optimal probability density function is essentially unique if
\[\exists p_{0}(x) \text{ s.t. } \forall w_{0} \in W_{0}, \ p(x \mid w_{0}) = p_{0}(x) .\]iff
\[\forall w_{1}, w_{2} \in W_{0}, \ \forall x \in \mathcal{X}, \ p(x \mid w_{0}) = p(x \mid w_{1})\]Lemma 3
(1)
proof
Example 8
(1) $\theta \in [0, 2\pi)$,
\[\begin{eqnarray} p(x, y \mid \theta) & := & \frac{1}{2\pi} \exp \left( -\frac{1}{2} \left( (x - \cos \theta)^{2} + (y - \sin \theta)^{2} \right) \right) \nonumber \\ q(x, y) & := & \frac{1}{2\pi} \exp \left( -\frac{1}{2} \left( x^{2} + y^{2} \right) \right) \end{eqnarray}\](2)
Lemma 3
- (1) If $q(x)$ is realizable for $p(x \mid w)$, then $p_{0} = q$ and $W_{00} = W_{0}$,
- (2)
proof
Definition 5
- $W$,
$f$ is called the log density ratio function. The log density ration function is said to be relatively finite variance If there exists $c_{0} > 0$ such that for all $w_{0} \in W_{0}$ and $w \in W$,
\[\begin{equation} \int_{\Omega} f(X(\omega), w_{0}, w) \ P(d \omega) = c_{0} \int_{\Omega} f(X(\omega), w_{0}, w)^{2} \ P(d \omega) . \nonumber \end{equation}\]Lemma 4
-
(1) If the log density ratio function is relatively finite variance,
-
(2) If $q$ is realizable, thel og density ration function is relatively finite variance
-
(3) $q$ is
proof
proof of (1)
For all $w_{1}, w_{2} \in W_{0}$,
\[\begin{eqnarray} 0 & = & L(w_{2}) - L(w_{1}) \nonumber \\ & = & \int q(x) \log \frac{ q(x) }{ p(x \mid w_{2}) } \ P_{X}(d x) - \int q(x) \log \frac{ q(x) }{ p(x \mid w_{1}) } \ P_{X}(d x) \nonumber \\ & = & \int q(x) f(x, w_{1}, w_{2}) \ P_{X}(d x)) \nonumber \\ & \ge & c_{0} \int q(x) f(x, w_{1}, w_{2})^{2} \ P_{X}(dx) . \nonumber \end{eqnarray}\]Hence for all $x$, $q(x) = 0$ or $f(x, w_{1}, w_{2}) = 0$. However, since $q$ is p.d.f., $q(x) = 0$ is with probability zero. Thus, $f(x, w_{1}, w_{2}) = 0$ is with probabiilty 1. By definition,
\[p(x \mid w_{1}) = p(x \mid w_{2}) .\]proof of (2)
\[\begin{eqnarray} & & F(t) := t + e^{-t} - 1 \ge 0 \nonumber \\ & & t = 0 \Leftrightarrow F(t) = 0 \nonumber \end{eqnarray}\]By mean value theorem,, there exists $t^{}$ such that $\abs{t^{}} \ge \abs{t}$,
\[F(t) = F(0) + F^{\prime}(0) (t - 0) + \frac{1}{2} F^{\prime\prime}(t^{*})(t - 0)^{2} = \frac{t^{2}}{2}\exp(-t^{2}) .\]proof of (3)
\[\]Remark
By lemma 4, if the log density ratio function is relatively finite variance, $f$ is independent on $w_{0} \in W_{0}$. In that case, we write
\[f(x, w) := f(x, w_{0}, w) .\]