View on GitHub

memo

Chapter2.

2.1 Formal Relation between True and Model

Definition 1

$q$ is said to be realizable by a statistical model $p(x \mid w)$ if

\[\exists w \in W \text{ s.t. } q(x) = p(x \mid w) .\]

Otherwise, $q$ is unrealizable.

\[W_{00} := \{ w \in W \mid \forall x, \ q(x) = p(x \mid w) \} .\]

Lemma 1

(2) Suppose $W_{00}$ is not emptyset.

\[\forall w_{1}, w_{2} \in W, \ p(x \mid w_{1}) = p(x \mid w_{2}) .\]

proof

$\Box$

Example 6

\[p(x, y \mid a, b) := \frac{1}{\sqrt{2\pi}} \exp \left( - \frac{1}{2} \left( y - a \sin(bx) \right)^{2} \right) .\]

If $q(x, y) := p(x, y \mid 1, 1)$,

\[W_{00} = \{ (-1, -1), (1, 1) \} .\]

If $q(x, y) := p(x, y \mid 0, 0)$,

\[W_{00} = \{ (-1, -1), (1, 1) \} .\]

If

\[q(x, y) := \frac{1}{\sqrt{2\pi}} \exp \left( -\frac{1}{2} \right),\]

then

\[W_{00} = \{ (-1, -1), (1, 1) \} .\]
\[\begin{eqnarray} L(w) & := & - \mathrm{E} \left[ \log p(X \mid w) \right] \nonumber \\ & := & - \int q(x) \log p(x \mid w) \ dx \nonumber \\ & = & - \int q(x) \log q(x) \ dx + \int q(x) \log \frac{q(x)}{p(x \mid w)} \ dx \nonumber \end{eqnarray}\] \[W_{00} = \left\{ w \in W \mid \int q(x) \log \frac{q(x)}{p(x \mid w)} \ dx = 0 \right\} .\]

Definition 3

\[W_{0} := \arg\inf_{w \in W} L(w) .\]

$q$ is said to be regular for $p(x \mid w)$ if

\[(\nabla^{2} L(w_{0}))_{i, j} := \left( \frac{\partial L}{\partial w_{i} \partial w_{j}} \right)(w_{0}) .\]

is positive definite.

Lemma 2

proof

$\Box$

Definiiton 4

the optimal probability density function is essentially unique if

\[\exists p_{0}(x) \text{ s.t. } \forall w_{0} \in W_{0}, \ p(x \mid w_{0}) = p_{0}(x) .\]

iff

\[\forall w_{1}, w_{2} \in W_{0}, \ \forall x \in \mathcal{X}, \ p(x \mid w_{0}) = p(x \mid w_{1})\]

Lemma 3

(1)

proof

$\Box$

Example 8

(1) $\theta \in [0, 2\pi)$,

\[\begin{eqnarray} p(x, y \mid \theta) & := & \frac{1}{2\pi} \exp \left( -\frac{1}{2} \left( (x - \cos \theta)^{2} + (y - \sin \theta)^{2} \right) \right) \nonumber \\ q(x, y) & := & \frac{1}{2\pi} \exp \left( -\frac{1}{2} \left( x^{2} + y^{2} \right) \right) \end{eqnarray}\]

(2)

Lemma 3

proof

$\Box$

Definition 5

\[f(x, w_{0}, w) := \log \frac{ p(x \mid w_{0}) }{ p(x \mid w) } .\]

$f$ is called the log density ratio function. The log density ration function is said to be relatively finite variance If there exists $c_{0} > 0$ such that for all $w_{0} \in W_{0}$ and $w \in W$,

\[\begin{equation} \int_{\Omega} f(X(\omega), w_{0}, w) \ P(d \omega) = c_{0} \int_{\Omega} f(X(\omega), w_{0}, w)^{2} \ P(d \omega) . \nonumber \end{equation}\]

Lemma 4

proof

proof of (1)

For all $w_{1}, w_{2} \in W_{0}$,

\[\begin{eqnarray} 0 & = & L(w_{2}) - L(w_{1}) \nonumber \\ & = & \int q(x) \log \frac{ q(x) }{ p(x \mid w_{2}) } \ P_{X}(d x) - \int q(x) \log \frac{ q(x) }{ p(x \mid w_{1}) } \ P_{X}(d x) \nonumber \\ & = & \int q(x) f(x, w_{1}, w_{2}) \ P_{X}(d x)) \nonumber \\ & \ge & c_{0} \int q(x) f(x, w_{1}, w_{2})^{2} \ P_{X}(dx) . \nonumber \end{eqnarray}\]

Hence for all $x$, $q(x) = 0$ or $f(x, w_{1}, w_{2}) = 0$. However, since $q$ is p.d.f., $q(x) = 0$ is with probability zero. Thus, $f(x, w_{1}, w_{2}) = 0$ is with probabiilty 1. By definition,

\[p(x \mid w_{1}) = p(x \mid w_{2}) .\]

proof of (2)

\[\begin{eqnarray} & & F(t) := t + e^{-t} - 1 \ge 0 \nonumber \\ & & t = 0 \Leftrightarrow F(t) = 0 \nonumber \end{eqnarray}\]

By mean value theorem,, there exists $t^{}$ such that $\abs{t^{}} \ge \abs{t}$,

\[F(t) = F(0) + F^{\prime}(0) (t - 0) + \frac{1}{2} F^{\prime\prime}(t^{*})(t - 0)^{2} = \frac{t^{2}}{2}\exp(-t^{2}) .\]

proof of (3)

\[\]

Remark

By lemma 4, if the log density ratio function is relatively finite variance, $f$ is independent on $w_{0} \in W_{0}$. In that case, we write

\[f(x, w) := f(x, w_{0}, w) .\]