3.5. Maximum a posteriori estimation
Defintition
Let
\[\begin{eqnarray} \hat{w}(\beta) & := & \argmin_{w} \mathcal{L}(w) \nonumber \\ & = & \argmin_{w} \frac{1}{n} \sum_{i=1}^{n} \log p(X_{i} \mid w) - \frac{1}{n\beta} \log \phi(w) \nonumber \\ \hat{w}(\infty) & := & \argmin_{w} \frac{1}{n} \sum_{i=1}^{n} \log p(X_{i} \mid w) . \nonumber \end{eqnarray}\]- $\hat{w}(1)$ is said to be Maximum A Posteriori estimator.
-
$\hat{w}(\infty)$ is said to be Maximum Likelihood estimator.
- The algorithm to estimate true distribution by the $p(x \mid \hat{w}(1))$ is MAP estimation
- The algorithm to estimate true distribution by the $p(x \mid \hat{w}(\infty))$ is MAP estimation
■
Lemma 18
\[\begin{eqnarray} & & K(\hat{w}(1)) \overset{p}{\rightarrow} 0 \quad n \rightarrow \infty \nonumber \\ & & K(\hat{w}(\infty)) \overset{p}{\rightarrow} 0 \quad n \rightarrow \infty \nonumber \end{eqnarray} .\]Moreover, if \(\{w_{0}\} = W_{0}\),
\[\begin{eqnarray} \hat{w}(1) \rightarrow w_{0} \nonumber \\ \hat{w}(\infty) \rightarrow w_{0} \nonumber \end{eqnarray}\]proof
\[\begin{eqnarray} \mathcal{L}(w) & = & L_{n}(w) - K_{n}(w) + K(w) + K_{n}(w) - K(w) - \frac{1}{n \beta} \log \phi(w) \nonumber \\ & = & L_{n}(w) + L_{n}(w_{0}) - L_{n}(w) + K(w) - \frac{1}{\sqrt{n}} \eta_{n}(w) - \frac{1}{n \beta} \log \phi(w) \nonumber \\ & = & L_{n}(w_{0}) + K(w) - \frac{1}{\sqrt{n}} \eta_{n}(w) - \frac{1}{n \beta} \log \phi(w) \nonumber \end{eqnarray}\]Let us assume $\eta_{n}(w)$ is stochastic process over a comact set. There exists a random variable $\eta$ such that
\[\begin{eqnarray} \abs{ \sup_{w \in W} \abs{ \eta_{n}(w) } - \eta } \overset{w}{\rightarrow} 0 . \nonumber \end{eqnarray}\]And
\[\begin{eqnarray} \abs{ \sup_{w \in W} \abs{ \eta_{n}(w) } - \eta } = O_{p}(\frac{1}{\sqrt{n}}) . \nonumber \end{eqnarray}\]Let us assume that $\phi$ is continuous. Since $W$ is a compact,
\[\forall w \in W, \ - \frac{1}{n\beta} \log \phi(w) \rightarrow 0 \quad (n \rightarrow \infty) .\]Therefore,
\[\mathcal{L}(w) := L(w_{0}) + K(w) + O_{p}(\frac{1}{\sqrt{n}}) .\]On the other hand, Let $\epsilon(n) >0$ be
\[\begin{eqnarray} \epsilon(n) & \rightarrow & 0, \nonumber \\ \sqrt{n} \epsilon(n) & \rightarrow & \infty, . \nonumber \end{eqnarray}\]For instance, $\epsilon(n) := 1 / n^{1/4}$ satisfies the conditions. For all $n$ satisfying $K(w) \ge \epsilon(n)$,
\[\begin{eqnarray} \mathcal{L}(w) & \ge & L(w_{0}) + \epsilon(n) + O_{p}(\frac{1}{\sqrt{n}}) \nonumber & \rightarrow & L(w_{0}) \quad (n \rightarrow \infty) . \nonumber \end{eqnarray}\]$\Box$
Remark 27
- (1) MAP estimator and ML estimator are random variables depending on samples.
- (2)
- (3) If true distribution is not regular with respect to statistical model, \(\mathrm{E}_{w}[w]\) does not approach to $w_{0}$ in general.
■