View on GitHub

memo

Chapter2-07. Exponential family

2.7 Exponential Families

Definition. Exponential families

\(\mathcal{P}\) is said to be exponential families if there exist

such that

\[\begin{eqnarray} \frac{ d P_{\theta} }{ d \mu }(x) & = & C(\theta) \exp \left( \sum_{j=1}^{k} Q_{j}(\theta) T_{j}(x) \right) h(x) \label{chap02_02_32_definition_of_exponential_family} \\ & = & \exp \left( \sum_{j=1}^{k} Q_{j}(\theta) T_{j}(x) - C^{\prime}(\theta) \right) h(x) \end{eqnarray}\]

where \(C^{\prime}(\theta) := -\log C(\theta)\).

Example. Exponential families

If a family of distribution is

it is exponential families. For instance, binominal distribution \(B(p)^{n}\) could be denoted as

\[\left( \begin{array}{c} n \\ x \end{array} \right) p^{x} (1 - p)^{n-x} = (1 - p)^{n} \exp \left( x \log \left( \frac{ p }{ 1 - p } \right) \right) \left( \begin{array}{c} n \\ x \end{array} \right)\]

Hence binominal distribution is exponential family.

Example 2.7.1

\[\begin{equation} \frac{ d P_{\sigma} }{d \mu}(y) := \frac{ 1 }{ (2\sigma^{2})^{f/2} \Gamma(f/2) } y^{\frac{f}{2} - 1} \exp \left( \frac{ -y }{ 2 \sigma^{2} } \right), \ y > 0, \label{chap02_02_33_exponential_family} \end{equation}\]

When $\sigma = 1$, the distribution is same as \(\chi^{2}\) distribution.

Example 2.7.2

Joint distribution of \(X := (X_{11}, \ldots, X_{ns})\) is

\[\begin{eqnarray} x := (x_{11}, \ldots, x_{ns}), \quad P(X_{11} = x_{11}, \ldots, X_{ns} = x_{ns}) & = & p_{1}^{ \sum_{i=1}^{n} x_{i1} } p_{2}^{ \sum_{i=1}^{n} x_{i2} } \cdots p_{s}^{ \sum_{i=1}^{n} x_{is} } \nonumber \\ & = & \exp \left( \sum_{j=1}^{s} \log \left( p_{j}^{ T_{j}(x) } \right) \right) \nonumber \\ & = & \exp \left( \sum_{j=1}^{s} T_{j}(x) \log \left( p_{j} \right) \right) . \end{eqnarray}\]

Joint distribution of \(T_{j}\) is multinomial distribution \(M(n; p_{1}, \ldots, p_{s})\) given by

\[\begin{eqnarray} P(T_{1} = t_{1}, \ldots, T_{s-1} = t_{s-1}) & = & \frac{ n! }{ t_{1}! \cdots t_{s-1}!(n - t_{1} - \cdots - t_{s-1}) } p_{1}^{t_{1}} \cdots p_{s-1}^{t_{s-1}}(1 - p_{1} - \cdots - p_{s-1})^{n-t_{1}- \cdots - t_{s-1}} \end{eqnarray}\]

If \(X_{1}, \ldots, X_{n}\) is a sample from a distribution with density \(\eqref{chap02_02_32_definition_of_exponential_family}\),

Definition. Natural Parameter Space

\[A := \left\{ a := (a_{1}, \ldots, a_{m}) \in \mathbb{R}^{m} \mid 0 < \int_{\mathcal{X}} \exp \left( \sum_{i=1}^{m} a_{i}T_{i}(x) \right) h(x) \ \mu (dx) < \infty \right\} .\]

where \(\{T_{i}\}\) and $h$ are measrauble functions in \(\eqref{chap02_02_32_definition_of_exponential_family}\). $A$ is said to be natural parameter space.

Let \(\mathcal{P}^{\prime} := \{P_{\theta}^{\prime}\}_{\theta \in \Theta}\) be exponential family whic satisfies

\[\begin{eqnarray} \frac{ d P_{\theta}^{\prime} }{ d \mu }(x) & = & \exp \left( \sum_{j=1}^{k} Q_{j}(\theta) T_{j}(x) \right) h(x) \nonumber . \end{eqnarray}\]

Let $A \neq \emptyset$ be natural parameter space of \(\mathcal{P}^{\prime}\).

\[\Psi (a) := \log \left( \int_{\mathcal{X}} \exp \left( \sum_{i=1}^{m} a_{i}T_{i}(x) \right) h(x) \ \mu(dx) \right) \quad (a \in A)\]

Now we can define more natural exponential family \(\mathcal{P}:=\{P_{a}\}_{a \in A}\) by

\[\frac{ d P_{a} }{ d\mu } (x) = \exp \left( \sum_{i=1}^{m} a_{i}T_{i}(x) - \Psi(a) \right) h(x) .\]

If $\mathcal{Y} \subseteq \mathbb{R}_{\ge 0}^{m}$, exponential family \(\mathcal{P}^{\prime}\) can be embed in \(\mathcal{P}\). Indeed, by definiiton of exponential family, radon nikodim derivative exists for all $\theta \in \Theta$, that is,

\[\forall \theta \in \Theta, \ i = 1, \ldots, m, \ T_{i}(\theta) \in A .\]

Hence \(\mathcal{P}^{\prime} \subseteq \mathcal{P}\).

Lemma 2.7.1

natural parameter space of exponential familiy is convex.

proof.

Let \(\theta := (\theta_{1}, \ldots, \theta_{k}) \in A\) and \(\theta^{\prime} := (\theta_{1}^{\prime}, \ldots, \theta_{k}^{\prime}) \in A\). For all $\alpha \in (0, 1)$$,

\[\begin{eqnarray} \int_{\mathcal{X}} \exp \left( \sum_{i=1} \left( \alpha\theta_{i} + (1 - \alpha)\theta_{i}^{\prime} \right) T_{i}(x) \right) \ \mu(dx) & = & \int_{\mathcal{X}} \exp \left( \sum_{i=1} \left( \alpha\theta_{i} \right) T_{i}(x) \right) \exp \left( \sum_{i=1} \left( (1 - \alpha)\theta_{i}^{\prime} \right) T_{i}(x) \right) \ \mu(dx) \nonumber \\ & \le & \left( \int_{\mathcal{X}} \exp \left( \alpha \sum_{i=1} \theta_{i} T_{i}(x) \right)^{\frac{1}{\alpha}} \ \mu(dx) \right)^{\alpha} \left( \int_{\mathcal{X}} \exp \left( (1 - \alpha) \sum_{i=1} \theta_{i}^{\prime} T_{i}(x) \right)^{\frac{1}{1- \alpha}} \ \mu(dx) \right)^{1 - \alpha} \nonumber \\ & = & \left( \int_{\mathcal{X}} \exp \left( \sum_{i=1} \theta_{i} T_{i}(x) \right) \ \mu(dx) \right)^{\alpha} \left( \int_{\mathcal{X}} \exp \left( \sum_{i=1} \theta_{i}^{\prime} T_{i}(x) \right) \ \mu(dx) \right)^{1 - \alpha} < \infty \nonumber \end{eqnarray}\]
$\Box$

Lemma 2.7.2

\[\frac{ d P_{\theta, \kappa}^{X} }{ d \mu }(x) = C(\theta, \kappa) \exp \left( \sum_{j=1}^{r} \theta_{i} U_{i}(x) + \sum_{j=1}^{s} \kappa_{j}T_{j}(x) \right)\]

Then there exist

such that

\[\begin{equation} \frac{ d P_{\theta, \kappa}^{T} }{ d \lambda_{\theta} } (t) = C(\theta, \kappa) \exp \left( \sum_{j=1}^{s} \kappa_{j}t_{j} \right) \label{chap02_02_36} \end{equation}\] \[\begin{equation} \frac{ d P_{\theta, \kappa}^{U \mid T= t} }{ d \nu_{t} } (t) = C(\theta) \exp \left( \sum_{j=1}^{r} \theta_{j} t_{j} \right) \label{chap02_02_37} \end{equation}\]

proof.

$\Box$

Theorem 2.7.1

\[\begin{equation} \int \phi(x) \exp \left( \sum_{j=1}^{k} \theta_{j} T_{j}(x) \right) d\mu(x) \label{chap02_02_38} \end{equation}\]

Then

proof.

$\Box$