View on GitHub

memo

Chapter5-01. The Linear Conjugate Grdient Method

5.1 The Linear Conjugate Grdient Method

The problem we consider

Problem 1

Find $x \in \mathbb{R}^{n}$ which satisfy

\[\begin{equation} \min_{x} Ax = b . \label{euqation_05_01} \end{equation}\]

Remark

The solution of problem1 is euivalent to the following problem.

\[\begin{eqnarray} \min_{x} \phi(x) := \frac{1}{2} x^{\mathrm{T}} A x - b^{\mathrm{T}}x \label{euqation_05_02} . \end{eqnarray}\]

Indeed, since $A$ is positive symmetric definite, the solution of the quadratic problem uniquely exists.

\[\begin{eqnarray} \nabla \phi(x) & = & Ax - b & =: & r(x) . \label{equation_05_03} \end{eqnarray}\] \[\begin{eqnarray} r_{k} & := & A x_{k} - b . \label{equation_05_04} \end{eqnarray}\]

Proposition

Then \(\{p_{0}, \ldots, p_{l}\}\) is linearly independent.

proof

$\Box$

Definition conjugate

\(\{p_{0}, \ldots, p_{l}\}\) is said to be conjugate with respect to $A$ if

\[\begin{eqnarray} p_{i}^{\mathrm{T}} A p_{j} = 0 \quad (i \neq j) . \label{equation_05_05} \end{eqnarray}\]

Definition conjugate direction method

Step0. $k := 0$,

Step1.

\[\begin{eqnarray} \alpha_{k} & := & - \frac{ r_{k}^{\mathrm{T}}p_{k} }{ p_{k}^{\mathrm{T}} A p_{k} } \nonumber \\ & = & - \frac{ (A x_{k} - b)^{\mathrm{T}}p_{k} }{ p_{k}^{\mathrm{T}} A p_{k} } \label{equation_05_07} \end{eqnarray}\]

Step2.

\[\begin{equation} x_{k + 1} := x_{k} + \alpha_{k} p_{k} \label{equation_05_06} \end{equation}\]

Step3. $k \leftarrow k + 1$,

Step4. If $k = n$, it finishes the algorithm. Otherwise go to Step1.

Theorem 5.1

The sequence covergexs to the solution of the linear system \(\eqref{equation_05_01}\) in at most $n$ steps.

proof

Since \(\{p_{0}, \ldots, p_{l}\}\) is linearly indepdent, there exist \(\sigma_{0}, \ldots, \sigma_{n-1}\) such that

\[\begin{eqnarray} x^{*} - x_{0} := \sigma_{0} p_{0} + \cdots + \sigma_{n-1} p_{n-1} . \end{eqnarray}\]

By multiplying the equation by $p_{k}^{\mathrm{T}} A$ and using the conjugacy

\[\begin{eqnarray} & & x^{*} - x_{0} = \sigma_{0} p_{0} + \cdots + \sigma_{n-1} p_{n-1} \nonumber \\ & \Leftrightarrow & p_{k}^{\mathrm{T}}A (x^{*} - x_{0)} = p_{k}^{\mathrm{T}}A \left( \sigma_{0} p_{0} + \cdots + \sigma_{n-1} p_{n-1} \right) \nonumber \\ & \Leftrightarrow & p_{k}^{\mathrm{T}}A (x^{*} - x_{0)} = p_{k}^{\mathrm{T}} A \sigma_{k} p_{k} \nonumber \\ & \Leftrightarrow & \sigma_{k} = \frac{ p_{k}^{\mathrm{T}} A (x^{*} - x_{0}) }{ p_{k}^{\mathrm{T}}A p_{k} } \label{equation_05_08} \end{eqnarray}\] \[\begin{eqnarray} & & x_{k} = x_{k-1} + \alpha_{k-1} p_{k-1} \nonumber \\ & \Leftrightarrow & x_{k} = x_{0} + p_{1} x_{1} + \cdots + \alpha_{k-1} p_{k-1} \nonumber \\ & \Leftrightarrow & x_{k} - x_{0} = p_{1} x_{1} + \cdots + \alpha_{k-1} p_{k-1} \nonumber \\ & \Leftrightarrow & p_{k}^{\mathrm{T}}A (x_{k} - x_{0}) = 0 . \end{eqnarray}\]

Hence

\[\begin{eqnarray} p_{k}^{\mathrm{T}}A (x^{*} - x_{0}) & = & p_{k}^{\mathrm{T}}A (x^{*} - x_{k} + x_{k} - x_{0}) \nonumber \\ & = & p_{k}^{\mathrm{T}}A (x^{*} - x_{k}) \nonumber \\ & = & p_{k}^{\mathrm{T}} (b - Ax_{k}) \nonumber \\ & = & - p_{k}^{\mathrm{T}} r_{k} . \nonumber \end{eqnarray}\]

Therefore, $\sigma_{k} = \alpha_{k}$.

$\Box$

Theorem 5.2

Then

\[\begin{equation} \forall i = 0, \ldots, k-1, \ r_{k}^{\mathrm{T}}p_{i} = 0 \label{equation_05_11} \end{equation} .\]

$x_{k}$ is the minimizer of $\phi(x)$ over set

\[\begin{eqnarray} D_{k} := \{ x \mid x = x_{0} + \mathrm{span}\{p_{0}, \ldots, p_{k-1}\} \} . \label{equation_05_12} \end{eqnarray}\]

proof

We first show that a point $\tilde{x}$ minimizes $\phi$ over the set if and only if

\[\forall, i = 0, \ldots, k -1, r(\tilde{x})^{\mathrm{T}} p_{i} = 0 .\]

Indeed, let us define

\[h(\sigma) := \phi(x_{0} + \sigma_{0}p_{0} + \cdots + \sigma_{k-1}p_{k-1}) .\]

Since composition of convex functions is also convex, $h$ is a strictly convex function. It has a unique minimizer $\sigma^{*}$ that satisfies

\[\begin{eqnarray} \forall i = 0, 1, \ldots, k-1, \ & & \frac{ \partial h(\sigma^{*}) }{ \partial \sigma_{i} } = 0 \nonumber \\ & \Leftrightarrow & \nabla \phi(x_{0} + \sigma_{0}^{*}p_{0} + \cdots + \sigma_{k-1}^{*}p_{k-1}) p_{i} = 0 \nonumber \\ & \Leftrightarrow & r(\tilde{x}) p_{i} = 0 \nonumber \end{eqnarray} .\]

We use the induction to prove the statement. For the case $k = 1$,

\[\begin{eqnarray} r_{1}^{\mathrm{T}}p_{0} & = & (Ax_{1} - b)^{\mathrm{T}} p_{0} \nonumber \\ & = & (Ax_{0} + A\alpha_{0}p_{0} - b)^{\mathrm{T}} p_{0} \\ & = & (x_{0}^{\mathrm{T}}A + \alpha_{0}p_{0}^{\mathrm{T}}A - b^{\mathrm{T}}) p_{0} \\ & = & x_{0}^{\mathrm{T}}A p_{0} + \alpha_{0}p_{0}^{\mathrm{T}}A p_{0} - b^{\mathrm{T}} p_{0} \nonumber \\ & = & x_{0}^{\mathrm{T}}A p_{0} + r_{0}^{\mathrm{T}}p_{0} - b^{\mathrm{T}} p_{0} \nonumber \\ & = & r_{0}^{\mathrm{T}} p_{0} + r_{0}^{\mathrm{T}}p_{0} \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]

Let us assume that the statement holds up to $k = l$. For the case $k = l + 1$,

\[\begin{eqnarray} r_{k+1} & = & Ax_{k+1} - b \nonumber \\ & = & Ax_{k} + A\alpha_{k}p_{k} - b \nonumber \\ & = & r_{k} + A\alpha_{k}p_{k} . \nonumber \end{eqnarray}\]

Hence

\[\begin{eqnarray} p_{k}^{\mathrm{T}}r_{k+1} & = & p_{k}^{\mathrm{T}} \left( r_{k} + A\alpha_{k}p_{k} \right) \nonumber \\ & = & p_{k}^{\mathrm{T}} r_{k} - r_{k}^{\mathrm{T}} p_{k} \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]

By conjugacy and the assumption of the induction, for all $i = 0, \ldots, k-1$,

\[\begin{eqnarray} p_{i}^{\mathrm{T}}r_{k+1} & = & p_{i}^{\mathrm{T}} \left( r_{k} + A\alpha_{k}p_{k} \right) \nonumber \\ & = & p_{i}^{\mathrm{T}} A\alpha_{k}p_{k} \nonumber \\ & = & 0 \label{equation_05_conjugacy} . \end{eqnarray}\]
$\Box$

Algorithm 5.1 CG-Preliminary Version

\[\begin{eqnarray} \alpha_{k} & \leftarrow & - \frac{ r_{k}^{\mathrm{T}}p_{k} }{ p_{k}^{\mathrm{T}}Ap_{k} } \label{equation_05_14_a} \\ x_{k+1} & \leftarrow & x_{k} + \alpha_{k} p_{k} \label{equation_05_14_b} \\ r_{k+1} & \leftarrow & A x_{k+1} - b \label{equation_05_14_c} \\ \beta_{k+1} & \leftarrow & \frac{ r_{k+1}^{\mathrm{T}} A p_{k} }{ p_{k}^{\mathrm{T}}A p_{k} } \label{equation_05_14_d} \\ p_{k+1} & \leftarrow & - r_{k+1} + \beta_{k+1} p_{k} \label{equation_05_14_e} \\ k & \leftarrow & k + 1 \label{equation_05_14_f} \end{eqnarray}\]

Definition Krylov subspace

Krylov subspace of degree $k$ for $r_{0}$ is defined as

\[\mathcal{K}(r_{0}; k) := \mathrm{span}\{ r_{0}, A r_{0}, \ldots, A^{k}r_{0} \} .\]

Theorem 5.3

Suppose that the $k$-th iterate generated by the conjugate gradient method is not hte solution point $x^{*}$.

\[\begin{eqnarray} r_{k}^{\mathrm{T}}r_{i} & = & 0 \label{equation_05_16} \\ \mathrm{span}\{r_{0}, \ldots, r_{k}\} & = & \mathrm{span}\{r_{0}, \ldots, A^{k}r_{0}\} \label{equation_05_17} \\ \mathrm{span}\{p_{0}, \ldots, p_{k}\} & = & \mathrm{span}\{r_{0}, \ldots, A^{k}r_{0}\} \label{equation_05_18} \\ p_{k}^{\mathrm{T}}Ap_{i} & = & 0 \label{equation_05_19} \end{eqnarray}\]

Therefore, the sequence \(\{x_{k}\}\) converges to $x^{*}$ in at most $n$ steps.

proof

We prove by induction. For $k = 0$, \(\eqref{equation_05_17}\) and \(\eqref{equation_05_18}\) hold. For $k = 1$,

\[\begin{eqnarray} p_{1}^{\mathrm{T}}Ap_{0} & = & (-r_{1} + \beta_{1} p_{0})^{\mathrm{T}} A p_{0} \nonumber \\ & = & \left( -r_{1} + p_{0} \frac{ r_{1}^{\mathrm{T}}Ap_{0} }{ p_{0}^{\mathrm{T}}Ap_{0} } \right)^{\mathrm{T}} A p_{0} \nonumber \\ & = & -r_{1}^{\mathrm{T}} Ap_{0} + r_{1}^{\mathrm{T}}Ap_{0} . \nonumber \end{eqnarray}\]

Let us assume that the \(\eqref{equation_05_17}\) and \(\eqref{equation_05_18}\) hold up to $k$. We first show \(r_{k+1} \in \mathrm{span}\{r_{0}, \ldots, A^{k+1}r_{0}\}\). By the assumption of the induction,

\[\begin{eqnarray} r_{k} & \in & \mathrm{span}\{ r_{0}, A r_{0}, \ldots, A^{k}r_{0} \} \nonumber \\ p_{k} & \in & \mathrm{span}\{ r_{0}, A r_{0}, \ldots, A^{k}r_{0} \} \nonumber \end{eqnarray}\]

Hence

\[\begin{eqnarray} Ap_{k} & \in & \mathrm{span}\{ Ar_{0}, A^{2} r_{0}, \ldots, A^{k+1}r_{0} \} \nonumber \end{eqnarray}\]

Combining these results,

\[\begin{eqnarray} r_{k+1} & = & r_{k} + \alpha_{k}Ap_{k} \nonumber \\ & \in & \mathrm{span}\{ r_{0}, A r_{0}, \ldots, A^{k+1}r_{0} \} . \nonumber \end{eqnarray}\]

That is,

\[\begin{eqnarray} \mathrm{span}\{r_{0}, \ldots, A^{k+1}r_{0}\} \supseteq \mathrm{span}\{r_{0}, \ldots, r_{k+1}\} \end{eqnarray}\]

We next show \(A^{k+1}r_{0} \in \mathrm{span}\{r_{0}, \ldots, r_{k+1}\}\). By our assumption, \(A^{k} r_{0} \in \mathrm{span}\{p_{0}, \ltots, p_{k}\}\). Thus,

\[\begin{eqnarray} A^{k+1} r_{0} & = & A (A^{k} r_{0}) \nonumber \\ & \in & \mathrm{span}\{Ap_{0}, \ldots, Ap_{k}\} \nonumber \\ & = & \mathrm{span} \left\{ A(r_{1} - r_{0})/\alpha_{0}, \ldots, A(r_{k+1} - r_{k-1})/\alpha_{k}, \right\} \nonumber \\ & = & \mathrm{span} \left\{ r_{0}, \ldots, r_{k+1} \right\} . \nonumber \end{eqnarray}\]

Therefore,

\[\mathrm{span}\{r_{0}, \ldots, A^{k+1}r_{0}\} \subseteq \mathrm{span}\{r_{0}, \ldots, r_{k+1}\} .\]

For \(\eqref{equation_05_18}\),

\[\begin{eqnarray} \mathrm{span}\{p_{0}, \ldots, p_{k}, p_{k+1}\} & = & \mathrm{span}\{p_{0}, \ldots, p_{k}, -r_{k+1} + \beta_{k+1}p_{k}\} \nonumber \\ & = & \mathrm{span}\{p_{0}, \ldots, p_{k}, -r_{k+1} + \beta_{k+1}p_{k}\} \nonumber \\ & = & \mathrm{span}\{p_{0}, \ldots, p_{k}, r_{k+1}\} \nonumber \\ & = & \mathrm{span}\{r_{0}, \ldots, A^{k}r_{0}, r_{k+1}\} \nonumber \\ & = & \mathrm{span}\{r_{0}, \ldots, r_{k}, r_{k+1}\} \nonumber \\ & = & \mathrm{span}\{r_{0}, \ldots, A^{k}r_{0}, A^{k+1}r_{0}\} . \nonumber \end{eqnarray}\]

Next, we prove \(\eqref{equation_05_19}\).

\[\begin{eqnarray} p_{k+1}^{\mathrm{T}}Ap_{k} & = & -r_{k+1}^{\mathrm{T}}Ap_{k} + \beta_{k+1}p_{k}^{\mathrm{T}}Ap_{k} \nonumber \\ & = & -r_{k+1}^{\mathrm{T}}Ap_{k} + \frac{ r_{k+1}Ap_{k} }{ p_{k}^{\mathrm{T}}Ap_{k} } p_{k}^{\mathrm{T}}A p_{k} \nonumber \\ & = & 0 \nonumber \end{eqnarray}\]

Let $i = 0, \ldots, k-1$ be fixed. By \(\eqref{equation_05_18}\),

\[\begin{eqnarray} A p_{i} & \in & \mathrm{span}\{Ar_{0}, \ldots, A A^{i}r_{0}\} \nonumber \\ & \subseteq & \mathrm{span}\{p_{0}, \ldots, p^{i+1}\} \quad (\because \eqref{equation_05_18}) \nonumber \end{eqnarray}\]

By Theorem 5.2,

\[\begin{eqnarray} r_{k+1}^{\mathrm{T}} A p_{i} & = & r_{k+1}^{\mathrm{T}} \sum_{l=0}^{i+1} \sigma_{i} p_{i} \nonumber \\ & = & \sum_{l=0}^{i+1} \sigma_{i} r_{k+1}^{\mathrm{T}} p_{i} \nonumber \\ & = & 0 \nonumber \end{eqnarray}\]

Thus,

\[\begin{eqnarray} p_{k+1}^{\mathrm{T}}Ap_{i} & = & - r_{k+1}^{\mathrm{T}} A p_{i} + \beta_{k+1} p_{k}^{\mathrm{T}} A p_{i} \nonumber \\ & = & - r_{k+1}^{\mathrm{T}} A p_{i} \quad (\because \text{induction hypothesis}) \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]

Therefore, \(\eqref{equation_05_19}\) is proved. Since \(\{p_{i}\}\) is conjugate, Theorem 5.1 ensure that the algorithm terminates in at most $n$ iterations.

For \(\eqref{equation_05_16}\), let $i = 1, \ldots, k - 1$ be fixed.

\[\begin{eqnarray} r_{k}^{\mathrm{T}}r_{i} & = & r_{k}^{\mathrm{T}} (\beta_{i}p_{i-1} - p_{i}) \nonumber \\ & = & r_{k}^{\mathrm{T}}\beta_{i}p_{i-1} - r_{k}^{\mathrm{T}} p_{i} \nonumber \\ & = & 0 . \quad (\because \eqref{equation_05_conjugacy}) \end{eqnarray}\]

In case of $i = 0$,

\[\begin{eqnarray} r_{k}^{\mathrm{T}}r_{0} & = & r_{k}^{\mathrm{T}} p_{0} \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]
$\Box$

Algorithm 5.2 CG

\[\begin{eqnarray} r_{0} & := & A x_{0} - b \nonumber \\ p_{0} & := & - r_{0} \nonumber \end{eqnarray}\]

while $r_{k} \neq 0$,

\[\begin{eqnarray} \alpha_{k} & \leftarrow & \frac{ r_{k}^{\mathrm{T}}r_{k} }{ p_{k}^{\mathrm{T}}Ap_{k} } \label{equation_05_24_a} \\ x_{k+1} & \leftarrow & x_{k} + \alpha_{k} p_{k} \label{equation_05_24_b} \\ r_{k+1} & \leftarrow & r_{k} + \alpha_{k} A p_{k} \label{equation_05_24_c} \\ \beta_{k+1} & \leftarrow & \frac{ r_{k+1}^{\mathrm{T}} r_{k+1} }{ r_{k}^{\mathrm{T}}r_{k} } \label{equation_05_24_d} \\ p_{k+1} & \leftarrow & - r_{k+1} + \beta_{k+1} p_{k} \label{equation_05_24_e} \\ k & \leftarrow & k + 1 \label{equation_05_24_f} \end{eqnarray}\]

end (while)

\[\begin{eqnarray} x_{k+1} & = & x_{k} + \alpha_{k} p_{k} \nonumber \\ & = & x_{0} + \alpha_{0} p_{0} + \cdots + \alpha_{k} p_{k} \nonumber \\ & = & x_{0} + \alpha_{0} (\sum_{i=1}^{n}c_{0, i}r_{0}A^{i}) + \alpha_{1} (\sum_{i=1}^{n}c_{1, i}r_{0}A^{i}) + \cdots + \alpha_{k} (\sum_{i=1}^{n}c_{k, i}r_{0}A^{i}) \quad (\because \eqref{equation_05_18}) \nonumber \\ & = & x_{0} + \gamma_{0} A^{0} r_{0} + \gamma_{1} A^{1}r_{0} + \cdots + \gamma_{k}A^{k}r_{0} \label{equation_05_25} \end{eqnarray}\]

where \(\gamma_{1}, \ldots, \gamma_{k}\) are some constants. Let

\[P_{k}^{*}(\lambda) := \gamma_{0} \lambda^{0} + \gamma_{1} \lambda^{1} + \cdots + \gamma_{k}\lambda^{k} .\]

Then \(\eqref{equation_05_25}\) can be expressed as

\[\begin{eqnarray} x_{k+1} & = & x_{0} + P_{k}^{*}(A) r_{0} . \label{equation_05_26} \end{eqnarray}\]

Let

\[\begin{eqnarray} \norm{z}_{A} & := & z^{\mathrm{T}}Az . \label{equation_05_27} \end{eqnarray}\]

With this norm,

\[\begin{eqnarray} \frac{1}{2} \norm{x - x^{*}}_{A}^{2} & = & \phi(x) - \phi(x^{*}) . \label{equation_05_28} \end{eqnarray}\] \[\begin{eqnarray} \min_{P_{k}} \norm{ x_{0} + P_{k}(A) r_{0} - x^{*} }_{A} \end{eqnarray}\] \[\begin{eqnarray} r_{0} & = & A x_{0} - b \nonumber \\ & = & A x_{0} - A x^{*} \nonumber \\ & = & A( x_{0} - x^{*}) \nonumber \end{eqnarray}\] \[\begin{eqnarray} x_{k+1} - x^{*} & = & x_{0} + P_{k}^{*}(A) r_{0} \nonumber \\ & = & (I + P_{k}^{*}(A)A)(x_{0} - x^{*}) \label{equation_05_30} \end{eqnarray}\]

Let \(0 < \lambda_{1} \le \lambda_{2} \le \cdots \le \lambda_{n}\) be the eigenvalues of $A$. Let \(v_{1}, \ldots, v_{n}\) be the corresponding orthogonal eigenvectors.

\[\begin{eqnarray} A & = & \sum_{i=1}^{n} \lambda_{i}v_{i}v_{i}^{\mathrm{T}} \nonumber \end{eqnarray}\]

Since $v_{i}$ spans the whole space $\mathbb{R}^{n}$, there exist \(\{xi_{i}\}\) such that

\[\begin{eqnarray} x_{0} - x^{*} & = & \sum_{i=1}^{n} \xi_{i}v_{i} . \label{equation_05_31} \end{eqnarray}\] \[\begin{eqnarray} P_{k}(A) v_{i} & = & \left( \gamma_{0}A^{0} + \cdots + \gamma_{k}A^{k} \right) v_{i} \nonumber \\ & = & \left( \gamma_{0}\lambda^{0} + \cdots + \gamma_{k}\lambda^{k} \right) v_{i} \nonumber \\ & = & P_{k}^{*}(\lambda_{i}) v_{i} . \end{eqnarray}\] \[\begin{eqnarray} x_{k+1} - x^{*} & = & x_{0} + \gamma_{0} r_{0} + \sum_{i=1}^{k} \gamma_{i}A^{i}r_{0} - x^{*} \nonumber \\ & = & x_{0} + P_{k}^{*}(A) r_{0} - x^{*} \nonumber \\ & = & x_{0} + P_{k}^{*}(A) A(x_{0} - x^{*}) - x^{*} \nonumber \\ & = & (I + P_{k}^{*}(A)A) (x_{0} - x^{*}) \nonumber \\ & = & (I + P_{k}^{*}(A)A) \left( \sum_{i=1}^{n} \xi_{i} v_{i} \right) \nonumber \\ & = & \sum_{i=1}^{n} \left( \xi_{i} v_{i} + \xi_{i} \lambda_{i} P_{k}^{*}(\lambda_{i}) v_{i} \right) \nonumber \\ & = & \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i} \nonumber \end{eqnarray}\]

Hence

\[\begin{eqnarray} \norm{ x_{k+1} - x^{*} }_{A}^{2} & = & \norm{ \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i} }_{A}^{2} \nonumber \\ & = & \sum_{i=1}^{n} \sum_{j=1}^{n} \left( \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i} \right)^{\mathrm{T}} A \left( \left( 1 + \lambda_{j} P_{k}^{*}(\lambda_{j}) \right) \xi_{j} v_{j} \right) \nonumber \\ & = & \sum_{i=1}^{n} \sum_{j=1}^{n} \left( \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i} \right)^{\mathrm{T}} \left( \left( 1 + \lambda_{j} P_{k}^{*}(\lambda_{j}) \right) \xi_{j} \lambda_{j} v_{j} \right) \nonumber \\ & = & \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i}^{\mathrm{T}} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} \lambda_{i} v_{i} \nonumber \\ & = & \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right)^{2} \xi_{i}^{2} \lambda_{i} \nonumber \end{eqnarray}\]

By thorem 5.2, the sequence \(\{x_{k+1}\}\) generated by CG method minimizes over \(x_{0} + \mathrm{span}\{r_{0}, Ar_{0}, \ldots, A^{k}r_{0}\}\).

\[\begin{eqnarray} \phi(x^{*}) & = & \min_{x} \phi(x) \nonumber \\ & = & \min_{a_{i} \in \mathbb{R}^{k}} \phi(x_{0} + a_{1}r_{0} + \cdots + A^{k}r_{0} a_{k}) \nonumber \\ & = & \min_{P_{k}: \text{polynomial with degree }k} \phi(x_{0} + P_{k}(A)r_{0}) \nonumber . \end{eqnarray}\]

where

\[\begin{eqnarray} P_{k}(x) := x + \cdots + x^{k} . \end{eqnarray}\]

For any $k$, \(x_{k+1} \in (x_{0} + \mathrm{span}\{r_{0}, Ar_{0}, \ldots, A^{k}r_{0}\}\), Hence

\[\begin{eqnarray} \min_{x \in \{x_{0}, \ldots, x_{n}\}} \norm{ x - x^{*} }_{A}^{2} & = & \min_{P_{k}: \text{polynomial with degree }k} \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \xi_{i}^{2} \lambda_{i} \nonumber \\ & \le & \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \left( \sum_{j=1}^{n} \xi_{j}^{2} \lambda_{j} \right) \nonumber \\ & \le & \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \left( \sum_{j=1}^{n} \xi_{j}^{2} \lambda_{j} \right) \nonumber \\ & = & \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \left( \sum_{j=1}^{n} \sum_{l=1}^{n} \xi_{j} v_{j}^{\mathrm{T}} A \xi_{l} v_{l} \right) \nonumber \\ & = & \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \norm{ \xi_{j} v_{j}^{\mathrm{T}} }_{A}^{2} \nonumber \\ & = & \min_{P_{k}: \text{polynomial with degree}k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \norm{ x - x^{*} }_{A}^{2} \quad (\because \text{by definition}) \label{equation_05_34} . \end{eqnarray}\]

Theorem 5.4

If $A$ has only $r$ distinct eigenvalues, then the CC iteration will terminate at the solution in at most $r$ iteration.

proof

\[Q_{r}(\lambda) := \frac{ (-1)^{r} }{ \tau_{1} \cdots \tau_{r} } (\lambda - \tau_{1}) \cdots (\lambda - \tau_{r}) .\]

It’s easy to confirm

\[\forall i = 1, \ldots, n, \ Q_{r}(\lambda_{i}) = 0 .\]

and

\[Q_{r}(0) = 1 .\]

Hence \(Q_{r}(\lambda) - 1\) is polynonimal of degree $r$ with a root at $\lambda = 0$. Let

\[\bar{P}_{r-1}(\lambda) := \frac{ Q_{r}(\lambda) - 1 }{ \lambda } .\]

\(\bar{P}_{r-1}\) is polynomial of degree $r-1$.

\[\begin{eqnarray} 0 & \le & \min_{P_{r-1}} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{r-1}(\lambda_{i}) \right)^{2} \nonumber \\ & \le & \max_{1 \le i \le n} \left( 1 + \lambda_{i} \bar{P}_{r-1}(\lambda_{i}) \right)^{2} \nonumber \\ & \le & \max_{1 \le i \le n} Q_{r}^{2}(\lambda_{i}) \nonumber \\ & = & 0 \nonumber \end{eqnarray}\]

By setting $k=r-1$, \(\eqref{equation_05_34}\) is

\[\begin{eqnarray} \norm{x_{r} - x^{*}}_{A} & = & \min_{P_{r-1}: \text{polynomial with degree}k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{r-1}(\lambda_{i}) \right)^{2} \norm{ x - x^{*} }_{A}^{2} \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]
$\Box$

Theorem 5.5

Let $k < r$.

\[\begin{eqnarray} \norm{x_{k+1} - x^{*}}_{A}^{2} & \le & \left( \frac{ \lambda_{n-k} - \lambda_{1} }{ \lambda_{n-k} + \lambda_{1} } \right)^{2} \norm{x_{0} - x^{*}}_{A}^{2} . \label{equation_05_35} \end{eqnarray}\]

proof

Let

\[\begin{eqnarray} Q_{k}(\lambda) & := & \frac{ (-1)^{k} }{ \tau_{1} \cdots \tau_{k} } (\lambda - \tau_{1}) \cdots (\lambda - \tau_{r}) \nonumber \end{eqnarray}\] \[\begin{eqnarray} \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} & \le & \max_{1 \le i \le n} Q_{r}^{2}(\lambda_{i}) \nonumber \\ & = & \max_{1 \le i \le n} \left( \frac{ (-1)^{k} }{ \tau_{1} \cdots \tau_{k} } (\lambda_{i} - \tau_{1}) \cdots (\lambda_{i} - \tau_{k}) \right)^{2} \nonumber \\ & = & (\lambda_{n} - \tau_{1}) r \max_{1 \le i \le n} \left( \frac{ }{ \tau_{1} \cdots \tau_{k} } \right) \end{eqnarray}\]
$\Box$