5.1 The Linear Conjugate Grdient Method
The problem we consider
Problem 1
- $A$,
- $n \times n$ symmetrics positive definite matirx
- $b$,
- $n$ vector
Find $x \in \mathbb{R}^{n}$ which satisfy
\[\begin{equation} \min_{x} Ax = b . \label{euqation_05_01} \end{equation}\]Remark
The solution of problem1 is euivalent to the following problem.
\[\begin{eqnarray} \min_{x} \phi(x) := \frac{1}{2} x^{\mathrm{T}} A x - b^{\mathrm{T}}x \label{euqation_05_02} . \end{eqnarray}\]Indeed, since $A$ is positive symmetric definite, the solution of the quadratic problem uniquely exists.
\[\begin{eqnarray} \nabla \phi(x) & = & Ax - b & =: & r(x) . \label{equation_05_03} \end{eqnarray}\] \[\begin{eqnarray} r_{k} & := & A x_{k} - b . \label{equation_05_04} \end{eqnarray}\]Proposition
- \(\{p_{0}, \ldots, p_{l}\}\),
- conjugate with respocet to $A$,
Then \(\{p_{0}, \ldots, p_{l}\}\) is linearly independent.
proof
Definition conjugate
- $l \in \mathbb{N}$,
- \(\{p_{0}, \ldots, p_{l}\}\),
- nonzero vectors
- $p_{k} \in \mathbb{R}^{n}$,
- $A$,
- $n \times n$ symmetric positive definite matrix
\(\{p_{0}, \ldots, p_{l}\}\) is said to be conjugate with respect to $A$ if
\[\begin{eqnarray} p_{i}^{\mathrm{T}} A p_{j} = 0 \quad (i \neq j) . \label{equation_05_05} \end{eqnarray}\]Definition conjugate direction method
- \(\{p_{0}, \ldots, p_{n-1}\}\),
- $x_{0} \in \mathbb{R}^{n}$,
Step0. $k := 0$,
Step1.
\[\begin{eqnarray} \alpha_{k} & := & - \frac{ r_{k}^{\mathrm{T}}p_{k} }{ p_{k}^{\mathrm{T}} A p_{k} } \nonumber \\ & = & - \frac{ (A x_{k} - b)^{\mathrm{T}}p_{k} }{ p_{k}^{\mathrm{T}} A p_{k} } \label{equation_05_07} \end{eqnarray}\]Step2.
\[\begin{equation} x_{k + 1} := x_{k} + \alpha_{k} p_{k} \label{equation_05_06} \end{equation}\]Step3. $k \leftarrow k + 1$,
Step4. If $k = n$, it finishes the algorithm. Otherwise go to Step1.
Theorem 5.1
- $x_{0} \in \mathbb{R}^{n}$,
- \(\{x_{0}, \ldots, x_{n-1}\}\),
- The sequence generated by \(\{x_{0}, \ldots, x_{n-1}\}\)
The sequence covergexs to the solution of the linear system \(\eqref{equation_05_01}\) in at most $n$ steps.
proof
Since \(\{p_{0}, \ldots, p_{l}\}\) is linearly indepdent, there exist \(\sigma_{0}, \ldots, \sigma_{n-1}\) such that
\[\begin{eqnarray} x^{*} - x_{0} := \sigma_{0} p_{0} + \cdots + \sigma_{n-1} p_{n-1} . \end{eqnarray}\]By multiplying the equation by $p_{k}^{\mathrm{T}} A$ and using the conjugacy
\[\begin{eqnarray} & & x^{*} - x_{0} = \sigma_{0} p_{0} + \cdots + \sigma_{n-1} p_{n-1} \nonumber \\ & \Leftrightarrow & p_{k}^{\mathrm{T}}A (x^{*} - x_{0)} = p_{k}^{\mathrm{T}}A \left( \sigma_{0} p_{0} + \cdots + \sigma_{n-1} p_{n-1} \right) \nonumber \\ & \Leftrightarrow & p_{k}^{\mathrm{T}}A (x^{*} - x_{0)} = p_{k}^{\mathrm{T}} A \sigma_{k} p_{k} \nonumber \\ & \Leftrightarrow & \sigma_{k} = \frac{ p_{k}^{\mathrm{T}} A (x^{*} - x_{0}) }{ p_{k}^{\mathrm{T}}A p_{k} } \label{equation_05_08} \end{eqnarray}\] \[\begin{eqnarray} & & x_{k} = x_{k-1} + \alpha_{k-1} p_{k-1} \nonumber \\ & \Leftrightarrow & x_{k} = x_{0} + p_{1} x_{1} + \cdots + \alpha_{k-1} p_{k-1} \nonumber \\ & \Leftrightarrow & x_{k} - x_{0} = p_{1} x_{1} + \cdots + \alpha_{k-1} p_{k-1} \nonumber \\ & \Leftrightarrow & p_{k}^{\mathrm{T}}A (x_{k} - x_{0}) = 0 . \end{eqnarray}\]Hence
\[\begin{eqnarray} p_{k}^{\mathrm{T}}A (x^{*} - x_{0}) & = & p_{k}^{\mathrm{T}}A (x^{*} - x_{k} + x_{k} - x_{0}) \nonumber \\ & = & p_{k}^{\mathrm{T}}A (x^{*} - x_{k}) \nonumber \\ & = & p_{k}^{\mathrm{T}} (b - Ax_{k}) \nonumber \\ & = & - p_{k}^{\mathrm{T}} r_{k} . \nonumber \end{eqnarray}\]Therefore, $\sigma_{k} = \alpha_{k}$.
Theorem 5.2
- $x_{0} \in \mathbb{R}^{n}$,
- \(\{x_{0}, \ldots, x_{n}\}\),
- The sequence generated by the conjugate direction algorithm
Then
\[\begin{equation} \forall i = 0, \ldots, k-1, \ r_{k}^{\mathrm{T}}p_{i} = 0 \label{equation_05_11} \end{equation} .\]$x_{k}$ is the minimizer of $\phi(x)$ over set
\[\begin{eqnarray} D_{k} := \{ x \mid x = x_{0} + \mathrm{span}\{p_{0}, \ldots, p_{k-1}\} \} . \label{equation_05_12} \end{eqnarray}\]proof
We first show that a point $\tilde{x}$ minimizes $\phi$ over the set if and only if
\[\forall, i = 0, \ldots, k -1, r(\tilde{x})^{\mathrm{T}} p_{i} = 0 .\]Indeed, let us define
\[h(\sigma) := \phi(x_{0} + \sigma_{0}p_{0} + \cdots + \sigma_{k-1}p_{k-1}) .\]Since composition of convex functions is also convex, $h$ is a strictly convex function. It has a unique minimizer $\sigma^{*}$ that satisfies
\[\begin{eqnarray} \forall i = 0, 1, \ldots, k-1, \ & & \frac{ \partial h(\sigma^{*}) }{ \partial \sigma_{i} } = 0 \nonumber \\ & \Leftrightarrow & \nabla \phi(x_{0} + \sigma_{0}^{*}p_{0} + \cdots + \sigma_{k-1}^{*}p_{k-1}) p_{i} = 0 \nonumber \\ & \Leftrightarrow & r(\tilde{x}) p_{i} = 0 \nonumber \end{eqnarray} .\]We use the induction to prove the statement. For the case $k = 1$,
\[\begin{eqnarray} r_{1}^{\mathrm{T}}p_{0} & = & (Ax_{1} - b)^{\mathrm{T}} p_{0} \nonumber \\ & = & (Ax_{0} + A\alpha_{0}p_{0} - b)^{\mathrm{T}} p_{0} \\ & = & (x_{0}^{\mathrm{T}}A + \alpha_{0}p_{0}^{\mathrm{T}}A - b^{\mathrm{T}}) p_{0} \\ & = & x_{0}^{\mathrm{T}}A p_{0} + \alpha_{0}p_{0}^{\mathrm{T}}A p_{0} - b^{\mathrm{T}} p_{0} \nonumber \\ & = & x_{0}^{\mathrm{T}}A p_{0} + r_{0}^{\mathrm{T}}p_{0} - b^{\mathrm{T}} p_{0} \nonumber \\ & = & r_{0}^{\mathrm{T}} p_{0} + r_{0}^{\mathrm{T}}p_{0} \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]Let us assume that the statement holds up to $k = l$. For the case $k = l + 1$,
\[\begin{eqnarray} r_{k+1} & = & Ax_{k+1} - b \nonumber \\ & = & Ax_{k} + A\alpha_{k}p_{k} - b \nonumber \\ & = & r_{k} + A\alpha_{k}p_{k} . \nonumber \end{eqnarray}\]Hence
\[\begin{eqnarray} p_{k}^{\mathrm{T}}r_{k+1} & = & p_{k}^{\mathrm{T}} \left( r_{k} + A\alpha_{k}p_{k} \right) \nonumber \\ & = & p_{k}^{\mathrm{T}} r_{k} - r_{k}^{\mathrm{T}} p_{k} \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]By conjugacy and the assumption of the induction, for all $i = 0, \ldots, k-1$,
\[\begin{eqnarray} p_{i}^{\mathrm{T}}r_{k+1} & = & p_{i}^{\mathrm{T}} \left( r_{k} + A\alpha_{k}p_{k} \right) \nonumber \\ & = & p_{i}^{\mathrm{T}} A\alpha_{k}p_{k} \nonumber \\ & = & 0 \label{equation_05_conjugacy} . \end{eqnarray}\]Algorithm 5.1 CG-Preliminary Version
- $x_{0} \in \mathbb{R}^{n}$,
Definition Krylov subspace
Krylov subspace of degree $k$ for $r_{0}$ is defined as
\[\mathcal{K}(r_{0}; k) := \mathrm{span}\{ r_{0}, A r_{0}, \ldots, A^{k}r_{0} \} .\]Theorem 5.3
- \(\{x_{k}\}\),
- the seqeunce generated by conjugate gradient method
Suppose that the $k$-th iterate generated by the conjugate gradient method is not hte solution point $x^{*}$.
\[\begin{eqnarray} r_{k}^{\mathrm{T}}r_{i} & = & 0 \label{equation_05_16} \\ \mathrm{span}\{r_{0}, \ldots, r_{k}\} & = & \mathrm{span}\{r_{0}, \ldots, A^{k}r_{0}\} \label{equation_05_17} \\ \mathrm{span}\{p_{0}, \ldots, p_{k}\} & = & \mathrm{span}\{r_{0}, \ldots, A^{k}r_{0}\} \label{equation_05_18} \\ p_{k}^{\mathrm{T}}Ap_{i} & = & 0 \label{equation_05_19} \end{eqnarray}\]Therefore, the sequence \(\{x_{k}\}\) converges to $x^{*}$ in at most $n$ steps.
proof
We prove by induction. For $k = 0$, \(\eqref{equation_05_17}\) and \(\eqref{equation_05_18}\) hold. For $k = 1$,
\[\begin{eqnarray} p_{1}^{\mathrm{T}}Ap_{0} & = & (-r_{1} + \beta_{1} p_{0})^{\mathrm{T}} A p_{0} \nonumber \\ & = & \left( -r_{1} + p_{0} \frac{ r_{1}^{\mathrm{T}}Ap_{0} }{ p_{0}^{\mathrm{T}}Ap_{0} } \right)^{\mathrm{T}} A p_{0} \nonumber \\ & = & -r_{1}^{\mathrm{T}} Ap_{0} + r_{1}^{\mathrm{T}}Ap_{0} . \nonumber \end{eqnarray}\]Let us assume that the \(\eqref{equation_05_17}\) and \(\eqref{equation_05_18}\) hold up to $k$. We first show \(r_{k+1} \in \mathrm{span}\{r_{0}, \ldots, A^{k+1}r_{0}\}\). By the assumption of the induction,
\[\begin{eqnarray} r_{k} & \in & \mathrm{span}\{ r_{0}, A r_{0}, \ldots, A^{k}r_{0} \} \nonumber \\ p_{k} & \in & \mathrm{span}\{ r_{0}, A r_{0}, \ldots, A^{k}r_{0} \} \nonumber \end{eqnarray}\]Hence
\[\begin{eqnarray} Ap_{k} & \in & \mathrm{span}\{ Ar_{0}, A^{2} r_{0}, \ldots, A^{k+1}r_{0} \} \nonumber \end{eqnarray}\]Combining these results,
\[\begin{eqnarray} r_{k+1} & = & r_{k} + \alpha_{k}Ap_{k} \nonumber \\ & \in & \mathrm{span}\{ r_{0}, A r_{0}, \ldots, A^{k+1}r_{0} \} . \nonumber \end{eqnarray}\]That is,
\[\begin{eqnarray} \mathrm{span}\{r_{0}, \ldots, A^{k+1}r_{0}\} \supseteq \mathrm{span}\{r_{0}, \ldots, r_{k+1}\} \end{eqnarray}\]We next show \(A^{k+1}r_{0} \in \mathrm{span}\{r_{0}, \ldots, r_{k+1}\}\). By our assumption, \(A^{k} r_{0} \in \mathrm{span}\{p_{0}, \ltots, p_{k}\}\). Thus,
\[\begin{eqnarray} A^{k+1} r_{0} & = & A (A^{k} r_{0}) \nonumber \\ & \in & \mathrm{span}\{Ap_{0}, \ldots, Ap_{k}\} \nonumber \\ & = & \mathrm{span} \left\{ A(r_{1} - r_{0})/\alpha_{0}, \ldots, A(r_{k+1} - r_{k-1})/\alpha_{k}, \right\} \nonumber \\ & = & \mathrm{span} \left\{ r_{0}, \ldots, r_{k+1} \right\} . \nonumber \end{eqnarray}\]Therefore,
\[\mathrm{span}\{r_{0}, \ldots, A^{k+1}r_{0}\} \subseteq \mathrm{span}\{r_{0}, \ldots, r_{k+1}\} .\]For \(\eqref{equation_05_18}\),
\[\begin{eqnarray} \mathrm{span}\{p_{0}, \ldots, p_{k}, p_{k+1}\} & = & \mathrm{span}\{p_{0}, \ldots, p_{k}, -r_{k+1} + \beta_{k+1}p_{k}\} \nonumber \\ & = & \mathrm{span}\{p_{0}, \ldots, p_{k}, -r_{k+1} + \beta_{k+1}p_{k}\} \nonumber \\ & = & \mathrm{span}\{p_{0}, \ldots, p_{k}, r_{k+1}\} \nonumber \\ & = & \mathrm{span}\{r_{0}, \ldots, A^{k}r_{0}, r_{k+1}\} \nonumber \\ & = & \mathrm{span}\{r_{0}, \ldots, r_{k}, r_{k+1}\} \nonumber \\ & = & \mathrm{span}\{r_{0}, \ldots, A^{k}r_{0}, A^{k+1}r_{0}\} . \nonumber \end{eqnarray}\]Next, we prove \(\eqref{equation_05_19}\).
\[\begin{eqnarray} p_{k+1}^{\mathrm{T}}Ap_{k} & = & -r_{k+1}^{\mathrm{T}}Ap_{k} + \beta_{k+1}p_{k}^{\mathrm{T}}Ap_{k} \nonumber \\ & = & -r_{k+1}^{\mathrm{T}}Ap_{k} + \frac{ r_{k+1}Ap_{k} }{ p_{k}^{\mathrm{T}}Ap_{k} } p_{k}^{\mathrm{T}}A p_{k} \nonumber \\ & = & 0 \nonumber \end{eqnarray}\]Let $i = 0, \ldots, k-1$ be fixed. By \(\eqref{equation_05_18}\),
\[\begin{eqnarray} A p_{i} & \in & \mathrm{span}\{Ar_{0}, \ldots, A A^{i}r_{0}\} \nonumber \\ & \subseteq & \mathrm{span}\{p_{0}, \ldots, p^{i+1}\} \quad (\because \eqref{equation_05_18}) \nonumber \end{eqnarray}\]By Theorem 5.2,
\[\begin{eqnarray} r_{k+1}^{\mathrm{T}} A p_{i} & = & r_{k+1}^{\mathrm{T}} \sum_{l=0}^{i+1} \sigma_{i} p_{i} \nonumber \\ & = & \sum_{l=0}^{i+1} \sigma_{i} r_{k+1}^{\mathrm{T}} p_{i} \nonumber \\ & = & 0 \nonumber \end{eqnarray}\]Thus,
\[\begin{eqnarray} p_{k+1}^{\mathrm{T}}Ap_{i} & = & - r_{k+1}^{\mathrm{T}} A p_{i} + \beta_{k+1} p_{k}^{\mathrm{T}} A p_{i} \nonumber \\ & = & - r_{k+1}^{\mathrm{T}} A p_{i} \quad (\because \text{induction hypothesis}) \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]Therefore, \(\eqref{equation_05_19}\) is proved. Since \(\{p_{i}\}\) is conjugate, Theorem 5.1 ensure that the algorithm terminates in at most $n$ iterations.
For \(\eqref{equation_05_16}\), let $i = 1, \ldots, k - 1$ be fixed.
\[\begin{eqnarray} r_{k}^{\mathrm{T}}r_{i} & = & r_{k}^{\mathrm{T}} (\beta_{i}p_{i-1} - p_{i}) \nonumber \\ & = & r_{k}^{\mathrm{T}}\beta_{i}p_{i-1} - r_{k}^{\mathrm{T}} p_{i} \nonumber \\ & = & 0 . \quad (\because \eqref{equation_05_conjugacy}) \end{eqnarray}\]In case of $i = 0$,
\[\begin{eqnarray} r_{k}^{\mathrm{T}}r_{0} & = & r_{k}^{\mathrm{T}} p_{0} \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]Algorithm 5.2 CG
- $x_{0} \in \mathbb{R}^{n}$,
- $k \leftarrow 0$,
while $r_{k} \neq 0$,
\[\begin{eqnarray} \alpha_{k} & \leftarrow & \frac{ r_{k}^{\mathrm{T}}r_{k} }{ p_{k}^{\mathrm{T}}Ap_{k} } \label{equation_05_24_a} \\ x_{k+1} & \leftarrow & x_{k} + \alpha_{k} p_{k} \label{equation_05_24_b} \\ r_{k+1} & \leftarrow & r_{k} + \alpha_{k} A p_{k} \label{equation_05_24_c} \\ \beta_{k+1} & \leftarrow & \frac{ r_{k+1}^{\mathrm{T}} r_{k+1} }{ r_{k}^{\mathrm{T}}r_{k} } \label{equation_05_24_d} \\ p_{k+1} & \leftarrow & - r_{k+1} + \beta_{k+1} p_{k} \label{equation_05_24_e} \\ k & \leftarrow & k + 1 \label{equation_05_24_f} \end{eqnarray}\]end (while)
where \(\gamma_{1}, \ldots, \gamma_{k}\) are some constants. Let
\[P_{k}^{*}(\lambda) := \gamma_{0} \lambda^{0} + \gamma_{1} \lambda^{1} + \cdots + \gamma_{k}\lambda^{k} .\]Then \(\eqref{equation_05_25}\) can be expressed as
\[\begin{eqnarray} x_{k+1} & = & x_{0} + P_{k}^{*}(A) r_{0} . \label{equation_05_26} \end{eqnarray}\]Let
\[\begin{eqnarray} \norm{z}_{A} & := & z^{\mathrm{T}}Az . \label{equation_05_27} \end{eqnarray}\]With this norm,
\[\begin{eqnarray} \frac{1}{2} \norm{x - x^{*}}_{A}^{2} & = & \phi(x) - \phi(x^{*}) . \label{equation_05_28} \end{eqnarray}\] \[\begin{eqnarray} \min_{P_{k}} \norm{ x_{0} + P_{k}(A) r_{0} - x^{*} }_{A} \end{eqnarray}\] \[\begin{eqnarray} r_{0} & = & A x_{0} - b \nonumber \\ & = & A x_{0} - A x^{*} \nonumber \\ & = & A( x_{0} - x^{*}) \nonumber \end{eqnarray}\] \[\begin{eqnarray} x_{k+1} - x^{*} & = & x_{0} + P_{k}^{*}(A) r_{0} \nonumber \\ & = & (I + P_{k}^{*}(A)A)(x_{0} - x^{*}) \label{equation_05_30} \end{eqnarray}\]Let \(0 < \lambda_{1} \le \lambda_{2} \le \cdots \le \lambda_{n}\) be the eigenvalues of $A$. Let \(v_{1}, \ldots, v_{n}\) be the corresponding orthogonal eigenvectors.
\[\begin{eqnarray} A & = & \sum_{i=1}^{n} \lambda_{i}v_{i}v_{i}^{\mathrm{T}} \nonumber \end{eqnarray}\]Since $v_{i}$ spans the whole space $\mathbb{R}^{n}$, there exist \(\{xi_{i}\}\) such that
\[\begin{eqnarray} x_{0} - x^{*} & = & \sum_{i=1}^{n} \xi_{i}v_{i} . \label{equation_05_31} \end{eqnarray}\] \[\begin{eqnarray} P_{k}(A) v_{i} & = & \left( \gamma_{0}A^{0} + \cdots + \gamma_{k}A^{k} \right) v_{i} \nonumber \\ & = & \left( \gamma_{0}\lambda^{0} + \cdots + \gamma_{k}\lambda^{k} \right) v_{i} \nonumber \\ & = & P_{k}^{*}(\lambda_{i}) v_{i} . \end{eqnarray}\] \[\begin{eqnarray} x_{k+1} - x^{*} & = & x_{0} + \gamma_{0} r_{0} + \sum_{i=1}^{k} \gamma_{i}A^{i}r_{0} - x^{*} \nonumber \\ & = & x_{0} + P_{k}^{*}(A) r_{0} - x^{*} \nonumber \\ & = & x_{0} + P_{k}^{*}(A) A(x_{0} - x^{*}) - x^{*} \nonumber \\ & = & (I + P_{k}^{*}(A)A) (x_{0} - x^{*}) \nonumber \\ & = & (I + P_{k}^{*}(A)A) \left( \sum_{i=1}^{n} \xi_{i} v_{i} \right) \nonumber \\ & = & \sum_{i=1}^{n} \left( \xi_{i} v_{i} + \xi_{i} \lambda_{i} P_{k}^{*}(\lambda_{i}) v_{i} \right) \nonumber \\ & = & \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i} \nonumber \end{eqnarray}\]Hence
\[\begin{eqnarray} \norm{ x_{k+1} - x^{*} }_{A}^{2} & = & \norm{ \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i} }_{A}^{2} \nonumber \\ & = & \sum_{i=1}^{n} \sum_{j=1}^{n} \left( \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i} \right)^{\mathrm{T}} A \left( \left( 1 + \lambda_{j} P_{k}^{*}(\lambda_{j}) \right) \xi_{j} v_{j} \right) \nonumber \\ & = & \sum_{i=1}^{n} \sum_{j=1}^{n} \left( \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i} \right)^{\mathrm{T}} \left( \left( 1 + \lambda_{j} P_{k}^{*}(\lambda_{j}) \right) \xi_{j} \lambda_{j} v_{j} \right) \nonumber \\ & = & \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} v_{i}^{\mathrm{T}} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right) \xi_{i} \lambda_{i} v_{i} \nonumber \\ & = & \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}^{*}(\lambda_{i}) \right)^{2} \xi_{i}^{2} \lambda_{i} \nonumber \end{eqnarray}\]By thorem 5.2, the sequence \(\{x_{k+1}\}\) generated by CG method minimizes over \(x_{0} + \mathrm{span}\{r_{0}, Ar_{0}, \ldots, A^{k}r_{0}\}\).
\[\begin{eqnarray} \phi(x^{*}) & = & \min_{x} \phi(x) \nonumber \\ & = & \min_{a_{i} \in \mathbb{R}^{k}} \phi(x_{0} + a_{1}r_{0} + \cdots + A^{k}r_{0} a_{k}) \nonumber \\ & = & \min_{P_{k}: \text{polynomial with degree }k} \phi(x_{0} + P_{k}(A)r_{0}) \nonumber . \end{eqnarray}\]where
\[\begin{eqnarray} P_{k}(x) := x + \cdots + x^{k} . \end{eqnarray}\]For any $k$, \(x_{k+1} \in (x_{0} + \mathrm{span}\{r_{0}, Ar_{0}, \ldots, A^{k}r_{0}\}\), Hence
\[\begin{eqnarray} \min_{x \in \{x_{0}, \ldots, x_{n}\}} \norm{ x - x^{*} }_{A}^{2} & = & \min_{P_{k}: \text{polynomial with degree }k} \sum_{i=1}^{n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \xi_{i}^{2} \lambda_{i} \nonumber \\ & \le & \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \left( \sum_{j=1}^{n} \xi_{j}^{2} \lambda_{j} \right) \nonumber \\ & \le & \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \left( \sum_{j=1}^{n} \xi_{j}^{2} \lambda_{j} \right) \nonumber \\ & = & \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \left( \sum_{j=1}^{n} \sum_{l=1}^{n} \xi_{j} v_{j}^{\mathrm{T}} A \xi_{l} v_{l} \right) \nonumber \\ & = & \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \norm{ \xi_{j} v_{j}^{\mathrm{T}} }_{A}^{2} \nonumber \\ & = & \min_{P_{k}: \text{polynomial with degree}k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} \norm{ x - x^{*} }_{A}^{2} \quad (\because \text{by definition}) \label{equation_05_34} . \end{eqnarray}\]Theorem 5.4
- $\lambda_{1}, \ldots, \lambda_{n}$,
- eigenvalues of $A$,
If $A$ has only $r$ distinct eigenvalues, then the CC iteration will terminate at the solution in at most $r$ iteration.
proof
\[Q_{r}(\lambda) := \frac{ (-1)^{r} }{ \tau_{1} \cdots \tau_{r} } (\lambda - \tau_{1}) \cdots (\lambda - \tau_{r}) .\]It’s easy to confirm
\[\forall i = 1, \ldots, n, \ Q_{r}(\lambda_{i}) = 0 .\]and
\[Q_{r}(0) = 1 .\]Hence \(Q_{r}(\lambda) - 1\) is polynonimal of degree $r$ with a root at $\lambda = 0$. Let
\[\bar{P}_{r-1}(\lambda) := \frac{ Q_{r}(\lambda) - 1 }{ \lambda } .\]\(\bar{P}_{r-1}\) is polynomial of degree $r-1$.
\[\begin{eqnarray} 0 & \le & \min_{P_{r-1}} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{r-1}(\lambda_{i}) \right)^{2} \nonumber \\ & \le & \max_{1 \le i \le n} \left( 1 + \lambda_{i} \bar{P}_{r-1}(\lambda_{i}) \right)^{2} \nonumber \\ & \le & \max_{1 \le i \le n} Q_{r}^{2}(\lambda_{i}) \nonumber \\ & = & 0 \nonumber \end{eqnarray}\]By setting $k=r-1$, \(\eqref{equation_05_34}\) is
\[\begin{eqnarray} \norm{x_{r} - x^{*}}_{A} & = & \min_{P_{r-1}: \text{polynomial with degree}k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{r-1}(\lambda_{i}) \right)^{2} \norm{ x - x^{*} }_{A}^{2} \nonumber \\ & = & 0 . \nonumber \end{eqnarray}\]Theorem 5.5
- $r$
- the number of distinct eigenvalues
Let $k < r$.
\[\begin{eqnarray} \norm{x_{k+1} - x^{*}}_{A}^{2} & \le & \left( \frac{ \lambda_{n-k} - \lambda_{1} }{ \lambda_{n-k} + \lambda_{1} } \right)^{2} \norm{x_{0} - x^{*}}_{A}^{2} . \label{equation_05_35} \end{eqnarray}\]proof
Let
\[\begin{eqnarray} Q_{k}(\lambda) & := & \frac{ (-1)^{k} }{ \tau_{1} \cdots \tau_{k} } (\lambda - \tau_{1}) \cdots (\lambda - \tau_{r}) \nonumber \end{eqnarray}\] \[\begin{eqnarray} \min_{P_{k}: \text{polynomial with degree }k} \max_{1 \le i \le n} \left( 1 + \lambda_{i} P_{k}(\lambda_{i}) \right)^{2} & \le & \max_{1 \le i \le n} Q_{r}^{2}(\lambda_{i}) \nonumber \\ & = & \max_{1 \le i \le n} \left( \frac{ (-1)^{k} }{ \tau_{1} \cdots \tau_{k} } (\lambda_{i} - \tau_{1}) \cdots (\lambda_{i} - \tau_{k}) \right)^{2} \nonumber \\ & = & (\lambda_{n} - \tau_{1}) r \max_{1 \le i \le n} \left( \frac{ }{ \tau_{1} \cdots \tau_{k} } \right) \end{eqnarray}\]