cell_dependencies!$3ec0e31c-3d09-43d1-9d9a-506320f7d964precedence_heuristic cell_id$3ec0e31c-3d09-43d1-9d9a-506320f7d964downstream_cells_mapupstream_cells_map@md_strgetindex$3c3fabfc-f7e1-4e93-acfa-48c7448f968aprecedence_heuristic cell_id$3c3fabfc-f7e1-4e93-acfa-48c7448f968adownstream_cells_mapupstream_cells_map@md_strgetindex$08508856-57bd-4d17-b8c6-3b0150e5b9afprecedence_heuristic cell_id$08508856-57bd-4d17-b8c6-3b0150e5b9afdownstream_cells_mapmake_circlepupstream_cells_maperror_Bauer_Fike$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832exact_eigenvalues$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832show_geschgorin$a7e14252-02ee-49c1-9f46-5699cf590923computed_eigenvalues$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832mapM$50024c47-f094-4fed-8ec1-1861a982a49a/plot!πerror_Kato_Temple$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832==geschgorin_centres$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832scatter!geschgorin_radii$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832:zerossizeshow_kato_temple$a7e14252-02ee-49c1-9f46-5699cf590923plot+*show_bauer_fike$a7e14252-02ee-49c1-9f46-5699cf590923cossin$bc4e51aa-674e-4a46-a20d-d9bfd82ff108precedence_heuristic cell_id$bc4e51aa-674e-4a46-a20d-d9bfd82ff108downstream_cells_mapupstream_cells_map@md_strgetindex$54fea52c-6adb-4c83-b380-416bfdaeacecprecedence_heuristic cell_id$54fea52c-6adb-4c83-b380-416bfdaeacecdownstream_cells_mapupstream_cells_map@md_strgetindex$ef9e68d5-b5d5-456e-a13d-8c0149c6b2d6precedence_heuristic cell_id$ef9e68d5-b5d5-456e-a13d-8c0149c6b2d6downstream_cells_mapupstream_cells_map@md_strM$50024c47-f094-4fed-8ec1-1861a982a49alatexify_mdgetindex$e5c087db-1df5-4654-a5f3-d4c27a5a9782precedence_heuristic cell_id$e5c087db-1df5-4654-a5f3-d4c27a5a9782downstream_cells_mapupstream_cells_mapString@htl HypertextLiteral.attribute_valueHypertextLiteral.ResultRobustLocalResourceHypertextLiteral.BypassHypertextLiteral.contentMarkdown.parseHypertextLiteral$8dab1cdf-f700-4dd9-a839-58fe8cfd6c4aMarkdownread$f172a676-599b-4be7-9b5d-ceba6b1794dcprecedence_heuristic cell_id$f172a676-599b-4be7-9b5d-ceba6b1794dcdownstream_cells_mapupstream_cells_map@md_strgetindex$5466b137-915b-4118-9dbf-f97750303784precedence_heuristic cell_id$5466b137-915b-4118-9dbf-f97750303784downstream_cells_mapupstream_cells_map@md_strgetindex$c7c14c64-133f-44e2-83ea-abde368de440precedence_heuristic cell_id$c7c14c64-133f-44e2-83ea-abde368de440downstream_cells_mapupstream_cells_map@md_strgetindex$371399ec-a571-4aed-b910-7307d90edad1precedence_heuristic cell_id$371399ec-a571-4aed-b910-7307d90edad1downstream_cells_mapupstream_cells_map@md_strgetindex$118355a0-2104-4216-9d11-53079fa2024dprecedence_heuristic cell_id$118355a0-2104-4216-9d11-53079fa2024ddownstream_cells_mapupstream_cells_map@md_strgetindex$a70cff6b-b3d6-4f49-aed0-4d01813d6a95precedence_heuristic cell_id$a70cff6b-b3d6-4f49-aed0-4d01813d6a95downstream_cells_mapupstream_cells_map@md_strgetindex$5e85fbbb-b9b6-49f2-b33f-539ddd7f4cc0precedence_heuristic cell_id$5e85fbbb-b9b6-49f2-b33f-539ddd7f4cc0downstream_cells_mapupstream_cells_map@md_strgetindex$440fa8bb-c502-4c51-bd80-b0a86d3b8507precedence_heuristic cell_id$440fa8bb-c502-4c51-bd80-b0a86d3b8507downstream_cells_mapupstream_cells_map@md_strgetindex$1bae37c2-946f-467e-9b1d-c3ca5a129b98precedence_heuristic cell_id$1bae37c2-946f-467e-9b1d-c3ca5a129b98downstream_cells_mapupstream_cells_map@md_strgetindex$7eba1e23-9f38-4478-84f4-e81ef653ba2cprecedence_heuristic cell_id$7eba1e23-9f38-4478-84f4-e81ef653ba2cdownstream_cells_mapupstream_cells_map@md_strgetindex$8dab1cdf-f700-4dd9-a839-58fe8cfd6c4aprecedence_heuristiccell_id$8dab1cdf-f700-4dd9-a839-58fe8cfd6c4adownstream_cells_mapLinearAlgebraPlutoUI$45f6befb-4518-4467-81f3-ecd6da88ccbb$a7e14252-02ee-49c1-9f46-5699cf590923PlotsHypertextLiteral$e5c087db-1df5-4654-a5f3-d4c27a5a9782LaTeXStrings$f6e33422-d41e-4444-b08b-b281f5eb0db3PlutoTeachingToolsTikzPicture$f6e33422-d41e-4444-b08b-b281f5eb0db3upstream_cells_mapRobustLocalResourceMarkdown.parseStringMarkdownread$760385db-cc30-40c1-8f66-5692028a96e3precedence_heuristic cell_id$760385db-cc30-40c1-8f66-5692028a96e3downstream_cells_mapupstream_cells_mapTableOfContents$4eb5e944-000f-4755-b039-63591d1dab8aprecedence_heuristic cell_id$4eb5e944-000f-4755-b039-63591d1dab8adownstream_cells_mapupstream_cells_map@md_strgetindex$a3e89cc9-55b6-423f-874a-b923026847eaprecedence_heuristic cell_id$a3e89cc9-55b6-423f-874a-b923026847eadownstream_cells_mapupstream_cells_map@md_strgetindex$c00bddac-a453-4831-81be-d5814a7e88b9precedence_heuristic cell_id$c00bddac-a453-4831-81be-d5814a7e88b9downstream_cells_mapupstream_cells_map@md_strgetindex$50024c47-f094-4fed-8ec1-1861a982a49aprecedence_heuristic cell_id$50024c47-f094-4fed-8ec1-1861a982a49adownstream_cells_mapM$ef9e68d5-b5d5-456e-a13d-8c0149c6b2d6$08508856-57bd-4d17-b8c6-3b0150e5b9af$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832upstream_cells_mapM12$45f6befb-4518-4467-81f3-ecd6da88ccbbM14$45f6befb-4518-4467-81f3-ecd6da88ccbbM13$45f6befb-4518-4467-81f3-ecd6da88ccbbM23$45f6befb-4518-4467-81f3-ecd6da88ccbb$a7e14252-02ee-49c1-9f46-5699cf590923precedence_heuristic cell_id$a7e14252-02ee-49c1-9f46-5699cf590923downstream_cells_mapshow_geschgorin$08508856-57bd-4d17-b8c6-3b0150e5b9afshow_bauer_fike$08508856-57bd-4d17-b8c6-3b0150e5b9afshow_kato_temple$08508856-57bd-4d17-b8c6-3b0150e5b9afupstream_cells_map@md_strCorePlutoUI$8dab1cdf-f700-4dd9-a839-58fe8cfd6c4aPlutoUI.CheckBoxPlutoRunnerPlutoRunner.create_bondCore.applicablegetindexBase.get@bindBase$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832precedence_heuristic cell_id$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832downstream_cells_maperror_Bauer_Fike$08508856-57bd-4d17-b8c6-3b0150e5b9afexact_eigenvalues$08508856-57bd-4d17-b8c6-3b0150e5b9afgeschgorin_radii$08508856-57bd-4d17-b8c6-3b0150e5b9afcomputed_eigenvalues$08508856-57bd-4d17-b8c6-3b0150e5b9aferror_Kato_Temple$08508856-57bd-4d17-b8c6-3b0150e5b9afresidualsgeschgorin_centres$08508856-57bd-4d17-b8c6-3b0150e5b9afδcomputed_eigenvectorsupstream_cells_mapsum>islessFloat64eachcol *Proof.* > We set $\theta \equiv \theta(v, \tilde{v})$ and write $\tilde{v}=v \cos \theta+w \sin \theta$, where $w \perp v$ is chosen appropriately. > We have > ```math > \begin{align} > (A-\tilde{\lambda} I) \tilde{v} & =\cos \theta(A-\tilde{\lambda} I) v +\sin \theta(A-\tilde{\lambda} I) w \\ > & =\cos \theta({ \lambda} -\tilde{\lambda}) v+\sin \theta(A-\tilde{\lambda} I) w > \end{align} > ``` > Note that >```math > \begin{align} > \langle v, (A-\tilde{\lambda} I) w \rangle & =\langle(A-\tilde{\lambda} I) v, w\rangle \\ > & =(\lambda-\tilde{\lambda})\langle v, w\rangle=0, > \end{align} > ``` > i.e. the two vectors in the previous sum are orthogonal. > Therefore > ```math > \begin{align} > \|r \|_{2}^{2} & = \| (A-\tilde{\lambda} I ) \tilde{v} \|_{2}^{2} \\ > & =\cos ^{2} \theta \ |\lambda-\tilde{\lambda}|^{2}+\sin ^{2} \theta \ \|(A-\tilde{\lambda} I) w\|_{2}^{2} > \end{align} > ``` > Hence, > ```math > \sin ^{2} \theta\|(A-\tilde{\lambda} I) w\|_{2}^{2} \leq\|r\|_{2}^{2} > ``` > Since $w \perp v$, >```math > \begin{aligned} > \|(A-\tilde{\lambda} I) w\|_{2} & =\left\|\sum_{\substack{i=1 \\ > \lambda_{i} \neq \lambda}}^{n}\left(\lambda_{i}-\tilde{\lambda}\right) v_{i} v_{i}^{H} w\right\|_{2} \\ > & \geq \min _{\substack{i=1, \dots, n \\ > \lambda_i \neq \lambda}}\left|\lambda_{i}-\tilde{\lambda}\right|=\delta > \end{aligned} >``` > This concludes the proof. $\hspace{11cm} \square$ """metadatashow_logsèdisabled®skip_as_script«code_folded$08508856-57bd-4d17-b8c6-3b0150e5b9afcell_id$08508856-57bd-4d17-b8c6-3b0150e5b9afcodebegin function make_circle(xmidpoint, radius; n=100) θs = (0:n) .* 2π ./ n map(radius .* cos.(θs), radius .* sin.(θs)) do x, y x + xmidpoint, y end end p = plot(; aspect_ratio=1.0, xlims=[0, 6], ylims=[-1.5, 1.5], legend=:bottomright) scatter!(p, exact_eigenvalues, zeros(size(M, 1)); label="eigenvalues", mark=:x, markersize=8) scatter!(p, computed_eigenvalues, zeros(size(M, 1)); label="computed", mark=:+, markersize=8) for i in 1:size(M, 1) if show_geschgorin label = i==1 ? "Gerschgorin" : "" circle = make_circle(geschgorin_centres[i], geschgorin_radii[i]) plot!(p, circle; color=3, label, lw=2) end if show_bauer_fike label = i==1 ? "Bauer-Fike" : "" circle = make_circle(computed_eigenvalues[i], error_Bauer_Fike[i]) plot!(p, circle; color=4, label, lw=2) end if show_kato_temple label = i==1 ? "Kato-Temple" : "" circle = make_circle(computed_eigenvalues[i], error_Kato_Temple[i]) plot!(p, circle ; color=5, label, lw=2) end end p endmetadatashow_logsèdisabled®skip_as_script«code_folded$bc4e51aa-674e-4a46-a20d-d9bfd82ff108cell_id$bc4e51aa-674e-4a46-a20d-d9bfd82ff108code"md""" ## Error bounds showcase """metadatashow_logsèdisabled®skip_as_script«code_folded$54fea52c-6adb-4c83-b380-416bfdaeaceccell_id$54fea52c-6adb-4c83-b380-416bfdaeaceccodemd""" !!! note "Lemma 3" Let $\tilde{v}$ be an approximate eigenvector of a Hermitian matrix $A$ with (for simplicity) $\|\tilde{v}\| = 1$. Let $\tilde \lambda$ be the associated approximated eigenvalue, calculated from $\tilde{\lambda}=\langle\tilde{v}, A \tilde{v}\rangle=R_{A}(\tilde{v})$. Further let $(\alpha, \beta)$ be an interval that contains no eigenvalue of $A$, but let $\tilde{\lambda} \in(\alpha, \beta)$. Then ```math (\beta-\tilde{\lambda})(\tilde{\lambda}-\alpha) \leq\|r\|_{2}^{2} . ``` """metadatashow_logsèdisabled®skip_as_script«code_folded$ef9e68d5-b5d5-456e-a13d-8c0149c6b2d6cell_id$ef9e68d5-b5d5-456e-a13d-8c0149c6b2d6codemd""" Consider the near-diagonal matrix M = $(latexify_md(M)) with some Sliders to tune the off-diagonal elements, shown below. As approximate eigenvectors we assume the unit vectors, that is $\tilde{v}_i = e_i$. As a result the corresponding approximate eigenvalues are just ```math R_M(\tilde{v}_i) = \tilde{v}_i^T M \tilde{v}_i = M_{ii}, ``` i.e. the diagonal entries of $M$. """metadatashow_logsèdisabled®skip_as_script«code_folded$e5c087db-1df5-4654-a5f3-d4c27a5a9782cell_id$e5c087db-1df5-4654-a5f3-d4c27a5a9782code;let RobustLocalResource("https://teaching.matmat.org/error-control/sidebar.md", "sidebar.md") Sidebar(toc, ypos) = @htl("""""") Sidebar(Markdown.parse(read("sidebar.md", String)), 265) endmetadatashow_logsèdisabled®skip_as_script«code_folded$f172a676-599b-4be7-9b5d-ceba6b1794dccell_id$f172a676-599b-4be7-9b5d-ceba6b1794dccodemd""" > *Proof.* > - First notice $r \perp \tilde{v}$ : > ```math > \begin{align*} > \langle\tilde{v}, r\rangle & =\left\langle\tilde{v}, A \tilde{v}-\tilde{\lambda}\, \tilde{v}\right\rangle \\ > & =\Big\langle\tilde{v}, A \tilde{v}-\left\langle\tilde{v}, A \tilde{v}\right\rangle \tilde{v}\Big\rangle \\ > & =\langle\tilde{v}, A \tilde{v}\rangle-\langle\tilde{v}, A \tilde{v}\rangle=0 \tag{4} > \end{align*} > ``` > - Using this result as well as the trick of adding and subtracting $\tilde{\lambda} \tilde{v}$: > ```math > \begin{align} > \Big\langle(A-\alpha I) \, \tilde{v}, (A-\beta I) \, \tilde{v}\Big\rangle > &=\Big\langle(A-\tilde{\lambda} I)\, \tilde{v} +(\tilde{\lambda}-\alpha)\, \tilde v,\\ > &\hspace{50pt} > (A-\tilde{\lambda} I) \, \tilde{v}+(\tilde{\lambda}-\beta)\, \tilde{v} \Big\rangle > \\ > & =\Big\langle r+(\tilde{\lambda}-\alpha) \tilde{v},\ r+(\tilde{\lambda}-\beta) \tilde{v}\Big\rangle > \\ > & \stackrel{(4)}{=}\|r\|_{2}^{2}+(\tilde{\lambda}-\alpha)(\tilde{\lambda}-\beta). \tag{5} > \end{align} > ``` > - Now expand $\tilde{v}$ in an eigenbasis of $A$, i.e. > ```math > \tilde{v}=\sum_{i=1}^{n} c_{i} v_{i} > ``` > where $v_i$ is an eigenvector of $A$ with associated eigenvalue $\lambda_i$. > - This yields for the left-hand side of (5) > ```math > \begin{align} > \big\langle(A-\alpha I) \, \tilde v,\ (A-\beta I) \, \tilde{v}\big\rangle & =\sum_{i=1}^{n}\left|c_{i}\right|^{2}\,(\lambda_i -\alpha)(\lambda_i -\beta) \\ > & \geq 0 > \end{align} > ``` > since the interval $(\alpha, \beta)$ contains no eigenpairs. > - Considering the right-hand side of (5), we obtain > ```math > \|r\|_{2}^{2}+(\tilde{\lambda}-\alpha)(\tilde{\lambda}-\beta) \geq 0 > ``` > which is the desired result. $\hspace{10cm} \square$ """metadatashow_logsèdisabled®skip_as_script«code_folded$5466b137-915b-4118-9dbf-f97750303784cell_id$5466b137-915b-4118-9dbf-f97750303784codemd""" !!! note "Theorem 2 (Bauer-Fike)" Let $A \in \mathbb{C}^{n \times n}$ Hermitian and let $(\tilde{\lambda}, \tilde{v})$ be an approximate eigenpair with $\|\tilde{v}\|_{2}=1$ and residual $r=A \tilde{v}-\tilde{\lambda} \tilde{v}$. Then, there exists an eigenvalue $\lambda$ of $A$ such that ```math \left|\lambda- \tilde \lambda \right| \leq\|r\|_{2} . ``` > *Proof.* > - If $\tilde{\lambda} \in \sigma(A)$ the result is trivial. > - Suppose $\tilde{\lambda}$ is not on eigenvalue of $A$. > Then $A-\tilde{\lambda} I$ is invertible, thus we write > ```math > \begin{align} > \tilde{v} & =(A-\tilde{\lambda} I)^{-1} r \\ > & = U \, (D-\tilde{\lambda} I)^{-1} \, U^{-1} \, r \tag{3} > \end{align} > ``` > where in the last step we used that $A$ is Hermitian, thus it can be diagonalised as $A=U D U^{-1}$ with $U$ unitary. > - Taking the 2-norm on both sids of (3) yields > ```math > \begin{aligned} > 1 =\|\tilde{v}\|_2 & =\left\|U\,(D-\tilde{\lambda} I)^{-1}\, U^{-1}\, r\right\|_2 \\ > & \leq \underbrace{\|U\|_{2}}_{=1} \ \| (D -\tilde{\lambda} I)^{-1}\|_{2} \ > {\underbrace{\left\|U^{-1}\right\|_{2}}_{=1}} \, \| r \|_{2} \\ > & =\max _{i=1,\dots,n} \left|\lambda_{i}-\tilde{\lambda}\right|^{-1} \|r\|_{2} > \end{aligned} > ``` > - Since $\text{argmin}_{i}\, x_{i}=\text{argmax}_i \, x_i^{-1}$, we obtain > ```math > \min _{i=1, \ldots, n}\left|\lambda_{i}- \tilde \lambda \right| \leq\|r\|_{2} > ``` > as desired. > $\hspace{12cm} \square$ """metadatashow_logsèdisabled®skip_as_script«code_folded$c7c14c64-133f-44e2-83ea-abde368de440cell_id$c7c14c64-133f-44e2-83ea-abde368de440code;md""" - To avoid this problem, we can use Theorem 2 (Bauer-Fike) to obtain a **guaranteed lower bound on the gap**. For example: ```math \begin{align} \left|\tilde{\lambda}_{i}-\lambda_{i+1}\right| & =\left|\tilde{\lambda}_{i}-\tilde{\lambda}_{i+1}+\tilde{\lambda}_{i+1}-\lambda_{i+1}\right| \\ & \stackrel{\text{rev.}~\Delta}{\geq}\left|\tilde{\lambda}_{i}-\tilde{\lambda}_{i+1}\right|-\left|\tilde{\lambda}_{i+1}-\lambda_{i+1}\right| \\ & \hspace{-0.5em} \stackrel{\text{Thm 2}}{\geq}\left|\tilde{\lambda}_{i}-\tilde{\lambda}_{i+1}\right|-\left\|r_{i+1}\right\|_{2} \end{align} ``` and similarly for $\left|\tilde{\lambda}_{i}-\lambda_{i-1}\right|$. - This results in computable lower bounds to $\left|\tilde{\lambda}_{i}-\lambda_{i+1}\right|$ and $\left|\tilde{\lambda}_{i}-\lambda_{i-1}\right|$, thus $\delta$. Using this gap estimate thus yields a guaranteed upper bound to $\left\|r_{i}\right\|^{2} / \delta$. - Note that when $\lambda_i$ and $\lambda_{i+1}$ are too close, this estimate can be negative, thus not a valid lower bound for $\delta$. """metadatashow_logsèdisabled®skip_as_script«code_folded$371399ec-a571-4aed-b910-7307d90edad1cell_id$371399ec-a571-4aed-b910-7307d90edad1codemd""" ## The residual & Bauer-Fike bound Suppose now we have obtained an approximate eigenpair $(\tilde{\lambda}, \tilde{v})$ to $A$. We can define the **residual** ```math r=A \tilde{v}-\tilde{\lambda} \tilde{v} ``` which can be computed and employed as a stopping criterion in iterations. Our goal now is to relate the residual $r$ to the error on the eigenvalue, $|\lambda-\tilde{\lambda}|$. """metadatashow_logsèdisabled®skip_as_script«code_folded$118355a0-2104-4216-9d11-53079fa2024dcell_id$118355a0-2104-4216-9d11-53079fa2024dcodemd""" > *Proof* by contradiction. > - Assume (1) does not hold. Then there is an eigenvalue $\lambda$ such that, for all $i=1, \dots, n$, > ```math > \left|\lambda-A_{i i}\right| > \sum_{\substack{j=1 \\ j \neq i}}^{j=n} \left|A_{i j}\right| \text {. } \tag{2} > ``` > - We then write > ```math > A-\lambda I=D-\lambda I+H, > ``` > with $D=\operatorname{diag}\left(A_{11}, \ldots, A_{i i}, \ldots, A_{n n}\right)$ and $H=A-D$ (i.e. zero on the diagonal). > - Due to (2), $D - \lambda I$ is invertible, thus > ```math > A-\lambda I=(D-\lambda I)\Big(I+\underbrace{(D-\lambda I)^{-1} H}_{M}\Big) > ``` > - The elements of $M$ are > ```math > M_{i j}= \begin{cases}0 & \text{if } i=j \\ \frac{A_{i j}}{A_{i i}-\lambda} & \text {otherwise }\end{cases} > ``` > - Due to (2), we thus have $\sum_{j=1}^{n}\left|M_{i j}\right|<1$ $\forall i$, therefore $\|M\|_{\infty}< 1$, and $\varrho(M) < \|M\|_{\infty}<1$. > - Since $\varrho(M)$ bounds the modulus of the eigenvalues of $M$, we have that $I+M$ is non-singular, which implies $A-\lambda I$ is non-singular. Therefore $\lambda$ cannot be an eigenvalue of $A$. > - This contradicts our initial statement. > $\hspace{7cm} \square$ """metadatashow_logsèdisabled®skip_as_script«code_folded$a70cff6b-b3d6-4f49-aed0-4d01813d6a95cell_id$a70cff6b-b3d6-4f49-aed0-4d01813d6a95code^md""" In fact in many practical examples such as [Herbst, Levitt, Cances 2020, *Figure 7*](https://michael-herbst.com/publications/2020.04.28_error_nonscf_kohn_sham.pdf) one sees that Bauer-Fike does not follow the convergence behaviour of the true error. In fact for our case a better bound is the Kato-Temple bound, which we will derive next. """metadatashow_logsèdisabled®skip_as_script«code_folded$5e85fbbb-b9b6-49f2-b33f-539ddd7f4cc0cell_id$5e85fbbb-b9b6-49f2-b33f-539ddd7f4cc0codemd""" ## Kato-Temple bound """metadatashow_logsèdisabled®skip_as_script«code_folded$440fa8bb-c502-4c51-bd80-b0a86d3b8507cell_id$440fa8bb-c502-4c51-bd80-b0a86d3b8507codefmd""" Suppose we have computed an approximate eigenpair $(\tilde \lambda, \tilde u)$ of a Hermitian matrix $A \in \mathbb C^{n \times n}$. - How do we know our computation is correct, in particular if finite-precision arithmetic is employed ? - How do we know how far away we are from the exact solution ? In one of the exercises we already saw that ```math \begin{align} \lambda_i \leq \| A \|_F && \forall i = 1, \dots, n \end{align} ``` but we also noted it to be a rather crude bound. This does, however, point to the fact that the matrix entries have something to say about the matrix eigenvalues. """metadatashow_logsèdisabled®skip_as_script«code_folded$1bae37c2-946f-467e-9b1d-c3ca5a129b98cell_id$1bae37c2-946f-467e-9b1d-c3ca5a129b98code@md""" !!! tip "Remark" Since the same result holds for $A^T$, we can formulate a version in terms of column sums instead of row sums. ```math \forall \lambda \in \sigma(A) \quad \exists \,j\, \text { s.t. }\, \left|\lambda-A_{j j}\right| \leq \sum_{\substack{i=1 \\ i \neq j}}^{i = n} \left|A_{i j}\right|. ``` """metadatashow_logsèdisabled®skip_as_script«code_folded$7eba1e23-9f38-4478-84f4-e81ef653ba2ccell_id$7eba1e23-9f38-4478-84f4-e81ef653ba2ccodeDmd""" ## Gerschgorin circles A first tighter bound is given by """metadatashow_logsèdisabled®skip_as_script«code_folded$8dab1cdf-f700-4dd9-a839-58fe8cfd6c4acell_id$8dab1cdf-f700-4dd9-a839-58fe8cfd6c4acode8begin using PlutoUI import TikzPictures.TikzPicture using LaTeXStrings using LinearAlgebra using PlutoTeachingTools using Plots using HypertextLiteral RobustLocalResource("https://teaching.matmat.org/error-control/latex_macros.md", "latex_macros.md") Markdown.parse(read("latex_macros.md", String)) endmetadatashow_logsèdisabled®skip_as_script«code_folded$760385db-cc30-40c1-8f66-5692028a96e3cell_id$760385db-cc30-40c1-8f66-5692028a96e3codeTableOfContents()metadatashow_logsèdisabled®skip_as_script«code_folded$4eb5e944-000f-4755-b039-63591d1dab8acell_id$4eb5e944-000f-4755-b039-63591d1dab8acode%md""" Note that Corollary 5 does not give a useful bound if $\delta = 0$, which can happen for degenerate eigenvalues. Given a sufficiently good eigenvector $\tilde{v}$ **Kato-Temple-type bounds** are considerably **sharper** than Bauer-Fike bounds. Their main complication is that the gap $\delta$ requires access to the **exact** eigenvalue and is thus usually not directly computable. However, one can usually determine an approximate gap as we will detail below. - Let us assume a diagonalization routine yields approximate eigenvectors $\tilde{v}_{1}, \ldots, \tilde{v}_{n}$ with eigenvalue approximations $\tilde{\lambda}_{1}, \ldots, \tilde{\lambda}_{n}$ and residuals $\tilde{r}_{1}, \ldots, \tilde{r}_{n}$. We further assume that we have not lost any eigenpair, i.e. $\tilde{λ}_i$ approximates $λ_i$ for all $i = 1, \ldots, n$. - **A naive idea** to approximate the gap $\delta$ for eigenvalue $\lambda_{i}$ is to take ```math \delta_{\text {est}}=\min \left(\left|\tilde{\lambda}_{i-1}- \tilde{\lambda}_{i}\right|,\left|\tilde{\lambda}_{i}-\tilde{\lambda}_{i+1}\right|\right) ``` While this *can be* a good approximation, a disadvantage of this approach is that this expression **is not a guaranteed lower bound** to $\delta$. In particular it may happen that $\delta_{\text {est}} > \delta$, which implies that $\left\|r_{i}\right\|^{2} / \delta_{\text{est}}$ may be smaller than the actual error ! The Kato-Temple estimate is **no longer a guaranteed bound**. - To see this pictorially, let us assume for simplicity that all exact eigenvalues are simple. We want to bound $|\lambda-\tilde{\lambda}_{i} |$ by Kato-Temple, and where the exact gap $\delta=\min \left(\left|\tilde{\lambda}_{i}-\lambda_{i+1}\right|,\left|\tilde{\lambda}_{i}-\lambda_{i-1}\right|\right)$. Then it can happen that: """metadatashow_logsèdisabled®skip_as_script«code_folded$a3e89cc9-55b6-423f-874a-b923026847eacell_id$a3e89cc9-55b6-423f-874a-b923026847eacodejmd""" > *Proof.* > Let $\lambda$ be the closest eigenvalue to $\tilde \lambda$. > If $\lambda<\tilde{\lambda}$, take $\alpha=\lambda$ and $\beta=b$ in Lemma 3.3 to yield > ```math > \begin{align} > 0 &\leq(b-\tilde{\lambda})(\tilde{\lambda}-\lambda) \leq\|r\|^{2}_2 \\ > \Rightarrow \quad 0 &\leq \tilde{\lambda}-\lambda \leq \frac{\|r\|^{2}_2}{b- \tilde{\lambda}} > \end{align} > ``` > Otherwise, set $\alpha=a$ and $\beta=\lambda$ to obtain > ```math > 0 \leq \lambda-\tilde{\lambda} \leq \frac{\| r \|^{2}_2}{\tilde{\lambda}-a} > ``` > Combining both results completes the proof. > $\hspace{7cm} \square$ """metadatashow_logsèdisabled®skip_as_script«code_folded$c00bddac-a453-4831-81be-d5814a7e88b9cell_id$c00bddac-a453-4831-81be-d5814a7e88b9code!md""" # Bounds on Eigenvalues """metadatashow_logsèdisabled®skip_as_script«code_folded$50024c47-f094-4fed-8ec1-1861a982a49acell_id$50024c47-f094-4fed-8ec1-1861a982a49acodeَM = [ 1.0 M12 M13 M14 0.0; M12 2.0 M23 0.0 -0.10; M13 M23 3.0 0.1 0.05; M14 0.0 0.1 4.0 0.0; 0.0 -0.1 0.05 0.0 5.0 ];metadatashow_logsèdisabled®skip_as_script«code_folded$a7e14252-02ee-49c1-9f46-5699cf590923cell_id$a7e14252-02ee-49c1-9f46-5699cf590923codemd""" - Plot Geschgorin disks: $(@bind show_geschgorin PlutoUI.CheckBox(default=true)) - Plot Bauer-Fike estimate: $(@bind show_bauer_fike PlutoUI.CheckBox(default=true)) - Plot Kato-Temple estimate: $(@bind show_kato_temple PlutoUI.CheckBox(default=true)) """metadatashow_logsèdisabled®skip_as_script«code_folded$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832cell_id$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832codebegin geschgorin_centres = diag(M) geschgorin_radii = [sum(abs, row) - abs(row[i]) for (i, row) in enumerate(eachrow(M))] computed_eigenvalues = diag(M) exact_eigenvalues = eigvals(M) computed_eigenvectors = collect(eachcol(Matrix{Float64}(I, 5, 5))) residuals = map(computed_eigenvalues, computed_eigenvectors) do λ, v M * v - λ * v end error_Bauer_Fike = norm.(residuals) δ = map(1:size(M, 2)) do i δ_left = δ_right = Inf λtilde = computed_eigenvalues if i > 1 δ_left = abs(λtilde[i] - λtilde[i-1]) - norm(residuals[i-1]) end if i < size(M, 2) δ_right = abs(λtilde[i] - λtilde[i+1]) - norm(residuals[i+1]) end max(0.0, min(δ_left, δ_right)) end error_Kato_Temple = norm.(residuals).^2 ./ δ end;metadatashow_logsèdisabled®skip_as_script«code_folded$45f6befb-4518-4467-81f3-ecd6da88ccbbcell_id$45f6befb-4518-4467-81f3-ecd6da88ccbbcodeUmd""" - M12 = $(@bind M12 PlutoUI.Slider(-1.0:0.001:1.0; default=0.001, show_value=true)) - M13 = $(@bind M13 PlutoUI.Slider(-1.0:0.001:1.0; default=0.1, show_value=true)) - M14 = $(@bind M14 PlutoUI.Slider(-1.0:0.001:1.0; default=0.1, show_value=true)) - M23 = $(@bind M23 PlutoUI.Slider(-1.0:0.001:1.0; default=-0.05, show_value=true)) """metadatashow_logsèdisabled®skip_as_script«code_folded$1b5e1cf6-e041-4311-9d9a-1db6cb628a3ecell_id$1b5e1cf6-e041-4311-9d9a-1db6cb628a3ecodemd""" !!! note "Theorem 6" Let $\tilde{v}$ be an approximate eigenvector to $A$, $\|\tilde{v}\|=1, \tilde{\lambda}=\langle\tilde{v}, A \tilde{v}\rangle$, and $r=A \tilde{v}-\tilde{\lambda} \tilde{v}$. Let $\lambda$ be the eigenvalue closest to $\tilde{\lambda}$ and $\delta=\min _{i} \{|\lambda_{i}-\tilde{\lambda}|, \lambda_{i} \neq \lambda \}$ be the distance from the spectrum (gap). Let $v$ be an eigenvector of $A$ associated with $\lambda$. Recall that $\theta(x, y)$, the angle between two vectors, is defined as ```math \cos \theta(x, y)=\frac{|\langle x, y\rangle|}{\|x\|\|y\|}. ``` Then, ```math \sin \theta(\tilde{v}, v) \leq \frac{\|r\|_{2}}{\delta} ``` """metadatashow_logsèdisabled®skip_as_script«code_folded$9107b5a1-5768-436d-b705-6b9215121aa0cell_id$9107b5a1-5768-436d-b705-6b9215121aa0codeTODO("Summary on other bounds")metadatashow_logsèdisabled®skip_as_script«code_folded$1309b18d-e2f0-46fa-8e0c-3a2730d69d20cell_id$1309b18d-e2f0-46fa-8e0c-3a2730d69d20codemd""" !!! note "Theorem 1 (Gerschgorin circles)" Any eigenvalue $\lambda$ of a matrix $A$ is located in one of the closed disks of the complex plane centred at $A_{i i}$ and having the radius ```math \sum_{\substack{j=1 \\ j \neq i}}^{j=n} \left|A_{i j}\right| \text {. } ``` In other words, ```math \forall \lambda \in \sigma(A) \quad \exists\, i\, \text{ s.t. }\, \left|\lambda-A_{i i}\right| \leq \sum_{\substack{j=1 \\ j \neq i}}^{j=n}\left|A_{i j}\right| \tag{1} ``` """metadatashow_logsèdisabled®skip_as_script«code_folded$f6e33422-d41e-4444-b08b-b281f5eb0db3cell_id$f6e33422-d41e-4444-b08b-b281f5eb0db3codeTikzPicture(L""" \draw[>=latex, ->] (0,0) -- (10,0) node[right]{$\sigma(A)$}; \draw[red, dotted] (3,0) -- (3,-1) (5.1,0) -- (5.1,-2) (6.3,-2) -- (6.3,0) ; \draw[purple, dotted] (3.4,0) -- (3.4,1) (5.1,0) -- (5.1,1) ; \draw (0.5,0) node{$\times$} node[above]{$\lambda_{i-3}$}; \draw[blue] (0.5 + 0.2,0) node{$\times$} node[below]{$\tilde \lambda_{i-3}$}; \draw (1.5,0) node{$\times$} node[above]{$\lambda_{i-2}$}; \draw[blue] (1.5 + 0.2,0) node{$\times$} node[below]{$\tilde \lambda_{i-2}$}; \draw (3,0) node{$\times$} node[above]{$\lambda_{i-1}$}; \draw[blue] (3 + 0.4,0) node{$\times$} node[below]{$\tilde \lambda_{i-1}$}; \draw (4.5,0) node{$\times$} node[above]{$\lambda_{i}$}; \draw[blue] (4.5 + 0.6,0) node{$\times$} node[below]{$\tilde \lambda_{i}$}; \draw (6.3,0) node{$\times$} node[above]{$\lambda_{i+1}$}; \draw[blue] (6.3 + 1.1,0) node{$\times$} node[below]{$\tilde \lambda_{i+1}$}; \draw (8.5,0) node{$\times$} node[above]{$\lambda_{i+2}$}; \draw[blue] (8.5 + 0.4,0) node{$\times$} node[below]{$\tilde \lambda_{i+2}$}; \draw[red] (3,-1) node{|} -- node[below]{$\tilde \lambda_i - \lambda_{i-1}$} (5.1,-1) node{|} ; \draw[red] (5.1,-2) node{|} -- node[below]{$\tilde \lambda_i - \lambda_{i+1} = \delta$} (6.3,-2) node{|} ; \draw[purple] (5.1,1) node{|} -- node[above]{$\tilde \lambda_i - \tilde \lambda_{i-1} = \delta_{\rm{est}} > \delta $} (3.4,1) node{|}; """,width="23cm",options="scale=1",preamble=raw"\usepackage{amsfonts}")metadatashow_logsèdisabled®skip_as_script«code_folded$9c189c48-497e-43c1-a50e-fc1ebe7f0714cell_id$9c189c48-497e-43c1-a50e-fc1ebe7f0714codemd""" The disks defined in this theorem are called **Gerschgorin disks**. There are $n$ disks and their union contains the spectrum of $A$. Gerschgorin disks are particularly useful when the matrix is almost diagonal, i.e. when a diagonalization algorithm is close to convergence. """metadatashow_logsèdisabled®skip_as_script«code_folded$f9cdf1eb-e803-48be-ab67-cd81fc33be3ecell_id$f9cdf1eb-e803-48be-ab67-cd81fc33be3ecodemd""" !!! note "Corollary 5 (Symmetric version of Kato-Temple)" Let $\tilde{v}$ be an approximate eigenvector to $A$, $\|\tilde{v}\|=1, \tilde{\lambda}=\langle\tilde{v}, A \tilde{v}\rangle$ and $r=A \tilde{v}-\tilde{\lambda} \tilde{v}$. Let $\lambda_i$ be the eigenvalue closest to $\tilde \lambda$. We define the distance of $\tilde{\lambda}$ to the rest of the spectrum as ```math \delta=\min _{j,\, j\neq i} |\lambda_{j}-\tilde{\lambda} | . ``` $\delta$ is also sometimes called the **gap**. Then, ```math |\tilde{\lambda}-\lambda| \leq \frac{\|r\|_{2}^{2}}{\delta}. ``` > *Proof.* > Theorem 3.4 with $a=\tilde{\lambda}-\delta$ and $b=\tilde{\lambda}+\delta$. """metadatashow_logsèdisabled®skip_as_script«code_folded$bf7bc022-c5d2-40a1-ae48-731ee97d9a32cell_id$bf7bc022-c5d2-40a1-ae48-731ee97d9a32codemd""" This is a simple way to get general error bound by establishing a **residual-error relationship**, i.e. a relation between the residual as a computable check for convergence and the error of our quantity of interest against the exact result. We will note that there is no need to know the exact result ! !!! tip "Remark" In general in *a posteriori* error analysis we want to establish relationship ```math \|e\|_{p} \leq C \|r\|_{q} ``` where $e$ is the **error against the exact answer**, $r$ the **residual**, and $C$ a known and computable constant. **Which norms** $p$ and $q$ are the best choice *depends on context*. For example, for measuring the error in the eigenvector we might choose the $\infty$-norm if we are interested in **entry-wise error**, or the 2-norm if we are interested in the error **natural to the vector space** $\mathbb{C}^{n}$. Note that there is no reason for $q$ or $C$ to be identical in both cases. This suggests the following important point: error-residual relationships are not unique. """metadatashow_logsèdisabled®skip_as_script«code_foldedënotebook_id$2608c9f0-c61f-11f0-37ab-01d21e1ec409bondscell_results!$3ec0e31c-3d09-43d1-9d9a-506320f7d964queued¤logsrunning¦outputbody

Theorem 4 (Kato-Temple)

Let $\tilde{v}$ be an approximate eigenvector to $A$, $\|\tilde{v}\|=1, \tilde{\lambda}=\langle\tilde{v}, A \tilde{v}\rangle$, and $r=A \tilde{v}-\tilde{\lambda} \tilde{v}.$ Assume we know an interval $(a, b)$ with $\tilde{\lambda} \in(a, b)$ and where $\lambda$ is the only eigenvalue of $A$ in $(a, b)$. Then,

$$ \frac{\|r\|_{2}^{2}}{a-\tilde{\lambda}} \leq \tilde{\lambda}-\lambda \leq \frac{\|r\|_{2}^{2}}{b-\tilde{\lambda}}$$

persist_js_state¤mimetext/htmllast_run_timestampAG6XR]has_pluto_hook_features¬rootassigneecell_id$3ec0e31c-3d09-43d1-9d9a-506320f7d964depends_on_disabled_cells§runtime.#published_object_keysdepends_on_skipped_cells§errored$3c3fabfc-f7e1-4e93-acfa-48c7448f968aqueued¤logsrunning¦outputbody i

Proof. We set $\theta \equiv \theta(v, \tilde{v})$ and write $\tilde{v}=v \cos \theta+w \sin \theta$, where $w \perp v$ is chosen appropriately. We have

$$ \begin{align} (A-\tilde{\lambda} I) \tilde{v} & =\cos \theta(A-\tilde{\lambda} I) v +\sin \theta(A-\tilde{\lambda} I) w \\ & =\cos \theta({ \lambda} -\tilde{\lambda}) v+\sin \theta(A-\tilde{\lambda} I) w \end{align}$$

Note that

$$ \begin{align} \langle v, (A-\tilde{\lambda} I) w \rangle & =\langle(A-\tilde{\lambda} I) v, w\rangle \\ & =(\lambda-\tilde{\lambda})\langle v, w\rangle=0, \end{align}$$

i.e. the two vectors in the previous sum are orthogonal. Therefore

$$ \begin{align} \|r \|_{2}^{2} & = \| (A-\tilde{\lambda} I ) \tilde{v} \|_{2}^{2} \\ & =\cos ^{2} \theta \ |\lambda-\tilde{\lambda}|^{2}+\sin ^{2} \theta \ \|(A-\tilde{\lambda} I) w\|_{2}^{2} \end{align}$$

Hence,

$$ \sin ^{2} \theta\|(A-\tilde{\lambda} I) w\|_{2}^{2} \leq\|r\|_{2}^{2}$$

Since $w \perp v$,

$$ \begin{aligned} \|(A-\tilde{\lambda} I) w\|_{2} & =\left\|\sum_{\substack{i=1 \\ \lambda_{i} \neq \lambda}}^{n}\left(\lambda_{i}-\tilde{\lambda}\right) v_{i} v_{i}^{H} w\right\|_{2} \\ & \geq \min _{\substack{i=1, \dots, n \\ \lambda_i \neq \lambda}}\left|\lambda_{i}-\tilde{\lambda}\right|=\delta \end{aligned}$$

This concludes the proof. $\hspace{11cm} \square$

persist_js_state¤mimetext/htmllast_run_timestampAG6Y.has_pluto_hook_features¬rootassigneecell_id$3c3fabfc-f7e1-4e93-acfa-48c7448f968adepends_on_disabled_cells§runtime5apublished_object_keysdepends_on_skipped_cells§errored$08508856-57bd-4d17-b8c6-3b0150e5b9afqueued¤logsrunning¦outputbodyL persist_js_state¤mimeimage/svg+xmllast_run_timestampAGRhhas_pluto_hook_features¬rootassigneecell_id$08508856-57bd-4d17-b8c6-3b0150e5b9afdepends_on_disabled_cells§runtime\published_object_keysdepends_on_skipped_cells§errored$bc4e51aa-674e-4a46-a20d-d9bfd82ff108queued¤logsrunning¦outputbodyV

Error bounds showcase

persist_js_state¤mimetext/htmllast_run_timestampAG6YIIhas_pluto_hook_features¬rootassigneecell_id$bc4e51aa-674e-4a46-a20d-d9bfd82ff108depends_on_disabled_cells§runtime!published_object_keysdepends_on_skipped_cells§errored$54fea52c-6adb-4c83-b380-416bfdaeacecqueued¤logsrunning¦outputbody

Lemma 3

Let $\tilde{v}$ be an approximate eigenvector of a Hermitian matrix $A$ with (for simplicity) $\|\tilde{v}\| = 1$. Let $\tilde \lambda$ be the associated approximated eigenvalue, calculated from $\tilde{\lambda}=\langle\tilde{v}, A \tilde{v}\rangle=R_{A}(\tilde{v})$. Further let $(\alpha, \beta)$ be an interval that contains no eigenvalue of $A$, but let $\tilde{\lambda} \in(\alpha, \beta)$. Then

$$(\beta-\tilde{\lambda})(\tilde{\lambda}-\alpha) \leq\|r\|_{2}^{2} .$$

persist_js_state¤mimetext/htmllast_run_timestampAG6X:has_pluto_hook_features¬rootassigneecell_id$54fea52c-6adb-4c83-b380-416bfdaeacecdepends_on_disabled_cells§runtime븵published_object_keysdepends_on_skipped_cells§errored$ef9e68d5-b5d5-456e-a13d-8c0149c6b2d6queued¤logsrunning¦outputbody

Consider the near-diagonal matrix

M = $\begin{equation} \left[ \begin{array}{ccccc} 1.0 & 0.001 & 0.099 & 0.099 & 0.0 \\ 0.001 & 2.0 & -0.051 & 0.0 & -0.1 \\ 0.099 & -0.051 & 3.0 & 0.1 & 0.05 \\ 0.099 & 0.0 & 0.1 & 4.0 & 0.0 \\ 0.0 & -0.1 & 0.05 & 0.0 & 5.0 \\ \end{array} \right] \end{equation} $

with some Sliders to tune the off-diagonal elements, shown below. As approximate eigenvectors we assume the unit vectors, that is $\tilde{v}_i = e_i$. As a result the corresponding approximate eigenvalues are just

$$ R_M(\tilde{v}_i) = \tilde{v}_i^T M \tilde{v}_i = M_{ii},$$

i.e. the diagonal entries of $M$.

persist_js_state¤mimetext/htmllast_run_timestampAGQ8&has_pluto_hook_features¬rootassigneecell_id$ef9e68d5-b5d5-456e-a13d-8c0149c6b2d6depends_on_disabled_cells§runtime ypublished_object_keysdepends_on_skipped_cells§errored$e5c087db-1df5-4654-a5f3-d4c27a5a9782queued¤logsrunning¦outputbodypersist_js_state¤mimetext/htmllast_run_timestampAGR(has_pluto_hook_features¬rootassigneecell_id$e5c087db-1df5-4654-a5f3-d4c27a5a9782depends_on_disabled_cells§runtime.Rݵpublished_object_keysdepends_on_skipped_cells§errored$f172a676-599b-4be7-9b5d-ceba6b1794dcqueued¤logsrunning¦outputbody

Proof.

  • First notice $r \perp \tilde{v}$ :

    $$\begin{align*} \langle\tilde{v}, r\rangle & =\left\langle\tilde{v}, A \tilde{v}-\tilde{\lambda}\, \tilde{v}\right\rangle \\ & =\Big\langle\tilde{v}, A \tilde{v}-\left\langle\tilde{v}, A \tilde{v}\right\rangle \tilde{v}\Big\rangle \\ & =\langle\tilde{v}, A \tilde{v}\rangle-\langle\tilde{v}, A \tilde{v}\rangle=0 \tag{4} \end{align*}$$

  • Using this result as well as the trick of adding and subtracting $\tilde{\lambda} \tilde{v}$:

    $$\begin{align} \Big\langle(A-\alpha I) \, \tilde{v}, (A-\beta I) \, \tilde{v}\Big\rangle &=\Big\langle(A-\tilde{\lambda} I)\, \tilde{v} +(\tilde{\lambda}-\alpha)\, \tilde v,\\ &\hspace{50pt} (A-\tilde{\lambda} I) \, \tilde{v}+(\tilde{\lambda}-\beta)\, \tilde{v} \Big\rangle \\ & =\Big\langle r+(\tilde{\lambda}-\alpha) \tilde{v},\ r+(\tilde{\lambda}-\beta) \tilde{v}\Big\rangle \\ & \stackrel{(4)}{=}\|r\|_{2}^{2}+(\tilde{\lambda}-\alpha)(\tilde{\lambda}-\beta). \tag{5} \end{align}$$

  • Now expand $\tilde{v}$ in an eigenbasis of $A$, i.e.

    $$\tilde{v}=\sum_{i=1}^{n} c_{i} v_{i}$$

    where $v_i$ is an eigenvector of $A$ with associated eigenvalue $\lambda_i$.

  • This yields for the left-hand side of (5)

    $$\begin{align} \big\langle(A-\alpha I) \, \tilde v,\ (A-\beta I) \, \tilde{v}\big\rangle & =\sum_{i=1}^{n}\left|c_{i}\right|^{2}\,(\lambda_i -\alpha)(\lambda_i -\beta) \\ & \geq 0 \end{align}$$

    since the interval $(\alpha, \beta)$ contains no eigenpairs.

  • Considering the right-hand side of (5), we obtain

    $$\|r\|_{2}^{2}+(\tilde{\lambda}-\alpha)(\tilde{\lambda}-\beta) \geq 0$$

    which is the desired result. $\hspace{10cm} \square$

persist_js_state¤mimetext/htmllast_run_timestampAG6X9Chas_pluto_hook_features¬rootassigneecell_id$f172a676-599b-4be7-9b5d-ceba6b1794dcdepends_on_disabled_cells§runtimeõpublished_object_keysdepends_on_skipped_cells§errored$5466b137-915b-4118-9dbf-f97750303784queued¤logsrunning¦outputbody

Theorem 2 (Bauer-Fike)

Let $A \in \mathbb{C}^{n \times n}$ Hermitian and let $(\tilde{\lambda}, \tilde{v})$ be an approximate eigenpair with $\|\tilde{v}\|_{2}=1$ and residual $r=A \tilde{v}-\tilde{\lambda} \tilde{v}$. Then, there exists an eigenvalue $\lambda$ of $A$ such that

$$ \left|\lambda- \tilde \lambda \right| \leq\|r\|_{2} .$$

Proof.

  • If $\tilde{\lambda} \in \sigma(A)$ the result is trivial.

  • Suppose $\tilde{\lambda}$ is not on eigenvalue of $A$. Then $A-\tilde{\lambda} I$ is invertible, thus we write

    $$\begin{align} \tilde{v} & =(A-\tilde{\lambda} I)^{-1} r \\ & = U \, (D-\tilde{\lambda} I)^{-1} \, U^{-1} \, r \tag{3} \end{align}$$

    where in the last step we used that $A$ is Hermitian, thus it can be diagonalised as $A=U D U^{-1}$ with $U$ unitary.

  • Taking the 2-norm on both sids of (3) yields

    $$\begin{aligned} 1 =\|\tilde{v}\|_2 & =\left\|U\,(D-\tilde{\lambda} I)^{-1}\, U^{-1}\, r\right\|_2 \\ & \leq \underbrace{\|U\|_{2}}_{=1} \ \| (D -\tilde{\lambda} I)^{-1}\|_{2} \ {\underbrace{\left\|U^{-1}\right\|_{2}}_{=1}} \, \| r \|_{2} \\ & =\max _{i=1,\dots,n} \left|\lambda_{i}-\tilde{\lambda}\right|^{-1} \|r\|_{2} \end{aligned}$$

  • Since $\text{argmin}_{i}\, x_{i}=\text{argmax}_i \, x_i^{-1}$, we obtain

    $$\min _{i=1, \ldots, n}\left|\lambda_{i}- \tilde \lambda \right| \leq\|r\|_{2}$$

    as desired. $\hspace{12cm} \square$

persist_js_state¤mimetext/htmllast_run_timestampAG6Wbhas_pluto_hook_features¬rootassigneecell_id$5466b137-915b-4118-9dbf-f97750303784depends_on_disabled_cells§runtimeN^published_object_keysdepends_on_skipped_cells§errored$c7c14c64-133f-44e2-83ea-abde368de440queued¤logsrunning¦outputbodyY
persist_js_state¤mimetext/htmllast_run_timestampAG6Xehas_pluto_hook_features¬rootassigneecell_id$c7c14c64-133f-44e2-83ea-abde368de440depends_on_disabled_cells§runtime published_object_keysdepends_on_skipped_cells§errored$371399ec-a571-4aed-b910-7307d90edad1queued¤logsrunning¦outputbody

The residual & Bauer-Fike bound

Suppose now we have obtained an approximate eigenpair $(\tilde{\lambda}, \tilde{v})$ to $A$. We can define the residual

$$r=A \tilde{v}-\tilde{\lambda} \tilde{v}$$

which can be computed and employed as a stopping criterion in iterations.

Our goal now is to relate the residual $r$ to the error on the eigenvalue, $|\lambda-\tilde{\lambda}|$.

persist_js_state¤mimetext/htmllast_run_timestampAG6W^has_pluto_hook_features¬rootassigneecell_id$371399ec-a571-4aed-b910-7307d90edad1depends_on_disabled_cells§runtime$vpublished_object_keysdepends_on_skipped_cells§errored$118355a0-2104-4216-9d11-53079fa2024dqueued¤logsrunning¦outputbody

Proof by contradiction.

  • Assume (1) does not hold. Then there is an eigenvalue $\lambda$ such that, for all $i=1, \dots, n$,

$$ \left|\lambda-A_{i i}\right| > \sum_{\substack{j=1 \\ j \neq i}}^{j=n} \left|A_{i j}\right| \text {. } \tag{2}$$

  • We then write

    $$A-\lambda I=D-\lambda I+H,$$

    with $D=\operatorname{diag}\left(A_{11}, \ldots, A_{i i}, \ldots, A_{n n}\right)$ and $H=A-D$ (i.e. zero on the diagonal).

  • Due to (2), $D - \lambda I$ is invertible, thus

$$ A-\lambda I=(D-\lambda I)\Big(I+\underbrace{(D-\lambda I)^{-1} H}_{M}\Big)$$

  • The elements of $M$ are

$$ M_{i j}= \begin{cases}0 & \text{if } i=j \\ \frac{A_{i j}}{A_{i i}-\lambda} & \text {otherwise }\end{cases}$$

  • Due to (2), we thus have $\sum_{j=1}^{n}\left|M_{i j}\right|<1$ $\forall i$, therefore $\|M\|_{\infty}< 1$, and $\varrho(M) < \|M\|_{\infty}<1$.

  • Since $\varrho(M)$ bounds the modulus of the eigenvalues of $M$, we have that $I+M$ is non-singular, which implies $A-\lambda I$ is non-singular. Therefore $\lambda$ cannot be an eigenvalue of $A$.

  • This contradicts our initial statement. $\hspace{7cm} \square$

persist_js_state¤mimetext/htmllast_run_timestampAG6WOhas_pluto_hook_features¬rootassigneecell_id$118355a0-2104-4216-9d11-53079fa2024ddepends_on_disabled_cells§runtimeLpublished_object_keysdepends_on_skipped_cells§errored$a70cff6b-b3d6-4f49-aed0-4d01813d6a95queued¤logsrunning¦outputbody

In fact in many practical examples such as Herbst, Levitt, Cances 2020, Figure 7 one sees that Bauer-Fike does not follow the convergence behaviour of the true error.

In fact for our case a better bound is the Kato-Temple bound, which we will derive next.

persist_js_state¤mimetext/htmllast_run_timestampAG6Whas_pluto_hook_features¬rootassigneecell_id$a70cff6b-b3d6-4f49-aed0-4d01813d6a95depends_on_disabled_cells§runtime $published_object_keysdepends_on_skipped_cells§errored$5e85fbbb-b9b6-49f2-b33f-539ddd7f4cc0queued¤logsrunning¦outputbodyN

Kato-Temple bound

persist_js_state¤mimetext/htmllast_run_timestampAG6XKhas_pluto_hook_features¬rootassigneecell_id$5e85fbbb-b9b6-49f2-b33f-539ddd7f4cc0depends_on_disabled_cells§runtime׵published_object_keysdepends_on_skipped_cells§errored$440fa8bb-c502-4c51-bd80-b0a86d3b8507queued¤logsrunning¦outputbody!

Suppose we have computed an approximate eigenpair $(\tilde \lambda, \tilde u)$ of a Hermitian matrix $A \in \mathbb C^{n \times n}$.

In one of the exercises we already saw that

$$\begin{align} \lambda_i \leq \| A \|_F && \forall i = 1, \dots, n \end{align}$$

but we also noted it to be a rather crude bound. This does, however, point to the fact that the matrix entries have something to say about the matrix eigenvalues.

persist_js_state¤mimetext/htmllast_run_timestampAG6V has_pluto_hook_features¬rootassigneecell_id$440fa8bb-c502-4c51-bd80-b0a86d3b8507depends_on_disabled_cells§runtime)published_object_keysdepends_on_skipped_cells§errored$1bae37c2-946f-467e-9b1d-c3ca5a129b98queued¤logsrunning¦outputbody

Remark

Since the same result holds for $A^T$, we can formulate a version in terms of column sums instead of row sums.

$$\forall \lambda \in \sigma(A) \quad \exists \,j\, \text { s.t. }\, \left|\lambda-A_{j j}\right| \leq \sum_{\substack{i=1 \\ i \neq j}}^{i = n} \left|A_{i j}\right|.$$

persist_js_state¤mimetext/htmllast_run_timestampAG6WXhas_pluto_hook_features¬rootassigneecell_id$1bae37c2-946f-467e-9b1d-c3ca5a129b98depends_on_disabled_cells§runtimeƵpublished_object_keysdepends_on_skipped_cells§errored$7eba1e23-9f38-4478-84f4-e81ef653ba2cqueued¤logsrunning¦outputbody{

Gerschgorin circles

A first tighter bound is given by

persist_js_state¤mimetext/htmllast_run_timestampAG6Vҿhas_pluto_hook_features¬rootassigneecell_id$7eba1e23-9f38-4478-84f4-e81ef653ba2cdepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$8dab1cdf-f700-4dd9-a839-58fe8cfd6c4aqueued¤logsrunning¦outputbody

$$\def\resolvent{{\rho}} \def\spectralradius{{\varrho}} \def\laplacian{{\Delta}} \def\contour{C} \def\eigenspace{{\mathcal E}} \def\op{\mathcal} \def\opA{{\mathcal A}} \def\opH{{\mathcal H}} \def\hilbert{{\mathscr H}} \def\graph{G} \def\boundedoperators{\mathscr B} \def\bloch{\mathcal B} \def\indicator{{\mathbf 1}} \def\im{\operatorname{Im}} \def\ker{\operatorname{Ker}} \definecolor{noteblue}{RGB}{123, 145, 178} \definecolor{warnyellow}{RGB}{165, 159, 116} \definecolor{prooftext}{RGB}{85, 85, 85}$$

persist_js_state¤mimetext/htmllast_run_timestampAGO𴞷has_pluto_hook_features¬rootassigneecell_id$8dab1cdf-f700-4dd9-a839-58fe8cfd6c4adepends_on_disabled_cells§runtime\ypublished_object_keysdepends_on_skipped_cells§errored$760385db-cc30-40c1-8f66-5692028a96e3queued¤logsrunning¦outputbodyP persist_js_state¤mimetext/htmllast_run_timestampAGR"Khas_pluto_hook_features¬rootassigneecell_id$760385db-cc30-40c1-8f66-5692028a96e3depends_on_disabled_cells§runtime?published_object_keysdepends_on_skipped_cells§errored$4eb5e944-000f-4755-b039-63591d1dab8aqueued¤logsrunning¦outputbody 

Note that Corollary 5 does not give a useful bound if $\delta = 0$, which can happen for degenerate eigenvalues.

Given a sufficiently good eigenvector $\tilde{v}$ Kato-Temple-type bounds are considerably sharper than Bauer-Fike bounds. Their main complication is that the gap $\delta$ requires access to the exact eigenvalue and is thus usually not directly computable. However, one can usually determine an approximate gap as we will detail below.

persist_js_state¤mimetext/htmllast_run_timestampAG6X޷has_pluto_hook_features¬rootassigneecell_id$4eb5e944-000f-4755-b039-63591d1dab8adepends_on_disabled_cells§runtimeߪpublished_object_keysdepends_on_skipped_cells§errored$a3e89cc9-55b6-423f-874a-b923026847eaqueued¤logsrunning¦outputbodyH

Proof. Let $\lambda$ be the closest eigenvalue to $\tilde \lambda$. If $\lambda<\tilde{\lambda}$, take $\alpha=\lambda$ and $\beta=b$ in Lemma 3.3 to yield

$$ \begin{align} 0 &\leq(b-\tilde{\lambda})(\tilde{\lambda}-\lambda) \leq\|r\|^{2}_2 \\ \Rightarrow \quad 0 &\leq \tilde{\lambda}-\lambda \leq \frac{\|r\|^{2}_2}{b- \tilde{\lambda}} \end{align}$$

Otherwise, set $\alpha=a$ and $\beta=\lambda$ to obtain

$$ 0 \leq \lambda-\tilde{\lambda} \leq \frac{\| r \|^{2}_2}{\tilde{\lambda}-a}$$

Combining both results completes the proof. $\hspace{7cm} \square$

persist_js_state¤mimetext/htmllast_run_timestampAG6Xk+has_pluto_hook_features¬rootassigneecell_id$a3e89cc9-55b6-423f-874a-b923026847eadepends_on_disabled_cells§runtime published_object_keysdepends_on_skipped_cells§errored$c00bddac-a453-4831-81be-d5814a7e88b9queued¤logsrunning¦outputbodyV

Bounds on Eigenvalues

persist_js_state¤mimetext/htmllast_run_timestampAG6V'has_pluto_hook_features¬rootassigneecell_id$c00bddac-a453-4831-81be-d5814a7e88b9depends_on_disabled_cells§runtimeRpublished_object_keysdepends_on_skipped_cells§errored$50024c47-f094-4fed-8ec1-1861a982a49aqueued¤logsrunning¦outputbodypersist_js_state¤mimetext/plainlast_run_timestampAGQ('has_pluto_hook_features¬rootassigneecell_id$50024c47-f094-4fed-8ec1-1861a982a49adepends_on_disabled_cells§runtimeZ⼵published_object_keysdepends_on_skipped_cells§errored$a7e14252-02ee-49c1-9f46-5699cf590923queued¤logsrunning¦outputbody
persist_js_state¤mimetext/htmllast_run_timestampAGQ: has_pluto_hook_features¬rootassigneecell_id$a7e14252-02ee-49c1-9f46-5699cf590923depends_on_disabled_cells§runtimeGEݵpublished_object_keysdepends_on_skipped_cells§errored$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832queued¤logsrunning¦outputbodypersist_js_state¤mimetext/plainlast_run_timestampAGQyhas_pluto_hook_features¬rootassigneecell_id$2d7cbb67-5c7a-455c-8c19-e9b9e8f4b832depends_on_disabled_cells§runtime-ߊƵpublished_object_keysdepends_on_skipped_cells§errored$45f6befb-4518-4467-81f3-ecd6da88ccbbqueued¤logsrunning¦outputbodyڡ
persist_js_state¤mimetext/htmllast_run_timestampAGQhas_pluto_hook_features¬rootassigneecell_id$45f6befb-4518-4467-81f3-ecd6da88ccbbdepends_on_disabled_cells§runtime ݜOpublished_object_keysdepends_on_skipped_cells§errored$1b5e1cf6-e041-4311-9d9a-1db6cb628a3equeued¤logsrunning¦outputbody

Theorem 6

Let $\tilde{v}$ be an approximate eigenvector to $A$, $\|\tilde{v}\|=1, \tilde{\lambda}=\langle\tilde{v}, A \tilde{v}\rangle$, and $r=A \tilde{v}-\tilde{\lambda} \tilde{v}$. Let $\lambda$ be the eigenvalue closest to $\tilde{\lambda}$ and $\delta=\min _{i} \{|\lambda_{i}-\tilde{\lambda}|, \lambda_{i} \neq \lambda \}$ be the distance from the spectrum (gap). Let $v$ be an eigenvector of $A$ associated with $\lambda$. Recall that $\theta(x, y)$, the angle between two vectors, is defined as

$$ \cos \theta(x, y)=\frac{|\langle x, y\rangle|}{\|x\|\|y\|}.$$

Then,

$$ \sin \theta(\tilde{v}, v) \leq \frac{\|r\|_{2}}{\delta}$$

persist_js_state¤mimetext/htmllast_run_timestampAG6Yjhas_pluto_hook_features¬rootassigneecell_id$1b5e1cf6-e041-4311-9d9a-1db6cb628a3edepends_on_disabled_cells§runtime¨published_object_keysdepends_on_skipped_cells§errored$9107b5a1-5768-436d-b705-6b9215121aa0queued¤logsrunning¦outputbody

⚠ TODO ⚠

Summary on other bounds

persist_js_state¤mimetext/htmllast_run_timestampAGPShas_pluto_hook_features¬rootassigneecell_id$9107b5a1-5768-436d-b705-6b9215121aa0depends_on_disabled_cells§runtime^W`published_object_keysdepends_on_skipped_cells§errored$1309b18d-e2f0-46fa-8e0c-3a2730d69d20queued¤logsrunning¦outputbody2

Theorem 1 (Gerschgorin circles)

Any eigenvalue $\lambda$ of a matrix $A$ is located in one of the closed disks of the complex plane centred at $A_{i i}$ and having the radius

$$ \sum_{\substack{j=1 \\ j \neq i}}^{j=n} \left|A_{i j}\right| \text {. }$$

In other words,

$$ \forall \lambda \in \sigma(A) \quad \exists\, i\, \text{ s.t. }\, \left|\lambda-A_{i i}\right| \leq \sum_{\substack{j=1 \\ j \neq i}}^{j=n}\left|A_{i j}\right| \tag{1}$$

persist_js_state¤mimetext/htmllast_run_timestampAG6V has_pluto_hook_features¬rootassigneecell_id$1309b18d-e2f0-46fa-8e0c-3a2730d69d20depends_on_disabled_cells§runtime%published_object_keysdepends_on_skipped_cells§errored$f6e33422-d41e-4444-b08b-b281f5eb0db3queued¤logsrunning¦outputbody{ persist_js_state¤mimeimage/svg+xmllast_run_timestampAGPVhas_pluto_hook_features¬rootassigneecell_id$f6e33422-d41e-4444-b08b-b281f5eb0db3depends_on_disabled_cells§runtimeFpublished_object_keysdepends_on_skipped_cells§errored$9c189c48-497e-43c1-a50e-fc1ebe7f0714queued¤logsrunning¦outputbodyu

The disks defined in this theorem are called Gerschgorin disks. There are $n$ disks and their union contains the spectrum of $A$. Gerschgorin disks are particularly useful when the matrix is almost diagonal, i.e. when a diagonalization algorithm is close to convergence.

persist_js_state¤mimetext/htmllast_run_timestampAG6W

Corollary 5 (Symmetric version of Kato-Temple)

Let $\tilde{v}$ be an approximate eigenvector to $A$, $\|\tilde{v}\|=1, \tilde{\lambda}=\langle\tilde{v}, A \tilde{v}\rangle$ and $r=A \tilde{v}-\tilde{\lambda} \tilde{v}$. Let $\lambda_i$ be the eigenvalue closest to $\tilde \lambda$. We define the distance of $\tilde{\lambda}$ to the rest of the spectrum as

$$ \delta=\min _{j,\, j\neq i} |\lambda_{j}-\tilde{\lambda} | .$$

$\delta$ is also sometimes called the gap. Then,

$$ |\tilde{\lambda}-\lambda| \leq \frac{\|r\|_{2}^{2}}{\delta}.$$

Proof. Theorem 3.4 with $a=\tilde{\lambda}-\delta$ and $b=\tilde{\lambda}+\delta$.

persist_js_state¤mimetext/htmllast_run_timestampAG6XUhas_pluto_hook_features¬rootassigneecell_id$f9cdf1eb-e803-48be-ab67-cd81fc33be3edepends_on_disabled_cells§runtimepublished_object_keysdepends_on_skipped_cells§errored$bf7bc022-c5d2-40a1-ae48-731ee97d9a32queued¤logsrunning¦outputbody

This is a simple way to get general error bound by establishing a residual-error relationship, i.e. a relation between the residual as a computable check for convergence and the error of our quantity of interest against the exact result. We will note that there is no need to know the exact result !

Remark

In general in a posteriori error analysis we want to establish relationship

$$\|e\|_{p} \leq C \|r\|_{q}$$

where $e$ is the error against the exact answer, $r$ the residual, and $C$ a known and computable constant. Which norms $p$ and $q$ are the best choice depends on context.

For example, for measuring the error in the eigenvector we might choose the $\infty$-norm if we are interested in entry-wise error, or the 2-norm if we are interested in the error natural to the vector space $\mathbb{C}^{n}$. Note that there is no reason for $q$ or $C$ to be identical in both cases.

This suggests the following important point: error-residual relationships are not unique.

persist_js_state¤mimetext/htmllast_run_timestampAG6Wbhas_pluto_hook_features¬rootassigneecell_id$bf7bc022-c5d2-40a1-ae48-731ee97d9a32depends_on_disabled_cells§runtimeBpublished_object_keysdepends_on_skipped_cells§errored©shortpath04_Matrix_error_bounds.jllast_save_timeAG6SZin_temp_dir¨metadata