Skip to content
homeaboutworkprojectsthesiswritingresume
Loading
~/blog/eigenvalues-and-the-spectral-theorem0%dark
  1. home/
  2. writing/
  3. Eigenvalues and the Spectral Theorem

09 June 2026 · 5 min read · updated 13 June 2026

Eigenvalues and the Spectral Theorem

A real symmetric matrix is diagonalised by an orthonormal basis of eigenvectors, and the eigenvalues are the values its quadratic form takes along those axes. We define eigenvalues and eigenvectors, prove the eigenvalues of a symmetric matrix are real and its eigenvectors orthogonal, prove the spectral theorem by extracting the eigenvector that maximises the quadratic form and deflating, and prove the variational characterisation of the eigenvalues that identifies the top one with the maximum of the Rayleigh quotient. This is the finite-dimensional spectral theorem behind principal component analysis and the Gaussian covariance.

  • 7 results
  • 7 connections
  • linear-algebra
  • spectral-theory
On this page▾
  • Eigenvalues and symmetry
  • The spectral theorem
  • The variational characterisation

5 min left

  • Eigenvalues and symmetry1m
  • The spectral theorem2m
  • The variational characterisation2m

A symmetric matrix is the finite-dimensional model of a self-adjoint operator, and it decomposes into orthogonal axes along which it acts by pure scaling. The scalings are the eigenvalues and the axes the eigenvectors, and the spectral theorem says they always exist and span the space. This is the fact behind principal component analysis, where the axes are the directions of greatest variance, and behind the Gaussian covariance, where the same decomposition factors the covariance matrix. This post proves it from the compactness of the sphere, the finite-dimensional shadow of the spectral theorem for compact operators [1], [2]. Vectors live in Rn\R^nRn with inner product ⟨x,y⟩=x⊤y\ip xy=x^\top y⟨x,y⟩=x⊤y, and a matrix AAA is symmetric when A⊤=AA^\top=AA⊤=A.

#Eigenvalues and symmetry

Definition1

A scalar λ\lambdaλ is an eigenvalue of AAA with eigenvector x≠0x\neq 0x=0 when Ax=λxAx=\lambda xAx=λx.

For a symmetric matrix the eigenvalues are constrained to the real line and the eigenvectors to orthogonal directions.

Proposition2

A symmetric matrix has real eigenvalues, and eigenvectors belonging to distinct eigenvalues are orthogonal.

Proof

For reality, allow complex vectors momentarily and let x∗=xˉ⊤x^\ast=\bar x^\topx∗=xˉ⊤ be the conjugate transpose. If Ax=λxAx=\lambda xAx=λx with x≠0x\neq 0x=0, then x∗Ax=λ x∗x=λ∥x∥2x^\ast Ax=\lambda\,x^\ast x=\lambda\norm x^2x∗Ax=λx∗x=λ∥x∥2. The scalar x∗Axx^\ast Axx∗Ax equals its own conjugate. Reality of AAA (Aˉ=A\bar A=AAˉ=A) gives x∗Ax‾=xˉ⊤Ax‾=x⊤Aˉ xˉ=x⊤Axˉ\overline{x^\ast Ax}=\overline{\bar x^\top Ax}= x^\top\bar A\,\bar x=x^\top A\bar xx∗Ax=xˉ⊤Ax=x⊤Aˉxˉ=x⊤Axˉ, and since x⊤Axˉx^\top A\bar xx⊤Axˉ is a 1×11\times11×1 scalar it equals its own transpose, x⊤Axˉ=(x⊤Axˉ)⊤=xˉ⊤A⊤x=xˉ⊤Ax=x∗Axx^\top A\bar x=(x^\top A\bar x)^\top=\bar x^\top A^\top x=\bar x^\top Ax=x^\ast Axx⊤Axˉ=(x⊤Axˉ)⊤=xˉ⊤A⊤x=xˉ⊤Ax=x∗Ax by symmetry A⊤=AA^\top=AA⊤=A, so it is real. As ∥x∥2=∑i∣xi∣2>0\norm x^2=\sum_i\abs{x_i}^2>0∥x∥2=∑i​∣xi​∣2>0 is real, λ=x∗Ax/∥x∥2\lambda=x^\ast Ax/\norm x^2λ=x∗Ax/∥x∥2 is real, and a real eigenvalue yields a real eigenvector from the null space of the singular real matrix A−λIA-\lambda IA−λI. For orthogonality, let Au=λuAu=\lambda uAu=λu and Av=μvAv=\mu vAv=μv with λ≠μ\lambda\neq\muλ=μ. Then λ⟨u,v⟩=⟨Au,v⟩=⟨u,Av⟩=μ⟨u,v⟩\lambda\ip uv=\ip{Au}v=\ip u{Av}=\mu\ip uvλ⟨u,v⟩=⟨Au,v⟩=⟨u,Av⟩=μ⟨u,v⟩ by symmetry, so (λ−μ)⟨u,v⟩=0(\lambda-\mu)\ip uv=0(λ−μ)⟨u,v⟩=0 forces ⟨u,v⟩=0\ip uv=0⟨u,v⟩=0.

#The spectral theorem

The eigenvector carrying the largest eigenvalue is the one that maximises the quadratic form, and compactness guarantees the maximum is attained.

Theorem3

A real symmetric matrix AAA has an orthonormal basis q1,…,qnq_1,\dots,q_nq1​,…,qn​ of eigenvectors, with real eigenvalues λ1,…,λn\lambda_1,\dots,\lambda_nλ1​,…,λn​. Equivalently A=QΛQ⊤A=Q\Lambda Q^\topA=QΛQ⊤ with QQQ orthogonal and Λ=diag⁡(λ1,…,λn)\Lambda= \operatorname{diag}(\lambda_1,\dots,\lambda_n)Λ=diag(λ1​,…,λn​).

Proof

Argue by induction on nnn, the case n=1n=1n=1 being trivial. The Rayleigh quotient x↦x⊤Axx\mapsto x^\top Axx↦x⊤Ax is continuous on the unit sphere {∥x∥=1}\{\norm x=1\}{∥x∥=1}, which is closed and bounded, so by the extreme value theorem it attains a maximum at some unit vector q1q_1q1​. The constraint g(x)=∥x∥2−1=0g(x)=\norm x^2-1=0g(x)=∥x∥2−1=0 has ∇g(x)=2x\nabla g(x)=2x∇g(x)=2x, nonzero on the sphere since ∥q1∥=1\norm{q_1}=1∥q1​∥=1, so the constraint qualification holds and the Lagrange multiplier condition ∇(x⊤Ax)=λ∇g\nabla(x^\top Ax)=\lambda\nabla g∇(x⊤Ax)=λ∇g applies. With A=A⊤A=A^\topA=A⊤ this reads 2Aq1=2λ1q12Aq_1=2\lambda_1 q_12Aq1​=2λ1​q1​, so Aq1=λ1q1Aq_1=\lambda_1 q_1Aq1​=λ1​q1​ and q1q_1q1​ is a unit eigenvector. The orthogonal complement q1⊥q_1^\perpq1⊥​ is invariant under AAA, since for x⊥q1x\perp q_1x⊥q1​, ⟨Ax,q1⟩=⟨x,Aq1⟩=λ1⟨x,q1⟩=0\ip{Ax}{q_1}=\ip x{Aq_1}=\lambda_1\ip x{q_ 1}=0⟨Ax,q1​⟩=⟨x,Aq1​⟩=λ1​⟨x,q1​⟩=0, so Ax⊥q1Ax\perp q_1Ax⊥q1​. Fix an orthonormal basis e2,…,ene_2,\dots,e_ne2​,…,en​ of q1⊥q_1^\perpq1⊥​; by invariance the restriction of AAA is an operator on q1⊥q_1^\perpq1⊥​ with matrix Bij=⟨Aej,ei⟩B_{ij}=\ip{Ae_j}{e_i}Bij​=⟨Aej​,ei​⟩, and symmetry gives Bij=⟨Aej,ei⟩=⟨ej,Aei⟩=BjiB_{ij}=\ip{Ae_j}{e_i}=\ip{e_j}{Ae_i}=B_{ji}Bij​=⟨Aej​,ei​⟩=⟨ej​,Aei​⟩=Bji​, so BBB is a genuine (n−1)×(n−1)(n-1)\times(n-1)(n−1)×(n−1) symmetric matrix. By induction it has an orthonormal eigenbasis, which pulls back to an orthonormal eigenbasis q2,…,qnq_2,\dots,q_nq2​,…,qn​ of q1⊥q_1^\perpq1⊥​, and together with q1q_1q1​ these form an orthonormal eigenbasis of Rn\R^nRn. Placing the qiq_iqi​ as the columns of QQQ makes QQQ orthogonal, and AQ=QΛAQ=Q\LambdaAQ=QΛ reads off the eigen-equations, giving A=QΛQ⊤A=Q\Lambda Q^\topA=QΛQ⊤.

The decomposition writes A=∑iλiqiqi⊤A=\sum_i\lambda_i q_iq_i^\topA=∑i​λi​qi​qi⊤​ as a weighted sum of the rank-one projections onto its eigenaxes, and in the eigenbasis AAA is the diagonal matrix Λ\LambdaΛ, scaling the iii-th coordinate by λi\lambda_iλi​. Every symmetric matrix is therefore a stretch along orthogonal axes.

#The variational characterisation

The eigenvalues are not only the diagonal entries but the extreme values of the quadratic form, which is how they drive principal component analysis.

Theorem4

Order the eigenvalues λ1≥⋯≥λn\lambda_1\ge\cdots\ge\lambda_nλ1​≥⋯≥λn​. Then λ1=max⁡∥x∥=1x⊤Ax\lambda_1=\max_{\norm x=1}x^\top Axλ1​=max∥x∥=1​x⊤Ax and λn=min⁡∥x∥=1x⊤Ax\lambda_n=\min_{\norm x=1}x^\top Axλn​=min∥x∥=1​x⊤Ax, each attained at the corresponding eigenvector. More generally λk=max⁡dim⁡V=k min⁡x∈V, ∥x∥=1x⊤Ax\lambda_k=\max_{\dim V=k}\ \min_{x\in V,\,\norm x=1}x^\top Axλk​=maxdimV=k​ minx∈V,∥x∥=1​x⊤Ax over kkk-dimensional subspaces VVV.

Proof

Expand a unit vector in the eigenbasis, x=∑iciqix=\sum_i c_i q_ix=∑i​ci​qi​ with ∑ici2=1\sum_i c_i^2=1∑i​ci2​=1. Then x⊤Ax=∑iλici2x^\top Ax=\sum_i \lambda_i c_i^2x⊤Ax=∑i​λi​ci2​, a weighted average of the eigenvalues with weights ci2c_i^2ci2​ summing to 111, which lies between λn\lambda_nλn​ and λ1\lambda_1λ1​ and reaches λ1\lambda_1λ1​ at x=q1x=q_1x=q1​ and λn\lambda_nλn​ at x=qnx=q_nx=qn​. For the intermediate eigenvalue, given any kkk-dimensional VVV, the span of qk,…,qnq_k,\dots,q_nqk​,…,qn​ has dimension n−k+1n-k+1n−k+1, so it meets VVV in a nonzero vector xxx, on which x⊤Ax=∑i≥kλici2≤λkx^\top Ax=\sum_{i\ge k}\lambda_i c_i^2\le\lambda_kx⊤Ax=∑i≥k​λi​ci2​≤λk​, so the inner minimum over VVV is at most λk\lambda_kλk​. Taking V=span⁡(q1,…,qk)V=\operatorname{span}(q_1,\dots,q_k)V=span(q1​,…,qk​), every unit x∈Vx\in Vx∈V has x⊤Ax=∑i≤kλici2≥λkx^\top Ax=\sum_{i\le k}\lambda_i c_i^2\ge\lambda_kx⊤Ax=∑i≤k​λi​ci2​≥λk​, so the maximum over VVV of the inner minimum is exactly λk\lambda_kλk​.

The variational characterisation is what makes the spectral theorem a tool for data. The direction of greatest spread of a centred data cloud is the top eigenvector of its covariance matrix, the first principal component, because the variance of the projection onto a unit direction xxx is x⊤Σxx^\top\Sigma xx⊤Σx and the spectral theorem maximises it at q1q_1q1​ with value λ1\lambda_1λ1​. The successive principal components are the remaining eigenvectors, capturing the most variance orthogonal to those already taken, which is the finite-dimensional Karhunen-Loeve expansion and the reason a covariance is summarised by its leading eigenaxes. The spectral theorem turns a symmetric matrix into a list of axes and scales that resolves any quadratic optimisation over the matrix.

[1]
S. Axler, Linear Algebra Done Right, 3rd ed. Springer, 2015.
[2]
R. A. Horn and C. R. Johnson, Matrix Analysis, 2nd ed. Cambridge University Press, 2013.

Part 1 of 2 in Linear Algebra

next →Positive Definite Matrices

Explore connections

see in the atlas →

related

  • Compact Operators and the Spectral Theorem
  • The Karhunen-Loeve Expansion
  • Bounded Operators and the Adjoint

referenced by (1)

  • Positive Definite Matrices
cite
@misc{eigenvalues-and-the-spectral-theorem,
  author = {Zac Kienzle},
  title  = {Eigenvalues and the Spectral Theorem},
  year   = {2026},
  month  = {06},
  url    = {https://zackienzle.com/blog/eigenvalues-and-the-spectral-theorem}
}