Eigenvalues and the Spectral Theorem

A symmetric matrix is the finite-dimensional model of a self-adjoint operator, and it decomposes into orthogonal axes along which it acts by pure scaling. The scalings are the eigenvalues and the axes the eigenvectors, and the spectral theorem says they always exist and span the space. This is the fact behind principal component analysis, where the axes are the directions of greatest variance, and behind the Gaussian covariance, where the same decomposition factors the covariance matrix. This post proves it from the compactness of the sphere, the finite-dimensional shadow of the spectral theorem for compact operators [1], [2]. Vectors live in $\R^n$ with inner product $\ip xy=x^\top y$ , and a matrix $A$ is symmetric when $A^\top=A$ .

#Eigenvalues and symmetry

Definition1

A scalar $\lambda$ is an eigenvalue of $A$ with eigenvector $x\neq 0$ when $Ax=\lambda x$ .

For a symmetric matrix the eigenvalues are constrained to the real line and the eigenvectors to orthogonal directions.

Proposition2

A symmetric matrix has real eigenvalues, and eigenvectors belonging to distinct eigenvalues are orthogonal.

Proof

For reality, allow complex vectors momentarily and let $x^\ast=\bar x^\top$ be the conjugate transpose. If $Ax=\lambda x$ with $x\neq 0$ , then $x^\ast Ax=\lambda\,x^\ast x=\lambda\norm x^2$ . A real symmetric $A$ is Hermitian, $A^\ast=\bar A^\top=A^\top=A$ , so $(x^\ast Ax)^\ast=x^\ast A^\ast x=x^\ast Ax$ and the scalar is real. As $\norm x^2=\sum_i\abs{x_i}^2>0$ is real, $\lambda=x^\ast Ax/\norm x^2$ is real, and a real eigenvalue yields a real eigenvector from the null space of the singular real matrix $A-\lambda I$ . For orthogonality, let $Au=\lambda u$ and $Av=\mu v$ with $\lambda\neq\mu$ . Then $\lambda\ip uv=\ip{Au}v=\ip u{Av}=\mu\ip uv$ by symmetry, so $(\lambda-\mu)\ip uv=0$ forces $\ip uv=0$ .

#The spectral theorem

The eigenvector carrying the largest eigenvalue is the one that maximises the quadratic form, and compactness guarantees the maximum is attained.

Theorem3

A real symmetric matrix $A$ has an orthonormal basis $q_1,\dots,q_n$ of eigenvectors, with real eigenvalues $\lambda_1,\dots,\lambda_n$ . Equivalently $A=Q\Lambda Q^\top$ with $Q$ orthogonal and $\Lambda= \operatorname{diag}(\lambda_1,\dots,\lambda_n)$ .

Proof

Argue by induction on $n$ , the case $n=1$ being trivial. The Rayleigh quotient $x\mapsto x^\top Ax$ is continuous on the unit sphere $\{\norm x=1\}$ , which is closed and bounded, so by the extreme value theorem it attains a maximum $\lambda_1=q_1^\top Aq_1$ at some unit vector $q_1$ . For any $v\perp q_1$ the curve $t\mapsto q_1\cos t+v\sin t$ stays on the sphere, so its Rayleigh value $\lambda_1\cos^2 t+2(q_1^\top Av)\cos t\sin t+(v^\top Av)\sin^2 t$ is maximised at $t=0$ ; the vanishing derivative there gives $q_1^\top Av=0$ , using $A=A^\top$ . Thus $Aq_1\perp v$ for every $v\perp q_1$ , forcing $Aq_1\in\operatorname{span}(q_1)$ , so $Aq_1=\lambda_1 q_1$ with $\lambda_1=q_1^\top Aq_1$ the attained maximum, the largest Rayleigh value. The orthogonal complement $q_1^\perp$ is invariant under $A$ , since for $x\perp q_1$ , $\ip{Ax}{q_1}=\ip x{Aq_1}=\lambda_1\ip x{q_ 1}=0$ , so $Ax\perp q_1$ . Fix an orthonormal basis $e_2,\dots,e_n$ of $q_1^\perp$ ; by invariance the restriction of $A$ is an operator on $q_1^\perp$ with matrix $B_{ij}=\ip{Ae_j}{e_i}$ , and symmetry gives $B_{ij}=\ip{Ae_j}{e_i}=\ip{e_j}{Ae_i}=B_{ji}$ , so $B$ is a genuine $(n-1)\times(n-1)$ symmetric matrix. By induction it has an orthonormal eigenbasis, which pulls back to an orthonormal eigenbasis $q_2,\dots,q_n$ of $q_1^\perp$ , and together with $q_1$ these form an orthonormal eigenbasis of $\R^n$ . Placing the $q_i$ as the columns of $Q$ makes $Q$ orthogonal, and $AQ=Q\Lambda$ reads off the eigen-equations, giving $A=Q\Lambda Q^\top$ .

The decomposition writes $A=\sum_i\lambda_i q_iq_i^\top$ as a weighted sum of the rank-one projections onto its eigenaxes, and in the eigenbasis $A$ is the diagonal matrix $\Lambda$ , scaling the $i$ -th coordinate by $\lambda_i$ . Every symmetric matrix is therefore a stretch along orthogonal axes.

#The variational characterisation

The eigenvalues are not only the diagonal entries but the extreme values of the quadratic form, which is how they drive principal component analysis.

Theorem4

Order the eigenvalues $\lambda_1\ge\cdots\ge\lambda_n$ . Then $\lambda_1=\max_{\norm x=1}x^\top Ax$ and $\lambda_n=\min_{\norm x=1}x^\top Ax$ , each attained at the corresponding eigenvector. More generally $\lambda_k=\max_{\dim V=k}\ \min_{x\in V,\,\norm x=1}x^\top Ax$ over $k$ -dimensional subspaces $V$ .

Proof

Expand a unit vector in the eigenbasis, $x=\sum_i c_i q_i$ with $\sum_i c_i^2=1$ . Since $Aq_i=\lambda_i q_i$ and $\ip{q_i}{q_j}=\delta_{ij}$ , $x^\top Ax=\sum_{i,j}c_ic_j\lambda_j\delta_{ij}=\sum_i \lambda_i c_i^2$ , a weighted average of the eigenvalues with weights $c_i^2$ summing to $1$ , which lies between $\lambda_n$ and $\lambda_1$ and reaches $\lambda_1$ at $x=q_1$ and $\lambda_n$ at $x=q_n$ . For the intermediate eigenvalue, given any $k$ -dimensional $V$ , the span of $q_k,\dots,q_n$ has dimension $n-k+1$ , so it meets $V$ in a nonzero vector $x$ , on which $x^\top Ax=\sum_{i\ge k}\lambda_i c_i^2\le\lambda_k$ , so the inner minimum over $V$ is at most $\lambda_k$ . Taking $V=\operatorname{span}(q_1,\dots,q_k)$ , every unit $x\in V$ has $x^\top Ax=\sum_{i\le k}\lambda_i c_i^2\ge\lambda_k$ , so the maximum over $V$ of the inner minimum is exactly $\lambda_k$ .

The variational characterisation is what makes the spectral theorem a tool for data. The direction of greatest spread of a centred data cloud is the top eigenvector of its covariance matrix, the first principal component, because the variance of the projection onto a unit direction $x$ is $x^\top\Sigma x$ and the spectral theorem maximises it at $q_1$ with value $\lambda_1$ . The successive principal components are the remaining eigenvectors, capturing the most variance orthogonal to those already taken, which is the finite-dimensional Karhunen-Loeve expansion and the reason a covariance is summarised by its leading eigenaxes. The spectral theorem turns a symmetric matrix into a list of axes and scales that resolves any quadratic optimisation over the matrix.

[1]

S. Axler, Linear Algebra Done Right, 3rd ed. Springer, 2015.

[2]

R. A. Horn and C. R. Johnson, Matrix Analysis, 2nd ed. Cambridge University Press, 2013.

Explore connections

see in the atlas

referenced by (1)

Positive Definite Matrices

cite

@misc{eigenvalues-and-the-spectral-theorem,
  author = {Zac Kienzle},
  title  = {Eigenvalues and the Spectral Theorem},
  year   = {2026},
  month  = {06},
  url    = {https://zackienzle.com/blog/eigenvalues-and-the-spectral-theorem}
}

#Eigenvalues and symmetry

Definition1

A scalar $\lambda$ is an eigenvalue of $A$ with eigenvector $x\neq 0$ when $Ax=\lambda x$ .

For a symmetric matrix the eigenvalues are constrained to the real line and the eigenvectors to orthogonal directions.

Proposition2

A symmetric matrix has real eigenvalues, and eigenvectors belonging to distinct eigenvalues are orthogonal.

Proof

#The spectral theorem

The eigenvector carrying the largest eigenvalue is the one that maximises the quadratic form, and compactness guarantees the maximum is attained.

Theorem3

Proof

#The variational characterisation

The eigenvalues are not only the diagonal entries but the extreme values of the quadratic form, which is how they drive principal component analysis.

Theorem4

Proof

[1]

S. Axler, Linear Algebra Done Right, 3rd ed. Springer, 2015.

[2]

R. A. Horn and C. R. Johnson, Matrix Analysis, 2nd ed. Cambridge University Press, 2013.

Explore connections

see in the atlas

referenced by (1)

Positive Definite Matrices

cite

@misc{eigenvalues-and-the-spectral-theorem,
  author = {Zac Kienzle},
  title  = {Eigenvalues and the Spectral Theorem},
  year   = {2026},
  month  = {06},
  url    = {https://zackienzle.com/blog/eigenvalues-and-the-spectral-theorem}
}