Skip to content
homeaboutworkprojectsthesiswritingresume
Loading
~/blog/the-karhunen-loeve-expansion0%dark
  1. home/
  2. writing/
  3. The Karhunen-Loeve Expansion

29 May 2026 · 13 min read · updated 13 June 2026

The Karhunen-Loeve Expansion

The Karhunen-Loeve expansion represents a second-order stochastic process as a series in the eigenfunctions of its covariance operator, with uncorrelated random coefficients whose variances are the eigenvalues. We build the covariance operator and establish its spectral decomposition through Mercer's theorem, prove the expansion converges in mean square uniformly in time, prove that its truncations are the mean-square optimal finite-rank representations among all orthonormal bases, specialise to the Gaussian case, and derive the expansion explicitly for Brownian motion and the Brownian bridge.

  • 13 equations
  • 9 results
  • 11 connections
  • stochastic-processes
  • functional-analysis
  • probability
  • dimensionality-reduction
On this page▾
  • The covariance operator
  • Mercer's theorem
  • The expansion
  • Optimality of the eigenbasis
  • The Gaussian case
  • Brownian motion
  • Brownian bridge
  • Numerical illustration

13 min left

  • The covariance operator2m
  • Mercer's theorem2m
  • The expansion2m
  • Optimality of the eigenbasis2m
  • The Gaussian case1m
  • Brownian motion1m
  • Brownian bridge1m
  • Numerical illustration1m

A stochastic process on an interval is a random function, and a random function is a point in an infinite-dimensional space. The Karhunen-Loeve expansion gives that space its natural coordinates. It writes the process as a series in a fixed orthonormal basis with random coefficients, and the basis it chooses is the eigenbasis of the process's own covariance, the unique basis that makes the coefficients uncorrelated and concentrates the variance into as few terms as possible. It is principal component analysis for functions, and it is the bridge between the spectral theory of operators and the second-order theory of processes [1], [2].

#The covariance operator

Let (Xt)t∈[0,T](X_t)_{t\in[0,T]}(Xt​)t∈[0,T]​ be a centered second-order process, E[Xt]=0\E[X_t]=0E[Xt​]=0 and E[Xt2]<∞\E[X_t^2]<\inftyE[Xt2​]<∞, with covariance K(s,t)=E[XsXt]K(s,t)=\E[X_s X_t]K(s,t)=E[Xs​Xt​] continuous on [0,T]2[0,T]^2[0,T]2, which holds when XXX is mean-square continuous. The covariance defines an integral operator on L2[0,T]L^2[0,T]L2[0,T],

(Kf)(t)=∫0TK(t,s) f(s) ds.(1)(\mathcal K f)(t)=\int_0^T K(t,s)\,f(s)\,ds. \tag{1}(Kf)(t)=∫0T​K(t,s)f(s)ds.(1)
Proposition1

The operator K\mathcal KK is self-adjoint, positive, and compact on L2[0,T]L^2[0,T]L2[0,T]. Its eigenvalues λ1≥λ2≥⋯≥0\lambda_1\ge\lambda_2\ge\cdots\ge 0λ1​≥λ2​≥⋯≥0 are nonnegative and accumulate only at zero, and its eigenfunctions (ϕn)(\phi_n)(ϕn​) form an orthonormal basis of the closure of the range.

Proof

Self-adjointness follows from the symmetry K(s,t)=K(t,s)K(s,t)=K(t,s)K(s,t)=K(t,s), since ⟨Kf,g⟩=∫∫K(t,s)f(s)g(t) ds dt=⟨f,Kg⟩\ip{\mathcal K f}{g}=\int\int K(t,s)f(s)g(t)\,ds\,dt=\ip{f}{\mathcal K g}⟨Kf,g⟩=∫∫K(t,s)f(s)g(t)dsdt=⟨f,Kg⟩. Positivity follows from

⟨Kf,f⟩=∫0T ⁣ ⁣∫0TK(t,s)f(s)f(t) ds dt=E[(∫0TXtf(t) dt)2]≥0,(2)\ip{\mathcal K f}{f}=\int_0^T\!\!\int_0^T K(t,s)f(s)f(t)\,ds\,dt =\E\Big[\Big(\int_0^T X_t f(t)\,dt\Big)^2\Big]\ge 0, \tag{2}⟨Kf,f⟩=∫0T​∫0T​K(t,s)f(s)f(t)dsdt=E[(∫0T​Xt​f(t)dt)2]≥0,(2)

the kernel being the covariance of the linear functional ∫Xtf(t) dt\int X_t f(t)\,dt∫Xt​f(t)dt. The interchange of the expectation and the double integral is licensed by ∫ ⁣∫E∣XsXt∣ ∣f(s)f(t)∣ ds dt≤(∫K(t,t) ∣f(t)∣ dt)2<∞\int\!\int\E\lvert X_sX_t\rvert\,\lvert f(s)f(t) \rvert\,ds\,dt\le\big(\int\sqrt{K(t,t)}\,\lvert f(t)\rvert\,dt\big)^2<\infty∫∫E∣Xs​Xt​∣∣f(s)f(t)∣dsdt≤(∫K(t,t)​∣f(t)∣dt)2<∞, using Cauchy-Schwarz and the continuity of KKK on the compact square. Compactness follows because that same continuity makes KKK square-integrable and K\mathcal KK a Hilbert-Schmidt operator, hence compact. The spectral theorem for compact self-adjoint operators then supplies a real eigenvalue sequence accumulating only at zero with orthonormal eigenfunctions spanning the closed range, and positivity makes every eigenvalue nonnegative.

The eigenvalue equation is the Fredholm problem ∫0TK(t,s)ϕn(s) ds=λnϕn(t)\int_0^T K(t,s)\phi_n(s)\,ds=\lambda_n\phi_n(t)∫0T​K(t,s)ϕn​(s)ds=λn​ϕn​(t), and the pairs (λn,ϕn)(\lambda_n,\phi_n)(λn​,ϕn​) are the spectral data of the process.

#Mercer's theorem

The spectral data reconstruct the covariance itself, not merely its action on functions.

Theorem2

For a continuous, positive semidefinite covariance KKK (positivity from Equation (2), where operator positivity ⟨Kf,f⟩≥0\ip{\mathcal K f}{f}\ge 0⟨Kf,f⟩≥0 upgrades to pointwise PSD ∑i,jcicjK(xi,xj)≥0\sum_{i,j}c_ic_jK(x_i,x_j)\ge 0∑i,j​ci​cj​K(xi​,xj​)≥0 by taking fff as approximate identities concentrated at the xix_ixi​ with weights cic_ici​ and passing to the limit through the continuity of KKK), the eigen-expansion

K(s,t)=∑n=1∞λn ϕn(s) ϕn(t)(3)K(s,t)=\sum_{n=1}^\infty\lambda_n\,\phi_n(s)\,\phi_n(t) \tag{3}K(s,t)=n=1∑∞​λn​ϕn​(s)ϕn​(t)(3)

converges absolutely and uniformly on [0,T]2[0,T]^2[0,T]2 [3].

The proof runs in two steps. On the diagonal, rN(t)=K(t,t)−∑n≤Nλnϕn(t)2r_N(t)=K(t,t)-\sum_{n\le N}\lambda_n\phi_n(t)^2rN​(t)=K(t,t)−∑n≤N​λn​ϕn​(t)2 is continuous, decreasing in NNN, and tends to 000 pointwise, since ∑nλnϕn(t)2=K(t,t)\sum_n\lambda_n\phi_n(t)^2=K(t,t)∑n​λn​ϕn​(t)2=K(t,t) by the spectral expansion of the positive operator, so Dini's theorem on the compact [0,T][0,T][0,T] gives rN→0r_N\to 0rN​→0 uniformly. Off the diagonal, Cauchy-Schwarz bounds the tail of Equation (3) by ∣∑n=MNλnϕn(s)ϕn(t)∣≤(∑n≥Mλnϕn(s)2)1/2(∑n≥Mλnϕn(t)2)1/2≤sup⁡u(∑n≥Mλnϕn(u)2)1/2 K(t,t)\big|\sum_{n=M}^N\lambda_n\phi_n(s)\phi_n(t)\big|\le\big(\sum_{n\ge M}\lambda_n\phi_n(s)^2\big)^{1/2} \big(\sum_{n\ge M}\lambda_n\phi_n(t)^2\big)^{1/2}\le\sup_u\big(\sum_{n\ge M}\lambda_n\phi_n(u)^2\big)^{1/2} \,\sqrt{K(t,t)}​∑n=MN​λn​ϕn​(s)ϕn​(t)​≤(∑n≥M​λn​ϕn​(s)2)1/2(∑n≥M​λn​ϕn​(t)2)1/2≤supu​(∑n≥M​λn​ϕn​(u)2)1/2K(t,t)​, whose first factor tends to 000 uniformly by the diagonal step while K(t,t)K(t,t)K(t,t) is bounded on the compact diagonal, so the series converges absolutely and uniformly on [0,T]2[0,T]^2[0,T]2. Setting s=ts=ts=t gives the trace identity

∫0TK(t,t) dt=∑n=1∞λn=E ⁣∫0TXt2 dt,(4)\int_0^T K(t,t)\,dt=\sum_{n=1}^\infty\lambda_n=\E\!\int_0^T X_t^2\,dt, \tag{4}∫0T​K(t,t)dt=n=1∑∞​λn​=E∫0T​Xt2​dt,(4)

where the exchange E∫0TXt2 dt=∫0TE[Xt2] dt=∫0TK(t,t) dt\E\int_0^T X_t^2\,dt=\int_0^T\E[X_t^2]\,dt=\int_0^T K(t,t)\,dtE∫0T​Xt2​dt=∫0T​E[Xt2​]dt=∫0T​K(t,t)dt is Tonelli, since (t,ω)↦Xt(ω)2(t,\omega)\mapsto X_t(\omega)^2(t,ω)↦Xt​(ω)2 is nonnegative and jointly measurable (measurability from mean-square continuity) and ∫0TK(t,t) dt\int_0^T K(t,t)\,dt∫0T​K(t,t)dt is finite by continuity of KKK on the compact diagonal. So the eigenvalues sum to the total variance of the process, the energy that the expansion below distributes across its coordinates.

#The expansion

Theorem3

Define the coefficients ξn=λn−1/2∫0TXt ϕn(t) dt\xi_n=\lambda_n^{-1/2}\int_0^T X_t\,\phi_n(t)\,dtξn​=λn−1/2​∫0T​Xt​ϕn​(t)dt for λn>0\lambda_n>0λn​>0. Then the ξn\xi_nξn​ are centered, uncorrelated, and of unit variance, and

Xt=∑n=1∞λn ξn ϕn(t),(5)X_t=\sum_{n=1}^\infty\sqrt{\lambda_n}\,\xi_n\,\phi_n(t), \tag{5}Xt​=n=1∑∞​λn​​ξn​ϕn​(t),(5)

the series converging in mean square uniformly in t∈[0,T]t\in[0,T]t∈[0,T].

Proof

Centering is immediate from E[Xt]=0\E[X_t]=0E[Xt​]=0. For the second moments,

E[ξmξn]=1λmλn∫0T ⁣ ⁣∫0TK(s,t)ϕm(s)ϕn(t) ds dt=1λmλn∫0Tλnϕn(s)ϕm(s) ds=δmn,(6)\E[\xi_m\xi_n]=\frac{1}{\sqrt{\lambda_m\lambda_n}}\int_0^T\!\!\int_0^T K(s,t)\phi_m(s)\phi_n(t)\,ds\,dt =\frac{1}{\sqrt{\lambda_m\lambda_n}}\int_0^T\lambda_n\phi_n(s)\phi_m(s)\,ds=\delta_{mn}, \tag{6}E[ξm​ξn​]=λm​λn​​1​∫0T​∫0T​K(s,t)ϕm​(s)ϕn​(t)dsdt=λm​λn​​1​∫0T​λn​ϕn​(s)ϕm​(s)ds=δmn​,(6)

using the eigen-equation ∫K(s,t)ϕn(t) dt=λnϕn(s)\int K(s,t)\phi_n(t)\,dt=\lambda_n\phi_n(s)∫K(s,t)ϕn​(t)dt=λn​ϕn​(s) and the orthonormality of the eigenfunctions, so the coefficients are uncorrelated with unit variance. Pulling the expectation inside the product of time integrals is the same Fubini bound as Equation (2), since ∫ ⁣∫E∣XsXt∣ ∣ϕm(s)ϕn(t)∣ ds dt≤(∫K(t,t) ∣ϕm∣)(∫K(t,t) ∣ϕn∣)<∞\int\!\int\E\lvert X_sX_t\rvert\,\lvert\phi_m(s)\phi_n(t)\rvert\,ds\,dt\le\big(\int\sqrt{K(t,t)}\, \lvert\phi_m\rvert\big)\big(\int\sqrt{K(t,t)}\,\lvert\phi_n\rvert\big)<\infty∫∫E∣Xs​Xt​∣∣ϕm​(s)ϕn​(t)∣dsdt≤(∫K(t,t)​∣ϕm​∣)(∫K(t,t)​∣ϕn​∣)<∞ by Cauchy-Schwarz and continuity of KKK on the compact square. Let SN(t)=∑n≤Nλn ξnϕn(t)S_N(t)=\sum_{n\le N}\sqrt{\lambda_n}\,\xi_n\phi_n(t)SN​(t)=∑n≤N​λn​​ξn​ϕn​(t) be the partial sum. Then

E[(Xt−SN(t))2]=K(t,t)−2E[XtSN(t)]+E[SN(t)2].(7)\E[(X_t-S_N(t))^2]=K(t,t)-2\E[X_t S_N(t)]+\E[S_N(t)^2]. \tag{7}E[(Xt​−SN​(t))2]=K(t,t)−2E[Xt​SN​(t)]+E[SN​(t)2].(7)

The middle term is E[XtSN(t)]=∑n≤Nλnϕn(t) E[Xtξn]=∑n≤Nλnϕn(t)2\E[X_t S_N(t)]=\sum_{n\le N}\sqrt{\lambda_n}\phi_n(t)\,\E[X_t\xi_n] =\sum_{n\le N}\lambda_n\phi_n(t)^2E[Xt​SN​(t)]=∑n≤N​λn​​ϕn​(t)E[Xt​ξn​]=∑n≤N​λn​ϕn​(t)2, since E[Xtξn]=λn−1/2∫K(t,s)ϕn(s) ds=λn ϕn(t)\E[X_t\xi_n]=\lambda_n^{-1/2}\int K(t,s)\phi_n(s)\,ds =\sqrt{\lambda_n}\,\phi_n(t)E[Xt​ξn​]=λn−1/2​∫K(t,s)ϕn​(s)ds=λn​​ϕn​(t) (the interchange of E\EE with the time integral defining ξn\xi_nξn​ is the same Fubini/Cauchy-Schwarz bound as Equation (2), as ∣Xt∣∈L2(Ω)\lvert X_t\rvert\in L^2(\Omega)∣Xt​∣∈L2(Ω) and ϕn∈L2[0,T]\phi_n\in L^2[0,T]ϕn​∈L2[0,T] with KKK continuous on the compact square), and the last term is E[SN(t)2]=∑n≤Nλnϕn(t)2\E[S_N(t)^2]=\sum_{n\le N}\lambda_n\phi_n(t)^2E[SN​(t)2]=∑n≤N​λn​ϕn​(t)2 because the coefficients are uncorrelated of unit variance. Substituting into Equation (7),

E[(Xt−SN(t))2]=K(t,t)−∑n≤Nλnϕn(t)2,(8)\E[(X_t-S_N(t))^2]=K(t,t)-\sum_{n\le N}\lambda_n\phi_n(t)^2, \tag{8}E[(Xt​−SN​(t))2]=K(t,t)−n≤N∑​λn​ϕn​(t)2,(8)

which is exactly the diagonal residual rN(t)r_N(t)rN​(t) of the Mercer proof, so the diagonal Dini step of Theorem 2 makes it tend to zero uniformly in ttt as N→∞N\to\inftyN→∞.

The expansion Equation (5) separates randomness from time. The eigenfunctions ϕn\phi_nϕn​ are deterministic shapes, the modes of the process, and all the randomness lives in the scalar coefficients ξn\xi_nξn​, which are uncorrelated and weighted by λn\sqrt{\lambda_n}λn​​. A process that was a point in an infinite-dimensional space is now a sequence of independent dials, each turning one mode.

#Optimality of the eigenbasis

The eigenbasis is not one good basis among many. It is the best basis for representing the process in finitely many terms.

Theorem4

Among all orthonormal bases (en)(e_n)(en​) of L2[0,T]L^2[0,T]L2[0,T], the eigenbasis (ϕn)(\phi_n)(ϕn​) minimises the mean-square error of every finite truncation. For each NNN the minimum truncation error is

min⁡(en) E ⁣∫0T ⁣(Xt−∑n≤N⟨X,en⟩ en(t))2dt=∑n>Nλn,(9)\min_{(e_n)}\,\E\!\int_0^T\!\Big(X_t-\sum_{n\le N}\langle X,e_n\rangle\,e_n(t)\Big)^{2}dt =\sum_{n>N}\lambda_n, \tag{9}(en​)min​E∫0T​(Xt​−n≤N∑​⟨X,en​⟩en​(t))2dt=n>N∑​λn​,(9)

attained by en=ϕne_n=\phi_nen​=ϕn​.

Proof

For any orthonormal basis (en)(e_n)(en​) write ζn=∫0TXten(t) dt\zeta_n=\int_0^T X_t e_n(t)\,dtζn​=∫0T​Xt​en​(t)dt. Since E∫0TXt2 dt=∫0TK(t,t) dt<∞\E\int_0^T X_t^2\,dt=\int_0^T K(t,t)\,dt<\inftyE∫0T​Xt2​dt=∫0T​K(t,t)dt<∞ by mean-square continuity, X∈L2[0,T]X\in L^2[0,T]X∈L2[0,T] almost surely, so Parseval holds pathwise, ∫0T(Xt−∑n≤Nζnen(t))2 dt=∑n>Nζn2\int_0^T(X_t-\sum_{n\le N}\zeta_n e_n(t))^2\,dt=\sum_{n>N}\zeta_n^2∫0T​(Xt​−∑n≤N​ζn​en​(t))2dt=∑n>N​ζn2​ a.s. Taking expectations and using Tonelli on the nonnegative terms gives the truncation error E∑n>Nζn2=∑n>NE[ζn2]=∑n>N⟨Ken,en⟩\E\sum_{n>N}\zeta_n^2=\sum_{n>N}\E[\zeta_n^2]=\sum_{n>N}\ip{\mathcal K e_n}{e_n}E∑n>N​ζn2​=∑n>N​E[ζn2​]=∑n>N​⟨Ken​,en​⟩, since E[ζn2]=∫∫K(s,t)en(s)en(t) ds dt=⟨Ken,en⟩\E[\zeta_n^2]=\int\int K(s,t)e_n(s)e_n(t)\,ds\,dt=\ip{\mathcal K e_n}{e_n}E[ζn2​]=∫∫K(s,t)en​(s)en​(t)dsdt=⟨Ken​,en​⟩, the swap of expectation and the infinite sum legitimate by monotone convergence. The total ∑n⟨Ken,en⟩=∑nλn=∫0TK(t,t) dt\sum_n\ip{\mathcal K e_n}{e_n}=\sum_n\lambda_n=\int_0^T K(t,t)\,dt∑n​⟨Ken​,en​⟩=∑n​λn​=∫0T​K(t,t)dt is finite by Equation (4) and independent of the basis, since K\mathcal KK is trace class, so minimising the tail is the same as maximising the head ∑n≤N⟨Ken,en⟩\sum_{n\le N}\ip{\mathcal K e_n}{e_n}∑n≤N​⟨Ken​,en​⟩ over orthonormal sets of size NNN. By the Ky Fan maximum principle for a positive compact self-adjoint operator, the sum of NNN Rayleigh quotients over an orthonormal set is largest when the set spans the top NNN eigenvectors, with maximum ∑n≤Nλn\sum_{n\le N}\lambda_n∑n≤N​λn​. The complementary minimum tail is ∑n>Nλn\sum_{n>N}\lambda_n∑n>N​λn​, attained at en=ϕne_n=\phi_nen​=ϕn​, which is Equation (9).

This is the function-space statement of principal component analysis. The eigenvalues, sorted downward, are the variance captured by each mode, and truncating after NNN terms keeps the most variance any NNN-dimensional linear representation can keep. A process whose eigenvalues decay fast is nearly finite-dimensional, and the decay rate is the precise sense in which a random function is simple or complex.

#The Gaussian case

When the process is Gaussian the coefficients gain their strongest property.

Proposition5

If XXX is a centered Gaussian process then the coefficients ξn\xi_nξn​ of Equation (5) are independent standard normal random variables.

Proof

Each ξn\xi_nξn​ is the mean-square limit of Riemann-sum linear functionals R(P)=∑kXtkϕn(tk)ΔtkR(P)=\sum_k X_{t_k}\phi_n(t_k)\Delta t_kR(P)=∑k​Xtk​​ϕn​(tk​)Δtk​. These form a Cauchy net in L2(Ω)L^2(\Omega)L2(Ω), because E[R(P)R(P′)]→∫ ⁣∫K(s,t)ϕn(s)ϕn(t) ds dt\E[R(P)R(P')]\to\int\!\int K(s,t)\phi_n(s)\phi_n(t)\,ds\,dtE[R(P)R(P′)]→∫∫K(s,t)ϕn​(s)ϕn​(t)dsdt as the meshes shrink (Riemann sums of the continuous deterministic double integral converge), so E[(R(P)−R(P′))2]→0\E[(R(P)-R(P'))^2]\to 0E[(R(P)−R(P′))2]→0; completeness of L2(Ω)L^2(\Omega)L2(Ω) then yields the limit ∫0TXtϕn(t) dt\int_0^T X_t\phi_n(t)\,dt∫0T​Xt​ϕn​(t)dt, the standard mean-square Riemann integral. Finite linear combinations of the values of a Gaussian process are Gaussian, and a mean-square limit of Gaussian vectors is Gaussian, so any finite collection of the ξn\xi_nξn​ is jointly Gaussian and centered. For jointly Gaussian variables zero correlation is independence, and Equation (6) gives E[ξmξn]=δmn\E[\xi_m\xi_n]=\delta_{mn}E[ξm​ξn​]=δmn​, so the ξn\xi_nξn​ are independent standard normals.

For a Gaussian process the expansion is therefore a genuine series of independent standard normals scaled by λn\sqrt{\lambda_n}λn​​ against fixed shapes, which is why simulating such a process reduces to drawing independent normals and summing modes, the basis of spectral simulation and of the polynomial chaos representations of random fields [4].

#Brownian motion

Take standard Brownian motion WWW on [0,1][0,1][0,1], whose covariance is K(s,t)=min⁡(s,t)K(s,t)=\min(s,t)K(s,t)=min(s,t). The eigenvalue problem ∫01min⁡(t,s)ϕ(s) ds=λϕ(t)\int_0^1\min(t,s)\phi(s)\,ds=\lambda\phi(t)∫01​min(t,s)ϕ(s)ds=λϕ(t) becomes a differential equation when differentiated. Splitting the integral at ttt and differentiating once gives ∫t1ϕ(s) ds=λϕ′(t)\int_t^1\phi(s)\,ds=\lambda\phi'(t)∫t1​ϕ(s)ds=λϕ′(t), and differentiating again gives

λ ϕ′′(t)=−ϕ(t),(10)\lambda\,\phi''(t)=-\phi(t), \tag{10}λϕ′′(t)=−ϕ(t),(10)

with boundary conditions ϕ(0)=0\phi(0)=0ϕ(0)=0 from the original equation at t=0t=0t=0 and ϕ′(1)=0\phi'(1)=0ϕ′(1)=0 from the once-differentiated equation at t=1t=1t=1. The solutions are ϕ(t)=2 sin⁡(t/λ)\phi(t)=\sqrt 2\,\sin(t/\sqrt\lambda)ϕ(t)=2​sin(t/λ​) with cos⁡(1/λ)=0\cos(1/\sqrt\lambda)=0cos(1/λ​)=0, giving the spectral data

ϕn(t)=2 sin⁡ ⁣((n−12)πt),λn=1(n−12)2π2,n=1,2,…,(11)\phi_n(t)=\sqrt 2\,\sin\!\Big(\Big(n-\half\Big)\pi t\Big),\qquad \lambda_n=\frac{1}{\big(n-\half\big)^2\pi^2},\qquad n=1,2,\dots, \tag{11}ϕn​(t)=2​sin((n−21​)πt),λn​=(n−21​)2π21​,n=1,2,…,(11)

and the Karhunen-Loeve expansion of Brownian motion,

Wt=2 ∑n=1∞sin⁡ ⁣((n−12)πt)(n−12)π ξn,ξn i.i.d. N(0,1).(12)W_t=\sqrt 2\,\sum_{n=1}^\infty\frac{\sin\!\big((n-\half)\pi t\big)}{(n-\half)\pi}\,\xi_n,\qquad \xi_n\ \text{i.i.d.}\ \mathcal N(0,1). \tag{12}Wt​=2​n=1∑∞​(n−21​)πsin((n−21​)πt)​ξn​,ξn​ i.i.d. N(0,1).(12)

The eigenvalues decay as n−2n^{-2}n−2, so the low modes dominate, and the first few carry most of the variance.

Mode nnnλn=1/((n−12)2π2)\lambda_n=1/((n-\tfrac12)^2\pi^2)λn​=1/((n−21​)2π2)Share of total variance
10.40530.40530.405381.1%81.1\%81.1%
20.04500.04500.04509.0%9.0\%9.0%
30.01620.01620.01623.2%3.2\%3.2%
40.00830.00830.00831.7%1.7\%1.7%

The total variance is ∑nλn=∫01K(t,t) dt=∫01t dt=12\sum_n\lambda_n=\int_0^1 K(t,t)\,dt=\int_0^1 t\,dt=\half∑n​λn​=∫01​K(t,t)dt=∫01​tdt=21​ by the trace identity Equation (4), and the first mode alone holds λ1/12=0.811\lambda_1/\half=0.811λ1​/21​=0.811 of it, so four modes already capture about 95%95\%95% of the variance of a Brownian path.

#Brownian bridge

The Brownian bridge Bt=Wt−tW1B_t=W_t-tW_1Bt​=Wt​−tW1​ on [0,1][0,1][0,1] has covariance K(s,t)=min⁡(s,t)−stK(s,t)=\min(s,t)-stK(s,t)=min(s,t)−st, which vanishes at both ends. The same differentiation gives λϕ′′=−ϕ\lambda\phi''=-\phiλϕ′′=−ϕ now with ϕ(0)=ϕ(1)=0\phi(0)=\phi(1)=0ϕ(0)=ϕ(1)=0, the Dirichlet conditions the pinned endpoints impose, so

ϕn(t)=2 sin⁡(nπt),λn=1n2π2,Bt=2 ∑n=1∞sin⁡(nπt)nπ ξn.(13)\phi_n(t)=\sqrt 2\,\sin(n\pi t),\qquad \lambda_n=\frac{1}{n^2\pi^2},\qquad B_t=\sqrt 2\,\sum_{n=1}^\infty\frac{\sin(n\pi t)}{n\pi}\,\xi_n. \tag{13}ϕn​(t)=2​sin(nπt),λn​=n2π21​,Bt​=2​n=1∑∞​nπsin(nπt)​ξn​.(13)

The bridge is the pure Fourier sine series of Brownian motion with both ends clamped, and its total variance is ∑n1/(n2π2)=1/6\sum_n 1/(n^2\pi^2)=1/6∑n​1/(n2π2)=1/6, matching ∫01(t−t2) dt=1/6\int_0^1(t-t^2)\,dt=1/6∫01​(t−t2)dt=1/6. The two processes differ in their boundary conditions alone, half-integer frequencies and a free right end for the motion, integer frequencies and two clamped ends for the bridge.

The three representations sit side by side.

BasisCoordinatesOptimal truncationCoefficients
FourierFixed sines and cosinesNo, in generalCorrelated, in general
Discrete PCAEigenvectors of a covariance matrixYes, for vectorsUncorrelated
Karhunen-LoeveEigenfunctions of K\mathcal KKYes, for processesUncorrelated, unit variance

For Brownian motion and the bridge the Fourier and Karhunen-Loeve bases coincide because the covariance commutes with the second-derivative operator, which is special. For a general process the eigenfunctions are not sinusoids and the Karhunen-Loeve basis is the only one that is both uncorrelated and optimal.

#Numerical illustration

Truncating Equation (12) at a finite order draws a Brownian path from a handful of independent normals, and the optimality result says no other linear basis of the same order reproduces the covariance more faithfully.

import numpy as np
from numpy.random import Generator


def brownian_via_kl(
    n_modes: int, grid: np.ndarray, rng: Generator
) -> np.ndarray:
    """Sample a Brownian path on [0, 1] from a truncated Karhunen-Loeve series.

    Args:
        n_modes: Number of eigenmodes retained in the truncation.
        grid: Increasing times in [0, 1] at which to evaluate the path.
        rng: Seeded generator for reproducibility.

    Returns:
        The sampled path values at the grid times.
    """
    n = np.arange(1, n_modes + 1)
    eigen_root = np.sqrt(2.0) / ((n - 0.5) * np.pi)
    modes = np.sin(np.outer(grid, (n - 0.5) * np.pi))
    coefficients = rng.standard_normal(n_modes)
    return modes @ (eigen_root * coefficients)


def captured_variance(n_modes: int) -> float:
    """Fraction of total Brownian variance held by the first modes.

    Args:
        n_modes: Number of leading eigenmodes.

    Returns:
        The ratio of retained eigenvalue mass to the total one half.
    """
    n = np.arange(1, n_modes + 1)
    eigenvalues = 1.0 / ((n - 0.5) * np.pi) ** 2
    return float(eigenvalues.sum() / 0.5)


rng = np.random.default_rng(0)
grid = np.linspace(0.0, 1.0, 500)
path = brownian_via_kl(n_modes=8, grid=grid, rng=rng)
retained = captured_variance(n_modes=8)

A stochastic process is a random function, and the Karhunen-Loeve expansion is its coordinate system. The covariance operator supplies the axes, Mercer's theorem reconstructs the process from them, the expansion turns the random function into uncorrelated dials, and the optimality theorem proves those axes are the ones that compress it best. Brownian motion is a weighted sine series with random amplitudes, and the same construction handles any second-order process, which is why the expansion is the spectral foundation of simulation, of dimensionality reduction for random fields, and of the principal component analysis of functional data.

[1]
K. Karhunen, “Über lineare Methoden in der Wahrscheinlichkeitsrechnung,” Annales Academiae Scientiarum Fennicae, Series A I, vol. 37, pp. 1–79, 1947.
[2]
M. Loève, Probability Theory II, 4th ed. Springer, 1978.
[3]
J. Mercer, “Functions of positive and negative type, and their connection with the theory of integral equations,” Philosophical Transactions of the Royal Society A, vol. 209, pp. 415–446, 1909.
[4]
R. G. Ghanem and P. D. Spanos, Stochastic Finite Elements: A Spectral Approach. Springer, 1991.

Part 8 of 9 in Probability

← previousSecond-Order Processes and Mean-Square Calculusnext →The Construction of Brownian Motion

Explore connections

see in the atlas →

related

  • Mercer's Theorem and Reproducing Kernels
  • Gaussian Vectors and Processes
  • Compact Operators and the Spectral Theorem

referenced by (8)

  • Compact Operators and the Spectral Theorem
  • Eigenvalues and the Spectral Theorem
  • Gaussian Vectors and Processes
  • L-squared and Completeness
  • Mercer's Theorem and Reproducing Kernels
  • Orthonormal Bases
  • +2 more
cite
@misc{the-karhunen-loeve-expansion,
  author = {Zac Kienzle},
  title  = {The Karhunen-Loeve Expansion},
  year   = {2026},
  month  = {05},
  url    = {https://zackienzle.com/blog/the-karhunen-loeve-expansion}
}