Skip to content
homeaboutworkprojectsthesiswritingresume
Loading
~/blog/radon-nikodym0%dark
  1. home/
  2. writing/
  3. The Radon-Nikodym Theorem

19 May 2026 · 4 min read · updated 13 June 2026

The Radon-Nikodym Theorem

When one measure assigns zero to every null set of another it possesses a density against it. We define absolute continuity, equivalence, and mutual singularity, then prove the Radon-Nikodym theorem for sigma-finite measures using von Neumann's representation of a bounded functional on L^2. The density is unique almost everywhere and obeys a chain rule.

  • 5 equations
  • 6 results
  • 8 connections
  • measure-theory
  • real-analysis
  • probability
On this page▾
  • Absolute continuity
  • The theorem
  • Chain rule and decomposition

4 min left

  • Absolute continuity1m
  • The theorem3m
  • Chain rule and decomposition1m

One measure is absolutely continuous with respect to another when it cannot charge what the other ignores. That single condition is exactly what forces the existence of a density, and the density is the object behind conditional expectation, likelihood ratios, and every equivalent change of measure.

#Absolute continuity

Definition1

Let μ,ν\mu,\nuμ,ν be measures on (Ω,F)(\Omega,\mathcal F)(Ω,F). Then ν\nuν is absolutely continuous with respect to μ\muμ, written ν≪μ\nu\ll\muν≪μ, when μ(A)=0\mu(A)=0μ(A)=0 implies ν(A)=0\nu(A)=0ν(A)=0. The measures are equivalent, ν∼μ\nu\sim\muν∼μ, when ν≪μ\nu\ll\muν≪μ and μ≪ν\mu\ll\nuμ≪ν, and mutually singular, ν⊥μ\nu\perp\muν⊥μ, when some A∈FA\in\mathcal FA∈F carries all of ν\nuν and none of μ\muμ, that is ν(Ac)=μ(A)=0\nu(A^c)=\mu(A)=0ν(Ac)=μ(A)=0.

Equivalence means the two measures share exactly their null sets, hence agree on which properties hold almost everywhere. This is the precise sense in which a change to an equivalent measure preserves almost-sure statements while it may alter expectations.

#The theorem

Theorem2

(Radon-Nikodym.) Let μ,ν\mu,\nuμ,ν be σ\sigmaσ-finite measures on (Ω,F)(\Omega,\mathcal F)(Ω,F) with ν≪μ\nu\ll\muν≪μ. There exists a measurable f≥0f\ge 0f≥0, unique up to μ\muμ-almost-everywhere equality, with

ν(A)=∫Af dμfor all A∈F.(1)\nu(A)=\int_A f\dd\mu\qquad\text{for all } A\in\mathcal F. \tag{1}ν(A)=∫A​fdμfor all A∈F.(1)

The function fff is the Radon-Nikodym derivative  dν/ dμ\dd\nu/\dd\mudν/dμ.

Proof

Assume first that μ\muμ and ν\nuν are finite and set ϕ=μ+ν\phi=\mu+\nuϕ=μ+ν, a finite measure. The map L(g)=∫g dνL(g)=\int g\dd\nuL(g)=∫gdν is linear on L2(ϕ)L^2(\phi)L2(ϕ) and bounded, since Cauchy-Schwarz gives

∣L(g)∣≤∫∣g∣ dν≤∫∣g∣ dϕ≤∥g∥L2(ϕ) ϕ(Ω)1/2.(2)\abs{L(g)}\le\int\abs g\dd\nu\le\int\abs g\dd\phi\le\norm{g}_{L^2(\phi)}\,\phi(\Omega)^{1/2}. \tag{2}∣L(g)∣≤∫∣g∣dν≤∫∣g∣dϕ≤∥g∥L2(ϕ)​ϕ(Ω)1/2.(2)

By the Riesz representation theorem for bounded linear functionals on a Hilbert space there is h∈L2(ϕ)h\in L^2(\phi)h∈L2(ϕ) with ∫g dν=∫gh dϕ\int g\dd\nu=\int gh\dd\phi∫gdν=∫ghdϕ for every g∈L2(ϕ)g\in L^2(\phi)g∈L2(ϕ). Taking g=1Ag=\ind_Ag=1A​ gives ν(A)=∫Ah dϕ\nu(A)=\int_A h\dd\phiν(A)=∫A​hdϕ, and since 0≤ν(A)≤ϕ(A)0\le\nu(A)\le\phi(A)0≤ν(A)≤ϕ(A) for all AAA, the density hhh satisfies 0≤h≤10\le h\le 10≤h≤1 ϕ\phiϕ-almost everywhere. Rewriting ∫g dν=∫gh dϕ\int g\dd\nu=\int gh\dd\phi∫gdν=∫ghdϕ as

∫g(1−h) dν=∫gh dμ(3)\int g(1-h)\dd\nu=\int gh\dd\mu \tag{3}∫g(1−h)dν=∫ghdμ(3)

and taking g=1Eg=\ind_Eg=1E​ for E={h=1}E=\{h=1\}E={h=1} yields 0=μ(E)0=\mu(E)0=μ(E), whence ν(E)=0\nu(E)=0ν(E)=0 by ν≪μ\nu\ll\muν≪μ and so ϕ(E)=0\phi(E)=0ϕ(E)=0; thus h<1h<1h<1 ϕ\phiϕ-almost everywhere. Define f=h/(1−h)≥0f=h/(1-h)\ge 0f=h/(1−h)≥0. For fixed A∈FA\in\mathcal FA∈F substitute g=1A (1+h+⋯+hn)g=\ind_A\,(1+h+\cdots+h^n)g=1A​(1+h+⋯+hn) into Equation (3), which telescopes to

∫A(1−h n+1) dν=∫Ah (1+h+⋯+hn) dμ.(4)\int_A\big(1-h^{\,n+1}\big)\dd\nu=\int_A h\,(1+h+\cdots+h^n)\dd\mu. \tag{4}∫A​(1−hn+1)dν=∫A​h(1+h+⋯+hn)dμ.(4)

Because 0≤h<10\le h<10≤h<1 almost everywhere, 1−hn+1↑11-h^{n+1}\uparrow 11−hn+1↑1 and h(1+⋯+hn)↑h/(1−h)=fh(1+\cdots+h^n)\uparrow h/(1-h)=fh(1+⋯+hn)↑h/(1−h)=f, so monotone convergence on each side gives ν(A)=∫Af dμ\nu(A)=\int_A f\dd\muν(A)=∫A​fdμ, which is Equation (1). For the σ\sigmaσ-finite case write Ω=⋃jMj=⋃kNk\Omega=\bigcup_j M_j=\bigcup_k N_kΩ=⋃j​Mj​=⋃k​Nk​ with μ(Mj)<∞\mu(M_j)<\inftyμ(Mj​)<∞ and ν(Nk)<∞\nu(N_k)<\inftyν(Nk​)<∞; disjointifying the countable common refinement {Mj∩Nk}\{M_j\cap N_k\}{Mj​∩Nk​} into {Ωn}\{\Omega_n\}{Ωn​} gives ϕ(Ωn)=μ(Ωn)+ν(Ωn)<∞\phi(\Omega_n)=\mu(\Omega_n)+\nu(\Omega_n)<\inftyϕ(Ωn​)=μ(Ωn​)+ν(Ωn​)<∞. Apply the finite case to μn(⋅)=μ(⋅∩Ωn)\mu_n(\cdot)=\mu(\cdot\cap\Omega_n)μn​(⋅)=μ(⋅∩Ωn​) and νn(⋅)=ν(⋅∩Ωn)\nu_n(\cdot)=\nu(\cdot\cap\Omega_n)νn​(⋅)=ν(⋅∩Ωn​), still with νn≪μn\nu_n\ll\mu_nνn​≪μn​, to obtain fn≥0f_n\ge 0fn​≥0 supported on Ωn\Omega_nΩn​ with ν(A∩Ωn)=∫A∩Ωnfn dμ\nu(A\cap\Omega_n)=\int_{A\cap\Omega_n}f_n\dd\muν(A∩Ωn​)=∫A∩Ωn​​fn​dμ. Set f=∑nfn1Ωnf=\sum_n f_n\ind_{\Omega_n}f=∑n​fn​1Ωn​​; countable additivity and monotone convergence for series of nonnegative functions give ν(A)=∑nν(A∩Ωn)=∑n∫A∩Ωnfn dμ=∫Af dμ\nu(A)=\sum_n\nu(A\cap\Omega_n)=\sum_n\int_{A\cap\Omega_n}f_n\dd\mu=\int_A f\dd\muν(A)=∑n​ν(A∩Ωn​)=∑n​∫A∩Ωn​​fn​dμ=∫A​fdμ. Uniqueness follows by localizing to finite, bounded pieces. With the partition above set Am,k={f1>f2}∩Ωm∩{f2≤k}A_{m,k}=\{f_1>f_2\}\cap\Omega_m\cap\{f_2\le k\}Am,k​={f1​>f2​}∩Ωm​∩{f2​≤k}; there f2f_2f2​ is bounded and μ(Am,k)<∞\mu(A_{m,k})<\inftyμ(Am,k​)<∞, so ∫Am,kf2 dμ<∞\int_{A_{m,k}}f_2\dd\mu<\infty∫Am,k​​f2​dμ<∞ and we may subtract, 0=∫Am,kf1 dμ−∫Am,kf2 dμ=∫Am,k(f1−f2) dμ0=\int_{A_{m,k}}f_1\dd\mu-\int_{A_{m,k}}f_2\dd\mu=\int_{A_{m,k}}(f_1-f_2)\dd\mu0=∫Am,k​​f1​dμ−∫Am,k​​f2​dμ=∫Am,k​​(f1​−f2​)dμ with f1−f2>0f_1-f_2>0f1​−f2​>0 on Am,kA_{m,k}Am,k​, forcing μ(Am,k)=0\mu(A_{m,k})=0μ(Am,k​)=0. Since {f1>f2}=⋃m,kAm,k\{f_1>f_2\}=\bigcup_{m,k}A_{m,k}{f1​>f2​}=⋃m,k​Am,k​, this gives μ({f1>f2})=0\mu(\{f_1>f_2\})=0μ({f1​>f2​})=0; symmetrically μ({f2>f1})=0\mu(\{f_2>f_1\})=0μ({f2​>f1​})=0, so f1=f2f_1=f_2f1​=f2​ μ\muμ-almost everywhere.

#Chain rule and decomposition

Proposition3

If λ≪μ≪ν\lambda\ll\mu\ll\nuλ≪μ≪ν are σ\sigmaσ-finite, then λ≪ν\lambda\ll\nuλ≪ν and

 dλ dν= dλ dμ  dμ dνν-almost everywhere.(5)\frac{\dd\lambda}{\dd\nu}=\frac{\dd\lambda}{\dd\mu}\,\frac{\dd\mu}{\dd\nu}\qquad\nu\text{-almost everywhere.} \tag{5}dνdλ​=dμdλ​dνdμ​ν-almost everywhere.(5)
Proof

Write g= dλ/ dμg=\dd\lambda/\dd\mug=dλ/dμ and w= dμ/ dνw=\dd\mu/\dd\nuw=dμ/dν. For A∈FA\in\mathcal FA∈F, two applications of Theorem 2 and the identity ∫1Ag dμ=∫1Agw dν\int \ind_A g\dd\mu=\int\ind_A g w\dd\nu∫1A​gdμ=∫1A​gwdν, valid first for simple ggg and then for general g≥0g\ge 0g≥0 by monotone convergence, give λ(A)=∫Ag dμ=∫Agw dν\lambda(A)=\int_A g\dd\mu=\int_A gw\dd\nuλ(A)=∫A​gdμ=∫A​gwdν. If ν(A)=0\nu(A)=0ν(A)=0 then this integral of the nonnegative gwgwgw over a ν\nuν-null set vanishes, so λ(A)=0\lambda(A)=0λ(A)=0, giving λ≪ν\lambda\ll\nuλ≪ν. Uniqueness of the derivative identifies gw= dλ/ dνgw=\dd\lambda/\dd\nugw=dλ/dν.

Theorem4

(Lebesgue decomposition.) For σ\sigmaσ-finite μ,ν\mu,\nuμ,ν there is a unique splitting ν=νac+νs\nu=\nu_{\mathrm{ac}}+\nu_{\mathrm s}ν=νac​+νs​ with νac≪μ\nu_{\mathrm{ac}}\ll\muνac​≪μ and νs⊥μ\nu_{\mathrm s}\perp\muνs​⊥μ [1], [2].

The density supplied by Theorem 2 implements the change to an equivalent probability measure. The factor  dν/ dμ\dd\nu/\dd\mudν/dμ is the reweighting applied pointwise to mass, and because an equivalent density is strictly positive almost everywhere it carries an inverse  dμ/ dν\dd\mu/\dd\nudμ/dν, so the change of measure is reversible.

[1]
P. Billingsley, Probability and Measure, 3rd ed. Wiley, 1995.
[2]
E. Çınlar, Probability and Stochastics. Springer, 2011.

Part 6 of 6 in Measure and Integration

← previousProduct Measures and Fubini's Theorem

Explore connections

see in the atlas →

related

  • Projection and Riesz Representation
  • Sigma-Algebras and Measures
  • Conditional Expectation

referenced by (2)

  • Projection and Riesz Representation
  • The L-p Spaces
cite
@misc{radon-nikodym,
  author = {Zac Kienzle},
  title  = {The Radon-Nikodym Theorem},
  year   = {2026},
  month  = {05},
  url    = {https://zackienzle.com/blog/radon-nikodym}
}