The Radon-Nikodym Theorem

When one measure assigns zero to every null set of another it possesses a density against it. We define absolute continuity, equivalence, and mutual singularity, then prove the Radon-Nikodym theorem for sigma-finite measures using von Neumann's representation of a bounded functional on L^2. The density is unique almost everywhere and obeys a chain rule.

5 equations

6 results

9 connections

One measure is absolutely continuous with respect to another when it cannot charge what the other ignores. That single condition is exactly what forces the existence of a density, and the density is the object behind conditional expectation, likelihood ratios, and every equivalent change of measure.

#Absolute continuity

Definition1

Let $\mu,\nu$ be measures on $(\Omega,\mathcal F)$ . Then $\nu$ is absolutely continuous with respect to $\mu$ , written $\nu\ll\mu$ , when $\mu(A)=0$ implies $\nu(A)=0$ . The measures are equivalent, $\nu\sim\mu$ , when $\nu\ll\mu$ and $\mu\ll\nu$ , and mutually singular, $\nu\perp\mu$ , when some $A\in\mathcal F$ carries all of $\nu$ and none of $\mu$ , that is $\nu(A^c)=\mu(A)=0$ .

Equivalence means the two measures share exactly their null sets, hence agree on which properties hold almost everywhere. This is the precise sense in which a change to an equivalent measure preserves almost-sure statements while it may alter expectations.

#The theorem

Theorem2

(Radon-Nikodym.) Let $\mu,\nu$ be $\sigma$ -finite measures on $(\Omega,\mathcal F)$ with $\nu\ll\mu$ . There exists a measurable $f\ge 0$ , unique up to $\mu$ -almost-everywhere equality, with

\nu(A)=\int_A f\dd\mu\qquad\text{for all } A\in\mathcal F. \tag{1}

The function $f$ is the Radon-Nikodym derivative $\dd\nu/\dd\mu$ .

Proof

Assume first that $\mu$ and $\nu$ are finite and set $\phi=\mu+\nu$ , a finite measure. The map $L(g)=\int g\dd\nu$ is linear on $L^2(\phi)$ and bounded, since Cauchy-Schwarz gives

\abs{L(g)}\le\int\abs g\dd\nu\le\int\abs g\dd\phi\le\norm{g}_{L^2(\phi)}\,\phi(\Omega)^{1/2}. \tag{2}

By the Riesz representation theorem for bounded linear functionals on a Hilbert space there is $h\in L^2(\phi)$ with $\int g\dd\nu=\int gh\dd\phi$ for every $g\in L^2(\phi)$ . Taking $g=\ind_A$ gives $\nu(A)=\int_A h\dd\phi$ , and since $0\le\nu(A)\le\phi(A)$ for all $A$ , the density $h$ satisfies $0\le h\le 1$ $\phi$ -almost everywhere. Rewriting $\int g\dd\nu=\int gh\dd\phi$ as

\int g(1-h)\dd\nu=\int gh\dd\mu \tag{3}

and taking $g=\ind_E$ for $E=\{h=1\}$ yields $0=\mu(E)$ , whence $\nu(E)=0$ by $\nu\ll\mu$ and so $\phi(E)=0$ ; thus $h<1$ $\phi$ -almost everywhere. Define $f=h/(1-h)\ge 0$ . For fixed $A\in\mathcal F$ substitute $g=\ind_A\,(1+h+\cdots+h^n)$ into Equation (3), which telescopes to

\int_A\big(1-h^{\,n+1}\big)\dd\nu=\int_A h\,(1+h+\cdots+h^n)\dd\mu. \tag{4}

Because $0\le h<1$ almost everywhere, $1-h^{n+1}\uparrow 1$ and $h(1+\cdots+h^n)\uparrow h/(1-h)=f$ , so monotone convergence on each side gives $\nu(A)=\int_A f\dd\mu$ , which is Equation (1). For the $\sigma$ -finite case write $\Omega=\bigcup_j M_j=\bigcup_k N_k$ with $\mu(M_j)<\infty$ and $\nu(N_k)<\infty$ ; disjointifying the countable common refinement $\{M_j\cap N_k\}$ into $\{\Omega_n\}$ gives $\phi(\Omega_n)=\mu(\Omega_n)+\nu(\Omega_n)<\infty$ . Apply the finite case to $\mu_n(\cdot)=\mu(\cdot\cap\Omega_n)$ and $\nu_n(\cdot)=\nu(\cdot\cap\Omega_n)$ , still with $\nu_n\ll\mu_n$ , to obtain $f_n\ge 0$ supported on $\Omega_n$ with $\nu(A\cap\Omega_n)=\int_{A\cap\Omega_n}f_n\dd\mu$ . Set $f=\sum_n f_n\ind_{\Omega_n}$ ; countable additivity and monotone convergence for series of nonnegative functions give $\nu(A)=\sum_n\nu(A\cap\Omega_n)=\sum_n\int_{A\cap\Omega_n}f_n\dd\mu=\int_A f\dd\mu$ . Uniqueness follows by localizing to finite, bounded pieces. With the partition above set $A_{m,k}=\{f_1>f_2\}\cap\Omega_m\cap\{f_2\le k\}$ ; there $f_2$ is bounded and $\mu(A_{m,k})<\infty$ , so $\int_{A_{m,k}}f_2\dd\mu<\infty$ and we may subtract, $0=\int_{A_{m,k}}f_1\dd\mu-\int_{A_{m,k}}f_2\dd\mu=\int_{A_{m,k}}(f_1-f_2)\dd\mu$ with $f_1-f_2>0$ on $A_{m,k}$ , forcing $\mu(A_{m,k})=0$ . Since $\{f_1>f_2\}=\bigcup_{m,k}A_{m,k}$ , this gives $\mu(\{f_1>f_2\})=0$ ; symmetrically $\mu(\{f_2>f_1\})=0$ , so $f_1=f_2$ $\mu$ -almost everywhere.

#Chain rule and decomposition

Proposition3

If $\lambda\ll\mu\ll\nu$ are $\sigma$ -finite, then $\lambda\ll\nu$ and

\frac{\dd\lambda}{\dd\nu}=\frac{\dd\lambda}{\dd\mu}\,\frac{\dd\mu}{\dd\nu}\qquad\nu\text{-almost everywhere.} \tag{5}

Proof

Write $g=\dd\lambda/\dd\mu$ and $w=\dd\mu/\dd\nu$ . For $A\in\mathcal F$ , two applications of Theorem 2 and the identity $\int \ind_A g\dd\mu=\int\ind_A g w\dd\nu$ , valid first for simple $g$ and then for general $g\ge 0$ by monotone convergence, give $\lambda(A)=\int_A g\dd\mu=\int_A gw\dd\nu$ . If $\nu(A)=0$ then this integral of the nonnegative $gw$ over a $\nu$ -null set vanishes, so $\lambda(A)=0$ , giving $\lambda\ll\nu$ . Uniqueness of the derivative identifies $gw=\dd\lambda/\dd\nu$ .

Theorem4

(Lebesgue decomposition.) For $\sigma$ -finite $\mu,\nu$ there is a unique splitting $\nu=\nu_{\mathrm{ac}}+\nu_{\mathrm s}$ with $\nu_{\mathrm{ac}}\ll\mu$ and $\nu_{\mathrm s}\perp\mu$ [1], [2].

The density supplied by Theorem 2 implements the change to an equivalent probability measure. The factor $\dd\nu/\dd\mu$ is the reweighting applied pointwise to mass, and because an equivalent density is strictly positive almost everywhere it carries an inverse $\dd\mu/\dd\nu$ , so the change of measure is reversible.

[1]

P. Billingsley, Probability and Measure, 3rd ed. Wiley, 1995.

[2]

E. Çınlar, Probability and Stochastics. Springer, 2011.

Explore connections

see in the atlas

referenced by (2)

cite

@misc{radon-nikodym,
  author = {Zac Kienzle},
  title  = {The Radon-Nikodym Theorem},
  year   = {2026},
  month  = {05},
  url    = {https://zackienzle.com/blog/radon-nikodym}
}

The Radon-Nikodym Theorem

5 equations

6 results

9 connections

#Absolute continuity

Definition1

#The theorem

Theorem2

(Radon-Nikodym.) Let $\mu,\nu$ be $\sigma$ -finite measures on $(\Omega,\mathcal F)$ with $\nu\ll\mu$ . There exists a measurable $f\ge 0$ , unique up to $\mu$ -almost-everywhere equality, with

\nu(A)=\int_A f\dd\mu\qquad\text{for all } A\in\mathcal F. \tag{1}

The function $f$ is the Radon-Nikodym derivative $\dd\nu/\dd\mu$ .

Proof

Assume first that $\mu$ and $\nu$ are finite and set $\phi=\mu+\nu$ , a finite measure. The map $L(g)=\int g\dd\nu$ is linear on $L^2(\phi)$ and bounded, since Cauchy-Schwarz gives

\abs{L(g)}\le\int\abs g\dd\nu\le\int\abs g\dd\phi\le\norm{g}_{L^2(\phi)}\,\phi(\Omega)^{1/2}. \tag{2}

\int g(1-h)\dd\nu=\int gh\dd\mu \tag{3}

\int_A\big(1-h^{\,n+1}\big)\dd\nu=\int_A h\,(1+h+\cdots+h^n)\dd\mu. \tag{4}

#Chain rule and decomposition

Proposition3

If $\lambda\ll\mu\ll\nu$ are $\sigma$ -finite, then $\lambda\ll\nu$ and

\frac{\dd\lambda}{\dd\nu}=\frac{\dd\lambda}{\dd\mu}\,\frac{\dd\mu}{\dd\nu}\qquad\nu\text{-almost everywhere.} \tag{5}

Proof

Theorem4

(Lebesgue decomposition.) For $\sigma$ -finite $\mu,\nu$ there is a unique splitting $\nu=\nu_{\mathrm{ac}}+\nu_{\mathrm s}$ with $\nu_{\mathrm{ac}}\ll\mu$ and $\nu_{\mathrm s}\perp\mu$ [1], [2].

The density supplied by Theorem 2 implements the change to an equivalent probability measure. The factor

\dd\nu/\dd\mu

is the reweighting applied pointwise to mass, and because an equivalent density is strictly positive almost everywhere it carries an inverse

\dd\mu/\dd\nu

, so the change of measure is reversible.

[1]

P. Billingsley, Probability and Measure, 3rd ed. Wiley, 1995.

[2]

E. Çınlar, Probability and Stochastics. Springer, 2011.