Conditional Expectation

Information is modelled by a sub- $\sigma$ -algebra, and conditioning identifies the best estimate of a random variable given only the events one can resolve. It is forced by an averaging identity, exists by Radon-Nikodym, and coincides with the $L^2$ projection.

#The defining property

Fix a probability space $(\Omega,\F,\P)$ and a sub- $\sigma$ -algebra $\G\subseteq\F$ . Let $X\in L^1(\P)$ .

Definition1

A conditional expectation of $X$ given $\G$ is a random variable $Y$ that is $\G$ -measurable, lies in $L^1$ , and satisfies

\int_G Y\dd\P=\int_G X\dd\P\qquad\text{for every } G\in\G. \tag{1}

Any such $Y$ is written $\E[X\mid\G]$ .

The two demands pull in opposite directions. The requirement of $\G$ -measurability coarsens $Y$ to the resolution of $\G$ , while Equation (1) forces it to reproduce the averages of $X$ over every $\G$ -event. We show below that exactly one random variable, up to a null set, meets both.

#Existence and uniqueness

Theorem2

For every $X\in L^1(\P)$ a conditional expectation $\E[X\mid\G]$ exists and is unique up to $\P$ -almost-everywhere equality.

Proof

Take first $X\ge 0$ . Define a measure on $(\Omega,\G)$ by $\nu(G)=\int_G X\dd\P$ . It is finite, with $\nu(\Omega)=\E[X]<\infty$ , and absolutely continuous with respect to the restriction $\P\!\restriction_\G$ , since $G\in\G$ with $\P(G)=0$ forces $\int_G X\dd\P=0$ . The Radon-Nikodym theorem applied on $(\Omega,\G,\P\!\restriction_\G)$ supplies a $\G$ -measurable $Y\ge 0$ with $\nu(G)=\int_G Y\dd\P$ for all $G\in\G$ , which is exactly Equation (1). For general $X$ set $\E[X\mid\G]=\E[X^+\mid\G]-\E[X^-\mid\G]$ , both terms finite since $X^\pm\le\abs X\in L^1$ . For uniqueness, if $Y_1,Y_2$ both satisfy Equation (1), the $\G$ -set $G=\{Y_1>Y_2\}$ gives $\int_G(Y_1-Y_2)\dd\P=0$ with a nonnegative integrand, so $\P(G)=0$ , and symmetrically, hence $Y_1=Y_2$ almost surely.

#The projection viewpoint

Proposition3

For $X\in L^2(\P)$ the conditional expectation $\E[X\mid\G]$ is the orthogonal projection of $X$ onto the closed subspace $L^2(\Omega,\G,\P)$ ; equivalently it is the $\G$ -measurable minimizer of $\E[(X-Y)^2]$ .

Proof

The space $L^2(\Omega,\G,\P)$ is complete, being $L^2$ of the measure space $(\Omega,\G,\P\!\restriction_\G)$ , hence a closed subspace of $L^2(\P)$ (when $\G$ is not $\P$ -complete one works with the completion $\bar\G$ , and $L^2(\bar\G)=L^2(\G)$ inside $L^2(\P)$ since each $\bar\G$ -measurable function agrees a.s. with a $\G$ -measurable one). So the projection $Y$ exists and is characterized by $X-Y\perp L^2(\G)$ , that is $\E[(X-Y)Z]=0$ for every $Z\in L^2(\G)$ . In particular for $Z=\ind_G$ , bounded and hence in $L^2$ on a finite measure space, this recovers Equation (1), so $Y=\E[X\mid\G]$ by Theorem 2. The minimization statement follows because an orthogonal projection minimizes the distance from $X$ to the subspace.

#The calculus of conditioning

Proposition4

Let $X,X'\in L^1(\P)$ and let $\mathcal H\subseteq\G$ be a further sub- $\sigma$ -algebra. Then almost surely

\begin{aligned} &\text{(i)} &&\E[aX+bX'\mid\G]=a\,\E[X\mid\G]+b\,\E[X'\mid\G],\\[2pt] &\text{(ii)} &&\E\big[\E[X\mid\G]\,\big|\,\mathcal H\big]=\E[X\mid\mathcal H],\\[2pt] &\text{(iii)} &&\E[ZX\mid\G]=Z\,\E[X\mid\G]\quad\text{for bounded }\G\text{-measurable }Z,\\[2pt] &\text{(iv)} &&X\ge 0\ \text{a.s.}\ \Rightarrow\ \E[X\mid\G]\ge 0\ \text{a.s.} \end{aligned} \tag{2}

Proof

Statement (i) holds because the right side is $\G$ -measurable and integrates correctly over each $G\in\G$ by linearity of the integral, so Theorem 2 identifies it. For (ii), the inner variable $W=\E[X\mid\G]$ lies in $L^1$ , so $\E[W\mid\mathcal H]$ is defined. The right side $\E[X\mid\mathcal H]$ is $\mathcal H$ -measurable, and for $H\in\mathcal H\subseteq\G$ , since $H\in\G$ as well, $\int_H\E[X\mid\mathcal H]\dd\P=\int_H X\dd\P=\int_H W\dd\P$ by Equation (1) applied at each level. By uniqueness Theorem 2 identifies $\E[X\mid\mathcal H]$ with $\E[W\mid\mathcal H]$ . For (iii), the claim holds for $Z=\ind_{G_0}$ with $G_0\in\G$ because for $G\in\G$ ,

\int_G \ind_{G_0}\E[X\mid\G]\dd\P=\int_{G\cap G_0}\E[X\mid\G]\dd\P=\int_{G\cap G_0}X\dd\P=\int_G\ind_{G_0}X\dd\P, \tag{3}

and it extends to general bounded $\G$ -measurable $Z$ by dominated convergence. In detail, with $\abs Z\le M$ pick simple $\G$ -measurable $Z_n$ with $\abs{Z_n}\le M$ and $Z_n\to Z$ pointwise; then $\abs{Z_nX}\le M\abs X\in L^1$ and $Z_nX\to ZX$ almost surely, so the ordinary dominated convergence theorem gives $\|Z_nX-ZX\|_1\to0$ , and since conditional expectation is an $L^1$ -contraction we get $\|\E[Z_nX\mid\G]-\E[ZX\mid\G]\|_1\le\|Z_nX-ZX\|_1\to0$ . Hence $\E[Z_nX\mid\G]\to\E[ZX\mid\G]$ in $L^1$ , so along a subsequence almost surely, while $Z_n\E[X\mid\G]\to Z\E[X\mid\G]$ almost surely; equating limits gives the claim. For (iv), if $X\ge0$ a.s. set $G=\{\E[X\mid\G]<0\}\in\G$ ; then $\int_G\E[X\mid\G]\dd\P=\int_G X\dd\P\ge0$ while the integrand is negative on $G$ , forcing $\P(G)=0$ .

Theorem5

(Conditional Jensen.) If $\varphi:\R\to\R$ is convex and $X,\varphi(X)\in L^1(\P)$ , then almost surely $\varphi\!\big(\E[X\mid\G]\big)\le\E[\varphi(X)\mid\G]$ .

Proof

At each rational $q$ a convex $\varphi$ has a subgradient $s_q$ , giving the affine minorant $\ell_q(x)=\varphi(q)+s_q(x-q)\le\varphi(x)$ ; continuity of $\varphi$ and density of $\Q$ then yield $\sup_q\ell_q(x)=\varphi(x)$ for every $x\in\R$ , a supremum over a countable family. For each $q$ , monotonicity (iv) and linearity (i) from Proposition 4, together with $\E[b\mid\G]=b$ for the constant $b$ (immediate from Theorem 2, since a constant is $\G$ -measurable and reproduces its own averages), give $\E[\varphi(X)\mid\G]\ge\E[\ell_q(X)\mid\G]=s_q\E[X\mid\G]+(\varphi(q)-s_q q)=\ell_q\!\big(\E[X\mid\G]\big)$ almost surely. Taking the supremum over the countable family, a null set at a time, yields $\E[\varphi(X)\mid\G]\ge\sup_q\ell_q\!\big(\E[X\mid\G]\big)=\varphi\!\big(\E[X\mid\G]\big)$ [1].

Conditional expectation is therefore both a density, by Theorem 2, and a projection, by Proposition 3, and the averaging identity Equation (1) is the common root of every rule above.

[1]

D. Williams, Probability with Martingales. Cambridge University Press, 1991.

Explore connections

see in the atlas

referenced by (4)

cite

@misc{conditional-expectation,
  author = {Zac Kienzle},
  title  = {Conditional Expectation},
  year   = {2026},
  month  = {05},
  url    = {https://zackienzle.com/blog/conditional-expectation}
}

#The defining property

Fix a probability space $(\Omega,\F,\P)$ and a sub- $\sigma$ -algebra $\G\subseteq\F$ . Let $X\in L^1(\P)$ .

Definition1

A conditional expectation of $X$ given $\G$ is a random variable $Y$ that is $\G$ -measurable, lies in $L^1$ , and satisfies

\int_G Y\dd\P=\int_G X\dd\P\qquad\text{for every } G\in\G. \tag{1}

Any such $Y$ is written $\E[X\mid\G]$ .

#Existence and uniqueness

Theorem2

For every $X\in L^1(\P)$ a conditional expectation $\E[X\mid\G]$ exists and is unique up to $\P$ -almost-everywhere equality.

Proof

#The projection viewpoint

Proposition3

Proof

#The calculus of conditioning

Proposition4

Let $X,X'\in L^1(\P)$ and let $\mathcal H\subseteq\G$ be a further sub- $\sigma$ -algebra. Then almost surely

\begin{aligned} &\text{(i)} &&\E[aX+bX'\mid\G]=a\,\E[X\mid\G]+b\,\E[X'\mid\G],\\[2pt] &\text{(ii)} &&\E\big[\E[X\mid\G]\,\big|\,\mathcal H\big]=\E[X\mid\mathcal H],\\[2pt] &\text{(iii)} &&\E[ZX\mid\G]=Z\,\E[X\mid\G]\quad\text{for bounded }\G\text{-measurable }Z,\\[2pt] &\text{(iv)} &&X\ge 0\ \text{a.s.}\ \Rightarrow\ \E[X\mid\G]\ge 0\ \text{a.s.} \end{aligned} \tag{2}

Proof

\int_G \ind_{G_0}\E[X\mid\G]\dd\P=\int_{G\cap G_0}\E[X\mid\G]\dd\P=\int_{G\cap G_0}X\dd\P=\int_G\ind_{G_0}X\dd\P, \tag{3}

Theorem5

(Conditional Jensen.) If $\varphi:\R\to\R$ is convex and $X,\varphi(X)\in L^1(\P)$ , then almost surely $\varphi\!\big(\E[X\mid\G]\big)\le\E[\varphi(X)\mid\G]$ .

Proof

Conditional expectation is therefore both a density, by Theorem 2, and a projection, by Proposition 3, and the averaging identity Equation (1) is the common root of every rule above.

[1]

D. Williams, Probability with Martingales. Cambridge University Press, 1991.

Explore connections

see in the atlas

referenced by (4)

cite

@misc{conditional-expectation,
  author = {Zac Kienzle},
  title  = {Conditional Expectation},
  year   = {2026},
  month  = {05},
  url    = {https://zackienzle.com/blog/conditional-expectation}
}