Independence

Independence is what lets randomness compound. Two experiments that do not inform each other multiply their probabilities, and the entire apparatus of sums of random variables, limit theorems, and product constructions rests on that one factorisation. Reading it through measure theory connects independence directly to the Fubini theorem and makes the factorisation of expectations a calculation rather than an axiom. This post builds independence from events up to the Kolmogorov zero-one law, on the probability space of the previous post [1], [2].

#Independence of events and sigma-algebras

Definition1

Events $A_1,\dots,A_n$ are independent when $\P(A_{i_1}\cap\cdots\cap A_{i_k})=\P(A_{i_1})\cdots \P(A_{i_k})$ for every subcollection. Sub-sigma-algebras $\mathcal G_1,\dots,\mathcal G_n$ of $\mathcal F$ are independent when $\P(A_1\cap\cdots\cap A_n)=\P(A_1)\cdots\P(A_n)$ for all choices $A_i\in\mathcal G_i$ . Random variables $X_1,\dots,X_n$ are independent when the generated sigma-algebras $\sigma(X_1),\dots,\sigma(X_n)$ are. An arbitrary family (of events, sigma-algebras, or random variables) is independent when every finite subfamily is, so that $\P(A_{i_1}\cap\cdots\cap A_{i_k})=\P(A_{i_1}) \cdots\P(A_{i_k})$ for all finite $\{i_1,\dots,i_k\}$ and $A_{i_j}\in\mathcal G_{i_j}$ .

Checking independence against every event of each sigma-algebra is unwieldy. It is enough to check a generating system closed under intersection.

Lemma2

If $\mathcal G$ is independent of a collection $\mathcal P$ that is closed under finite intersection, then $\mathcal G$ is independent of $\sigma(\mathcal P)$ . Hence independence of sigma-algebras need only be checked on intersection-closed generating systems.

Proof

Fix $G\in\mathcal G$ with $\P(G)>0$ , the case $\P(G)=0$ being trivial. The collection $\mathcal D=\{B\in \mathcal F:\P(G\cap B)=\P(G)\P(B)\}$ contains $\mathcal P$ by hypothesis and contains $\Omega$ . It is closed under proper differences, since $\P(G\cap(B\setminus B'))=\P(G\cap B)-\P(G\cap B')=\P(G)(\P(B)- \P(B'))$ for $B'\subseteq B$ , and under increasing limits by continuity of measures. So $\mathcal D$ is a Dynkin system containing the pi-system $\mathcal P$ , hence contains $\sigma(\mathcal P)$ by the Dynkin theorem. Thus every $G\in\mathcal G$ is independent of every set in $\sigma(\mathcal P)$ .

#Independence and the product law

For random variables, independence is exactly the factorisation of the joint law.

Theorem3

Random variables $X$ and $Y$ are independent if and only if their joint law is the product of the marginals, $\P_{(X,Y)}=\P_X\otimes\P_Y$ , equivalently $\P(X\le x,Y\le y)=F_X(x)F_Y(y)$ for all $x,y$ .

Proof

The sigma-algebra $\sigma(X)$ is generated by the intersection-closed system $\mathcal P_X=\{\{X\le x\}\}$ , and likewise $\sigma(Y)$ by $\mathcal P_Y=\{\{Y\le y\}\}$ . The rectangle identity $\P(X\le x,Y\le y)= \P(X\le x)\P(Y\le y)=F_X(x)F_Y(y)$ for all $x,y$ says $\mathcal P_X$ is independent of $\mathcal P_Y$ ; Lemma 2 with $\mathcal G=\mathcal P_X$ , $\mathcal P=\mathcal P_Y$ gives $\mathcal P_X$ independent of $\sigma(Y)$ , and a second application with $\mathcal G=\sigma(Y)$ , $\mathcal P=\mathcal P_X$ gives $\sigma(X)$ independent of $\sigma(Y)$ . So this identity is equivalent to the independence of $X$ and $Y$ . It also says the joint law and the product measure $\P_X\otimes\P_Y$ agree on the rectangles $(-\infty,x]\times (-\infty,y]$ , an intersection-closed generating system of the Borel sets of the plane. Both are probability measures of total mass $1$ , and the rectangles $R_n=(-\infty,n]\times(-\infty,n]$ lie in the system with $R_n\uparrow\R^2$ , so two finite measures agreeing on this exhausting generator agree on the generated sigma-algebra by Dynkin uniqueness. Hence they agree everywhere, $\P_{(X,Y)}=\P_X\otimes\P_Y$ . Conversely, if $\P_{(X,Y)}=\P_X\otimes\P_Y$ , evaluating both sides on the rectangle $(-\infty,x]\times(-\infty,y]$ gives $\P(X\le x,Y\le y)=F_X(x)F_Y(y)$ , which is independence.

The product law turns the factorisation of expectations into an application of Fubini.

Theorem4

If $X$ and $Y$ are independent and integrable, then $XY$ is integrable and $\E[XY]=\E[X]\,\E[Y]$ .

Proof

By the change of variables on the pair $(X,Y)$ and the product law, $\E[\abs{XY}]=\int_{\R^2}\abs{xy}\,d\P_{(X,Y)}=\int_{\R^2}\abs{xy}\,d(\P_X\otimes\P_Y)$ , which the Tonelli theorem factors as $\big(\int\abs x\,d\P_X\big) \big(\int\abs y\,d\P_Y\big)=\E\abs X\,\E\abs Y<\infty$ , so $XY$ is integrable. The Fubini theorem then factors the signed integral the same way, $\E[XY]=\int_{\R^2}xy\,d(\P_X\otimes\P_Y)=\E[X]\,\E[Y]$ .

Factoring expectations makes the variance of a sum of independent variables additive, since the cross terms $\E[(X_i-\E X_i)(X_j-\E X_j)]$ vanish. This additivity drives the law of large numbers.

#The Borel-Cantelli lemmas

For a sequence of events, the set on which infinitely many occur is the limit superior $\limsup_n A_n= \bigcap_N\bigcup_{n\ge N}A_n$ . Two lemmas decide its probability from the sum $\sum_n\P(A_n)$ .

Theorem5

If $\sum_n\P(A_n)<\infty$ , then $\P(\limsup_n A_n)=0$ .

Proof

For every $N$ , monotonicity and countable subadditivity give $\P(\limsup_n A_n)\le\P(\bigcup_{n\ge N}A_n) \le\sum_{n\ge N}\P(A_n)$ . The right side is the tail of a convergent series, so it tends to $0$ as $N\to\infty$ , forcing $\P(\limsup_n A_n)=0$ .

Theorem6

If the events $A_n$ are independent and $\sum_n\P(A_n)=\infty$ , then $\P(\limsup_n A_n)=1$ .

Proof

It suffices to show $\P(\bigcup_{n\ge N}A_n)=1$ for every $N$ , equivalently $\P(\bigcap_{n\ge N}A_n^c)=0$ . Each singleton $\{A_n\}$ is an intersection-closed system generating $\sigma(A_n)=\{\emptyset,A_n,A_n^c, \Omega\}$ , so Lemma 2 applied coordinatewise lifts the independence of the $A_n$ to independence of the sigma-algebras $\sigma(A_n)$ , and since $A_n^c\in\sigma(A_n)$ the complements $\{A_n^c\}$ are independent. With the bound $1-p\le e^{-p}$ ,

\P\Big(\bigcap_{n=N}^M A_n^c\Big)=\prod_{n=N}^M(1-\P(A_n))\le\exp\Big(-\sum_{n=N}^M\P(A_n)\Big). \tag{1}

As $M\to\infty$ the exponent diverges because $\sum_n\P(A_n)=\infty$ , so the left side, which decreases to $\P(\bigcap_{n\ge N}A_n^c)$ by continuity from above, is $0$ . Hence $\P(\bigcup_{n\ge N}A_n)=1$ for all $N$ , and intersecting over $N$ gives $\P(\limsup_n A_n)=1$ .

The two lemmas are a sharp dichotomy for independent events. The probability that infinitely many occur is $0$ or $1$ according as the sum of probabilities converges or diverges, with nothing in between.

#The Kolmogorov zero-one law

That dichotomy is an instance of a structural fact. Any event determined by the entire tail of an independent sequence, free of any finite initial segment, is deterministic.

Theorem7

Let $X_1,X_2,\dots$ be independent random variables and $\mathcal T=\bigcap_n\sigma(X_n,X_{n+1},\dots)$ the tail sigma-algebra. Every $A\in\mathcal T$ has $\P(A)\in\{0,1\}$ .

Proof

Fix $m$ ; we show the head block $\sigma(X_1,\dots,X_m)$ and the tail block $\sigma(X_{m+1},X_{m+2},\dots)$ are independent. The finite intersections $\mathcal P_1=\{B_1\cap \cdots\cap B_m:B_i\in\sigma(X_i)\}$ are closed under intersection (overlapping ranges merge, with absent factors set to $\Omega$ ) and contain each $\sigma(X_i)$ , $i\le m$ , so they generate $\sigma(X_1,\dots, X_m)$ ; likewise $\mathcal P_2=\{B_{m+1}\cap\cdots\cap B_{m+k}:k\ge1,\,B_j\in\sigma(X_j)\}$ is intersection-closed and contains each $\sigma(X_j)$ , $j>m$ , so it generates $\sigma(X_{m+1},X_{m+2},\dots)$ . The finite-dimensional independence of the sequence factors $\P$ across any one set drawn from each system, so $\mathcal P_1$ and $\mathcal P_2$ are independent. Applying Lemma 2 with $\mathcal G=\mathcal P_1$ , $\mathcal P=\mathcal P_2$ gives $\mathcal P_1$ independent of $\sigma(\mathcal P_2)$ , and a second application with $\mathcal G=\sigma(\mathcal P_2)$ , $\mathcal P= \mathcal P_1$ lifts this to independence of $\sigma(X_1,\dots,X_m)$ and $\sigma(X_{m+1},X_{m+2},\dots)$ . Since $\mathcal T \subseteq\sigma(X_{m+1},X_{m+2},\dots)$ , it is independent of $\sigma(X_1,\dots,X_m)$ for every $m$ , hence of their union over $m$ , an intersection-closed system generating $\sigma(X_1,X_2,\dots)$ . By Lemma 2 again, $\mathcal T$ is independent of $\sigma(X_1,X_2,\dots)$ . But $\mathcal T\subseteq\sigma(X_1,X_2,\dots)$ . So $\mathcal T$ is independent of itself. For $A\in\mathcal T$ this means $\P(A)=\P(A\cap A)=\P(A)\P(A)$ , so $\P(A)=\P(A)^2$ , whose only solutions are $0$ and $1$ .

The zero-one law says the asymptotic behaviour of an independent sequence is never genuinely random. Events like the convergence of $\sum_n X_n$ , or $\limsup_n X_n$ exceeding a threshold, depend on no finite block of the sequence and so are tail events, settled with certainty one way or the other. Independence is therefore both the hypothesis that lets variables combine and the source of the rigidity that makes their limits deterministic, the structure on which the law of large numbers and the convergence of random series are built.

[1]

R. Durrett, Probability: Theory and Examples, 5th ed. in Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019.

[2]

O. Kallenberg, Foundations of Modern Probability, 3rd ed. Springer, 2021.

Explore connections

see in the atlas

referenced by (5)

cite

@misc{independence,
  author = {Zac Kienzle},
  title  = {Independence},
  year   = {2026},
  month  = {05},
  url    = {https://zackienzle.com/blog/independence}
}

#Independence of events and sigma-algebras

Definition1

Checking independence against every event of each sigma-algebra is unwieldy. It is enough to check a generating system closed under intersection.

Lemma2

Proof

#Independence and the product law

For random variables, independence is exactly the factorisation of the joint law.

Theorem3

Proof

The product law turns the factorisation of expectations into an application of Fubini.

Theorem4

If $X$ and $Y$ are independent and integrable, then $XY$ is integrable and $\E[XY]=\E[X]\,\E[Y]$ .

Proof

Factoring expectations makes the variance of a sum of independent variables additive, since the cross terms $\E[(X_i-\E X_i)(X_j-\E X_j)]$ vanish. This additivity drives the law of large numbers.

#The Borel-Cantelli lemmas

For a sequence of events, the set on which infinitely many occur is the limit superior $\limsup_n A_n= \bigcap_N\bigcup_{n\ge N}A_n$ . Two lemmas decide its probability from the sum $\sum_n\P(A_n)$ .

Theorem5

If $\sum_n\P(A_n)<\infty$ , then $\P(\limsup_n A_n)=0$ .

Proof

Theorem6

If the events $A_n$ are independent and $\sum_n\P(A_n)=\infty$ , then $\P(\limsup_n A_n)=1$ .

Proof

\P\Big(\bigcap_{n=N}^M A_n^c\Big)=\prod_{n=N}^M(1-\P(A_n))\le\exp\Big(-\sum_{n=N}^M\P(A_n)\Big). \tag{1}

#The Kolmogorov zero-one law

That dichotomy is an instance of a structural fact. Any event determined by the entire tail of an independent sequence, free of any finite initial segment, is deterministic.

Theorem7

Let $X_1,X_2,\dots$ be independent random variables and $\mathcal T=\bigcap_n\sigma(X_n,X_{n+1},\dots)$ the tail sigma-algebra. Every $A\in\mathcal T$ has $\P(A)\in\{0,1\}$ .

Proof

[1]

R. Durrett, Probability: Theory and Examples, 5th ed. in Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019.

[2]

O. Kallenberg, Foundations of Modern Probability, 3rd ed. Springer, 2021.

Explore connections

see in the atlas

referenced by (5)

cite

@misc{independence,
  author = {Zac Kienzle},
  title  = {Independence},
  year   = {2026},
  month  = {05},
  url    = {https://zackienzle.com/blog/independence}
}