Skip to content
homeaboutworkprojectsthesiswritingresume
Loading
~/blog/characteristic-functions0%dark
  1. home/
  2. writing/
  3. Characteristic Functions

20 May 2026 · 7 min read · updated 13 June 2026

Characteristic Functions

The characteristic function is the Fourier transform of a probability law, and it linearises the study of sums and limits. We define it, prove its elementary analytic properties, prove it factors over independent sums, read moments from its derivatives, prove the inversion formula that recovers the law and hence the uniqueness theorem, and prove the Levy continuity theorem that pointwise convergence of characteristic functions to a function continuous at the origin is convergence in distribution. These are the tools the central limit theorem is proved with.

  • 4 equations
  • 13 results
  • 15 connections
  • probability
  • fourier
  • limit-theorems
On this page▾
  • Definition and elementary properties
  • Moments
  • The inversion formula and uniqueness
  • Convergence in distribution and tightness
  • The Levy continuity theorem

7 min left

  • Definition and elementary properties1m
  • Moments1m
  • The inversion formula and uniqueness2m
  • Convergence in distribution and tightness1m
  • The Levy continuity theorem2m

The characteristic function is the Fourier transform of a probability law, and it converts the two hardest operations on random variables into easy ones. Adding independent variables becomes multiplying their transforms, and convergence in distribution becomes pointwise convergence of the transforms, so the central limit theorem reduces to a Taylor expansion. This post develops the characteristic function from its definition through the inversion formula and the Levy continuity theorem, on the probability space and using the independence of the previous posts [1], [2].

#Definition and elementary properties

Definition1

The characteristic function of a random variable XXX is φX(t)=E[eitX]=E[cos⁡tX]+iE[sin⁡tX]\varphi_X(t)=\E[e^{itX}]=\E[\cos tX]+i\E[\sin tX]φX​(t)=E[eitX]=E[costX]+iE[sintX], defined for every real ttt.

The expectation exists because eitXe^{itX}eitX is bounded, ∣eitX∣=1\abs{e^{itX}}=1​eitX​=1, so the characteristic function is defined for every law without integrability assumptions. This is its structural advantage over the moment generating function, which requires E[etX]\E[e^{tX}]E[etX] to be finite near 000.

Proposition2

The characteristic function satisfies φX(0)=1\varphi_X(0)=1φX​(0)=1 and ∣φX(t)∣≤1\abs{\varphi_X(t)}\le 1∣φX​(t)∣≤1, is uniformly continuous on R\RR, and for independent XXX and YYY obeys φX+Y=φXφY\varphi_{X+Y}=\varphi_X\varphi_YφX+Y​=φX​φY​.

Proof

At t=0t=0t=0, φX(0)=E[1]=1\varphi_X(0)=\E[1]=1φX​(0)=E[1]=1, and ∣φX(t)∣=∣E[eitX]∣≤E∣eitX∣=1\abs{\varphi_X(t)}=\abs{\E[e^{itX}]}\le\E\abs{e^{itX}}=1∣φX​(t)∣=​E[eitX]​≤E​eitX​=1. For uniform continuity, ∣φX(t+h)−φX(t)∣=∣E[eitX(eihX−1)]∣≤E∣eihX−1∣\abs{\varphi_X(t+h)-\varphi_X(t)}=\abs{\E[e^{itX}(e^{ihX}-1)]}\le\E\abs{e^{ihX}-1}∣φX​(t+h)−φX​(t)∣=​E[eitX(eihX−1)]​≤E​eihX−1​, a bound independent of ttt, and ∣eihX−1∣≤2\abs{e^{ihX}-1}\le 2​eihX−1​≤2 with eihX−1→0e^{ihX}-1\to 0eihX−1→0 pointwise as h→0h\to 0h→0, so the dominated convergence theorem sends E∣eihX−1∣→0\E\abs{e^{ihX}-1}\to 0E​eihX−1​→0, which is uniform continuity. For the product rule, independence makes eitXe^{itX}eitX and eitYe^{itY}eitY independent, so the factorisation of expectations applied to real and imaginary parts gives φX+Y(t)=E[eitXeitY]=E[eitX] E[eitY]=φX(t)φY(t)\varphi_{X+Y}(t)=\E[e^{itX}e^{itY}]=\E[e^{itX}]\,\E[e^{itY}]=\varphi_X(t)\varphi_Y(t)φX+Y​(t)=E[eitXeitY]=E[eitX]E[eitY]=φX​(t)φY​(t).

The product rule is the reason characteristic functions suit sums. The distribution of a sum of independent variables, a convolution that is awkward to compute directly, becomes a pointwise product of transforms.

#Moments

Differentiating under the expectation reads the moments of XXX off the derivatives of φX\varphi_XφX​ at the origin.

Proposition3

If E∣X∣n<∞\E\abs X^n<\inftyE∣X∣n<∞, then φX\varphi_XφX​ is nnn times continuously differentiable with φX(k)(t)=E[(iX)keitX]\varphi_X^{(k)}(t) =\E[(iX)^k e^{itX}]φX(k)​(t)=E[(iX)keitX], so φX(k)(0)=ikE[Xk]\varphi_X^{(k)}(0)=i^k\E[X^k]φX(k)​(0)=ikE[Xk], and

φX(t)=∑k=0n(it)kk!E[Xk]+o(tn)as t→0.(1)\varphi_X(t)=\sum_{k=0}^n\frac{(it)^k}{k!}\E[X^k]+o(t^n)\quad\text{as }t\to 0. \tag{1}φX​(t)=k=0∑n​k!(it)k​E[Xk]+o(tn)as t→0.(1)
Proof

The difference quotient (ei(t+h)X−eitX)/h(e^{i(t+h)X}-e^{itX})/h(ei(t+h)X−eitX)/h converges to iXeitXiXe^{itX}iXeitX as h→0h\to 0h→0 and is bounded in modulus by ∣X∣\abs X∣X∣, since ∣eihX−1∣≤∣hX∣\abs{e^{ihX}-1}\le\abs{hX}​eihX−1​≤∣hX∣. When E∣X∣<∞\E\abs X<\inftyE∣X∣<∞ the dominated convergence theorem lets the derivative pass inside, φX′(t)=E[iXeitX]\varphi_X'(t)=\E[iXe^{itX}]φX′​(t)=E[iXeitX]. Iterating k≤nk\le nk≤n times, with dominator ∣X∣k\abs X^k∣X∣k integrable by hypothesis, gives φX(k)(t)=E[(iX)keitX]\varphi_X^{(k)}(t)=\E[(iX)^k e^{itX}]φX(k)​(t)=E[(iX)keitX], continuous in ttt by dominated convergence, and at t=0t=0t=0 equal to ikE[Xk]i^k\E[X^k]ikE[Xk]. The expansion Equation (1) is Taylor's theorem applied to the nnn times differentiable φX\varphi_XφX​ at the origin with these derivatives.

#The inversion formula and uniqueness

The characteristic function determines the law. The proof is an explicit formula recovering the probability of an interval from the transform.

Theorem4

For a<ba<ba<b with P(X=a)=P(X=b)=0\P(X=a)=\P(X=b)=0P(X=a)=P(X=b)=0,

P(a<X≤b)=lim⁡T→∞12π∫−TTe−ita−e−itbit φX(t) dt.(2)\P(a<X\le b)=\lim_{T\to\infty}\frac{1}{2\pi}\int_{-T}^{T}\frac{e^{-ita}-e^{-itb}}{it}\,\varphi_X(t)\,dt. \tag{2}P(a<X≤b)=T→∞lim​2π1​∫−TT​ite−ita−e−itb​φX​(t)dt.(2)
Proof

Write φX(t)=∫Reitx dPX(x)\varphi_X(t)=\int_\R e^{itx}\,d\P_X(x)φX​(t)=∫R​eitxdPX​(x) and insert it into the integral. The integrand is bounded by b−ab-ab−a on [−T,T]×R[-T,T]\times\R[−T,T]×R, since ∣(e−ita−e−itb)/(it)∣=∣∫abe−its ds∣≤b−a\abs{(e^{-ita}-e^{-itb})/(it)}=\abs{\int_a^b e^{-its}\,ds}\le b-a​(e−ita−e−itb)/(it)​=​∫ab​e−itsds​≤b−a, so the finite-measure Fubini theorem permits exchanging the integrals,

12π∫−TTe−ita−e−itbitφX(t) dt=∫R(1π∫0Tsin⁡t(x−a)−sin⁡t(x−b)t dt)dPX(x),(3)\frac{1}{2\pi}\int_{-T}^T\frac{e^{-ita}-e^{-itb}}{it}\varphi_X(t)\,dt=\int_\R\Big(\frac{1}{\pi}\int_0^T \frac{\sin t(x-a)-\sin t(x-b)}{t}\,dt\Big)d\P_X(x), \tag{3}2π1​∫−TT​ite−ita−e−itb​φX​(t)dt=∫R​(π1​∫0T​tsint(x−a)−sint(x−b)​dt)dPX​(x),(3)

The inner expression collapses to a sine integral because, after division by ititit, the cosine parts of eit(x−a)−eit(x−b)e^{it(x-a)}-e^{it(x-b)}eit(x−a)−eit(x−b) carry a factor 1/t1/t1/t that is odd in ttt and integrate to zero over the symmetric range, while the sine parts are even and survive. The Dirichlet integral ∫0Tsin⁡ctt dt→π2sgn⁡(c)\int_0^T\frac{\sin ct}{t}\,dt\to\frac{\pi}{2}\operatorname{sgn}(c)∫0T​tsinct​dt→2π​sgn(c) as T→∞T\to\inftyT→∞, with the partial integrals bounded uniformly in ccc and TTT. So the inner expression is bounded and converges to 12(sgn⁡(x−a)−sgn⁡(x−b))\tfrac12(\operatorname{sgn}(x-a)-\operatorname{sgn}(x-b))21​(sgn(x−a)−sgn(x−b)), which equals 111 on (a,b)(a,b)(a,b), 12\tfrac1221​ at x∈{a,b}x\in\{a,b\}x∈{a,b}, and 000 outside [a,b][a,b][a,b]. The bounded convergence theorem sends the right side of Equation (3) to P(a<X<b)+12(P(X=a)+P(X=b))\P(a<X<b)+\tfrac12(\P(X=a)+\P(X=b))P(a<X<b)+21​(P(X=a)+P(X=b)), which under the continuity hypothesis is P(a<X≤b)\P(a<X\le b)P(a<X≤b).

Corollary5

Two random variables with the same characteristic function have the same law.

Proof

The formula Equation (2) determines P(a<X≤b)\P(a<X\le b)P(a<X≤b) from φX\varphi_XφX​ for every a,ba,ba,b that are not atoms. The non-atoms are all but countably many points, hence dense, so the distribution function is determined at a dense set of points and, being right-continuous, everywhere. The law is determined by its distribution function.

#Convergence in distribution and tightness

A sequence of laws converges in distribution, written Xn⇒XX_n\Rightarrow XXn​⇒X, when FXn(x)→FX(x)F_{X_n}(x)\to F_X(x)FXn​​(x)→FX​(x) at every continuity point xxx of FXF_XFX​. The characteristic functions control this convergence, but only once tightness prevents mass from escaping to infinity, which the next lemma controls by bounding the tail mass with the characteristic function near the origin.

Lemma6

For every u>0u>0u>0, P(∣X∣≥2/u)≤1u∫−uu(1−Re⁡φX(t)) dt\P\big(\abs X\ge 2/u\big)\le\dfrac{1}{u}\displaystyle\int_{-u}^{u}\big(1-\Re\varphi_X(t) \big)\,dtP(∣X∣≥2/u)≤u1​∫−uu​(1−ReφX​(t))dt.

Proof

By change of variables and Fubini,

1u∫−uu(1−Re⁡φX(t)) dt=∫R1u∫−uu(1−cos⁡tx) dt dPX(x)=2∫R(1−sin⁡uxux)dPX(x),(4)\frac1u\int_{-u}^u\big(1-\Re\varphi_X(t)\big)\,dt=\int_\R\frac1u\int_{-u}^u\big(1-\cos tx\big)\,dt\, d\P_X(x)=2\int_\R\Big(1-\frac{\sin ux}{ux}\Big)d\P_X(x), \tag{4}u1​∫−uu​(1−ReφX​(t))dt=∫R​u1​∫−uu​(1−costx)dtdPX​(x)=2∫R​(1−uxsinux​)dPX​(x),(4)

the inner integral being 1u(2u−2sin⁡(ux)/x)=2(1−sin⁡(ux)/(ux))\frac1u(2u-2\sin(ux)/x)=2(1-\sin(ux)/(ux))u1​(2u−2sin(ux)/x)=2(1−sin(ux)/(ux)). The integrand is nonnegative, and where ∣ux∣≥2\abs{ux}\ge 2∣ux∣≥2 it satisfies 1−sin⁡uxux≥1−1∣ux∣≥121-\frac{\sin ux}{ux}\ge 1-\frac1{\abs{ux}}\ge\tfrac121−uxsinux​≥1−∣ux∣1​≥21​. Restricting the integral to {∣x∣≥2/u}\{\abs x\ge 2/u\}{∣x∣≥2/u} and using this bound gives the right side at least 2⋅12P(∣X∣≥2/u)=P(∣X∣≥2/u)2\cdot\tfrac12\P(\abs X \ge 2/u)=\P(\abs X\ge 2/u)2⋅21​P(∣X∣≥2/u)=P(∣X∣≥2/u).

The estimate says a characteristic function close to 111 near the origin forces the law to concentrate, because 1−Re⁡φX(t)1-\Re\varphi_X(t)1−ReφX​(t) small on [−u,u][-u,u][−u,u] bounds the tail mass beyond 2/u2/u2/u.

#The Levy continuity theorem

Theorem7

Let XnX_nXn​ have characteristic functions φn\varphi_nφn​. If φn(t)→φ(t)\varphi_n(t)\to\varphi(t)φn​(t)→φ(t) for every ttt and φ\varphiφ is continuous at 000, then φ\varphiφ is the characteristic function of a law μ\muμ and Xn⇒μX_n\Rightarrow\muXn​⇒μ.

Proof

First, tightness. Fix ε>0\varepsilon>0ε>0. Since φ(0)=lim⁡φn(0)=1\varphi(0)=\lim\varphi_n(0)=1φ(0)=limφn​(0)=1 and φ\varphiφ is continuous at 000, choose u>0u>0u>0 with 1u∫−uu(1−Re⁡φ(t)) dt<ε/2\frac1u\int_{-u}^u(1-\Re\varphi(t))\,dt<\varepsilon/2u1​∫−uu​(1−Reφ(t))dt<ε/2. The integrands 1−Re⁡φn1-\Re \varphi_n1−Reφn​ are bounded by 222 and converge pointwise to 1−Re⁡φ1-\Re\varphi1−Reφ, so by bounded convergence the integrals converge and 1u∫−uu(1−Re⁡φn(t)) dt<ε\frac1u\int_{-u}^u(1-\Re\varphi_n(t))\,dt<\varepsilonu1​∫−uu​(1−Reφn​(t))dt<ε for all large nnn. By Lemma 6, P(∣Xn∣≥2/u)<ε\P(\abs{X_n}\ge 2/u)<\varepsilonP(∣Xn​∣≥2/u)<ε for those nnn, and enlarging 2/u2/u2/u to cover the finitely many remaining nnn makes the family tight.

Now take any subsequence. By Helly's selection theorem, every sequence of distribution functions has a further subsequence converging at continuity points to a nondecreasing right-continuous limit GGG, obtained by a diagonal extraction of convergent values on the rationals followed by right-continuous interpolation. Tightness forces GGG to be a genuine distribution function, with no mass lost to ±∞\pm\infty±∞. Along that subsequence Xnk⇒νX_{n_k}\Rightarrow\nuXnk​​⇒ν for the law ν\nuν of GGG, and convergence in distribution with the uniformly bounded continuous integrands eitxe^{itx}eitx gives φnk(t)→φν(t)\varphi_{n_k}(t)\to\varphi_ \nu(t)φnk​​(t)→φν​(t). But φnk(t)→φ(t)\varphi_{n_k}(t)\to\varphi(t)φnk​​(t)→φ(t) by hypothesis, so φν=φ\varphi_\nu=\varphiφν​=φ, and by Corollary 5 every subsequential limit is the same law μ\muμ with characteristic function φ\varphiφ. A sequence all of whose subsequences have a further subsequence converging to the same limit converges to that limit, so Xn⇒μX_n\Rightarrow\muXn​⇒μ.

The Levy continuity theorem is the analytic engine of the central limit theorem. To prove a sum of independent variables converges in distribution to a Gaussian, one shows its characteristic function, a product of the individual transforms by Proposition 2, converges pointwise to e−t2/2e^{-t^2/2}e−t2/2, the transform of the standard normal, and the continuity theorem converts that pointwise convergence into the distributional limit. The expansion Equation (1) supplies the pointwise limit through a second-order Taylor approximation of each factor; that computation is carried out in the central limit theorem post. The characteristic function thereby reduces a statement about the shape of a distribution to a calculation with an ordinary function of a real variable.

[1]
R. Durrett, Probability: Theory and Examples, 5th ed. in Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019.
[2]
D. Williams, Probability with Martingales. Cambridge University Press, 1991.

Part 3 of 9 in Probability

← previousIndependencenext →Gaussian Vectors and Processes

Explore connections

see in the atlas →

related

  • Convergence and Limit Theorems
  • Probability Spaces and Random Variables
  • Conditional Expectation

referenced by (8)

  • Differentiation and Taylor's Theorem
  • Gaussian Vectors and Processes
  • Holomorphic Functions and Cauchy's Theorem
  • Quadratic Variation
  • Residues and Contour Integration
  • Series and Power Series
  • +2 more
cite
@misc{characteristic-functions,
  author = {Zac Kienzle},
  title  = {Characteristic Functions},
  year   = {2026},
  month  = {05},
  url    = {https://zackienzle.com/blog/characteristic-functions}
}