Skip to content
homeaboutworkprojectsthesiswritingresume
Loading
~/blog/convex-sets-and-functions0%dark
  1. home/
  2. writing/
  3. Convex Sets and Functions

07 June 2026 · 6 min read · updated 13 June 2026

Convex Sets and Functions

Convexity is the property that makes optimisation well behaved, because a convex function has no local minima other than the global one and a convex set can be separated from any point outside it by a hyperplane. We define convex sets and functions, characterise convexity through the supporting tangent and the positive semidefinite Hessian, prove the separating and supporting hyperplane theorems from the projection onto a closed convex set, and prove that a local minimum of a convex function is global. These are the structural facts the duality theory of optimisation is built on.

  • 1 equation
  • 9 results
  • 6 connections
  • optimization
  • convex-analysis
  • real-analysis
On this page▾
  • Convex sets and functions
  • Separating and supporting hyperplanes
  • Local minima are global

6 min left

  • Convex sets and functions2m
  • Separating and supporting hyperplanes2m
  • Local minima are global1m

Convexity is the dividing line between optimisation problems that can be solved and those that cannot. A convex function curves upward everywhere, so it has a single basin and no false minima to trap a descent. A convex set can be cleanly separated from any outside point by a flat boundary, the geometric root of every duality theorem. This post builds the two ideas and their characterisations, deriving the separating hyperplane from the projection onto a closed convex set proved in the Hilbert space track [1], [2]. The setting is Euclidean space Rn\R^nRn with inner product ⟨x,y⟩\ip xy⟨x,y⟩.

#Convex sets and functions

Definition1

A set C⊆RnC\subseteq\R^nC⊆Rn is convex when it contains the segment between any two of its points, λx+(1−λ)y∈C\lambda x+(1-\lambda)y\in Cλx+(1−λ)y∈C for all x,y∈Cx,y\in Cx,y∈C and λ∈[0,1]\lambda\in[0,1]λ∈[0,1]. A function f:Rn→Rf:\R^n\to\Rf:Rn→R is convex when

f(λx+(1−λ)y)≤λf(x)+(1−λ)f(y)(1)f\big(\lambda x+(1-\lambda)y\big)\le\lambda f(x)+(1-\lambda)f(y) \tag{1}f(λx+(1−λ)y)≤λf(x)+(1−λ)f(y)(1)

for all x,yx,yx,y and λ∈[0,1]\lambda\in[0,1]λ∈[0,1], and strictly convex when the inequality is strict for x≠yx\neq yx=y and λ∈(0,1)\lambda\in(0,1)λ∈(0,1).

A function is convex exactly when the region above its graph is a convex set, since the epigraph {(x,t):t≥f(x)}\{(x,t):t\ge f(x)\}{(x,t):t≥f(x)} contains the segment between two of its points precisely when Equation (1) holds. Convexity of a function is therefore a special case of convexity of a set, and the finite form of the defining inequality is Jensen's inequality, f(∑iλixi)≤∑iλif(xi)f(\sum_i\lambda_i x_i)\le\sum_i \lambda_i f(x_i)f(∑i​λi​xi​)≤∑i​λi​f(xi​) for weights λi≥0\lambda_i\ge 0λi​≥0 summing to 111, which follows by induction. For smooth functions convexity is read from derivatives.

Proposition2

A differentiable fff is convex if and only if f(y)≥f(x)+⟨∇f(x),y−x⟩f(y)\ge f(x)+\ip{\nabla f(x)}{y-x}f(y)≥f(x)+⟨∇f(x),y−x⟩ for all x,yx,yx,y, the graph lying above each tangent. A twice continuously differentiable fff is convex if and only if its Hessian ∇2f(x)\nabla^2 f(x)∇2f(x) is positive semidefinite for every xxx.

Proof

Suppose fff is convex. For λ∈(0,1)\lambda\in(0,1)λ∈(0,1), Equation (1) rearranges to f(x+λ(y−x))−f(x)λ≤f(y)−f(x)\frac{f(x+\lambda(y-x)) -f(x)}{\lambda}\le f(y)-f(x)λf(x+λ(y−x))−f(x)​≤f(y)−f(x), and letting λ→0+\lambda\to 0^+λ→0+ the left side tends to the directional derivative ⟨∇f(x),y−x⟩\ip{\nabla f(x)}{y-x}⟨∇f(x),y−x⟩, giving the tangent inequality. Conversely, if the tangent inequality holds, apply it at the point z=λx+(1−λ)yz=\lambda x+(1-\lambda)yz=λx+(1−λ)y to both xxx and yyy, f(x)≥f(z)+⟨∇f(z),x−z⟩f(x)\ge f(z)+\ip{\nabla f(z)}{x-z}f(x)≥f(z)+⟨∇f(z),x−z⟩ and f(y)≥f(z)+⟨∇f(z),y−z⟩f(y)\ge f(z)+\ip{\nabla f(z)}{y-z}f(y)≥f(z)+⟨∇f(z),y−z⟩, and take the λ,(1−λ)\lambda,(1-\lambda)λ,(1−λ) weighted sum, whose gradient terms cancel because λ(x−z)+(1−λ)(y−z)=0\lambda(x-z)+(1-\lambda)(y-z)=0λ(x−z)+(1−λ)(y−z)=0, leaving λf(x)+(1−λ)f(y)≥f(z)\lambda f(x)+(1-\lambda)f(y)\ge f(z)λf(x)+(1−λ)f(y)≥f(z). For the Hessian, the tangent inequality along the line x+svx+svx+sv is equivalent to the one-variable function g(s)=f(x+sv)g(s)=f(x+sv)g(s)=f(x+sv) being convex for every direction vvv, which for g∈C2g\in C^2g∈C2 holds if and only if g′′(s)=⟨v,∇2f(x+sv)v⟩≥0g''(s)=\ip{v}{\nabla^2 f(x+sv)v}\ge 0g′′(s)=⟨v,∇2f(x+sv)v⟩≥0, that is the Hessian is positive semidefinite.

#Separating and supporting hyperplanes

The geometric heart of convexity is that a convex set can be separated from any point it does not contain. The closest point of the set provides the separating direction.

Theorem3

Let C⊆RnC\subseteq\R^nC⊆Rn be nonempty, closed, and convex, and let z∉Cz\notin Cz∈/C. Then there is a vector a≠0a\neq 0a=0 and a scalar ccc with ⟨a,x⟩<c<⟨a,z⟩\ip ax<c<\ip az⟨a,x⟩<c<⟨a,z⟩ for all x∈Cx\in Cx∈C.

Proof

Let ppp be the projection of zzz onto CCC, the unique nearest point, which exists because CCC is closed and convex and Rn\R^nRn is complete. Set a=z−p≠0a=z-p\neq 0a=z−p=0. By the variational inequality ⟨z−p,x−p⟩≤0\ip{z-p}{x-p}\le 0⟨z−p,x−p⟩≤0 for all x∈Cx\in Cx∈C, we have ⟨a,x⟩≤⟨a,p⟩\ip ax \le\ip ap⟨a,x⟩≤⟨a,p⟩ for all x∈Cx\in Cx∈C. Moreover ⟨a,z⟩−⟨a,p⟩=⟨a,z−p⟩=∥a∥2>0\ip az-\ip ap=\ip a{z-p}=\norm a^2>0⟨a,z⟩−⟨a,p⟩=⟨a,z−p⟩=∥a∥2>0, so ⟨a,p⟩<⟨a,z⟩\ip ap<\ip az⟨a,p⟩<⟨a,z⟩. Choosing ccc strictly between ⟨a,p⟩\ip ap⟨a,p⟩ and ⟨a,z⟩\ip az⟨a,z⟩ separates, since ⟨a,x⟩≤⟨a,p⟩<c<⟨a,z⟩\ip ax\le\ip ap<c<\ip az⟨a,x⟩≤⟨a,p⟩<c<⟨a,z⟩ for all x∈Cx\in Cx∈C.

At a boundary point the separating hyperplane becomes a supporting one, touching the set.

Theorem4

Let C⊆RnC\subseteq\R^nC⊆Rn be convex with nonempty interior and let x0x_0x0​ be a boundary point. Then there is a vector a≠0a\neq 0a=0 with ⟨a,x⟩≤⟨a,x0⟩\ip a{x}\le\ip a{x_0}⟨a,x⟩≤⟨a,x0​⟩ for all x∈Cx\in Cx∈C, a supporting hyperplane at x0x_0x0​.

Proof

Pick an interior point w∈int⁡Cw\in\operatorname{int}Cw∈intC and set zk=x0+1k(x0−w)z_k=x_0+\tfrac1k(x_0-w)zk​=x0​+k1​(x0​−w), which converge to x0x_0x0​ and lie outside the closure C‾\overline CC. Indeed if zk∈C‾z_k\in\overline Czk​∈C, then x0x_0x0​ is a proper convex combination of the interior point www and the point zk∈C‾z_k\in\overline Czk​∈C, namely x0=1k+1w+kk+1zkx_0=\tfrac1{k+1}w +\tfrac k{k+1}z_kx0​=k+11​w+k+1k​zk​, which places x0x_0x0​ in int⁡C\operatorname{int}CintC because the open segment from an interior point to any point of the closure lies in the interior, contradicting x0x_0x0​ being on the boundary. That principle holds because, picking r>0r>0r>0 with B(w,r)⊆CB(w,r)\subseteq CB(w,r)⊆C and y′∈Cy'\in Cy′∈C near zkz_kzk​, convexity gives λB(w,r)+(1−λ)y′=B(λw+(1−λ)y′,λr)⊆C\lambda B(w,r)+(1-\lambda)y'=B(\lambda w+(1-\lambda)y',\lambda r)\subseteq CλB(w,r)+(1−λ)y′=B(λw+(1−λ)y′,λr)⊆C for λ∈(0,1]\lambda\in(0,1]λ∈(0,1], and ∥y′−zk∥\norm{y'-z_k}∥y′−zk​∥ small enough keeps λw+(1−λ)zk\lambda w+(1-\lambda)z_kλw+(1−λ)zk​ inside this ball. Each zkz_kzk​ gives by Theorem 3, applied to the closed convex C‾\overline CC (the closure of a convex set is convex, since for xn→xx_n\to xxn​→x, yn→yy_n\to yyn​→y in CCC the combination λxn+(1−λ)yn∈C\lambda x_n+(1-\lambda)y_n\in Cλxn​+(1−λ)yn​∈C converges to λx+(1−λ)y∈C‾\lambda x+(1-\lambda)y\in\overline Cλx+(1−λ)y∈C), a unit vector aka_kak​ with ⟨ak,x⟩≤⟨ak,zk⟩\ip{a_k}{x}\le\ip{a_k}{z_k}⟨ak​,x⟩≤⟨ak​,zk​⟩ for all x∈C‾x\in\overline Cx∈C. The unit vectors lie in the compact unit sphere, so by the Bolzano-Weierstrass theorem a subsequence converges to a unit vector aaa. Passing to the limit in ⟨ak,x⟩≤⟨ak,zk⟩\ip{a_k}{x}\le\ip{a_k}{z_k}⟨ak​,x⟩≤⟨ak​,zk​⟩, with zk→x0z_k\to x_0zk​→x0​, gives ⟨a,x⟩≤⟨a,x0⟩\ip a{x}\le\ip a{x_0}⟨a,x⟩≤⟨a,x0​⟩ for all x∈Cx\in Cx∈C, the supporting hyperplane.

#Local minima are global

The payoff of convexity for optimisation is that it abolishes the distinction between local and global minima.

Theorem5

If fff is convex, then every local minimum is a global minimum, and the set of minimisers is convex. If fff is strictly convex the minimiser is unique.

Proof

Let x∗x^\astx∗ be a local minimum, so f(x∗)≤f(x)f(x^\ast)\le f(x)f(x∗)≤f(x) for xxx near x∗x^\astx∗, and suppose some yyy had f(y)<f(x∗)f(y)<f(x^\ast)f(y)<f(x∗). Along the segment, convexity gives f(λy+(1−λ)x∗)≤λf(y)+(1−λ)f(x∗)<f(x∗)f(\lambda y+(1-\lambda)x^\ast)\le\lambda f(y)+(1- \lambda)f(x^\ast)<f(x^\ast)f(λy+(1−λ)x∗)≤λf(y)+(1−λ)f(x∗)<f(x∗) for every λ∈(0,1]\lambda\in(0,1]λ∈(0,1], and taking λ\lambdaλ small enough that the point lies in the neighbourhood contradicts local minimality. So f(x∗)≤f(y)f(x^\ast)\le f(y)f(x∗)≤f(y) for all yyy, a global minimum. The minimisers form the sublevel set {f≤f(x∗)}\{f\le f(x^\ast)\}{f≤f(x∗)} at the minimum value, which is convex because fff is. If fff is strictly convex and x1≠x2x_1\neq x_2x1​=x2​ both minimised, the midpoint would have f(12x1+12x2)<12f(x1)+12f(x2)=f(x∗)f(\tfrac12 x_1+\tfrac12 x_2)<\tfrac12 f(x_1)+\tfrac12 f(x_2)=f(x^\ast)f(21​x1​+21​x2​)<21​f(x1​)+21​f(x2​)=f(x∗), below the minimum, which is impossible, so the minimiser is unique.

Convexity is the structural assumption that turns optimisation from a search through a landscape of traps into a problem with one answer reachable by descent. The supporting hyperplane theorem is the geometric fact that every convex set is the intersection of the half-spaces that support it. This duality between a set and the linear functionals that bound it is the lever that the Lagrangian duality of the next post pulls to certify optimality. The quadratic programs of mean-variance portfolio choice and the convex cost of optimal execution are convex problems whose solutions this theory locates and guarantees.

[1]
S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.
[2]
R. T. Rockafellar, Convex Analysis. Princeton University Press, 1970.

Part 1 of 2 in Optimization

next →Convex Duality and the KKT Conditions

Explore connections

see in the atlas →

related

  • Projection and Riesz Representation
  • Positive Definite Matrices
  • Differentiation and Taylor's Theorem

referenced by (1)

  • Convex Duality and the KKT Conditions
cite
@misc{convex-sets-and-functions,
  author = {Zac Kienzle},
  title  = {Convex Sets and Functions},
  year   = {2026},
  month  = {06},
  url    = {https://zackienzle.com/blog/convex-sets-and-functions}
}