Convexity is the dividing line between optimisation problems that can be solved and those that cannot. A convex function curves upward everywhere, so it has a single basin and no false minima to trap a descent. A convex set can be cleanly separated from any outside point by a flat boundary, the geometric root of every duality theorem. This post builds the two ideas and their characterisations, deriving the separating hyperplane from the projection onto a closed convex set proved in the Hilbert space track [1], [2]. The setting is Euclidean space with inner product .
#Convex sets and functions
A set is convex when it contains the segment between any two of its points, for all and . A function is convex when
for all and , and strictly convex when the inequality is strict for and .
A function is convex exactly when the region above its graph is a convex set, since the epigraph contains the segment between two of its points precisely when Equation (1) holds. Convexity of a function is therefore a special case of convexity of a set, and the finite form of the defining inequality is Jensen's inequality, for weights summing to , which follows by induction. For smooth functions convexity is read from derivatives.
A differentiable is convex if and only if for all , the graph lying above each tangent. A twice continuously differentiable is convex if and only if its Hessian is positive semidefinite for every .
Suppose is convex. For , Equation (1) rearranges to , and letting the left side tends to the directional derivative , giving the tangent inequality. Conversely, if the tangent inequality holds, apply it at the point to both and , and , and take the weighted sum, whose gradient terms cancel because , leaving . For the Hessian, the tangent inequality along the line is equivalent to the one-variable function being convex for every direction , which for holds if and only if , that is the Hessian is positive semidefinite.
#Separating and supporting hyperplanes
The geometric heart of convexity is that a convex set can be separated from any point it does not contain. The closest point of the set provides the separating direction.
Let be nonempty, closed, and convex, and let . Then there is a vector and a scalar with for all .
Let be the projection of onto , the unique nearest point, which exists because is closed and convex and is complete. Set . By the variational inequality for all , we have for all . Moreover , so . Choosing strictly between and separates, since for all .
At a boundary point the separating hyperplane becomes a supporting one, touching the set.
Let be convex with nonempty interior and let be a boundary point. Then there is a vector with for all , a supporting hyperplane at .
Pick an interior point and set , which converge to and lie outside the closure . Indeed if , then is a proper convex combination of the interior point and the point , namely , which places in because the open segment from an interior point to any point of the closure lies in the interior, contradicting being on the boundary. That principle holds because, picking with and near , convexity gives for , and small enough keeps inside this ball. Each gives by Theorem 3, applied to the closed convex (the closure of a convex set is convex, since for , in the combination converges to ), a unit vector with for all . The unit vectors lie in the compact unit sphere, so by the Bolzano-Weierstrass theorem a subsequence converges to a unit vector . Passing to the limit in , with , gives for all , the supporting hyperplane.
#Local minima are global
The payoff of convexity for optimisation is that it abolishes the distinction between local and global minima.
If is convex, then every local minimum is a global minimum, and the set of minimisers is convex. If is strictly convex the minimiser is unique.
Let be a local minimum, so for near , and suppose some had . Along the segment, convexity gives for every , and taking small enough that the point lies in the neighbourhood contradicts local minimality. So for all , a global minimum. The minimisers form the sublevel set at the minimum value, which is convex because is. If is strictly convex and both minimised, the midpoint would have , below the minimum, which is impossible, so the minimiser is unique.
Convexity is the structural assumption that turns optimisation from a search through a landscape of traps into a problem with one answer reachable by descent. The supporting hyperplane theorem is the geometric fact that every convex set is the intersection of the half-spaces that support it. This duality between a set and the linear functionals that bound it is the lever that the Lagrangian duality of the next post pulls to certify optimality. The quadratic programs of mean-variance portfolio choice and the convex cost of optimal execution are convex problems whose solutions this theory locates and guarantees.