Carathéodory's theorem (convex hull)

From Free net encyclopedia

See also Carathéodory's theorem for other meanings

In mathematics Carathéodory's theorem on convex sets states that if a point x of R^d lies in the convex hull of a set P, there is a subset P′ of P consisting of no more than d+1 points such that x lies in the convex hull of P′. In other words, x lies in a r-simplex with vertices in P, where <math>r \leq d</math>. The result is named for Constantin Carathéodory.

For example, consider a set P={(0,0), (0,1), (1,0), (1,1)} which is a subset of R². The convex hull of this set is a square. Consider now a point x=(1/4, 1/4), which is in the convex hull of P. We can then construct a set {(0,0),(0,1),(1,0)} = P′, the convex hull of which is a triangle and encloses p, and thus the theorem works for this instance, since |P′| = 3. It may help to visualise Carathéodory's theorem in 2 dimensions, as saying that we can construct a triangle consisting of points from P that encloses any point in P.

[edit]

Proof

Let x be a point in the convex hull of P. Then, x is a convex combination of points in P:

<math>\mathbf{x}=\sum_{j=1}^k \lambda_j \mathbf{x}_j</math>

where every x_j is in P, every λ_j is nonnegative, and <math>\sum_{j=1}^k\lambda_j=1</math>.

Suppose k>d+1 (otherwise, there is nothing to prove). Then, the points x₂-x₁, ..., x_k-x₁ are linearly dependent, so there are real scalars μ₂, ..., μ_k, not all zero, such that

<math>\sum_{j=2}^k \mu_j (\mathbf{x}_j-\mathbf{x}_1)=\mathbf{0}.</math>

If μ₁ is defined as

then

<math>\sum_{j=1}^k \mu_j \mathbf{x}_j=\mathbf{0}</math>

and not all of the μ_j are equal to zero. Therefore, at least one μ_j>0. Then,

<math>\mathbf{x} = \sum_{j=1}^k \lambda_j \mathbf{x}_j-\alpha\sum_{j=1}^k \mu_j \mathbf{x}_j = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j</math>

for any real α. In particular, the equality will hold if α is defined as

<math>\alpha:=\min_{1\leq j \leq k} \left\{ \frac{\lambda_j}{\mu_j}:\mu_j>0\right\}=\frac{\lambda_i}{\mu_i}.</math>

Note that α>0, and for every j between 1 and k,

<math>\lambda_j-\alpha\mu_j \geq 0.</math>

In particular, λ_i-αμ_i=0 by definition of α. Therefore,

<math>\mathbf{x} = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j</math>

where every λ_j-αμ_j is nonnegative, their sum is one , and furthermore, <math>\lambda_i-\alpha\mu_i=0</math>. In other words, x is represented as a convex combination of at most k-1 points of P. This process can be repeated until x is represented as a convex combination of at most d+1 points in P.

Q.E.D.

[edit]