Carathéodory's theorem (convex hull)

From Free net encyclopedia

Image:Caratheodorys theorem example.png

See also Carathéodory's theorem for other meanings

In mathematics Carathéodory's theorem on convex sets states that if a point x of Rd lies in the convex hull of a set P, there is a subset P′ of P consisting of no more than d+1 points such that x lies in the convex hull of P′. In other words, x lies in a r-simplex with vertices in P, where <math>r \leq d</math>. The result is named for Constantin Carathéodory.

For example, consider a set P={(0,0), (0,1), (1,0), (1,1)} which is a subset of R2. The convex hull of this set is a square. Consider now a point x=(1/4, 1/4), which is in the convex hull of P. We can then construct a set {(0,0),(0,1),(1,0)} = P′, the convex hull of which is a triangle and encloses p, and thus the theorem works for this instance, since |P′| = 3. It may help to visualise Carathéodory's theorem in 2 dimensions, as saying that we can construct a triangle consisting of points from P that encloses any point in P.

Proof

Let x be a point in the convex hull of P. Then, x is a convex combination of points in P:

<math>\mathbf{x}=\sum_{j=1}^k \lambda_j \mathbf{x}_j</math>

where every xj is in P, every λj is nonnegative, and <math>\sum_{j=1}^k\lambda_j=1</math>.

Suppose k>d+1 (otherwise, there is nothing to prove). Then, the points x2-x1, ..., xk-x1 are linearly dependent, so there are real scalars μ2, ..., μk, not all zero, such that

<math>\sum_{j=2}^k \mu_j (\mathbf{x}_j-\mathbf{x}_1)=\mathbf{0}.</math>

If μ1 is defined as

<math>\mu_1:=-\sum_{j=2}^k \mu_j</math>

then

<math>\sum_{j=1}^k \mu_j \mathbf{x}_j=\mathbf{0}</math>
<math>\sum_{j=1}^k \mu_j=0</math>

and not all of the μj are equal to zero. Therefore, at least one μj>0. Then,

<math>\mathbf{x} = \sum_{j=1}^k \lambda_j \mathbf{x}_j-\alpha\sum_{j=1}^k \mu_j \mathbf{x}_j = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j</math>

for any real α. In particular, the equality will hold if α is defined as

<math>\alpha:=\min_{1\leq j \leq k} \left\{ \frac{\lambda_j}{\mu_j}:\mu_j>0\right\}=\frac{\lambda_i}{\mu_i}.</math>

Note that α>0, and for every j between 1 and k,

<math>\lambda_j-\alpha\mu_j \geq 0.</math>

In particular, λi-αμi=0 by definition of α. Therefore,

<math>\mathbf{x} = \sum_{j=1}^k (\lambda_j-\alpha\mu_j) \mathbf{x}_j</math>

where every λj-αμj is nonnegative, their sum is one , and furthermore, <math>\lambda_i-\alpha\mu_i=0</math>. In other words, x is represented as a convex combination of at most k-1 points of P. This process can be repeated until x is represented as a convex combination of at most d+1 points in P.

Q.E.D.

External links