Matroid

From Free net encyclopedia

In combinatorial mathematics, a matroid is a structure that captures the essence of a notion of "independence" (hence independence structure) that generalizes linear independence in vector spaces. There are many equivalent ways to define a matroid; the most significant include those in terms of independent sets, bases, closed sets (flats), the closure operator, circuits (minimal dependent sets), and the rank function.

Contents

Definition by independent sets

For convenience and simplicity we begin with the definition by independent sets. In this definition a matroid M on a ground set E is a pair (E, I), where I is a collection of subsets of E (called the independent sets) with the following properties:

  1. The empty set is independent.
  2. Every subset of an independent set is independent; this is sometimes called the hereditary property.
  3. If A and B are two independent sets and A has more elements than B, then there exists an element in A which is not in B and when added to B still gives an independent set; this property is called the exchange property.

If the ground set is finite, this is all that is necessary, but if it is infinite more is needed; but rather than go into the complexities of infinite matroids we mention the property that defines a finitary matroid:

  • An infinite subset of E is independent if and only if every finite subset of it is independent; this property is called finite character.

A matroid is finite-dimensional or of finite rank if there is a natural number such that no independent set is larger than that natural number.

The first two properties are simple, but the motivation behind the third property is not obvious. The examples in the example section below will make its motivation clearer.

A set that is not independent is called dependent. A minimal dependent set is called a circuit.

Definition by closure operator

The closure cl(A) of a subset A of E in a finitary matroid M is defined to be A together with every point x in E\A, such that there is a circuit C containing x and contained in the union of A and {x}. This defines the closure operator, from P(E) to P(E), where P denotes the power set.

We can give axioms for the closure operator, thereby defining a matroid in terms of its closure. First, let E be a finite set. A function cl from P(E) to P(E) is a matroid closure if it satisfies the following conditions, for all elements a, b of E and all subsets Y, Z of E:

  1. cl is an abstract closure operator.
  2. MacLane–Steinitz exchange property: If a is in cl(Y ∪ b) \ cl(Y), then b is in cl(Y ∪ a).
  3. If E is infinite, we add the property of finite character: If a is in cl(Y), then there is a finite subset X of Y such that a is in cl(X).

The theorem is that a matroid closure is exactly the same thing as the closure operator of a finitary matroid as defined by independent sets. Thus, one can choose to define a matroid by its closure instead of its independent sets; and this is often desirable, especially in applications to geometry.

By contrast, the closure operator of a topological space tends to fail the exchange property (2); instead it has a different characteristic property.

Examples

  • Let E be a set and k a natural number. The subsets of E with at most k elements are the independent sets of a matroid on E. The properties hold because:
    • The empty set has at most k elements for all natural numbers k;
    • Every subset of a set is of smaller or equal size;
    • If A is larger than B, then A must contain some element B does not. Since B has size at most k − 1, adding this to B preserves independence.
  • If E is any finite subset of a vector space V, then we can define a matroid M on E by taking the independent sets of M to be the linearly independent elements in E. Matroids of this kind are called vector matroids. (Note that any set of vectors of the vector space constitutes a finitary matroid, which is finite-dimensional if the vector space has finite dimension.) The properties hold because:
    • The empty set is vacuously linearly independent;
    • Linear dependency cannot be introduced by removing vectors;
    • Exchange: If A and B are sets of linearly independent vectors, they span spaces of dimensions |A| and |B| respectively. If every vector in A is linearly dependent on the vectors in B, then every vector in A lies in the |B|-dimensional space spanned by the vectors in B. Thus, the vectors of A span a space of dimension at most |B|, implying |A| ≤ |B|.
  • A matrix A with entries in a field gives rise to a matroid M on its set of columns. The dependent sets of columns in the matroid are those that are linearly dependent as vectors. This matroid is called the column matroid of A, and A is said to represent M. Column matroids are just vector matroids under another name, but there are many reasons to favor the matrix representation.
  • In a projective geometry choose any set E of points and let the closure be projective closure, restricted to E. This defines a matroid. As a special case, E can be a subset of an affine geometry, in particular of a Euclidean space.
  • In a vector space or a projective geometry, choose any arrangement of hyperplanes, i.e., a finite set of subspaces of codimension 1. Call a subset A independent if its intersection has codimension equal to the number of hyperplanes in A. The closure of A is the set of hyperplanes in E that contain the intersection of all hyperplanes in A. This defines a matroid on E; such matroids are linearly or projectively dual to those of the preceding examples (vectors, and projective points).
  • Every finite graph or multigraph G gives rise to a matroid as follows: take as E the set of all edges in G and consider a set of edges independent if and only if it does not contain a simple cycle. Such an edge set is called a forest in graph theory. This is called the cycle matroid, sometimes graphic matroid. The properties hold because:
    • There are no cycles if there are no edges;
    • Removing edges cannot create cycles;
    • Here we assume that each forest contains all the vertices of the graph. If we consider each isolated vertex to be a tree, then a forest with k edges is made up of |V| − k trees. Thus, a forest with more edges has fewer trees than a forest with fewer edges. But then some tree in the larger forest must contain two vertices from two different trees in the smaller forest. As the vertices are in a tree in the larger forest, within the larger forest there is a unique path between the two vertices. This path must contain an edge that is not in any of the trees of the smaller forest; this is the edge we can add.

Note that an infinite graph determines a cycle matroid in the same way, which is a finitary matroid because all cycles are finite, and is finite dimensional if the number of vertices is finite (even if the number of edges is infinite).

Image:Maximal three-colorable.png On the other hand, consider this non-example: let E be the vertices of a graph, and let the independent sets be those that can be colored with three colors. The empty set can be three-colored, and any subset of a three-colorable set is three-colorable, but the exchange property does not hold, because it's possible to have two maximal three-colorable subgraphs of different sizes, as shown to the right.

Further definitions and properties

A subset of E is called dependent if it is not independent. A dependent set all of whose proper subsets are independent is called a circuit (with the terminology coming from the graph example above).

We say A, a subset of E, spans M if its closure is E.

A matroid is simple if every 2-element set is independent.

An independent set that is not properly contained in another independent set is called a basis (with the terminology coming from the vector space example above). Hence bases are maximal independent sets, and circuits are minimal dependent sets. An important fact is that all bases of a matroid have the same number of elements, called the rank of M. In general, the circuits of M can have different sizes.

In the linear algebra example matroid above, a basis is a basis in the sense of linear algebra of the subspace spanned by E, and a circuit is a minimal set of dependent vectors of E. In the cycle matroid, a basis is the same as a spanning forest of the graph G, and circuits are simple cycles in the graph. In the example where sets of at most k elements are independent, a basis is any subset of E with k elements, and a circuit is any subset of k + 1 elements.

If A is a subset of E, then a matroid on A can be defined by considering a subset of A independent if and only if it is independent in M. This allows us to talk about submatroids and about the rank of any subset of E.

The rank function r assigns a natural number to every subset of E and has the following properties:

  1. r(A) ≤ |A| for all subsets A of E.
  2. If A and B are subsets of E with AB, then r(A) ≤ r(B).
  3. For any two subsets A and B of E, we have r(AB) + r(AB) ≤ r(A) + r(B).

These three properties can be used as one of the alternative definitions of a finite matroid: the independent sets are then defined as those subsets A of E with r(A) = |A|.

If M is a finite matroid, we can define the dual matroid M* by taking the same underlying set and calling a set a basis in M* if and only if its complement is a basis in M. Thus, a set is independent in M* if and only if it is contained in the complement of some basis of M, if and only if its complement spans M. One checks easily that M* is indeed a matroid. One of the difficulties with infinite matroids is to define duality; this has never been resolved satisfactorily.

Further examples

Very early it was recognized that algebraic independence is a matroid independence. Let L be an extension field of a field K. A finite subset x1, ..., xk of L is algebraically independent if there is no non-zero polynomial f(t1, ..., tk), with coefficients in K, such that f(x1, ..., xk) = 0. Algebraic dependence determines a finitary matroid on the ground set L, called the full algebraic matroid of L over K. The rank of this matroid equals the transcendence degree of L over K. An algebraic matroid is any submatroid of a full algebraic matroid.

Not much later, it was found that transversal theory provides another kind of matroid, now called a transversal matroid. (The original name, due to the founder R. Rado, was independence structure.)

Weighted matroids and greedy algorithms

There is a simple algorithm for finding a basis:

  • Let A be the empty set.
  • For each x in E
    • if A U {x} is independent, then set A to A U {x}.

The result is clearly an independent set. It is a maximal independent set because if B U {x} is not independent for some subset B of A, then A U {x} is not independent either (the contrapositive follows from the hereditary property). Thus if we pass up an element, we'll never have an opportunity to use it later. We will generalize this algorithm to solve a harder problem.

A weight function w : ER+ for a matroid M=(E, I) assigns a strictly positive weight to each element of E. We extend the function to subsets of E by summation; w(A) is the sum of w(x) over x in A. A matroid with an associated weight function is called a weighted matroid.

As a simple example, say we wish to find the maximum spanning forest of a graph. That is, given a graph and a weight for each edge, find a forest containing every vertex and maximizing the total weight of the edges in the tree. This problem arises in some clustering applications. If we look at the definition of the forest matroid above, we see that the maximum spanning forest is simply the independent set with largest total weight — such a set must span the graph, for otherwise we can add edges without creating cycles. But how do we find it?

An independent set of largest total weight is called an optimal set. Optimal sets are always bases, because if an edge can be added, it should be; this only increases the total weight. As it turns out, there is a trivial greedy algorithm for computing an optimal set of a weighted matroid. It works as follows:

  • Let A be the empty set.
  • For each x in E, taken in (monotonically) decreasing order by weight
    • if A U {x} is independent, then set A to A U {x}.

This algorithm finds a basis, since it is a special case of the above algorithm. It always chooses the element of largest weight that it can while preserving independence (thus the term "greedy"). This always produces an optimal set: suppose that it produces <math>A=\{e_1,e_2,\ldots,e_r\}</math> and that <math>B=\{f_1,f_2,\ldots,f_r\}</math>. Now for any <math>k</math> with <math>1\le k\le r</math>, consider open sets <math>O_1=\{e_1,\ldots,e_{k-1}\}</math> and <math>O_2=\{f_1,\ldots,f_k\}</math>. Since <math>O_1</math> is smaller than <math>O_2</math>, there is some element of <math>O_2</math> which can be put into <math>O_1</math> with the result still being independent. However <math>e_k</math> is an element of maximal weight that can be added to <math>O_1</math> to maintain independence. Thus <math>e_k</math> is of no smaller weight than some element of <math>O_2</math>, and hence <math>e_k</math> is of at least a large a weight as <math>f_k</math>. As this is true for all <math>k</math>, <math>A</math> is weightier than <math>B</math>.

The easiest way to traverse the members of E in the desired order is to sort them. This requires O(|E|log|E|) time using a comparison sorting algorithm. We also need to test for each x whether A U {x} is independent; assuming independence tests require O(f(|E|)) time, the total time for the algorithm is O(|E|log|E| + |E|f(|E|)).

If we want to find a minimum spanning tree instead, we simply "invert" the weight function by subtracting it from a large constant. More specifically, let wmin(x) = W - w(x), where W exceeds the total weight over all graph edges. Many more optimization problems about all sorts of matroids and weight functions can be solved in this trivial way, although in many cases more efficient algorithms can be found that exploit more specialized properties.

Note also that if we take a set <math>I</math> of "independent" sets which is a down-set but not a matroid, then the greedy algorithm will not always work. For then there are independent sets <math>I_1</math> and <math>I_2</math> with <math>|I_1|<|I_2|</math>, but such that for no <math>e\in I_2\setminus I_1</math> is <math>I_1\cup e</math> independent.

Pick an <math>\epsilon>0</math> and <math>\tau>0</math> such that <math>(1+2\epsilon)|I_1|+\tau|E|<|I_2|</math>. Weight the elements of <math>I_1\cup I_2</math> in the range <math>2</math> to <math>2+2\epsilon</math>, the elements of <math>I_1\setminus I_2</math> in the range <math>1+\epsilon</math> to <math>1+2\epsilon</math>, the elements of <math>I_2\setminus I_1</math> in the range <math>1</math> to <math>1+\epsilon</math>, and the rest in the range <math>0</math> to <math>\tau</math>. The greedy algorithm will select the elements of <math>I_1</math>, and then cannot pick any elements of <math>I_2\setminus I_1</math>. Therefore the independent set it constructs will be of weight at most <math>(1+2\epsilon)|I_1|+\tau|E|+|I_1\cup I_2|</math>, which is smaller than the weight of <math>I_2</math>.

History

The concept of a (finite) matroid was introduced by Hassler Whitney in 1935 in his paper "On the abstract properties of linear dependence". The names independence structure (R. Rado; this is a finitary matroid) and combinatorial pregeometry (G.-C. Rota) have been used, although the latter especially is now rare. (Rota called a simple matroid a combinatorial geometry.)

The original interest was in vector, graphic, and algebraic matroids, and later in transversal matroids.

It was not until 1971 that Jack Edmonds connected weighted matroids to greedy algorithms in his paper "Matroids and the greedy algorithm". Korte and Lovász would generalize these ideas to objects called greedoids, which allow even larger classes of problems to be solved by greedy algorithms.

References

  • H. Whitney. On the Abstract Properties of Linear Dependence. American Journal of Mathematics,

volume 57, p.509–533. 1935.

  • Jack Edmonds. Matroids and the Greedy Algorithm. Mathematical Programming, volume 1, p.125–136. 1971.
  • James G. Oxley. Matroid Theory. Oxford University Press, New York, 1992. ISBN 0198535635.
  • T. Cormen, C. Leiserson, R. Rivest. Introduction to Algorithms. MIT Press, 1990. Section 16.4, Theoretical foundations for greedy methods. ISBN 0262032937.

External links

hu:matroid it:Matroide pl:Matroid