Multiset

From Free net encyclopedia

In mathematics, a multiset (sometimes also called a bag) differs from a set in that each member has a multiplicity, which is a natural number indicating (loosely speaking) how many times it is a member, or perhaps how many memberships it has in the multiset. For example, in the multiset { a, a, b, b, b, c }, the multiplicities of the members a, b, and c are respectively 2, 3, and 1.

1 Formal definition
2 Examples
3 Operations
4 Multiset coefficients
5 Polynomial notation
6 Cumulant generating function
7 Free commutative monoids

[edit]

Formal definition

Within set theory, a multiset can be formally defined as a pair (A, m) where A is some set and m : A → N is a function from A to the set N of (positive) natural numbers. The set A is called the underlying set of elements. For each a in A the multiplicity (that is, number of occurrences) of a is the number m(a).

It is common to write the function m as a set of ordered pairs { (a, m(a)) : a ∈ A } — indeed, this is the set-theoretic definition of the function m. For example,

the multiset written as { a, b, b } is defined as { (a, 1), (b, 2) },
likewise { a, a, b } is defined as { (a, 2), (b, 1) }, and
the multiset { a, b } is defined as { (a, 1), (b, 1) }.
an indexed family (mathematics), ( a_i ), where i is in some index-set, defines the multiset { a_i } , (provided no element occurs more than a finite number of times in the family. Even in an infinite multiset the multiplicities must be finite numbers).

If the set A is finite, the size or length of the multiset (A, m) is the sum of all multiplicities for each element of A:

This is the size of A counted with multiplicity, while the size of A counted without multiplicity is

A submultiset (B, n) of a multiset (A, m) is a subset B ⊆ A and a function n : B → N such that n(a) ≤ m(a).

[edit]

Examples

One of the most natural and simple examples is the multiset of prime factors of a number n. Here the underlying set of elements is the set of prime divisors of n. For example the number 120 has the prime factorization

which gives the multiset {2, 2, 2, 3, 5}.

Another is the multiset of solutions of an algebraic equation. Everyone learns in secondary school that a quadratic equation has two solutions, but in some cases they are both the same number. Thus the multiset of solutions of the equation could be { 3, 5 }, or it could be { 4, 4 }. In the latter case it has a solution of multiplicity 2.

[edit]

Operations

The usual set operations such as union, intersection and Cartesian product can be easily generalized for multisets.

Suppose (A, m) and (B, n) are multisets

The union can be defined as (A ∪ B, f) where f(x) = max{m(x), n(x)}.
The intersection can be defined as (A ∩ B, f) where f(x) = min{m(x), n(x)}.
The cartesian product can be defined as (A × B, f) where f((x,y)) = m(x)n(y).
The join (sometimes called the sum) can be defined as <math>(A,m) \biguplus (B,n) = (A \cup B, f) </math> where f(x) = m(x)+n(x).

[edit]

Multiset coefficients

The number of submultisets of size k in a set of size n is the multiset coefficient

<math>\left\langle \begin{matrix}n \\ k \end{matrix}\right\rangle = {n + k - 1 \choose n-1}={n+k-1 \choose k},</math>

where the expressions to the right of "=" are binomial coefficients, i.e., the number of such multisets is the same as the number of subsets of size k in a set of size n + k − 1. Unlike the situation with sets, this cardinality will not be 0 when k > n. One simple way to prove this involves representing multisets in the following way. First, consider the notation for multisets that would represent { a, a, a, a, a, a, b, b, c, c, c, d, d, d, d, d, d, d } (6 as, 2 bs, 3 cs, 7 ds) in this form:

<math>\bullet \bullet \bullet \bullet \bullet \bullet \mid \bullet \bullet \mid \bullet \bullet \bullet \mid \bullet \bullet \bullet \bullet \bullet \bullet \bullet </math>

This is a multiset of size 18 made of elements of a set of size 4. The number of characters including both dots and vertical lines used in this notation is 18 + 4 − 1. The number of vertical lines is 4 − 1. The number of multisets of size 18 is then the number of ways to arrange the 4 − 1 vertical lines among the 18 + 4 − 1 characters, and is thus the number of subsets of size 4 − 1 in a set of size 18 + 4 − 1. Equivalently, it is the number of ways to arrange the 18 dots among the 18 + 4 − 1 characters, which is the number of subsets of size 18 of a set of size 18 + 4 − 1. This is

<math>{18+4-1 \choose 4-1}={18+4-1 \choose 18},</math>

so that is the value of the multiset coefficient

<math>\left\langle\begin{matrix} 4 \\ 18 \end{matrix}\right\rangle.</math>

One may define a generalized binomial coefficient

<math>{n \choose k}={n(n-1)(n-2)\cdots(n-k+1) \over k!}</math>

in which n is not required to be a nonnegative integer, but may be negative or a non-integer, or a non-real complex number. (If k = 0, then the value of this coefficient is 1 because it is the product of no numbers.) Then the number of multisets of size k in a set of size n is

<math>\left\langle\begin{matrix} n \\ k \end{matrix}\right\rangle=(-1)^k{-n \choose k}.</math>

This fact led Gian-Carlo Rota to ask "Why are negative sets multisets?". He considered that question worthy of the attention of philosophers of mathematics.

[edit]

Polynomial notation

The set {x} may be represented by the monomial x. The set of subsets, { {}, {x} } , is represented by the binomial 1 + x.

The set {x,y} may be represented by the monomial xy. The set of subsets, { {}, {x}, {y}, {x,y} } , is represented by the polynomial

(1 + x)(1 + y) = 1 + (x + y) + xy.

The multiset {x,x} may be represented by the monomial xx = x². The multiset of submultisets, { {}, {x}, {x}, {x,x} } , is represented by the polynomial

(1 + x)² = 1 + 2x + x².

The multiset of submultisets of the multiset xⁿ is

<math>(1+x)^n=\sum_{k=0}^n{n \choose k} x^k.</math>

That is why the binomial coefficient count the number of k-combinations of an n-set.

The multiset x^Ky^N−K , containing N elements, K of which are hits, is called a statistical population, and a submultiset is called a statistical sample. The set of samples is

(1 + x)^K(1 + y)^N−K

which by the binomial theorem equals

<math>\sum_{n=0}^N\sum_{k=0}^K{K \choose k}{N-K \choose n-k}x^ky^{n-k}.</math>

So the number of n-samples with k hits is

<math>{K \choose k}{N-K \choose n-k}</math>

See hypergeometric distribution and inferential statistics for further on the distribution of hits.

The infinite set of finite multisets of elements from the set {x}:

{ {}, {x}, {x,x}, {x,x,x}, ... }

may be represented by the formal power series

S = 1 + x + x² + x³ + ... = 1 + xS .

The formal solution,

S = (1 − x)⁻¹,

makes sense as a set of multisets, but the intermediate result, 1−x, does not make sense as a set of multisets.

The infinite set of finite multisets of elements from the set xy is

(1 − x)⁻¹(1 − y)⁻¹ = 1 + (x + y) + (x² + xy + y²) + ...

The special case y=x : The infinite multiset of finite multisets of elements from the multiset x² is

(1 − x)⁻² = 1 + 2x + 3x² + ...

The general case: The infinite multiset of finite multisets of elements from the multiset xⁿ is

<math>(1-x)^{-n}=\sum_{k=0}^\infty{-n \choose k} (-x)^k</math> .

This explains why "multisets are negative sets". The negative binomial coefficients count the number of k-multisets of elements from an n-set.

[edit]

Cumulant generating function

A natural number, n, can be represented by the monomial xⁿ . So a finite multiset of natural numbers, say {2, 2, 2, 3, 5}, can be represented by a polynomial f(x), say 3x²+x³+x⁵ . It is convenient to consider the function g(s) = log(f(e^s)), say g(s) = log(3e^2s+e^3s+e^5s) . The number of elements in the multiset is e^g(0) . The mean value of the elements in the multiset is the derivative μ = g'(0) . The variance, i.e. the square of the standard deviation, of the elements in the multiset is σ² = g''(0). The numbers ( μ, σ², ... ) = ( g'(0), g''(0), ... ) are called cumulants, and so g is called the cumulant-generating function.

The set of nonnegative integers is infinite, and the mean value and standard deviation are undefined. Nevertheless the set has a cumulant-generating function, −log(1−e^s) , for s < 0.

A multiset of real numbers can also be represented by a cumulant-generating function, even if it cannot be represented by a polynomial. See partition function (statistical mechanics) for the case where the numbers in question are the energy levels of a physical system.

The cumulant-generating function of the multiset of sums of elements of two multisets of numbers is the sum of the two cumulant-generating functions.

That is the reason why the cumulant generating function is a convenient representation for multisets of numbers, and for probability distributions.

[edit]

Free commutative monoids

There is a connection with the free object concept: the free commutative monoid on a set X can be taken to be the set of finite multisets with elements drawn from X, with the obvious addition operation.de:Multimenge es:Multiconjunto nl:Multiset pl:Multizbiór sl:Večkratna množica uk:Мультимножина

Retrieved from "http://www.netipedia.com/index.php/Multiset"

Categories: Set theory | Factorial and binomial topics

Multiset

From Free net encyclopedia

Contents

Formal definition

Examples

Operations

Multiset coefficients

Polynomial notation

Cumulant generating function

Free commutative monoids

Views

Personal tools

Search

Partner sites