Pumping lemma
From Free net encyclopedia
In the theory of formal languages, a pumping lemma states that any language of a given class can be "pumped" and still belong to that class. A language can be pumped if any sufficiently long string in the language can be broken into pieces that can be repeated to produce an even longer string in the language. Thus, if there is a pumping lemma for a given language class, any language in the class will contain an infinite set of finite strings all produced by a simple rule given by the lemma. In general, such results depend on the pigeonhole principle.
The two most important examples are the pumping lemma for regular languages and the pumping lemma for context-free languages. Unlike theorems, lemmas are specifically intended to facilitate streamlined proofs. These two lemmas are used to determine if a particular language is not in a given language class. However, they cannot be used to determine if a language is in a given class, since satisfying the pumping lemma is a necessary, but not sufficient, condition for class membership.
As an example of a practical application of the pumping lemmas, someone familiar with the pumping lemma for regular languages can see immediately that a language that allows parenthesized expressions, but requires the parentheses to balance, cannot be a regular language, and so the language cannot be generated by a regular grammar nor recognized by a finite state machine. Trying to compose such a proof directly would take a significant length of time.
Contents |
Pumping lemma for regular languages
Main article: Pumping lemma for regular languages
If a language L is regular, then there exists a number p, the pumping length, where p > 0 such that every string w in L with |w| ≥ p can be written in the form:
- w = xyz
with strings x, y and z such that |xy| ≤ p, |y| > 0 and
- xy iz is in L for every i ≥ 0.
(Note that this is trivially true if the language is not infinite, since then p just needs to be longer than longest string in the language)
Using this lemma, one can for instance prove that the language L = {a nb n : n ≥ 0} over the alphabet Σ = {a, b} is not regular. For if it were, we could pick p as in the pumping length. The string w = a pb p is in L, and the pumping lemma therefore yields a decomposition w = xyz with |xy| ≤ p, |y| ≥ 1 and xy iz in L for every i ≥ 0. But then y has to consist of a non-zero number of as, and xy 2z has more as than bs and is therefore not in L, a contradiction.
The proof that the language of balanced (i.e. properly nested) parentheses is not regular follows the same idea. Given p, there is a string of balanced parentheses that begins with more than p left parentheses, so that y will consist entirely of left parentheses. By repeating y, we can produce a string that does not even contain the same number of left and right parentheses, and so they cannot be balanced.
The proof idea for the pumping lemma itself is as follows: the regular language is accepted by a certain deterministic finite acceptor; every string that's longer than the number p of states of that acceptor will revisit a certain state, thereby causing a loop which can be repeated; the loop corresponds to the string y.
Note that the pumping lemma does not give a sufficient condition for a language to be regular: for example, the language {uu Rv : u,v <math>\in</math> {0,1}+} (strings over the alphabet {0,1} consisting of a nonempty even palindrome followed by another nonempty string) is not regular but can still be "pumped" with p = 4: Suppose w = uu Rv has length at least 4. If u has length 1, then |v| ≥ 2 and we can take y to be the first character in v. Otherwise, take y to be the first character of u and note that y k for k ≥ 2 starts with the nonempty palindrome yy. For a practical test that exactly characterizes regular languages, see the Myhill-Nerode theorem.
Pumping lemma for context-free languages
If a language L is context-free, then there exists some number p > 0 such that any string w in L with |w| ≥ p (where p is a pumping length) can be written as
- w = uvxyz
with strings u, v, x, y and z, such that |vxy| ≤ p, |vy| ≥ 1, and
- uv ixy iz is in L for every i ≥ 0.
The pumping lemma can be used to show that certain languages are NOT context-free.
We can show that the language L = {aibici: i>0} is NOT context-free.
- Suppose L is context-free
- Conditions:
- | vxy | ≤ p. That is, the middle portion is not too long.
- vy ≠ ε (empty). Since v and y are the pieces to be “pumped”, this condition says that at least one of the strings we pump must not be empty.
- For all i ≥ 0, uvixyiz is in L. That is, the two strings v and y may be “pumped” any number of times, including 0, and the resulting string will still be a member of L.
- Conditions:
- If string w ∈ L where | w | > p, it follows that w = uvxyz, where | vxy | ≤ p
- Now, choose a value for some integer variable i that is greater than p.
- Then, wherever uvxyz occurs in the string aibici, vxy cannot contain more than two distinct letters, since the rightmost a is i+1 positions away from the leftmost c.
- Ex: It can be all a’s, all b’s, or all c’s, or it can be a’s and b’s, or it can be b’s and c’s.
- Thus, the string vxy cannot contain more than two distinct letters, but by the pumping lemma it cannot be empty either, so it must contain at least one letter.
- Ex: It can be all a’s, all b’s, or all c’s, or it can be a’s and b’s, or it can be b’s and c’s.
- Now we can start "pumping"
- Since uvxyz is in L, uv2xy2z must also be in L. Since v and y can’t both be empty, | uv2xy2z | >
| uvxyz |, so we have added letters.
- But since vy does not contain all three distinct letters, we cannot have added the same number of each letter. We have pumped more letters and contradicted the original definition of L. Thus,
uv2xy2z cannot be in L.
- Since uvxyz is in L, uv2xy2z must also be in L. Since v and y can’t both be empty, | uv2xy2z | >
- We have arrived at a contradiction. Therefore, our original assumption, that L is context free, is false. KJ
(For a generalization of this lemma, see Ogden's lemma.)
See also
References
- Template:Cite book Section 1.4: Nonregular Languages, pp.77–83. Section 2.3: Non-context-free Languages, pp.115–119.cs:Lemma o vkládání
de:Pumping-Lemma he:למת הניפוח לשפות חופשיות הקשר ko:펌핑 렘마 zh:泵引理 fr:Lemme de l'étoile