Lookahead
From Free net encyclopedia
Lookahead is a tool in algorithms for looking ahead a few more input items before making a cost effective decision at one stage of the algorithm.
Contents |
Lookahead vs. Lazy evaluation
This is in contrast to another technique called lazy evaluation that delays the computation until it is really needed. Both techniques are used for economical usage of space or time. Lookahead makes the right decision and so avoids backtracking from undesirable stages at later stages of algorithm and so saves space. Lookahead adds slight time overhead of extra lookups. Lazy evaluation normally skips the unexplored algorithmic paths and thus saves both the time and space in general. Some applications of lazy evaluations are demand paging in operating systems and lazy parse tables in compilers.
In search space exploration, both the techniques are used. When there are already promising paths in the algorithm to evaluate, lazy evaluation is used and to be explored paths will be saved in the queue or stack. When there are no promising paths to evaluate and to check whether the new path can be a more promising path in leading to solution, lookahead is used.
Compilers also use both the techniques. They will be lazy in generating parse tables from given rules, but they lookahead in parsing given input.
Lookahead in search problems
In artificial intelligence, lookahead is an important component of combinatorial search which specifies, roughly, how deeply the graph representing the problem is explored.
The need for a specific limit on lookahead comes from the large problem graphs in many applications, such as computer chess and computer Go. A naive breadth-first search of these graphs would quickly consume all the memory of any modern computer. But by setting a specific lookahead limit, the algorithm's time can be carefully controlled; its time increases exponentially as the lookahead limit increases.
More sophisticated search techniques such as alpha-beta pruning are able to eliminate entire subtrees of the search tree from consideration at an earlier time. When these techniques are used, lookahead is not a precisely defined quantity, but instead either the maximum depth searched or some type of average.
Lookahead in parsing
Lookahead is also an important concept in parsers in compilers which establishes the maximum number of incoming input tokens the parser can look at to decide which rule it should use.
Lookahead is especially relevant to LL, LR, and LALR parsers, where it is often explicitly indicated by affixing the lookahead to the algorithm name in parentheses, such as LALR(1).
Most programming languages, the primary target of parsers, are carefully defined in such a way that a parser with limited lookahead, typically one, can parse them, because parsers with limited lookahead are often more efficient. One important change to this trend came in 1990 when Terence Parr created ANTLR for his Ph.D. thesis, a parser generator for efficient LL(k) parsers, where k is any fixed value.
Parsers typically have few actions after seeing each token on the way of parsing. They are shift (add this token to stack for later reduction), reduce (pop tokens from stack and form a syntatic construct), done, error (no known rule applies now) or conflict (does not know whether to shift or reduce at this stage).
Lookahead has two advantages.
- It helps parser to take which action in case of conflicts in many cases. Parsing the if statement in the case of else clause is an example of this.
- It eliminates many duplicate states and eases the burden of extra stack. C Program nonlookahead parser will have around 10,000 states without lookahead. Lookahead parser will have around 300 states. Expression parsing given below is an example of this.
e.g. Parsing Expression 1 + 2 * 3
Set of expression parsing rules (called grammar) is as follows,
Rule1: E → E + E Expression can be sum of two expressions. Rule2: E → E * E Expression can be product of two expressions. Rule3: E → number Expression can be simple number Rule4: + has less precedence than *
Many programming languages (except a few like APL) and algebraic formulas give more precedence to multiplication than addition. The correct interpretation is ( 1 + (2*3)) Note that in the above example, Rule4 is a semantical rule. It is possible to rewrite the grammar to incorporate this into syntax. However, not all semantic rules can be translated into syntax.
Simple nonlookahead parser actions:
1. Reduces 1 to expression E on input 1 based on rule3. 2. Shift + onto stack on input 1 in anticipation of rule1. 3. Reduce stack element 2 to Expression E based on rule3. 4. Reduce stack items E+ and new input E to E based on rule1. 5. Shift * onto stack on input * in anticipation of rule2. 6. Shift 3 onto stack on input 3 in anticipation of rule3. 7. Reduce 3 to Expression E on input 3 based on rule3. 8. Reduce stack items E* and new input E to E based on rule2.
The parse tree and resulting code from it is not correct according to language semantics.
To correctly parse with no lookahead, there are three solutions.
- The user has to enclose expressions within brackets as a workaround. This is really not a solution.
- The parser needs to have more logic to backtrack and retry whenever a rule is violated or not complete. The similar method is followed in LL parsers.
- Alternatively, the parser or grammar needs to have extra logic not to reduce immediately and reduce only when it is absolutely sure which rule to reduce first. This method is used in LR parsers. This correctly parses the expression but with many more states and with a lot of stack depth.
Lookahead parser actions:
1. shift 1 onto stack on input 1 in anticipation of rule3. It does not reduce immediately. 2. reduce stack item 1 to simple Expression on input + based on rule3. The lookahead is +, so we are on path to E +, so we can reduce the stack to E. 3. shift + onto stack on input + in anticipation of rule1. 4. shift 2 onto stack on input 2 in anticipation of rule3. 5. reduce stack item 2 to Expression on input * based on rule3. The lookahead * expects only E before it. 6. Now stack has E + E and still the input is *. It has two choices now, either to shift based on rule2 or reduction based on rule1. Since * has more precedence than + based on rule4, so shift * onto stack in anticipation of rule2. 7. shift 3 onto stack on input 3 in anticipation of rule3. 8. reduce stack item 3 to Expression after seeing end of input based on rule3. 9. reduce stack items E * E to E based on rule2. 10. reduce stack items E + E to E based on rule1.
The parse tree generated is correct and simply more efficient than nonlookahead parsers. This is the strategy followed in LALR parsers.de:Lookahead