Alpha-beta pruning

From Free net encyclopedia

Alpha-beta pruning is a technique to reduce the number of nodes evaluated in the search tree by the minimax algorithm. It is a search with adversary algorithm used commonly for two-players games evaluation (tic-tac-toe, chess, go ...). It stops completely evaluating a move that a player can make when at least one reply has been found that proves the move to be worse than a previously examined move. Since it clearly doesn't benefit the player to play that move, it need not be evaluated any further. This simple sounding technique nearly always saves a great deal of processing time, but without affecting the search result in any way.

1 Improvements over minimax
2 Heuristic improvements
3 Other algorithms
4 Pseudocode
5 See also
6 External references

[edit]

Improvements over minimax

The benefit of alpha-beta pruning lies in the fact that branches of the search tree can be eliminated. The search time can in this way be limited to the 'more promising' subtree, and a deeper search can be performed in the same time. Like its predecessor, it belongs to the branch and bound class of algorithms. The optimisation typically reduces the effective branching factor by two compared to simple Minimax, or equivalently doubles the number of nodes that can be searched in a given time. The algorithm does even better if the nodes are evaluated in an optimal or near optimal order.

With a branching factor of b, and a search depth of p ply, the maximum number of leaf node positions evaluated (when the move ordering is pessimal) is b*b*b*b*...*b = b^p - the same as a simple minimax search. If the move ordering for the search is optimal (with best moves searched first), the number of positions searched is about b*1*b*1*...*b (odd depth) and b*1*b*1*...*1 (even depth). This latter case will only require about the square root of the number of evaluations of a simple minimax search (or alternatively can nearly double the search depth). The explanation of b*1*b*1*... is that all the first player's moves must be studied to find the best one, but for each, only the best second player's move is needed to refute all but the first (and best) first player move - alpha beta ensures no other second player moves need be considered. If b=40 (as in chess), and the seach depth is 12 ply, the ratio between optimal and pessimal sorting is a factor of nearly 40^6 or about 4 billion times.

Normally during alpha beta the subtrees are temporarily dominated by either a first player advantage (when many first player moves are good, and at each search depth the first first player move checked is adequate, but all second player responses are required to try and find a (probably lacking) refutation), or vice versa. This advantage can switch sides many times during the search if the move ordering is incorrect, each time leading to inefficiency. As the number of positions searched decreases exponentially each move nearer the current position, it is worth spending considerable effort on sorting early moves. An improved sort at any depth will exponentially reduce the total number of positions searched, but sorting all positions at depths near the root node is relatively cheap as there are so few of them.

The algorithm maintains two values, alpha and beta, which represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of respectively. Initially alpha is negative infinity and beta is positive infinity. As the recursion progresses the "window" becomes smaller. When beta becomes less than alpha, it means that the current position cannot be the result of best play by both players and hence need not be explored further.

[edit]

Heuristic improvements

Further improvement can be achieved without sacrificing accuracy, by using ordering heuristics to search parts of the tree that are likely to force alpha-beta cutoffs early. For example, in chess, moves that take pieces may be examined before moves that do not, or moves that have scored highly in earlier passes through the game-tree analysis may be evaluated before others. Another common, and very cheap, heuristic is the killer heuristic, where the last move that caused a beta-cutoff at the same level in the tree search is always examined first. This idea can be generalised into a set of refutation tables.

Alpha-beta search can be made even faster by considering only a narrow search window (generally determined by guesswork based on experience). This is known as aspiration search. In the extreme case, the search is performed with alpha and beta equal; a technique known as zero-window search or scout search. This is particularly useful for win/loss searches near the end of a game where the extra depth gained from the narrow window and a simple win/loss evaluation function may lead to a conclusive result. If an aspiration search fails, it is straightforward to detect whether it failed high (high edge of window was too low) or low (lower edge of window was too high). This gives information about what window values might be useful in a re-search of the position.

[edit]

Other algorithms

More advanced algorithms that are even faster while still being able to compute the exact minimax value are known, such as Negascout and MTD-f.

Since the minimax algorithm and its variants are inherently depth-first, a strategy such as iterative deepening is usually used in conjunction with alpha-beta so that a reasonably good move can be returned even if the algorithm is interrupted before it has finished execution. Another advantage of using iterative deepening is that searches at shallower depths give move-ordering hints that can help produce cutoffs for higher depth searches much earlier than would otherwise be possible.

Algorithms like SSS-star, on the other hand, use the best-first strategy. This can potentially make them more time-efficient, but typically at a heavy cost in space-efficiency.

[edit]

Pseudocode

Pseudocode for the alpha-beta algorithm is given below

evaluate (node, alpha, beta)
    if node is a leaf
        return the heuristic value of node
    if node is a minimizing node
        for each child of node
            beta = min (beta, evaluate (child, alpha, beta))
            if beta <= alpha
                return beta
        return beta
    if node is a maximizing node
        for each child of node
            alpha = max (alpha, evaluate (child, alpha, beta))
            if beta <= alpha
                return alpha
        return alpha

[edit]