A* search algorithm

From Free net encyclopedia

Template:Tree search algorithm In computer science, A* (pronounced "A star") is a graph search algorithm that finds a path from a given initial node to a given goal node (or one passing a given goal test). It employs a "heuristic estimate" that ranks each node by an estimate of the best route that goes through that node. It visits the nodes in order of this heuristic estimate. The A* algorithm is therefore an example of best-first search.

The algorithm was first described in 1968 by Peter Hart, Nils Nilsson, and Bertram Raphael.

The name A* came about because it was originally presented as the optimal version (denoted by the *) of another algorithm, which was dubbed algorithm A.

Contents

Intuition

Consider the problem of route finding, for which A* is commonly used. A* incrementally builds all routes leading from the starting point until it finds one that reaches the goal. But, like all informed search algorithms, builds only routes that appear to lead towards the goal.

To know which routes will likely lead to the goal, A* employs a heuristic estimation of the distance from any given point to the goal. In the case of route finding, this may be the straight-line distance, which is usually an approximation of road distance.

What sets A* apart from best-first search is that it also takes the distance already travelled into account. This makes A* complete and optimal, i.e., A* will always find the shortest route if any exists.

However, A* is not guaranteed to perform better than simpler search algorithms. In a maze-like environment, the only way to reach the goal might be to first travel one way and eventually turn around. In this case trying nodes closer to your destination first may cost you time.

Algorithm description

A* maintains a set of partial solutions, i.e. paths through the graph starting at the start node, stored in a priority queue. The priority assigned to a path <math>x</math> is determined by the function <math>f(x) = g(x) + h(x)</math>

Here, <math>g(x)</math> is the cost of the path so far, i.e. the weight of the edges followed so far. <math>h(x)</math> is the heuristic estimate of the minimal cost to reach the goal from <math>x</math>. For example, if "cost" is taken to mean distance travelled, the straight line distance between two points on a map is a heuristic estimate of the distance to be travelled.

The lower <math>f(x)</math>, the higher the priority (so a min-heap could be used to implement the queue).

function A*(start,goal)
    var closed := the empty set
    var q := make_queue(path(start))
    while q is not empty
        var p := remove_first(q)
        var x := the last node of p
        if x in closed
            continue
        if x = goal
            return p
        add x to closed
        foreach y in successors(p)
            if y not in closed
                enqueue(q, extend_path(p,y))
    return failure

Here, successors(x) returns the set of paths created by extending p with one neighbor node. It is assumed that the queue maintains an ordering by <math>f</math>-value automatically.

In the closed set (closed), all paths generated so far are recorded, so as to avoid repetition and cycles (making this a graph search). The queue is sometimes analogously called the open set. The closed set can be omitted (yielding a tree search algorithm) if either a solution is guaranteed to exist, or if the successors function is adapted to reject cycles.

Properties

Like breadth-first search, A* is complete in the sense that it will always find a solution if there exists one.

If the heuristic function <math>h</math> is admissible, meaning that it never overestimates the actual minimal cost of reaching the goal, then A* is itself admissible (or optimal) if no closed set is used. If a closed set is used, then <math>h</math> must also be monotonic (or consistent) for A* to be optimal. This means that it never overestimates the cost of getting from a node to its neighbor. Formally, for all nodes <math>x,y</math> where <math>y</math> is a successor of <math>x</math>:

<math>h(x) \le g(y) - g(x) + h(y)</math>

A* is also optimally efficient for any heuristic <math>h</math>, meaning that no algorithm employing the same heuristic will expand fewer nodes than A*, except when there are several partial solutions where <math>h</math> exactly predicts the cost of the optimal path.

Relation to uniform-cost search

Uniform-cost search, aka Dijkstra's algorithm, is a special case of A* where <math>h(x) = 0</math> for all <math>x</math>.

Intuition about why A* is admissible and computationally optimal

There is an intuitive explanation of why A* is both admissible and considers fewer nodes than any other admissible search algorithm. The intuition rests on the recognition that A* has an "optimistic" estimate of the cost of a path through every node that it considers -- optimistic in that the true cost of a path through that node to the goal will be at least as great as the estimate. But, critically, as far as A* "knows", that optimistic estimate might be achievable.

When A* terminates its search, it by definition has found a path whose actual cost is lower than the estimated cost of any path through any open node. But since those estimates are optimistic, A* can safely ignore those nodes. In other words, A* will never overlook the possibility of a lower-cost path and so is admissible.

Suppose now that some other search algorithm A terminates its search with a path whose actual cost is not less than the estimated cost of a path through some open node. Algorithm A cannot rule out the possibility, based on the heuristic information it has, that a path through that node might have a lower cost. So while A might consider fewer nodes than A*, it cannot be admissible. Accordingly, A* considers the fewest nodes of any admissible search algorithm that uses a no more accurate heuristic estimate.

Complexity

The time complexity of A* depends on the heuristic. It is exponential in the worst case, but is polynomial when the heuristic function h meets the following condition:

<math>|h(x) - h^*(x)| \le O(\log h^*(x))</math>

where <math>h^*</math> is the optimal heuristic, i.e. the exact cost to get from <math>x</math> to the goal.

Memory use

More problematic than its time complexity is A*'s memory usage. Several variants of A* have been developed to cope with this, including iterative deepening A*, memory-bounded A* (MA*) and simplified memory bounded A*.

Another informed search algorithm that is optimal and complete if its heuristic is admissible is recursive best-first search (RBFS).

References

  • {{cite journal
| first = P. E.
| last = Hart
| coauthors = Nilsson, N. J.; Raphael, B.
| title = A Formal Basis for the Heuristic Determination of Minimum Cost Paths
| journal = IEEE Transactions on Systems Science and Cybernetics SSC4
| issue = 2
| pages = pp. 100–107
| year = 1968
}}
  • {{cite journal
| first = P. E.
| last = Hart
| coauthors = Nilsson, N. J.; Raphael, B.
| title = Correction to "A Formal Basis for the Heuristic Determination of Minimum Cost Paths"
| journal = SIGART Newsletter
| volume = 37
| pages = pp. 28–29
| year = 1972
}}
  • {{cite book
| first = N. J.
| last = Nilsson
| title = Principles of Artificial Intelligence
| publisher = Tioga Publishing Company
| location = Palo Alto, California
| year = 1980
}}
  • {{cite book
| first = S. J.
| last = Russell
| coauthors = Norvig, P.
| title = Artificial Intelligence: A Modern Approach
| year = 2003
| pages = pp. 97-104
}}

External links

fr:Algorithme A* it:A* nl:A*-algoritme ja:A* pl:Algorytm A* fi:A*-algoritmi