Splay tree
From Free net encyclopedia
A splay tree is a self-balancing binary search tree with the additional unusual property that recently accessed elements are quick to access again. It performs basic operations such as insertion, look-up and removal in O(log(n)) amortized time. For many non-uniform sequences of operations, splay trees perform better than other search trees, even when the specific pattern of the sequence is unknown. The splay tree was invented by Daniel Sleator and Robert Tarjan.
All normal operations on a binary search tree are combined with one basic operation, called splaying. Splaying the tree for a certain element rearranges the tree so that the element is placed at the root of the tree. One way to do this is to first perform a standard binary tree search for the element in question, and then use tree rotations in a specific fashion to bring the element to the top. Alternatively, a bottom-up algorithm can combine the search and the tree reorganization.
Contents |
Advantages and disadvantages
Good performance for a splay tree depends on the fact that it is self-balancing, and indeed self optimising, in that frequently accessed nodes will move nearer to the root where they can be accessed more quickly. This is an advantage for nearly all practical applications, and is particularly useful for implementing caches; however it is important to note that for uniform access, a splay tree's performance will be considerably (although not asymptotically) worse than a somewhat balanced simple binary search tree.
Splay trees also have the advantage of being considerably simpler to implement than other self-balancing binary search trees, such as red-black trees or AVL trees, while their average-case performance is just as efficient. Also, splay trees don't need to store any bookkeeping data, thus minimizing memory requirements. However, these other data structures provide worst-case time guarantees, and can be more efficient in practice for uniform access.
One worst case issue with the basic splay tree algorithm is that of sequentially accessing all the elements of the tree in the sort order. This leaves the tree completely unbalanced (this takes n accesses- each an O(1) operation). Reaccessing the first item triggers an operation that takes O(n) operations to rebalance the tree before returning the first item. This is a significant delay for that final operation, although the amortised performance over the entire sequence is actually O(1). However, recent research shows that randomly rebalancing the tree can avoid this unbalancing effect and give similar performance to the other self-balancing algorithms.
It is possible to create a persistent version of splay trees which allows access to both the previous and new versions after an update. This requires amortized O(log n) space per update.
The splay operation
When a node x is accessed, a splay operation is performed on x to move it to the root. To perform a splay operation we carry out a sequence of splay steps, each of which moves x closer to the root. As long as x has a grandparent, each particular step depends on two factors:
- Whether x is the left or right child of its parent node, p,
- Whether p is the left or right child of its parent, g (the grandparent of x).
Thus, there are four cases when x has a grandparent. They fall into two types of splay steps.
Zig-zag Step: One zig-zag case is when x is the right child of p and p is the left child of g (shown above). p is the new left child of x, g is the new right child of x, and the subtrees A, B, C, and D of x, p, and g are rearranged as necessary. The other zig-zag case is the mirror image of this, i.e. when x is the left child of p and p is the right child of g. Note that a zig-zag step is equivalent to doing a rotation on the edge between x and p, then doing a rotation on the edge between p and g.
Zig-zig Step: One zig-zig case is when x is the left child of p and p is the left child of g (shown above). p is the new right child of x, g is the new right child of p, and the subtrees A, B, C, and D of x, p, and g are rearranged as necessary. The other zig-zig case is the mirror image of this, i.e. when x is the right child of p and p is the right child of g. Note that zig-zig steps are the only thing that differentiate splay trees from the rotate to root method indroduced by Allen and Munro prior to the introduction of splay trees.
Zig Step: There is also a third kind of splay step that is done when x has a parent p but no grandparent. This is called a zig step and is simply a rotation on the edge between x and p. Zig steps exist to deal with the parity issue and will be done only as the last step in a splay operation and only when x has odd depth at the beginning of the operation.
By performing a splay operation on the node of interest after every access, we keep recently accessed nodes near the root and keep the tree roughly balanced, so that we achieve the desired amortized time bounds.
Performance theorems
There are several theorems and conjectures regarding the worst-case runtime for performing a sequence S of m accesses in a splay tree containing n elements.
Balance Theorem: The cost of performing the sequence S is <math>O(m(\log n + 1)+n\log n)</math>. In other words, splay trees perform as well as static balanced binary search trees on sequences of at least n accesses.
Static Optimality Theorem: Let <math>q_i</math> be the number of times element i is accessed in S. The cost of performing S is <math>O\left(m+\sum_{i=1}^n q_i\log\frac{m}{q_i}\right)</math>. In other words, splay trees perform as well as optimum static binary search trees on sequences of at least n accesses.
Static Finger Theorem: Let <math>i_j</math> be the element accessed in the <math>j^{th}</math> access of S and let f be any fixed element (the finger). The cost of performing S is <math>O\left(m+n\log n+\sum_{j=1}^m \log(|i_j-f|+1) \right)</math>.
Working Set Theorem: Let <math>t(j)</math> be the number of distinct elements accessed between access j and the previous time element <math>i_j</math> was accessed. The cost of performing S is <math>O\left(m+n\log n+\sum_{j=1}^m \log(t(j)+1) \right)</math>.
Dynamic Finger Theorem: The cost of performing S is <math>O\left(m+n\log n+\sum_{j=1}^m \log(|i_{j+1}-i_j|+1) \right)</math>.
Scanning Theorem: Also known as the Sequential Access Theorem. Accessing the n elements of a splay tree in symmetric order takes <math>\Theta(n)</math> time, regardles of the initial structure of the splay tree. The tightest upper bound proven so far is <math>4.5n</math>.
See also
References
- Donald Knuth. The Art of Computer Programming, Volume 3: Sorting and Searching, Third Edition. Addison-Wesley, 1997. ISBN 0-201-89685-0. Page 478 of section 6.2.3.
- D.D. Sleator and R.E. Tarjan. Self-Adjusting Binary Search Trees. Journal of the ACM 32:3, pages 652-686, 1985
External links
- NIST's Dictionary of Algorithms and Data Structures: Splay Tree
- The ACM Digital Library: the original publication describing splay trees NB full access requires an ACM Web Account
- Splay Tree Applet
- AVL, Splay and Red/Black Applet
- New York University: Dept of Computer Science: Algorithm Visualization: Splay Treeses:Árbol biselado