Steven S. Skiena
What about non-uniform access?
AVL/red-black trees give us worst case query and update operations, by keeping a balanced search tree. But when I access with non-uniform probability, a skewed tree might be better:
Expected cost of left tree:
Expected cost of right tree:
In real life, it is difficult to obtain the actual probabilities, and they keep changing. What can we do?
Self-organizing Search Trees
We can apply our self-organizing heuristics to search trees, as we did with linked lists. Whenever we access a node, we can either:
Once again, move-to-front proves better at adjusting to changing distributions.
Moving a made to the front of a search tree means making it the root!
To get a particular node to the root we can do a sequence of rotations!
Splay trees use the move-to-front heuristic on each search / query.
Splay Trees
To search or insert into a splay tree, we first perform the operation as if it was a random tree. After it is found or inserted, perform a splay operation to move the given key to the root.
A splay operation consists of a sequence of double rotations until the node is within one level of the root, where at most one single rotation suffices to finish the job.
The choice of which double rotation to do depends upon our relationship to our grandparent - a single rotation is performed only when we have no grandparent!
The cases: and .
Splay Tree Example
Example: Splay(a)
At the conclusion, a is the root and the tree is more balanced.
Note that the tree would not have become more balanced had we just used single rotations to promote a to the root, instead of double rotations.
How good are Splay Trees?
Sleator and Tarjan showed that if the keys are accessed with a uniform distribution, the cost for any sequence of n splay operations is , so the amortized cost is per operation!
This is better than expected since there is no probability involved! If we get an expensive splay step (i.e. moving up an non-balanced tree) it meant we did enough cheap operations before this that we can pay for the differences out of our savings!
Further, if the distribution is non-uniform, we get amortized costs within a constant factor of the best possible tree!
All of this is done without keeping any balance or color information - amazing!