Next: About this document Up: My Home Page

Priority Queues and Heapsort
Lecture 18

Steven S. Skiena

Who's Number 2?

In most sports playoffs, a single elimination tournament is used to decide the championship.

The Marlins were clearly the best team in the 1997 World Series, since they were the only one without a loss. But who is number 2? The Giants, Braves, and Indians all have equal claims, since only the champion beat them!

Each game can be thought of as a comparison. Given n keys, we would like to determine the k largest values. Can we do better than just sorting all of them?

In the tournament example, each team represents an leaf of the tree and each game is an internal node of the tree. Thus there are n-1 games/comparisons for n teams/leaves.

Note that the champion is identified even though no team plays more than games!

Lewis Carroll, author of ``Alice in Wonderland'', studied this problem in the 19th century in order to design better tennis tournaments!

We will seek a data structure which will enable us to repeatedly identify the largest key, and then delete it to retrieve the largest remaining key.

This data structure is called a heap, as in ``top of the heap''.

Binary Heaps

A binary heap is defined to be a binary tree with a key in each node such that:

All leaves are on, at most, two adjacent levels.
All leaves on the lowest level occur to the left, and all levels except the lowest are completely filled.
The key in the root is all its children, and the left and right subtrees are again binary heaps. (This is a recursive definition)

Conditions 1 and 2 specify the shape of the tree, while condition 3 describes the labeling of the nodes tree.

Unlike the tournament example, each label only appears on one node.

Note that heaps are not binary search trees, but they are binary trees.

Heap Test

Where is the largest element in a heap?

Answer - the root.

Where is the second largest element?

Answer - as the root's left or right child.

Where is the smallest element?

Answer - it is one of the leaves.

Can we do a binary search to find a particular key in a heap?

Answer - No! A heap is not a binary search tree, and cannot be effectively used for searching.

Why Do Heaps Lean Left?

As a consequence of the structural definition of a heap, each of the n items can be assigned a number from 1 to n with the property that the left child of node number k has a number 2k and the right child number 2k+1.

Thus we can store the heap in an n element array without pointers!

If we did not enforce the left constraint, we might have holes, and need room for elements to store n things.

This implicit representation of trees saves memory but is less flexible than using pointers. For this reason, we will not be able to use them when we discuss binary search trees.

Constructing Heaps

Heaps can be constructed incrementally, by inserting new elements into the left-most open spot in the array.

If the new element is greater than its parent, swap their positions and recur.

Since at each step, we replace the root of a subtree by a larger one, we preserve the heap order.

Since all but the last level is always filled, the height h of an n element heap is bounded because:

displaymath268

so .

Doing n such insertions takes , since each insertion takes at most time.

Deleting the Root

The smallest (or largest) element in the heap sits at the root.

Deleting the root can be done by replacing the root by the nth key (which must be a leaf) and letting it percolate down to its proper position!

The smallest element of (1) the root, (2) its left child, and (3) its right child is moved to the root. This leaves at most one of the two subtrees which is not in heap order, so we continue one level down.

After steps of O(1) time each, we reach a leaf, so the deletion is completed in time.

This percolate-down operation is called often Heapify, for it merges two heaps with a new root.

Heapsort

An initial heap can be constructed out on n elements by incremental insertion in time:



Build-heap(A)

		 for i = 2 to n do

				 HeapInsert(A[i], A)

Exchanging the maximum element with the last element and calling heapify repeatedly gives an sorting algorithm, named Heapsort.



Heapsort(A)

		 Build-heap(A)

		 for i = n to 1 do

		 		 swap(A[1],A[i])

		 		 n = n - 1

		 		 Heapify(A,1)

Advantages of heapsort include:

No extra space (Quicksort needs a stack)
No worst case trouble.
Simpler to get fast and correct than Quicksort.

The Lesson of Heapsort

Always ask yourself, ``Can we use a different data structure?''

Selection sort scans throught the entire array, repeatedly finding the smallest remaining element.

For i = 1 to n A: Find the smallest of the first n-i+1 items. B: Pull it out of the array and put it first.

Using arrays or unsorted linked lists as the data structure, operation A takes O(n) time and operation B takes O(1).

Using heaps, both of these operations can be done within time, balancing the work and achieving a better tradeoff.

Priority Queues

A priority queue is a data structure on sets of keys supporting the operations: Insert(S, x) - insert x into set S, Maximum(S) - return the largest key in S, and ExtractMax(S) - return and remove the largest key in S

These operations can be easily supported using a heap.

Insert - use the trickle up insertion in .
Maximum - read the first element in the array in O(1).
Extract-Max - delete first element, replace it with the last, decrement the element counter, then heapify in .

Application: Heaps as stacks or queues

In a stack, push inserts a new item and pop removes the most recently pushed item.
In a queue, enqueue inserts a new item and dequeue removes the least recently enqueued item.

Both stacks and queues can be simulated by using a heap, when we add a new time field to each item and order the heap according it this time field.

To simulate the stack, increment the time with each insertion and put the maximum on top of the heap.
To simulate the queue, decrement the time with each insertion and put the maximum on top of the heap (or increment times and keep the minimum on top)

This simulation is not as efficient as a normal stack/queue implementation, but it is a cute demonstration of the flexibility of a priority queue.

Discrete Event Simulations

In simulations of airports, parking lots, and jai-alai - priority queues can be used to maintain who goes next.

In a simulation, we often need to schedule events according to a clock. When someone is born, we may then immediately decide when they will die, and we will have to be reminded when to bury them!

The stack and queue orders are just special cases of orderings. In real life, certain people cut in line.

Sweepline Algorithms in Computational Geometry

In the priority queue, we will store the points we have not yet encountered, ordered by x coordinate. and push the line forward one stop at a time.

Greedy Algorithms

In greedy algorithms, we always pick the next thing which locally maximizes our score. By placing all the things in a priority queue and pulling them off in order, we can improve performance over linear search or sorting, particularly if the weights change.

Example: Sequential strips in triangulations.

About this document ...

Next: About this document Up: My Home Page

Steve Skiena
Thu Oct 30 09:57:21 EST 1997

Priority Queues and Heapsort Lecture 18

Priority Queues and Heapsort
Lecture 18