String/Character I/O

There are several approaches to reading in the text input required by many of these problems. Either you can:

- Repeatedly get single characters (perhaps using a library
function like
`getchar`); - Repeatedly get strings (perhaps using a library function like
`scanf`) and break them down into single characters. - Read the entire line as a string (perhaps using a library
function like
`gets`), and then parsing it by accessing characters in the string. - Perhaps more modern ways using streams are easier, perhaps not.

Basic Data Types

Selecting the right data structure makes a trememdous difference in the organization and complexity of a given program.

Be aware of your basic structured data types (arrays, records, multidimensional arrays, enumerated types) and what they are used for.

Linked structures provide great * flexibility* in how memory
is used, but are often unnecessary when the largest
possible size structure is known in advance.

This is true for all or almost all of the problems in this book. Except for argument passing, pointers were not used in any of the example programs in the book.

Pointer structures are often more complex to work with and debug than arrays (KISS).

Abstract Data Types

Thinking of data structures in terms of abstract data types provides a higher-order way to think about program organization.

Abstract data types are defined by the * operations* you want to
perform on the data. The correct implementation (i.e. arrays or
linked structures) is determined * after* you have defined
the abstract data type.

Modern object-oriented languages like C++ and Java come with standard libraries of fundamental data structures.

These eliminate the need to reinvent the wheel, once you know the wheel has been invented.

Queues and Stacks

Stacks and queues are containers where items are retrieved according to the order of insertion, independent of content.

* Stacks* maintain
* last-in, first-out* order.

* Push(x,s)* -
Insert item on top of stack .

* Pop(s)* -
Return (and remove) the top item of stack .

* Initialize(s)* -
Create an empty stack.

* Full(s), Empty(s)* -
Test whether the stack can accept more pushes or pops, respectively.

Note that there is no element search operation defined on standard stacks and queues.

Applications include (1) processing parenthesized formulas (push on a ``('', pop on ``)'') (2) recursive program calls (push on a procedure entry, pop on a procedure exit), and (3) depth-first traversals of graphs (push on discovering a vertex, pop on leaving it for the last time).

Also when the insertion order does not matter at all, since stacks are a very simple container.

Queues

* Queues* maintain * first-in, first-out* order.

* Enqueue(x,q)* -
Insert item at the back of queue .

* Dequeue(q)* -
Return (and remove) the front item from queue

* Initialize(q), Full(q), Empty(q)* -
Analogous to these operation on stacks.

Applications include (1) implementing buffers, (2) simulating waiting lines, and (3) representing card decks for shuffling.

Implementations include circular queues and linked lists.

Dictionaries

Dictionaries permit content-based retrieval, unlike the position-based retrieval of stacks and queues.

* Insert(x,d)* -
Insert item into dictionary .

* Delete(x,d)* -
Remove item (or the item pointed to by )
from dictionary .

* Search(k,d)* -
Return an item with key if one exists in dictionary .

Classical dictionary implementations include (1) sorted arrays, (2) binary search trees, and (3) hash tables.

The correct implementation largely depends upon whether insertions and deletions will be performed.

Hash tables are often the right answer in practice, for reasons of simplicity and performance.

Priority Queues

* Priority queues* are data structures on sets of items supporting
three operations -

* Insert(x,p)* -
Insert item into priority queue .

* Maximum(p)* -
Return the item with the largest key in priority queue .

* ExtractMax(p)* -
Return and remove
the item with the largest key in .

Priority queues are used to (1) to maintain schedules and calendars and (2) in sweepline geometric algorithms where operations go from left to right.

The most famous implementation of priority queues is the binary heap, but it is often simpler to maintain a sorted array, particularly if you will not be performing too many insertions.

Sets

Sets (or more strictly speaking * subsets*)
are unordered collections of elements drawn from a
given universal set .

* Member(x,S)* -
Is an item an element of subset ?

* Union(A,B)* -
Construct subset of all elements in subset
or in subset .

* Intersection(A,B)* -
Construct subset of all elements in subset
and in subset .

* Insert(x,S), Delete(x,S)* -
Insert/delete element into/from subset .

Set data structures get distinguished from dictionaries because
there is at least an implicit need to encode which elements from
are * not* in the given subset.
For sets of a large or unbounded universe, the obvious solution
is representing a subset using a dictionary.

For sets drawn from small, unchanging universes, bit vectors provide a convenient representation. An -bit vector or array can represent any subset from an -element universe. Bit will be 1 iff . Element insertion and deletion operations simply flip the appropriate bit. Intersection and union are done by ``and-ing'' or ``or-ing'' the corresponding bits together.

Since only one bit is used per element, bit vectors can be space efficient for surprisingly large values of . For example, an array of 1,000 standard four-byte integers can represent any subset on 32,000 elements.

Object Libraries

A general library of abstract data types cannot really
exist in C language because functions in C can't tell the
type of their arguments.
Thus we would have to define separate routines such
as ` push_int()` and ` push_char()` for every possible data type.

However, C++ has been designed to support object libraries.
In particular, the * Standard Template Library*
provides
implementations of all the data structures defined above and much more.
Each data object must have the type of its elements fixed (i.e., templated)
at compilation time, so

#include <stl.h> stack<int> S; stack<char> T;declares two stacks with different element types.

Useful standard Java objects appear in the ` java.util` package.
Almost all of ` java.util` is available on the judge.

Appropriate implementations of the basic data structures include --

Data Structure | Abstract class | Concrete class | Methods |

Stack | No interface | Stack |
pop, push, empty, peek |

Queue | List |
ArrayList, LinkedList |
add, remove, clear |

Dictionaries | Map |
HashMap, Hashtable |
put, get, contains |

Priority Queue | SortedMap |
TreeMap |
firstKey, lastKey, headMap |

Sets | Set |
HashSet |
add, remove, contains |

Ranking and Unranking Functions

Whenever we can create a numerical * ranking* function and a
dual * unranking* function which hold
over a particular set of items ,
we can represent any item by its integer rank.

The key property is that . Thus the ranking function can be thought of as a hash function without collisions.

One can define ranking/unranking functions for permutations (1 to ), subsets (1 to ), and playing card (1 to 52).

We can use ranking/unranking functions to (1) generate all of the objects, (2) pick one at random, and (3) sort and compare them.

To rank and unrank playing cards, we order the card values (low to high) and note that there are four distinct cards of each value. Multiplication and division are the key to mapping them from 0 to 51:

#define NCARDS 52 /* number of cards */ #define NSUITS 4 /* number of suits */ char values[] = "23456789TJQKA"; char suits[] = "cdhs"; int rank_card(char value, char suit) { int i,j; /* counters */ for (i=0; i<(NCARDS/NSUITS); i++) if (values[i]==value) for (j=0; j<NSUITS; j++) if (suits[j]==suit) return( i*NSUITS + j ); printf("Warning: bad input value=%d, suit=%d\n",value,suit); } char suit(int card) { return( suits[card % NSUITS] ); } char value(int card) { return( values[card/NSUITS] ); }

Assigned Problems

110201 (Jolly Jumpers) - Does the distances between neighbors of a set of numerical steps realize all vaules to ? What data structure should you use? What graphs (other than a path) allow such structures?

110204 (Crypt Kicker) - Decode an alphabet permutation-encrypted message using a dictionary of words. What constraints does the dictionary imply?

110205 (Stack 'em Up) - Rearrange a deck of cards according to a set of allowable shuffle operations. How do shuffles operate as rearrangement operations (permutations)? How good are the traditional perfect shuffles at mixing up a deck?

110208 (Yahtzee) - How do we assign dice roles to categories so as to maximize our score? Do we need to try all possibilities, or can we be more clever?

2003-02-04