CSCE 221 Chapter 8
« previous  Tuesday, February 22, 2011  next »
Dictionaries and Hashing
 A searchable collection of keyelement items.
 Multiple items allowed same key
Functions
Function  Description 

int size()  Number of items in dictionary 
bool isEmpty()  Whether dictionary is empty 
ObjectIterator elements()  Return elements stored in dictionary 
ObjectIterator keys()  Return keys stored in dictionary 
Position find(k)  Return position of an item with key equal to k 
PositionIterator findAll(k)  Returns an iterator of all keys that match k 
void insertItem(k, o)  Insert element o into the dictionary with key k 
void removeElement(k)  Remove an element with key equal to k 
void removeAllElements(k)  Removes all elements with key equal to k 
Log File
Dictionary implementation with an unordered sequence
 insertItem() is
 find() and removeElement() are
Good for insertion but not for retrieval (good for logging access or queries)
Direct Address Table
Functions similar to Vector in which the index can be used
Keys are and are not allowed to be repeated
insertItem(), find(), and removeElement() are
Hash Tables
Keys associated as "addresses"
Bucket Array
Main array of size where cells act like containers for elements. Keys from hash function h() ranges from
Transform key to index that will have nice properties
Avoid collisions where some keys map to same index of H
Chaining
have a singlylinked list come off of every position in the table. As keys fill up, collisions result in the item being added to the tail of the linked list. (Requires space)
Open addressing
Store in another cell of table. For deletion, mark item as inactive to show that an item used to be there.
 Linear probing
 Put item in next slot over until it runs into an open slot
 possible probe sequences
 Quadratic probing
 Look in i^2 slots
 possible probe sequences
 Double hashing
 use a second hash function that places item sin the next available cell using
 , otherwise secondary hash might not find an empty cell. In other words, has to be prime.
 secondary compression function where is prime.
 load factor (how full the table is
 uniform hashing assumption: assuming hash values are like random numbers, expected number of probes for insertion is
 possible probe sequences
Hash Function
Composition of 2 functions such that :
 Hash code map:
 Compression map:
Goal is to disperse keys randomly and evenly
Modular Arithmetic is used heavily
 Division:
 Multiply, Add, Divide:
Ordered Dictionary
Tuesday, March 1, 2011
Additional functions:
 closestBefore(k) return position of item with largest key ≤
 closestAfter(k) return position of item with smallest key ≥
Lookup Table
Dictionary implemented using a sorted sequence Performance:
find()  
insertItem()  
removeElement() 
Binary Search
Must be given random access to sorted sequence in order for this to work.
 Look at middle item
 if middle is what we're looking for, return it
 if middle <, recurse on right
 if middle >, recurse on left
Skip List
Bunch of lists with each "level" list containing a random subset of the level before it Special keys and
Searching
Find :
 Start at
 compare with next position
 : return element(after(current))
 : scan forward (current = after(current))
 : drop down (current = below(current))
 If we drop down past bottom list, key does not exist
Randomized Algorithms
Digital "coin tosses"
Insertion
 "flip coin" until we get "tails" (let # flips = )
 if (height of skip list), add new lists
 search for in the skip list and find positions of items with largest key less than in each list
 Insert entries into each list before positions found earlier
Analysis
Everything depends on random bits used by each insert
Known "facts":
 Probability of getting consecutive heads is
 If each of items is present in a set with probability , expected size of set is
 If each of events has a probability , the probability that at least one event occurs is at most
Expected space usage is
Expected height is
Summary
Dictionary  insertItem()  find() and deleteItem()  space 

Log File  
Direct Address Table  (unrelated to )  
Hash Table (unsorted chaining)  w.c. a.c. b.c. 

Hash Table (sorted chaining)  w.c. a.c. b.c. 
w.c. a.c. b.c. 

Hashing (open addressing) 
Hash Codes
Questions
 I was never able to fully grasp the concept of bitwise operators. Can we review?
 What is the difference between highorder and loworder bits?