CSCE 221 Chapter 8
« previous | Tuesday, February 22, 2011 | next »
Dictionaries and Hashing
- A searchable collection of key-element items.
- Multiple items allowed same key
Functions
Function | Description |
---|---|
int size() | Number of items in dictionary |
bool isEmpty() | Whether dictionary is empty |
ObjectIterator elements() | Return elements stored in dictionary |
ObjectIterator keys() | Return keys stored in dictionary |
Position find(k) | Return position of an item with key equal to k |
PositionIterator findAll(k) | Returns an iterator of all keys that match k |
void insertItem(k, o) | Insert element o into the dictionary with key k |
void removeElement(k) | Remove an element with key equal to k |
void removeAllElements(k) | Removes all elements with key equal to k |
Log File
Dictionary implementation with an unordered sequence
- insertItem() is
- find() and removeElement() are
Good for insertion but not for retrieval (good for logging access or queries)
Direct Address Table
Functions similar to Vector in which the index can be used
Keys are and are not allowed to be repeated
insertItem(), find(), and removeElement() are
Hash Tables
Keys associated as "addresses"
Bucket Array
Main array of size where cells act like containers for elements. Keys from hash function h() ranges from
Transform key to index that will have nice properties
Avoid collisions where some keys map to same index of H
Chaining
have a singly-linked list come off of every position in the table. As keys fill up, collisions result in the item being added to the tail of the linked list. (Requires space)
Open addressing
Store in another cell of table. For deletion, mark item as inactive to show that an item used to be there.
- Linear probing
- Put item in next slot over until it runs into an open slot
- possible probe sequences
- Quadratic probing
- Look in i^2 slots
- possible probe sequences
- Double hashing
- use a second hash function that places item sin the next available cell using
- , otherwise secondary hash might not find an empty cell. In other words, has to be prime.
- secondary compression function where is prime.
- load factor (how full the table is
- uniform hashing assumption: assuming hash values are like random numbers, expected number of probes for insertion is
- possible probe sequences
Hash Function
Composition of 2 functions such that :
- Hash code map:
- Compression map:
Goal is to disperse keys randomly and evenly
Modular Arithmetic is used heavily
- Division:
- Multiply, Add, Divide:
Ordered Dictionary
Tuesday, March 1, 2011
Additional functions:
- closestBefore(k) return position of item with largest key ≤
- closestAfter(k) return position of item with smallest key ≥
Lookup Table
Dictionary implemented using a sorted sequence Performance:
find() | |
insertItem() | |
removeElement() |
Binary Search
Must be given random access to sorted sequence in order for this to work.
- Look at middle item
- if middle is what we're looking for, return it
- if middle <, recurse on right
- if middle >, recurse on left
Skip List
Bunch of lists with each "level" list containing a random subset of the level before it Special keys and
Searching
Find :
- Start at
- compare with next position
- : return element(after(current))
- : scan forward (current = after(current))
- : drop down (current = below(current))
- If we drop down past bottom list, key does not exist
Randomized Algorithms
Digital "coin tosses"
Insertion
- "flip coin" until we get "tails" (let # flips = )
- if (height of skip list), add new lists
- search for in the skip list and find positions of items with largest key less than in each list
- Insert entries into each list before positions found earlier
Analysis
Everything depends on random bits used by each insert
Known "facts":
- Probability of getting consecutive heads is
- If each of items is present in a set with probability , expected size of set is
- If each of events has a probability , the probability that at least one event occurs is at most
Expected space usage is
Expected height is
Summary
Dictionary | insertItem() | find() and deleteItem() | space |
---|---|---|---|
Log File | |||
Direct Address Table | (unrelated to ) | ||
Hash Table (unsorted chaining) | w.c. a.c. b.c. |
||
Hash Table (sorted chaining) | w.c. a.c. b.c. |
w.c. a.c. b.c. |
|
Hashing (open addressing) |
Hash Codes
Questions
- I was never able to fully grasp the concept of bitwise operators. Can we review?
- What is the difference between high-order and low-order bits?