public class PatriciaST<Value> extends Object
PatriciaST
class provides an implementation of an unordered
symbol table of key-value pairs, with the restriction that the key is of
class String
. It supports the usual put,
get, contains, delete, size, and
is-empty methods. It also provides a keys method for
iterating over all of the keys. A symbol table implements the
associative array abstraction: when associating a value with a key
that is already in the symbol table, the convention is to replace the old
value with the new value. Unlike Map
, this class uses the
convention that values cannot be null
—setting the value
associated with a key to null
is equivalent to deleting the key
from the symbol table.
This unordered symbol table class implements PATRICIA (Practical Algorithm to Retrieve Information Coded In Alphanumeric). In spite of the acronym, string keys are not limited to alphanumeric content. A key may possess any string value, except for the string of zero length (the empty string).
Unlike other generic symbol table implementations that can accept a
parameterized key type, this symbol table class can only accommodate keys
of class String
. This unfortunate restriction stems from a
limitation in Java. Although Java provides excellent support for generic
programming, the current infrastructure somewhat limits generic collection
implementations to those that employ comparison-based or hash-based methods.
PATRICIA does not employ comparisons or hashing; instead, it relies on
bit-test operations. Because Java does not furnish any generic abstractions
(or implementations) for bit-testing the contents of an object, providing
support for generic keys using PATRICIA does not seem practical.
PATRICIA is a variation of a trie, and it is often classified as a space-optimized trie. In a classical trie, each level represents a subsequent digit in a key. In PATRICIA, nodes only exist to identify the digits (bits) that distinguish the individual keys within the trie. Because PATRICIA uses a radix of two, each node has only two children, like a binary tree. Also like a binary tree, the number of nodes, within the trie, equals the number of keys. Consequently, some classify PATRICIA as a tree.
The analysis of PATRICIA is complicated. The theoretical wost-case performance for a get, put, or delete operation is O(N), when N is less than W (where W is the length in bits of the longest key), and O(W), when N is greater than W. However, the worst case is unlikely to occur with typical use. The average (and usual) performance of PATRICIA is approximately ~lg N for each get, put, or delete operation. Although this appears to put PATRICIA on the same footing as binary trees, this time complexity represents the number of single-bit test operations (under PATRICIA), and not full-key comparisons (as required by binary trees). After the single-bit tests conclude, PATRICIA requires just one full-key comparison to confirm the existence (or absence) of the key (per get, put, or delete operation).
In practice, decent implementations of PATRICIA can often outperform balanced binary trees, and even hash tables. Although this particular implementation performs well, the source code was written with an emphasis on clarity, and not performance. PATRICIA performs admirably when its bit-testing loops are well tuned. Consider using the source code as a guide, should you need to produce an optimized implementation, for another key type, or in another programming language.
Other resources for PATRICIA:
Sedgewick, R. (1990) Algorithms in C, Addison-Wesley
Knuth, D. (1973) The Art of Computer Programming, Addison-Wesley
Constructor and Description |
---|
PatriciaST()
Initializes an empty PATRICIA-based symbol table.
|
Modifier and Type | Method and Description |
---|---|
boolean |
contains(String key)
Returns
true if the key-value pair, specified by the given
key, exists within the symbol table. |
void |
delete(String key)
Removes a key and its associated value from the symbol table, if it
exists.
|
Value |
get(String key)
Retrieves the value associated with the given key.
|
Iterable<String> |
keys()
Returns all keys in the symbol table as an
Iterable . |
static void |
main(String[] args)
Unit tests the
PatriciaST data type. |
void |
put(String key,
Value val)
Places a key-value pair into the symbol table.
|
public void put(String key, Value val)
null
, then the key becomes removed
from the symbol table.key
- the keyval
- the valueIllegalArgumentException
- if key
is null
IllegalArgumentException
- if key
is the empty string.public Value get(String key)
key
- the keynull
if the key is not in the symbol tableIllegalArgumentException
- if key
is null
IllegalArgumentException
- if key
is the empty string.public void delete(String key)
key
- the keyIllegalArgumentException
- if key
is null
IllegalArgumentException
- if key
is the empty string.public boolean contains(String key)
true
if the key-value pair, specified by the given
key, exists within the symbol table.key
- the keytrue
if this symbol table contains the given
key
and false
otherwiseIllegalArgumentException
- if key
is null
IllegalArgumentException
- if key
is the empty string.public Iterable<String> keys()
Iterable
.
To iterate over all of the keys in the symbol table named
st
, use the foreach notation:
for (Key key : st.keys())
.Iterable
public static void main(String[] args)
PatriciaST
data type.
This test fixture runs a series of tests on a randomly generated dataset.
You may specify up to two integer parameters on the command line. The
first parameter indicates the size of the dataset. The second parameter
controls the number of passes (a new random dataset becomes generated at
the start of each pass).args
- the command-line arguments