public class SuffixArrayX extends Object
SuffixArrayXclass represents a suffix array of a string of length n. It supports the selecting the ith smallest suffix, getting the index of the ith smallest suffix, computing the length of the longest common prefix between the ith smallest suffix and the i-1st smallest suffix, and determining the rank of a query string (which is the number of suffixes strictly less than the query string).
This implementation uses 3-way radix quicksort to sort the array of suffixes.
For a simpler (but less efficient) implementations of the same API, see
The index and length operations takes constant time
in the worst case. The lcp operation takes time proportional to the
length of the longest common prefix.
The select operation takes time proportional
to the length of the suffix and should be used primarily for debugging.
This implementation uses '\0' as a sentinel and assumes that the charater '\0' does not appear in the text.
In practice, this algorithm runs very fast. However, in the worst-case it can be very poor (e.g., a string consisting of N copies of the same character. We do not shuffle the array of suffixes before sorting because shuffling is relatively expensive and a pathologial input for which the suffixes start out in a bad order (e.g., sorted) is likely to be a bad input for this algorithm with or without the shuffle.
For additional documentation, see Section 6.3 of Algorithms, 4th Edition by Robert Sedgewick and Kevin Wayne.
|Constructor and Description|
Initializes a suffix array for the given
|Modifier and Type||Method and Description|
Returns the index into the original string of the ith smallest suffix.
Returns the length of the longest common prefix of the ith smallest suffix and the i-1st smallest suffix.
Returns the length of the input string.
Unit tests the
Returns the number of suffixes strictly less than the
Returns the ith smallest suffix as a string.
public SuffixArrayX(String text)
text- the input string
public int length()
public int index(int i)
text.substring(sa.index(i))is the i smallest suffix.
i- an integer between 0 and n-1
0 <=i < n
public int lcp(int i)
i- an integer between 1 and n-1
1 <= i < n
public String select(int i)
i- the index
0 <= i < n
public int rank(String query)
querystring. We note that
ibetween 0 and n-1.
query- the query string
public static void main(String args)
args- the command-line arguments