Introduction
What is a Suffix Array?
Suffix Array: It is a sorted array of all suffixes of a given string. It is a type of data structure used in full-text indices and data compression algorithms, and generally, it is used in any area that deals with strings and string matching problems.

The suffix array contains integers representing the starting indexes of a given string's suffixes after the suffixes are sorted.
Different approaches
Two widely used approaches for constructing a Suffix Array are the Naive Approach and the Optimized Approach. We will be discussing the Optimized Approach in depth.
Naive Approach
It is also known as the O(n2logn) approach. In this approach, we get all the suffixes, sort them using quicksort or mergesort, and simultaneously retain their original indices.
Code for obtaining a suffix array in Java:
import java.util.Arrays;
import java.util.ArrayList;
class suffixArray {
public static void main(String[] args) throws Exception
{
ArrayList<Integer> suffixIndex = new ArrayList<Integer>();
String w = "CodingNinjas";
String ar1[] = new String[w.length()];
String ar2[] = new String[w.length()];
int suffixArray[] = new int[w.length()];
int i;
for (i = 0; i < w.length(); i++) {
ar1[i] = w.substring(i);
}
ar2 = ar1.clone();
Arrays.sort(ar1);
for (String s : ar1) {
String p = s;
int index = new suffixArray().index(ar2, p);
suffixIndex.add(index);
}
for (i = 0; i < suffixArray.length; i++) {
suffixArray[i] = suffixIndex.get(i);
}
System.out.println("The suffix array for CodingNinjas");
for (var c : suffixArray) {
System.out.print(c + " ");
}
}
//A function that returns the index of item from array a[]
int index(String a[], String item)
{
for (int i = 0; i < a.length; i++) {
if (item == a[i])
return i;
}
return -1;
}
}
Output
The suffix array for CodingNinjas
0 6 10 2 5 3 7 9 4 8 1 11
Optimized Approach
This approach is also known as O(nlogn) because our aim in the approach is to reduce the time from O(n2logn) to O(nlogn).
- In this approach, first, we sort all the suffixes according to the starting (first) character; after that, we sort them according to the first two characters, followed by the first four characters, and till the number of characters is less than 2n.
-
Once we have sorted all the suffixes according to the first 2i characters, we can then sort suffixes according to the first 2i+1 characters in O(n log n) time using a sorting algorithm of the type (n Log n) like Merge Sort.
First, by giving each suffix a rank using the first character's ASCII value, we will initially sort by the first two characters.
Index |
Suffix |
Rank |
0 |
banana |
1 |
1 |
anana |
0 |
2 |
nana |
13 |
3 |
ana |
0 |
4 |
na |
13 |
5 |
a |
0 |
For each character, we will also store the rank of the next adjacent character (It is required to sort the suffixes according to the first two characters). If the character is the last, we store the next rank as -1.
Index |
Suffix |
Rank |
Next Rank |
0 |
banana |
1 |
0 |
1 |
anana |
0 |
13 |
2 |
nana |
13 |
0 |
3 |
ana |
0 |
13 |
4 |
na |
13 |
0 |
5 |
a |
0 |
-1 |
Then, all Suffixes will be sorted by rank and adjacent rank. The first digit, or MSD, is regarded as rank, and the second digit is considered as the adjacent rank.
Index |
Suffix |
Rank |
Adjacent Rank |
5 |
a |
0 |
-1 |
1 |
anana |
0 |
13 |
3 |
ana |
0 |
13 |
0 |
banana |
1 |
0 |
2 |
nana |
13 |
0 |
4 |
na |
13 |
0 |
Now sorting according to the first four characters.
All of the suffixes will be assigned new ranks. We will consider the sorted suffixes while allocating new rankings. We'll start by giving the first Suffix the new rank of zero. When allocating ranks to the remaining suffixes, we will consider the rank pair of suffixes immediately before the current Suffix. Give a suffix the same rank if its previous rank pair matches the previous rank of the Suffix that came before it. If not, give the previous Suffix's rank plus one.
Index |
Suffix |
Rank |
|
5 |
a |
0 |
Assigning 0 to first |
1 |
anana |
1 |
(0,13) is different from previous |
3 |
ana |
1 |
(0, 13) is the same as previous |
0 |
banana |
2 |
(1, 0) is different from previous |
2 |
nana |
3 |
(13, 0) is different from previous |
4 |
na |
3 |
(13, 0) is the same as previous |
For every suffix str[i], we also store the rank of the next suffix at str[i + 2]. If no next suffix exists at i + 2, we store the next rank as -1.
Index |
Suffix |
Rank |
Next Rank |
5 |
a |
0 |
-1 |
1 |
anana |
1 |
1 |
3 |
ana |
1 |
0 |
0 |
banana |
2 |
3 |
2 |
nana |
3 |
3 |
4 |
na |
3 |
-1 |
We sort all the suffixes according to the rank and the next rank.
Index |
Suffix |
Rank |
Next Rank |
5 |
a |
0 |
-1 |
1 |
anana |
1 |
0 |
3 |
ana |
1 |
1 |
0 |
banana |
2 |
3 |
2 |
nana |
3 |
-1 |
4 |
na |
3 |
3 |
Code for Optimized Approach:
import java.util.*;
class suffixArray
{
// Class to store information of a suffix
public static class Suffix implements Comparable<Suffix>
{
int index;
int rank;
int next;
public Suffix(int ind, int r, int nr)
{
index = ind;
rank = r;
next = nr;
}
// A comparison function used by sort()
// to compare two suffixes.
// Compares two pairs, returns 1
// if first pair is smaller
public int compareTo(Suffix s)
{
if (rank != s.rank) return Integer.compare(rank, s.rank);
return Integer.compare(next, s.next);
}
}
// This function takes a string 'txt' of size n as an argument
//builds and returns the suffix array for the given string
public static int[] suffixArray(String s)
{
int n = s.length();
Suffix[] su = new Suffix[n];
// Store suffixes and their indexes in
// an array of classes. The class is needed
// to sort the suffixes alphabetically and
// maintain their old indexes while sorting
for (int i = 0; i < n; i++)
{
su[i] = new Suffix(i, s.charAt(i) - '$', 0);
}
for (int i = 0; i < n; i++)
su[i].next = (i + 1 < n ? su[i + 1].rank : -1);
// Sort the suffixes using the comparison function
// defined above.
Arrays.sort(su);
// Here, all suffixes are sorted
//according to their first 2 characters.
// Let us now sort suffixes according to first 4
// characters, then first 8 and so on..
// characters, then first 8 and so on
int[] ind = new int[n];
for (int length = 4; length < 2 * n; length <<= 1)
{
// Assigning rank and index values to first suffix
int rank = 0, prev = su[0].rank;
su[0].rank = rank;
ind[su[0].index] = 0;
for (int i = 1; i < n; i++)
{
// If first rank and next ranks are same as
// that of previous suffix in array,
// assign the same new rank to this suffix
if (su[i].rank == prev &&
su[i].next == su[i - 1].next)
{
prev = su[i].rank;
su[i].rank = rank;
}
else
{
// Otherwise increment rank and assign
prev = su[i].rank;
su[i].rank = ++rank;
}
ind[su[i].index] = i;
}
// Assign next rank to every suffix
for (int i = 0; i < n; i++)
{
int nextP = su[i].index + length / 2;
su[i].next = nextP < n ?
su[ind[nextP]].rank : -1;
}
// Sort the suffixes according
// to first k characters
Arrays.sort(su);
}
// Store indexes of all sorted
// suffixes in the suffix array
int[] suf = new int[n];
for (int i = 0; i < n; i++)
suf[i] = su[i].index;
// Return the suffix array
return suf;
}
static void printArr(int arr[], int n)
{
for (int i = 0; i < n; i++)
System.out.print(arr[i] + " ");
System.out.println();
}
public static void main(String[] args)
{
String txt = "banana";
int n = txt.length();
int[] suff_arr = suffixArray(txt);
System.out.println("The suffix array for banana: ");
printArr(suff_arr, n);
}
}
Output
The suffix array for banana: 5 3 1 0 4 2







