Why Suffix array?
When learning any new data structure, one should be curious why another data structure? You must have thought about this while deciding to learn suffix arrays, right?
Suffix arrays are a very useful Data Structure and extremely easy to implement. Moreover, it allows for efficient string matching. Some of the applications of suffix arrays are in data compression algorithms, bioinformatics and full-text indexes. From the point of view of problem-solving, they can be used in any area dealing with string and pattern finding problems.
It is quite related to the suffix tree data structure.
Now that you know what a suffix array is and why is it important, it's time to dive into constructing a suffix array.
Construction of Suffix Array
We will use the most naive approach, and hence the implementation is quite intuitive and simple.
Steps:
- Store all the suffixes of the given string in an array along with their starting indexes.
- Sort the array in ascending order.
- Finally, create an array of integers and simply store the indexes of the sorted suffixes as obtained above, which is the required suffix array.
Note: After reading the Algorithm part, it is recommended to try writing the code on your own before reading the solution code.
Pseudo-Code to construct the suffix array
This program constructs the suffix array of a given string.
In the main function
Input the string s
Create a vector of pairs and sore the suffixes of string s along with its index
sort the suffixes array
Print the suffix array
Let’s see the code implementation along with the analysis of time and space complexity in the next section.
C++ Implementation
/*
C++ Program to construct the suffix array of a given string
*/
#include <bits/stdc++.h>
using namespace std;
int main()
{
string s = "Coding Ninjas Studio";
int len = s.length();
vector<pair<string, int>> suffixes;
for (int i = 0; i < len; i++)
{
suffixes.push_back({s.substr(i), i});
}
sort(suffixes.begin(), suffixes.end());
cout << "The suffix array of the string " << s << " is:" << endl;
for (int i = 0; i < len; i++)
{
cout << suffixes[i].second << " ";
}
cout << endl;
cout << "The sorted suffixes of the string are:\n";
for (int i = 0; i < len; i++)
{
cout << suffixes[i].second << "-->" << suffixes[i].first << endl;
}
}

You can also try this code with Online C++ Compiler
Run Code
Output
The suffix array of the string Coding Ninjas Studio is:
0 2 7 3 8 9 1 4 5 6
The sorted suffixes of the string are:
0-->Coding Ninjas Studio
2-->destudio
7-->dio
3-->estudio
8-->io
9-->o
1-->odestudio
4-->studio
5-->tudio
6-->udio
Time Complexity
O(N2logN), where N = length of the given string.
There are N suffixes in total, and sorting them takes O(NlogN) time, but as we use comparison-based sorting(quicksort or mergesort) algorithms, comparing two strings takes O(N) time in the worst case. Hence, the total time complexity turns out to be O(N*NlogN) = O(N2logN).
Space Complexity
O(N), where N = length of the given string. Since, we use an additional space of size N to store all the suffixes of the given string.
Also check out - Substr C++ and Euclid GCD Algorithm
Applications of Suffix Array
-
Searching a string x in another string s can be done easily using a suffix array. Since the suffix array contains the order of the suffixes of the string s, we can do a binary search over the suffix array. Thus, the worst-case time complexity of the search will be O(|x|log(|s|)).
-
It can be used for comparing two substrings of a string.
- It can be used to find the longest common prefix of two substrings.
Frequently Asked Questions
-
What is the disadvantage of suffix trees over suffix array search?
The disadvantage of suffix trees over suffix arrays is that they generally require more space to store all the internal pointers in the tree.
-
What are the pros and cons of suffix array over suffix trees?
The suffix tree is lighter and faster than the trie and is used to index DNA or optimize large web search engines. The suffix array is slower in some pattern searches than the suffix tree but uses less space and is more widely used than the Suffix tree.
-
How to construct a suffix array from a suffix tree?
The suffix array can be constructed by doing a depth-first traversal of the suffix tree.
Key Takeaways
In this article, we learnt about suffix array, its introduction and saw its applications in problem-solving. We also saw the naive algorithm to construct the suffix array for any given string and the pseudocode and C++ implementation.
Recommended Problem - K Closest Points To Origin
There are many other efficient algorithms to construct suffix arrays. We will cover all those in our future blogs. So, stay tuned.
Are you planning to ace the interviews of reputed product-based companies like Amazon, Google, Microsoft, and more? Attempt our Online Mock Test Series on Coding Ninjas Studio now!