Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Suffix Tree
2.1.
Example of Suffix tree
3.
Generalized suffix tree
3.1.
Example of Generalized suffix tree
4.
Application of Generalized suffix tree
4.1.
Longest Common Substring
5.
FAQs
6.
Key Takeaways
Last Updated: Mar 27, 2024

Generalized Suffix Tree

Author Ujjawal Gupta
0 upvote
Master Python: Predicting weather forecasts
Speaker
Ashwin Goyal
Product Manager @

Introduction

This blog will discuss the topic of Generalized suffix trees. A generalized suffix tree is a crucial topic and is widely asked in various coding contests and multiple interviews. This blog will discuss what a generalized suffix tree is and its applications. We will also discuss how we can implement the generalized suffix tree.

Suffix Tree

A suffix tree is a Data Structure that stores all the possible suffixes of any string. It is implemented using a tree data structure. You can learn more about suffix trees here.

Example of Suffix tree

Supposes we have to store all the suffixes of string “nonsense.” All the suffix possible is shown in the below diagram.

Source

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Generalized suffix tree

The suffix tree stores the suffix of only a single string. But in real life, we have to deal with the data of more than one string. So we need a data structure that stores all the suffix of more than one string, and the data structure is known as a generalized suffix tree. As name contains ‘Generalized’, which means it is generalized from the suffix tree. 

A generalized suffix tree for strings S1, S2, S3….Sk is the suffix tree of S1, S2, S3….Sk but the leaf node label has the position in the string and an index of which string it refers to. The leaves are labeled with i:j, which means jth suffix of string Ti.

Example of Generalized suffix tree

Let's draw the suffix tree for two strings say, T1 = “nonsense” and T2 = “offense”. As shown in the diagram below, each leaf node stores the index starting index of the suffix of both the strings. This tree contains the suffix tree of both the strings; that's why this is the generalized suffix tree.

Source

Application of Generalized suffix tree

Longest Common Substring

We can use a generalized suffix tree to find the longest common substring of two strings. The main idea behind this algorithm is that every substring in one string is the prefix of some suffix of that string. The algorithm to find the longest common substring between two strings is discussed below:

  • Build the generalized suffix tree of S1 and S2.
  • Annotate each internal node in the tree with whether that node has at least one leaf node from each of S1 and S2.
  • Run a DFS Algorithm over the tree to find the marked node with the largest string depth.

FAQs

  1. What are the advantages of Suffix Tree over Tries?
    Suffix Trees consume lesser space, and when you are storing a vast text, the difference between linear and quadratic space complexity can be huge. Suffix tree use can also reduce lookup time.
     
  2. What are the similarities between the suffix tree and the generalized suffix tree?
    Both suffix trees and generalized suffix trees are implemented using a tree data structure, and both are used to perform operations in the suffix of the string.
     
  3. What are the differences between the suffix tree and the generalized one? 
    The suffix tree is made only for a single string, whereas the generalized suffix tree is used for more than one string.
     
  4. What are the applications of the Suffix Tree?
    While there are many applications of this data structure, the fundamental ones are searching a string in a Text in O(len(string)) time for multiple checks, searching for longest repeated string, longest common substring, etc. It's also used for searching patterns in DNA or protein sequences.
     
  5. Which data structure is used in implementing the suffix tree and generalized suffix tree?
    The tree data structure is used in implementing the suffix tree and generalized suffix tree. We can also use the Trie data structure.

Key Takeaways

In this blog, we learned about an exciting topic, the Generalized suffix tree. We have discussed the generalized suffix tree and its applications. You can learn more similar topics: here.

Check out this problem - Longest Common Prefix

Hence learning never stops, and there is a lot more to learn.

So head over to our practice platform Coding Ninjas Studio to practice top problems, attempt mock tests, read interview experiences, and much more. Till then, Happy Coding!

Previous article
Suffix String Rank Problem
Next article
Substrings and Repetitions
Live masterclass