Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
Trie is a special type of search tree Data Structure that is used as an efficient retrieval or searching data structure as searching time can be reduced to the optimal limit of pattern length. Here we will see one of the important implementations of trie for pattern searching.
But we will use standard trie of all suffixes to implement the pattern searching algorithm.
Problem statement
You are given a text or string. You need to find all occurrences of given queries of patterns in the text.
Input txt="minimize" pattr="mi"
Output Pattern found at position 0 Pattern found at position 4
Note: Please try to solve the problem first and then see the solution below.
Approach
Step 1: building trie with all suffixes of the given text
First, generate all suffixes of the given string
Then, build the trie considering every suffix as individual words.
Step 2: pattern searching in trie
Start from the root of the trie and the first character of the given pattern.
If there is an edge from the root for the current character, follow the edge.
Thus following the edges one by one for the current character if all characters of the given pattern are processed, the pattern is present in the text. We will store indexes of the suffixes starting at the current node.
If there is not any edge for the current character of the pattern, then the pattern is not present in the given text.
//c++ implementation of pattern Searching using a Trie of all Suffixes
#include<iostream>
#include<list>
#define MAX_CHAR 256
using namespace std;
//node of trie with all suffixes
class TrieNode
{
private:
TrieNode *children[MAX_CHAR];
list<int> *indexes;
public:
// constructor
TrieNode()
{
// Creating an empty linked list for storing indexes of
// suffixes starting from the current node
indexes = new list<int>;
for (int i = 0; i < MAX_CHAR; i++)
{
children[i] = NULL; // initializing all child pointers as NULL
}
}
// function to insert a suffix of the given txt
// in subtree rooted with this node
void insertSuffix(string suffix, int index)
{
// Store index in linked list
indexes->push_back(index);
// If string has more characters
if (suffix.length() > 0)
{
char cIndex = suffix.at(0);// first character
//add a new edge if there is no edge for this character
if (children[cIndex] == NULL)
children[cIndex] = new TrieNode();
// Recur for next suffix
children[cIndex]->insertSuffix(suffix.substr(1), index+1);
}
}
// A function for the pattern searching in subtree rooted
// with this node which returns pointer to a linked
// list containing all indexes where pattern is present.
list<int>* search(string pat)
{
// return the list of indexes if all characters of pattern is processed,
if (pat.length() == 0)
return indexes;
// if there is an edge for the current character, then follow the edge.
if (children[pat.at(0)] != NULL)
return (children[pat.at(0)])->search(pat.substr(1));
// pattern doesn’t exist in the text If there is no edge,
else return NULL;
}
};
// trie of all suffixes
class TrieOfSuffixes
{
private:
TrieNode root;
public:
// constructing a trie of suffixes of the given text(constructor)
TrieOfSuffixes(string txt)
{
// inserting all suffixes into the Suffix Trie
//using the recursive function
for (int i = 0; i < txt.length(); i++)
root.insertSuffix(txt.substr(i), i);
}
// Function for searching a pattern in this suffix trie.
void search(string pat)
{
list<int> *result = root.search(pat);// calling search function for root of Trie.
// paattern not matched if the list of indexes is empty
if (result == NULL)
cout << "Pattern not found" << endl;
else
{
list<int>::iterator i;
int patLength = pat.length();
for (i = result->begin(); i != result->end(); i++)
{
cout << "Pattern found at position " << *i - patLength<< endl;
}
}
}
};
// driver code
int main()
{
// suffix trie for text "minimize"
string txt = "minimize";
TrieOfSuffixes S(txt);
cout << "Search for 'mi'" << endl;
S.search("mi");
return 0;
}
You can also try this code with Online C++ Compiler
Time complexity is O(n+k), where n is the length of the pattern and k is the number of occurrences of the pattern.
FAQs
What is the Trie? Trie is a special type of search tree data structure that is used as an efficient retrieval or searching data structure as searching time can be reduced to the optimal limit of pattern length.
What are some applications of the Trie data structure? Trie is used in many string search-based applications such as autocomplete, prefix matching, text search.
Key Takeaways
This article covered how to implement the pattern searching algorithm using the trie of suffixes.
Check out the Coding Ninjas Studio library for getting a better hold of the data structures and algorithms. Side by side, you can also practice a wide variety of coding questions commonly asked in interviews in Coding Ninjas Studio. Along with coding questions, you can also find the interview experience of scholars working in renowned product-based companies here.