Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Approach
3.
Trie insertion and search operation code in C++
4.
Time and space complexity for trie insertion and search
5.
Advantages of trie
6.
FAQs
7.
Key Takeaways
Last Updated: Mar 27, 2024

Trie - Insertion and Search

Introduction

As we are already familiar with tree data structure, today’s discussion is also on a similar topic as trees. To squeeze the gap of time complexity more, the trie came into existence. More precisely, it was created to compromise time and space complexities in O(N) terms. It is generally pronounced as ‘try’‘ to differentiate its existence from a tree.

trie is a tree-like data structure wherein the nodes of the tree store the entire alphabets, and strings/ words are retrieved by traversing down a branch path of the tree. But let us first see how exactly this trie data structure looks like; let’s dig it out.

Approach

In a tree data Data Structure, every node has either 0, 1, or 2 children. But in a trie, the number of child nodes possibly depends upon the total number of values. A trie is generally used to store a bunch of words or strings as in a dictionary. In the English alphabet, there are 26 alphabets, so the total number of child nodes is 26. Each trie has an empty root node, which contains references to other nodes, each signifying separate alphabetic values. 

A simple trie structure looks like this:

struct TrieNode
{
  char data;
  Int word_end;
  TrieNode *child[26]
};


So, a trie can be small or big depending on whatever strings it contains. Till now, we have just talked about the root node, but that too is empty, so where do the letters of remaining words take shelter if the root node doesn’t provide them all? Let’s figure it out.

This is a trie structure, containing words: ‘coder,’codes,’ ‘codeless,’ ‘coding’ and ‘doing.’ ‘r,’ ’s,’ ’s’ and ‘g’ are the end nodes(highlighted by yellow color). ‘c’ and ‘d’ are the root nodes. Seen from close, the root nodes look like:



A root node contains a value which is NULL, a reference array to child nodes of size 26. Generally, the value of the root node is an empty string. Typically, this structure is followed for all other nodes, although all are not shown here.

For inserting, let’s take the word ‘code’; the steps followed are:



The key ‘code’ is already inserted into the trie. What if we wanted to add the word ‘coding’ to this list?

We’d check that this word didn’t already exist. We’d follow the links from one node to another until we reach a NULL node. Then we’d insert the value into the trie.

Word_count shows that many words ending with a particular letter here in this trie. Now, let’s take a look at what searching through our newly-built trie looks like:

Trie insertion and search operation code in C++

// program for trie insertion and search in C++
#include<bits/stdc++.h>
using namespace std;

struct TrieNode
{
    char data;
    
    // word count
    int wc;  
    TrieNode *child[26];
};

// Pool of 100K Trienodes is created
TrieNode nodepool[100000]; 

// Root of Trie is created globally
TrieNode *root;    

// Always points to next free Trienode of the trie
int poolcount;  

// Initializer function
void init() 
{
    poolcount = 0;
    root = &nodepool[poolcount++];
    root->data = '/';
    
    /*
      loop from i-0 till 26
      as there are 26 alphabets in total
    */
    for(int i=0;i<26;++i)
        {
            root->child[i] = NULL;
        }
}

TrieNode *getNode(char c)
{
    TrieNode *newnode = &nodepool[poolcount++];
    newnode->data = c;
    for(int i=0;i<26;++i)
        {
            newnode->child[i] = NULL;
        }
    newnode->wc=0;
    return newnode;
}

/**
    Insert function inserts the key values
    letter by letter in the trie
**/
void insert(char *str)
{
    TrieNode *curr = root;
    int index;
    for(int i=0; str[i]!='\0'; ++i)
    {
        index = str[i]-'a';
        if(curr->child[index]==NULL)
            {
                curr->child[index] = getNode(str[i]);
            }
        curr->child[index]->wc += 1;
        curr = curr->child[index];
    }
}

/*
    The boolean function returns true. 
    if the key is found otherwise
    false for the search key
*/
bool Search_key(char *s)
{
    TrieNode *curr = root;
    int index;
    for(int i=0; s[i]!='\0'; ++i)
    {
        index = s[i]-'a';
        if(curr->child[index]==NULL || curr->child[index]->wc == 0)
            {
                return false;
            }
        curr = curr->child[index];
    }
    return true;
}

int CountPrefix(string str)
{
    TrieNode *curr = root;
    int index;
    for(int i=0; str[i]!='\0'; ++i)
    {
        index = str[i]-'a';
        
        // No string with given prefix is present
        if(curr->child[index]==NULL || curr->child[index]->wc == 0)
            {
                return 0;
            }   
        curr = curr->child[index];
    }
    return curr->wc;
}

int main()
{
    init();
    char arr[10] = {'c','o','d','e','l','e','s','s'};
    char brr[10] = {'c','o','d','e','r'};
    char crr[10] = {'c','o','d','e','s'};
    char drr[10] = {'c','o','d','i','n','g'};
    char err[10] = {'d','o','i','n','g'};
    char frr[10] = {'c','o','d','e'};

    /**
        Above key values are inserted.
        in trie
    **/
    insert(arr);
    insert(brr);
    insert(crr);
    insert(drr);
    insert(err);
    insert(frr);
    
    /**
        Search operation is done.
        printing the prefixes count with 
        Their count in the trie.
    **/
    cout<<"Program for trie insertion and search :"<<"\n";

    printf("No of strings with prefix'c' = %d\n",
        CountPrefix("c"));
    printf("No of strings with prefix 'er' = %d\n",
        CountPrefix("er"));
    printf("No of strings with prefix 'code' = %d\n",
        CountPrefix("code"));
    printf("No of strings with prefix 'coding' = %d\n",
        CountPrefix("coding"));
    printf("No of strings with prefix 'codes' = %d\n",
        CountPrefix("codes"));
    printf("No of strings with prefix 'coder' = %d\n",
        CountPrefix("coder"));

    return 0;
}

Output:

Search
No of strings with prefix'c' = 5
No of strings with prefix 'er' = 0
No of strings with prefix 'code' = 4
No of strings with prefix 'coding' = 1
No of strings with prefix 'codes' = 1
No of strings with prefix 'coder' = 1

Time and space complexity for trie insertion and search

Trie insertion Time Complexity: O(M * N)
Trie search Time Complexity: O(M * N)

Where ‘M’ is the longest word length and ‘N’ is the total number of words. The worst-case time complexity for searching the record associated with the key is O(M * N), and insertion of key-value pair(key, record) also takes O(M * N). 

The main points to ponder why the time complexity squeezes with trie is: 

  • As a trie grows in size, less work needs to be done to add a value since the “intermediate nodes” or the tree branches have already been built up.
  • Each time we add a word’s letter, we need to look at 26 references since there are only 26 letters in the alphabet. This number never changes in the context of our trie, so it is a constant value. 
     

Space complexity: O(M * N)

The space complexity is O(M * N). As at max the size required can be of the maximum word length multiplied with the total possible words. 

Advantages of trie

  1. Tries are like tree data structures. It can perform insert, search, and delete operations in O(N) time, where ‘N’ is the word/ key-value length.
  2. Key values can be easily printed in alphabetical order, which is not possible in hashmaps.
  3. Prefix-search can be easily performed in tries.

FAQs

  1. What is the fundamental difference between hash tables and trie?
    One significant difference between hash tables and tries is that a trie doesn’t need a hash function; every path to an element is unique. However, a trie takes up a lot of space with empty or NULL pointers.
     
  2. What is the time complexity of trie insertion and search?
    The complexity of the search, insert and delete operation in tries is O(M * N), containing, let’s where M is the length of the longest string and N is the total number of words.
     
  3. What is trie used for?
    Trie data structure is used to store the strings or key values which can be represented in the form of string. It contains nodes and edges. At max, 26 children connect each parent node to its children.
     
  4. Does Google use trie for some purpose?
    Yes, Google also uses trie to store each word or string value.
     
  5. How is trie pronounced?
    A trie is generally pronounced as “try”, although it is derived from the word “retrieval”.

Key Takeaways

In this article, we have discussed trie insertion and search techniques, a discussion around its time and space complexity, and finally, understand why every time trie wins over trees. If you want to practice more such problems then you can visit our Code Studio.

But this isn’t the end, right? 

Keeping the theoretical knowledge at our fingertips helps us get about half the work done. To gain complete understanding, practice is a must. A variety of coding questions from interviews are available. If you think that this blog helped you, then share it with your friends!.

Happy learning

Live masterclass