Introduction to tries
Trie is a tree data structure used for storing collections of words/trie nodes in the form of strings. If two strings have a common prefix, then they will have the same ancestor in the tree. We can use trie to store hundreds of thousands of strings, making it easy to search a string if it exists.
For example - Dictionary is a trie data structure.
GIF Credits: giphy
Searching words is a laborious process but using tries makes it efficient and simpler.
Trie is an ideal data structure and is very much similar to the construction of trees. Tries are also used to do a prefix-based search, and we can sort the strings lexicographically in the trie. An alternative method is to use a hash table, but it has some drawbacks over trie since it cannot do a prefix-based search and a regular hash table takes more space than a trie. While deleting keys or words in a trie, we use recursion in a bottom-up manner.
Working of the Code for deleting nodes from trie
Let us consider an example. Suppose we have strings “ abc ”, “ abcf ” and “ kmn ” in our trie data structure, and we will delete the string “ abcf ”.
As we have to delete the string “ abcf ”, the first element to search in the trie data structure is the character ‘a’ in the root trie node.
Since ‘a’ is present in the root node, we will move the pointer to the next character in the string. We will then search for the next character in the trie data structure, i.e.,- character ‘b’.
Since ‘b’ exists in the node, we will move to the next pointer and search for the next character in the trie data structure, i.e.,- character ‘c’.
Notice that ‘c’ is also present in the next node, we will move the pointer to the next character in the string and search for the next character, ‘f’.
‘f’ is also present in the next node, we will move the pointer to the next node.
This trie node has no children because its data value is empty, so we can easily delete this node and move up to the higher level.
Since the end is false, we can easily delete this data value and the node and move to the next higher level.
Now here, we cannot delete this node as if we delete this node; then string “ abc ” will not be available when searching. Hence we stop here and we have deleted the string “ abcf ”.
Algorithm for deleting nodes from trie
When building a trie, the first element to be created or initialized is the root element before inserting any element.
The root value will have a variable ‘wordEnd’ = 0; the wordEnd variable will indicate if the word has ended or not. If the word has finished, then we will increment the wordEnd variable by 1.
The root will create 26 pointers which will correspond to the child nodes.
struct Trienode{ |
While performing the delete operation, we will be deleting the key using recursion in a bottom-up way.
The first case will be to check if the key exists or not in the trie data structure. We have to make sure that deleting nodes from trie should not modify the trie.
For example:
WORDS: [“apple”, “ball”, “car”, “dog”, “dogecoin”, “carter”, “balling”]
We want to delete the string “best”.
We will search for the word “best” in the trie, but since it does not exist, hence no deletion will occur.
The second case is that the key should be unique, i.e., no part of the key should contain any other key, nor the key itself will have another key. Then we can delete all the nodes.
For example, we can delete “apple” as it is a unique key.
The third case is when the key itself is a prefix of another key in the trie data structure. (Prefix case)
For example, we want to delete “car” but “car” is also present in “carter”. Hence, we will update the leaf node boolean value “true” to ”false” for the string “car”.
The fourth case is when the key we are deleting has another key as its prefix.
For example, we want to delete “balling”. So we can delete all the nodes from the bottom till we reach the leaf node for the word “ball”. Hence, we will delete 3 characters “g”, “n” and then “i”.
The fifth case is when a node is later on shared with other words.
For example, we want to delete the word “dogecoin” which is also a part of “dog”. We will check if any other children are there in “d”; there is one child; hence, we move on to “o”, we notice that it is common for “dog” too. Here we will delete the remaining nodes “e”, “c”, ”o”, ”i” and ”n”; hence, the word “dogecoin” is deleted.