Last Updated: 12 Jan, 2021

Check for Duplicate Subtree

Hard
Asked in companies
Flipkart limitedGoogle inc

Problem statement

Given an arbitrary binary tree consisting of N nodes, each node is associated with a character value. Your task is to check whether there exist a pair of nodes such that their subtrees are equal.

Note :
A subtree of a node X is all the nodes that are below X and X itself.

A binary tree is a tree consisting of at most two children.

Two subtrees are said to be equal if they both have the same structure and the corresponding nodes in both the subtrees are associated with the same character value.
Input format :
The first line of each input has a single integer T, denoting the number of test cases.

The first line of each test case contains a single integer N, denoting the number of nodes in a tree.

The second line contains the keys of the nodes of the tree in the level order form ( # for NULL node) Refer to the example for further clarification.
Example:
Consider the binary tree

altImage

The input of the tree depicted in the image above will be like:
a
b c
d # e f
# g # # # #
# #

Explanation :
Level 1 :
The root node of the tree is a

Level 2 :
The left child of a = b
Right child of a = c

Level 3 :
Left child of b= d
Right child of b = null (#)
Left child of c = e
Right child of c = f

Level 4 :
Left child of d = null (#)
Right child of d = g
Left child of e = null (#)
Right child of e = null (#)
Left child of f = null (#)
Right child of f = null (#)

Level 5 :
Left child of g = null (#)
Right child of g = null (#)

The first not-null node (of the previous level) is treated as the parent of the first two nodes of the current level. The second not-null node (of the previous level) is treated as the parent node for the next two nodes of the current level and so on.

The input ends when all nodes at the last level are null (#).
Output format :
For each test case print on a new line “True”, if there are two similar subtrees with size greater than or equal to 2, otherwise print “False”.
Note :
You do not need to print anything, it has already been taken care of. Just implement the given function. 
Constraints :
1 <= T <= 100
1 <= N <= 500
a <= node value <= z

Time Limit: 1 sec.

Approaches

01 Approach

  • Store the pointer to all the nodes of a tree in an array let say Nodes. To do this you can do a preorder traversal in the tree.
  • Now iterate through all the indices in the Nodes array. Let’s say we are currently at ith index.
    • For each index i iterate through all indices of nodes other than itself. Let's say we iterate through all indices j such that j not equal to i.
      • For each i and j, we will try to find out whether subtrees of Nodes[i] and Nodes[j] are the same or not.
      • For this, we will do a simultaneous preorder on both the nodes and if it is the same then we will return True.
  • If we don’t find any two nodes with the same subtree then we will return False.

02 Approach

  • In this approach, we will try to serialise each tree and find the hash of each subtree and if there are two subtrees with the same hash we will return true else return false.
  • Define a hashmap Hash that we will use to store the hash of each subtree.
  • Do a preorder traversal of the tree. Let’s say we are current.
    • Call preorder traversal of its left and right subtree.
    • Now calculate the serialised string for this node.
    • This string would be equal to the current->key + serialised string of its left child + serialised string of its right child.
    • Also, maintain the size of the subtree of current, it would be equal to 1 + size of the subtree of its left child + size of a subtree of its right child.
    • If its size of the subtree is greater than or equal to 2 then add it to the Hash.
  • If there are any strings with a count greater than 1 in Hash then return true else return false.
  • Let's say we have the following tree.

3

a

b b

# # # #

  • then if we do preorder traversal then we will get the following serialized strings.
  • SerialzedString[1] = ab##b##
  • SerialzedString[2] = b##  but it has size only 1 so we do not store it in the hash map
  • SerialzedString[3] = b## but it has size only 1 so we do not store it in the hash map
  • So there are no serialized strings with size greater than or equal to 2 with count more than 1 so we return false.