Last Updated: 15 Jan, 2021

Largest Distance Between Two Nodes In A Tree

Moderate
Asked in companies
SamsungInnovaccerMicrosoft

Problem statement

You are given an arbitrary unweighted rooted tree which consists of N nodes, 0 to N - 1. Your task is to find the largest distance between two nodes in the tree.

The distance between two nodes is the number of edges in a path between the nodes (there will always be a unique path between any pair of nodes since it is a tree).

Note :
Use zero-based indexing for the nodes.

The tree is always rooted at 0.
Input format :
The very first line of input contains an integer ‘T’, denoting the number of test cases.

The first line of each test case contains an integer ‘N’, denoting the number of nodes in the tree. 

The next N-1 lines of each test case contain two space-separated integers u and v, denoting an edge between node u and node v.
Output format :
For each test case, the largest distance between two nodes in the tree is printed.

Note :

You do not need to print anything, it has already been taken care of. Just implement the given function.
Follow Up :
Can you solve this problem in just one traversal?
Constraints :
1 <= T <= 100 
2 <= N <= 3000
0 <= u , v < N

Time Limit: 1 sec

Approaches

01 Approach

  • A brute force approach could be to find the distance of each node to every other node in the tree.
  • While doing so, we keep track of the distance of the longest path which can be travelled for each node.
  • The node from which we can travel the maximum distance gives us the answer.

Algorithm:

  • Iterate over the nodes, 0 to N-1.
  • For each node, apply BFS, starting from the current node.
    • Store the distance of the longest path that can be covered from the current node.
  • The maximum of the distances from all the nodes gives us the required answer.

Note:

  • We can also use DFS to calculate the distances between two nodes. But the idea will remain the same.

02 Approach

  • The idea behind this approach is that the longest path in a tree always lies between two end(leaf) nodes, which is nothing but the diameter of the tree.
  • To find the diameter of the tree we recursively traverse the tree in post-order (or DFS).
  • Suppose, we are currently at the root node of the tree.
  • Now, there are two possible cases:
    • Case 1: The diameter of the tree is present in one of the subtrees of the root node.
    • Case 2: The diameter of the tree passes through the root node.

         

  • For the first case, we can recursively find the diameter of the subtrees.
  • For the second case, we need to find the length of the longest path between the leaves which passes through the root.
    • To do so, we find the sum of the heights of the two subtrees, where height is the number of nodes in the longest path which starts from the root and goes down to any leaf node.
  • The maximum of the diameter of the subtrees (case 1) and the length of the longest path passing through the root node (case 2) gives us the length of the longest path in the tree.

Algorithm:

  • Let our recursive function is getDiameter(tree, currentNode, parent), which returns the diameter of the tree rooted at currentNode, where parent is the parent node of the current node.
  • In order to get the diameter of the given root (i.e. 0) we call the function with currentNode = 0 and parent = -1.
  • For every child of the current node:
    • We calculate the height of the subtree rooted at the child node.
  • From all the heights calculated in the previous step, we find the largest and second-largest height and store them in variable maxHeight1 and maxHeight2, respectively.
  • Now, for every child of the current node:
    • Calculate its diameter by recursively calling the function getDiameter(tree,child,currentNode).
  • Store the maximum diameter in a variable, say maxChildDiameter.
  • The diameter of the current node will be maximum of maxChildDiameter and maxHeight1 + maxHeight2. 
  • So, return MAX(maxChildDiameter, maxHeight1 + maxHeight2)

03 Approach

  • Instead of traversing the complete tree for every node. The problem can also be solved using just two traversals.
  • We just need to apply the BFS Traversal more cleverly.
  • The idea behind this approach is that, if we start the BFS from any node X and find the farthest node from it, say Y. Then the node Y must be the endpoint of the longest path present in the tree.
  • Now, we can apply a second BFS, taking Y as the starting node. Suppose this path ends at node Z. Then this path from Y to Z is the longest path in the tree. So, return the length.
  • We can also apply the same idea using DFS.
  • Note: A mathematical proof of the above algorithm can be found here.

04 Approach

  • This approach is an optimization of Approach 2.
  • Instead of using two traversals as in Approach 2, we can also solve the problem using just one traversal. To do so, we calculate the height of the tree in the same recursion as the diameter and store the heights for each node in an array.
  • This way we can avoid the extra traversal for every node.
  • Now, the longest path of the tree (i.e. diameter) can lie:
    • Either completely in one of its subtrees (i.e. nothing but the diameter of the subtree).
    • Or can contain the root node.
  • Hence, the longest path for a tree rooted at node ‘R’ will be the maximum of the diameter of subtrees of R and the height of the tree rooted at R.

Algorithm:

  • Create an array, height, of size N and initialize it with 1.
  • Let our recursive function is largestDistanceHelper(tree, height, currentNode, parent, ans), where ‘ans’ variable stores the length of the longest path for the tree rooted at ‘currentNode’.
  • In order to get the diameter of the given root (i.e. 0) we call the function with currentNode = 0 and parent = -1.
  • For every child of the current node:
    • We calculate its diameter by recursively calling the function largestDistanceHelper(tree, height, child, currentNode, ans). The recursive call also calculates the heights of the subtree rooted at child.
    • So, if the height of the subtree rooted at child is greater than the height of the tree rooted at currentNode, we update the height as height[currentNode] = height[child] + 1.
  • From the height of the subtrees calculated in the previous step, we find the largest and second-largest height and store them in variable maxHeight1 and maxHeight2, respectively.
  • Now, we update the ans as MAX(ans, maxHeight1 + maxHeight2, height[currentNode] - 1).
    • We subtract 1 from the height as we consider the number of edges for the length of the longest path and not the number of vertices.
  • ans stores the final answer.