Before we jump into understanding the problem statement and methods to solve the problem, let’s quickly revise the prerequisites. A Binary Search Tree or a BST is a data structure to store the hierarchical arrangement of data. Properties of the BST are:

The right subtree of a BST node always has nodes with values greater than that BST node.

The left subtree of a BST node always has nodes with values smaller than that BST node.

Each subtree should also be a BST

Problem Statement

Given a number ‘N’ representing the number of nodes present in a BST, find the number of unique BSTs possible using ‘N’ nodes.

Please note that all the nodes are unique (No node is a duplicate of another). You can consider the ‘N’ nodes having values ranging from 1 to ‘N’.

Input Format

Number of Nodes: 3

Output Format

Number of BSTs: 5

Explanation

Get the tech career you deserve, faster!

Connect with our expert counsellors to understand how to hack your way to success

User rating 4.7/5

1:1 doubt support

95% placement record

Akash Pal

Senior Software Engineer

326% Hike After Job Bootcamp

Himanshu Gusain

Programmer Analyst

32 LPA After Job Bootcamp

After Job Bootcamp

Method 1: Recursion

Recursion is all about using a top-down approach. Every recursion solution recipe has the following ingredients:

Step 1: A problem with bigger input is broken down into the same problem with smaller inputs.

Step 2: The process keeps repeating until the breakdown is not possible anymore. In other words, recursion stops when we hit the base case.

Step 3: The outputs of smaller problems are combined to compute problems with bigger inputs.

Can we use recursion for the given problem? Let’s see if our unique BSTs problem contains all the ingredients for our recursion recipe.

We know that every subtree of a BST is also a BST. A bigger BST is split into two BST subtrees with a size smaller than the bigger one BST. For a number ‘i’ as the current root node. The left child nodes can be the nodes with values less than ‘i’. Similarly. the right child nodes will be the nodes with values greater than ‘i’ and less than ‘N’. So, the size of the left subtrees will be ‘i - 1’. And the right subtree will be ‘N - i’.

Can you guess the base case? If there are no more subtrees possible for the current subtree, we have reached the base case. You can understand it this way: If a tree has no nodes or only one node, then the number of unique BSTs possible is 1.

Do you remember the Fundamental Principle of Counting? If one job can be completed in N ways and another can be completed in M ways, then the total number of ways both the jobs can be completed together is calculated as

Total number of ways= N x M.

For a BST, if ‘N’ right subtrees are possible and ‘M’ left subtrees are possible, then the number of unique BSTs possible is N x M.

The problem has all the ingredients for our recursion recipe. Refer to the algorithm given below for a better understanding.

Algorithm

If N= 0 or N = 1 (N = number of nodes in the BST)

Return 1.

Set ‘ANSWER’, ‘LEFT_SUBTREES’and ‘RIGHT_SUBTREES’variables = 0.

Loop with variable ‘i’from 1 to ‘N’ (‘i’ is the root of current BST).

Set ‘LEFT_SUBTREES’equal to a recursive call with ‘i- 1’ input.

Set ‘RIGHT_SUBTREES’equal to a recursive call with ‘N - i’ input.

Add (LEFT_SUBTREES x RIGHT_SUBTREES) to ‘ANSWER’.

Return ‘ANSWER’.

Logic Under the Hood

You don't need to think about the start and end value of a BST or a subtree. Because two different subtrees with values 1 to 3 and with values 3 to 6 look the same. If you still have doubts, draw the subtrees on a piece of paper. Don’t take bigger values for ‘N’. It's going to cost you pages for a dry run for significant input.

#include <iostream>
using namespace std;
int numberOfBSTs(int n)
{
// Base case when a tree is either empty or has only one node.
if (n == 0 || n == 1)
return 1;
int answer = 0, leftSubtree = 0, rightSubtree = 0;
// Different root values possible for current BST.
for (int i = 1; i <= n; i++)
{
// Recursive call to the left subtree.
leftSubtree = numberOfBSTs(i - 1);
// Recursive call to the right subtree.
rightSubtree = numberOfBSTs(n - i);
/* Using outputs of problems with smaller input in
problems with bigger inputs. */
answer += leftSubtree * rightSubtree;
}
return answer;
}
int main()
{
int n;
cout << "Number of nodes: ";
cin >> n;
cout << "Number of BSTs: " << numberOfBSTs(n);
}

Input

Number of Nodes: 5

Output

Number of BSTs: 42

Complexity Analysis

Time Complexity: The complexity estimation for the above recursion can be done as follows:

Let's call the estimation function T(‘N’) for the ‘N’ number of nodes, here it represents the number of recursive calls for ‘N’.

A call for ‘N’ takes exactly 2 * (N - 1) recursive calls, each of them adding their own costs , 2 * (T(1) + T(2) + … + T(N - 1)).

A call for N + 1 takes exactly 2 * N recursive calls. Each of them adding their own cost, 2 * ( T(1) + T(2) + … + T(N - 1) + T(N)).

T(N + 1) - T(N) = 2 + 2 * T(N)

T(N) = 3 * T (N - 1) + 2

T(N) = 3^{N}

And hence time complexity will be O(3^{N}), where ‘N’ represents the number of nodes.

Space Complexity: For every recursive call ‘LEFT_SUBTREES’ , ‘RIGHT_SUBTREES’, and ‘ANSWER’ variables will be created. The total space required to store variables in all recursive calls is 3 * 3^{N}. So, the space complexity is O(3^{N}), where ‘N’ represents the number of nodes.

Take a look at the recursion tree shown in the figure below:

With the increase in the size of the recursive tree, the recursive function is called multiple times with the same input argument. Memoization can prune the recursive tree by storing the results of recursive calls. All you need to do is to make some tiny changes in our recursion recipe.

Add the steps given below in our recursion procedure, and we’re good to go!

The output of the recursive calls can be memoized using an array of size ‘N’. Before calling the numberOfBSTs() function, initialize a memo array of size ‘N’ and values equal to zero.

Before making a recursive call, check if its output is already present in the lookup. Thus, memoization calculates output for every subtree only once.

Before returning the output of a recursive call, store the result in the lookup.

Refer to the image given below for a better understanding.

Algorithm

If N = 0 or N = 1 (N = number of nodes in the BST).

Return 1.

If MEMO[N] ! = 0.

Return MEMO[N].

Set ‘ANSWER’, ‘LEFT_SUBTREES’and ‘RIGHT_SUBTREES’variables = 0.

Loop with variable ‘i’from 1 to ‘N’ (‘i’ is the root of current BST).

Set ‘LEFT_SUBTREES’equal to a recursive call with i- 1 input.

Set ‘RIGHT_SUBTREES’equal to a recursive call with N - i input.

Add (‘LEFT_SUBTREES’ x ‘RIGHT_SUBTREES’) to ‘ANSWER’.

Set MEMO[N] = ANSWER.

Return ‘ANSWER’.

Let’s make changes to our recursive function and see how it looks.

Program

#include <iostream>
#include <vector>
using namespace std;
int numberOfBSTs(int n, vector<int> &memo)
{
// Base case when a tree is either empty or has one node only.
if (n == 0 || n == 1)
return 1;
// Checking lookup before recursive calls.
if (memo[n] != 0)
return memo[n];
int answer = 0, leftSubtree = 0, rightSubtree = 0;
// Different root values possible for currentBST.
for (int i = 1; i <= n; i++)
{
// Adding memo argument in recursive calls.
leftSubtree = numberOfBSTs(i - 1, memo);
rightSubtree = numberOfBSTs(n - i, memo);
/* Using outputs of problems with smaller input in
problems with bigger inputs. */
answer += leftSubtree * rightSubtree;
}
// Storing answer in memo before return.
memo[n] = answer;
return answer;
}
int main()
{
int n;
cout << "Number of nodes: ";
cin >> n;
// Initialization of memo for lookup.
vector<int> memo(n, 0);
cout << "Number of BSTs: " << numberOfBSTs(n, memo);
}

Input

Number of Nodes: 5

Output

Number of BSTs: 42

Complexity Analysis

Time Complexity: For every value from 1 to ‘N’ where ‘N’ represents the number of nodes in a BST, left and right subtrees are calculated once using memoization. To loop through 1 to ‘N’ requires O(N) time. To calculate the number of left and right subtrees, N - 1 recursive call are made. Every memoized recursive call runs at least once. So, the time complexity of code is O(N x (N - 1 )) = O(N^{2}), where ‘N’ represents the number of nodes in a BST.

Space Complexity: Extra space is used to memoize the code. The size of the ‘MEMO’ vector is equal to the size of the input. So, the space complexity is O(N), where ‘N’ represents the number of nodes in a BST.

Method 3: DP with Tabulation

Rather than going from top-down (going from bigger input to smaller input), DP with tabulation follows a bottom-up approach (from smaller input to bigger input) to find the number of unique BSTs possible. Let’s see how we can do it for our problem statement.

To build the solution for problems with bigger input using smaller size problems we need to store results of the latter. Let’s store them in an integer vector, ‘DP’.

Try to find the smallest input possible. The smallest possible input is one where the recursive call for a smaller input is not possible. Recursion stops at this point. If you remember, this point is the base case of a recursive function. For this problem, the base case is when the BST is empty or has one node at most. For the base case, the number of unique BSTs is 1. Let’s store the result of the base case in the ‘DP’ vector. So, DP[0] = 1 , DP[1] = 1.

Once we decide on the base case, we need to work on recursive calls. The recursive calls in this problem are to find the left and right subtrees. For every number ‘N’ as input, a loop is run using the ‘i’ variable. The loop made the recursive calls with arguments as ‘i’ and ‘N’ - ‘I’.

Dynamic programming stores the results of recursive calls in the ‘DP’ vector beforehand. How? The solution is built from bottom to up. The outputs for smaller-sized problems are already computed. Refer to the algorithm given below for a better understanding.

Algorithm

Initialize a vector ‘DP’ of size N + 1 with values 0.

Set DP[1] = 1 and DP[0] = 1.

Loop from 2 to ‘N’ with variable‘i’ .

Loop from 0 to ‘i’ with variable ‘j’(to find all the possible subtrees with ‘i’as root).

Set LEFT_SUBTREES=DP[j] (replacement of numberOfBSTs(i - 1)).

Set ‘RIGHT_SUBTREES’= DP[‘I’ -‘J’ - 1] (replacement of numberOfBSTs(N - i)).

Add ‘LEFT_SUBTREES’and ‘RIGHT_SUBTREES’to DP[i].

Return DP[N].

Program

#include <iostream>
#include <vector>
using namespace std;
int numberOfBSTs(int n)
{
// Dp vector to store results of different inputs.
vector<int> dp(n + 1, 0);
// Base case.
dp[1] = 1, dp[0] = 1;
// Calculating number of BSTs possible for different values of n.
for (int i = 2; i <= n; i++)
{
// Finding all the possible subtrees with i as root.
for (int j = 0; j < i; j++)
{
int leftSubtree = dp[j];
int rightSubtree = dp[i - j - 1];
dp[i] += leftSubtree * rightSubtree;
}
}
return dp[n];
}
int main()
{
int n;
cout << "Number of nodes: ";
cin >> n;
cout << "Number of BSTs: " << numberOfBSTs(n);
}

Input

Number of Nodes: 3

Output

Number of BSTs: 5

Complexity Analysis

Time Complexity: The code runs on two nested linear loops. So, the time complexity is O(N^{2}), where ‘N’ represents the number of nodes in a BST.

Space Complexity: Extra space is used to create the ‘DP’vector. The size of the ‘DP’vector is equal to the size of the input. So, the space complexity is O(N), where ‘N’ represents the number of nodes in a BST.

Relation between DP and Catalan numbers

The method we have seen above is calculating the Catalan number for a given input n. A Catalan number C_{n} represents the number of combinations possible for an arrangement of N elements. It is primarily used in problems involving recursion. The value of C_{n }can be calculated as:

Don’t be afraid of this complex equation. It just represents two nested for loops we have used in our code. Refer to the snippet given below for the same:

for (int i = 2; i <= n; i++)
{
// Finding all the possible subtrees with i as root.
for (int j = 0; j < i; j++)
{
int leftSubtree = dp[j];
int rightSubtree = dp[i - j - 1];
dp[i] += leftSubtree * rightSubtree;
}

There’s usually more than one approach to a solution. Yes, it feels sluggish to solve the same problem again and again. But, we should always try to look for more ways to solve a problem. After all, it’s an excellent way to practice more than one algorithm or technique from a single problem.

Try to memoize the code if some value is computed again and again. Once you memoize the code successfully, try to tabulate it. Why so? Because it is an excellent way to avoid recursive calls and fill the call stack. You can practice memoization and many more cool techniques using our free practice platform Coding Ninjas Studio. So, keep practicing. That’s what good coders do.