Last Updated: 24 Jul, 2021

Longest Common Prefix

Moderate
Asked in companies
CiscoAmazonJP Morgan

Problem statement

You are given an array ‘ARR’ consisting of ‘N’ strings. Your task is to find the longest common prefix among all these strings. If there is no common prefix, you have to return an empty string.

A prefix of a string can be defined as a substring obtained after removing some or all characters from the end of the string.

For Example:
Consider ARR = [“coding”, ”codezen”, ”codingninja”, ”coders”]
The longest common prefix among all the given strings is “cod” as it is present as a prefix in all strings. Hence, the answer is “cod”.
Input Format:
The first line of the input contains a single integer, 'T,’ denoting the number of test cases.

The first line of each test case contains a single integer ‘N’ denoting the number of strings in the array.

The next line contains ‘N’ space-separated strings denoting the elements of the array ‘ARR’.
Output Format:
For each test case, print a single string corresponding to the longest common prefix.

Print the output of each test case in a separate line.
Note:
You do not need to print anything. It has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10
1 <= N <= 3000
1 <= |ARR[i]| <=1000

Each string consists of only lowercase letters.

Time limit: 1 sec

Approaches

01 Approach

In this approach, we will iterate through each character of the first string and check if the same character exists in each string or not. We will maintain a variable longestPrefix to store the longest common prefix.

 

We will traverse idx from 0 to the length of ARR[0] -1.

  1. We will iterate index from 1 to N-1 and check if ARR[index][idx] is equal to ARR[0][idx].
  2. If the condition is true for all strings, we can add ARR[0][idx] in the common prefix string, and we will insert ARR[0][idx] in longestPrefix. Otherwise, the search is completed for the longest common prefix.

In the end, we will return the variable longestPrefix.

 

Algorithm:

  • We will declare a string longestPrefix to store the longest prefix string of all strings.
  • We will iterate through the first string of ARR and check each of its characters is present in all other strings or not.
  • Iterate idx from 0 to length of ARR[0] - 1 ,do the following:
    • Set ch as ARR[0][idx]. The variable ch stores the character to be searched.
    • Set matched as true. The variable stores if it is possible to insert ch into the longestPrefix.
    • We will iterate index from 1 to N-1, do the following:
      • Check if the length of ARR[index] is less than or equal to the variable idx or ARR[index][idx] is not equal to ch.
        • Set matched as false.
        • Break this loop.
    • If matched is true, we can insert character ch into longestPrefix. Otherwise, break this loop as the longest common prefix is already found.
  • Return the string longestPrefix corresponding to the longest prefix string of all the strings in the given array.

02 Approach

In this approach, we will divide the array of strings into halves, and we will find the common prefix for both parts separately. We will find the common prefix of these obtained strings.

 

We will define a function commonPrefix(STR1, STR2) to find the common prefix of STR1 and STR2. We will define a function commonPrefixUtil(ARR, start, end), returning the longest common prefix for the subarray from index start to end. This function will recursively divide the range into half and find the longest common prefix for that subarray of ARR.

 

Algorithm: 

  • Defining function commonPrefix(STR1, STR2):
    • Set minLength as the minimum of the length of STR1 string and length of STR2 string.
    • Declare a string ans to store the longest common prefix of these two strings.
    • For each index idx from 0 to minLength - 1 ,do the following:
      • If STR1[idx] is equal to STR2[idx]:
        • Insert STR1[idx] into ans.
      • Otherwise, break the loop.
    • Return ans string.

 

  • Defining function commonPrefixUtil(ARR, start, end):
    • If start is equal to end:
      • The subarray contains only one string, so the longest common prefix will be ARR[start].
      • Return ARR[start].
    • Declare variables left and right to store the longest common prefix for the left and right halves.
    • Set mid as (start + end)/2.
    • Set left as commonPrefixUtil(ARR, start, mid).
    • Set right as commonPrefixUtil(ARR, mid+1 , end).
    • Declare the variable ans to store the longest common prefix for the subarray from start to end.
    • Set ans as commonPrefix(left,right).
    • Return ans.
  • Declare a string answer to store the longest common prefix for all strings of the array.
  • Set answer as commonPrefixUtil(ARR, 0, N-1).
  • Return answer corresponding to longest common prefix.

03 Approach

In this approach, we will first find the string with the minimum length among all the strings in the array and store it as prefix because the shortest string is the longest possible common prefix. We will also define a function isCommon(ARR, prefix, length, N) to whether the string prefix is a common prefix for all the strings in the array ARR or not. We will start the search by setting start as 0 and end as the length of prefix. At each iteration, we will set mid as (start+end)/2. Each time search space is divided into two equal parts, one of them is discarded because it is sure that it doesn't contain the solution. The element prefix[0: mid] is the substring of the prefix from index 0 to mid-1. There are two possible cases:

  • The prefix[0: mid] is not a common string. We discard the second half of the search space. Hence, we will set end as mid-1.
  • prefix[0: mid] is a common string. We discard the first half of the search space because we try to find a longer common prefix. Hence, we will set start as mid-1.

At last, we will return the substring of prefix from index 0 to mid.

 

Algorithm: 

  • Defining function isCommon(ARR, prefix, length, N) :
    • We will iterate index idx from 0 to length -1, do the following steps:
      • We will iterate the index from 0 to N-1.
        • If ARR[index][idx] is not equal to prefix[idx] then we found a mismatched character, and we will return false.
    • If we don’t find any mismatched character, this prefix is common in all strings. Hence, return true.

 

  • Declare a string prefix to store the shortest string among the array.
  • Declare minLength to store the length of the shortest string. Initialize minLength as large integer value.
  • We will iterate the variable idx from 0 to N-1:
    • If the length of ARR[idx] is less than  minLength, do the following:
      • Set minLength as the length of ARR[idx].
      • Set prefix as ARR[idx].
  • Set start as 0.
  • Set end as minLength -1.
  • While start is less than or equal to end, do the following:
    • Set mid as (start + end) /2. 
    • If isCommon(arr, prefix, mid, N) is true means the first half is common in all strings, we will try to search for a longer common prefix.
      • Set start as mid + 1.
    • Otherwise, the substring of prefix from index 0 to mid is not common in all strings. We will reduce the length.
      • Set end as mid-1.
  • Set mid as (start + end) / 2 .
  • Now mid is the length of the longest common prefix.
  • Declare a string answer to store the longest common prefix for all strings of the array.
  • Set answer as the substring of prefix starting from index 0 to index mid-1.
  • Return answer corresponding to longest common prefix.

04 Approach

In this approach, we will use Trie to store all the strings, and then we will search for the longest common prefix using the Trie. Every node of Trie consists of multiple branches. Each branch represents a possible character of keys. We need to mark the last node of every key by setting endOfWord as true. 

  1. To search for the longest common prefix, we will start from the root node of the trie and check if it has only one child. If the node satisfies this condition, we will add it into our longest common prefix.
  2. If we encounter a node where endOfWord is true, we cannot iterate further

Hence our search is completed. We will also define a function insertWord(root, word) to insert a string word into the Trie. 

 

Algorithm: 

  • Defining TrieNode class:
    • A variable val to store the character.
    • An array child of size 26 to store the child pointers corresponding to each character. Initialize all the child pointers with an empty node.
    • A variable childCount to count the number of child nodes. Set childCount with 0.
    • A variable endOfWord to store whether the character is the last character of the word or not.Set endOfWord as False.
       
  • Defining insert(root, word) function:
    • Set cur as root.
    • We iterate idx from 0 to length of word -1:
      • If the character word[idx] is not present in cur as a child, add that node and increment the childCount of cur by 1.
      • Set diff as a difference between word[idx] and ‘a’.
      • Set cur as child[diff] of cur.
    • Set endOfWord of cur as true.

 

  • Declare a TrieNode root.
  • Insert all strings of array ARR into the trie.
  • Declare a string answer to store the longest common prefix.
  • Declare a string prefix. Set prefix as ARR[0].
  • We will try to check if prefix is a common prefix or not.
  • Iterate  idx from 0 to length of prefix -1:
    • If childCount of the root is equal to 1.
      • Insert prefix[idx] into answer.
      • Set diff as a difference between prefix[idx] and ‘a’.
      • Set root as the child[diff] of the root.
    • Otherwise, if childCount of the root is not equal to 1.
      • We will break the loop.
    • If endOfWord of the root is true,
      • The search is completed, we will break the loop.
  • Return answer corresponding to longest common prefix.