Last Updated: 12 Apr, 2021

Double Strings

Hard
Asked in company
alibaba

Problem statement

You are given a string ‘STR’, containing only lowercase English letters.

Your task is to determine the total number of distinct substrings present in ‘STR’ which are double strings.

Note :
1. A double string is a string that is made by concatenation of some string with itself.

2. The length of a double string is at least 2.

For example :

In the string ‘ababab’, there are just two substrings, that are double strings. Both such substrings are ‘abab’, and thus there exists only one distinct substring which is a double string. ‘abab’ can be written as ‘ab’ + ‘ab’. But ‘ababab’ can not be considered as a double string because ‘ab’ has been concatenated twice with itself. 
Input Format :
The first line contains an integer ‘T’, which denotes the number of test cases to be run. Then, the T test cases follow. 

The first line of each test case contains the string ‘STR’.
Output Format :
For each test case, print a single integer, denoting the total number of distinct substrings which are also double strings.

Output for each test case will be printed in a separate line.
Note :
You do not need to print anything. It has already been taken care of. Just implement the given function.
Constraints :
1 <= T <= 5
1 <= |STR| < 2000
Where ‘T’ denotes the number of test cases and |STR| represents the length of the string ‘STR’

Time Limit: 1sec

Approaches

01 Approach

Approach:

The approach of this problem is to find out all substrings of even length because for any string to be a double string, it must be of even length. For checking whether a substring is a double string or not, simply check whether all the characters in the first half of the string are equal to the corresponding characters in the second half or not. 

 

Steps:

  1. Declare an unordered set, say allDistinctStrings, to store the distinct substrings.
  2. Store the size of the string ‘STR’ as strLen.
  3. Run a loop from i=0 to i<strLen-1, and do:
    1. Run another loop for every even value of len from len=2 to len<=strLen-i, and do:
      1. Define substring as the substring starting at ‘ith’ index having a length ‘len’. Formally, make substring = s.substr(i,len).
      2. Now, if isDoubleString(substring == true), this means that the substring starting at index ‘i’ having a length of ‘len’ is a double string. Hence, we insert it into the set allDistinctStrings.
  4. Finally, the total number of distinct double substrings will be equal to the size of this set as this set does not contain any duplicate string. So, we return the size of this set.

void isDoubleString(string str) {

  1. Define mid as str.size /2 .
  2. Run a loop from i=0 to i<mid, and do:
    1. If the character at ith index in the first half of the string is not equal to the ith index in the second half, then it is not a double string. Formally, if str[i] != str[mid+i], return false.
  3. If all the corresponding characters match, return true as this is a double string.

02 Approach

Approach:

  1. The approach of this problem is to use dynamic programming to find out the length of the longest common subsequence among the substrings starting at any two indices.
  2. So, if the length of the longest common subsequence among the substring starting at index ‘i’ and the substring starting at index ‘j’ is greater or more than j - i, then it means that the substring starting at index ‘i’, having a length of j-i is equal to the substring starting at index ‘j’, having a length of j-i.
  3. This means that the string starting at index ‘i’, having a length of 2*(j-i) is a double string.
  4. Thus, we can insert this string into the set to find the count of all distinct double strings.

 

Steps:

  1. Declare an unordered set, say allDistinctStrings, to store the distinct substrings.
  2. Store the size of the string ‘STR’ as strLen.
  3. Define a 2D vector, say DP of size (N * N), to store the length of the longest common subsequence among the substrings starting at any two indices.
  4. Run a loop from j=strLen-2 to j>=0, and another loop from i=j-1 to i>=0, and do:
    1. If the character at ‘ith’ index is the same as the character at ‘jth’ index, do:
      1. If j is the last index, then the length of the longest common subsequence will be 1.
      2. Else, make DP[i][j] = DP[i+1][j+1] + 1.
    2. Now, if DP[i][j] >= j-i, it means that the string starting at index ‘i’, having a length of 2*(j-i) is a double string. So, we insert this string into allDistinctStrings.
  5. Finally, we return the size of the set, allDistinctStrings.

 

03 Approach

The idea is to use Rolling Hash for this solution.  We will select substrings from the given string and find out the hash value using rolling hash algorithm. If hash value of two substrings are equal, it means the strings are equal and we can make the desired string using the two substrings.

 

 The steps for this are as follows:

  • Select 2 Prime numbers, let say, 31 and other big prime numbers be 1000000009 and store them in variables, say ‘p’ and ‘m’.
  • Pre calculate the power of ‘p’ upto length of the of the given input string ‘str’.
  • Declare a set of integers ‘ans’, which stores all the strings that can be written as the concatenation of some string with itself. The reason for using a set is that we won’t have the same occurrence of the same substring.
  • Iterate from i = 2 to i = n(This indexing ensures that we have a non-empty string every single time):
    • Now make all the first half of the final string i.e., from j = 0 to j = i / 2 (here a non-empty substring is ensured).
    • In order to store it to find if this string repeats or not, calculate the hash value for the substring and store it in a variable  ‘hash1’.
    • The other string will be taken from index j = i / 2 to j < i.
    • For this substring, calculate the hash value and store it in a variable ‘hash2’.
    • If the value of hash1 is equal to hash2, it means both the substring are equal and the final string that would be made by concatenating the first substring and second substring can be written by concatenating both the substrings.
    • Insert the hash value of the first substring in the set ‘ans’ so that we won’t have the same occurrence of the same substring in case of repetition in the given string.
  • Return the size of the set ‘ans’ as the final answer as it contains all the substrings that can be concatenated with itself and the final string is present in the given string.