Virus Detection

Moderate
0/80
5 upvotes
Asked in companies
Goldman SachsShareChatUber

Problem statement

In a database, each stored file is named by a string of size ‘N’. But due to some error in the security, some of the files got infected by a virus. Given two strings ‘A’ and ‘B’ and a virus string ‘C’, Your task is to tell the number of files that are not affected by the virus. An unaffected file follows the given conditions:

The name of the file should be greater than or equal to the string ‘A’.
The name of the file should be less than or equal to string ‘B’.
The name of the file should not contain the virus string as a substring ‘C’.
Since the answer can be very large, print answer % (10^9 + 7).
For Example
If N=2 and string ‘A’ = “bc” and string ‘B’ = “bm” and virus string ‘C’ = ”g”.So the files that are not affected are “bc”, “bd”, “be”, “bf”, “bh”, “bi”, “bj”, “bk”, “bl”.
Detailed explanation ( Input/output format, Notes, Images )
Input Format:
The first line of the input contains an integer, 'T,’ denoting the number of test cases.

The first line of each test case contains a single integer, 'N’, denoting the length of the file’s name.

The second line contains three strings corresponding to ‘A’, ’B’ and ‘C’.
Output Format:
For each test case, print an integer corresponding to the number of unaffected files  % (10^9 +7).  

Print the output of each test case in a separate line.
Note:
You do not need to print anything. It has already been taken care of. Just implement the given function.
Constraints:
1 <= T <= 10
1 <= N <= 500.
length of ‘A’ and ‘B’ == N.
1 <= length of string ‘C’ <= N .  

Time limit: 1 sec
Sample Input 1:
2
2
bc bl g
1
b f c
Sample Output 1:
9
4
Explanation of sample input 1:
For the first test case,

The name of the unaffected files that satisfies the given conditions are:
“bc”, “bd”, “be”, “bf”, “bh”, “bi”, “bj”, “bk”, “bl”.

Hence, the answer is 9. 

For the second test case:

The name of the unaffected files that satisfies the given conditions are:
“b”, “d”, “e”, “f”.

Hence, the answer is 4.
Sample Input 2:
2
2
ac cd x
2
cd da d
Sample Output 2:
52
22
Hint

Try to find each possible file name and check if it is affected or not.

Approaches (2)
Brute Force

In this approach, we will declare a recursive function HELPER(‘CUR’,’ TARGET’,’ VIRUS’) that returns the number of unaffected files having a name lexicographically less than or equal to ‘TARGET’.

HELPER() function will create all possible strings and check if it’s affected or not.

We will first call HELPER() function as ‘TARGET’ as ‘B’ and store it in ‘ANS’.

Then we will update ‘ANS’ as ‘ANS’ - HELPER(“”,’ A’, ’C’).

At last, we will check if ‘A’ is affected or not and increment ‘ANS’ by 1 accordingly.

 

Algorithm:

  • Defining function HELPER(‘CUR’, ‘TARGET’,’ VIRUS’)
    • If the length of ‘CUR’ is equal to  ‘TARGET, do the following:
      • If ‘CUR’ is greater than ‘TARGET’:
        • Return 0.
      • Check if the ‘CUR’ string is virus affected or not.
      • For i in range 0 to (length of ‘CUR’ - length of ‘VIRUS’ -1), do the following:
        • Set j as 0.
        • While j<length of ‘VIRUS’ and ‘CUR’[ i+j ]==’VIRUS’[j]:
          • Set j as j+1
        • If j is equal to the length of ‘VIRUS’ it implies that virus is found:
          • Return 0.
      • Return 1. (Found an unaffected file).
    • Else:
      • Set ‘ANS’ as 0.
      • For ‘i’ in range 0 to 25:
        • Set ‘ANS’ as ( ANS + HELPER(‘CUR’+ (‘a’ + i ),’ TARGET’,’VIRUS’) ) % (10^9 + 7).
      • Return ‘ANS’.
  • Set ‘ANS’ as HELPER(““,’B’,’C’).
  • Set ‘ANS’ as ( ‘ANS’ - HELPER(“”,’A’,’C’) )% (10^9 + 7).
  • Check if string ‘A’ is virus affected or not.
  • Set ‘TEMP’ as 1.
  • For i in range 0 to (length of ‘A’ - length of ‘C’ -1), do the following:
    • Set j as 0.
    • While j<length of ‘C’ and ‘A’[ i+j ]==’C’[j]:
      • Set j as j+1
    • If j is equal to the length of ‘C’ it implies that virus is found:
      • Set temp as 0.
      • Break this for loop.
  • Set ‘ANS’ as ( ‘ANS’ + ‘TEMP’) % (10^9 + 7).
  • Return ‘ANS’.
Time Complexity

O((26^N)*(N*M)), where ‘N’ is the length of string ‘A’ and ‘M’ is the length of virus string.

 

In this approach, we are checking the file is virus affected or not for every possible file name.

There can be a total of 26^N file names and we are also checking whether that is affected or not by matching its every character with string ‘C’ that takes (N*M) time. Hence the overall time complexity is O((26^N)*(N*M)).

Space Complexity

O(26^N), where ‘N’ is the length of the string ‘A’.

 

In this approach, in the worst case, there will be (26^N) recursive calls of function HELPER(). So the stack memory used is of the order (26^N). Hence the overall space complexity is O(26^N).

Code Solution
(100% EXP penalty)
Virus Detection
Full screen
Console