Last Updated: 1 Aug, 2020

Word Search - ll

Moderate
Asked in companies
SamsungUberOYO

Problem statement

You are given a two-dimensional grid having 'N' rows and 'M' columns, consisting of upper case characters. You are also given a word 'WORD'. You have to find the number of occurrences of that word in the grid.

Starting from a given cell, a word can be formed by moving in all eight directions: horizontally left, horizontally right, vertically up, vertically down, and four diagonal directions.

Example :
There are three occurrences of the word 'NINJA' in the following grid:

word_grid

Note :
1) Given word will not be a single character.

2) If occurrences of the given word start from two different cells, then these two occurrences are considered to be different.

Consider the word 'XYX' to be searched in the string 'AXYXB'. Here, we find an occurrence of 'XYX' at index 2 going in the right direction, and another occurrence at index 4 going in the left direction.

Although these two words correspond to the same word, they will be considered as different occurrences.
Input Format :
The first input line contains two integers, 'N' and 'M', the number of rows and columns respectively.

The next 'N' input lines representing the rows of the grid, contain a string each of length 'M'.

The next input line contains a string 'WORD'.
Output Format :
In the only output line prints the number of occurrences of the given string in the grid.
Note :
You don't have to print anything. It has already been taken care of, just implement the given function.
Constraints :
1 <= N <= 10^3
1 <= M <= 10^3
1 <= N*M <= 10^6
2 <= WORD.size <= 10^3

Time Limit : 1 sec

Approaches

01 Approach

In thie approach we try to traverse in all directions when we found a matching character.

 

Here is the algorithm :

 

  1. Let’s do this : for each cell, we first check if the character in this cell matches the first letter of the given word. If not, then there lies no point in checking all eight directions, so we go to the next cell.
  2. To move in all eight directions, we create two arrays dx[] and dy[], which corresponds to the representation of these 8 directions in the X and Y axes respectively. Our implementation becomes much easier with this, we simply have to iterate over these arrays to move in all eight directions.
  3. The rest of the implementation is simple. Keep going in a particular direction as long as the string matches. If you find a mismatch, simply break and start looking in other directions. If you reach the end of the given word, you have found an occurrence of the word in the grid, so add 1 to the answer.

02 Approach

 

  1. There exists a famous algorithm for pattern searching in a text, known as the Knuth-Morris-Pratt (in short, KMP) algorithm. Given a text of length N and a pattern of length M, it can search if the pattern exists in the given text in O(N+M) time.
  2. Before applying this algorithm, we first have to compute the prefix function of the given word. The prefix function is described as follows:

 

PI[i] : The maximum length of suffix of string[1, i], which is also a proper prefix of the string.

Here, proper prefix means that we exclude the empty string and the string itself.

 

  1. How does this help us? Let’s iterate over the text and the given word, just like we do in the naive approach to finding this pattern in the word.
  2. At a given instance, suppose we are at position ‘i’ in the text and position ‘j’ in the word. We compare these two characters. If these two match, we increment i and j by 1. Else, in case of the naive algorithm, we reset j to 0 and set i back to the next character from which the word can begin.
  3. However, now that we have the prefix function, instead of moving ‘i’ back, we can check “Where is the last index from which we can resume checking, without moving ‘i’ back?” We are, in essence, asking for the longest prefix for which the suffix is same. We move j to this index and check for the characters again. If they mismatch again, we repeat this. This very optimization allows searching for the pattern in O(N+M) time.
  4. Now that we have the algorithm figured out, all that is left is to apply it for all the rows, columns, and diagonals of the grid.
  5. Creating the ‘text’ strings for the rows and columns is simple. For the diagonals:
    • In case of upper diagonals, all these diagonals will start from the left and top boundaries of the grid.
    • In case of upper diagonals, all these diagonals will start from the left and top boundaries of the grid.
for(i=0; i<n; i++)
	for(j=0; i+j<n && j<m; j++)
		// All elements diagonally, starting from top to bottom

for(int j=1; j<m; j++)
	for(i=0; i<n && j+i<m; j++)
		// All elements diagonally, starting from left to bottom

 

for(int i=0; i<n; i++)
	for(int j=0; i-j >= 0 && j < m; j++)
		// All elements diagonally, starting from left to top


for(int j=1; j<m; j++)
	for(int i=n-1; i >= 0 && j+(n-1)-i < m; i--)
		// All elements diagonally, starting from bottom to top

 

Remember we have to check for the reverse directions as well! In order to do that, we simply apply the same algorithm to the reverse of the string.