Tag Validator

Moderate
0/80
Average time to solve is 20m
profile
Contributed by
3 upvotes
Asked in companies
UberAmdocs

Problem statement

You are given a string 'S' representing a code file. You have to check whether it is a valid code file or not.

The File Is Valid If-
1) The file can be represented as \<TAG> CONTENT \</TAG>

2) TAG is an uppercase English letter word of length 1 to 9.

3) CONTENT may contain other valid closed tags, CDATA, letters, digits, ‘>’, ‘/’, ‘]’, ‘[’, ‘!’  and ‘ ’.

4) CONTENT does not contain any unmatched ‘<’.

5) Each start tag must have a matching end-tag. And they must be balanced.

6) CDATA can be represented as \<![CDATA[DATA\_CONTENT]]> where, DATA\_CONTENT can be any string. 

7) DATA\_CONTENT is ignored and not parsed even if it contains a valid tag it is considered as a string and not as a tag. 

For Example :

S = ”<SPAN>Hello <B>Ninja </B> </SPAN>”
S = start_tag | CONTENT | end_tag
start_tag=<SPAN>
CONTENT=Hello <B> Ninja</B>
end_tag=</SPAN>

CONTENT also have a closed <B> tag. 

Answer is True
Detailed explanation ( Input/output format, Notes, Images )
Input Format :
The first line of input contains an integer ‘T’ denoting the number of test cases to run. Then the test case follows.

The first line of each test case contains a non-empty string 'S'.
Output Format :
For each test case print “True” if it is a valid code file, else print “False” without quotes.

Output for each query is printed in a separate line.
Note :
You don’t need to print anything. It has already been taken care of. Just implement the given function.
Constraints :
1 <= T <= 5
1 <= N <= 10^5 

Where ‘N’, is the size of String 'S' and the given string 'S' contains only letters, digits, ‘<’, ‘>’, ‘/’, ‘]’, ‘[’, ‘!’  and ‘ ’.

Time Limit: 1 sec
Sample Input 1 :
2
<A><![CDATA[<<</B></C>]]></A>
<A><B></A></B>
Sample Output 1 :
True
False
Explanation Of Sample Input 1 :
For the first test case,  “<< </B> </C> ” is ignored as it is present in CDATA block. And <A> tag has a matching end-tag.

For the second case both the tags have ending tag but they are mismatched.
Sample Input 2 :
2
<X></X><Z></Z>
<![CDATA[<<</B></C>]]> 
Sample Output 2 :
False
False
Hint

Can we use the stack to store all the start tags? And iterate through String and ignore all the CDATA blocks ??

Approaches (1)
Stack

We will maintain a stack of all the start tags. And iterate through ‘S’. If S[i]== ‘<’ than it must be a start of a tag, and S[i+1]==’/’ then it must be an end of the tag. If S[i]= ‘<’ and S[i+1] = ‘!’ it is a CDATA block and we will ignore it. All the content must be wrapped in some tag. 

If for any ‘<’ we are not able to find ‘>’ it is an invalid file. or tag name is not valid or the end tag is missing for the start tag then it is an invalid file.

The algorithm will be-

  1. We initially initialize i to 0 and N to the length of S.
  2. We also make use of the data structure ‘stack’. Initially, stack will be empty.
  3. Now While i < N
    1. If S[i]= ‘<’
      1. If it is a end tag
        1. If not valid tag stack.back()!=tag()
          1. Return False
        2. Else, we pop the element of the stack
      2. If it is a CDATA block
        1. If not a valid block()
          1. Return False
        2. Ignore till the end of block
      3. If it is a start tag
        1. If not valid tag
          1. Return False
        2. stack.Push(tag)
    2. ELSE
      1. If stack.empty()
        1. Return False
    3. If i<N and stack.empty()
      1. Return False
  4. We finally return ‘True’.
Time Complexity

O(N)  where ‘N’ is the size of given string S.

 

As every position will be visited at most once the time complexity will be O(N).

Space Complexity

O(N)  where ‘N’ is the size of given string S.

 

Since space is required to store the stack of the start tags.

Code Solution
(100% EXP penalty)
Tag Validator
Full screen
Console