You are given a string 'S' representing a code file. You have to check whether it is a valid code file or not.
The File Is Valid If-1) The file can be represented as \<TAG> CONTENT \</TAG>
2) TAG is an uppercase English letter word of length 1 to 9.
3) CONTENT may contain other valid closed tags, CDATA, letters, digits, ‘>’, ‘/’, ‘]’, ‘[’, ‘!’ and ‘ ’.
4) CONTENT does not contain any unmatched ‘<’.
5) Each start tag must have a matching end-tag. And they must be balanced.
6) CDATA can be represented as \<![CDATA[DATA\_CONTENT]]> where, DATA\_CONTENT can be any string.
7) DATA\_CONTENT is ignored and not parsed even if it contains a valid tag it is considered as a string and not as a tag.
For Example :
S = ”<SPAN>Hello <B>Ninja </B> </SPAN>”
S = start_tag | CONTENT | end_tag
start_tag=<SPAN>
CONTENT=Hello <B> Ninja</B>
end_tag=</SPAN>
CONTENT also have a closed <B> tag.
Answer is True
The first line of input contains an integer ‘T’ denoting the number of test cases to run. Then the test case follows.
The first line of each test case contains a non-empty string 'S'.
Output Format :
For each test case print “True” if it is a valid code file, else print “False” without quotes.
Output for each query is printed in a separate line.
Note :
You don’t need to print anything. It has already been taken care of. Just implement the given function.
1 <= T <= 5
1 <= N <= 10^5
Where ‘N’, is the size of String 'S' and the given string 'S' contains only letters, digits, ‘<’, ‘>’, ‘/’, ‘]’, ‘[’, ‘!’ and ‘ ’.
Time Limit: 1 sec
2
<A><![CDATA[<<</B></C>]]></A>
<A><B></A></B>
True
False
For the first test case, “<< </B> </C> ” is ignored as it is present in CDATA block. And <A> tag has a matching end-tag.
For the second case both the tags have ending tag but they are mismatched.
2
<X></X><Z></Z>
<![CDATA[<<</B></C>]]>
False
False
Can we use the stack to store all the start tags? And iterate through String and ignore all the CDATA blocks ??
We will maintain a stack of all the start tags. And iterate through ‘S’. If S[i]== ‘<’ than it must be a start of a tag, and S[i+1]==’/’ then it must be an end of the tag. If S[i]= ‘<’ and S[i+1] = ‘!’ it is a CDATA block and we will ignore it. All the content must be wrapped in some tag.
If for any ‘<’ we are not able to find ‘>’ it is an invalid file. or tag name is not valid or the end tag is missing for the start tag then it is an invalid file.
The algorithm will be-
O(N) where ‘N’ is the size of given string S.
As every position will be visited at most once the time complexity will be O(N).
O(N) where ‘N’ is the size of given string S.
Since space is required to store the stack of the start tags.