What is the stack-based state management pattern in Tag Validator?

In the Tag Validator problem, stack-based state management is used to track the opening and closing of tags. Each time a start tag is encountered, it is pushed onto the stack, and each time an end tag is encountered, it is popped from the stack. This ensures tags are balanced and nested correctly.

How do CDATA sections affect the Tag Validator solution?

CDATA sections are treated as plain text in the Tag Validator problem. They are skipped in terms of tag validation, but the surrounding text must be properly handled by the stack. This prevents false positives or negatives in validation.

What are some common mistakes in solving Tag Validator?

Common mistakes include failing to handle unbalanced tags, incorrectly parsing CDATA sections as tags, and not properly managing the stack for nested structures.

How does stack-based parsing help with validating tags?

Stack-based parsing helps track the nested structure of tags by pushing start tags onto the stack and popping them when their corresponding end tag is found. This ensures tags are correctly opened and closed in the right order.

What is the space complexity of the Tag Validator solution?

The space complexity of the Tag Validator solution is O(n), as the stack may grow in size proportional to the number of nested tags in the input string.

#591

Hard

auto_awesome栈·状态

LeetCode 题解工作台

标签验证器

给定一个表示代码片段的字符串，你需要实现一个验证器来解析这段代码，并返回它是否合法。合法的代码片段需要遵守以下的所有规则：代码必须被合法的闭合标签包围。否则，代码是无效的。闭合标签（不一定合法）要严格符合格式： TAG_CONTENT 。其中，是起始标签，是结束标签。起始和结束标签中的…

字符串栈

题目描述

给定一个表示代码片段的字符串，你需要实现一个验证器来解析这段代码，并返回它是否合法。合法的代码片段需要遵守以下的所有规则：

代码必须被合法的闭合标签包围。否则，代码是无效的。
闭合标签（不一定合法）要严格符合格式：<TAG_NAME>TAG_CONTENT</TAG_NAME>。其中，<TAG_NAME>是起始标签，</TAG_NAME>是结束标签。起始和结束标签中的 TAG_NAME 应当相同。当且仅当 TAG_NAME 和 TAG_CONTENT 都是合法的，闭合标签才是合法的。
合法的 TAG_NAME 仅含有大写字母，长度在范围 [1,9] 之间。否则，该 TAG_NAME 是不合法的。
合法的 TAG_CONTENT 可以包含其他合法的闭合标签，cdata （请参考规则7）和任意字符（注意参考规则1）除了不匹配的<、不匹配的起始和结束标签、不匹配的或带有不合法 TAG_NAME 的闭合标签。否则，TAG_CONTENT 是不合法的。
一个起始标签，如果没有具有相同 TAG_NAME 的结束标签与之匹配，是不合法的。反之亦然。不过，你也需要考虑标签嵌套的问题。
一个<，如果你找不到一个后续的>与之匹配，是不合法的。并且当你找到一个<或</时，所有直到下一个>的前的字符，都应当被解析为 TAG_NAME（不一定合法）。
cdata 有如下格式：<![CDATA[CDATA_CONTENT]]>。CDATA_CONTENT 的范围被定义成 <![CDATA[ 和后续的第一个 ]]>之间的字符。
CDATA_CONTENT 可以包含任意字符。cdata 的功能是阻止验证器解析CDATA_CONTENT，所以即使其中有一些字符可以被解析为标签（无论合法还是不合法），也应该将它们视为常规字符。

示例 1：

输入：code = "<DIV>This is the first line <![CDATA[<div>]]></DIV>"
输出：true
解释：
代码被闭合的标签包围：<DIV> 和 </DIV>。
TAG_NAME 是合法的，TAG_CONTENT 只包含一些字母和 cdata。
尽管 CDATA_CONTENT 有一个非法 TAG_NAME 的未匹配开始标签，它可以被认为是普通文本，不被解析为一个标签。
所以 TAG_CONTENT 是合法的，并且代码是合法的。因此返回 true。

示例 2：

输入：code = "<DIV>>>  ![cdata[]] <![CDATA[<div>]>]]>]]>>]</DIV>"
输出：true
解释：
我们首先将代码分割为：start_tag|tag_content|end_tag。
start_tag -> "<DIV>"
end_tag -> "</DIV>"
tag_content 也可以被分割为：text1|cdata|text2。
text1 -> ">>  ![cdata[]] "
cdata -> "<![CDATA[<div>]>]]>"，其中 CDATA_CONTENT 是 "<div>]>"
text2 -> "]]>>]"
start_tag 不是 "<DIV>>>" 的原因是规则 6。
cdata 不是 "<![CDATA[<div>]>]]>]]>" 的原因是规则 7。

示例 3：

输入：code = "<A>  <B> </A>   </B>"
输出：false
解释：不平衡。如果 "<A>" 是闭合的，那么 "<B>" 一定没有匹配，反之亦然。

提示：

1 <= code.length <= 500
code 只包含英文字母，数字，'<'，'>'，'/'，'!'，'['，']'，'.' 和 ' '。

lightbulb

解题思路

方法一：栈模拟

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

class Solution:
    def isValid(self, code: str) -> bool:
        def check(tag):
            return 1 <= len(tag) <= 9 and all(c.isupper() for c in tag)

        stk = []
        i, n = 0, len(code)
        while i < n:
            if i and not stk:
                return False
            if code[i : i + 9] == '<![CDATA[':
                i = code.find(']]>', i + 9)
                if i < 0:
                    return False
                i += 2
            elif code[i : i + 2] == '</':
                j = i + 2
                i = code.find('>', j)
                if i < 0:
                    return False
                t = code[j:i]
                if not check(t) or not stk or stk.pop() != t:
                    return False
            elif code[i] == '<':
                j = i + 1
                i = code.find('>', j)
                if i < 0:
                    return False
                t = code[j:i]
                if not check(t):
                    return False
                stk.append(t)
            i += 1
        return not stk

speed

复杂度分析

指标	值
时间	Regular Expressions are/can be implemented in the form of finite-state machines
空间	O(n)

psychology

面试官常问的追问

外企场景

question_mark
Look for an understanding of stack-based state management and how it applies to nested tag parsing.
question_mark
Expect the candidate to handle edge cases like unbalanced tags and CDATA sections.
question_mark
The candidate should efficiently manage space and time complexity while maintaining correct validation logic.

warning

常见陷阱

外企场景

error
Mismanaging CDATA sections by treating them as regular tags, leading to incorrect validation.
error
Failing to properly close tags or mismatching start and end tags, causing the solution to return an incorrect result.
error
Not efficiently handling nested tags or not using the stack effectively to track tag states.

swap_horiz

进阶变体

外企场景

arrow_right_alt
Instead of using a stack, implement the solution using a recursive approach to track the opening and closing of tags.
arrow_right_alt
Handle different tag formats or allow for more complex tag content within the code.
arrow_right_alt
Extend the problem to handle other types of embedded content, such as script tags or special HTML entities.

help

常见问题

外企场景

继续练习

#678 有效的括号字符串 #488 祖玛游戏 #726 原子的数量