LeetCode 题解工作台

单词的压缩编码

单词数组 words 的 有效编码 由任意助记字符串 s 和下标数组 indices 组成,且满足: words.length == indices.length 助记字符串 s 以 '#' 字符结尾 对于每个下标 indices[i] , s 的一个从 indices[i] 开始、到下一个 '#'…

category

4

题型

code_blocks

4

代码语言

hub

3

相关题

当前训练重点

中等 · 数组·哈希·扫描

bolt

答案摘要

题目大意:充分利用重叠的后缀,使有效编码尽可能短。 判断当前单词是否是其他单词的后缀,若是,就不用写入助记字符串中,否则需要写入并且加上一个 # 后缀。

Interview AiBox logo

Interview AiBox 实时 AI 助手,陪你讲清 数组·哈希·扫描 题型思路

试试 AiBox 面试助手arrow_forward
description

题目描述

单词数组 words有效编码 由任意助记字符串 s 和下标数组 indices 组成,且满足:

  • words.length == indices.length
  • 助记字符串 s'#' 字符结尾
  • 对于每个下标 indices[i]s 的一个从 indices[i] 开始、到下一个 '#' 字符结束(但不包括 '#')的 子字符串 恰好与 words[i] 相等

给你一个单词数组 words ,返回成功对 words 进行编码的最小助记字符串 s 的长度 。

 

示例 1:

输入:words = ["time", "me", "bell"]
输出:10
解释:一组有效编码为 s = "time#bell#" 和 indices = [0, 2, 5] 。
words[0] = "time" ,s 开始于 indices[0] = 0 到下一个 '#' 结束的子字符串,如加粗部分所示 "time#bell#"
words[1] = "me" ,s 开始于 indices[1] = 2 到下一个 '#' 结束的子字符串,如加粗部分所示 "time#bell#"
words[2] = "bell" ,s 开始于 indices[2] = 5 到下一个 '#' 结束的子字符串,如加粗部分所示 "time#bell#"

示例 2:

输入:words = ["t"]
输出:2
解释:一组有效编码为 s = "t#" 和 indices = [0] 。

 

提示:

  • 1 <= words.length <= 2000
  • 1 <= words[i].length <= 7
  • words[i] 仅由小写字母组成
lightbulb

解题思路

方法一:前缀树

题目大意:充分利用重叠的后缀,使有效编码尽可能短。

判断当前单词是否是其他单词的后缀,若是,就不用写入助记字符串中,否则需要写入并且加上一个 # 后缀。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class Trie:
    def __init__(self) -> None:
        self.children = [None] * 26


class Solution:
    def minimumLengthEncoding(self, words: List[str]) -> int:
        root = Trie()
        for w in words:
            cur = root
            for c in w[::-1]:
                idx = ord(c) - ord("a")
                if cur.children[idx] == None:
                    cur.children[idx] = Trie()
                cur = cur.children[idx]
        return self.dfs(root, 1)

    def dfs(self, cur: Trie, l: int) -> int:
        isLeaf, ans = True, 0
        for i in range(26):
            if cur.children[i] != None:
                isLeaf = False
                ans += self.dfs(cur.children[i], l + 1)
        if isLeaf:
            ans += l
        return ans
speed

复杂度分析

指标
时间complexity is O(sum of all word lengths) because each character is processed once during reversal and set operations. Space complexity is also O(sum of all word lengths) for storing reversed words in the set.
空间O(\sum w_i)
psychology

面试官常问的追问

外企场景
  • question_mark

    Notice if the candidate correctly identifies redundant suffixes among words.

  • question_mark

    Check if they efficiently remove duplicates using hashing rather than nested loops.

  • question_mark

    Watch whether they compute the length correctly by accounting for '#' delimiters.

warning

常见陷阱

外企场景
  • error

    Failing to remove words that are suffixes of others, leading to longer encoding.

  • error

    Ignoring the '#' characters when summing the total length.

  • error

    Using nested loops to compare all word pairs, causing timeouts on large inputs.

swap_horiz

进阶变体

外企场景
  • arrow_right_alt

    Compute minimal encoding when words can include uppercase letters or digits.

  • arrow_right_alt

    Return the actual reference string s instead of just its length.

  • arrow_right_alt

    Extend to support dynamic insertion of words and updating the minimal encoding efficiently.

help

常见问题

外企场景

单词的压缩编码题解:数组·哈希·扫描 | LeetCode #820 中等