What is the main trick in Word Subsets?

The main trick is to combine all words2 requirements into one 26-letter maximum frequency array. Once that profile is built, every word in words1 is checked only once against it.

Why do we take the maximum frequency per letter instead of adding counts?

A universal word must satisfy every single word in words2, not concatenate them all. Taking the maximum per letter captures the strongest requirement any one word imposes, while adding counts would incorrectly demand letters from different words at the same time.

Is array scanning plus hash lookup really the right pattern for this problem?

Yes, but for lowercase English letters the hash lookup part is usually implemented as a fixed 26-slot frequency array. The pattern is: preprocess the requirement once, then scan candidates and compare counts directly.

How does the example with ["e","o"] produce ["facebook","google","leetcode"]?

The merged requirement is simply at least one e and one o. Facebook, google, and leetcode meet both counts, while amazon and apple fail because each misses one of the required letters.

Can I solve Word Subsets with sorting or set intersection?

Those approaches usually break on multiplicity. This problem is about exact letter counts, so frequency arrays are the safest way to preserve requirements like two c characters or two o characters.

#916

Medium

auto_awesome数组·哈希·扫描

LeetCode 题解工作台

单词子集

给你两个字符串数组 words1 和 words2 。现在，如果 b 中的每个字母都出现在 a 中，包括重复出现的字母，那么称字符串 b 是字符串 a 的子集。例如， "wrr" 是 "warrior" 的子集，但不是 "world" 的子集。如果对 words2 中的每一个单词 b …

数组哈希表字符串

题目描述

给你两个字符串数组 words1 和 words2。

现在，如果 b 中的每个字母都出现在 a 中，包括重复出现的字母，那么称字符串 b 是字符串 a 的子集。

例如，"wrr" 是 "warrior" 的子集，但不是 "world" 的子集。

如果对 words2 中的每一个单词 b，b 都是 a 的子集，那么我们称 words1 中的单词 a 是 通用单词 。

以数组形式返回 words1 中所有的通用单词。你可以按 任意顺序 返回答案。

示例 1：

输入：words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["e","o"]

输出：["facebook","google","leetcode"]

示例 2：

输入：words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["lc","eo"]

输出：["leetcode"]

示例 3：

输入：words1 = ["acaac","cccbb","aacbb","caacc","bcbbb"], words2 = ["c","cc","b"]

输出：["cccbb"]

提示：

1 <= words1.length, words2.length <= 10⁴
1 <= words1[i].length, words2[i].length <= 10
words1[i] 和 words2[i] 仅由小写英文字母组成
words1 中的所有字符串 互不相同

lightbulb

解题思路

方法一：计数

遍历 words2 中的每个单词 b，统计每个字母出现的最大次数，记为 cnt。

然后遍历 words1 中的每个单词 a，统计每个字母出现的次数，记为 t。如果 cnt 中的每个字母的出现次数都不大于 t 中的出现次数，则 a 是通用单词，将其加入答案。

时间复杂度 $O(L)$ ，其中 $L$ 为 words1 和 words2 中所有单词的长度之和。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

class Solution:
    def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
        cnt = Counter()
        for b in words2:
            t = Counter(b)
            for c, v in t.items():
                cnt[c] = max(cnt[c], v)
        ans = []
        for a in words1:
            t = Counter(a)
            if all(v <= t[c] for c, v in cnt.items()):
                ans.append(a)
        return ans

speed

复杂度分析

指标	值
时间	O(\mathcal{A} + \mathcal{B})
空间	O(1)

psychology

面试官常问的追问

外企场景

question_mark
They want you to notice that multiple words in words2 can be merged by per-letter maximum frequency, not intersected or summed blindly.
question_mark
They are testing whether you handle multiplicity correctly, such as requiring two copies of a letter when one word in words2 demands it.
question_mark
They expect a fixed-size character counting solution, since lowercase English letters make 26-slot arrays simpler and faster than generic hash maps.

warning

常见陷阱

外企场景

error
Summing frequencies across all words2 instead of taking the maximum per letter, which overstates the requirement and rejects valid universal words.
error
Treating subset as plain character presence and forgetting multiplicity, so words that need repeated letters pass when they should fail.
error
Rechecking each words2 entry against every words1 word, which keeps the logic correct but misses the intended compression step for this problem.

swap_horiz

进阶变体

外企场景

arrow_right_alt
Return the count of universal words instead of the list, while keeping the same merged requirement logic.
arrow_right_alt
Support an online stream of words1 candidates, where the precomputed words2 requirement stays fixed and each new word is checked once.
arrow_right_alt
Generalize beyond lowercase English letters, which keeps the same idea but replaces the 26-slot array with a dynamic frequency map.

help

常见问题

外企场景

继续练习

#929 独特的电子邮件地址 #893 特殊等价字符串组 #890 查找和替换模式