LeetCode Problem Workspace

Word Subsets

Word Subsets is solved by merging words2 into one max-frequency requirement, then scanning words1 against that fixed letter profile.

category

3

Topics

code_blocks

6

Code langs

hub

3

Related

Practice Focus

Medium · Array scanning plus hash lookup

bolt

Answer-first summary

Word Subsets is solved by merging words2 into one max-frequency requirement, then scanning words1 against that fixed letter profile.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Array scanning plus hash lookup

Try AiBox Copilotarrow_forward

For Word Subsets, the key shortcut is to avoid checking every word in words2 against every candidate in words1. Instead, compress words2 into one 26-letter maximum frequency requirement, then count each word in words1 once and keep it only if it satisfies that merged profile. This turns repeated subset checks into a single array comparison per candidate and cleanly handles multiplicity like needing two e characters instead of one.

Problem Statement

You get two lowercase string arrays, words1 and words2. A word from words1 qualifies only when it contains every letter requirement imposed by every word in words2, including repeated letters. That means a check is not just about presence; if a word in words2 needs two c characters, the candidate in words1 must also provide at least two c characters.

The useful observation is that words2 can be merged into one global requirement by taking the maximum count needed for each letter across all its words. After that, each word in words1 is universal exactly when its own letter counts meet or exceed that combined requirement. On the sample with amazon, apple, facebook, google, and leetcode against ["e","o"], only facebook, google, and leetcode contain both required letters.

Examples

Example 1

Input: words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["e","o"]

Output: ["facebook","google","leetcode"]

Example details omitted.

Example 2

Input: words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["lc","eo"]

Output: ["leetcode"]

Example details omitted.

Example 3

Input: words1 = ["acaac","cccbb","aacbb","caacc","bcbbb"], words2 = ["c","cc","b"]

Output: ["cccbb"]

Example details omitted.

Constraints

  • 1 <= words1.length, words2.length <= 104
  • 1 <= words1[i].length, words2[i].length <= 10
  • words1[i] and words2[i] consist only of lowercase English letters.
  • All the strings of words1 are unique.

Solution Approach

Collapse words2 into one letter demand

Count letter frequencies for every word in words2, but do not keep them separately. For each of the 26 lowercase letters, store the maximum frequency seen across all words2 entries. This is the core reduction for Word Subsets: if one word needs two o characters and another needs one e, every universal word only has to satisfy that merged max-count profile, not each source word independently.

Scan each word in words1 once

Build a 26-slot frequency array for each candidate in words1. Compare it against the merged requirement from words2. If every required letter count is met or exceeded, append that word to the answer. Because each word length is at most 10, the per-word counting step is small and predictable, which makes array scanning cleaner than heavier map-based logic here.

Why this avoids the brute-force trap

The slow failure mode is checking every word in words2 against every word in words1 and recounting letters for each pair. That repeats the same requirement work many times. By turning words2 into one fixed profile first, the algorithm replaces many subset tests with one direct frequency comparison per candidate, which is exactly the right trade-off for short lowercase strings and large array counts.

Complexity Analysis

Metric Value
Time O(\mathcal{A} + \mathcal{B})
Space O(1)

Let A be the total number of characters across words1 and B be the total number of characters across words2. Building the merged requirement costs O(B), and scanning all candidates in words1 costs O(A), so the total time is O(A + B). The extra space is O(1) because the algorithm uses fixed-size 26-letter counting arrays rather than structures that grow with input size.

What Interviewers Usually Probe

  • They want you to notice that multiple words in words2 can be merged by per-letter maximum frequency, not intersected or summed blindly.
  • They are testing whether you handle multiplicity correctly, such as requiring two copies of a letter when one word in words2 demands it.
  • They expect a fixed-size character counting solution, since lowercase English letters make 26-slot arrays simpler and faster than generic hash maps.

Common Pitfalls or Variants

Common pitfalls

  • Summing frequencies across all words2 instead of taking the maximum per letter, which overstates the requirement and rejects valid universal words.
  • Treating subset as plain character presence and forgetting multiplicity, so words that need repeated letters pass when they should fail.
  • Rechecking each words2 entry against every words1 word, which keeps the logic correct but misses the intended compression step for this problem.

Follow-up variants

  • Return the count of universal words instead of the list, while keeping the same merged requirement logic.
  • Support an online stream of words1 candidates, where the precomputed words2 requirement stays fixed and each new word is checked once.
  • Generalize beyond lowercase English letters, which keeps the same idea but replaces the 26-slot array with a dynamic frequency map.

FAQ

What is the main trick in Word Subsets?

The main trick is to combine all words2 requirements into one 26-letter maximum frequency array. Once that profile is built, every word in words1 is checked only once against it.

Why do we take the maximum frequency per letter instead of adding counts?

A universal word must satisfy every single word in words2, not concatenate them all. Taking the maximum per letter captures the strongest requirement any one word imposes, while adding counts would incorrectly demand letters from different words at the same time.

Is array scanning plus hash lookup really the right pattern for this problem?

Yes, but for lowercase English letters the hash lookup part is usually implemented as a fixed 26-slot frequency array. The pattern is: preprocess the requirement once, then scan candidates and compare counts directly.

How does the example with ["e","o"] produce ["facebook","google","leetcode"]?

The merged requirement is simply at least one e and one o. Facebook, google, and leetcode meet both counts, while amazon and apple fail because each misses one of the required letters.

Can I solve Word Subsets with sorting or set intersection?

Those approaches usually break on multiplicity. This problem is about exact letter counts, so frequency arrays are the safest way to preserve requirements like two c characters or two o characters.

terminal

Solution

Solution 1: Counting

Traverse each word `b` in `words2`, count the maximum occurrence of each letter, and record it as `cnt`.

1
2
3
4
5
6
7
8
9
10
11
12
13
class Solution:
    def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
        cnt = Counter()
        for b in words2:
            t = Counter(b)
            for c, v in t.items():
                cnt[c] = max(cnt[c], v)
        ans = []
        for a in words1:
            t = Counter(a)
            if all(v <= t[c] for c, v in cnt.items()):
                ans.append(a)
        return ans
Word Subsets Solution: Array scanning plus hash lookup | LeetCode #916 Medium