#792

Medium

auto_awesomeArray scanning plus hash lookup

LeetCode Problem Workspace

Number of Matching Subsequences

Given a string s and a list of words, count how many words are subsequences of s using efficient array scanning and hash lookup.

array hash table string binary search dynamic programming

Problem Statement

You are given a string s and a list of strings words. Your task is to return the number of strings from words that are subsequences of the string s.

A subsequence of a string is a sequence that can be derived by deleting some or no characters of the string without changing the relative order of the remaining characters.

Examples

Example 1

Input: s = "abcde", words = ["a","bb","acd","ace"]

Output: 3

There are three strings in words that are a subsequence of s: "a", "acd", "ace".

Example 2

Input: s = "dsahjpjauf", words = ["ahjpjau","ja","ahbwzgqnuk","tnmlanowax"]

Output: 2

Example details omitted.

Constraints

1 <= s.length <= 5 * 104
1 <= words.length <= 5000
1 <= words[i].length <= 50
s and words[i] consist of only lowercase English letters.

Solution Approach

Efficient Array Scanning with Hash Lookup

To solve this problem efficiently, you can map the characters of string s into a hash table. Then, scan each word in the words list, checking if the characters can be matched in order within s using this hash table.

Using Binary Search for Optimized Search

A more optimized approach uses binary search to find the next character of the word within s. This can be combined with the previously mentioned hash lookup to efficiently check for subsequences.

Trie-based Approach for Faster Matching

Alternatively, a trie can be built using all words, allowing for faster subsequence matching by scanning through s and checking if a word in the trie can be matched as a subsequence.

Complexity Analysis

Metric	Value
Time	Depends on the final approach
Space	Depends on the final approach

The time complexity depends on the approach: using hash lookup, it's O(N + M), where N is the length of s and M is the total length of all words. If using binary search, it may improve to O(N log M). The space complexity is O(M) for storing the words in the hash table or trie.

What Interviewers Usually Probe

The candidate demonstrates an understanding of subsequence matching and optimizes using hash maps or binary search.
The candidate is able to apply different data structures, such as hash maps, binary search, or tries, to optimize subsequence search.
The candidate is able to reduce time complexity efficiently and handle large input sizes within constraints.

Common Pitfalls or Variants

Common pitfalls

Brute force checking of each word as a subsequence could lead to time complexity that exceeds the problem's limits.
Not using an efficient data structure like a hash table or trie might result in slow performance for larger inputs.
Overlooking edge cases where words might not match the sequence order in s or contain extra characters.

Follow-up variants

Use dynamic programming to store partial subsequences for more complex scenarios.
Instead of checking each word individually, preprocess the input string s to speed up subsequence checks.
Modify the approach to handle multiple strings where each has different character sets or sequences.

FAQ

What is the primary approach for solving the Number of Matching Subsequences problem?

The primary approach is to combine array scanning with hash lookup, allowing for efficient subsequence matching in string s.

How do you avoid brute-force solutions in this problem?

You avoid brute force by using a hash table or binary search, significantly reducing the time complexity when checking if a word is a subsequence.

Can this problem be solved with a trie?

Yes, a trie-based approach can be used to store the words and scan string s, offering faster subsequence matching.

What are the time and space complexities for this problem?

The time complexity is O(N + M) with hash lookup, where N is the length of s and M is the total length of all words. The space complexity is O(M).

What is the key pattern in solving this problem?

The key pattern is using efficient array scanning combined with hash lookup to determine subsequences in a time-efficient manner.

terminal

Solution

Solution 1

#### Python3

1

2

3

4

5

6

7

8

9

10

11

12

13

14

class Solution:
    def numMatchingSubseq(self, s: str, words: List[str]) -> int:
        d = defaultdict(deque)
        for w in words:
            d[w[0]].append(w)
        ans = 0
        for c in s:
            for _ in range(len(d[c])):
                t = d[c].popleft()
                if len(t) == 1:
                    ans += 1
                else:
                    d[t[1]].append(t[1:])
        return ans

Solution 2

#### Python3

1

2

3

4

5

6

7

8

9

10

11

12

13

14

class Solution:
    def numMatchingSubseq(self, s: str, words: List[str]) -> int:
        d = defaultdict(deque)
        for w in words:
            d[w[0]].append(w)
        ans = 0
        for c in s:
            for _ in range(len(d[c])):
                t = d[c].popleft()
                if len(t) == 1:
                    ans += 1
                else:
                    d[t[1]].append(t[1:])
        return ans

Solution 3

#### Python3

1

2

3

4

5

6

7

8

9

10

11

12

13

14

class Solution:
    def numMatchingSubseq(self, s: str, words: List[str]) -> int:
        d = defaultdict(deque)
        for w in words:
            d[w[0]].append(w)
        ans = 0
        for c in s:
            for _ in range(len(d[c])):
                t = d[c].popleft()
                if len(t) == 1:
                    ans += 1
                else:
                    d[t[1]].append(t[1:])
        return ans

Continue Practicing

#720 longest word in dictionary #692 top k frequent words #1048 longest string chain