LeetCode Problem Workspace

Substring with Concatenation of All Words

Find all starting indices of substrings in a string that are concatenations of a given list of words.

category

3

Topics

code_blocks

7

Code langs

hub

3

Related

Practice Focus

Hard · Sliding window with running state updates

bolt

Answer-first summary

Find all starting indices of substrings in a string that are concatenations of a given list of words.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Sliding window with running state updates

Try AiBox Copilotarrow_forward

To solve this, use a sliding window to traverse the string, updating the state with a hash table. You need to ensure that every word in the list is accounted for in each window. Adjust window size dynamically as you move through the string to match all words concatenated.

Problem Statement

Given a string s and an array of words, find all starting indices of substrings in s that are the concatenation of all words in any permutation. All words are of the same length. Each substring in s should match a concatenated string of all the words exactly, without any extra characters or omissions.

Return an array containing the starting indices of these substrings. You may return the result in any order. The goal is to efficiently check for each possible substring in s that could match a concatenation of all the words.

Examples

Example 1

Input: s = "barfoothefoobarman", words = ["foo","bar"]

Output: [0,9]

The substring starting at 0 is "barfoo" . It is the concatenation of ["bar","foo"] which is a permutation of words . The substring starting at 9 is "foobar" . It is the concatenation of ["foo","bar"] which is a permutation of words .

Example 2

Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]

Output: []

There is no concatenated substring.

Example 3

Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]

Output: [6,9,12]

The substring starting at 6 is "foobarthe" . It is the concatenation of ["foo","bar","the"] . The substring starting at 9 is "barthefoo" . It is the concatenation of ["bar","the","foo"] . The substring starting at 12 is "thefoobar" . It is the concatenation of ["the","foo","bar"] .

Constraints

  • 1 <= s.length <= 104
  • 1 <= words.length <= 5000
  • 1 <= words[i].length <= 30
  • s and words[i] consist of lowercase English letters.

Solution Approach

Sliding Window with Hash Table

The core of the solution is using a sliding window approach. Start by calculating the size of the window based on the total length of all words concatenated. Within the window, check if the substring contains all words from the list using a hash table to count occurrences. As you slide the window one word at a time, update the hash table to maintain the current word count.

Efficient Hash Table Updates

Maintain a hash table to track the frequency of each word in the current window. As the window slides, incrementally update the table by adding the new word and removing the old word. This ensures that the check for a valid concatenation happens efficiently. The hash table will help you verify if all words are present in the window in any order.

Optimizing Window Movement

Rather than checking each possible substring separately, use the sliding window to skip over portions of the string that have already been verified. By adjusting the window size dynamically and leveraging the hash table to maintain the state of word counts, you can skip unnecessary comparisons, improving the overall efficiency of the algorithm.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

The time complexity is O(n * m), where n is the length of the string and m is the length of each word. This comes from sliding the window over the string and updating the hash table efficiently. The space complexity is O(m * k), where m is the length of each word and k is the number of words in the list, as this is the space required to store the word frequencies in the hash table.

What Interviewers Usually Probe

  • Do you understand how to handle sliding window operations efficiently in this problem?
  • Can you implement a hash table that dynamically updates word frequencies during the sliding window process?
  • Will you be able to adjust the window size based on word count frequency in real-time?

Common Pitfalls or Variants

Common pitfalls

  • Failing to update the hash table correctly when sliding the window, leading to incorrect word counts.
  • Overlooking edge cases where words may not fully match the substring even if their individual counts are correct.
  • Not optimizing the window movement, leading to unnecessary comparisons that degrade performance.

Follow-up variants

  • Substring with Concatenation of All Words: Handle edge cases with non-overlapping words.
  • Substring with Concatenation of All Words: Modify the problem to allow overlapping words.
  • Substring with Concatenation of All Words: Extend to support words of different lengths.

FAQ

What is the main algorithmic technique for this problem?

The problem is best solved using a sliding window with dynamic hash table updates to track word frequencies as the window slides through the string.

How do you handle the frequency of words in the sliding window?

Use a hash table to track the count of each word in the current window. As the window slides, update the table by adding the new word and removing the old one.

Why is the sliding window approach efficient here?

The sliding window approach allows you to check potential substrings in a dynamic, incremental manner, avoiding the need to check every substring from scratch.

How do you handle cases where there is no valid concatenation?

If no valid concatenation is found during the sliding window process, simply return an empty array to indicate no matching substrings exist.

What is the space complexity of the algorithm?

The space complexity is O(m * k), where m is the length of each word, and k is the number of words, due to the hash table storing word frequencies.

terminal

Solution

Solution 1: Hash Table + Sliding Window

We use a hash table $cnt$ to count the number of times each word appears in $words$, and use a hash table $cnt1$ to count the number of times each word appears in the current sliding window. We denote the length of the string $s$ as $m$, the number of words in the string array $words$ as $n$, and the length of each word as $k$.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class Solution:
    def findSubstring(self, s: str, words: List[str]) -> List[int]:
        cnt = Counter(words)
        m, n = len(s), len(words)
        k = len(words[0])
        ans = []
        for i in range(k):
            l = r = i
            cnt1 = Counter()
            while r + k <= m:
                t = s[r : r + k]
                r += k
                if cnt[t] == 0:
                    l = r
                    cnt1.clear()
                    continue
                cnt1[t] += 1
                while cnt1[t] > cnt[t]:
                    rem = s[l : l + k]
                    l += k
                    cnt1[rem] -= 1
                if r - l == n * k:
                    ans.append(l)
        return ans
Substring with Concatenation of All Words Solution: Sliding window with running state upd… | LeetCode #30 Hard