LeetCode Problem Workspace

Find Substring With Given Hash Value

Locate the first substring of length k whose rolling hash matches the given hashValue using a sliding window approach.

category

4

Topics

code_blocks

6

Code langs

hub

3

Related

Practice Focus

Hard · Sliding window with running state updates

bolt

Answer-first summary

Locate the first substring of length k whose rolling hash matches the given hashValue using a sliding window approach.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Sliding window with running state updates

Try AiBox Copilotarrow_forward

To solve this problem, iterate over the string with a sliding window of length k while maintaining the rolling hash efficiently. Instead of recomputing the hash from scratch for each substring, update it by removing the oldest character and adding the new character at each step. Return the first substring where the computed hash equals hashValue, ensuring constant time updates to meet performance constraints.

Problem Statement

Given a string s of lowercase English letters and integers power, modulo, k, and hashValue, compute a rolling hash function defined as hash(sub) = (val(sub[0]) * power^0 + val(sub[1]) * power^1 + ... + val(sub[k-1]) * power^(k-1)) % modulo, where val('a') = 1 and val('z') = 26.

Return the first substring of length k in s such that its hash equals hashValue. The input guarantees that at least one such substring exists, and efficient iteration using a sliding window with running hash updates is required to handle large strings.

Examples

Example 1

Input: s = "leetcode", power = 7, modulo = 20, k = 2, hashValue = 0

Output: "ee"

The hash of "ee" can be computed to be hash("ee", 7, 20) = (5 * 1 + 5 * 7) mod 20 = 40 mod 20 = 0. "ee" is the first substring of length 2 with hashValue 0. Hence, we return "ee".

Example 2

Input: s = "fbxzaad", power = 31, modulo = 100, k = 3, hashValue = 32

Output: "fbx"

The hash of "fbx" can be computed to be hash("fbx", 31, 100) = (6 * 1 + 2 * 31 + 24 * 312) mod 100 = 23132 mod 100 = 32. The hash of "bxz" can be computed to be hash("bxz", 31, 100) = (2 * 1 + 24 * 31 + 26 * 312) mod 100 = 25732 mod 100 = 32. "fbx" is the first substring of length 3 with hashValue 32. Hence, we return "fbx". Note that "bxz" also has a hash of 32 but it appears later than "fbx".

Constraints

  • 1 <= k <= s.length <= 2 * 104
  • 1 <= power, modulo <= 109
  • 0 <= hashValue < modulo
  • s consists of lowercase English letters only.
  • The test cases are generated such that an answer always exists.

Solution Approach

Initialize Rolling Hash

Start by computing the hash for the first k-length substring. Convert each character to its alphabet index and apply the rolling hash formula. This initial value sets up the sliding window and allows subsequent updates without full recomputation.

Slide Window with Efficient Updates

Move the window one character at a time. Subtract the contribution of the outgoing character, multiply or adjust by power, and add the new character's weighted value modulo the given number. This approach avoids recalculating the entire hash for each substring.

Check Hash and Return Substring

At each step, compare the updated hash to hashValue. Once a match is found, return the corresponding substring immediately. This guarantees the first substring with the required hash is returned, and processing stops early for efficiency.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

Time complexity is O(n) because each character is processed once during the sliding window pass. Space complexity is O(1) beyond input storage since only the rolling hash and a few variables are maintained.

What Interviewers Usually Probe

  • Notice the need for updating the hash efficiently rather than recalculating each substring.
  • Consider the influence of power and modulo on preventing integer overflow.
  • Be ready to discuss why processing the string from right to left can simplify modular arithmetic in rolling hashes.

Common Pitfalls or Variants

Common pitfalls

  • Recomputing the hash from scratch for every substring instead of updating it incrementally.
  • Incorrectly handling modulo operations which can yield negative values.
  • Misindexing characters in the rolling window leading to off-by-one errors.

Follow-up variants

  • Find all substrings of length k with a given hash value instead of only the first occurrence.
  • Compute substrings with multiple different hashValues efficiently using the same rolling hash framework.
  • Apply a similar rolling hash method to detect repeated substrings or hash collisions in large texts.

FAQ

What is the main pattern to solve Find Substring With Given Hash Value?

The main pattern is using a sliding window with rolling hash updates to efficiently track substring hash values.

Why is recalculating the hash for each substring inefficient?

Recomputing the hash for each substring is O(k) per window, which can lead to O(n*k) time. Rolling hash reduces it to O(n).

How do modulo operations affect rolling hash updates?

Modulo ensures hash values remain within bounds. Forgetting modulo or using it incorrectly can produce wrong or negative results.

Can this approach handle large strings efficiently?

Yes, the sliding window with rolling hash updates only touches each character once, making it suitable for strings up to 2 * 10^4 characters.

What should I watch out for with the first substring requirement?

Ensure that the code returns the substring as soon as a matching hash is found to satisfy the requirement for the first occurrence.

terminal

Solution

Solution 1: Sliding Window + Reverse Traversal

We can maintain a sliding window of length $k$ to calculate the hash value of the substring. Considering that if we traverse the string in the forward order, the calculation of the hash value involves division and modulo operations, which are relatively complicated to handle. Therefore, we can traverse the string in reverse order, so that when calculating the hash value, only multiplication and modulo operations are needed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
class Solution:
    def subStrHash(
        self, s: str, power: int, modulo: int, k: int, hashValue: int
    ) -> str:
        h, n = 0, len(s)
        p = 1
        for i in range(n - 1, n - 1 - k, -1):
            val = ord(s[i]) - ord("a") + 1
            h = ((h * power) + val) % modulo
            if i != n - k:
                p = p * power % modulo
        j = n - k
        for i in range(n - 1 - k, -1, -1):
            pre = ord(s[i + k]) - ord("a") + 1
            cur = ord(s[i]) - ord("a") + 1
            h = ((h - pre * p) * power + cur) % modulo
            if h == hashValue:
                j = i
        return s[j : j + k]
Find Substring With Given Hash Value Solution: Sliding window with running state upd… | LeetCode #2156 Hard