LeetCode Problem Workspace
Find Substring With Given Hash Value
Locate the first substring of length k whose rolling hash matches the given hashValue using a sliding window approach.
4
Topics
6
Code langs
3
Related
Practice Focus
Hard · Sliding window with running state updates
Answer-first summary
Locate the first substring of length k whose rolling hash matches the given hashValue using a sliding window approach.
Ace coding interviews with Interview AiBoxInterview AiBox guidance for Sliding window with running state updates
To solve this problem, iterate over the string with a sliding window of length k while maintaining the rolling hash efficiently. Instead of recomputing the hash from scratch for each substring, update it by removing the oldest character and adding the new character at each step. Return the first substring where the computed hash equals hashValue, ensuring constant time updates to meet performance constraints.
Problem Statement
Given a string s of lowercase English letters and integers power, modulo, k, and hashValue, compute a rolling hash function defined as hash(sub) = (val(sub[0]) * power^0 + val(sub[1]) * power^1 + ... + val(sub[k-1]) * power^(k-1)) % modulo, where val('a') = 1 and val('z') = 26.
Return the first substring of length k in s such that its hash equals hashValue. The input guarantees that at least one such substring exists, and efficient iteration using a sliding window with running hash updates is required to handle large strings.
Examples
Example 1
Input: s = "leetcode", power = 7, modulo = 20, k = 2, hashValue = 0
Output: "ee"
The hash of "ee" can be computed to be hash("ee", 7, 20) = (5 * 1 + 5 * 7) mod 20 = 40 mod 20 = 0. "ee" is the first substring of length 2 with hashValue 0. Hence, we return "ee".
Example 2
Input: s = "fbxzaad", power = 31, modulo = 100, k = 3, hashValue = 32
Output: "fbx"
The hash of "fbx" can be computed to be hash("fbx", 31, 100) = (6 * 1 + 2 * 31 + 24 * 312) mod 100 = 23132 mod 100 = 32. The hash of "bxz" can be computed to be hash("bxz", 31, 100) = (2 * 1 + 24 * 31 + 26 * 312) mod 100 = 25732 mod 100 = 32. "fbx" is the first substring of length 3 with hashValue 32. Hence, we return "fbx". Note that "bxz" also has a hash of 32 but it appears later than "fbx".
Constraints
- 1 <= k <= s.length <= 2 * 104
- 1 <= power, modulo <= 109
- 0 <= hashValue < modulo
- s consists of lowercase English letters only.
- The test cases are generated such that an answer always exists.
Solution Approach
Initialize Rolling Hash
Start by computing the hash for the first k-length substring. Convert each character to its alphabet index and apply the rolling hash formula. This initial value sets up the sliding window and allows subsequent updates without full recomputation.
Slide Window with Efficient Updates
Move the window one character at a time. Subtract the contribution of the outgoing character, multiply or adjust by power, and add the new character's weighted value modulo the given number. This approach avoids recalculating the entire hash for each substring.
Check Hash and Return Substring
At each step, compare the updated hash to hashValue. Once a match is found, return the corresponding substring immediately. This guarantees the first substring with the required hash is returned, and processing stops early for efficiency.
Complexity Analysis
| Metric | Value |
|---|---|
| Time | Depends on the final approach |
| Space | Depends on the final approach |
Time complexity is O(n) because each character is processed once during the sliding window pass. Space complexity is O(1) beyond input storage since only the rolling hash and a few variables are maintained.
What Interviewers Usually Probe
- Notice the need for updating the hash efficiently rather than recalculating each substring.
- Consider the influence of power and modulo on preventing integer overflow.
- Be ready to discuss why processing the string from right to left can simplify modular arithmetic in rolling hashes.
Common Pitfalls or Variants
Common pitfalls
- Recomputing the hash from scratch for every substring instead of updating it incrementally.
- Incorrectly handling modulo operations which can yield negative values.
- Misindexing characters in the rolling window leading to off-by-one errors.
Follow-up variants
- Find all substrings of length k with a given hash value instead of only the first occurrence.
- Compute substrings with multiple different hashValues efficiently using the same rolling hash framework.
- Apply a similar rolling hash method to detect repeated substrings or hash collisions in large texts.
FAQ
What is the main pattern to solve Find Substring With Given Hash Value?
The main pattern is using a sliding window with rolling hash updates to efficiently track substring hash values.
Why is recalculating the hash for each substring inefficient?
Recomputing the hash for each substring is O(k) per window, which can lead to O(n*k) time. Rolling hash reduces it to O(n).
How do modulo operations affect rolling hash updates?
Modulo ensures hash values remain within bounds. Forgetting modulo or using it incorrectly can produce wrong or negative results.
Can this approach handle large strings efficiently?
Yes, the sliding window with rolling hash updates only touches each character once, making it suitable for strings up to 2 * 10^4 characters.
What should I watch out for with the first substring requirement?
Ensure that the code returns the substring as soon as a matching hash is found to satisfy the requirement for the first occurrence.
Solution
Solution 1: Sliding Window + Reverse Traversal
We can maintain a sliding window of length $k$ to calculate the hash value of the substring. Considering that if we traverse the string in the forward order, the calculation of the hash value involves division and modulo operations, which are relatively complicated to handle. Therefore, we can traverse the string in reverse order, so that when calculating the hash value, only multiplication and modulo operations are needed.
class Solution:
def subStrHash(
self, s: str, power: int, modulo: int, k: int, hashValue: int
) -> str:
h, n = 0, len(s)
p = 1
for i in range(n - 1, n - 1 - k, -1):
val = ord(s[i]) - ord("a") + 1
h = ((h * power) + val) % modulo
if i != n - k:
p = p * power % modulo
j = n - k
for i in range(n - 1 - k, -1, -1):
pre = ord(s[i + k]) - ord("a") + 1
cur = ord(s[i]) - ord("a") + 1
h = ((h - pre * p) * power + cur) % modulo
if h == hashValue:
j = i
return s[j : j + k]Continue Topic
string
Practice more edge cases under the same topic.
arrow_forwardauto_awesomeContinue Pattern
Sliding window with running state updates
Expand the same solving frame across more problems.
arrow_forwardsignal_cellular_altSame Difficulty Track
Hard
Stay on this level to stabilize interview delivery.
arrow_forward