What is the main pattern to solve Find Substring With Given Hash Value?

The main pattern is using a sliding window with rolling hash updates to efficiently track substring hash values.

Why is recalculating the hash for each substring inefficient?

Recomputing the hash for each substring is O(k) per window, which can lead to O(n*k) time. Rolling hash reduces it to O(n).

How do modulo operations affect rolling hash updates?

Modulo ensures hash values remain within bounds. Forgetting modulo or using it incorrectly can produce wrong or negative results.

Can this approach handle large strings efficiently?

Yes, the sliding window with rolling hash updates only touches each character once, making it suitable for strings up to 2 * 10^4 characters.

What should I watch out for with the first substring requirement?

Ensure that the code returns the substring as soon as a matching hash is found to satisfy the requirement for the first occurrence.

#2156

Hard

auto_awesome滑动窗口（状态滚动更新）

LeetCode 题解工作台

查找给定哈希值的子串

给定整数 p 和 m ，一个长度为 k 且下标从 0 开始的字符串 s 的哈希值按照如下函数计算： hash(s, p, m) = (val(s[0]) * p 0 + val(s[1]) * p 1 + ... + val(s[k-1]) * p k-1 ) mod m . 其中 val(s[i]…

字符串滑动窗口滚动哈希哈希函数

题目描述

给定整数 p 和 m ，一个长度为 k 且下标从 0 开始的字符串 s 的哈希值按照如下函数计算：

hash(s, p, m) = (val(s[0]) * p⁰ + val(s[1]) * p¹ + ... + val(s[k-1]) * p^k-1) mod m.

其中 val(s[i]) 表示 s[i] 在字母表中的下标，从 val('a') = 1 到 val('z') = 26 。

给你一个字符串 s 和整数 power，modulo，k 和 hashValue 。请你返回 s 中 第一个 长度为 k 的子串 sub ，满足 hash(sub, power, modulo) == hashValue 。

测试数据保证一定存在至少一个这样的子串。

子串定义为一个字符串中连续非空字符组成的序列。

示例 1：

输入：s = "leetcode", power = 7, modulo = 20, k = 2, hashValue = 0
输出："ee"
解释："ee" 的哈希值为 hash("ee", 7, 20) = (5 * 1 + 5 * 7) mod 20 = 40 mod 20 = 0 。
"ee" 是长度为 2 的第一个哈希值为 0 的子串，所以我们返回 "ee" 。

示例 2：

输入：s = "fbxzaad", power = 31, modulo = 100, k = 3, hashValue = 32
输出："fbx"
解释："fbx" 的哈希值为 hash("fbx", 31, 100) = (6 * 1 + 2 * 31 + 24 * 31²) mod 100 = 23132 mod 100 = 32 。
"bxz" 的哈希值为 hash("bxz", 31, 100) = (2 * 1 + 24 * 31 + 26 * 31²) mod 100 = 25732 mod 100 = 32 。
"fbx" 是长度为 3 的第一个哈希值为 32 的子串，所以我们返回 "fbx" 。
注意，"bxz" 的哈希值也为 32 ，但是它在字符串中比 "fbx" 更晚出现。

提示：

1 <= k <= s.length <= 2 * 10⁴
1 <= power, modulo <= 10⁹
0 <= hashValue < modulo
s 只包含小写英文字母。
测试数据保证一定存在满足条件的子串。

lightbulb

解题思路

方法一：滑动窗口 + 倒序遍历

我们可以维护一个长度为 $k$ 的滑动窗口，用来计算子串的哈希值。考虑到如果正序遍历字符串，在哈希值的计算中，涉及到除法取模的操作，处理起来比较麻烦。因此我们可以倒序遍历字符串，这样在计算哈希值的时候，只需要乘法和取模操作。

我们首先计算字符串末尾的 $k$ 个字符的哈希值，然后从字符串末尾开始倒序遍历，每次计算当前窗口的哈希值，如果等于给定的哈希值，我们就找到了一个满足条件的子串，更新答案字符串的起始位置。

最后返回答案字符串即可。

时间复杂度 $O(n)$ ，其中 $n$ 是字符串的长度。空间复杂度 $O(1)$ 。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

class Solution:
    def subStrHash(
        self, s: str, power: int, modulo: int, k: int, hashValue: int
    ) -> str:
        h, n = 0, len(s)
        p = 1
        for i in range(n - 1, n - 1 - k, -1):
            val = ord(s[i]) - ord("a") + 1
            h = ((h * power) + val) % modulo
            if i != n - k:
                p = p * power % modulo
        j = n - k
        for i in range(n - 1 - k, -1, -1):
            pre = ord(s[i + k]) - ord("a") + 1
            cur = ord(s[i]) - ord("a") + 1
            h = ((h - pre * p) * power + cur) % modulo
            if h == hashValue:
                j = i
        return s[j : j + k]

speed

复杂度分析

指标	值
时间	complexity is O(n) because each character is processed once during the sliding window pass. Space complexity is O(1) beyond input storage since only the rolling hash and a few variables are maintained.
空间	Depends on the final approach

psychology

面试官常问的追问

外企场景

question_mark
Notice the need for updating the hash efficiently rather than recalculating each substring.
question_mark
Consider the influence of power and modulo on preventing integer overflow.
question_mark
Be ready to discuss why processing the string from right to left can simplify modular arithmetic in rolling hashes.

warning

常见陷阱

外企场景

error
Recomputing the hash from scratch for every substring instead of updating it incrementally.
error
Incorrectly handling modulo operations which can yield negative values.
error
Misindexing characters in the rolling window leading to off-by-one errors.

swap_horiz

进阶变体

外企场景

arrow_right_alt
Find all substrings of length k with a given hash value instead of only the first occurrence.
arrow_right_alt
Compute substrings with multiple different hashValues efficiently using the same rolling hash framework.
arrow_right_alt
Apply a similar rolling hash method to detect repeated substrings or hash collisions in large texts.

help

常见问题

外企场景

继续练习

#2223 构造字符串的总得分和 #1044 最长重复子串 #1960 两个回文子字符串长度的最大乘积