What is the pattern to solve the Longest Common Subpath problem?

The problem can be efficiently solved using binary search over subpath lengths combined with rolling hashes to verify the presence of common subpaths.

How do I handle large inputs in the Longest Common Subpath problem?

For large inputs, it's important to optimize the subpath verification process using efficient algorithms like rolling hashes and minimize the number of comparisons with binary search.

What if there is no common subpath in the Longest Common Subpath problem?

If no common subpath exists, the solution will return 0, indicating that the friends' paths do not share any common subpath.

How do rolling hashes work in the Longest Common Subpath problem?

Rolling hashes allow you to compute hashes for subpaths in constant time as you slide over the paths, making it efficient to compare subpaths across different paths.

How does binary search help in the Longest Common Subpath problem?

Binary search is used to efficiently narrow down the possible lengths of the longest common subpath, reducing the number of candidate lengths to check.

#1923

Hard

auto_awesome二分·搜索·答案·空间

LeetCode 题解工作台

最长公共子路径

一个国家由 n 个编号为 0 到 n - 1 的城市组成。在这个国家里，每两个城市之间都有一条道路连接。总共有 m 个编号为 0 到 m - 1 的朋友想在这个国家旅游。他们每一个人的路径都会包含一些城市。每条路径都由一个整数数组表示，每个整数数组表示一个朋友按顺序访问过的城市序列。同一个城市…

数组二分查找滚动哈希后缀数组哈希函数

题目描述

一个国家由 n 个编号为 0 到 n - 1 的城市组成。在这个国家里，每两个 城市之间都有一条道路连接。

总共有 m 个编号为 0 到 m - 1 的朋友想在这个国家旅游。他们每一个人的路径都会包含一些城市。每条路径都由一个整数数组表示，每个整数数组表示一个朋友按顺序访问过的城市序列。同一个城市在一条路径中可能重复出现，但同一个城市在一条路径中不会连续出现。

给你一个整数 n 和二维数组 paths ，其中 paths[i] 是一个整数数组，表示第 i 个朋友走过的路径，请你返回 每一个 朋友都走过的 最长公共子路径 的长度，如果不存在公共子路径，请你返回 0 。

一个 子路径 指的是一条路径中连续的城市序列。

示例 1：

输入：n = 5, paths = [[0,1,2,3,4],
                     [2,3,4],
                     [4,0,1,2,3]]
输出：2
解释：最长公共子路径为 [2,3] 。

示例 2：

输入：n = 3, paths = [[0],[1],[2]]
输出：0
解释：三条路径没有公共子路径。

示例 3：

输入：n = 5, paths = [[0,1,2,3,4],
                     [4,3,2,1,0]]
输出：1
解释：最长公共子路径为 [0]，[1]，[2]，[3] 和 [4] 。它们长度都为 1 。

提示：

1 <= n <= 10⁵
m == paths.length
2 <= m <= 10⁵
sum(paths[i].length) <= 10⁵
0 <= paths[i][j] < n
paths[i] 中同一个城市不会连续重复出现。

lightbulb

解题思路

方法一：字符串哈希

字符串哈希是把一个任意长度的字符串映射成一个非负整数，并且其冲突的概率几乎为 0。字符串哈希用于计算字符串哈希值，快速判断两个字符串是否相等。

取一固定值 BASE，把字符串看作是 BASE 进制数，并分配一个大于 0 的数值，代表每种字符。一般来说，我们分配的数值都远小于 BASE。例如，对于小写字母构成的字符串，可以令 a=1, b=2, ..., z=26。取一固定值 MOD，求出该 BASE 进制对 M 的余数，作为该字符串的 hash 值。

一般来说，取 BASE=131 或者 BASE=13331，此时 hash 值产生的冲突概率极低。只要两个字符串 hash 值相同，我们就认为两个字符串是相等的。通常 MOD 取 2^64，C++ 里，可以直接使用 unsigned long long 类型存储这个 hash 值，在计算时不处理算术溢出问题，产生溢出时相当于自动对 2^64 取模，这样可以避免低效取模运算。

除了在极特殊构造的数据上，上述 hash 算法很难产生冲突，一般情况下上述 hash 算法完全可以出现在题目的标准答案中。我们还可以多取一些恰当的 BASE 和 MOD 的值（例如大质数），多进行几组 hash 运算，当结果都相同时才认为原字符串相等，就更加难以构造出使这个 hash 产生错误的数据。

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

class Solution:
    def longestCommonSubpath(self, n: int, paths: List[List[int]]) -> int:
        def check(k: int) -> bool:
            cnt = Counter()
            for h in hh:
                vis = set()
                for i in range(1, len(h) - k + 1):
                    j = i + k - 1
                    x = (h[j] - h[i - 1] * p[j - i + 1]) % mod
                    if x not in vis:
                        vis.add(x)
                        cnt[x] += 1
            return max(cnt.values()) == m

        m = len(paths)
        mx = max(len(path) for path in paths)
        base = 133331
        mod = 2**64 + 1
        p = [0] * (mx + 1)
        p[0] = 1
        for i in range(1, len(p)):
            p[i] = p[i - 1] * base % mod
        hh = []
        for path in paths:
            k = len(path)
            h = [0] * (k + 1)
            for i, x in enumerate(path, 1):
                h[i] = h[i - 1] * base % mod + x
            hh.append(h)
        l, r = 0, min(len(path) for path in paths)
        while l < r:
            mid = (l + r + 1) >> 1
            if check(mid):
                l = mid
            else:
                r = mid - 1
        return l

speed

复杂度分析

指标	值
时间	Depends on the final approach
空间	Depends on the final approach

psychology

面试官常问的追问

外企场景

question_mark
Ability to implement efficient hashing techniques such as rolling hashes.
question_mark
Understanding of binary search over a valid answer space to optimize problem-solving.
question_mark
Skill in managing multiple paths and confirming common subpaths through set intersection.

warning

常见陷阱

外企场景

error
Failing to optimize the subpath verification process, leading to excessive brute-force comparisons.
error
Misunderstanding the relationship between binary search and subpath verification, resulting in inefficient solution design.
error
Overlooking edge cases, such as paths that are entirely distinct or paths with minimal overlap.

swap_horiz

进阶变体

外企场景

arrow_right_alt
What if paths can contain repeated cities in non-consecutive order?
arrow_right_alt
Can this solution be optimized further for large n and m values?
arrow_right_alt
How would the solution change if we needed to find the longest common subpath across only a subset of the paths?

help

常见问题

外企场景

继续练习

#2223 构造字符串的总得分和 #1044 最长重复子串 #718 最长重复子数组