LeetCode Problem Workspace

Longest Duplicate Substring

Find the longest duplicated substring in a string using binary search, sliding window, and rolling hash techniques.

category

6

Topics

code_blocks

4

Code langs

hub

3

Related

Practice Focus

Hard · Binary search over the valid answer space

bolt

Answer-first summary

Find the longest duplicated substring in a string using binary search, sliding window, and rolling hash techniques.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Binary search over the valid answer space

Try AiBox Copilotarrow_forward

The longest duplicate substring problem involves finding the longest substring that appears at least twice in a given string. Binary search is typically used to search for the length of the duplicated substring, followed by applying rolling hash to efficiently find duplicates within the string. Key patterns include binary search over the possible substring lengths and optimizing with a sliding window technique.

Problem Statement

You are given a string s and need to find the longest duplicated substring. A duplicated substring is a contiguous substring that appears more than once in s. The occurrences of the substring may overlap, and your task is to return the longest such substring. If no such duplicated substring exists, return an empty string.

For example, in the string "banana", the longest duplicated substring is "ana", which appears twice in the string. If the string has no repeated substrings, the answer should be an empty string.

Examples

Example 1

Input: s = "banana"

Output: "ana"

Example details omitted.

Example 2

Input: s = "abcd"

Output: ""

Example details omitted.

Constraints

  • 2 <= s.length <= 3 * 104
  • s consists of lowercase English letters.

Solution Approach

Binary Search for Substring Length

We begin by performing a binary search on the possible lengths of substrings. We start with the full range of substring lengths from 0 to the length of the string, and use binary search to find the maximum length for which a duplicated substring exists.

Sliding Window with Rolling Hash

For each candidate substring length, we use a sliding window technique to extract all possible substrings of that length. To efficiently check for duplicates, we apply a rolling hash to each substring to ensure constant-time comparison as the window slides.

Binary Search and Hashing Integration

The binary search narrows down the maximum length of a valid duplicated substring, and for each length, the sliding window combined with the rolling hash allows us to detect duplicates in an efficient manner. This approach minimizes unnecessary comparisons and speeds up the solution.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

The time complexity depends on the approach used for substring checking and binary search. Binary search reduces the possible lengths of substrings, which significantly lowers the number of checks. Hashing the substrings with a sliding window further optimizes the process, making the solution efficient. Overall, the complexity is approximately O(n log n), where n is the length of the string.

What Interviewers Usually Probe

  • Can the candidate explain the trade-offs of binary search in terms of string length?
  • Is the candidate able to effectively implement a rolling hash function?
  • Can the candidate describe the sliding window technique for substring comparison?

Common Pitfalls or Variants

Common pitfalls

  • Not efficiently using the sliding window technique, resulting in excessive comparisons.
  • Incorrectly handling edge cases where no substring is repeated.
  • Misapplying the binary search method and not considering the valid substring space properly.

Follow-up variants

  • Implement the solution without binary search, using brute force for substring comparison.
  • Consider using different hashing techniques such as polynomial rolling hashes.
  • Optimize for extremely large input sizes, exploring parallelization strategies.

FAQ

How does binary search help in solving the Longest Duplicate Substring problem?

Binary search helps by narrowing down the possible substring lengths, enabling an efficient search for the longest valid duplicate substring.

What is the role of rolling hash in this problem?

Rolling hash helps efficiently compute hash values for substrings, allowing for constant-time comparison as the sliding window moves across the string.

How does the sliding window technique optimize substring search?

The sliding window technique ensures that substrings are checked only once as it moves across the string, reducing redundant computations.

Can this problem be solved without binary search?

Yes, but it would require a brute force approach, which is less efficient and not scalable for larger strings.

What is the time complexity of the optimal solution for this problem?

The time complexity is O(n log n), where n is the length of the string, due to binary search combined with efficient substring checks using rolling hash.

terminal

Solution

Solution 1

#### Python3

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Solution:
    def longestDupSubstring(self, s: str) -> str:
        def check(l):
            vis = set()
            for i in range(n - l + 1):
                t = s[i : i + l]
                if t in vis:
                    return t
                vis.add(t)
            return ''

        n = len(s)
        left, right = 0, n
        ans = ''
        while left < right:
            mid = (left + right + 1) >> 1
            t = check(mid)
            ans = t or ans
            if t:
                left = mid
            else:
                right = mid - 1
        return ans
Longest Duplicate Substring Solution: Binary search over the valid answer s… | LeetCode #1044 Hard