LeetCode Problem Workspace

Shortest Matching Substring

Find the shortest substring in a string that matches a pattern with exactly two wildcards efficiently using binary search.

category

4

Topics

code_blocks

0

Code langs

hub

3

Related

Practice Focus

Hard · Binary search over the valid answer space

bolt

Answer-first summary

Find the shortest substring in a string that matches a pattern with exactly two wildcards efficiently using binary search.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Binary search over the valid answer space

Try AiBox Copilotarrow_forward

This problem requires finding the minimal-length substring of s that satisfies the pattern p containing exactly two '*'. We use a binary search over possible substring lengths and validate each candidate with a two-pointer match approach. This efficiently narrows down the shortest valid substring or returns -1 if no match exists.

Problem Statement

Given a string s and a pattern string p containing exactly two '' characters, determine the length of the shortest contiguous substring of s that matches p. Each ' ' in p can represent any sequence of zero or more characters, including an empty sequence. If no substring matches, return -1.

Constraints: s consists of only lowercase letters with length 1 to 10^5. Pattern p contains lowercase letters and exactly two '' with length 2 to 10^5. Your task is to find the shortest matching substring efficiently, considering the division of p into three segments separated by the ' ' characters.

Examples

Example 1

Input: s = "abaacbaecebce", p = "ba*c*ce"

Output: 8

The shortest matching substring of p in s is " ba e c eb ce " .

Example 2

Input: s = "baccbaadbc", p = "cc*baa*adb"

Output: -1

There is no matching substring in s .

Example 3

Input: s = "a", p = ""

Output: 0

The empty substring is the shortest matching substring.

Constraints

  • 1 <= s.length <= 105
  • 2 <= p.length <= 105
  • s contains only lowercase English letters.
  • p contains only lowercase English letters and exactly two '*'.

Solution Approach

Binary Search over Substring Length

Use binary search on the possible lengths of substrings from 1 to s.length. For each candidate length, check if any substring of that length matches the pattern using the two-pointer check. This reduces the search space efficiently instead of checking all substrings.

Two-Pointer Pattern Match

Divide pattern p into three segments around the '*' characters. Use two pointers to attempt to match each segment sequentially against a candidate substring of s. If all segments match respecting the wildcards, the substring is valid. Otherwise, continue searching with binary search adjustments.

Optimization with Early Termination

While scanning substrings, terminate early if a mismatch is detected between a segment and the substring. Combine this with sliding window scanning of candidate substrings during the binary search to minimize unnecessary comparisons and achieve optimal performance.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

Time complexity is O(n log n) where n is s.length, due to binary search over possible lengths and linear validation for each candidate substring. Space complexity is O(1) extra or O(n) if substring slices are stored during validation.

What Interviewers Usually Probe

  • Focus on binary search over the valid answer space.
  • Notice the pattern has exactly two wildcards, splitting it into three segments.
  • Consider early termination in two-pointer matching to avoid TLE.

Common Pitfalls or Variants

Common pitfalls

  • Trying to match all substrings without binary search results in TLE.
  • Misaligning segments when handling the '*' wildcards leads to incorrect matches.
  • Forgetting to handle empty matches for '*' causing wrong output for edge cases.

Follow-up variants

  • Patterns with more than two '*' requiring multi-segment matching.
  • Finding the longest matching substring instead of the shortest.
  • Matching substrings under case-insensitive rules or extended alphabets.

FAQ

What is the main strategy for Shortest Matching Substring?

The main strategy is binary search over substring lengths combined with two-pointer validation of the three segments split by the two '*' in the pattern.

How do the '*' characters in pattern p work?

Each '*' can match any sequence of characters, including an empty sequence, which allows the substring to flexibly accommodate varying characters between segments.

What should I do if no substring matches?

Return -1 immediately after binary search confirms no valid substring length exists that matches all segments.

Can the solution handle very long strings efficiently?

Yes, binary search over lengths combined with two-pointer matching ensures linear scanning only for potential candidates, keeping it efficient even for large s.

Is it necessary to divide the pattern for matching?

Yes, dividing pattern p into three segments around the '*' is essential to correctly implement two-pointer matching and handle wildcards.

terminal

Solution

Solution 1

#### Python3

1
Shortest Matching Substring Solution: Binary search over the valid answer s… | LeetCode #3455 Hard