LeetCode Problem Workspace

Find Maximum Removals From Source String

Determine the maximum number of characters you can remove from source while keeping pattern as a subsequence using array scanning.

category

5

Topics

code_blocks

5

Code langs

hub

3

Related

Practice Focus

Medium · Array scanning plus hash lookup

bolt

Answer-first summary

Determine the maximum number of characters you can remove from source while keeping pattern as a subsequence using array scanning.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Array scanning plus hash lookup

Try AiBox Copilotarrow_forward

This problem asks for the maximum number of removable characters from a source string such that a given pattern remains a subsequence. The approach relies on scanning the array while tracking removed indices and using hash lookup for fast pattern checks. Dynamic programming or two-pointer strategies ensure each removal operation is validated correctly, balancing efficiency with correctness.

Problem Statement

Given a string source of length n, a string pattern that is a subsequence of source, and a sorted array targetIndices containing distinct positions in the range [0, n - 1], determine the maximum number of characters that can be removed from source while keeping pattern as a subsequence.

Each removal operation deletes the character at a specified index without affecting the positions of remaining characters. For example, removing 'c' from "acb" leaves 'b' at index 2. The goal is to maximize the number of valid removals defined by targetIndices while ensuring pattern remains in order.

Examples

Example 1

Input: source = "abbaa", pattern = "aba", targetIndices = [0,1,2]

Output: 1

We can't remove source[0] but we can do either of these two operations:

Example 2

Input: source = "bcda", pattern = "d", targetIndices = [0,3]

Output: 2

We can remove source[0] and source[3] in two operations.

Example 3

Input: source = "dda", pattern = "dda", targetIndices = [0,1,2]

Output: 0

We can't remove any character from source .

Constraints

  • 1 <= n == source.length <= 3 * 103
  • 1 <= pattern.length <= n
  • 1 <= targetIndices.length <= n
  • targetIndices is sorted in ascending order.
  • The input is generated such that targetIndices contains distinct elements in the range [0, n - 1].
  • source and pattern consist only of lowercase English letters.
  • The input is generated such that pattern appears as a subsequence in source.

Solution Approach

Two-Pointer Subsequence Check

Use two pointers to iterate over source and pattern simultaneously. Skip characters marked as removed by targetIndices. If all pattern characters are matched before reaching the end of source, the subsequence remains valid. This forms the basis for verifying each removal candidate efficiently.

Binary Search Maximum Removals

Perform a binary search on the number of removable indices from targetIndices. For each midpoint, apply the removal mask and use the two-pointer check. If the pattern still matches, move the search range higher; otherwise, lower. This approach finds the maximum safe removals in logarithmic search steps.

Hash Lookup for Removed Positions

Maintain a hash set of indices marked for removal. During scanning, check in O(1) whether the current character is removed. This avoids repeated linear scans and ensures each subsequence check remains efficient, especially for large source strings.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

Time complexity is O(n log m) where n is the source length and m is targetIndices length due to binary search combined with two-pointer scans. Space complexity is O(m) for storing removal indices in a hash set.

What Interviewers Usually Probe

  • Look for solutions using subsequence validation with minimal repeated scanning.
  • Check if candidate handles edge cases where pattern characters overlap with removal indices.
  • Evaluate the efficiency of two-pointer combined with hash lookup versus naive iteration.

Common Pitfalls or Variants

Common pitfalls

  • Assuming removal shifts subsequent indices, which is incorrect for this problem's definition.
  • Not handling cases where multiple pattern characters exist at removable indices.
  • Inefficient repeated scans without using hash sets or dynamic programming, leading to timeouts.

Follow-up variants

  • Compute the minimum number of removals to make pattern no longer a subsequence.
  • Given multiple patterns, find the maximum removals that preserve all as subsequences.
  • Allow repeated characters in pattern and adjust removal strategy accordingly.

FAQ

What is the key pattern to recognize in Find Maximum Removals From Source String?

The problem follows an array scanning plus hash lookup pattern, where two pointers track pattern matching while removals are validated efficiently.

Can I remove characters in any order from source?

No, you can only remove characters at indices specified in targetIndices, and removal does not shift other characters' positions.

Why use binary search in this problem?

Binary search efficiently finds the maximum number of removable indices while repeatedly validating subsequence integrity using two pointers.

What data structures improve performance here?

Using a hash set for removed indices ensures O(1) lookup during scans, avoiding repeated linear searches over targetIndices.

What happens if all targetIndices are part of the pattern?

Any attempt to remove pattern characters invalidates the subsequence, so maximum removals may be less than the total indices available.

terminal

Solution

Solution 1: Dynamic Programming

We define $f[i][j]$ to represent the maximum number of deletions in the first $i$ characters of $\textit{source}$ that match the first $j$ characters of $\textit{pattern}$. Initially, $f[0][0] = 0$, and the rest $f[i][j] = -\infty$.

1
2
3
4
5
6
7
8
9
10
11
12
class Solution:
    def maxRemovals(self, source: str, pattern: str, targetIndices: List[int]) -> int:
        m, n = len(source), len(pattern)
        f = [[-inf] * (n + 1) for _ in range(m + 1)]
        f[0][0] = 0
        s = set(targetIndices)
        for i, c in enumerate(source, 1):
            for j in range(n + 1):
                f[i][j] = f[i - 1][j] + int((i - 1) in s)
                if j and c == pattern[j - 1]:
                    f[i][j] = max(f[i][j], f[i - 1][j - 1])
        return f[m][n]
Find Maximum Removals From Source String Solution: Array scanning plus hash lookup | LeetCode #3316 Medium