LeetCode Problem Workspace

Minimize Hamming Distance After Swap Operations

Solve Minimize Hamming Distance After Swap Operations by grouping swappable indices and matching value counts inside each connected component.

category

3

Topics

code_blocks

5

Code langs

hub

3

Related

Practice Focus

Medium · Array plus Depth-First Search

bolt

Answer-first summary

Solve Minimize Hamming Distance After Swap Operations by grouping swappable indices and matching value counts inside each connected component.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Array plus Depth-First Search

Try AiBox Copilotarrow_forward

Treat allowedSwaps as edges in a graph over source indices. Once indices are connected, any value inside that component can be rearranged anywhere within it, so the task becomes counting how many target values can be matched by the source multiset in each component. DFS or Union Find both work, but the key interview step is realizing you should solve per connected component, not per swap.

Problem Statement

You get two arrays, source and target, with the same length, plus a list of allowed index swaps that can be repeated in any order. Your goal is to reorder only source through those allowed swaps so that it differs from target in as few positions as possible.

The critical restriction is that values can move only inside groups of indices connected by allowedSwaps. That turns Minimize Hamming Distance After Swap Operations into a component problem: for each connected group, compare the frequency of source values against the frequency of target values and count how many positions still cannot be matched.

Examples

Example 1

Input: source = [1,2,3,4], target = [2,1,4,5], allowedSwaps = [[0,1],[2,3]]

Output: 1

source can be transformed the following way:

  • Swap indices 0 and 1: source = [2,1,3,4]
  • Swap indices 2 and 3: source = [2,1,4,3] The Hamming distance of source and target is 1 as they differ in 1 position: index 3.

Example 2

Input: source = [1,2,3,4], target = [1,3,2,4], allowedSwaps = []

Output: 2

There are no allowed swaps. The Hamming distance of source and target is 2 as they differ in 2 positions: index 1 and index 2.

Example 3

Input: source = [5,1,2,4,3], target = [1,5,4,2,3], allowedSwaps = [[0,4],[4,2],[1,3],[1,4]]

Output: 0

Example details omitted.

Constraints

  • n == source.length == target.length
  • 1 <= n <= 105
  • 1 <= source[i], target[i] <= 105
  • 0 <= allowedSwaps.length <= 105
  • allowedSwaps[i].length == 2
  • 0 <= ai, bi <= n - 1
  • ai != bi

Solution Approach

Model allowed swaps as connected components

Build a graph where each index is a node and each pair in allowedSwaps is an undirected edge. Run DFS to collect every connected component. Inside one component, repeated swaps let you permute values freely, so only the multiset of values matters, not the original order of indices.

Count source values and cancel target matches

For each component, count frequencies of source values in a hash map. Then scan the same component's target values and greedily consume matching counts from that map. Every target value that cannot be consumed contributes one unavoidable mismatch to the final Hamming distance.

Use Union Find when component discovery is cleaner

Union Find gives the same grouping without explicit adjacency traversal and is often simpler when allowedSwaps is large. After building parents, bucket indices by root and perform the same frequency matching per root. The trade-off is that DFS makes the graph idea more visible, while Union Find makes merging fast and compact.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

Let n be the array length and m be allowedSwaps.length. Building components takes O(n + m) with DFS over an adjacency list, or near O(n + m) with Union Find. The frequency reconciliation across all components is O(n) because each index is processed a constant number of times. Extra space is O(n + m) for the graph and component storage in the DFS version, or O(n) beyond input for Union Find plus counting maps.

What Interviewers Usually Probe

  • They hint that allowedSwaps forms a graph over indices, which is the push toward connected components.
  • They ask why repeated swaps matter, which is testing whether you realize each component allows arbitrary reordering.
  • They ask for a faster method than simulating swaps, which means you should switch to component-level frequency matching.

Common Pitfalls or Variants

Common pitfalls

  • Trying to greedily swap mismatched positions one by one misses the real invariant that only component membership matters.
  • Comparing indices directly inside a component instead of comparing value counts overcounts mismatches that can actually be fixed by reordering.
  • Forgetting isolated indices as single-node components causes missing mismatches when allowedSwaps is sparse or empty.

Follow-up variants

  • Replace DFS with Union Find and keep the same per-component counting logic.
  • Return the indices that must remain mismatched instead of only the minimum Hamming distance.
  • Handle online swap additions by rebuilding or maintaining components before re-running the matching step.

FAQ

What is the main pattern in Minimize Hamming Distance After Swap Operations?

The core pattern is connected components over array indices, followed by frequency matching inside each component. The graph can be traversed with DFS or built with Union Find.

Why can values be freely rearranged inside one component?

Because swaps can be repeated in any order, any index reachable through allowedSwaps belongs to the same connected group. Within that group, values can be permuted arbitrarily through a sequence of valid swaps.

Why does counting frequencies work better than simulating swaps?

The exact swap sequence does not matter once you know the component. What matters is how many copies of each value source has in that component and how many copies target needs there.

When would Union Find be better than DFS here?

Union Find is often cleaner when you want to merge many swap pairs without storing a full adjacency list. DFS is equally valid and sometimes easier to explain when the interviewer emphasizes the graph view.

What happens when allowedSwaps is empty?

Each index becomes its own component, so no value can move. The answer is then just the ordinary Hamming distance between source and target.

terminal

Solution

Solution 1: Union-Find + Hash Table

We can consider each index as a node, and the element corresponding to each index as the value of the node. Then each element `[a_i, b_i]` in the given `allowedSwaps` represents an edge between index `a_i` and `b_i`. Therefore, we can use a union-find set to maintain these connected components.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Solution:
    def minimumHammingDistance(
        self, source: List[int], target: List[int], allowedSwaps: List[List[int]]
    ) -> int:
        def find(x: int) -> int:
            if p[x] != x:
                p[x] = find(p[x])
            return p[x]

        n = len(source)
        p = list(range(n))
        for a, b in allowedSwaps:
            p[find(a)] = find(b)
        cnt = defaultdict(Counter)
        for i, x in enumerate(source):
            j = find(i)
            cnt[j][x] += 1
        ans = 0
        for i, x in enumerate(target):
            j = find(i)
            cnt[j][x] -= 1
            ans += cnt[j][x] < 0
        return ans
Minimize Hamming Distance After Swap Operations Solution: Array plus Depth-First Search | LeetCode #1722 Medium