LeetCode Problem Workspace

Minimum Genetic Mutation

Determine the minimum number of single-character gene mutations to reach the target gene using BFS and a hash table lookup.

category

3

Topics

code_blocks

6

Code langs

hub

3

Related

Practice Focus

Medium · Hash Table plus String

bolt

Answer-first summary

Determine the minimum number of single-character gene mutations to reach the target gene using BFS and a hash table lookup.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Hash Table plus String

Try AiBox Copilotarrow_forward

Use a breadth-first search (BFS) to explore all valid one-character mutations from the start gene. Maintain a hash table of valid genes from the bank for constant-time checks. Track mutation steps layer by layer until reaching the end gene or exhausting possibilities.

Problem Statement

You are given two 8-character gene strings, startGene and endGene, containing only 'A', 'C', 'G', and 'T'. Each mutation changes exactly one character. Only genes present in the provided bank are valid mutations.

Determine the minimum number of mutations needed to transform startGene into endGene. Return -1 if the transformation is impossible. Each step must produce a gene from the bank, and repeated states are invalid.

Examples

Example 1

Input: startGene = "AACCGGTT", endGene = "AACCGGTA", bank = ["AACCGGTA"]

Output: 1

Example details omitted.

Example 2

Input: startGene = "AACCGGTT", endGene = "AAACGGTA", bank = ["AACCGGTA","AACCGCTA","AAACGGTA"]

Output: 2

Example details omitted.

Constraints

  • 0 <= bank.length <= 10
  • startGene.length == endGene.length == bank[i].length == 8
  • startGene, endGene, and bank[i] consist of only the characters ['A', 'C', 'G', 'T'].

Solution Approach

Breadth-First Search Traversal

Perform BFS starting from startGene. At each step, generate all possible single-character mutations and enqueue only those present in the hash table of valid genes. Track levels to count mutations.

Hash Table Validation

Store all genes in the bank in a hash set for O(1) lookup. Before enqueuing a mutated gene, check its presence in the hash table to ensure only valid genes are processed.

Early Termination and State Management

Immediately return the current step count when endGene is reached. Maintain a visited set to prevent revisiting genes, avoiding cycles that cause infinite loops or incorrect mutation counts.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

Time complexity is O(8 * 4^8) in the worst case due to generating all one-character mutations, but limited by bank size and BFS pruning. Space complexity is O(bank.length + queue size) for hash table and BFS queue.

What Interviewers Usually Probe

  • Expect BFS usage due to minimal step requirement.
  • Hash table for fast gene validation is crucial.
  • Consider edge cases where endGene is not reachable from startGene.

Common Pitfalls or Variants

Common pitfalls

  • Mutating genes not present in the bank leading to invalid paths.
  • Failing to mark visited genes, causing cycles.
  • Counting steps incorrectly by not tracking BFS levels.

Follow-up variants

  • Allow multiple simultaneous mutations per step, requiring a modified BFS.
  • Genes of variable length, requiring dynamic mutation generation.
  • Large bank sizes, emphasizing memory-efficient hash table or pruning strategies.

FAQ

What is the main pattern used in Minimum Genetic Mutation?

The problem combines a hash table for fast gene validation and BFS to explore minimal mutation paths efficiently.

How do I handle genes not in the bank?

Ignore any mutation not present in the hash table to ensure only valid gene sequences are considered.

Can mutations revert a previously changed character?

No, previously visited genes should be tracked to prevent cycles; each mutation sequence must be unique.

What determines the BFS step count?

Each BFS level corresponds to one mutation step; reaching endGene at level k means k mutations.

What if startGene equals endGene?

Return 0 mutations immediately since no changes are needed.

terminal

Solution

Solution 1: BFS

We define a queue `q` to store the current gene sequence and the number of changes, and a set `vis` to store the visited gene sequences. Initially, we add the starting gene sequence `start` to the queue `q` and the set `vis`.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Solution:
    def minMutation(self, startGene: str, endGene: str, bank: List[str]) -> int:
        q = deque([(startGene, 0)])
        vis = {startGene}
        while q:
            gene, depth = q.popleft()
            if gene == endGene:
                return depth
            for nxt in bank:
                diff = sum(a != b for a, b in zip(gene, nxt))
                if diff == 1 and nxt not in vis:
                    q.append((nxt, depth + 1))
                    vis.add(nxt)
        return -1
Minimum Genetic Mutation Solution: Hash Table plus String | LeetCode #433 Medium