LeetCode Problem Workspace
Occurrences After Bigram
Extract the third word in every adjacent triple of words matching a given bigram pattern from a text.
1
Topics
5
Code langs
3
Related
Practice Focus
Easy · String-driven solution strategy
Answer-first summary
Extract the third word in every adjacent triple of words matching a given bigram pattern from a text.
Ace coding interviews with Interview AiBoxInterview AiBox guidance for String-driven solution strategy
To solve the problem, split the string into words and identify all adjacent triples. Then, for each triple that matches the given bigram, extract the third word. A direct approach with string manipulation helps solve this efficiently, ensuring all valid triples are captured.
Problem Statement
Given a string text and two words, first and second, your task is to find all the third words that immediately follow the bigram formed by first and second in the string. For each occurrence of the pattern "first second third", the word 'third' is considered the third word that appears after the bigram.
Return an array of all such words. If no valid triplet is found, return an empty array. The words in the string are space-separated and there are no leading or trailing spaces. Your solution should efficiently process the string to extract the required words.
Examples
Example 1
Input: text = "alice is a good girl she is a good student", first = "a", second = "good"
Output: ["girl","student"]
Example details omitted.
Example 2
Input: text = "we will we will rock you", first = "we", second = "will"
Output: ["we","rock"]
Example details omitted.
Constraints
- 1 <= text.length <= 1000
- text consists of lowercase English letters and spaces.
- All the words in text are separated by a single space.
- 1 <= first.length, second.length <= 10
- first and second consist of lowercase English letters.
- text will not have any leading or trailing spaces.
Solution Approach
Split the string into words
Start by splitting the input string into individual words. This helps in handling the text as a list, making it easier to compare consecutive words.
Identify adjacent triples
Iterate over the list of words, checking each set of consecutive three words. If the first two match the given first and second words, collect the third word as a valid result.
Return the third words
For each valid occurrence of the bigram, store the third word in an output array and return it at the end. Ensure that all valid occurrences are captured.
Complexity Analysis
| Metric | Value |
|---|---|
| Time | Depends on the final approach |
| Space | Depends on the final approach |
The time complexity is O(n), where n is the number of words in the input string, as we only iterate through the string once. The space complexity is O(k), where k is the number of valid triples found, as we store these words in the result array.
What Interviewers Usually Probe
- Ability to recognize string manipulation patterns.
- Efficient handling of adjacent word comparisons.
- Understanding of time and space complexity for string operations.
Common Pitfalls or Variants
Common pitfalls
- Overlooking cases where the bigram is not followed by a valid third word.
- Incorrectly handling spaces or punctuation, which could affect word separation.
- Misunderstanding the need to iterate through adjacent triples, rather than simply searching for individual words.
Follow-up variants
- Handle different forms of text input (e.g., multiple spaces or punctuation marks).
- Extend the problem to support variable-sized n-grams, not just bigrams.
- Optimize for cases with very large text inputs, considering time and space trade-offs.
FAQ
What is the core pattern behind the Occurrences After Bigram problem?
The problem requires identifying adjacent triples of words that match a given bigram pattern. It's a string manipulation problem that focuses on comparing consecutive words.
How do I handle cases where there are no valid triples?
Simply return an empty array when no valid triples match the bigram pattern in the string.
What is the time complexity of solving the Occurrences After Bigram problem?
The time complexity is O(n), where n is the number of words in the input text, as we only need to iterate through the string once.
How can I optimize this solution for large inputs?
The solution is already quite efficient with O(n) time complexity. For very large inputs, ensure you're using efficient string splitting and comparison operations to avoid unnecessary overhead.
Can this problem be solved with a regular expression?
While regular expressions could be used, they are not the most straightforward solution. Splitting the string into words and iterating through them is more intuitive and efficient for this problem.
Solution
Solution 1
#### Python3
class Solution:
def findOcurrences(self, text: str, first: str, second: str) -> List[str]:
words = text.split()
ans = []
for i in range(len(words) - 2):
a, b, c = words[i : i + 3]
if a == first and b == second:
ans.append(c)
return ansContinue Topic
string
Practice more edge cases under the same topic.
arrow_forwardauto_awesomeContinue Pattern
String-driven solution strategy
Expand the same solving frame across more problems.
arrow_forwardsignal_cellular_altSame Difficulty Track
Easy
Stay on this level to stabilize interview delivery.
arrow_forward