LeetCode Problem Workspace

Occurrences After Bigram

Extract the third word in every adjacent triple of words matching a given bigram pattern from a text.

category

1

Topics

code_blocks

5

Code langs

hub

3

Related

Practice Focus

Easy · String-driven solution strategy

bolt

Answer-first summary

Extract the third word in every adjacent triple of words matching a given bigram pattern from a text.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for String-driven solution strategy

Try AiBox Copilotarrow_forward

To solve the problem, split the string into words and identify all adjacent triples. Then, for each triple that matches the given bigram, extract the third word. A direct approach with string manipulation helps solve this efficiently, ensuring all valid triples are captured.

Problem Statement

Given a string text and two words, first and second, your task is to find all the third words that immediately follow the bigram formed by first and second in the string. For each occurrence of the pattern "first second third", the word 'third' is considered the third word that appears after the bigram.

Return an array of all such words. If no valid triplet is found, return an empty array. The words in the string are space-separated and there are no leading or trailing spaces. Your solution should efficiently process the string to extract the required words.

Examples

Example 1

Input: text = "alice is a good girl she is a good student", first = "a", second = "good"

Output: ["girl","student"]

Example details omitted.

Example 2

Input: text = "we will we will rock you", first = "we", second = "will"

Output: ["we","rock"]

Example details omitted.

Constraints

  • 1 <= text.length <= 1000
  • text consists of lowercase English letters and spaces.
  • All the words in text are separated by a single space.
  • 1 <= first.length, second.length <= 10
  • first and second consist of lowercase English letters.
  • text will not have any leading or trailing spaces.

Solution Approach

Split the string into words

Start by splitting the input string into individual words. This helps in handling the text as a list, making it easier to compare consecutive words.

Identify adjacent triples

Iterate over the list of words, checking each set of consecutive three words. If the first two match the given first and second words, collect the third word as a valid result.

Return the third words

For each valid occurrence of the bigram, store the third word in an output array and return it at the end. Ensure that all valid occurrences are captured.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

The time complexity is O(n), where n is the number of words in the input string, as we only iterate through the string once. The space complexity is O(k), where k is the number of valid triples found, as we store these words in the result array.

What Interviewers Usually Probe

  • Ability to recognize string manipulation patterns.
  • Efficient handling of adjacent word comparisons.
  • Understanding of time and space complexity for string operations.

Common Pitfalls or Variants

Common pitfalls

  • Overlooking cases where the bigram is not followed by a valid third word.
  • Incorrectly handling spaces or punctuation, which could affect word separation.
  • Misunderstanding the need to iterate through adjacent triples, rather than simply searching for individual words.

Follow-up variants

  • Handle different forms of text input (e.g., multiple spaces or punctuation marks).
  • Extend the problem to support variable-sized n-grams, not just bigrams.
  • Optimize for cases with very large text inputs, considering time and space trade-offs.

FAQ

What is the core pattern behind the Occurrences After Bigram problem?

The problem requires identifying adjacent triples of words that match a given bigram pattern. It's a string manipulation problem that focuses on comparing consecutive words.

How do I handle cases where there are no valid triples?

Simply return an empty array when no valid triples match the bigram pattern in the string.

What is the time complexity of solving the Occurrences After Bigram problem?

The time complexity is O(n), where n is the number of words in the input text, as we only need to iterate through the string once.

How can I optimize this solution for large inputs?

The solution is already quite efficient with O(n) time complexity. For very large inputs, ensure you're using efficient string splitting and comparison operations to avoid unnecessary overhead.

Can this problem be solved with a regular expression?

While regular expressions could be used, they are not the most straightforward solution. Splitting the string into words and iterating through them is more intuitive and efficient for this problem.

terminal

Solution

Solution 1

#### Python3

1
2
3
4
5
6
7
8
9
class Solution:
    def findOcurrences(self, text: str, first: str, second: str) -> List[str]:
        words = text.split()
        ans = []
        for i in range(len(words) - 2):
            a, b, c = words[i : i + 3]
            if a == first and b == second:
                ans.append(c)
        return ans
Occurrences After Bigram Solution: String-driven solution strategy | LeetCode #1078 Easy