LeetCode Problem Workspace
Report Spam Message
Determine if a message contains at least two banned words using array scanning and hash lookup.
3
Topics
5
Code langs
3
Related
Practice Focus
Medium · Array scanning plus hash lookup
Answer-first summary
Determine if a message contains at least two banned words using array scanning and hash lookup.
Ace coding interviews with Interview AiBoxInterview AiBox guidance for Array scanning plus hash lookup
To solve this problem, scan the message array and check if any two words are present in the bannedWords array. The best approach utilizes a hash set for efficient lookup. The challenge focuses on handling large inputs efficiently, given the constraints.
Problem Statement
You are given two arrays of strings: message and bannedWords. The task is to determine if there are at least two words in the message array that appear in the bannedWords array. If so, the message is considered spam, and you should return true. Otherwise, return false.
The problem requires an efficient approach due to the potential size of the input arrays. Given that the length of message and bannedWords can each go up to 100,000, the solution must utilize an efficient searching mechanism like hash sets to quickly check for matches between words.
Examples
Example 1
Input: message = ["hello","world","leetcode"], bannedWords = ["world","hello"]
Output: true
The words "hello" and "world" from the message array both appear in the bannedWords array.
Example 2
Input: message = ["hello","programming","fun"], bannedWords = ["world","programming","leetcode"]
Output: false
Only one word from the message array ( "programming" ) appears in the bannedWords array.
Constraints
- 1 <= message.length, bannedWords.length <= 105
- 1 <= message[i].length, bannedWords[i].length <= 15
- message[i] and bannedWords[i] consist only of lowercase English letters.
Solution Approach
Array Scanning with Hash Set Lookup
The solution involves scanning the message array and using a hash set to store banned words. For each word in message, check if it exists in the bannedWords hash set. This approach ensures that lookup is done in constant time, making the solution scalable for large inputs.
Hash Set for Efficiency
Using a hash set to store the banned words ensures fast lookups. Hash set operations like add and check for membership are performed in average O(1) time, which is crucial for handling the problem's constraints.
Early Exit Optimization
As soon as two words from message are found in the bannedWords set, return true immediately to avoid unnecessary computations. This early exit optimization helps reduce processing time when the condition is met early in the array.
Complexity Analysis
| Metric | Value |
|---|---|
| Time | Depends on the final approach |
| Space | Depends on the final approach |
The time complexity of this solution is O(n + m), where n is the length of the message array and m is the length of the bannedWords array. The space complexity is O(m), as the hash set stores all the banned words.
What Interviewers Usually Probe
- Look for the candidate's ability to implement efficient searching using a hash set.
- Evaluate the candidate's approach to optimizing for early exits.
- Test the candidate's understanding of time complexity, particularly with large arrays.
Common Pitfalls or Variants
Common pitfalls
- Incorrect handling of case sensitivity: The problem specifies lowercase letters only, but candidates might mistakenly use case-insensitive checks.
- Not using a hash set: A naive approach without hash sets could lead to slow performance for large inputs.
- Failing to optimize for early exit: Checking the entire array even after finding two banned words is inefficient.
Follow-up variants
- Instead of returning true or false, return the list of banned words found in the message.
- Handle edge cases where the
messageorbannedWordsarrays are empty. - Optimize for constant space complexity by scanning the array and checking banned words inline without using a hash set.
FAQ
What is the optimal solution for the 'Report Spam Message' problem?
The optimal solution uses array scanning with a hash set for fast lookups, ensuring the solution runs in O(n + m) time.
Why should I use a hash set in this problem?
A hash set allows for constant time lookups, which is crucial for handling large input sizes efficiently in this problem.
What happens if I don't optimize for early exit?
Failing to optimize for early exit will result in unnecessary checks, slowing down the algorithm when spam conditions are met early in the message array.
How do I handle empty arrays in the 'Report Spam Message' problem?
You should return false if either the message or bannedWords array is empty, as there can't be any matches.
What are the edge cases to consider for this problem?
Edge cases include empty arrays, arrays where no banned words are present, and arrays where all words are banned.
Solution
Solution 1: Hash Table
We use a hash table $s$ to store all the words in $\textit{bannedWords}$. Then, we traverse each word in $\textit{message}$. If the word appears in the hash table $s$, we increment the counter $cnt$ by one. If $cnt$ is greater than or equal to $2$, we return $\text{true}$; otherwise, we return $\text{false}$.
class Solution:
def reportSpam(self, message: List[str], bannedWords: List[str]) -> bool:
s = set(bannedWords)
return sum(w in s for w in message) >= 2Continue Topic
array
Practice more edge cases under the same topic.
arrow_forwardauto_awesomeContinue Pattern
Array scanning plus hash lookup
Expand the same solving frame across more problems.
arrow_forwardsignal_cellular_altSame Difficulty Track
Medium
Stay on this level to stabilize interview delivery.
arrow_forward