LeetCode Problem Workspace
Unique Email Addresses
Identify the count of unique email addresses by normalizing local names and using hash-based lookups efficiently.
3
Topics
7
Code langs
3
Related
Practice Focus
Easy · Array scanning plus hash lookup
Answer-first summary
Identify the count of unique email addresses by normalizing local names and using hash-based lookups efficiently.
Ace coding interviews with Interview AiBoxInterview AiBox guidance for Array scanning plus hash lookup
This problem requires scanning each email, normalizing its local name by removing dots and ignoring characters after a plus, then storing the resulting addresses in a hash set to count uniqueness. Efficient string processing combined with hash table lookups ensures O(N) performance for typical input sizes. The solution pattern revolves around array scanning plus hash lookup, making it straightforward but sensitive to string parsing mistakes.
Problem Statement
Given a list of email addresses, compute the number of distinct addresses that actually receive emails. Each email consists of a local name and a domain name separated by '@'. In the local name, periods '.' are ignored and everything after the first plus '+' is discarded. Domain names remain unchanged.
Return the count of unique email addresses after applying these normalization rules. For example, 'test.email+alex@leetcode.com' and 'test.e.mail@leetcode.com' are considered the same. The input array can contain up to 100 emails, each up to 100 characters long, consisting of lowercase letters, '.', '+', and '@'.
Examples
Example 1
Input: emails = ["test.email+alex@leetcode.com","test.e.mail+bob.cathy@leetcode.com","testemail+david@lee.tcode.com"]
Output: 2
"testemail@leetcode.com" and "testemail@lee.tcode.com" actually receive mails.
Example 2
Input: emails = ["a@leetcode.com","b@leetcode.com","c@leetcode.com"]
Output: 3
Example details omitted.
Constraints
- 1 <= emails.length <= 100
- 1 <= emails[i].length <= 100
- emails[i] consist of lowercase English letters, '+', '.' and '@'.
- Each emails[i] contains exactly one '@' character.
- All local and domain names are non-empty.
- Local names do not start with a '+' character.
- Domain names end with the ".com" suffix.
- Domain names must contain at least one character before ".com" suffix.
Solution Approach
Normalize each email
Iterate through each email string. Split it into local and domain parts. Remove all '.' from the local part and truncate at the first '+'. Recombine with the domain to produce the normalized email.
Use a hash set for uniqueness
Insert each normalized email into a hash set. This ensures duplicates are automatically ignored. At the end, the size of the hash set represents the number of unique emails.
Optimize scanning and string operations
Avoid unnecessary string concatenations by using string builders or slicing carefully. Process each character only once to keep the algorithm efficient for up to 100 emails of 100 characters each.
Complexity Analysis
| Metric | Value |
|---|---|
| Time | Depends on the final approach |
| Space | Depends on the final approach |
Time complexity is O(N * L) where N is the number of emails and L is the average length, due to scanning and normalizing each email. Space complexity is O(N * L) to store normalized emails in a hash set.
What Interviewers Usually Probe
- Look for O(N) array scanning with string manipulation.
- Expect correct handling of '+' and '.' in local names.
- Check if duplicates are correctly ignored using a hash set.
Common Pitfalls or Variants
Common pitfalls
- Forgetting to ignore characters after '+' in the local name.
- Removing dots from domain names instead of only the local part.
- Recomputing strings inefficiently leading to unnecessary time or memory use.
Follow-up variants
- Return the list of normalized emails instead of just counting.
- Handle uppercase letters by converting all to lowercase before normalization.
- Allow arbitrary domain suffixes, not just '.com', while applying the same local name rules.
FAQ
How does the plus '+' affect the Unique Email Addresses problem?
In the local name, any character after the first '+' is ignored. This ensures filtered emails count as duplicates if their normalized local names match.
Should dots '.' in domain names be removed?
No, dots are removed only in the local part. Domain names remain intact to distinguish unique addresses correctly.
What is the recommended data structure for counting unique emails?
A hash set is ideal, as it automatically ignores duplicates and provides O(1) insertion and lookup.
Can emails have multiple '@' characters?
No, each email must contain exactly one '@' separating local and domain names according to the problem constraints.
What pattern does this problem follow in GhostInterview?
This problem follows the array scanning plus hash lookup pattern, emphasizing string normalization and duplicate elimination.
Solution
Solution 1: Hash Table
We can use a hash table $s$ to store all unique email addresses. Then, we traverse the array $\textit{emails}$. For each email address, we split it into the local part and the domain part. We process the local part by removing all dots and ignoring characters after a plus sign. Finally, we concatenate the processed local part with the domain part and add it to the hash table $s$.
class Solution:
def numUniqueEmails(self, emails: List[str]) -> int:
s = set()
for email in emails:
local, domain = email.split("@")
t = []
for c in local:
if c == ".":
continue
if c == "+":
break
t.append(c)
s.add("".join(t) + "@" + domain)
return len(s)Continue Topic
array
Practice more edge cases under the same topic.
arrow_forwardauto_awesomeContinue Pattern
Array scanning plus hash lookup
Expand the same solving frame across more problems.
arrow_forwardsignal_cellular_altSame Difficulty Track
Easy
Stay on this level to stabilize interview delivery.
arrow_forward