Insert Delete GetRandom O(1) - Duplicates allowed

This problem challenges you to design a data structure that supports insertion, removal, and random access with O(1) time complexity while allowing duplicates.

array hash table math design randomized

Problem Statement

The task is to implement the RandomizedCollection class that supports inserting, removing, and retrieving random elements, allowing duplicates. The collection should be able to handle operations where each function works on average O(1) time complexity. It needs to allow both duplicate values and fast random access.

The class should implement three functions: insert(val), remove(val), and getRandom(). Insert should return true when the value is added for the first time, and false for subsequent insertions. Remove should return true if the value is present and removes one occurrence. The getRandom() function should return a random element from the collection, with uniform probability for each element present.

Examples

Example 1

Input: See original problem statement.

Output: See original problem statement.

Input ["RandomizedCollection", "insert", "insert", "insert", "getRandom", "remove", "getRandom"] [[], [1], [1], [2], [], [1], []] Output [null, true, false, true, 2, true, 1]

Explanation RandomizedCollection randomizedCollection = new RandomizedCollection(); randomizedCollection.insert(1); // return true since the collection does not contain 1. // Inserts 1 into the collection. randomizedCollection.insert(1); // return false since the collection contains 1. // Inserts another 1 into the collection. Collection now contains [1,1]. randomizedCollection.insert(2); // return true since the collection does not contain 2. // Inserts 2 into the collection. Collection now contains [1,1,2]. randomizedCollection.getRandom(); // getRandom should: // - return 1 with probability 2/3, or // - return 2 with probability 1/3. randomizedCollection.remove(1); // return true since the collection contains 1. // Removes 1 from the collection. Collection now contains [1,2]. randomizedCollection.getRandom(); // getRandom should return 1 or 2, both equally likely.

Constraints

-231 <= val <= 231 - 1
At most 2 * 105 calls in total will be made to insert, remove, and getRandom.
There will be at least one element in the data structure when getRandom is called.

Solution Approach

Array and Hash Map for Constant Time Operations

The solution uses an array to store the values and a hash map to track the indices of each element in the collection. This allows both insertions and removals to be done in O(1) time on average. The hash map stores each value along with a list of its indices in the array, which helps efficiently remove any specific element and maintain a random access structure.

Efficient Removal and getRandom Implementation

To remove an element, the solution replaces the element with the last one in the array and removes the last index. This ensures the array stays compact while still allowing random access. The getRandom() function is implemented by selecting a random index from the array, ensuring each element is equally likely to be chosen.

Handling Duplicates in O(1) Time

Duplicates are handled using the hash map, which stores all the indices of a given value in the array. When inserting or removing, we only update the hash map and the array accordingly, ensuring that duplicates do not affect the performance of the insert and remove operations.

Complexity Analysis

Metric	Value
Time	Depends on the final approach
Space	Depends on the final approach

The time complexity of all operations—insert, remove, and getRandom—is O(1) on average. This is achieved by using a hash map to track the indices of elements and swapping the last element with the one being removed to maintain the array's compactness. Space complexity is O(n) due to the storage requirements for the array and hash map.

What Interviewers Usually Probe

Look for candidates who demonstrate a good understanding of hash table operations and array manipulation.
Candidates should highlight the O(1) time complexity for all operations and explain how duplicates are efficiently handled.
Expect candidates to discuss the trade-offs of using an array with a hash map for O(1) random access and insertion/removal.

Common Pitfalls or Variants

Common pitfalls

Failing to update the hash map correctly when removing elements, which can lead to incorrect results in subsequent operations.
Overlooking the need to maintain the array’s compactness when removing elements, which could lead to slower operations.
Not fully understanding the implications of storing multiple indices for duplicate values in the hash map, leading to inefficient solutions.

Follow-up variants

Handling large data sets efficiently while maintaining O(1) operations.
Supporting additional operations like 'getRandomWithFrequency' to retrieve random elements based on their frequency of insertion.
Optimizing space usage for cases with very sparse data where many indices are empty.

FAQ

How can we ensure O(1) time complexity for all operations?

By using a hash map to track the indices of each element in the array and swapping the last element when removing an item, we achieve O(1) time complexity for all operations.

What is the role of the hash map in this solution?

The hash map is used to track the indices of each element in the array, making insertions and deletions more efficient while handling duplicates.

How do we handle duplicates in this problem?

Duplicates are handled by storing multiple indices for the same value in the hash map, ensuring that all occurrences are correctly managed during insertions and removals.

Can we use a different data structure for this problem?

While other data structures like linked lists could be used, the combination of an array and hash map is optimal for ensuring O(1) time complexity for insert, remove, and getRandom operations.

What is the best way to explain this problem in an interview?

In an interview, focus on how the array and hash map work together to achieve constant time complexity for all operations, and discuss how duplicates are efficiently handled.

terminal

Solution

Solution 1

#### Python3

class RandomizedCollection:
    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.m = {}
        self.l = []

    def insert(self, val: int) -> bool:
        """
        Inserts a value to the collection. Returns true if the collection did not already contain the specified element.
        """
        idx_set = self.m.get(val, set())
        idx_set.add(len(self.l))
        self.m[val] = idx_set
        self.l.append(val)
        return len(idx_set) == 1

    def remove(self, val: int) -> bool:
        """
        Removes a value from the collection. Returns true if the collection contained the specified element.
        """
        if val not in self.m:
            return False
        idx_set = self.m[val]
        idx = list(idx_set)[0]
        last_idx = len(self.l) - 1
        self.l[idx] = self.l[last_idx]
        idx_set.remove(idx)

        last_idx_set = self.m[self.l[last_idx]]
        if last_idx in last_idx_set:
            last_idx_set.remove(last_idx)
        if idx < last_idx:
            last_idx_set.add(idx)
        if not idx_set:
            self.m.pop(val)
        self.l.pop()
        return True

    def getRandom(self) -> int:
        """
        Get a random element from the collection.
        """
        return -1 if len(self.l) == 0 else random.choice(self.l)


# Your RandomizedCollection object will be instantiated and called as such:
# obj = RandomizedCollection()
# param_1 = obj.insert(val)
# param_2 = obj.remove(val)
# param_3 = obj.getRandom()

Continue Practicing

#380 insert delete getrandom o1 #710 random pick with blacklist #268 missing number