#1157

Hard

auto_awesomeBinary search over the valid answer space

LeetCode Problem Workspace

Online Majority Element In Subarray

Efficiently find the majority element in any subarray using a data structure optimized for multiple range queries.

array binary search design binary indexed tree segment tree

Problem Statement

You need to implement a class MajorityChecker that supports efficiently finding the majority element in a subarray. A majority element in a subarray is an element that occurs at least a given threshold number of times.

The class should support multiple queries of the form query(left, right, threshold) and return the majority element for that subarray, or -1 if no such element exists. Optimize your design to handle frequent online queries efficiently for arrays up to length 2 * 10^4.

Examples

Example 1

Input: See original problem statement.

Output: See original problem statement.

Input ["MajorityChecker", "query", "query", "query"] [[[1, 1, 2, 2, 1, 1]], [0, 5, 4], [0, 3, 3], [2, 3, 2]] Output [null, 1, -1, 2]

Explanation MajorityChecker majorityChecker = new MajorityChecker([1, 1, 2, 2, 1, 1]); majorityChecker.query(0, 5, 4); // return 1 majorityChecker.query(0, 3, 3); // return -1 majorityChecker.query(2, 3, 2); // return 2

Constraints

1 <= arr.length <= 2 * 104
1 <= arr[i] <= 2 * 104
0 <= left <= right < arr.length
threshold <= right - left + 1
2 * threshold > right - left + 1
At most 104 calls will be made to query.

Solution Approach

Use Frequency Maps with Binary Search

Preprocess the array by storing the positions of each distinct element. For each query, iterate over potential majority candidates and use binary search on their positions to count occurrences in the subarray. Return the candidate that meets the threshold.

Segment Tree for Candidate Queries

Build a segment tree where each node stores a potential majority element and its frequency. For a query, merge segment nodes and check if the candidate exceeds the threshold. This reduces redundant scanning of subarrays and supports fast range queries.

Random Sampling Optimization

For large subarrays, randomly sample elements and check their frequency within the range using precomputed positions. With multiple samples, the probability of missing the true majority element is low, providing a probabilistic but fast approach.

Complexity Analysis

Metric	Value
Time	Depends on the final approach
Space	Depends on the final approach

Time and space complexity depend on the approach: frequency maps with binary search give O(k log n) per query, segment trees give O(log n) per query, and random sampling achieves expected O(log n) with extra space for index maps.

What Interviewers Usually Probe

Can you handle multiple online queries efficiently without scanning each subarray?
What data structures allow you to count element frequencies quickly within a range?
How can binary search be applied over the candidate element positions to validate thresholds?

Common Pitfalls or Variants

Common pitfalls

Assuming a linear scan per query is fast enough for large arrays.
Forgetting that the majority element must meet the threshold, not just be most frequent.
Not handling the case when no element satisfies the threshold and returning -1 incorrectly.

Follow-up variants

Find the majority element for dynamic subarrays where elements can be updated between queries.
Return all elements that appear more than a fraction of the subarray size rather than a fixed threshold.
Optimize for streaming input where queries arrive in real time and the array is too large to store completely.

FAQ

What is the main strategy for Online Majority Element In Subarray?

The main strategy is to preprocess element positions and use binary search or segment trees to efficiently verify potential majority elements per query.

Can we solve this problem without extra space?

Not efficiently; tracking element occurrences with maps or trees is essential to support multiple fast queries.

Why is binary search over candidate positions used?

Binary search allows counting occurrences of a candidate in a subarray in logarithmic time, avoiding linear scans.

How does the threshold affect candidate selection?

Only elements that appear at least threshold times in the subarray are valid; all others are discarded during the query check.

Is random sampling a reliable approach here?

Yes, for large arrays it can efficiently detect the majority element with high probability, especially when combined with position maps.

terminal

Solution

Solution 1: Segment Tree + Boyer-Moore Voting Algorithm + Binary Search

We notice that the problem requires us to find the possible majority element in a specific interval, so we consider using a segment tree to maintain the candidate majority element and its occurrence in each interval.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

class Node:
    __slots__ = ("l", "r", "x", "cnt")

    def __init__(self):
        self.l = self.r = 0
        self.x = self.cnt = 0


class SegmentTree:
    def __init__(self, nums):
        self.nums = nums
        n = len(nums)
        self.tr = [Node() for _ in range(n << 2)]
        self.build(1, 1, n)

    def build(self, u, l, r):
        self.tr[u].l, self.tr[u].r = l, r
        if l == r:
            self.tr[u].x = self.nums[l - 1]
            self.tr[u].cnt = 1
            return
        mid = (l + r) >> 1
        self.build(u << 1, l, mid)
        self.build(u << 1 | 1, mid + 1, r)
        self.pushup(u)

    def query(self, u, l, r):
        if self.tr[u].l >= l and self.tr[u].r <= r:
            return self.tr[u].x, self.tr[u].cnt
        mid = (self.tr[u].l + self.tr[u].r) >> 1
        if r <= mid:
            return self.query(u << 1, l, r)
        if l > mid:
            return self.query(u << 1 | 1, l, r)
        x1, cnt1 = self.query(u << 1, l, r)
        x2, cnt2 = self.query(u << 1 | 1, l, r)
        if x1 == x2:
            return x1, cnt1 + cnt2
        if cnt1 >= cnt2:
            return x1, cnt1 - cnt2
        else:
            return x2, cnt2 - cnt1

    def pushup(self, u):
        if self.tr[u << 1].x == self.tr[u << 1 | 1].x:
            self.tr[u].x = self.tr[u << 1].x
            self.tr[u].cnt = self.tr[u << 1].cnt + self.tr[u << 1 | 1].cnt
        elif self.tr[u << 1].cnt >= self.tr[u << 1 | 1].cnt:
            self.tr[u].x = self.tr[u << 1].x
            self.tr[u].cnt = self.tr[u << 1].cnt - self.tr[u << 1 | 1].cnt
        else:
            self.tr[u].x = self.tr[u << 1 | 1].x
            self.tr[u].cnt = self.tr[u << 1 | 1].cnt - self.tr[u << 1].cnt


class MajorityChecker:
    def __init__(self, arr: List[int]):
        self.tree = SegmentTree(arr)
        self.d = defaultdict(list)
        for i, x in enumerate(arr):
            self.d[x].append(i)

    def query(self, left: int, right: int, threshold: int) -> int:
        x, _ = self.tree.query(1, left + 1, right + 1)
        l = bisect_left(self.d[x], left)
        r = bisect_left(self.d[x], right + 1)
        return x if r - l >= threshold else -1


# Your MajorityChecker object will be instantiated and called as such:
# obj = MajorityChecker(arr)
# param_1 = obj.query(left,right,threshold)

Continue Practicing

#731 my calendar ii #729 my calendar i #1649 create sorted array through instructions