#275

Medium

auto_awesomeBinary search over the valid answer space

LeetCode Problem Workspace

H-Index II

Compute a researcher's H-Index from a sorted citation array using binary search over the valid answer space efficiently.

array binary search

Problem Statement

Given a sorted array of integers representing a researcher's citations per paper, determine the H-Index: the largest number h such that the researcher has at least h papers with h or more citations each. The array is non-decreasing and can contain zero citations. Your algorithm must achieve logarithmic runtime.

For example, with citations = [0,1,3,5,6], the researcher has 5 papers; 3 of them have at least 3 citations, so the H-Index is 3. Your task is to implement a solution that efficiently identifies this H-Index without scanning each paper linearly.

Examples

Example 1

Input: citations = [0,1,3,5,6]

Output: 3

[0,1,3,5,6] means the researcher has 5 papers in total and each of them had received 0, 1, 3, 5, 6 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Example 2

Input: citations = [1,2,100]

Output: 2

Example details omitted.

Constraints

n == citations.length
1 <= n <= 105
0 <= citations[i] <= 1000
citations is sorted in ascending order.

Solution Approach

Binary Search over the Valid Answer Space

Treat possible H-Index values from 0 to n as a search space. At each step, check if there are enough papers with citations >= mid. If yes, search higher; otherwise, search lower. This leverages the sorted order to skip unnecessary checks.

Check Condition at Midpoint Efficiently

For each midpoint h, compute n - index to see if at least h papers meet or exceed h citations. This avoids scanning all elements and ensures correctness while following the binary search pattern specific to H-Index II.

Return Maximum Valid H-Index

Continue the binary search until the search space collapses. The highest h that satisfies the condition is returned. This approach prevents common errors where naive scanning overcounts or undercounts eligible papers.

Complexity Analysis

Metric	Value
Time	Depends on the final approach
Space	Depends on the final approach

Time complexity is O(log n) due to binary search over possible H-Index values. Space complexity is O(1) as only a few variables are maintained during search.

What Interviewers Usually Probe

Expecting logarithmic time using the sorted property of citations.
Looking for correct identification of the H-Index without linear scanning.
Check for off-by-one errors in determining n - index >= h condition.

Common Pitfalls or Variants

Common pitfalls

Confusing index positions with citation counts during binary search.
Failing to handle cases where citations are all lower than potential h.
Using linear scan, which violates the logarithmic time requirement.

Follow-up variants

H-Index calculation where the citations array is unsorted, requiring sorting first.
Dynamic H-Index tracking as new citations are added to the array.
Compute H-Index with additional constraints, e.g., only counting papers from certain journals.

FAQ

What is the H-Index II problem on LeetCode?

H-Index II asks for the maximum h where a sorted array of citations has at least h papers with h or more citations, requiring logarithmic runtime.

Why use binary search for H-Index II?

Binary search efficiently narrows down the valid H-Index in a sorted array, avoiding the need for linear scans and achieving O(log n) time.

How do I check if a candidate h is valid?

Compute n - index at the midpoint of the search; if this is >= h, the candidate h is valid and you search higher.

What are common mistakes when implementing H-Index II?

Mistakes include miscounting papers, off-by-one errors in binary search, and using linear iteration instead of O(log n) search.

Can H-Index II be solved if citations are unsorted?

Yes, but you must first sort the array; otherwise, binary search over the valid answer space cannot guarantee correctness.

terminal

Solution

Solution 1: Binary Search

We notice that if there are at least $x$ papers with citation counts greater than or equal to $x$, then for any $y \lt x$, its citation count must also be greater than or equal to $y$. This exhibits monotonicity.

1

2

3

4

5

6

7

8

9

10

11

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        n = len(citations)
        left, right = 0, n
        while left < right:
            mid = (left + right + 1) >> 1
            if citations[n - mid] >= mid:
                left = mid
            else:
                right = mid - 1
        return left

Continue Practicing

#268 missing number #287 find the duplicate number #300 longest increasing subsequence