LeetCode Problem Workspace

H-Index II

Compute a researcher's H-Index from a sorted citation array using binary search over the valid answer space efficiently.

category

2

Topics

code_blocks

7

Code langs

hub

3

Related

Practice Focus

Medium · Binary search over the valid answer space

bolt

Answer-first summary

Compute a researcher's H-Index from a sorted citation array using binary search over the valid answer space efficiently.

Interview AiBox logo

Ace coding interviews with Interview AiBox guidance for Binary search over the valid answer space

Try AiBox Copilotarrow_forward

To solve H-Index II, immediately apply binary search on the sorted citations array to locate the maximum h satisfying the H-Index definition. Check the number of papers with citations >= h at each midpoint. This guarantees logarithmic time while handling edge cases where multiple papers have fewer citations than h, avoiding linear scans.

Problem Statement

Given a sorted array of integers representing a researcher's citations per paper, determine the H-Index: the largest number h such that the researcher has at least h papers with h or more citations each. The array is non-decreasing and can contain zero citations. Your algorithm must achieve logarithmic runtime.

For example, with citations = [0,1,3,5,6], the researcher has 5 papers; 3 of them have at least 3 citations, so the H-Index is 3. Your task is to implement a solution that efficiently identifies this H-Index without scanning each paper linearly.

Examples

Example 1

Input: citations = [0,1,3,5,6]

Output: 3

[0,1,3,5,6] means the researcher has 5 papers in total and each of them had received 0, 1, 3, 5, 6 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Example 2

Input: citations = [1,2,100]

Output: 2

Example details omitted.

Constraints

  • n == citations.length
  • 1 <= n <= 105
  • 0 <= citations[i] <= 1000
  • citations is sorted in ascending order.

Solution Approach

Binary Search over the Valid Answer Space

Treat possible H-Index values from 0 to n as a search space. At each step, check if there are enough papers with citations >= mid. If yes, search higher; otherwise, search lower. This leverages the sorted order to skip unnecessary checks.

Check Condition at Midpoint Efficiently

For each midpoint h, compute n - index to see if at least h papers meet or exceed h citations. This avoids scanning all elements and ensures correctness while following the binary search pattern specific to H-Index II.

Return Maximum Valid H-Index

Continue the binary search until the search space collapses. The highest h that satisfies the condition is returned. This approach prevents common errors where naive scanning overcounts or undercounts eligible papers.

Complexity Analysis

Metric Value
Time Depends on the final approach
Space Depends on the final approach

Time complexity is O(log n) due to binary search over possible H-Index values. Space complexity is O(1) as only a few variables are maintained during search.

What Interviewers Usually Probe

  • Expecting logarithmic time using the sorted property of citations.
  • Looking for correct identification of the H-Index without linear scanning.
  • Check for off-by-one errors in determining n - index >= h condition.

Common Pitfalls or Variants

Common pitfalls

  • Confusing index positions with citation counts during binary search.
  • Failing to handle cases where citations are all lower than potential h.
  • Using linear scan, which violates the logarithmic time requirement.

Follow-up variants

  • H-Index calculation where the citations array is unsorted, requiring sorting first.
  • Dynamic H-Index tracking as new citations are added to the array.
  • Compute H-Index with additional constraints, e.g., only counting papers from certain journals.

FAQ

What is the H-Index II problem on LeetCode?

H-Index II asks for the maximum h where a sorted array of citations has at least h papers with h or more citations, requiring logarithmic runtime.

Why use binary search for H-Index II?

Binary search efficiently narrows down the valid H-Index in a sorted array, avoiding the need for linear scans and achieving O(log n) time.

How do I check if a candidate h is valid?

Compute n - index at the midpoint of the search; if this is >= h, the candidate h is valid and you search higher.

What are common mistakes when implementing H-Index II?

Mistakes include miscounting papers, off-by-one errors in binary search, and using linear iteration instead of O(log n) search.

Can H-Index II be solved if citations are unsorted?

Yes, but you must first sort the array; otherwise, binary search over the valid answer space cannot guarantee correctness.

terminal

Solution

Solution 1: Binary Search

We notice that if there are at least $x$ papers with citation counts greater than or equal to $x$, then for any $y \lt x$, its citation count must also be greater than or equal to $y$. This exhibits monotonicity.

1
2
3
4
5
6
7
8
9
10
11
class Solution:
    def hIndex(self, citations: List[int]) -> int:
        n = len(citations)
        left, right = 0, n
        while left < right:
            mid = (left + right + 1) >> 1
            if citations[n - mid] >= mid:
                left = mid
            else:
                right = mid - 1
        return left
H-Index II Solution: Binary search over the valid answer s… | LeetCode #275 Medium