LeetCode Problem Workspace
H-Index II
Compute a researcher's H-Index from a sorted citation array using binary search over the valid answer space efficiently.
2
Topics
7
Code langs
3
Related
Practice Focus
Medium · Binary search over the valid answer space
Answer-first summary
Compute a researcher's H-Index from a sorted citation array using binary search over the valid answer space efficiently.
Ace coding interviews with Interview AiBoxInterview AiBox guidance for Binary search over the valid answer space
To solve H-Index II, immediately apply binary search on the sorted citations array to locate the maximum h satisfying the H-Index definition. Check the number of papers with citations >= h at each midpoint. This guarantees logarithmic time while handling edge cases where multiple papers have fewer citations than h, avoiding linear scans.
Problem Statement
Given a sorted array of integers representing a researcher's citations per paper, determine the H-Index: the largest number h such that the researcher has at least h papers with h or more citations each. The array is non-decreasing and can contain zero citations. Your algorithm must achieve logarithmic runtime.
For example, with citations = [0,1,3,5,6], the researcher has 5 papers; 3 of them have at least 3 citations, so the H-Index is 3. Your task is to implement a solution that efficiently identifies this H-Index without scanning each paper linearly.
Examples
Example 1
Input: citations = [0,1,3,5,6]
Output: 3
[0,1,3,5,6] means the researcher has 5 papers in total and each of them had received 0, 1, 3, 5, 6 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.
Example 2
Input: citations = [1,2,100]
Output: 2
Example details omitted.
Constraints
- n == citations.length
- 1 <= n <= 105
- 0 <= citations[i] <= 1000
- citations is sorted in ascending order.
Solution Approach
Binary Search over the Valid Answer Space
Treat possible H-Index values from 0 to n as a search space. At each step, check if there are enough papers with citations >= mid. If yes, search higher; otherwise, search lower. This leverages the sorted order to skip unnecessary checks.
Check Condition at Midpoint Efficiently
For each midpoint h, compute n - index to see if at least h papers meet or exceed h citations. This avoids scanning all elements and ensures correctness while following the binary search pattern specific to H-Index II.
Return Maximum Valid H-Index
Continue the binary search until the search space collapses. The highest h that satisfies the condition is returned. This approach prevents common errors where naive scanning overcounts or undercounts eligible papers.
Complexity Analysis
| Metric | Value |
|---|---|
| Time | Depends on the final approach |
| Space | Depends on the final approach |
Time complexity is O(log n) due to binary search over possible H-Index values. Space complexity is O(1) as only a few variables are maintained during search.
What Interviewers Usually Probe
- Expecting logarithmic time using the sorted property of citations.
- Looking for correct identification of the H-Index without linear scanning.
- Check for off-by-one errors in determining n - index >= h condition.
Common Pitfalls or Variants
Common pitfalls
- Confusing index positions with citation counts during binary search.
- Failing to handle cases where citations are all lower than potential h.
- Using linear scan, which violates the logarithmic time requirement.
Follow-up variants
- H-Index calculation where the citations array is unsorted, requiring sorting first.
- Dynamic H-Index tracking as new citations are added to the array.
- Compute H-Index with additional constraints, e.g., only counting papers from certain journals.
FAQ
What is the H-Index II problem on LeetCode?
H-Index II asks for the maximum h where a sorted array of citations has at least h papers with h or more citations, requiring logarithmic runtime.
Why use binary search for H-Index II?
Binary search efficiently narrows down the valid H-Index in a sorted array, avoiding the need for linear scans and achieving O(log n) time.
How do I check if a candidate h is valid?
Compute n - index at the midpoint of the search; if this is >= h, the candidate h is valid and you search higher.
What are common mistakes when implementing H-Index II?
Mistakes include miscounting papers, off-by-one errors in binary search, and using linear iteration instead of O(log n) search.
Can H-Index II be solved if citations are unsorted?
Yes, but you must first sort the array; otherwise, binary search over the valid answer space cannot guarantee correctness.
Solution
Solution 1: Binary Search
We notice that if there are at least $x$ papers with citation counts greater than or equal to $x$, then for any $y \lt x$, its citation count must also be greater than or equal to $y$. This exhibits monotonicity.
class Solution:
def hIndex(self, citations: List[int]) -> int:
n = len(citations)
left, right = 0, n
while left < right:
mid = (left + right + 1) >> 1
if citations[n - mid] >= mid:
left = mid
else:
right = mid - 1
return leftContinue Topic
array
Practice more edge cases under the same topic.
arrow_forwardauto_awesomeContinue Pattern
Binary search over the valid answer space
Expand the same solving frame across more problems.
arrow_forwardsignal_cellular_altSame Difficulty Track
Medium
Stay on this level to stabilize interview delivery.
arrow_forward