#274

Medium

LeetCode Problem Workspace

H-Index

Determine a researcher's h-index by analyzing citations using array sorting and counting techniques efficiently and accurately.

array sorting counting sort

Problem Statement

You are given an integer array citations where citations[i] represents the number of citations for the ith paper of a researcher. Calculate the researcher's h-index, defined as the largest number h such that the researcher has at least h papers with at least h citations each.

For example, given citations = [3,0,6,1,5], the researcher has 5 papers, with citation counts of 3, 0, 6, 1, 5. The h-index is 3 because there are 3 papers with at least 3 citations each, and the remaining two papers have no more than 3 citations each.

Examples

Example 1

Input: citations = [3,0,6,1,5]

Output: 3

[3,0,6,1,5] means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.

Example 2

Input: citations = [1,3,1]

Output: 1

Example details omitted.

Constraints

n == citations.length
1 <= n <= 5000
0 <= citations[i] <= 1000

Solution Approach

Sort and Compare

Sort the citations array in ascending order and iterate from highest to lowest to find the maximum h where citations[i] >= h. This direct approach uses the array plus sorting pattern to simplify counting high-citation papers.

Counting Sort Optimization

Use a counting array to record the frequency of each citation count. Then traverse from the maximum possible citation downward, summing frequencies until the cumulative count meets or exceeds the index. This reduces the problem to linear time and avoids unnecessary sorting overhead.

Binary Search on Sorted Array

After sorting citations ascendingly, apply binary search to find the smallest index where citations[index] >= n - index, with n as the array length. This approach efficiently leverages the sorted array structure and is ideal when handling large datasets.

Complexity Analysis

Metric	Value
Time	Depends on the final approach
Space	Depends on the final approach

Sorting takes O(n log n) time with O(1) extra space if in-place, while counting sort provides O(n + m) time with O(m) space where m is the max citation. Binary search adds O(log n) to the sorting step. Space complexity depends on chosen approach, either O(1) or O(n) for auxiliary arrays.

What Interviewers Usually Probe

Expect an initial sort-based solution and probe understanding of h-index properties.
Clarify whether counting sort optimization is feasible given citation constraints.
Check if candidate can handle edge cases like all zeros or all identical citation counts.

Common Pitfalls or Variants

Common pitfalls

Miscounting papers with exactly h citations or assuming strict greater-than.
Failing to consider empty or single-element arrays.
Ignoring the trade-off between sorting and linear counting when max citation is small.

Follow-up variants

Compute h-index when citations are provided in descending order without sorting.
Handle streaming citation updates and dynamically maintain the h-index.
Calculate h-index for multiple researchers simultaneously using counting arrays.

FAQ

What is the H-Index problem pattern on LeetCode?

It follows the Array plus Sorting pattern, where you sort citation counts or use counting arrays to compute the maximum h-index.

Can I solve H-Index without sorting?

Yes, using a counting sort or bucket array approach lets you determine the h-index in linear time without a full sort.

What edge cases should I consider?

Empty arrays, all zeros, all papers with the same citation count, or citations exceeding array length are key cases to handle.

What is the time complexity using counting sort?

Counting sort approach runs in O(n + maxCitation) time, which is linear if maxCitation is reasonable compared to n.

How do I verify my h-index result is correct?

Check that at least h papers have at least h citations and the remaining papers have no more than h citations to confirm correctness.

terminal

Solution

Solution 1: Sorting

We can sort the array `citations` in descending order. Then we enumerate the value $h$ from large to small, if there is an $h$ value satisfying $citations[h-1] \geq h$, it means that there are at least $h$ papers that have been cited at least $h$ times, just return $h$ directly. If we cannot find such an $h$ value, it means that all the papers have not been cited, return $0$.

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        citations.sort(reverse=True)
        for h in range(len(citations), 0, -1):
            if citations[h - 1] >= h:
                return h
        return 0

Solution 2: Counting + Sum

We can use an array $cnt$ of length $n+1$, where $cnt[i]$ represents the number of papers with the reference count of $i$. We traverse the array `citations` and treat the papers with the reference count greater than $n$ as papers with a reference count of $n$. Then we use the reference count as the index and add $1$ to the corresponding element of $cnt$ for each paper. In this way, we have counted the number of papers for each reference count.

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        citations.sort(reverse=True)
        for h in range(len(citations), 0, -1):
            if citations[h - 1] >= h:
                return h
        return 0

Solution 3: Binary Search

We notice that if there is a $h$ value that satisfies at least $h$ papers are cited at least $h$ times, then for any $h'<h$, at least $h'$ papers are cited at least $h'$ times. Therefore, we can use the binary search method to find the largest $h$ such that at least $h$ papers are cited at least $h$ times.

class Solution:
    def hIndex(self, citations: List[int]) -> int:
        citations.sort(reverse=True)
        for h in range(len(citations), 0, -1):
            if citations[h - 1] >= h:
                return h
        return 0

Continue Practicing

#561 array partition #912 sort an array #1051 height checker