18. Find H-Index
β GFG solution to the Find H-Index problem: calculate researcher's h-index using optimal bucket counting technique with linear time complexity. π
The problem can be found at the following link: π Question Link
π§© Problem Description
You are given an array citations[]
, where each element citations[i]
represents the number of citations received by the ith paper of a researcher. Your task is to calculate the researcher's H-index.
The H-index is defined as the maximum value H, such that the researcher has published at least H papers, and all those papers have citation value greater than or equal to H.
π Examples
Example 1
Input: citations[] = [3, 0, 5, 3, 0]
Output: 3
Explanation: There are at least 3 papers with citation counts of 3, 5, and 3,
all having citations greater than or equal to 3.
Example 2
Input: citations[] = [5, 1, 2, 4, 1]
Output: 2
Explanation: There are 3 papers (with citation counts of 5, 2, and 4) that have 2 or more citations.
However, the H-Index cannot be 3 because there aren't 3 papers with 3 or more citations.
Example 3
Input: citations[] = [0, 0]
Output: 0
Explanation: The H-index is 0 because there are no papers with at least 1 citation.
π Constraints
$1 \le \text{citations.size()} \le 10^6$
$0 \le \text{citations}[i] \le 10^6$
β
My Approach
The optimal approach uses Bucket Counting technique to achieve linear time complexity:
Bucket Counting Approach
Create Bucket Array:
Create a bucket array of size
n+1
wheren
is the number of papers.Index
i
represents papers with exactlyi
citations (orn+
citations for indexn
).
Count Citations:
For each citation count, increment the corresponding bucket.
Citations greater than
n
are placed in bucketn
since h-index cannot exceedn
.
Calculate H-Index:
Traverse buckets from right to left (highest to lowest citations).
Maintain cumulative count of papers with at least
i
citations.Return the largest
i
where cumulative count β₯i
.
Key Insight:
H-index cannot exceed the total number of papers.
We only need to track counts, not exact citation values above
n
.
π Time and Auxiliary Space Complexity
Expected Time Complexity: O(n), where n is the size of the citations array. We make two single passes: one to populate buckets and one to calculate h-index.
Expected Auxiliary Space Complexity: O(n), where n is the size of the citations array. We use a bucket array of size n+1 to store citation counts.
π§ Code (C)
int hIndex(int* citations, int citationsSize) {
int n = citationsSize;
int* bucket = (int*)calloc(n + 1, sizeof(int));
for (int i = 0; i < n; i++)
bucket[citations[i] > n ? n : citations[i]]++;
int count = 0;
for (int i = n; i >= 0; i--) {
count += bucket[i];
if (count >= i) {
free(bucket);
return i;
}
}
free(bucket);
return 0;
}
π§βπ» Code (C++)
class Solution {
public:
int hIndex(vector<int>& citations) {
int n = citations.size();
vector<int> bucket(n + 1, 0);
for (int c : citations) bucket[min(c, n)]++;
int count = 0;
for (int i = n; i >= 0; i--) {
count += bucket[i];
if (count >= i) return i;
}
return 0;
}
};
β Code (Java)
class Solution {
public int hIndex(int[] citations) {
int n = citations.length;
int[] bucket = new int[n + 1];
for (int c : citations) bucket[Math.min(c, n)]++;
int count = 0;
for (int i = n; i >= 0; i--) {
count += bucket[i];
if (count >= i) return i;
}
return 0;
}
}
π Code (Python)
class Solution:
def hIndex(self, citations):
n = len(citations)
bucket = [0] * (n + 1)
for c in citations:
bucket[min(c, n)] += 1
count = 0
for i in range(n, -1, -1):
count += bucket[i]
if count >= i:
return i
return 0
π§ Contribution and Support
For discussions, questions, or doubts related to this solution, feel free to connect on LinkedIn: π¬ Any Questions?. Let's make this learning journey more collaborative!
β If you find this helpful, please give this repository a star! β
πVisitor Count
Last updated