20. Find Median in a Stream
The problem can be found at the following link: Question Link
Problem Description
Given a data stream arr[], where integers are read sequentially, determine the median of the elements encountered so far after each new integer is read.
Rules for Median Calculation:
If the number of elements is odd, the median is the middle element.
If the number of elements is even, the median is the average of the two middle elements.
Examples
Example 1:
Input:
arr = [5, 15, 1, 3, 2, 8]Output:
[5.0, 10.0, 5.0, 4.0, 3.0, 4.0]Explanation:
Read
5β median =5.0Read
15β median =(5 + 15) / 2 = 10.0Read
1β median =5.0Read
3β median =(3 + 5) / 2 = 4.0Read
2β median =3.0Read
8β median =(3 + 5) / 2 = 4.0
Constraints:
$(1 \leq \text{Number of elements} \leq 10^5)$
$(1 \leq arr[i] \leq 10^6)$
Min-Heap & Max-Heap
Key Idea:
Use two heaps to maintain the lower half and upper half of elements efficiently.
Max-Heap (Left Half) β Stores the smaller half of the elements.
Min-Heap (Right Half) β Stores the larger half of the elements.
The median is either:
The max of the left half (if odd elements)
The average of max(left half) and min(right half) (if even elements)
Algorithm Steps:
Insert each number into either max-heap or min-heap:
If
maxHeapis empty ORnum <= maxHeap.top(), push intomaxHeap.Else, push into
minHeap.
Balance the heaps:
If
maxHeapis larger thanminHeapby more than 1, move top ofmaxHeaptominHeap.If
minHeapis larger, move top ofminHeaptomaxHeap.
Calculate the median:
If
maxHeaphas more elements, returnmaxHeap.top().Else, return
(maxHeap.top() + minHeap.top()) / 2.0.
Time and Auxiliary Space Complexity
Time Complexity: (O(N \log N)) (heap insertions & balancing)
Auxiliary Space Complexity: (O(N)) (for storing elements in heaps)
Code (C++)
β‘ Alternative Approaches
2οΈβ£ Balanced BST (O(N log N) Time, O(N) Space)
Use Balanced BST (TreeSet in Java, SortedList in Python).
Keep two halves of elements.
Median = Middle Element (odd) / Average of Two (even).
πΉ Pros: Balanced approach, works well for dynamic insertions. πΉ Cons: Slightly slower than heaps due to extra balancing.
3οΈβ£ Brute Force (O(NΒ²) Time, O(N) Space)
Sort list every time a new element arrives.
Find median from sorted list.
πΉ Pros: Simple and easy to understand.
πΉ Cons: Very slow (O(NΒ²)), impractical for large data.
π Comparison of Approaches
Approach
β±οΈ Time Complexity
ποΈ Space Complexity
β Pros
β οΈ Cons
Heap (Priority Queue)
π’ O(N log N)
π‘ O(N)
Best runtime & simple to implement
Uses extra space
Balanced BST (TreeSet)
π‘ O(N log N)
π‘ O(N)
Balanced and good for dynamic data
Slightly slower
Brute Force (Sorting)
π΄ O(NΒ²)
π‘ O(N)
Simple & easy to understand
Very slow for large N
π‘ Best Choice?
β For best efficiency: Min-Heap (
O(N log N)).β For handling dynamic updates: Balanced BST (
O(N log N)).β For small input sizes: Brute Force (
O(NΒ²)).
Code (Java)
Code (Python)
Contribution and Support:
For discussions, questions, or doubts related to this solution, feel free to connect on LinkedIn: Any Questions. Letβs make this learning journey more collaborative!
β If you find this helpful, please give this repository a star! β
πVisitor Count
Last updated