23. Union of Arrays with Duplicates
β GFG solution to Union of Arrays with Duplicates: find all distinct elements from two arrays using efficient set-based or sorting techniques. π
The problem can be found at the following link: π Question Link
π§© Problem Description
You are given two arrays a[] and b[], return the Union of both the arrays in any order.
The Union of two arrays is a collection of all distinct elements present in either of the arrays. If an element appears more than once in one or both arrays, it should be included only once in the result.
Note: Elements of a[] and b[] are not necessarily distinct. You can return the Union in any order but the driver code will print the result in sorted order only.
π Examples
Example 1
Input: a[] = [1, 2, 3, 2, 1], b[] = [3, 2, 2, 3, 3, 2]
Output: [1, 2, 3]
Explanation: Union set of both the arrays will be 1, 2 and 3.Example 2
Input: a[] = [1, 2, 3], b[] = [4, 5, 6]
Output: [1, 2, 3, 4, 5, 6]
Explanation: Union set of both the arrays will be 1, 2, 3, 4, 5 and 6.Example 3
π Constraints
$1 \le \text{a.size()}, \text{b.size()} \le 10^6$
$0 \le \text{a}[i], \text{b}[i] \le 10^5$
β
My Approach
The optimal solution uses Merge and Sort with Deduplication:
Merge + Sort + Unique
Merge Arrays:
Combine both arrays into a single array.
This preserves all elements from both sources.
Sort Combined Array:
Sort the merged array to group duplicate elements together.
Sorting brings identical elements adjacent to each other.
Remove Duplicates:
Use two-pointer or set to eliminate consecutive duplicates.
In C++, use
unique()algorithm which removes consecutive duplicates.
Return Result:
The resulting array contains all distinct elements from both arrays.
Key Advantage: This approach is straightforward and leverages standard library functions for efficiency. While it requires sorting (O(n log n)), it's cache-friendly and has good practical performance.
π Time and Auxiliary Space Complexity
Expected Time Complexity: O((n + m) log(n + m)), where n and m are the sizes of the two arrays. Merging takes O(n + m), sorting the combined array takes O((n + m) log(n + m)), and removing duplicates takes O(n + m). The sorting operation dominates the overall complexity.
Expected Auxiliary Space Complexity: O(n + m), as we create a new array to store the merged elements from both input arrays. The space required is proportional to the total number of elements.
π§βπ» Code (C)
π§βπ» Code (C++)
β‘ View Alternative Approaches with Code and Analysis
π 2οΈβ£ Set-Based Approach
π‘ Algorithm Steps:
Create an unordered set to store unique elements.
Insert all elements from first array into the set.
Insert all elements from second array into the set.
Convert set back to vector and return.
π Complexity Analysis:
Time: β±οΈ O(n + m) - Linear insertion for both arrays
Auxiliary Space: πΎ O(n + m) - Set storage in worst case
β
Why This Approach?
Optimal time complexity with linear performance
Automatic deduplication via set properties
Simple and clean implementation
π 3οΈβ£ TreeSet (Ordered Set) Approach
π‘ Algorithm Steps:
Use ordered set (TreeSet in Java, set in C++) for automatic sorting.
Insert all elements from both arrays.
The set maintains elements in sorted order and removes duplicates.
Convert to array/list and return.
π Complexity Analysis:
Time: β±οΈ O((n + m) log(n + m)) - Set insertion with log factor
Auxiliary Space: πΎ O(n + m) - Set storage
β
Why This Approach?
Result is automatically sorted
Clean single data structure usage
Predictable performance characteristics
π 4οΈβ£ Frequency Map Approach
π‘ Algorithm Steps:
Use hash map to track which elements have been seen.
Iterate through first array, mark elements as seen.
Iterate through second array, mark new elements.
Collect all marked elements into result vector.
π Complexity Analysis:
Time: β±οΈ O(n + m) - Linear traversal with hash operations
Auxiliary Space: πΎ O(n + m) - Map storage
β
Why This Approach?
Similar to set but shows map usage
Can be extended to track frequencies if needed
Explicit marking logic
π 5οΈβ£ Two-Pointer Merge (Sorted Arrays)
π‘ Algorithm Steps:
First sort both arrays individually.
Use two pointers to merge arrays while avoiding duplicates.
Compare elements at both pointers and add smaller unique element.
Handle remaining elements from either array.
π Complexity Analysis:
Time: β±οΈ O(n log n + m log m) - Sorting both arrays separately
Auxiliary Space: πΎ O(1) - Excluding output array
β
Why This Approach?
Classic merge technique from merge sort
Space efficient if sorting in-place
Good for teaching two-pointer patterns
π π Comparison of Approaches
π Approach
β±οΈ Time Complexity
πΎ Space Complexity
β Pros
β οΈ Cons
π― Merge + Sort + Unique
π‘ O((n+m) log(n+m))
π‘ O(n + m)
π STL optimized, cache-friendly
π§ Sorting overhead
π Unordered Set
π’ O(n + m)
π‘ O(n + m)
β‘ Fastest average case
π§ Unordered output
π TreeSet (Ordered)
π‘ O((n+m) log(n+m))
π‘ O(n + m)
π― Sorted output automatically
π Log factor per insertion
ποΈ Frequency Map
π’ O(n + m)
π‘ O(n + m)
π§ Extensible for frequencies
π§ Similar to set
π Two-Pointer Merge
π‘ O(n log n + m log m)
π’ O(1)*
πΎ Space efficient
π§ Complex duplicate handling
*Excluding output array
π Best Choice Recommendation
π― Scenario
ποΈ Recommended Approach
π₯ Performance Rating
π Fastest average performance
π₯ Unordered Set
β β β β β
π Need sorted output
π₯ TreeSet (Ordered)
β β β β β
πΎ Memory constrained
π₯ Two-Pointer Merge
β β β β β
π― Competitive programming
π Merge + Sort + Unique
β β β β β
β Code (Java)
π Code (Python)
π§ Contribution and Support
For discussions, questions, or doubts related to this solution, feel free to connect on LinkedIn: π¬ Any Questions?. Let's make this learning journey more collaborative!
β If you find this helpful, please give this repository a star! β
πVisitor Count
Last updated