30. Count of Distinct Substrings
β GFG solution to the Count of Distinct Substrings problem: find total number of unique non-empty substrings using efficient trie-based approach. π
The problem can be found at the following link: π Question Link
π§© Problem Description
You are given a string s consisting of lowercase English characters. Your task is to determine the total number of distinct non-empty substrings present in the string.
A substring is defined as a contiguous block of characters within the string. Two substrings are considered distinct if their contents differ, even if they originate from different positions in the string.
Note: The empty substring is not counted.
π Examples
Example 1
Input: s = "ababa"
Output: 9
Explanation: All distinct substrings of "ababa" are: "a", "b", "ab", "ba", "aba", "bab", "abab", "baba", "ababa".Example 2
Input: s = "aaa"
Output: 3
Explanation: The distinct substrings of "aaa" are: "a", "aa", "aaa".π Constraints
$1 \le \text{s.size()} \le 3000$
β
My Approach
The optimal approach uses a Trie (Prefix Tree) data structure to efficiently count distinct substrings:
Trie-Based Substring Counting
Initialize Trie:
Create a trie root node with 26 children (for each lowercase letter).
Maintain a counter to track newly created nodes.
Generate All Suffixes:
For each starting position
iin the string, consider it as the beginning of a suffix.Start from position
iand traverse to the end of the string.
Insert Characters into Trie:
For each character in the current suffix, navigate through the trie.
If a child node doesn't exist for the current character, create it and increment the counter.
Each new node represents a new unique substring.
Count Unique Substrings:
The total count of new nodes created equals the number of distinct substrings.
Each path from root to any node represents a unique substring.
Return Result:
Return the final count of distinct substrings.
π Time and Auxiliary Space Complexity
Expected Time Complexity: O(nΒ²), where n is the length of the string. We generate all possible substrings by iterating through each starting position and extending to the end, and each character insertion in the trie takes O(1) time.
Expected Auxiliary Space Complexity: O(nΒ²), as in the worst case (all characters are distinct), the trie will store all possible substrings, requiring space proportional to the total number of characters across all unique substrings.
π§βπ» Code (C++)
β‘ View Alternative Approaches with Code and Analysis
π 2οΈβ£ Set-Based Approach
π‘ Algorithm Steps:
Use an unordered set to store all unique substrings.
Generate all possible substrings using nested loops.
Insert each substring into the set for automatic deduplication.
Return the size of the set as the count.
π Complexity Analysis:
Time: β±οΈ O(nΒ³) - Substring generation and insertion
Auxiliary Space: πΎ O(nΒ²) - Storage for all substrings
β
Why This Approach?
Simple and intuitive implementation
Automatic handling of duplicates via set
Easy to debug and verify
Note: This approach results in Time Limit Exceeded (TLE) for large inputs (fails ~1110/1121 test cases due to time constraints).
π 3οΈβ£ Suffix Array Approach
π‘ Algorithm Steps:
Build a suffix array of all suffixes.
Sort the suffix array lexicographically.
Compute longest common prefix between consecutive suffixes.
Calculate unique substrings using formula: total substrings minus duplicate LCP counts.
π Complexity Analysis:
Time: β±οΈ O(nΒ² log n) - Suffix sorting dominates
Auxiliary Space: πΎ O(nΒ²) - Storage for suffixes
β
Why This Approach?
Mathematical optimization reduces redundant checking
Scales better for very large strings
Classic string algorithm pattern
π 4οΈβ£ Optimized Trie with Inline Node
π‘ Algorithm Steps:
Create a compact trie structure with inline node definition.
Insert all suffixes starting from each position.
Count new nodes created during insertion.
Each new node represents a unique substring.
π Complexity Analysis:
Time: β±οΈ O(nΒ²) - Optimal for trie-based approach
Auxiliary Space: πΎ O(nΒ²) - Worst case trie storage
β
Why This Approach?
Most space-efficient trie implementation
Compact code with inline struct
Best balance of readability and performance
π π Comparison of Approaches
π Approach
β±οΈ Time Complexity
πΎ Space Complexity
β Pros
β οΈ Cons
π² Trie Standard
π’ O(nΒ²)
π‘ O(nΒ²)
π Optimal time complexity
πΎ Moderate space usage
π Set-Based
π΄ O(nΒ³)
π‘ O(nΒ²)
π Simplest implementation
π Slowest time complexity
π Suffix Array
π‘ O(nΒ² log n)
π‘ O(nΒ²)
π― Mathematical elegance
π§ Complex to implement
β‘ Optimized Trie
π’ O(nΒ²)
π’ O(nΒ²)
β Most compact code
π§ Requires understanding tries
π Best Choice Recommendation
π― Scenario
ποΈ Recommended Approach
π₯ Performance Rating
π Optimal performance needed
π₯ Optimized Trie
β β β β β
π Learning/Prototyping
π₯ Set-Based
β β β ββ
π§ Large strings (n > 10β΄)
π₯ Suffix Array
β β β β β
π― Interview/Competitive
π Trie Standard
β β β β β
β Code (Java)
π Code (Python)
π§ Contribution and Support
For discussions, questions, or doubts related to this solution, feel free to connect on LinkedIn: π¬ Any Questions?. Let's make this learning journey more collaborative!
β If you find this helpful, please give this repository a star! β
πVisitor Count
Last updated