07. Number of Distinct Subsequences
β GFG solution to the Number of Distinct Subsequences problem: find the count of distinct subsequences of a string using dynamic programming with modulo arithmetic. π
The problem can be found at the following link: π Question Link
π§© Problem Description
Given a string str consisting of lowercase English alphabets, the task is to find the number of distinct subsequences of the string.
Note: Answer can be very large, so output will be answer modulo 10^9+7.
A subsequence is a sequence that can be derived from another sequence by deleting zero or more elements without changing the order of the remaining elements. For example, "abc" is a subsequence of "aebdc" but "aec" is not.
π Examples
Example 1
Input: str = "gfg"
Output: 7
Explanation:
The seven distinct subsequences are "", "g", "f", "gf", "fg", "gg" and "gfg".Example 2
Input: str = "ggg"
Output: 4
Explanation:
The four distinct subsequences are "", "g", "gg", "ggg".π Constraints
$1 \le |str| \le 10^5$
strcontains lowercase English alphabets
β
My Approach
The optimal approach uses Dynamic Programming to efficiently count distinct subsequences:
Dynamic Programming with Last Occurrence Tracking
Understanding Subsequences:
For any string, the number of subsequences is 2^n (including empty string).
However, when characters repeat, some subsequences become identical.
We need to count only distinct subsequences.
DP State Definition:
Let
dp[i]represent the number of distinct subsequences of the prefix ending at index i.Initialize
dp[0] = 1for the empty subsequence.
Recurrence Relation:
When adding a new character, we can either include it in existing subsequences or not.
This doubles the count of subsequences:
dp[i] = 2 * dp[i-1].However, if the character has appeared before, we need to subtract the count of subsequences ending with the previous occurrence to avoid duplicates.
If
last[c]represents the last index where charactercappeared, then:dp[i] = 2 * dp[i-1] - dp[last[c]-1](if characterchas appeared before).
Optimization:
Instead of storing the entire DP array, we can optimize space by just keeping track of the current result.
We maintain an array
lastto store the last occurrence of each character.
Modulo Operation:
Since the result can be very large, we take modulo 10^9+7 at each step to prevent overflow.
π Time and Auxiliary Space Complexity
Expected Time Complexity: O(n), where n is the length of the string. We process each character exactly once.
Expected Auxiliary Space Complexity: O(1), as we only use a fixed-size array of 26 elements to track last occurrences, regardless of the input size.
π§βπ» Code (C++)
β Code (Java)
π Code (Python)
π§ Contribution and Support
For discussions, questions, or doubts related to this solution, feel free to connect on LinkedIn: π¬ Any Questions?. Let's make this learning journey more collaborative!
β If you find this helpful, please give this repository a star! β
πVisitor Count
Last updated