03. Substrings with K Distinct
โ GFG solution to the Substrings with K Distinct Characters problem using sliding window and optimized frequency count. ๐
The problem can be found at the following link: ๐ Question Link
๐งฉ Problem Description
Given a string consisting of lowercase characters and an integer k
, the task is to count all possible substrings (not necessarily distinct) that have exactly k
distinct characters.
๐ Examples
Example 1
Input: s = "abc", k = 2
Output: 2
Explanation: Possible substrings are ["ab", "bc"]
Example 2
Input: s = "aba", k = 2
Output: 3
Explanation: Possible substrings are ["ab", "ba", "aba"]
Example 3
Input: s = "aa", k = 1
Output: 3
Explanation: Possible substrings are ["a", "a", "aa"]
๐ Constraints
$1 \le |s| \le 10^6$
s
contains only lowercase English letters$1 \le k \le 26$
โ
My Approach
We use a two-pass sliding window technique to count the number of substrings with exactly k
distinct characters. The idea is to leverage a helper function that counts substrings with at most k
distinct characters, and then use the inclusion-exclusion principle to find the exact count.
๐ก Idea:
Define a helper function
count(s, k)
that returns the number of substrings with at mostk
distinct characters.Use a sliding window with two pointers and a frequency array of size 26 (since the string contains only lowercase letters).
Expand the window by moving the right pointer, and keep track of the frequency of characters.
If the number of distinct characters exceeds
k
, shrink the window from the left until the window contains at mostk
distinct characters.For each position
j
, add(j - i + 1)
โ the number of valid substrings ending atj
โ to the result.Finally, compute:
substrings with exactly k distinct = count(s, k) - count(s, k - 1)
This subtraction removes substrings that have fewer than k
distinct characters, leaving only those with exactly k
.
โ๏ธ Algorithm Steps:
Implement a helper function
count(s, k)
:Initialize a frequency array
freq[26]
and two pointersi = 0
,j = 0
.Iterate through the string with pointer
j
:Increment frequency of
s[j]
.If distinct characters exceed
k
, movei
forward while updating frequencies until the window has at mostk
distinct characters.Add
(j - i + 1)
to the result count for valid substrings ending atj
.
Call
count(s, k)
andcount(s, k - 1)
and return their difference.
๐ Time and Auxiliary Space Complexity
Time Complexity: O(n) โ Each character is processed at most twice (once when added to the window, once when removed).
Auxiliary Space Complexity: O(1) โ Fixed-size frequency array of length 26 for lowercase English letters.
๐งโ๐ป Code (C++)
class Solution {
public:
int count(string &s, int k) {
int n = s.length(), ans = 0;
vector<int> freq(26, 0);
int i = 0, distinct = 0;
for (int j = 0; j < n; ++j) {
if (++freq[s[j] - 'a'] == 1) ++distinct;
while (distinct > k)
if (--freq[s[i++] - 'a'] == 0) --distinct;
ans += j - i + 1;
}
return ans;
}
int countSubstr(string &s, int k) {
return count(s, k) - count(s, k - 1);
}
};
๐งโ๐ป Code (Java)
class Solution {
int count(String s, int k) {
int[] freq = new int[26];
int i = 0, ans = 0, distinct = 0;
for (int j = 0; j < s.length(); ++j) {
if (++freq[s.charAt(j) - 'a'] == 1) ++distinct;
while (distinct > k)
if (--freq[s.charAt(i++) - 'a'] == 0) --distinct;
ans += j - i + 1;
}
return ans;
}
int countSubstr(String s, int k) {
return count(s, k) - count(s, k - 1);
}
}
๐ Code (Python)
class Solution:
def count(self, s, k):
freq = [0] * 26
i = ans = distinct = 0
for j in range(len(s)):
if freq[ord(s[j]) - ord('a')] == 0:
distinct += 1
freq[ord(s[j]) - ord('a')] += 1
while distinct > k:
freq[ord(s[i]) - ord('a')] -= 1
if freq[ord(s[i]) - ord('a')] == 0:
distinct -= 1
i += 1
ans += j - i + 1
return ans
def countSubstr(self, s, k):
return self.count(s, k) - self.count(s, k - 1)
๐ง Contribution and Support
For discussions, questions, or doubts related to this solution, feel free to connect on LinkedIn: ๐ฌ Any Questions?. Letโs make this learning journey more collaborative!
โญ If you find this helpful, please give this repository a star! โญ
๐Visitor Count
Last updated