Problem Definition
Given two strings s and p, identify every starting index within s where a substring contains the exact same characters as p with identical frequencies. Character order is irrelevant for matching purposes.
Example: With s = "cbaebabacd" and p = "abc", the qualifying substrings appear at indices 0 ("cba") and 6 ("bac"). The expected output array is [0, 6].
Algorithmic Strategy
The optimal solution applies a fixed-length sliding window paired with character frequency mapping. Because the problem domain restricts inputs to lowercase Latin letters, we can efficiently track distributions using static integer buffers of length 26.
- Frequency Initialization: Allocate two count arrays. The first records the character requirements derived from
p, while the second monitors the dynamic state of the traversed segment ins. - Rightward Expansion: Iterate through
susing a right boundary pointer. Increment the frequency slot corresponding to each newly encountered character. - Leftward Compression: If the current window span surpasses the length of
p, decrement the frequency count for the character being pushed out from the left edge. - Equality Validation: Once the window dimensions align with
p, perform a direct comparison between the two count arrays. An exact match confirms an anagram. Store the left boundary index in the results container.
Code Implementation
#include <iostream>
#include <vector>
#include <string>
#include <array>
std::vector<int> locateAnagrams(const std::string& text, const std::string& pattern) {
std::vector<int> validIndices;
int textLen = text.length();
int patLen = pattern.length();
if (patLen > textLen) return validIndices;
std::array<int, 26> targetFreq = {};
std::array<int, 26> currentFreq = {};
// Establish baseline character distribution
for (char charVal : pattern) {
targetFreq[charVal - 'a']++;
}
// Scan through the source string
for (int rightPos = 0; rightPos < textLen; ++rightPos) {
// Incorporate new character into the active window
currentFreq[text[rightPos] - 'a']++;
// Expel leftmost character once capacity is exceeded
if (rightPos >= patLen) {
currentFreq[text[rightPos - patLen] - 'a']--;
}
// Verify configuration when window reaches target dimensions
if (rightPos >= patLen - 1) {
if (currentFreq == targetFreq) {
validIndices.push_back(rightPos - patLen + 1);
}
}
}
return validIndices;
}
int main() {
std::string s_input = "cbaebabacd";
std::string p_input = "abc";
std::vector<int> matches = locateAnagrams(s_input, p_input);
for (int idx : matches) {
std::cout << idx << " ";
}
return 0;
}
Execution Mechanics
- ASCII Offset Mapping: Subtracting the base character
'a'converts alphabetic input into contiguous zero-indexed integers, enabling direct array access without hash overhead. - Boundary Arithmetic: The expression
rightPos - patLenprecisely targets the element exiting the window frame. Unlike variable-window approaches, this subtraction does not require additional offset adjustments because the window size constraint is mathematically hard-coded. - Starting Index Derivation: When equality is detected at
rightPos, the original substring begins atrightPos - patLen + 1. This calculation shifts the cursor backward from the right edge to capture the exact anchor position of the valid segment.