Efficient String Partitioning via KMP Periodicity Detection

This analysis addresses the problem of decomposing a string into the minimum number of substrings such that none of the substrings are "cyclic" (periodic). A string is considered cyclic if it can be constructed by repeating a smaller substring multiple times. Given a string S of length N, we must deetrmine the minimum number of partitions and the number of distinct ways to achieve this minimum partitioning.

Understanding Periodicity

A string X of length L has a period P if X[i] == X[i + P] for all valid endices i, and P divides L. If such a P exists where P < L, the string is cyclic. The fundamental property for checking periodicity using the Knuth-Morris-Pratt (KMP) prefix function is:

  • Let π[i] be the length of the longest proper prefix of the substring S[0...i] that is also a suffix.
  • The minimal period of a string of length L is L - π[L-1].
  • The string is cyclic if (L - π[L-1]) divides L. Otherwise, it is non-cyclic.

Deriving the Minimum Partitions

We can categorize the solution into three distinct cases based on the properties of the input string S:

  1. Non-cyclic String: If S is not cyclic, the minimum partitions required is 1 (the string itself). There is only 1 way to do this.

  2. Uniform String (Period 1): If S consists of identical characters (e.g., "aaaa"), its minimal period is 1. Any substring of length greater than 1 will be cyclic. Thus, we must split it into N substrings of length 1. The minimum partitions is N, with 1 way.

  3. Cyclic with Period > 1: If S is cyclic but the minimal period is greater then 1, the minimum number of partitions is always 2.

    Proof: Consider splitting S into a prefix of length N-1 and a suffix of length 1. The suffix is trivially non-cyclic. The prefix is formed by taking a cyclic string and removing the last few characters (fewer than the minimal period). This results in an "incomplete cycle." It can be proven mathematically that an incomplete cycle string cannot be cyclic. Therefore, a partition of size 2 always exists. Since the whole string is cyclic (size 1 is invalid) and we have a valid partition of size 2, the minimum is exactly 2.

Counting Valid Splits

For the general case where the answer is 2, we need to count the number of indices i (1 ≤ i < N) such that both the prefix S[0...i-1] and the suffix S[i...N-1] are non-cyclic.

To check the cyclic nature of every prefix and suffix efficiently:

  • Compute the KMP prefix function π for S. This allows O(1) periodicity checks for all prefixes.
  • Compute the KMP prefix function for the reversed string S'. The prefixes of S' correspond to the suffixes of S. This allows O(1) periodicity checks for all suffixes.

The total time complexity is O(N) due to the linear nature of the KMP algorithm.

Implementation

Below is the C++ implementation utilizing the KMP algorithm to solve the problem in linear time.


#include <iostream>
#include <vector>
#include <string>
#include <algorithm>

using namespace std;

// Computes the prefix function (pi array) for the given string
vector<int> computePrefixFunction(const string& s) {
    int n = s.length();
    vector<int> pi(n, 0);
    for (int i = 1; i < n; ++i) {
        int j = pi[i - 1];
        while (j > 0 && s[i] != s[j]) {
            j = pi[j - 1];
        }
        if (s[i] == s[j]) {
            j++;
        }
        pi[i] = j;
    }
    return pi;
}

// Checks if a substring of length 'len' with prefix function value 'piVal' is cyclic
// cyclic = (len % (len - piVal) == 0)
bool isPeriodic(int len, int piVal) {
    int period = len - piVal;
    return period != len && len % period == 0;
}

int main() {
    ios_base::sync_with_stdio(false);
    cin.tie(NULL);

    string s;
    if (!(cin >> s)) return 0;

    int n = s.length();

    // Case 1: Check if all characters are identical (Minimal Period = 1)
    bool uniform = true;
    for (int i = 1; i < n; ++i) {
        if (s[i] != s[0]) {
            uniform = false;
            break;
        }
    }

    if (uniform) {
        cout << n << "\n1" << endl;
        return 0;
    }

    // Compute Prefix Function for original string
    vector<int> pi = computePrefixFunction(s);

    // Case 2: Check if the whole string is NOT cyclic
    if (!isPeriodic(n, pi[n - 1])) {
        cout << "1\n1" << endl;
        return 0;
    }

    // Case 3: The string is cyclic with period > 1. Answer is 2.
    // Now count valid splits.
    
    // Compute Prefix Function for reversed string to handle suffixes
    string rev_s = s;
    reverse(rev_s.begin(), rev_s.end());
    vector<int> rev_pi = computePrefixFunction(rev_s);

    int valid_splits = 0;

    // Iterate through all possible split points
    // Left part length = i, Right part length = n - i
    for (int i = 1; i < n; ++i) {
        int left_len = i;
        int right_len = n - i;

        // Check if Left part [0...i-1] is non-cyclic
        // pi[i-1] is the value for prefix of length i
        bool left_ok = !isPeriodic(left_len, pi[i - 1]);

        // Check if Right part [i...n-1] is non-cyclic
        // This corresponds to prefix of reversed string of length (n-i)
        bool right_ok = !isPeriodic(right_len, rev_pi[right_len - 1]);

        if (left_ok && right_ok) {
            valid_splits++;
        }
    }

    cout << "2\n" << valid_splits << endl;

    return 0;
}

Tags: KMP string-algorithms periodicity competitive-programming

Posted on Wed, 20 May 2026 06:01:06 +0000 by Spogliani