String Manipulation Algorithms: From Basics to KMP Pattern Matching

String Reversal

String reversal serves as a fundamental operation in string manipulation. While most programming languages provide built-in reverse functions, understanding the underlying mechanism is crucial for technical interviews.

The approach uses two pointers starting from opposite ends of the string. These pointers move toward the center, swapping elements as they progress. For instance, reversing "hello" involves swapping positions 0 and 4, then 1 and 3, continuing until the pointers meet.

void reverseString(vector<char>& s) {
    for (int left = 0, right = s.size() - 1; left < s.size() / 2; left++, right--) {
        char temp = s[left];
        s[left] = s[right];
        s[right] = temp;
    }
}

Using swap is acceptable when the function serves as a minor implementation detail rather than the core algorithmic challenge. Swapping can be achieved through a temporary variable or bitwise XOR operations for in-place exchange without extra memory.

Reverse String II

This problem requires reversing every 2k characters where only the first k characters of each block are reversed. The key insight is to increment the index by 2k in each iteration rather than handling logic manually.

char* reverseStr(char* s, int k) {
    int len = strlen(s);
    for (int i = 0; i < len; i += (2 * k)) {
        k = (i + k > len) ? (len - i) : k;
        int left = i;
        int right = i + k - 1;
        while (left < right) {
            char temp = s[left];
            s[left++] = s[right];
            s[right--] = temp;
        }
    }
    return s;
}

Replacing Digits with "number"

This problem demonstrates the two-pointer technique for in-place modification without auxiliary arrays. The algorithm first calculates the expanded size, then fills from the end to avoid repeated shifting.

#include <stdio.h>
#include <string.h>

int main() {
    char s[100];
    while (scanf("%s", s) != EOF) {
        int digitCount = 0;
        int oldLen = strlen(s);
        for (int i = 0; i < oldLen; i++) {
            if (s[i] >= '0' && s[i] <= '9') {
                digitCount++;
            }
        }
        int newLen = oldLen + digitCount * 5;
        s[newLen] = '\0';
        for (int i = newLen - 1, j = oldLen - 1; j < i; i--, j--) {
            if (s[j] > '9' || s[j] < '0') {
                s[i] = s[j];
            } else {
                s[i] = 'r';
                s[i - 1] = 'e';
                s[i - 2] = 'b';
                s[i - 3] = 'm';
                s[i - 4] = 'u';
                s[i - 5] = 'n';
                i -= 5;
            }
        }
        printf("%s\n", s);
    }
    return 0;
}

Filling from the back prevents O(n²) complexity that would result from front-to-back operations requiring element shifting.

Reversing Words in a String

This problem requires reversing words while handling multiple spaces. A three-step approach proves effective: remove extra spaces, reverse the entire string, then reverse each word individually.

#include <stdio.h>
#include <string.h>

void reverseRange(char* s, int start, int end) {
    while (start < end) {
        char temp = s[start];
        s[start] = s[end];
        s[end] = temp;
        start++;
        end--;
    }
}

void processString(char* s) {
    int slow = 0;
    int fast = 0;
    int len = strlen(s);

    while (s[fast] == ' ') fast++;

    while (fast < len) {
        if (s[fast] != ' ') {
            s[slow++] = s[fast++];
        } else if (s[fast - 1] != ' ') {
            s[slow++] = ' ';
            fast++;
        } else {
            fast++;
        }
    }

    if (s[slow - 1] == ' ') slow--;
    s[slow] = '\0';

    reverseRange(s, 0, slow - 1);

    int start = 0;
    for (int end = 0; end <= slow; end++) {
        if (s[end] == ' ' || s[end] == '\0') {
            reverseRange(s, start, end - 1);
            start = end + 1;
        }
    }
}

int main() {
    char str[] = "  Hello world!  ";
    printf("Original: %s\n", str);
    processString(str);
    printf("Modified: %s\n", str);
    return 0;
}

The fast-slow pointer technique removes redundant spaces in O(n) time, unlike naive approaches using erase operations which create O(n²) complexity.

Right Rotating a String

Right rotation moves the last k characters to the front. The solution uses three reversals: reverse the entire string, reverse the first k characters, then reverse the remaining portion.

#include <stdio.h>
#include <string.h>

void swap(char* a, char* b) {
    char temp = *a;
    *a = *b;
    *b = temp;
}

void reverseSection(char* s, int start, int end) {
    while (start < end) {
        swap(&s[start], &s[end]);
        start++;
        end--;
    }
}

int main() {
    int n;
    char s[100];
    scanf("%d", &n);
    scanf("%s", s);
    int len = strlen(s);
    reverseSection(s, 0, len - 1);
    reverseSection(s, 0, n - 1);
    reverseSection(s, n, len - 1);
    printf("%s\n", s);
    return 0;
}

This technique, known as triple reversal, elegantly achieves rotation without additional memory allocation.

Implementing strStr() with KMP Algorithm

The KMP algorithm, named after Knuth, Morris, and Pratt, dramatically improves string matching efficiency. Its core innovation lies in leveraging previously matched information to avoid redundant comparisons.

Prefix Table

The prefix table (or next array) records the longest proper prefix that is also a suffix for each position. This enables jumping directly to the next potential match position when a mismatch occurs.

For example, with pattern "aabaaf" and text "aabaabaafa":

  • Position 0: longest prefix-suffix length is 0
  • Position 1: "aa" has length 1
  • Position 2: "aab" has length 0
  • Position 3: "aaba" has length 1
  • Position 4: "aabaa" has length 2
  • Position 5: "aabaaf" has length 0

Building the Next Array

The construction uses two pointers: j tracks the prefix length, while i scans the pattern.

void buildNext(int* next, const char* pattern) {
    int j = -1;
    next[0] = j;
    int len = strlen(pattern);

    for (int i = 1; i < len; i++) {
        while (j >= 0 && pattern[i] != pattern[j + 1]) {
            j = next[j];
        }
        if (pattern[i] == pattern[j + 1]) {
            j++;
        }
        next[i] = j;
    }
}

Matching Process

int strStr(const char* haystack, const char* needle) {
    int needleLen = strlen(needle);
    if (needleLen == 0) return 0;
    int next[needleLen];
    buildNext(next, needle);
    int j = -1;
    int haystackLen = strlen(haystack);

    for (int i = 0; i < haystackLen; i++) {
        while (j >= 0 && haystack[i] != needle[j + 1]) {
            j = next[j];
        }
        if (haystack[i] == needle[j + 1]) {
            j++;
        }
        if (j == needleLen - 1) {
            return i - needleLen + 1;
        }
    }
    return -1;
}

int main() {
    const char* haystack = "hello";
    const char* needle = "ll";
    printf("Match found at index: %d\n", strStr(haystack, needle));
    return 0;
}

KMP achieves O(n + m) time complexity compared to O(n × m) for brute force, making it essential for efficient pattern matching.

Repeated Substring Pattern

Method 1: Concatenation Check

If a string consists of repeated substrings, concatenating the string with itself should contain the original string as a substring (excluding the first and last characters).

#include <stdbool.h>
#include <string.h>

bool repeatedSubstringPattern(const char* s) {
    int len = strlen(s);
    char* combined = (char*)malloc(sizeof(char) * (2 * len + 1));
    strcpy(combined, s);
    strcat(combined, s);
    combined++;
    combined[2 * len - 1] = '\0';
    return strstr(combined, s) != NULL;
}

Method 2: KMP-Based Solution

Using the prefix table, a string consists of repeated substrings if:

  • next[len - 1] ≠ -1 (has common prefix-suffix)
  • len % (len - (next[len - 1] + 1)) == 0
#include <stdbool.h>
#include <string.h>

void buildNext(int* next, const char* s) {
    next[0] = -1;
    int j = -1;
    int len = strlen(s);
    for (int i = 1; i < len; i++) {
        while (j >= 0 && s[i] != s[j + 1]) {
            j = next[j];
        }
        if (s[i] == s[j + 1]) {
            j++;
        }
        next[i] = j;
    }
}

bool repeatedSubstringPattern(const char* s) {
    int len = strlen(s);
    int next[len];
    buildNext(next, s);
    int cycleLen = len - (next[len - 1] + 1);
    return (next[len - 1] != -1 && len % cycleLen == 0);
}

int main() {
    const char* s = "abcabc";
    printf("%s\n", repeatedSubstringPattern(s) ? "true" : "false");
    return 0;
}

Summary

String manipulation problems often appear simple but require careful implementation. Key techniques include:

  • Two-pointer approach: Essential for reversal, replacement, and space removal
  • Triple reversal: Elegant solution for rtoation problems
  • KMP algorithm: Enables efficient pattern matching and repeated substring detection

Understanding library function implementations and their time complexities prevents common pitfalls during technical interviews.

Tags: algorithms strings two-pointer KMP pattern-matching

Posted on Mon, 01 Jun 2026 16:25:58 +0000 by tecdesign