String Interview Patterns

Strings appear in roughly 25-30% of coding interviews. Unlike arrays or trees, string problems have a concentrated set of patterns – master these and most string problems become recognizable variants.

Sliding Window for Substring Problems

The sliding window pattern solves most “find a substring satisfying condition X” problems in O(n).

from collections import defaultdict

def min_window_substring(s: str, t: str) -> str:
    """LC 76 - Minimum Window Substring"""
    need = defaultdict(int)
    for c in t:
        need[c] += 1

    have, required = 0, len(need)
    window = defaultdict(int)
    left = 0
    result = ""
    min_len = float('inf')

    for right, c in enumerate(s):
        window[c] += 1
        if c in need and window[c] == need[c]:
            have += 1

        while have == required:
            # Update result
            if right - left + 1 < min_len:
                min_len = right - left + 1
                result = s[left:right+1]
            # Shrink window
            left_c = s[left]
            window[left_c] -= 1
            if left_c in need and window[left_c]  int:
    """LC 3 - Longest Substring Without Repeating Characters"""
    seen = {}
    left = 0
    best = 0
    for right, c in enumerate(s):
        if c in seen and seen[c] >= left:
            left = seen[c] + 1
        seen[c] = right
        best = max(best, right - left + 1)
    return best


def find_anagrams(s: str, p: str) -> list:
    """LC 438 - Find All Anagrams in a String"""
    if len(p) > len(s):
        return []

    p_count = [0] * 26
    w_count = [0] * 26
    for c in p:
        p_count[ord(c) - ord('a')] += 1

    result = []
    for i, c in enumerate(s):
        w_count[ord(c) - ord('a')] += 1
        if i >= len(p):
            out = s[i - len(p)]
            w_count[ord(out) - ord('a')] -= 1
        if w_count == p_count:
            result.append(i - len(p) + 1)
    return result

Template recognition: If the problem asks for a substring (contiguous) satisfying some count/frequency constraint, reach for sliding window first.

KMP (Knuth-Morris-Pratt) Pattern Matching

Brute-force string search is O(n*m). KMP achieves O(n+m) by precomputing a failure function that tells us how much to advance the pattern pointer on mismatch.

def build_failure_function(pattern: str) -> list:
    """
    failure[i] = length of longest proper prefix of pattern[0..i]
    that is also a suffix.
    """
    m = len(pattern)
    failure = [0] * m
    j = 0  # length of previous longest prefix-suffix

    for i in range(1, m):
        while j > 0 and pattern[i] != pattern[j]:
            j = failure[j - 1]  # fall back
        if pattern[i] == pattern[j]:
            j += 1
        failure[i] = j

    return failure


def kmp_search(text: str, pattern: str) -> list:
    """Returns all start indices where pattern occurs in text."""
    if not pattern:
        return list(range(len(text) + 1))

    failure = build_failure_function(pattern)
    matches = []
    j = 0  # number of chars matched in pattern

    for i, c in enumerate(text):
        while j > 0 and c != pattern[j]:
            j = failure[j - 1]
        if c == pattern[j]:
            j += 1
        if j == len(pattern):
            matches.append(i - j + 1)
            j = failure[j - 1]

    return matches


# Example
print(kmp_search("ababcababcabc", "ababc"))  # [0, 5, 7]

When KMP matters in interviews: Most interviewers accept Python’s str.find() or in operator. Knowing KMP signals depth. Bring it up proactively: “I’d use KMP here for O(n+m) – want me to implement the failure function?”

Rabin-Karp Rolling Hash

Rolling hash enables O(1) hash recomputation as you slide a window, making O(n) substring search possible.

def rabin_karp(text: str, pattern: str) -> list:
    """Find all occurrences of pattern in text using rolling hash."""
    n, m = len(text), len(pattern)
    if m > n:
        return []

    BASE = 31
    MOD = 10**9 + 7
    matches = []

    def char_val(c):
        return ord(c) - ord('a') + 1

    # Precompute BASE^m mod MOD
    base_m = pow(BASE, m, MOD)

    # Initial hash for pattern and first window
    p_hash = 0
    t_hash = 0
    for i in range(m):
        p_hash = (p_hash * BASE + char_val(pattern[i])) % MOD
        t_hash = (t_hash * BASE + char_val(text[i])) % MOD

    for i in range(n - m + 1):
        if t_hash == p_hash:
            # Verify to handle hash collisions
            if text[i:i+m] == pattern:
                matches.append(i)
        if i  str:
    """LC 1044 - binary search on length + Rabin-Karp"""
    def has_duplicate(length: int) -> str:
        BASE, MOD = 31, 10**9 + 7
        base_l = pow(BASE, length, MOD)
        h = 0
        for c in s[:length]:
            h = (h * BASE + ord(c)) % MOD
        seen = {h: [0]}
        for i in range(1, len(s) - length + 1):
            h = (h * BASE - ord(s[i-1]) * base_l + ord(s[i+length-1])) % MOD
            for start in seen.get(h, []):
                if s[start:start+length] == s[i:i+length]:
                    return s[i:i+length]
            seen.setdefault(h, []).append(i)
        return ""

    lo, hi, result = 1, len(s) - 1, ""
    while lo <= hi:
        mid = (lo + hi) // 2
        found = has_duplicate(mid)
        if found:
            result = found
            lo = mid + 1
        else:
            hi = mid - 1
    return result

Anagram and Character Frequency Patterns

from collections import Counter

def group_anagrams(strs: list) -> list:
    """LC 49 - Group Anagrams"""
    groups = defaultdict(list)
    for s in strs:
        key = tuple(sorted(s))  # or Counter(s) as frozenset
        groups[key].append(s)
    return list(groups.values())


def is_anagram(s: str, t: str) -> bool:
    """LC 242 - Valid Anagram - O(n) with Counter"""
    return Counter(s) == Counter(t)


def find_anagrams_freq(s: str, p: str) -> list:
    """LC 438 using Counter diff - cleaner but slightly slower"""
    result = []
    p_count = Counter(p)
    w_count = Counter(s[:len(p)])
    if w_count == p_count:
        result.append(0)

    for i in range(len(p), len(s)):
        # Add new character
        w_count[s[i]] += 1
        # Remove old character
        old = s[i - len(p)]
        w_count[old] -= 1
        if w_count[old] == 0:
            del w_count[old]
        if w_count == p_count:
            result.append(i - len(p) + 1)

    return result

Interview tip: Counter comparison is O(k) where k = unique characters. For fixed alphabets (26 letters) this is O(1) effectively. Mention this.

Palindrome Patterns

def longest_palindrome_expand(s: str) -> str:
    """LC 5 - expand around center, O(n^2) time O(1) space"""
    def expand(left, right):
        while left >= 0 and right  int:
    """LC 647 - count all palindromic substrings"""
    count = 0
    for i in range(len(s)):
        for left, right in [(i, i), (i, i+1)]:
            while left >= 0 and right < len(s) and s[left] == s[right]:
                count += 1
                left -= 1
                right += 1
    return count

Manacher’s algorithm solves LC 5 in O(n) time by reusing previously computed palindrome lengths. The key insight: if we are inside a larger palindrome, the mirror position gives us a lower bound on the current position’s palindrome length. In practice, expand-around-center O(n^2) is accepted in most interviews – bring up Manacher’s as an optimization if asked.

String Encoding and Decoding

def encode(strs: list) -> str:
    """LC 271 - encode list of strings for transmission"""
    # Format: length#string for each entry
    # "4#lint6#coding4#is3#fun" -> ["lint","coding","is","fun"]
    return ''.join(f"{len(s)}#{s}" for s in strs)

def decode(s: str) -> list:
    result = []
    i = 0
    while i  str:
    """Classic RLE: 'aaabbc' -> '3a2b1c'"""
    if not s:
        return ""
    result = []
    count = 1
    for i in range(1, len(s)):
        if s[i] == s[i-1]:
            count += 1
        else:
            result.append(f"{count}{s[i-1]}")
            count = 1
    result.append(f"{count}{s[-1]}")
    return ''.join(result)

Two-Pointer on Strings

def reverse_vowels(s: str) -> str:
    """LC 345"""
    vowels = set('aeiouAEIOU')
    chars = list(s)
    left, right = 0, len(chars) - 1
    while left < right:
        while left < right and chars[left] not in vowels:
            left += 1
        while left  str:
    """LC 151 - Reverse Words in a String"""
    return ' '.join(s.split()[::-1])


def is_palindrome(s: str) -> bool:
    """LC 125 - Valid Palindrome (alphanumeric only)"""
    cleaned = [c.lower() for c in s if c.isalnum()]
    return cleaned == cleaned[::-1]
    # Two-pointer version for O(1) space:
    # left, right = 0, len(s)-1
    # while left < right:
    #     while left<right and not s[left].isalnum(): left+=1
    #     while left<right and not s[right].isalnum(): right-=1
    #     if s[left].lower() != s[right].lower(): return False
    #     left+=1; right-=1
    # return True

Pattern Recognition Table

Problem Type Technique Key Examples
Find substring with constraint Sliding window LC 3, 76, 438
Find pattern in text (one-off) str.find() or KMP LC 28
Find repeated/duplicate substrings Rabin-Karp + binary search LC 1044, 718
Are two strings anagrams Counter / frequency array LC 49, 242, 438
Find longest palindromic substring Expand around center LC 5, 647
Serialize/deserialize strings Length-prefix encoding LC 271
In-place string reversal Two pointers LC 125, 151, 345
Edit distance / alignment DP (2D table) LC 72, 1143
Multiple pattern search Aho-Corasick (rare in interviews) LC 212 (use Trie instead)

Interview approach: State the pattern aloud before coding. “This looks like a sliding window with frequency tracking – similar to minimum window substring.” This signals pattern recognition and gives the interviewer confidence before you write a single line of code.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What is the sliding window template for substring problems?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use two pointers (left, right) and a window state (typically a dict/counter). Expand right pointer one step at a time, adding the new character to the window state. When the window violates a constraint, shrink from the left until valid again. Record the answer when the window is valid. Template: left=0, for right in range(len(s)): update_window(s[right]); while invalid: undo_window(s[left]); left+=1; update_answer(). For minimum window substring (LC 76): invalid = any required character count not satisfied. For longest without repeating (LC 3): invalid = any character appears more than once. For find all anagrams (LC 438): valid = window counter equals target counter.”}},{“@type”:”Question”,”name”:”How does KMP (Knuth-Morris-Pratt) pattern matching work?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”KMP avoids re-examining characters after a mismatch by precomputing a failure function. The failure function fail[i] = length of the longest proper prefix of pattern[0..i] that is also a suffix. When a mismatch occurs at pattern position j, instead of restarting at j=0, jump to j=fail[j-1]. This amortizes the work: each character in the text is processed at most twice. Build failure function: O(m) where m is pattern length. Search: O(n) where n is text length. Total: O(n+m) vs O(n*m) naive. Key use case: substring search in a large document, or finding a pattern in a circular string (concatenate string with itself).”}},{“@type”:”Question”,”name”:”What is the rolling hash technique and when do you use it?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Rolling hash computes a hash of a sliding window in O(1) by adding the new character and removing the old: hash = hash * base + ord(new_char) – ord(old_char) * base^window_len. Use modular arithmetic (mod a large prime) to prevent overflow. When the rolling hash matches the target, do a full string comparison to confirm (avoid false positives). Use cases: find repeated DNA sequences (LC 187), longest duplicate substring (LC 1044, binary search on length + rolling hash), Rabin-Karp string matching when KMP is overkill for average-case. Rolling hash is O(n) average vs O(n^2) naive substring comparison.”}},{“@type”:”Question”,”name”:”How do you check if two strings are anagrams and group anagrams efficiently?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Two strings are anagrams if they have the same character frequency. Check: Counter(s) == Counter(t) or sorted(s) == sorted(t). O(n) vs O(n log n). Group anagrams (LC 49): use sorted characters as the hash key – all anagrams of a word sort to the same key. group = defaultdict(list); for word in words: group[tuple(sorted(word))].append(word). Alternative key: tuple of 26 character counts – avoids sorting, always O(n) per word. For large alphabets or Unicode strings, use the Counter/tuple of character counts approach.”}},{“@type”:”Question”,”name”:”How do you find the longest palindromic substring efficiently?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Expand around center: for each index i, expand outward as long as characters match. Handle both odd-length (center at i) and even-length (center between i and i+1) palindromes. Track the longest palindrome found. O(n^2) time, O(1) space. Better: Manacher's algorithm achieves O(n) by reusing palindrome information from previous centers. Transform the string to "#a#b#a#" to unify odd/even cases. Maintain a rightmost palindrome boundary and its center; use the mirror palindrome to initialize the expansion. For interviews, expand-around-center is expected; mention Manacher's for bonus points. LC 5 (Longest Palindromic Substring), LC 647 (Count Palindromic Substrings).”}}]}

Meta coding interviews heavily test string patterns. See common questions for Meta interview: string and substring problems.

Apple coding rounds test string algorithms including KMP and sliding window. See patterns for Apple interview: string manipulation and matching.

Atlassian coding rounds include string manipulation and pattern matching. See patterns for Atlassian interview: string and text processing problems.

Scroll to Top