What are the main string problem patterns in coding interviews?

Five patterns cover 90% of string problems: (1) Character frequency -- anagrams, group anagrams, minimum window substring. Core technique: frequency hash map or int[26] array. Sliding window for substring variants. (2) Palindrome -- longest palindromic substring (expand around center O(N^2)), palindrome partitioning (backtracking + DP pre-computation), longest palindromic subsequence (LCS of s and reverse(s)). (3) Substring search -- KMP/Rabin-Karp for pattern matching, sliding window for substring with properties. (4) Parsing -- string to integer (state machine), decode string (stack for nesting), simplify path (stack). (5) String DP -- edit distance, regex matching, word break, distinct subsequences. All use dp[i][j] with transitions based on character matching. Recognition: composition/order -> frequency map. Reads same forward/backward -> palindrome technique. Find pattern in text -> KMP. Nested brackets -> stack. Minimum operations -> DP.

How do you solve Edit Distance with dynamic programming?

Edit Distance: minimum insert/delete/replace operations to transform s1 into s2. State: dp[i][j] = edit distance between s1[0..i-1] and s2[0..j-1]. Base: dp[0][j] = j (insert j chars), dp[i][0] = i (delete i chars). Transition: if s1[i-1] == s2[j-1]: dp[i][j] = dp[i-1][j-1] (characters match, no operation). Else: dp[i][j] = 1 + min(dp[i-1][j] (delete from s1), dp[i][j-1] (insert into s1), dp[i-1][j-1] (replace)). Time: O(N*M). Space: O(N*M), optimizable to O(M) with rolling arrays. This pattern extends to: wildcard matching (? matches any char, * matches any sequence), regex matching (. matches any, * matches zero or more of preceding), and longest common subsequence (similar DP without the replace operation).

Coding Interview: String Problems Comprehensive Guide — Anagram, Palindrome, Substring, Encoding, Parsing

⏱ 6 min read

String problems appear in 25-30% of coding interviews, making them the second most common category after arrays. String manipulation requires careful index management, understanding of character encodings, and recognizing which technique (hash map, two pointers, DP, trie) applies. This guide provides a comprehensive overview of string problem patterns with the key insight for each.

Anagram and Character Frequency Problems

An anagram uses the same characters in a different order. The core technique: character frequency counting with a hash map or fixed-size array (int[26] for lowercase English). Valid Anagram: compare frequency arrays of both strings. O(N). Group Anagrams: sort each string (or compute a frequency key) as the hash map key. Group strings with the same key. O(N * K log K) with sorting, O(N * K) with frequency keys. Find All Anagrams in a String: sliding window of size len(p) over s. Maintain a frequency count for the window. When the window frequency matches p frequency, record the position. O(N). Minimum Window Substring: variable-size sliding window. Expand right to include characters, shrink left when all characters of t are covered. Track minimum window. O(N). Key insight: whenever the problem involves character composition (same characters, different order), build a frequency map. The sliding window variant handles substring/subarray versions efficiently.

Palindrome Problems

A palindrome reads the same forward and backward. Core techniques: (1) Two-pointer check: left and right pointers move inward, comparing characters. O(N). (2) Expand around center: for each character (and each pair of adjacent characters), expand outward while characters match. Finds all palindromic substrings. O(N^2). Longest Palindromic Substring: expand around each of the 2N-1 centers (N single characters + N-1 pairs). Track the longest. O(N^2). Manacher algorithm achieves O(N) but is rarely asked in interviews. Palindrome Partitioning: find all ways to partition s into palindromic substrings. Backtracking: at each position, try all substrings starting here. If the substring is a palindrome, recurse on the rest. Pre-compute isPalindrome[i][j] with DP for O(1) palindrome checks. Longest Palindromic Subsequence: DP on the string and its reverse (this is LCS of s and reverse(s)). O(N^2). Valid Palindrome II: can you make a palindrome by removing at most one character? Two-pointer. On mismatch: try skipping left or skipping right. O(N). Key insight: “substring” palindromes use expand-around-center or DP. “subsequence” palindromes use DP (LCS variant).

Substring Search and Matching

Finding patterns within strings: (1) Brute force: check every position. O(N*M). Sufficient for short patterns in interviews unless asked to optimize. (2) KMP: O(N+M) with failure function preprocessing. Handles repeated pattern prefixes efficiently. (3) Rabin-Karp: rolling hash for O(N+M) expected. Better for multi-pattern matching. (4) Built-in: most languages have efficient substring search (Python “in” operator, Java indexOf). Use in interviews unless the problem specifically asks for implementation. Repeated Substring Pattern: check if s is composed of a repeated unit. Trick: if s + s (with first and last character removed) contains s, then s is periodic. Or use KMP: the shortest period = N – failure[N-1]. If N is divisible by the period, the string is periodic. Longest Repeating Character Replacement: sliding window. Expand right, track the count of the most frequent character in the window. If window_size – max_freq > K (more than K characters need replacement), shrink left. Track maximum window. O(N). Key insight: “find a pattern in text” = KMP/Rabin-Karp. “find a substring with a property” = sliding window.

String Conversion and Parsing

String to Integer (atoi): handle: leading whitespace, optional +/- sign, digits, and overflow (clamp to INT_MAX/INT_MIN). Process character by character with state tracking. Integer to English Words: break into groups of three digits (billions, millions, thousands, ones). Each group follows the same pattern: hundreds + tens/ones. Map digits to words with lookup tables. Roman to Integer / Integer to Roman: mapping table of symbols to values. Roman to integer: scan right to left, add value unless the current value is less than the previous (then subtract — handles IV, IX, XL, etc.). Simplify Path: Unix path simplification. Split by “/”. Stack: push directory names, pop on “..”, ignore “.” and empty strings. Join with “/”. Decode String: “3[a2[c]]” -> “accaccacc”. Stack of (string, count) pairs. Push on “[“, pop and repeat on “]”. (Covered in detail in our Stack Problems guide.) Key insight: parsing problems usually involve a state machine (process character by character with current state tracking) or a stack (for nested structures like brackets, paths, or expressions).

String DP Problems

Many string problems have optimal substructure for DP: Edit Distance: minimum operations (insert, delete, replace) to transform s1 into s2. dp[i][j] = edit distance between s1[0..i-1] and s2[0..j-1]. If characters match: dp[i][j] = dp[i-1][j-1]. Else: min(dp[i-1][j]+1, dp[i][j-1]+1, dp[i-1][j-1]+1). O(N*M). Wildcard Matching: does pattern with ? and * match the string? dp[i][j] = does s[0..i-1] match p[0..j-1]. ? matches any single character. * matches any sequence. Regular Expression Matching: dp with . (any char) and * (zero or more of preceding). More complex transitions but same DP structure. Word Break: can the string be segmented into dictionary words? dp[i] = can s[0..i-1] be segmented? For each j < i: if dp[j] and s[j..i-1] is in the dictionary, then dp[i] = true. O(N^2 * M) with hash set dictionary lookup. Distinct Subsequences: count distinct subsequences of s that equal t. dp[i][j] = count for s[0..i-1] and t[0..j-1]. If s[i-1] == t[j-1]: dp[i][j] = dp[i-1][j-1] + dp[i-1][j]. Else: dp[i][j] = dp[i-1][j]. Key insight: string DP usually has two string dimensions dp[i][j] with transitions based on character matching.