Naive String Search and Why to Avoid It
Naive string search: for each position i in text (0..n-m), compare pattern[0..m-1] with text[i..i+m-1]. Worst case O(n*m) – catastrophic for inputs like text = “aaaa…a” and pattern = “aaaa…ab”. All optimized algorithms improve on this by avoiding re-scanning characters we already know match. The key insight: when a mismatch occurs after k matched characters, we know the last k characters of text. We can use this knowledge to skip ahead instead of restarting from scratch.
KMP (Knuth-Morris-Pratt) – O(n + m)
def kmp_search(text: str, pattern: str) -> list[int]:
def build_lps(pat: str) -> list[int]:
# Longest Proper Prefix which is also Suffix
lps = [0] * len(pat)
length = 0
i = 1
while i < len(pat):
if pat[i] == pat[length]:
length += 1
lps[i] = length
i += 1
elif length != 0:
length = lps[length - 1] # fall back, don't increment i
else:
lps[i] = 0
i += 1
return lps
lps = build_lps(pattern)
matches = []
i = j = 0 # i = text index, j = pattern index
while i < len(text):
if text[i] == pattern[j]:
i += 1; j += 1
if j == len(pattern):
matches.append(i - j)
j = lps[j - 1] # continue searching
elif i < len(text) and text[i] != pattern[j]:
if j != 0:
j = lps[j - 1] # skip using LPS
else:
i += 1
return matches
# O(n + m): building LPS is O(m), search is O(n)
# Used for: LC 28 (Find Index of First Occurrence), repeated substring pattern
Rabin-Karp – Rolling Hash for Multiple Pattern Search
def rabin_karp(text: str, pattern: str) -> list[int]:
MOD = 10**9 + 7
BASE = 31 # prime base for hashing
n, m = len(text), len(pattern)
def char_val(c): return ord(c) - ord('a') + 1
# Precompute pattern hash and first window hash
pat_hash = 0
win_hash = 0
power = 1 # BASE^(m-1) mod MOD
for i in range(m):
pat_hash = (pat_hash * BASE + char_val(pattern[i])) % MOD
win_hash = (win_hash * BASE + char_val(text[i])) % MOD
if i < m - 1:
power = power * BASE % MOD
matches = []
for i in range(n - m + 1):
if win_hash == pat_hash and text[i:i+m] == pattern: # verify to avoid hash collision
matches.append(i)
if i < n - m:
# Roll the hash: remove leftmost char, add new right char
win_hash = (win_hash - char_val(text[i]) * power) % MOD
win_hash = (win_hash * BASE + char_val(text[i + m])) % MOD
win_hash = (win_hash + MOD) % MOD # ensure non-negative
return matches
# O(n + m) average. Advantage: find K patterns simultaneously in O(n + sum(m_i))
# LC 187: Repeated DNA Sequences - rolling hash on k-length windows
Z-Algorithm – Pattern Matching via Z-Array
def z_algorithm(s: str) -> list[int]:
n = len(s)
z = [0] * n
z[0] = n
l = r = 0
for i in range(1, n):
if i < r:
z[i] = min(r - i, z[i - l])
while i + z[i] r:
l, r = i, i + z[i]
return z
def z_search(text: str, pattern: str) -> list[int]:
combined = pattern + '$' + text # sentinel separates pattern from text
z = z_algorithm(combined)
m = len(pattern)
return [i - m - 1 for i in range(m + 1, len(combined)) if z[i] == m]
# O(n + m). Z[i] = length of longest substring starting at i that matches prefix of s
# Advantage: simpler to implement than KMP for pattern matching
When to Use Each Algorithm
Naive O(n*m): only for very short strings or guaranteed no repeated prefixes. KMP O(n+m): single pattern search, standard choice for most interview problems. Strong when pattern has repeated prefixes (e.g., “abcabc”). LPS table is the key insight – practice building it. Rabin-Karp O(n+m) average: multiple patterns simultaneously, or when the “rolling window hash” insight is the point (LC 187 Repeated DNA Sequences). Z-algorithm O(n+m): cleaner implementation than KMP for some, same complexity. Also useful for string periodicity problems. All three are O(n+m) but KMP is most commonly tested in interviews – practice the LPS construction until you can do it from memory.
Meta coding interviews include string search problems like KMP and rolling hash. Review patterns for Meta interview: string search and pattern matching problems.
LinkedIn coding rounds test string algorithms including KMP. See common patterns for LinkedIn interview: string algorithm problems.
Apple coding interviews test string search algorithms. Review KMP and rolling hash patterns for Apple interview: string manipulation and search problems.