String Fundamentals for Interviews
Strings are immutable in Python and Java — concatenating in a loop is O(n^2). Use a list and join() in Python (O(n) total), or StringBuilder in Java. Key operations: hashing (Counter or array of 26 for lowercase English), two pointers (palindrome check, reversal), sliding window (longest substring problems), and sorting (anagram detection). Common mistake: not handling Unicode vs. ASCII — clarify with interviewer. Most interview problems assume lowercase English letters (26 characters) — use an int[26] instead of a hashmap for O(1) access.
Valid Anagram and Group Anagrams
from collections import Counter
def is_anagram(s: str, t: str) -> bool:
return Counter(s) == Counter(t)
# O(n) time, O(1) space (bounded alphabet)
def group_anagrams(strs: list[str]) -> list[list[str]]:
groups = {}
for s in strs:
key = tuple(sorted(s)) # or Counter(s) as frozenset
groups.setdefault(key, []).append(s)
return list(groups.values())
# O(n * L log L) where L = average string length
Longest Palindromic Substring (LC 5) — Expand Around Center
def longest_palindrome(s: str) -> str:
def expand(left: int, right: int) -> str:
while left >= 0 and right len(result): result = odd
if len(even) > len(result): result = even
return result
# O(n^2) time, O(1) space. Manacher's algorithm is O(n) but overkill for interviews.
Palindrome Partitioning (LC 131)
def partition(s: str) -> list[list[str]]:
result = []
def backtrack(start: int, path: list[str]):
if start == len(s):
result.append(path[:])
return
for end in range(start + 1, len(s) + 1):
sub = s[start:end]
if sub == sub[::-1]: # is palindrome
path.append(sub)
backtrack(end, path)
path.pop()
backtrack(0, [])
return result
Encode and Decode Strings (LC 271)
def encode(strs: list[str]) -> str:
# Format: length#string for each string
return "".join(f"{len(s)}#{s}" for s in strs)
def decode(s: str) -> list[str]:
result = []
i = 0
while i < len(s):
j = s.index('#', i)
length = int(s[i:j])
result.append(s[j+1:j+1+length])
i = j + 1 + length
return result
String Compression and Roman Numerals
# String compression: "aabcccdddd" -> "a2bc3d4"
def compress(chars: list[str]) -> int:
write = 0
i = 0
while i < len(chars):
char = chars[i]
count = 0
while i 1:
for c in str(count):
chars[write] = c; write += 1
return write
# Integer to Roman (LC 12)
def int_to_roman(num: int) -> str:
vals = [1000,900,500,400,100,90,50,40,10,9,5,4,1]
syms = ["M","CM","D","CD","C","XC","L","XL","X","IX","V","IV","I"]
result = ""
for v, s in zip(vals, syms):
while num >= v:
result += s; num -= v
return result
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”Why is string concatenation in a loop O(n^2) in Python and Java?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”In Python and Java, strings are immutable. Concatenating string A (length a) with string B (length b) creates a new string of length a+b, copying all characters from both strings. In a loop of n iterations concatenating strings of average length L: total copies = L + 2L + 3L + … + nL = O(n^2 * L). For n = 10,000 and L = 10: ~500 million character copies. Solution in Python: build a list of strings and join at the end — join is O(total characters). Solution in Java: use StringBuilder.append() — amortized O(1) per append, O(n) total. In C++: strings are mutable, += with a string literal may still copy, but std::ostringstream or reserve() + += is efficient.”}},{“@type”:”Question”,”name”:”What is the expand-around-center technique for palindromes and when does it fail?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Expand-around-center: for each position i in the string, try to expand a palindrome centered at i (odd length: i, i) and (i, i+1) (even length). Expand while characters match on both sides. O(n^2) time, O(1) space. Better than O(n^2) DP (which uses O(n^2) space). It does NOT find all palindromic substrings — it finds the longest one. Limitation: it works only for contiguous substrings. For palindromic subsequences (characters not necessarily adjacent), use DP. Manacher's algorithm finds all palindromic substrings in O(n) by reusing previously computed expansion results, but is complex to implement correctly in an interview — expand-around-center is the preferred interview solution with a note about Manacher's for O(n).”}},{“@type”:”Question”,”name”:”How do you detect if two strings are anagrams efficiently?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Method 1 (sorting): sort both strings and compare. O(n log n) time, O(n) space. Simple but not optimal. Method 2 (character frequency array): create an int[26] array (for lowercase English). Increment for each char in s, decrement for each char in t. If all zeros: anagrams. O(n) time, O(1) space (fixed alphabet size). Method 3 (prime product hash): assign each letter a prime number; multiply all primes for s and t. If products equal: anagrams. O(n) time, O(1) space, no array needed, but risk of integer overflow for long strings. For Group Anagrams: use the sorted string as the hashmap key — all anagrams of the same word sort to the same key. Alternative: tuple(sorted(Counter(s).items())) as key for Unicode support.”}},{“@type”:”Question”,”name”:”How do you encode a list of strings to a single string and decode it reliably?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The challenge: a naive delimiter (like comma) fails if strings contain the delimiter. Robust encoding: length-prefixed format. For each string s: encode as len(s)#s. The # is the separator between the length and the string content. Decoding: scan for #, read the integer before it (the length), then read exactly that many characters. The # in the content is safe because we read a fixed number of characters determined by the length, not by scanning for another #. This handles any string content including #, commas, newlines, null bytes, and Unicode. Edge cases: empty string encodes as 0#. Empty list encodes as empty string. This is the same technique used in Redis RESP (Redis Serialization Protocol) and many binary protocols.”}},{“@type”:”Question”,”name”:”How does the KMP (Knuth-Morris-Pratt) algorithm achieve O(n+m) substring search?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Naive substring search: for each position in text (length n), compare pattern (length m) — O(nm). KMP precomputes a "failure function" (or prefix table) for the pattern: lps[i] = length of the longest proper prefix of pattern[0..i] that is also a suffix. Build in O(m). During search: when a mismatch occurs at pattern position j after matching j characters, instead of restarting from 0, shift to lps[j-1] (reuse the longest matched prefix-suffix). The text pointer never goes backward. Total comparisons: O(n). Use cases in interviews: check if s2 contains s1 as a substring (is s1 a rotation of s2: check if s1 is a substring of s2+s2). For most interview problems, Python's in operator (which uses optimized Boyer-Moore-Horspool) or str.find() is sufficient — mention KMP when asked about complexity.”}}]}
See also: Meta Interview Prep
See also: Atlassian Interview Prep
See also: LinkedIn Interview Prep