Regression and Hypothesis Testing for Quant Interviews: Diagnostics, Pitfalls, and What Gets Asked
Regression is the most-used statistical tool in quant finance and one of the most-tested topics in quant-research interviews. From factor models at Two Sigma to alpha forecasting at D. E. Shaw to risk modeling at Goldman Sachs Strats, regression is everywhere. The same is true for hypothesis testing: every signal candidate is evaluated against null hypotheses, every backtest result is questioned through the lens of statistical significance.
This guide covers what gets asked: not just “what’s OLS” but the diagnostics, pitfalls, and applied judgment that separate strong candidates from those who can recite formulas. Most candidates have taken a regression course; few can defend regression decisions under pressure. This guide focuses on the latter.
OLS Refresher
Ordinary least squares regression: given dependent variable y and predictors X, find β that minimizes ||y – Xβ||². Closed-form solution: β = (X^T X)^(-1) X^T y.
Geometric interpretation: β projects y onto the column space of X. Predicted values ŷ = Xβ are the projection; residuals e = y – ŷ are orthogonal to every column of X.
Standard assumptions for inference (Gauss-Markov):
- Linearity: E[y | X] = Xβ.
- Independence of errors.
- Homoskedasticity: Var(ε) = σ²I (constant variance).
- Normality of errors (for finite-sample inference).
- No perfect multicollinearity in X.
Most assumptions fail in financial data. Recognizing this and addressing it is the practical skill.
Regression Diagnostics
Residual analysis
Plot residuals vs fitted values. Patterns indicate problems:
- Funnel shape (residuals widen with fitted): heteroskedasticity. Use heteroskedasticity-robust (White / Huber-White) standard errors or transform y.
- Curvature: misspecified functional form. Add polynomial terms, transform variables, or use a non-linear model.
- Time-pattern in residuals: serial correlation. Use Newey-West standard errors or model the autocorrelation explicitly.
Multicollinearity
Highly correlated predictors inflate standard errors and make β unstable. Diagnostic: variance inflation factor (VIF). VIF > 5 is a warning; VIF > 10 is a problem. Solutions: drop redundant features, ridge regression, or PCA-based dimensionality reduction.
Outliers and influence
A few extreme observations can dominate OLS estimates. Cook’s distance and leverage statistics identify influential points. In finance, this matters: a single crisis day can pin down β estimates that look stable in calm periods.
R² and adjusted R²
R² = 1 – SS_residual / SS_total. Adjusted R² penalizes R² for adding more predictors. In finance, R² is often very low (returns are mostly noise) — an R² of 0.05 can be a strong signal in some contexts. Don’t equate low R² with bad model in financial regression; understand the context.
Common Regression Pitfalls in Finance
Look-ahead bias
Using future information to predict the past. Easy to introduce: features computed using full-sample statistics, restated financials, end-of-day data treated as available intraday. Always think about what information would have been available at each point in time.
Survivorship bias
Regressions on currently-existing stocks ignore stocks that delisted. Returns regressions can look better than reality because losers are dropped from the dataset.
Stationarity violations
Regressing one non-stationary series on another can produce “spurious regressions” with high R² and significant t-statistics that disappear when the series are differenced. Engle and Granger’s seminal work on cointegration emerged from this.
Heteroskedasticity
Financial returns have time-varying volatility (clustered volatility). OLS standard errors assuming homoskedasticity are wrong; t-statistics are inflated. Use Newey-West or White standard errors.
Sample selection
If your sample is selected based on outcomes (e.g., only successful funds), regression on this subset gives biased estimates. Heckman selection models address this in some contexts.
Overfitting
Adding more predictors always increases R² in-sample. Out-of-sample performance is what matters. Use cross-validation, holdout samples, or regularization (ridge, lasso).
Beyond OLS
Ridge regression
β = (X^T X + λI)^(-1) X^T y. Penalizes large coefficients; handles multicollinearity. Useful when X has many correlated features.
Lasso
L1 penalty: minimizes ||y – Xβ||² + λ ||β||_1. Performs feature selection (drives some coefficients to zero). Useful when you suspect many features are irrelevant.
Elastic net
Combines ridge and lasso penalties. Often used when you want some feature selection plus stability.
Generalized linear models (GLMs)
For non-Gaussian responses: logistic regression for binary outcomes, Poisson regression for counts. Less common in finance than in other fields but used for default modeling, event prediction.
Quantile regression
Models conditional quantiles instead of conditional mean. Useful for tail risk modeling, where the median or upper quantiles matter more than the mean.
Hypothesis Testing Topics
p-values and significance
The probability of observing a test statistic at least as extreme as the one observed, given the null hypothesis. Common misunderstandings: p-value is NOT the probability the null is true; “p < 0.05" doesn't mean "definitely real effect."
Multiple testing
Testing 1000 strategies at p < 0.05 produces ~50 "significant" results by chance. Bonferroni correction (multiply p-values by number of tests) is conservative; Benjamini-Hochberg controls false discovery rate. In quant research, multiple testing is endemic; honest researchers correct or hold out test data.
t-tests and F-tests
t-tests for individual coefficient significance; F-tests for joint significance of multiple coefficients. Standard regression output includes both.
Type I vs Type II error
Type I: rejecting a true null (false positive). Type II: failing to reject a false null (false negative). The trade-off is set by sample size and effect size; understand which matters more in your context.
Power analysis
Probability of detecting a true effect of given size. Important in study design: with too small a sample, you can’t reliably detect even substantial effects.
Common Interview Questions
Walk through OLS
“Derive the OLS estimator.” Set up the loss function, take the derivative with respect to β, set to zero, solve. β = (X^T X)^(-1) X^T y. Bonus: discuss geometric interpretation.
Diagnose a regression
“Here’s a regression output. What concerns do you raise?” Check for: heteroskedasticity (residuals vs fitted), multicollinearity (VIF), serial correlation (Durbin-Watson), influential points (Cook’s distance), out-of-sample validation.
Discuss multicollinearity
“What happens if two predictors are perfectly correlated?” X^T X is singular; β is not unique. Strong candidates discuss the fix: remove one, regularize, or use PCA.
Address heteroskedasticity
“Returns regressions show heteroskedastic residuals. How do you handle inference?” Use heteroskedasticity-robust standard errors (White / Newey-West for autocorrelated returns).
Multiple testing in backtests
“You backtest 1000 strategies and find 50 with p < 0.05. How do you interpret this?" Roughly what you'd expect by chance. Bonferroni or FDR corrections. Hold-out validation. Strong candidates probe how strategies were generated and whether discovery is in the same dataset as evaluation.
Distinguish causation from correlation
“Your regression shows significant coefficients. Does this prove causation?” No. Discuss confounders, reverse causality, selection bias, randomized experiments vs observational data. Strong candidates know that even significant regression coefficients don’t establish causation without further argument.
Frequently Asked Questions
How important are regression diagnostics vs basic regression knowledge?
Diagnostics matter more in interviews than basic OLS derivation. Most candidates can compute β = (X^T X)^(-1) X^T y; few can articulate what to check after running the regression. Strong candidates routinely talk about residual plots, heteroskedasticity, multicollinearity, and out-of-sample validation. The interviewer is looking for applied judgment, not formula recall.
What’s the difference between regression and ML in quant interviews?
Often a difference of framing. Linear regression with regularization (ridge, lasso) is ML; logistic regression is ML; GLMs are ML. The distinction is more cultural: “ML” implies tree-based methods, neural networks, and validation-heavy workflows; “regression” implies linear models, inference-focused. Quant interviews increasingly mix both. Be prepared to discuss random forests, gradient boosting, and neural networks alongside classical regression.
How do quant interviewers feel about classical statistics vs Bayesian methods?
Most quant interviews use frequentist framing (p-values, confidence intervals, OLS). Some firms (especially those with academic-research-heavy cultures) use Bayesian framing more. Knowing both helps. If asked about Bayesian methods in finance, common applications: Bayesian shrinkage for covariance estimation, hierarchical models for cross-sectional pooling, Bayesian model averaging. Don’t push Bayesian unless the interviewer’s framing invites it.
What’s the most common regression mistake in quant work?
Treating in-sample R² as evidence the model works. Overfitting is endemic in quant research because we have many candidate features and noisy targets. The fix is rigorous out-of-sample validation: train on one period, test on a held-out period, repeat across rolling windows. Candidates who default to in-sample R² and ignore validation are signaling lack of practical experience.
Should I memorize regression formulas or focus on intuition?
Both. Memorize the OLS formula (β = (X^T X)^(-1) X^T y), the geometric interpretation (projection onto column space of X), and the basic standard error formulas. But the interview value comes from applied intuition: when does OLS fail, what diagnostics matter, how do you address violations. Practice answering “the residuals look like X — what’s wrong and what would you do?” Practical fluency beats memorization.
See also: Linear Algebra for Quant Interviews • Time-Series Analysis for Quant Interviews • Expected Value and Fair-Game Reasoning