AI ethics and fairness questions appear in interviews at every major tech company — and not just for policy roles. Engineers are expected to identify and mitigate bias, understand fairness trade-offs, and discuss the limits of technical solutions to social problems. This is especially common at Google, Meta, Microsoft, and AI-focused companies.
What Interviewers Are Testing
- Can you identify bias sources in data and models?
- Do you understand that different fairness definitions are mathematically incompatible in general?
- Can you describe technical interventions and their trade-offs?
- Do you understand where technical solutions end and policy decisions begin?
Types of Bias in ML
Historical Bias
The world has historical inequities that appear in data. A resume screening model trained on historical hiring decisions will encode historical discrimination — e.g., lower hiring rates for women in technical roles. Removing protected attributes doesn’t fix this because proxies remain (graduation year, name, university).
Measurement Bias
The feature being measured is a worse proxy for the true construct for some groups. Criminal recidivism prediction: arrest records are a proxy for crime, but arrest rates differ by race due to policing patterns. The model predicts arrests, not actual crime.
Aggregation Bias
A model trained on aggregate data performs worse on subgroups. A diabetes prediction model trained on predominantly white patients may have higher error rates for Black patients due to physiological differences in relevant biomarkers.
Representation Bias
Underrepresentation of groups in training data. Facial recognition systems trained on lighter-skinned faces have significantly higher error rates on darker-skinned faces (documented in the Gender Shades study: error rates up to 34% higher for darker-skinned women vs lighter-skinned men).
Fairness Definitions and the Impossibility Theorem
Q: Explain the different definitions of algorithmic fairness.
Demographic Parity (Statistical Parity):
Positive prediction rates are equal across groups.
P(Ŷ=1 | A=0) = P(Ŷ=1 | A=1)
Equalized Odds:
True positive rates AND false positive rates are equal across groups.
P(Ŷ=1 | Y=1, A=0) = P(Ŷ=1 | Y=1, A=1) [TPR]
P(Ŷ=1 | Y=0, A=0) = P(Ŷ=1 | Y=0, A=1) [FPR]
Equal Opportunity:
Only true positive rates are equal (relaxation of equalized odds).
P(Ŷ=1 | Y=1, A=0) = P(Ŷ=1 | Y=1, A=1)
Calibration:
Among all predicted at 70% probability, 70% actually belong to the positive class — for all groups.
P(Y=1 | Ŷ=p, A=0) = P(Y=1 | Ŷ=p, A=1) = p
Q: Why can’t you satisfy all fairness definitions simultaneously?
The Chouldechova (2017) impossibility theorem proves that calibration, equal FPR, and equal FNR cannot all hold simultaneously when base rates differ across groups — which they almost always do in practice.
The COMPAS recidivism tool case: ProPublica found COMPAS violated equalized odds (Black defendants had higher false positive rates). Northpointe responded that COMPAS was calibrated. Both were correct — because base rates differ, you cannot have both.
The policy implication: Which fairness criterion to optimize is not a technical decision — it reflects a value judgment about which type of error is worse. Engineers must surface this trade-off, not resolve it unilaterally.
Technical Interventions
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
# ── Pre-processing: Reweighting ───────────────────────────────────────────
def compute_reweighting_factors(y_true, sensitive_attr):
"""
Reweight samples so that each (group, label) combination
has equal representation in effective training distribution.
"""
weights = np.ones(len(y_true))
groups = np.unique(sensitive_attr)
labels = np.unique(y_true)
for group in groups:
for label in labels:
mask = (sensitive_attr == group) & (y_true == label)
if mask.sum() == 0:
continue
# Expected proportion if perfectly balanced
expected = 1.0 / (len(groups) * len(labels))
# Actual proportion
actual = mask.sum() / len(y_true)
weights[mask] = expected / actual
return weights
# ── Post-processing: Threshold Adjustment ─────────────────────────────────
def equalize_opportunity(y_scores, y_true, sensitive_attr, target_tpr=None):
"""
Find group-specific thresholds that equalize TPR (Equal Opportunity).
If target_tpr is None, find thresholds that minimize TPR difference.
"""
groups = np.unique(sensitive_attr)
thresholds = {}
for group in groups:
mask = sensitive_attr == group
group_scores = y_scores[mask]
group_labels = y_true[mask]
# Find threshold that achieves target TPR for this group
from sklearn.metrics import roc_curve
fpr, tpr, thresh = roc_curve(group_labels, group_scores)
if target_tpr is not None:
idx = np.argmin(np.abs(tpr - target_tpr))
else:
idx = len(thresh) // 2 # Default midpoint
thresholds[group] = thresh[idx]
print(f"Group {group}: threshold={thresh[idx]:.3f}, TPR={tpr[idx]:.3f}")
return thresholds
def apply_group_thresholds(y_scores, sensitive_attr, thresholds):
"""Apply group-specific thresholds to get predictions."""
predictions = np.zeros(len(y_scores))
for group, threshold in thresholds.items():
mask = sensitive_attr == group
predictions[mask] = (y_scores[mask] >= threshold).astype(int)
return predictions
Intervention comparison:
| Stage | Method | Pros | Cons |
|---|---|---|---|
| Pre-processing | Reweighting, resampling, disparate impact removal | Model-agnostic; fixes representation | May not fully address proxy variables |
| In-processing | Fairness constraints in objective (adversarial debiasing, fairness regularization) | Directly optimizes fairness during training | Complex; may reduce accuracy |
| Post-processing | Group-specific thresholds, equalized odds post-processing | Simple; doesn’t require retraining | Requires access to sensitive attribute at serving time; may violate anti-discrimination law in some jurisdictions |
High-Stakes Applications
Q: How would you approach fairness for a hiring screening model?
- Don’t use features that are proxies for protected attributes: zip code correlates with race; name can reveal gender and ethnicity
- Audit output distributions: track selection rates by demographic group; flag adverse impact (4/5ths rule: selection rate for protected group should be at least 80% of highest-selecting group)
- Consider: should you build this at all? If the historical training data reflects discriminatory hiring, the model will replicate it regardless of interventions
- Human in the loop: for high-stakes decisions, require human review; model is a filter, not a decision-maker
Q: What are the limitations of purely technical fairness interventions?
- Proxy variables: even after removing race/gender, models can learn equivalent proxies from correlated features
- Feedback loops: biased decisions generate biased future training data (predictive policing example)
- Definition conflicts: satisfying one fairness criterion may worsen another
- The ground truth problem: if the label itself is biased (arrest = crime), no technical fix solves the underlying problem
- Generalization: a fair model on test data may be unfair on out-of-distribution deployment data
Regulatory Context (2026)
- EU AI Act: High-risk AI systems (hiring, credit, law enforcement) require conformity assessments, transparency reports, and human oversight
- US: EEOC guidance extends disparate impact doctrine to algorithmic hiring tools; CFPB applies ECOA to credit scoring models
- Model cards: Standardized documentation of model performance across demographic groups; expected at Google, Meta, Microsoft for all production models
Depth Levels
Junior: Name types of bias, explain the fairness trade-off, describe one technical intervention.
Senior: Implement group-specific threshold adjustment, explain impossibility theorem, audit a model for disparate impact.
Staff: Design a fairness monitoring system for production, navigate stakeholder alignment on which fairness criterion to optimize, assess regulatory compliance requirements, advise on whether to build a high-risk application at all.
Related ML Topics
- Classification Metrics — fairness metrics (equal opportunity, equalized odds) are extensions of precision/recall applied across demographic groups
- Handling Imbalanced Datasets — class imbalance and group imbalance interact: a model can be accurate overall while severely underperforming for underrepresented groups
- How to Evaluate an LLM — LLM bias evaluation and red-teaming are part of the broader LLM evaluation framework; TruthfulQA and demographic bias benchmarks
- ML System Design: Build a Fraud Detection System — fraud models face regulatory fairness requirements (ECOA, CFPB); false positive rates must be audited across demographic groups
- How Does RLHF Work? — RLHF and Constitutional AI are technical approaches to aligning LLMs with human values; related to but distinct from bias mitigation