Content Classifier Low-Level Design: Multi-Label Classification, Ensemble Models, and Human Review

Content Classifier System Design Overview

A content classifier assigns one or more category labels to user-generated content: text posts, images, videos, and documents. Production content classifiers must handle multi-label outputs (a post can be both spam and adult content), combine signals from multiple models via ensemble voting, route uncertain predictions to human reviewers, and feed reviewer decisions back into model retraining. Getting this pipeline right is critical for platform safety and content quality.

Requirements

Functional Requirements

  • Classify content across a configurable label taxonomy (spam, adult, violence, hate speech, misinformation, safe).
  • Support multi-label classification: a single item may receive multiple labels simultaneously.
  • Combine predictions from text, image, and metadata models using ensemble voting.
  • Route items with prediction confidence below a configurable threshold to a human review queue.
  • Accept reviewer verdicts and store them for model retraining and audit.

Non-Functional Requirements

  • Classification latency under 500ms for synchronous calls on text content.
  • Throughput of 10,000 classification requests per second at peak.
  • Human review queue SLA: items reviewed within 4 hours of escalation.
  • Ensemble model updates deployable without service restart.

Data Model

  • classification_requests: request_id, content_id, content_type, content_hash, submitted_at, status
  • model_predictions: request_id, model_id, label, score, model_version, predicted_at
  • ensemble_decisions: request_id, label, ensemble_score, confidence, action (allow, block, escalate), decided_at
  • review_queue_items: item_id, request_id, labels_under_review, assigned_reviewer_id, escalated_at, reviewed_at, verdict, reviewer_notes
  • training_examples: content_id, label, source (model or human), confidence, created_at

Core Algorithms

Multi-Label Classification

Each model in the ensemble is a binary classifier per label, outputting a score in [0, 1] for that label independently. The multi-label decision is made by applying a per-label threshold to each score independently, so a piece of content can receive any combination of labels. Per-label thresholds are tuned on a held-out validation set to hit a target precision of 0.95 for blocking actions, accepting lower recall to minimize false positives that would incorrectly penalize legitimate content.

Ensemble Voting

Three model types contribute predictions: a fine-tuned BERT-based text classifier, a ResNet-based image classifier (for posts with media), and a structured metadata classifier (account age, post frequency, link domains). Ensemble voting uses a weighted average of per-model scores for each label: ensemble_score = sum(w_i * score_i) / sum(w_i). Weights are maintained in a configuration store and updated quarterly based on each models offline F1 score on the validation set. If an image model is unavailable (text-only content), its weight is redistributed proportionally among the remaining models.

Confidence-Gated Human Review

After ensemble scoring, each label is classified into one of three disposition zones by comparing ensemble_score to two thresholds: auto_block_threshold (default 0.90) and review_threshold (default 0.60). Scores above auto_block_threshold trigger immediate blocking. Scores between review_threshold and auto_block_threshold escalate to the human review queue. Scores below review_threshold are auto-approved. The dual-threshold design keeps the human review queue manageable: only genuinely ambiguous predictions require human judgment, while clear violations and clear approvals are handled automatically.

Feedback Loop

Human reviewer verdicts are written to the training_examples table as high-confidence labeled examples (confidence = 1.0). A nightly retraining pipeline (Apache Airflow DAG) selects the last 30 days of human-reviewed examples, samples an equal volume of auto-approved examples, and fine-tunes each model on the combined dataset. Models are evaluated offline against a fixed benchmark dataset; models that improve F1 by more than 0.5% are promoted to the canary slot and receive 10% of traffic for 24 hours before full promotion.

Scalability

Classification requests are served synchronously for latency-sensitive callers (post submission flow) and asynchronously via Kafka for batch backfill jobs. The ensemble orchestrator calls each model microservice in parallel using async HTTP with a 300ms timeout per model. Models that time out are excluded from the ensemble for that request, and the remaining models vote with redistributed weights. This prevents a slow model from blocking the entire classification decision.

The human review queue is backed by Postgres with a partial index on (status = pending, escalated_at ASC) to efficiently serve the oldest-first assignment query used by the reviewer dashboard.

API Design

POST /v1/classify

  • Body: content_id, content_type (text, image, video, mixed), text (optional), media_url (optional), metadata (JSON)
  • Response: request_id, decisions array (label, ensemble_score, action), reviewed_by_human: false

POST /v1/review/{item_id}/verdict

  • Body: verdict (allow or block), labels (array of confirmed labels), reviewer_notes
  • Response: item_id, updated_action, training_example_id
  • Auth: reviewer role required

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

Scroll to Top