What Is a Scoring Service?
A scoring service is the write-side component responsible for receiving raw score events, applying business rules (multipliers, caps, bonuses), and producing a final authoritative score for each entity. It is the upstream producer that feeds leaderboards and ranking systems. Correctness, idempotency, and throughput are its primary concerns.
Data Model
Each scoring event is persisted before processing to enable replay and audit:
CREATE TABLE score_events (
event_id UUID PRIMARY KEY,
entity_id BIGINT NOT NULL,
namespace VARCHAR(64) NOT NULL,
event_type VARCHAR(64) NOT NULL, -- e.g. 'kill', 'purchase', 'click'
raw_value BIGINT NOT NULL,
multiplier DECIMAL(5,2) NOT NULL DEFAULT 1.00,
final_value BIGINT NOT NULL,
processed_at TIMESTAMP,
status VARCHAR(16) NOT NULL DEFAULT 'pending', -- 'pending', 'applied', 'rejected'
INDEX idx_entity_namespace (entity_id, namespace),
INDEX idx_status_created (status, processed_at)
);
CREATE TABLE entity_totals (
entity_id BIGINT NOT NULL,
namespace VARCHAR(64) NOT NULL,
total_score BIGINT NOT NULL DEFAULT 0,
last_event UUID,
updated_at TIMESTAMP NOT NULL DEFAULT NOW(),
PRIMARY KEY (entity_id, namespace)
);
Core Algorithm and Workflow
Event Ingestion
- Client calls
POST /score-eventswith{entity_id, event_type, raw_value, idempotency_key}. - API validates the payload, checks the idempotency key against a Redis cache (
SET idempotency:{key} 1 NX EX 86400). Duplicate requests return the original result immediately. - Event is written to
score_eventswith statuspendingand published to Kafka topicraw-score-events.
Score Processing
- Score processor consumer reads from Kafka.
- Looks up the rule set for
(namespace, event_type): base points, multiplier, cap, bonus conditions. - Applies rules:
final_value = MIN(raw_value * multiplier + bonus, cap). - Atomically increments
entity_totals.total_scoreusing an optimistic lock (compare-and-swap onupdated_at) or a database row lock. - Updates
score_eventsstatus toappliedand publishes ascore-appliedevent to a downstream Kafka topic consumed by leaderboard and ranking services.
Rule Engine
Rules are stored in a configuration table and cached in memory with a short TTL:
CREATE TABLE scoring_rules (
namespace VARCHAR(64) NOT NULL,
event_type VARCHAR(64) NOT NULL,
base_points BIGINT NOT NULL,
multiplier DECIMAL(5,2) NOT NULL DEFAULT 1.00,
score_cap BIGINT,
bonus_json TEXT, -- JSON blob for conditional bonuses
PRIMARY KEY (namespace, event_type)
);
Failure Handling and Consistency
At-least-once delivery: Kafka guarantees at-least-once delivery. Idempotency keys at the API layer prevent client-side duplicates. Processor-side deduplication checks score_events.status before applying: if applied, skip and acknowledge.
Processor crash mid-flight: The event remains pending in the database. A sweeper job periodically retries events stuck in pending beyond a timeout, re-publishing them to Kafka. This guarantees eventual processing.
Rule misconfiguration: Events that fail rule validation are marked rejected with an error code. A dead-letter queue holds them for manual review and potential reprocessing after rule correction.
Optimistic concurrency: If two processors race to update entity_totals, the loser retries. Under high contention, a Redis-based lock (SET lock:entity:{id} 1 NX EX 5) serializes updates per entity while keeping throughput high across entities.
Scalability Considerations
Kafka partitioning by entity_id: Partitioning the raw-score-events topic by entity_id ensures all events for a given entity are processed in order by the same consumer, eliminating per-entity race conditions without explicit locking.
Horizontal consumer scaling: Add consumer instances up to the partition count to scale throughput linearly. Partition count should be sized for peak write volume with headroom.
Score aggregation offload: For entities with extremely high event rates (e.g., a popular streamer receiving thousands of tip events per second), a mini-aggregator buffers events in memory for 100ms and emits a single batched increment, reducing database write pressure.
Read-your-writes: After submitting a score event, clients may immediately query their score. Route these reads to the relational entity_totals table (with replication lag awareness) rather than Redis, which may not yet reflect the latest applied event.
Database sharding: Shard entity_totals and score_events by entity_id % N across database nodes. The scoring service routes writes to the correct shard. A scatter-gather is only needed for cross-entity analytics, which belongs in a data warehouse, not the operational store.
Summary
A scoring service is the authoritative write path for score data. It enforces idempotency at ingestion, applies configurable rule-based transformations, and persists results durably before propagating downstream. Kafka partitioning by entity_id provides natural ordering and horizontal scale. Optimistic concurrency and sweeper jobs guarantee correctness under failures. The service is designed to be the single source of truth, decoupling raw event volume from the leaderboard and ranking read paths.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is a scoring service in system design?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A scoring service is a backend component that computes a numeric score for an entity — such as a fraud risk score for a transaction, a credit score for a user, or a quality score for a listing — based on a set of input features and a scoring model. It is typically invoked synchronously on the critical path or asynchronously in batch pipelines.”
}
},
{
“@type”: “Question”,
“name”: “How do you design a low-latency scoring service for fraud detection?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A low-latency fraud scoring service pre-computes and caches user and merchant feature vectors in Redis. At transaction time, the service fetches these features in a single round-trip, runs a lightweight ML model (e.g., a decision tree or logistic regression), and returns a risk score within 10–20 ms. Heavier models run asynchronously for post-authorization review.”
}
},
{
“@type”: “Question”,
“name”: “How do you ensure consistency and auditability in a scoring service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Consistency is ensured by versioning the scoring model and storing the model version alongside each score in an append-only log. Auditability requires logging all input features and the resulting score to an immutable store (e.g., S3 or a WORM-compliant database) so that any score can be replayed and explained — a critical requirement for financial services companies like Stripe and Coinbase.”
}
},
{
“@type”: “Question”,
“name”: “What are common interview questions about scoring service design?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Common interview questions at companies like Stripe, Amazon, and Coinbase include: How would you design a real-time fraud scoring service that processes 100,000 transactions per second? How do you handle model updates without downtime? How do you deal with feature drift over time? How do you design a scoring service that can explain its decisions to regulators?”
}
}
]
}
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Coinbase Interview Guide