Overview
A returns and refunds service handles the reverse logistics of e-commerce: accepting return requests, authorizing them against policy, routing physical goods back to the warehouse, restocking eligible inventory, calculating refund amounts, issuing credits or reversals through the payment gateway, and reconciling the financial ledger. The service must be correct (no double refunds, no lost credits), auditable (every dollar traceable), and fraud-resistant (not exploitable by bad actors filing false claims).
Data Model
-- One return request per return event (a customer may return multiple times per order)
CREATE TABLE return_requests (
return_id BIGINT PRIMARY KEY AUTO_INCREMENT,
order_id BIGINT NOT NULL,
customer_id BIGINT NOT NULL,
status ENUM('requested','authorized','in_transit','received','inspected','refund_pending','refunded','rejected','closed') NOT NULL DEFAULT 'requested',
reason_code VARCHAR(64) NOT NULL,
reason_note TEXT,
rma_number VARCHAR(64) UNIQUE, -- Return Merchandise Authorization
label_url TEXT, -- pre-paid shipping label
fraud_score TINYINT UNSIGNED, -- 0-100, higher = more suspicious
requested_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
authorized_at DATETIME,
received_at DATETIME,
closed_at DATETIME,
updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX idx_order (order_id),
INDEX idx_customer (customer_id),
INDEX idx_status (status),
INDEX idx_rma (rma_number)
);
-- Line items being returned (subset of the original order items)
CREATE TABLE return_items (
ri_id BIGINT PRIMARY KEY AUTO_INCREMENT,
return_id BIGINT NOT NULL,
order_item_id BIGINT NOT NULL,
qty_returned INT NOT NULL,
condition_code ENUM('new','like_new','damaged','unsellable') DEFAULT NULL, -- set after inspection
restock_qty INT NOT NULL DEFAULT 0,
FOREIGN KEY (return_id) REFERENCES return_requests(return_id),
INDEX idx_return (return_id)
);
-- Financial outcome of each return
CREATE TABLE refunds (
refund_id BIGINT PRIMARY KEY AUTO_INCREMENT,
return_id BIGINT NOT NULL UNIQUE, -- one refund record per return
gross_item_amount NUMERIC(12,2) NOT NULL, -- sum of returned line totals
restocking_fee NUMERIC(12,2) NOT NULL DEFAULT 0,
shipping_refund NUMERIC(12,2) NOT NULL DEFAULT 0,
net_refund_amount NUMERIC(12,2) NOT NULL, -- gross - fee + shipping_refund
refund_method ENUM('original_payment','store_credit','check') NOT NULL,
payment_gateway_ref VARCHAR(128), -- gateway transaction ID
status ENUM('pending','processing','completed','failed') NOT NULL DEFAULT 'pending',
issued_at DATETIME,
FOREIGN KEY (return_id) REFERENCES return_requests(return_id)
);
-- Append-only ledger: every financial movement
CREATE TABLE refund_ledger (
ledger_id BIGINT PRIMARY KEY AUTO_INCREMENT,
refund_id BIGINT NOT NULL,
entry_type ENUM('debit','credit') NOT NULL,
amount NUMERIC(12,2) NOT NULL,
description VARCHAR(255) NOT NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (refund_id) REFERENCES refunds(refund_id)
);
-- Fraud signal history per customer
CREATE TABLE return_fraud_signals (
signal_id BIGINT PRIMARY KEY AUTO_INCREMENT,
customer_id BIGINT NOT NULL,
return_id BIGINT NOT NULL,
signal_type VARCHAR(64) NOT NULL, -- e.g., high_return_rate, wardrobing, fake_damage
score_delta TINYINT NOT NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
INDEX idx_customer (customer_id)
);
Return Authorization Workflow
- Eligibility check: verify the order is in a returnable state (
fulfilledorpartially_fulfilled), the return window has not expired (configurable per product category), and the requested quantity does not exceed the shipped quantity minus any prior returns for the same item. - Fraud scoring: compute a
fraud_score(0-100) based on signals: customer lifetime return rate, days since delivery, number of open returns, item category risk, and whether a physical label is actually used. Score >= 80 triggers manual review queue; the return is held atrequestedstatus. - Authorization: if score < 80 and policy checks pass, set status to
authorized, generate an RMA number, and issue a pre-paid return shipping label via the carrier API. Write all of this in one DB transaction.
Restocking Logic
When the warehouse receives the physical return it performs an inspection and records condition_code on each return_item. The restocking decision table:
condition_code = new --> restock_qty = qty_returned (full restock)
condition_code = like_new --> restock_qty = qty_returned (restock, may discount)
condition_code = damaged --> restock_qty = 0 (quarantine, file vendor claim)
condition_code = unsellable --> restock_qty = 0 (dispose or liquidate)
For each row where restock_qty > 0, the service calls the IMS receiveStock endpoint to increment on_hand. This is done item by item so partial failures do not block the entire return.
Refund Calculation
gross_item_amount = SUM(order_item.unit_price * return_item.qty_returned)
restocking_fee = gross_item_amount * policy.restocking_fee_pct -- e.g. 0.15 if item was opened
shipping_refund = original_order.shipping_cost IF all items returned ELSE 0
net_refund_amount = gross_item_amount - restocking_fee + shipping_refund
Partial refunds arise when only some items in an order are returned, when a restocking fee applies, or when the return is a goodwill credit (e.g., item arrived damaged: full refund including original shipping).
Partial Refund Flow
- Compute
net_refund_amount. - Write a
refundsrow with statuspendingand the computed amounts (single transaction also inserting ledger credit entry). - Call payment gateway refund API with the original
payment_gateway_reffrom the order and the partial amount. Most gateways support partial refunds natively; the OMS stored the charge ID at payment capture time. - On gateway success: update
refunds.statustocompleted, store the gateway transaction reference, insert a ledger debit entry to balance the credit, and transition the return torefunded. - On gateway failure: mark
refunds.statusasfailed, enqueue a retry job with exponential backoff, and alert the ops team after N consecutive failures.
Fraud Detection
Fraud signals are evaluated at authorization time and again at inspection time:
- High return rate: customer's return count / order count over 90 days > threshold (e.g., 40%).
- Wardrobing: high-value item returned just before the window closes, condition is "like new", pattern matches previous returns of the same SKU by this customer.
- Label not scanned: RMA issued but carrier never records a scan within N days — likely a claim without actual shipment.
- Damage inconsistency: customer claims damage but inspection finds item in new condition — potential false claim.
Each detected signal inserts a row into return_fraud_signals. Cumulative score across recent returns feeds into the next authorization decision. Customers above a lifetime threshold are flagged for manual review on all future returns.
Key Design Decisions and Trade-offs
- Separate refund row from return request row: a return can exist without a refund (e.g., exchange-only returns, rejected returns). Keeping the financial record in a dedicated table with its own ledger makes accounting queries straightforward and avoids NULLs in the return table.
- Append-only ledger:
refund_ledgeris never updated. Every financial event (credit for refund amount, debit after gateway confirmation, debit for restocking fee) is a new row. Running SUM over the ledger always reconciles to the current state — no need to trust a mutable balance column. - Fraud score at authorization, not only at receipt: early scoring allows the service to deny fraudulent requests before issuing a label and before the customer has any expectation of a refund. Later re-scoring at inspection catches fraud that can only be detected physically.
- Store credit vs. original payment method: store credit bypasses the payment gateway entirely — it is a ledger entry in the customer wallet service. This is faster and has no gateway failure risk. Offering store credit at a slight premium (e.g., 105% of net refund) reduces gateway fees and retains revenue.
Failure Handling and Edge Cases
- Gateway partial refund rejection: some gateways reject partial refunds if the charge is older than a threshold (e.g., 180 days). The service falls back to store credit automatically and notifies the customer.
- Return received but no matching RMA: the WMS may receive an unlabeled or misrouted package. A manual review queue handles these; if resolved, the return is retroactively linked and the normal flow resumes.
- Double refund prevention: the
UNIQUEconstraint onrefunds.return_idprevents inserting two refund records for the same return. The gateway call is wrapped in an idempotency key so retrying after a timeout does not issue a second charge reversal. - Inspection updates after refund issued: if the inspection finds the item unsellable after the refund was already processed, no financial reversal is issued (the customer already received credit); instead, a vendor claim or write-off entry is appended to the ledger as a debit against cost of goods.
- Return window expiry race: the authorization check and the window expiry check must be atomic. A row-level lock on the return request (SELECT FOR UPDATE) prevents a concurrent request from slipping through after another request already consumed the last valid authorization slot.
Scalability Considerations
- Async inspection processing: warehouse inspection events arrive as webhooks; they are written to a queue and processed asynchronously so the WMS HTTP call returns immediately. The return status transitions happen in a consumer process.
- Ledger partitioning: partition
refund_ledgerby month. Accounting reconciliation queries always target a bounded date range so partition pruning keeps them fast even as the table grows to hundreds of millions of rows. - Fraud model refresh: fraud scoring uses pre-computed customer-level aggregates (return rate, lifetime score) stored in a materialized table refreshed hourly by a batch job. Real-time scoring reads the materialized table rather than aggregating raw event data on every authorization request.
- Return volume spikes: post-holiday return spikes can be 10x normal volume. The authorization path is stateless and horizontally scalable; the bottleneck is the payment gateway. Implement a token bucket rate limiter per gateway credential and queue excess refund processing jobs for off-peak execution.
Summary
A returns and refunds service is a correctness-critical financial system layered on top of reverse logistics. The data model separates commercial returns (return_requests, return_items) from financial outcomes (refunds, refund_ledger) to keep each concern independently queryable and auditable. The authorization workflow gates on eligibility, fraud score, and policy rules before any physical or financial commitment is made. Restocking feeds directly into inventory via the IMS. Refund calculation handles partial returns, restocking fees, and shipping credits via a deterministic formula. An append-only ledger ensures that every dollar in and out is traceable, and idempotency keys at the gateway boundary prevent double refunds under retry conditions.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Shopify Interview Guide
See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering