Low Level Design: Returns and Refunds Service

Overview

A returns and refunds service handles the reverse logistics of e-commerce: accepting return requests, authorizing them against policy, routing physical goods back to the warehouse, restocking eligible inventory, calculating refund amounts, issuing credits or reversals through the payment gateway, and reconciling the financial ledger. The service must be correct (no double refunds, no lost credits), auditable (every dollar traceable), and fraud-resistant (not exploitable by bad actors filing false claims).

Data Model

-- One return request per return event (a customer may return multiple times per order)
CREATE TABLE return_requests (
    return_id       BIGINT         PRIMARY KEY AUTO_INCREMENT,
    order_id        BIGINT         NOT NULL,
    customer_id     BIGINT         NOT NULL,
    status          ENUM('requested','authorized','in_transit','received','inspected','refund_pending','refunded','rejected','closed') NOT NULL DEFAULT 'requested',
    reason_code     VARCHAR(64)    NOT NULL,
    reason_note     TEXT,
    rma_number      VARCHAR(64)    UNIQUE,          -- Return Merchandise Authorization
    label_url       TEXT,                           -- pre-paid shipping label
    fraud_score     TINYINT UNSIGNED,               -- 0-100, higher = more suspicious
    requested_at    DATETIME       NOT NULL DEFAULT CURRENT_TIMESTAMP,
    authorized_at   DATETIME,
    received_at     DATETIME,
    closed_at       DATETIME,
    updated_at      DATETIME       NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
    INDEX idx_order      (order_id),
    INDEX idx_customer   (customer_id),
    INDEX idx_status     (status),
    INDEX idx_rma        (rma_number)
);

-- Line items being returned (subset of the original order items)
CREATE TABLE return_items (
    ri_id           BIGINT        PRIMARY KEY AUTO_INCREMENT,
    return_id       BIGINT        NOT NULL,
    order_item_id   BIGINT        NOT NULL,
    qty_returned    INT           NOT NULL,
    condition_code  ENUM('new','like_new','damaged','unsellable') DEFAULT NULL,  -- set after inspection
    restock_qty     INT           NOT NULL DEFAULT 0,
    FOREIGN KEY (return_id) REFERENCES return_requests(return_id),
    INDEX idx_return (return_id)
);

-- Financial outcome of each return
CREATE TABLE refunds (
    refund_id           BIGINT         PRIMARY KEY AUTO_INCREMENT,
    return_id           BIGINT         NOT NULL UNIQUE,   -- one refund record per return
    gross_item_amount   NUMERIC(12,2)  NOT NULL,          -- sum of returned line totals
    restocking_fee      NUMERIC(12,2)  NOT NULL DEFAULT 0,
    shipping_refund     NUMERIC(12,2)  NOT NULL DEFAULT 0,
    net_refund_amount   NUMERIC(12,2)  NOT NULL,          -- gross - fee + shipping_refund
    refund_method       ENUM('original_payment','store_credit','check') NOT NULL,
    payment_gateway_ref VARCHAR(128),                     -- gateway transaction ID
    status              ENUM('pending','processing','completed','failed') NOT NULL DEFAULT 'pending',
    issued_at           DATETIME,
    FOREIGN KEY (return_id) REFERENCES return_requests(return_id)
);

-- Append-only ledger: every financial movement
CREATE TABLE refund_ledger (
    ledger_id     BIGINT        PRIMARY KEY AUTO_INCREMENT,
    refund_id     BIGINT        NOT NULL,
    entry_type    ENUM('debit','credit') NOT NULL,
    amount        NUMERIC(12,2) NOT NULL,
    description   VARCHAR(255)  NOT NULL,
    created_at    DATETIME      NOT NULL DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (refund_id) REFERENCES refunds(refund_id)
);

-- Fraud signal history per customer
CREATE TABLE return_fraud_signals (
    signal_id    BIGINT       PRIMARY KEY AUTO_INCREMENT,
    customer_id  BIGINT       NOT NULL,
    return_id    BIGINT       NOT NULL,
    signal_type  VARCHAR(64)  NOT NULL,   -- e.g., high_return_rate, wardrobing, fake_damage
    score_delta  TINYINT      NOT NULL,
    created_at   DATETIME     NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_customer (customer_id)
);

Return Authorization Workflow

  1. Eligibility check: verify the order is in a returnable state (fulfilled or partially_fulfilled), the return window has not expired (configurable per product category), and the requested quantity does not exceed the shipped quantity minus any prior returns for the same item.
  2. Fraud scoring: compute a fraud_score (0-100) based on signals: customer lifetime return rate, days since delivery, number of open returns, item category risk, and whether a physical label is actually used. Score >= 80 triggers manual review queue; the return is held at requested status.
  3. Authorization: if score < 80 and policy checks pass, set status to authorized, generate an RMA number, and issue a pre-paid return shipping label via the carrier API. Write all of this in one DB transaction.

Restocking Logic

When the warehouse receives the physical return it performs an inspection and records condition_code on each return_item. The restocking decision table:

condition_code = new        --> restock_qty = qty_returned  (full restock)
condition_code = like_new   --> restock_qty = qty_returned  (restock, may discount)
condition_code = damaged     --> restock_qty = 0            (quarantine, file vendor claim)
condition_code = unsellable  --> restock_qty = 0            (dispose or liquidate)

For each row where restock_qty > 0, the service calls the IMS receiveStock endpoint to increment on_hand. This is done item by item so partial failures do not block the entire return.

Refund Calculation

gross_item_amount  = SUM(order_item.unit_price * return_item.qty_returned)
restocking_fee     = gross_item_amount * policy.restocking_fee_pct   -- e.g. 0.15 if item was opened
shipping_refund    = original_order.shipping_cost  IF all items returned ELSE 0
net_refund_amount  = gross_item_amount - restocking_fee + shipping_refund

Partial refunds arise when only some items in an order are returned, when a restocking fee applies, or when the return is a goodwill credit (e.g., item arrived damaged: full refund including original shipping).

Partial Refund Flow

  1. Compute net_refund_amount.
  2. Write a refunds row with status pending and the computed amounts (single transaction also inserting ledger credit entry).
  3. Call payment gateway refund API with the original payment_gateway_ref from the order and the partial amount. Most gateways support partial refunds natively; the OMS stored the charge ID at payment capture time.
  4. On gateway success: update refunds.status to completed, store the gateway transaction reference, insert a ledger debit entry to balance the credit, and transition the return to refunded.
  5. On gateway failure: mark refunds.status as failed, enqueue a retry job with exponential backoff, and alert the ops team after N consecutive failures.

Fraud Detection

Fraud signals are evaluated at authorization time and again at inspection time:

  • High return rate: customer's return count / order count over 90 days > threshold (e.g., 40%).
  • Wardrobing: high-value item returned just before the window closes, condition is "like new", pattern matches previous returns of the same SKU by this customer.
  • Label not scanned: RMA issued but carrier never records a scan within N days — likely a claim without actual shipment.
  • Damage inconsistency: customer claims damage but inspection finds item in new condition — potential false claim.

Each detected signal inserts a row into return_fraud_signals. Cumulative score across recent returns feeds into the next authorization decision. Customers above a lifetime threshold are flagged for manual review on all future returns.

Key Design Decisions and Trade-offs

  • Separate refund row from return request row: a return can exist without a refund (e.g., exchange-only returns, rejected returns). Keeping the financial record in a dedicated table with its own ledger makes accounting queries straightforward and avoids NULLs in the return table.
  • Append-only ledger: refund_ledger is never updated. Every financial event (credit for refund amount, debit after gateway confirmation, debit for restocking fee) is a new row. Running SUM over the ledger always reconciles to the current state — no need to trust a mutable balance column.
  • Fraud score at authorization, not only at receipt: early scoring allows the service to deny fraudulent requests before issuing a label and before the customer has any expectation of a refund. Later re-scoring at inspection catches fraud that can only be detected physically.
  • Store credit vs. original payment method: store credit bypasses the payment gateway entirely — it is a ledger entry in the customer wallet service. This is faster and has no gateway failure risk. Offering store credit at a slight premium (e.g., 105% of net refund) reduces gateway fees and retains revenue.

Failure Handling and Edge Cases

  • Gateway partial refund rejection: some gateways reject partial refunds if the charge is older than a threshold (e.g., 180 days). The service falls back to store credit automatically and notifies the customer.
  • Return received but no matching RMA: the WMS may receive an unlabeled or misrouted package. A manual review queue handles these; if resolved, the return is retroactively linked and the normal flow resumes.
  • Double refund prevention: the UNIQUE constraint on refunds.return_id prevents inserting two refund records for the same return. The gateway call is wrapped in an idempotency key so retrying after a timeout does not issue a second charge reversal.
  • Inspection updates after refund issued: if the inspection finds the item unsellable after the refund was already processed, no financial reversal is issued (the customer already received credit); instead, a vendor claim or write-off entry is appended to the ledger as a debit against cost of goods.
  • Return window expiry race: the authorization check and the window expiry check must be atomic. A row-level lock on the return request (SELECT FOR UPDATE) prevents a concurrent request from slipping through after another request already consumed the last valid authorization slot.

Scalability Considerations

  • Async inspection processing: warehouse inspection events arrive as webhooks; they are written to a queue and processed asynchronously so the WMS HTTP call returns immediately. The return status transitions happen in a consumer process.
  • Ledger partitioning: partition refund_ledger by month. Accounting reconciliation queries always target a bounded date range so partition pruning keeps them fast even as the table grows to hundreds of millions of rows.
  • Fraud model refresh: fraud scoring uses pre-computed customer-level aggregates (return rate, lifetime score) stored in a materialized table refreshed hourly by a batch job. Real-time scoring reads the materialized table rather than aggregating raw event data on every authorization request.
  • Return volume spikes: post-holiday return spikes can be 10x normal volume. The authorization path is stateless and horizontally scalable; the bottleneck is the payment gateway. Implement a token bucket rate limiter per gateway credential and queue excess refund processing jobs for off-peak execution.

Summary

A returns and refunds service is a correctness-critical financial system layered on top of reverse logistics. The data model separates commercial returns (return_requests, return_items) from financial outcomes (refunds, refund_ledger) to keep each concern independently queryable and auditable. The authorization workflow gates on eligibility, fraud score, and policy rules before any physical or financial commitment is made. Restocking feeds directly into inventory via the IMS. Refund calculation handles partial returns, restocking fees, and shipping credits via a deterministic formula. An append-only ledger ensures that every dollar in and out is traceable, and idempotency keys at the gateway boundary prevent double refunds under retry conditions.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What is a returns and refunds service and what are its main components?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A returns and refunds service manages the reverse logistics flow when a customer wants to return a purchased item. Its main components are: a return eligibility engine that evaluates whether an item qualifies for return based on policy rules (return window, item condition, category restrictions); a return merchandise authorization (RMA) generator that issues a unique return label and authorization code; a receiving and inspection module used by warehouse staff to grade returned items; a refund calculation engine that determines the refund amount based on condition, restocking fees, and original payment method; a fraud detection layer that flags suspicious return patterns; and an inventory disposition engine that routes returned goods to restock, refurbishment, liquidation, or disposal. All state changes are persisted as events to support auditability and customer-facing status tracking.”
}
},
{
“@type”: “Question”,
“name”: “How is a return authorization generated and validated?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Return authorization (RMA) generation begins with the customer submitting a return request referencing an order line item and a return reason. The eligibility engine validates: (1) the item is within the return window (e.g. 30 days from delivery); (2) the order is in a returnable state (DELIVERED, not already returned); (3) the item category is not excluded from returns (e.g. perishables, digital goods); and (4) the customer’s account is not flagged for return abuse. If eligible, the service generates an RMA record with a unique authorization code, a pre-paid return shipping label (via carrier API), and an expiry timestamp. The RMA is stored in PENDING state. When the carrier scans the return parcel, a webhook updates the RMA to IN_TRANSIT. On warehouse receipt the RMA advances to RECEIVED and inspection begins. The authorization code is validated at each transition to prevent unauthorized returns.”
}
},
{
“@type”: “Question”,
“name”: “How are refund amounts calculated including partial refunds and restocking fees?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Refund calculation starts from the original item price paid (after promotions and coupons) and applies a series of deductions. If the item is returned in original condition within the standard window, the customer receives a full refund of the item price; original shipping charges are typically non-refundable unless the return is due to a seller error. Partial refunds apply when: the item is returned in a damaged or used condition (a damage deduction, e.g. 20-50% of item price, is applied based on inspection grade); a restocking fee is defined in policy for certain categories (e.g. electronics); or only a subset of a bundle is returned. The refund engine stores a structured deduction ledger so the customer can see exactly how the refund amount was derived. Refunds are issued back to the original payment instrument where possible; if the original method is unavailable (e.g. expired card) the service falls back to store credit.”
}
},
{
“@type”: “Question”,
“name”: “How does a returns service detect and prevent return fraud?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Return fraud detection is applied at multiple stages. At RMA request time, a rule engine and ML model score the request using signals such as: return rate for the customer account over a rolling window, ratio of returns with reason ‘item not received’ or ‘item not as described’, velocity of high-value returns, device and IP fingerprint, and whether the account is newly created. High-risk requests are routed to manual review or require photo evidence before an RMA is issued. At inspection time, the warehouse flags discrepancies such as wrong item returned, empty box, or item serial number mismatch against the original order. Repeated fraud patterns trigger account-level policies: requiring return-before-refund (instead of refund-on-shipment), limiting return eligibility, or account suspension. All fraud signals are fed back into the model as labeled training data to improve future detection accuracy.”
}
}
]
}

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Shopify Interview Guide

See also: Airbnb Interview Guide 2026: Search Systems, Trust and Safety, and Full-Stack Engineering

Scroll to Top