Refund Service Low-Level Design: Partial Refunds, Ledger Entries, and Gateway Callbacks

Refund Service Low-Level Design

The refund service is responsible for calculating how much to return to a buyer, executing the reversal against the original payment gateway, maintaining a double-entry ledger, and confirming completion via gateway callbacks. Partial refunds, multi-payment orders, and idempotent retry semantics are the key design challenges interviewers focus on.

Requirements

Functional

  • Calculate the refundable amount for full and partial returns, accounting for discounts, shipping credits, and previous partial refunds.
  • Submit refund requests to the payment gateway in an idempotent manner.
  • Post debit entries to an internal financial ledger for each refund.
  • Confirm refund completion via gateway webhooks and update order and return records.
  • Support multiple refunds against a single order up to the original captured amount.

Non-Functional

  • Zero double-refunds: idempotency enforced at both the service and gateway layers.
  • Ledger entries must balance: every debit has a corresponding credit.
  • Refund initiation under 1 second; async confirmation expected within 3 business days depending on payment method.

Data Model

  • refund: id (UUID), order_id, rma_id (nullable), idempotency_key (UNIQUE), amount, currency, reason_code, status (PENDING | SUBMITTED | CONFIRMED | FAILED), gateway_refund_id, initiated_at, confirmed_at.
  • refund_line: id, refund_id, order_line_id, quantity, line_refund_amount, shipping_refund_amount, tax_refund_amount.
  • ledger_entry: id (sequence), refund_id, entry_type (DEBIT | CREDIT), account_code, amount, currency, posted_at, idempotency_key (UNIQUE).
  • gateway_callback: id, refund_id, gateway_event_id (UNIQUE), event_type, payload (JSONB), received_at, processed_at.

Core Algorithms and Flows

Partial Refund Calculation

The service computes the maximum refundable amount as: original captured amount minus the sum of all previously CONFIRMED or SUBMITTED refunds for the same order. Line-level refunds are calculated proportionally: line_refund = (quantity_returned / quantity_ordered) * line_total. Shipping is refunded in full on the first return and zero on subsequent returns. Tax is calculated by applying the original effective tax rate to the refunded subtotal. Any attempt to request more than the remaining refundable amount is rejected with a 422 error before touching the gateway.

Idempotent Gateway Submission

The refund row is inserted with a PENDING status before calling the gateway, using the refund.idempotency_key as the gateway-level idempotency header. If the gateway call succeeds the status advances to SUBMITTED and the gateway_refund_id is stored. If the call times out the status remains PENDING; a background retry job picks up PENDING refunds older than 30 seconds and resubmits using the same idempotency key, which the gateway deduplicates on its side.

Ledger Entry Posting

On successful gateway submission the service posts two ledger entries inside a single transaction: a DEBIT to the merchant liability account and a CREDIT to the buyer receivable account. Each entry carries a unique idempotency_key derived from refund_id and entry_type, making the posting safe to replay if the transaction is rolled back after the gateway call. A ledger reconciliation job verifies that total debits equal total credits per currency per day.

Webhook Confirmation

The payment gateway sends a refund.updated webhook when the bank confirms settlement. The handler validates the webhook signature, checks gateway_callback.gateway_event_id for duplicates (idempotency), and if novel advances the refund status to CONFIRMED and publishes a REFUND_CONFIRMED event. Downstream services (order service, buyer notification) consume this event to update their own records.

API Design

  • POST /v1/refunds — initiates a refund; body includes order_id, line quantities, and reason; idempotency_key required in header.
  • GET /v1/refunds/{id} — returns refund status, line breakdown, and gateway reference.
  • GET /v1/orders/{order_id}/refunds — lists all refunds for an order with amounts and statuses.
  • POST /v1/webhooks/gateway-refund — receives gateway confirmation events; validates HMAC signature.

Scalability Considerations

  • Refund volume is typically 5-15% of order volume; the service does not require horizontal scaling beyond what the gateway API rate limits allow.
  • Retry jobs use a distributed lock (Redis) on refund_id to prevent parallel retries from the same pod cluster submitting duplicate gateway calls.
  • Ledger entries are written to a dedicated append-only table; reads are served by a read replica to isolate financial reporting queries from the write path.
  • Webhook handlers are stateless and idempotent; they can be scaled and redeployed without risk of duplicate processing because gateway_event_id uniqueness is enforced at the database level.
  • Currency conversion for multi-currency refunds is applied at the time of refund initiation using the exchange rate recorded at the time of the original charge, stored in the payment_intent record.

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Shopify Interview Guide

Scroll to Top