Refund Service Low-Level Design
The refund service is responsible for calculating how much to return to a buyer, executing the reversal against the original payment gateway, maintaining a double-entry ledger, and confirming completion via gateway callbacks. Partial refunds, multi-payment orders, and idempotent retry semantics are the key design challenges interviewers focus on.
Requirements
Functional
- Calculate the refundable amount for full and partial returns, accounting for discounts, shipping credits, and previous partial refunds.
- Submit refund requests to the payment gateway in an idempotent manner.
- Post debit entries to an internal financial ledger for each refund.
- Confirm refund completion via gateway webhooks and update order and return records.
- Support multiple refunds against a single order up to the original captured amount.
Non-Functional
- Zero double-refunds: idempotency enforced at both the service and gateway layers.
- Ledger entries must balance: every debit has a corresponding credit.
- Refund initiation under 1 second; async confirmation expected within 3 business days depending on payment method.
Data Model
- refund: id (UUID), order_id, rma_id (nullable), idempotency_key (UNIQUE), amount, currency, reason_code, status (PENDING | SUBMITTED | CONFIRMED | FAILED), gateway_refund_id, initiated_at, confirmed_at.
- refund_line: id, refund_id, order_line_id, quantity, line_refund_amount, shipping_refund_amount, tax_refund_amount.
- ledger_entry: id (sequence), refund_id, entry_type (DEBIT | CREDIT), account_code, amount, currency, posted_at, idempotency_key (UNIQUE).
- gateway_callback: id, refund_id, gateway_event_id (UNIQUE), event_type, payload (JSONB), received_at, processed_at.
Core Algorithms and Flows
Partial Refund Calculation
The service computes the maximum refundable amount as: original captured amount minus the sum of all previously CONFIRMED or SUBMITTED refunds for the same order. Line-level refunds are calculated proportionally: line_refund = (quantity_returned / quantity_ordered) * line_total. Shipping is refunded in full on the first return and zero on subsequent returns. Tax is calculated by applying the original effective tax rate to the refunded subtotal. Any attempt to request more than the remaining refundable amount is rejected with a 422 error before touching the gateway.
Idempotent Gateway Submission
The refund row is inserted with a PENDING status before calling the gateway, using the refund.idempotency_key as the gateway-level idempotency header. If the gateway call succeeds the status advances to SUBMITTED and the gateway_refund_id is stored. If the call times out the status remains PENDING; a background retry job picks up PENDING refunds older than 30 seconds and resubmits using the same idempotency key, which the gateway deduplicates on its side.
Ledger Entry Posting
On successful gateway submission the service posts two ledger entries inside a single transaction: a DEBIT to the merchant liability account and a CREDIT to the buyer receivable account. Each entry carries a unique idempotency_key derived from refund_id and entry_type, making the posting safe to replay if the transaction is rolled back after the gateway call. A ledger reconciliation job verifies that total debits equal total credits per currency per day.
Webhook Confirmation
The payment gateway sends a refund.updated webhook when the bank confirms settlement. The handler validates the webhook signature, checks gateway_callback.gateway_event_id for duplicates (idempotency), and if novel advances the refund status to CONFIRMED and publishes a REFUND_CONFIRMED event. Downstream services (order service, buyer notification) consume this event to update their own records.
API Design
POST /v1/refunds— initiates a refund; body includes order_id, line quantities, and reason; idempotency_key required in header.GET /v1/refunds/{id}— returns refund status, line breakdown, and gateway reference.GET /v1/orders/{order_id}/refunds— lists all refunds for an order with amounts and statuses.POST /v1/webhooks/gateway-refund— receives gateway confirmation events; validates HMAC signature.
Scalability Considerations
- Refund volume is typically 5-15% of order volume; the service does not require horizontal scaling beyond what the gateway API rate limits allow.
- Retry jobs use a distributed lock (Redis) on refund_id to prevent parallel retries from the same pod cluster submitting duplicate gateway calls.
- Ledger entries are written to a dedicated append-only table; reads are served by a read replica to isolate financial reporting queries from the write path.
- Webhook handlers are stateless and idempotent; they can be scaled and redeployed without risk of duplicate processing because gateway_event_id uniqueness is enforced at the database level.
- Currency conversion for multi-currency refunds is applied at the time of refund initiation using the exchange rate recorded at the time of the original charge, stored in the payment_intent record.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How is partial refund calculation handled in a refund service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Partial refund calculation starts with the approved return line items and their original unit prices. The service applies proportional discount allocation — distributing order-level discounts across line items by price weight — to compute the refundable amount per item. Shipping is refunded only if the entire order is returned. Tax is recalculated on the refundable subtotal using the original tax rate. The final amount is capped at the original order total minus any previous refunds issued against the same order.”
}
},
{
“@type”: “Question”,
“name”: “How do you make payment gateway calls idempotent in a refund service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Each refund request is assigned an idempotency key derived from the RMA ID and line-item fingerprint (e.g., SHA-256 of rmaId + skuId + amount). This key is passed in the gateway API call header (e.g., Stripe's Idempotency-Key). If the network times out and the request is retried, the gateway returns the original response instead of issuing a duplicate refund. The service also stores the gateway response keyed by idempotency key locally so retries can short-circuit before hitting the gateway.”
}
},
{
“@type”: “Question”,
“name”: “What is double-entry ledger design in a refund service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Every refund posts two ledger entries: a debit to the refunds-payable account and a credit to the customer-balance or external-payment account. This ensures the ledger always balances and provides a full audit trail. Entries are written in the same DB transaction as the refund record. Reconciliation jobs periodically sum debits and credits per account and alert on any discrepancy. This model makes it straightforward to produce accurate financial reports and detect double-refund bugs.”
}
},
{
“@type”: “Question”,
“name”: “How does webhook deduplication work in a refund service?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Payment gateways may deliver the same refund webhook multiple times. The service stores each received webhook event ID in a deduplications table with a unique constraint. On receipt, it attempts to insert the event ID; if the insert fails (duplicate), the handler returns HTTP 200 immediately without reprocessing. If the insert succeeds, the handler updates refund state and commits both changes in one transaction. The deduplication table is pruned after a retention window (e.g., 30 days) to control storage.”
}
}
]
}
See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Shopify Interview Guide