Payment Refund System: Low-Level Design
A payment refund system reverses a completed charge — partially or fully — and ensures the customer’s account is credited while the merchant’s revenue ledger is accurately adjusted. The critical design challenges are preventing double refunds, handling partial refunds correctly (multiple partial refunds on one payment), coordinating with payment processors (Stripe, Braintree), and providing a complete audit trail for disputes and chargebacks.
Core Data Model
CREATE TABLE Payment (
payment_id BIGSERIAL PRIMARY KEY,
order_id BIGINT NOT NULL,
customer_id BIGINT NOT NULL,
amount_cents INT NOT NULL,
currency CHAR(3) NOT NULL DEFAULT 'USD',
status VARCHAR(20) NOT NULL, -- succeeded, refunded, partially_refunded
processor VARCHAR(30) NOT NULL, -- stripe, braintree, paypal
processor_charge_id VARCHAR(200) NOT NULL UNIQUE, -- e.g. ch_3NxvXY2eZvKYlo2C
captured_at TIMESTAMPTZ NOT NULL,
refundable_cents INT NOT NULL, -- amount - sum(refunds); decremented on each refund
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE TABLE Refund (
refund_id BIGSERIAL PRIMARY KEY,
payment_id BIGINT NOT NULL REFERENCES Payment(payment_id),
customer_id BIGINT NOT NULL,
requested_by BIGINT NOT NULL, -- user_id of agent or customer
amount_cents INT NOT NULL,
currency CHAR(3) NOT NULL DEFAULT 'USD',
reason VARCHAR(50) NOT NULL, -- customer_request, fraud, defective, duplicate
status VARCHAR(20) NOT NULL DEFAULT 'pending',
-- pending, processing, succeeded, failed
processor_refund_id VARCHAR(200) UNIQUE, -- stripe: re_3Nxv...; set on success
idempotency_key VARCHAR(200) UNIQUE NOT NULL,
notes TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
processed_at TIMESTAMPTZ,
failed_reason TEXT
);
CREATE TABLE RefundAuditEvent (
event_id BIGSERIAL PRIMARY KEY,
refund_id BIGINT NOT NULL REFERENCES Refund(refund_id),
event_type VARCHAR(50) NOT NULL, -- created, processor_submitted, succeeded, failed
actor_id BIGINT,
metadata JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX ON Refund(payment_id, status);
CREATE INDEX ON Payment(processor_charge_id);
Refund Processing
import uuid, stripe
stripe.api_key = STRIPE_SECRET_KEY
def request_refund(payment_id: int, amount_cents: int, reason: str,
requested_by: int, notes: str = None) -> int:
"""
Initiate a refund. Validates amount against remaining refundable balance.
Uses idempotency key to prevent duplicate refund submissions.
Returns refund_id.
"""
# Lock the payment row to prevent concurrent over-refund
payment = db.fetchone("""
SELECT * FROM Payment WHERE payment_id=%s FOR UPDATE
""", (payment_id,))
if not payment:
raise PaymentNotFoundError(f"Payment {payment_id} not found")
if payment['status'] not in ('succeeded', 'partially_refunded'):
raise RefundNotAllowedError(f"Cannot refund payment in status: {payment['status']}")
if amount_cents payment['refundable_cents']:
raise InvalidRefundAmountError(
f"Refund amount {amount_cents} exceeds refundable balance {payment['refundable_cents']}"
)
idem_key = f"refund:{payment_id}:{amount_cents}:{requested_by}:{str(uuid.uuid4())[:8]}"
refund_id = db.fetchone("""
INSERT INTO Refund
(payment_id, customer_id, requested_by, amount_cents, currency,
reason, idempotency_key, notes)
VALUES (%s,%s,%s,%s,%s,%s,%s,%s)
RETURNING refund_id
""", (payment_id, payment['customer_id'], requested_by, amount_cents,
payment['currency'], reason, idem_key, notes))['refund_id']
# Reserve the refund amount immediately (prevent concurrent over-refund)
db.execute("""
UPDATE Payment
SET refundable_cents = refundable_cents - %s,
status = CASE WHEN refundable_cents - %s = 0 THEN 'refunded'
ELSE 'partially_refunded' END
WHERE payment_id=%s
""", (amount_cents, amount_cents, payment_id))
_audit(refund_id, 'created', requested_by, {'amount_cents': amount_cents})
# Process asynchronously via job queue for resilience
enqueue('process_refund', {'refund_id': refund_id}, queue_name='payments',
idempotency_key=f"process:{refund_id}")
return refund_id
def process_refund(refund_id: int):
"""
Job handler: submit refund to payment processor.
Idempotent: if processor_refund_id already set, the refund was already submitted.
"""
refund = db.fetchone("""
SELECT r.*, p.processor, p.processor_charge_id
FROM Refund r JOIN Payment p USING(payment_id)
WHERE r.refund_id=%s
""", (refund_id,))
if refund['status'] == 'succeeded':
return # Already processed (idempotency)
db.execute(
"UPDATE Refund SET status='processing' WHERE refund_id=%s", (refund_id,)
)
_audit(refund_id, 'processor_submitted', None, {})
try:
if refund['processor'] == 'stripe':
result = stripe.Refund.create(
charge=refund['processor_charge_id'],
amount=refund['amount_cents'],
reason=_map_reason_to_stripe(refund['reason']),
idempotency_key=refund['idempotency_key'],
)
processor_refund_id = result['id']
else:
raise NotImplementedError(f"Processor {refund['processor']} not supported")
db.execute("""
UPDATE Refund
SET status='succeeded', processor_refund_id=%s, processed_at=NOW()
WHERE refund_id=%s
""", (processor_refund_id, refund_id))
_audit(refund_id, 'succeeded', None, {'processor_refund_id': processor_refund_id})
except stripe.error.StripeError as e:
_handle_processor_failure(refund_id, refund['payment_id'], refund['amount_cents'], str(e))
raise # Re-raise so job queue retries
def _handle_processor_failure(refund_id: int, payment_id: int,
amount_cents: int, error: str):
"""On permanent failure, restore the refundable balance."""
db.execute("""
UPDATE Refund SET status='failed', failed_reason=%s WHERE refund_id=%s
""", (error[:500], refund_id))
# Restore refundable amount so the agent can retry
db.execute("""
UPDATE Payment
SET refundable_cents = refundable_cents + %s,
status = CASE WHEN status='refunded' THEN 'succeeded' ELSE status END
WHERE payment_id=%s
""", (amount_cents, payment_id))
_audit(refund_id, 'failed', None, {'error': error})
def _map_reason_to_stripe(reason: str) -> str:
return {'fraud': 'fraudulent', 'duplicate': 'duplicate', 'customer_request': 'requested_by_customer'}.get(reason, 'requested_by_customer')
def _audit(refund_id, event_type, actor_id, metadata):
import json
db.execute("""
INSERT INTO RefundAuditEvent (refund_id, event_type, actor_id, metadata)
VALUES (%s,%s,%s,%s)
""", (refund_id, event_type, actor_id, json.dumps(metadata)))
Key Design Decisions
- FOR UPDATE + immediate refundable_cents decrement prevents double refunds: the payment row is locked before the refund row is created. Two concurrent refund requests for the same payment: the first decrements refundable_cents; the second reads the updated value and fails the amount validation. No distributed locks or application-level checks needed — the database enforces it atomically.
- Reserve refundable amount before calling the processor: decrement refundable_cents at request time, not after the processor confirms. If the processor call is made first and the server crashes before recording the response, a retry would submit a second refund. By reserving first and restoring on permanent failure, there is no window for accidental double submission.
- Processor idempotency key: Stripe accepts an Idempotency-Key header — if the same key is resubmitted (due to retry), Stripe returns the same response without processing a second refund. The refund’s idempotency_key column maps directly to Stripe’s header.
- Async processing via job queue: processor calls are slow (200–800ms) and can fail. Running them in a job queue with retry logic ensures refunds are eventually processed without blocking the user-facing API and without losing the request if the server crashes.
Payment refund and transaction reversal system design is discussed in Stripe system design interview questions.
Payment refund and merchant order management design is covered in Shopify system design interview preparation.
Payment refund and booking cancellation system design is discussed in Airbnb system design interview guide.