How does a two-pass ledger matching algorithm work for payment reconciliation?

Pass one performs exact matching: internal transactions are joined to external settlement records on (amount, currency, external_reference_id). Matched pairs are marked RECONCILED immediately. Pass two runs fuzzy matching on the remaining unmatched rows using a configurable tolerance window (e.g., ±$0.01 for rounding, ±1 day for settlement lag) combined with a similarity score on merchant name and transaction date. Fuzzy candidates above a confidence threshold are auto-matched; those below are queued for manual review.

How are discrepancy types classified in a reconciliation system?

Discrepancies are classified into: MISSING_IN_BANK (internal record has no matching settlement), MISSING_IN_LEDGER (bank reports a transaction not in the internal system), AMOUNT_MISMATCH (same reference, different amount — often FX rounding or fee deductions), DATE_MISMATCH (settlement date differs by more than the tolerance window), and DUPLICATE (same transaction matched to two bank entries). Classification drives routing: amount mismatches go to the finance team; missing-in-ledger above a threshold trigger fraud alerts.

How does an auto-resolution rule engine reduce manual reconciliation work?

Rules are stored as rows in a resolution_rules table with fields: discrepancy_type, amount_threshold, age_threshold, and action (AUTO_WRITE_OFF, AUTO_ADJUST, ESCALATE). The engine evaluates rules in priority order against each open discrepancy. For example, amount mismatches under $1.00 older than 30 days are auto-written-off against a rounding suspense account. Rules are versioned and audited; any auto-resolution generates an audit_log entry with the matched rule ID, preventing silent data changes.

Why use an append-only audit log in payment reconciliation, and how is it structured?

An append-only log ensures every state change is traceable for regulatory compliance and dispute resolution. Each row captures: event_id (UUID), transaction_id, event_type (MATCHED, FLAGGED, RESOLVED, WRITTEN_OFF), actor (user ID or 'system'), payload (JSON diff), and created_at. Rows are never updated or deleted. A separate materialized view or CQRS read model projects current reconciliation status from the log, keeping the hot query path fast without sacrificing immutability of the historical record.

Payment Reconciliation Service Low-Level Design: Ledger Matching, Discrepancy Detection, and Dispute Resolution

⏱ 6 min read

Payment Reconciliation Service: Overview and Requirements

A payment reconciliation service matches transaction records from internal ledgers against statements from external processors, banks, and payment networks. It surfaces discrepancies automatically, attempts auto-resolution for known patterns, escalates unresolved items for human review, and maintains an immutable audit trail of every match and resolution decision.

Functional Requirements

Ingest transaction records from multiple internal and external sources on configurable schedules.
Match records across sources using configurable match keys such as transaction ID, amount, currency, and date.
Detect discrepancies: missing records, amount mismatches, duplicate entries, and timing differences.
Auto-resolve discrepancies that match known patterns such as processor fees, FX rounding, and settlement timing windows.
Escalate unresolved discrepancies to a dispute resolution queue with full context.
Produce reconciliation reports per period, source, and currency.

Non-Functional Requirements

Reconciliation of one million transactions must complete within 30 minutes of source data availability.
All match decisions and resolutions must be stored immutably for regulatory audit for a minimum of 7 years.
The system must process each source independently so a delayed file from one bank does not block reconciliation of others.
Auto-resolution rules must be versioned and auditable: any rule change must record who changed it and when.

Data Model

LedgerEntry: entry_id, source_id, external_ref, amount, currency, direction (debit | credit), transaction_date, settlement_date, status, raw_payload, ingested_at.
ReconciliationRun: run_id, period_start, period_end, source_ids, triggered_at, completed_at, status (running | completed | failed), summary_json.
MatchResult: match_id, run_id, entry_ids (array), match_type (exact | fuzzy | rule), confidence_score, matched_at, matched_by (system | user_id).
Discrepancy: discrepancy_id, run_id, entry_ids (array), type (missing | amount_mismatch | duplicate | timing), amount_delta, currency, status (open | auto_resolved | escalated | resolved), resolution_rule_id, resolved_at, resolved_by.
ResolutionRule: rule_id, name, condition_json, action (auto_resolve | escalate), version, created_by, created_at, active.
AuditLog: log_id, entity_type, entity_id, action, actor, payload_before, payload_after, occurred_at — append-only, no updates or deletes permitted.

Multi-Source Ledger Matching

Matching operates in two passes. The first pass performs exact matching on the canonical match key, which is typically a combination of the external transaction reference, amount, and currency. Records that share the same canonical key across all expected sources are marked as matched with high confidence.

The second pass applies fuzzy matching to unmatched residuals. Fuzzy matching relaxes constraints:

Amount tolerance: match records where the absolute difference is within a configurable threshold, such as 0.01 USD to handle FX rounding.
Date tolerance: match records whose transaction dates differ by up to N settlement days to account for bank processing delays.
Reference normalization: strip prefixes and suffixes applied by different processors to the same underlying transaction reference before comparing.

Fuzzy matches are assigned a confidence score based on how many match criteria were relaxed. Scores below a configurable floor are not auto-accepted and are instead routed to the discrepancy pipeline.

Discrepancy Detection

After matching, unmatched records are classified into discrepancy types:

Missing: a record exists in the internal ledger but has no counterpart in the external source, or vice versa.
Amount mismatch: a matched pair has a non-zero amount delta outside the tolerance window.
Duplicate: two or more records in the same source share the same external reference and amount.
Timing: a record is present in both sources but the settlement dates differ by more than the expected settlement window.

Auto-Resolution Engine

Resolution rules are evaluated in priority order against each open discrepancy. A rule condition is a JSON expression evaluated against the discrepancy attributes including type, amount_delta, currency, and source_ids. When a rule matches, its action is applied:

Auto-resolve: the discrepancy is closed, a ResolutionRule reference is recorded, and the entry is posted to the adjustment ledger if an accounting entry is required.
Escalate: the discrepancy is moved to the dispute resolution queue with enriched context and assigned to the owning team based on source routing rules.

Common auto-resolution patterns include processor service fee entries, rounding differences under $0.01, and records that reconcile when the date window is extended by one additional business day.

Dispute Resolution Workflow

Escalated discrepancies enter a queue with SLA timers. A resolver reviews the original records, may request additional documentation from the processor or bank via the external dispute API, and records a resolution decision with a reason code. All decisions are written to the AuditLog. Resolved discrepancies trigger any necessary adjustment journal entries in the accounting system via an outbound event.

API Design

POST /runs — trigger a reconciliation run for a specified period and source set.
GET /runs/{run_id} — get run status, progress, and summary statistics.
GET /discrepancies — query discrepancies with filters on type, status, currency, and date range.
PATCH /discrepancies/{id}/resolve — manually resolve a discrepancy with a reason code and optional adjustment amount.
GET /audit-log?entity_type=&entity_id= — retrieve the full audit history for a record.
POST /resolution-rules — create a new auto-resolution rule, creating a new version if one exists for the same name.

Scalability and Observability

Ingestion and matching are parallelized by source. Each source runs an independent pipeline allowing partial completion when one source file is delayed. Matching within a source pair is further parallelized by currency and date partition. Key metrics include: ingestion record count and latency per source, match rate by pass (exact vs fuzzy), discrepancy rate by type, auto-resolution rate by rule, dispute queue depth and age, and audit log write throughput.