Document Signing Service Low-Level Design: Signature Workflow, PDF Embedding, and Audit Trail

What Is a Document Signing Service?

A document signing service allows multiple parties to review and digitally sign documents in a defined order, with signatures embedded into the PDF and every action recorded in a tamper-evident audit trail. DocuSign, Adobe Sign, and HelloSign are production examples. The core design challenges are orchestrating multi-party workflows, embedding cryptographic signatures into PDF files, and ensuring the audit trail cannot be altered after the fact.

Requirements

Functional Requirements

  • Upload a document and define an ordered list of signers
  • Notify each signer in sequence (or in parallel for non-ordered flows)
  • Allow signers to annotate and apply a digital signature at designated fields
  • Embed the signature into the PDF and produce a finalized document
  • Record every action (viewed, signed, declined, delegated) with timestamp and IP
  • Provide a downloadable audit trail certificate alongside the signed document

Non-Functional Requirements

  • Tamper evidence: any modification to the PDF after signing must be detectable
  • Long-term validity of signatures (PAdES-LTV or equivalent)
  • Support documents up to 100 MB
  • 99.9% uptime; signing must not be blocked by downstream notification failures

Data Model

  • documents: document_id, owner_id, title, original_s3_key, signed_s3_key, status (draft, in_progress, completed, voided), created_at, completed_at
  • signing_requests: request_id, document_id, signer_email, signer_name, order_index, status (pending, notified, viewed, signed, declined), signing_token (UUID), notified_at, signed_at
  • signature_fields: field_id, document_id, request_id, page_number, x, y, width, height, field_type (signature, initials, date, text)
  • audit_events: event_id, document_id, request_id, actor_email, event_type, ip_address, user_agent, metadata (JSON), created_at, prev_hash

The prev_hash column in audit_events forms a hash chain: each row stores a SHA-256 hash of the previous row combined with the current row payload, making retroactive insertion or modification detectable.

Core Algorithms

Signature Workflow Orchestration

On document creation, insert signing_requests rows in order_index sequence. A state machine drives transitions: when the current signer signs, mark their request as signed, find the next pending signer (ORDER BY order_index), send notification, and update that request to notified. The document status moves to completed when all signers have signed. Use a background worker to drive notifications so that the signing action itself completes synchronously while email dispatch is async.

PDF Digital Signature Embedding

Use a PDF library (e.g., iText, PDFBox, or pdf-lib) to embed signatures. Process: load the original PDF from S3, locate the signature field rectangle for the signer, render the signature image or text into the field, apply a PKCS#7 (CAdES/PAdES) digital signature using a signing certificate held in a Hardware Security Module (HSM) or a managed KMS key. Write the signed PDF back to S3 under a new key. For multi-party signing, each subsequent signer signs the already-signed PDF, producing an incremental signature revision that preserves prior signatures.

Tamper-Evident Audit Trail

On every audit event insertion: compute hash = SHA-256(prev_hash || event_type || actor_email || created_at || metadata_json). Store hash in current row and use it as prev_hash for the next row. To verify integrity, replay all rows in insertion order and recompute the chain. Any gap or mismatch indicates tampering. Optionally anchor the final hash periodically into a blockchain or RFC 3161 timestamp authority for external verifiability.

API Design

  • POST /documents — upload document and define signers; returns document_id
  • GET /documents/{id} — fetch document status and signer list
  • GET /sign/{token} — signer opens their signing session (public, token-gated)
  • POST /sign/{token}/fields — submit field values and signature image
  • POST /sign/{token}/decline — decline to sign with reason
  • GET /documents/{id}/audit — download audit trail (admin or owner)
  • GET /documents/{id}/download — download finalized signed PDF

Scalability Considerations

PDF processing is CPU-intensive. Offload signing jobs to a dedicated worker pool (Celery, Sidekiq, or AWS Lambda) triggered via a message queue. Store original and signed PDFs in S3 with versioning enabled. Use presigned URLs for direct client downloads to avoid proxying large files through the API tier.

The audit_events table grows unboundedly. Partition by created_at (monthly). Archive completed document audit rows to cold storage after a retention period (e.g., 7 years for legal compliance). The hash chain remains verifiable from archived exports.

For high-volume bulk sending (e.g., HR onboarding thousands of employees), batch-create signing_requests and publish notification jobs to a queue. Rate-limit email sends to respect provider limits per domain.

Summary

A document signing service orchestrates a state-machine-driven multi-party workflow, embeds cryptographic PAdES signatures into PDFs using incremental revisions, and records every action in a SHA-256 hash-chained audit table. Decoupling notification delivery from the signing write path and offloading PDF processing to background workers keeps the API responsive and the system reliable.

{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How do you design a multi-party signature workflow?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Model the signing envelope as a state machine: Draft → Sent → In-Progress → Completed / Voided. Each signer is a row in a signers table with order (for sequential) or no order (for parallel), status, and signed_at. A workflow engine (or simple cron + event listener) advances the envelope state: in sequential mode, notify the next signer only after the previous one signs; in parallel mode, notify all simultaneously and complete when all sign.”
}
},
{
“@type”: “Question”,
“name”: “How is a PDF digital signature embedded using PKCS#7?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Compute a hash of the PDF byte range excluding the signature placeholder. Wrap the hash in a PKCS#7 CMS SignedData structure signed with the signer's private key and include their X.509 certificate chain. Embed the DER-encoded PKCS#7 blob into the PDF's /ByteRange signature dictionary. Readers verify by re-hashing the same byte range and validating the certificate chain against a trusted CA root.”
}
},
{
“@type”: “Question”,
“name”: “What makes an audit trail tamper-evident?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Append each audit event (viewed, signed, declined, voided) to an immutable log table with a chained hash: each row stores SHA-256(previous_hash || event_data). To verify integrity, replay the chain and recompute hashes. Store the chain anchor (first hash) in a trusted external system (blockchain timestamp, RFC 3161 TSA token) to prove the log existed at a point in time. Never allow UPDATE or DELETE on audit rows.”
}
},
{
“@type”: “Question”,
“name”: “When should you use parallel vs sequential signing order?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “Use parallel signing when all parties are peers and no signer's decision depends on another's (e.g., co-buyers on a contract). Use sequential signing when there is an approval hierarchy (e.g., employee signs first, then manager countersigns) or when earlier signers' identity must be established before later parties commit. Sequential reduces turnaround ambiguity; parallel minimizes total time-to-completion.”
}
}
]
}

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Atlassian Interview Guide

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

See also: Stripe Interview Guide 2026: Process, Bug Bash Round, and Payment Systems

See also: Coinbase Interview Guide

Scroll to Top