GDPR Data Deletion System Low-Level Design

What is GDPR Right to Erasure?

GDPR Article 17 gives EU users the “right to be forgotten” — they can request deletion of all personal data your company holds about them. Upon receiving a valid deletion request, you have 30 days to delete the data. This sounds simple but is technically complex: data is spread across dozens of services, databases, data warehouses, backups, caches, search indexes, and third-party integrations. Building a systematic deletion pipeline is essential for any product serving EU users.

Requirements

  • Accept user deletion requests via API or in-product UI
  • Delete or anonymize personal data across all systems within 30 days
  • Systems: primary DB, read replicas, analytics DB, search index, CDN/object storage, email lists, third-party integrations
  • Maintain audit log of deletion requests and completion status (ironically, required for compliance)
  • Handle partial failures: retry failed deletions, alert on stale requests
  • Business data (financial transactions, fraud logs) may need to be retained — anonymize rather than delete

Data Inventory: Know Where PII Lives

Before building deletion, catalog where personal data exists:

Primary DB tables with PII:
  users: email, name, phone, address, date_of_birth
  orders: shipping_address, billing_address (linked to user_id)
  messages: content (may contain PII), sender_id
  sessions: ip_address, user_agent

External systems:
  Analytics (Mixpanel, Amplitude): user events by user_id
  Email (SendGrid/Mailchimp): subscriber list, send history
  CDN/S3: profile photos, uploaded documents
  Elasticsearch: indexed user profiles, message content
  Data warehouse (Snowflake/BigQuery): user events, analytics tables
  Third-party: Intercom, Salesforce, Stripe customer objects

Deletion Request Data Model

DeletionRequest(request_id UUID, user_id UUID, email VARCHAR,
                requested_at TIMESTAMP, deadline TIMESTAMP,   -- requested_at + 30 days
                status ENUM(PENDING, IN_PROGRESS, COMPLETED, FAILED),
                completed_at TIMESTAMP,
                requester_ip VARCHAR, verification_method VARCHAR)

DeletionTask(task_id UUID, request_id UUID,
             system VARCHAR,     -- 'primary_db', 'elasticsearch', 'sendgrid', 's3', ...
             status ENUM(PENDING, COMPLETED, FAILED, SKIPPED),
             error_message TEXT,
             attempted_at TIMESTAMP, completed_at TIMESTAMP,
             retry_count INT DEFAULT 0)

Deletion Orchestration

def process_deletion(request_id):
    request = db.get(DeletionRequest, request_id)
    user_id = request.user_id

    # Define tasks for each system
    tasks = [
        ('primary_db', delete_from_primary_db),
        ('elasticsearch', delete_from_elasticsearch),
        ('s3_uploads', delete_s3_objects),
        ('sendgrid', remove_from_sendgrid),
        ('analytics', anonymize_analytics_events),
        ('data_warehouse', anonymize_warehouse_data),
    ]

    for system_name, handler in tasks:
        task = db.create(DeletionTask(request_id=request_id, system=system_name))
        try:
            handler(user_id)
            db.update(task, status='COMPLETED', completed_at=now())
        except Exception as e:
            db.update(task, status='FAILED', error_message=str(e))

    failed = db.count(DeletionTask, request_id=request_id, status='FAILED')
    if failed == 0:
        db.update(request, status='COMPLETED', completed_at=now())
    else:
        db.update(request, status='FAILED')
        alert_compliance_team(request_id, failed_tasks=failed)

Delete vs. Anonymize

Some data must be retained for legal/business reasons but personal identifiers must be removed:

def delete_from_primary_db(user_id):
    # Hard delete: truly remove the record
    db.execute('DELETE FROM sessions WHERE user_id=?', user_id)
    db.execute('DELETE FROM messages WHERE user_id=?', user_id)

    # Anonymize: retain business data, remove PII
    db.execute('''
        UPDATE users SET
            email = 'deleted-' || user_id || '@deleted.invalid',
            name = 'Deleted User',
            phone = NULL,
            address = NULL,
            date_of_birth = NULL,
            status = 'DELETED'
        WHERE user_id=?
    ''', user_id)

    # Orders must be retained for financial records; anonymize the address
    db.execute('''
        UPDATE orders SET
            shipping_name = 'Deleted User',
            shipping_address = '[deleted]',
            billing_address = '[deleted]'
        WHERE user_id=?
    ''', user_id)

Backup Handling

Backups contain PII snapshots. Options: (1) Exclude deleted users from future backups (feasible for incremental backups). (2) Accept that current backups contain PII and establish a backup retention policy (e.g., 90-day backup TTL); after TTL, the backup containing the user’s data expires naturally. (3) For compliance: document your backup retention policy; GDPR allows reasonable time for backup expiry.

Key Design Decisions

  • Task-per-system model — independent retry for each system; one failure doesn’t block others
  • Anonymize financial records rather than delete — legal retention requirements outweigh erasure right for transactional data
  • Audit log of deletion tasks — ironic but required: you need to prove you deleted data
  • 30-day deadline alert — cron job checks DeletionRequests WHERE deadline < NOW() AND status != ‘COMPLETED’ and alerts compliance team
  • Verification before deletion — require re-authentication or email confirmation before processing deletion request

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How long does a company have to fulfill a GDPR deletion request?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Under GDPR Article 17, you have 30 days from receiving the valid request. If the request is complex or numerous, you can extend by an additional 2 months but must notify the user within the first 30 days. Build your system to alert compliance teams when requests approach the 30-day deadline — a cron job checking DeletionRequest WHERE deadline < NOW() + 3 DAYS AND status != COMPLETED is sufficient.”}},{“@type”:”Question”,”name”:”Do you have to delete data from backups to comply with GDPR right to erasure?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”No, GDPR does not require immediate deletion from existing backups. You must delete from live systems immediately and document that backups containing the data will expire within your stated retention period (e.g., 90 days). After backup expiry, no copy of the personal data will remain. Document your backup retention policy in your privacy policy and data processing agreements. This is the accepted industry interpretation confirmed by multiple EU Data Protection Authorities.”}},{“@type”:”Question”,”name”:”What data can you retain despite a GDPR deletion request?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”GDPR includes exemptions for data that must be retained by law. Financial records and transaction data typically must be kept for 7 years (tax law, accounting regulations). Fraud detection logs may need to be retained. Safety-critical data (e.g., medical records) may have separate regulations. For this data, anonymize instead of delete: replace name/email/address with placeholder values while preserving the record structure and business-relevant fields like amounts and dates.”}},{“@type”:”Question”,”name”:”How do you handle deletion across third-party integrations?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Create a DeletionTask for each third-party system: Intercom, Salesforce, SendGrid, Stripe, analytics platforms. Call each provider’s deletion/anonymization API. Most major providers offer GDPR deletion endpoints. Store the API response and mark the task COMPLETED or FAILED. For failed tasks, implement retry with exponential backoff. Keep a record of what was sent to each provider — this is your proof of compliance in case of audit.”}},{“@type”:”Question”,”name”:”How do you verify a deletion request is legitimate?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Require the user to authenticate (re-login or verify email ownership) before processing the request. A malicious third party should not be able to delete another user’s account by submitting their email address. For users who have lost account access: require email verification to the address on file. For API-based requests: verify with the user’s existing session token or require re-authentication within the last 10 minutes.”}}]}

GDPR data deletion and compliance system design is discussed in Stripe system design interview guide.

Data privacy, retention, and deletion compliance is covered in Coinbase system design interview questions.

GDPR compliance and data deletion systems are discussed in Atlassian system design interview preparation.

Scroll to Top