Soft Delete Pattern — Low-Level Design
The soft delete pattern marks records as deleted without removing them from the database, preserving data for audits, recovery, and referential integrity. This pattern is used widely in production systems and is asked in design interviews at companies where data recovery and compliance matter.
Core Pattern
-- Add deleted_at column to any table
ALTER TABLE Post ADD COLUMN deleted_at TIMESTAMPTZ;
-- null = active, non-null = soft-deleted at that timestamp
-- All queries must filter deleted records
-- Bad: SELECT * FROM Post WHERE id=123
-- Good: SELECT * FROM Post WHERE id=123 AND deleted_at IS NULL
-- Partial index: only index active records (smaller, faster)
CREATE INDEX idx_post_active ON Post(author_id, created_at DESC)
WHERE deleted_at IS NULL;
Soft Delete Operations
def soft_delete(model_class, record_id, deleted_by=None):
rows = db.execute(f"""
UPDATE {model_class.__tablename__}
SET deleted_at=NOW()
WHERE id=%(id)s AND deleted_at IS NULL
RETURNING id
""", {'id': record_id}).rowcount
if rows == 0:
raise NotFound(f'{model_class.__name__} {record_id} not found or already deleted')
# Audit log
if deleted_by:
log_audit_event(
action='soft_delete',
entity_type=model_class.__tablename__,
entity_id=record_id,
actor_id=deleted_by,
)
def restore(model_class, record_id, restored_by=None):
db.execute(f"""
UPDATE {model_class.__tablename__}
SET deleted_at=NULL
WHERE id=%(id)s AND deleted_at IS NOT NULL
""", {'id': record_id})
Cascading Soft Deletes
def soft_delete_user(user_id):
"""Soft-deleting a user cascades to their owned content."""
with db.transaction():
db.execute("UPDATE User SET deleted_at=NOW() WHERE id=%(uid)s", {'uid': user_id})
# Cascade to owned content
db.execute("UPDATE Post SET deleted_at=NOW() WHERE author_id=%(uid)s AND deleted_at IS NULL", {'uid': user_id})
db.execute("UPDATE Comment SET deleted_at=NOW() WHERE author_id=%(uid)s AND deleted_at IS NULL", {'uid': user_id})
# Revoke sessions
db.execute("UPDATE Session SET revoked_at=NOW() WHERE user_id=%(uid)s AND revoked_at IS NULL", {'uid': user_id})
# Note: cascades must be explicit — database FK ON DELETE CASCADE removes rows,
# which defeats the purpose of soft delete.
Preventing Duplicate Unique Constraints
-- Problem: User table has UNIQUE(email)
-- After soft-deleting user@example.com, re-registering fails with unique violation
-- Solution 1: Partial unique index (only active records must be unique)
CREATE UNIQUE INDEX idx_user_email_active
ON User(email)
WHERE deleted_at IS NULL;
-- Now: multiple deleted users can have the same email
-- Only one active user per email is enforced
-- Solution 2: Nullify unique fields on delete (for hard uniqueness)
def soft_delete_user_with_email_release(user_id):
db.execute("""
UPDATE User
SET deleted_at=NOW(), deleted_email=email, email=NULL
WHERE id=%(id)s
""", {'id': user_id})
-- Store original in deleted_email for recovery; clear email for reuse
Hard Delete with Retention Policy
def purge_old_soft_deletes(table, retention_days=90):
"""Permanently delete records soft-deleted more than N days ago."""
deleted_count = db.execute(f"""
DELETE FROM {table}
WHERE deleted_at IS NOT NULL
AND deleted_at < NOW() - INTERVAL '%(days)s days'
""", {'days': retention_days}).rowcount
log.info(f'Purged {deleted_count} records from {table}')
# Run weekly via cron for all soft-delete tables
# 90-day retention satisfies most support recovery windows
# GDPR right-to-erasure: run immediately on explicit deletion request
ORM Integration (Active Record / SQLAlchemy)
# SQLAlchemy: add a query filter mixin for automatic deleted_at IS NULL
from sqlalchemy.orm import Query
class SoftDeleteMixin:
deleted_at = Column(DateTime, nullable=True)
def soft_delete(self, session):
self.deleted_at = datetime.utcnow()
session.add(self)
@classmethod
def active_query(cls, session):
return session.query(cls).filter(cls.deleted_at.is_(None))
# Usage: User.active_query(session).filter_by(email=email).first()
# Always use active_query() — never session.query(User) directly
Key Interview Points
- Partial indexes are essential: A standard index on (author_id, created_at) that includes deleted records is larger and slower. WHERE deleted_at IS NULL on the index excludes deleted rows and halves index size for tables with ~50% deletion.
- Unique constraints need partial indexes too: A UNIQUE constraint on email will prevent re-registration after soft delete. Use a partial unique index with WHERE deleted_at IS NULL to allow the email to be reused.
- Soft delete is not a privacy guarantee: Under GDPR, soft-deleted records still contain personal data. For right-to-erasure requests, either run immediate hard delete or clear PII fields at soft-delete time.
- Cascades must be explicit: Database-level ON DELETE CASCADE will hard-delete child records when the parent is deleted. Soft-delete cascades must be implemented in application code or triggers.
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”What is a partial index and why does soft delete require one?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A partial index includes only rows that match a WHERE clause. For soft delete: CREATE INDEX ON Post(author_id, created_at DESC) WHERE deleted_at IS NULL. This index contains only active (non-deleted) records. Benefits: (1) Smaller index — if 30% of records are deleted, the partial index is 30% smaller and faster to traverse. (2) Correct WHERE clause — the index condition is automatically applied to queries that include AND deleted_at IS NULL, making the query planner choose the partial index. Without a partial index on large tables with heavy deletion, all queries slow down as the index fills with deleted records.”}},{“@type”:”Question”,”name”:”How does soft delete interact with unique constraints?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”A UNIQUE constraint on email prevents re-registering after a soft-deleted account. Solution: use a partial unique index that only enforces uniqueness on active records: CREATE UNIQUE INDEX ON User(email) WHERE deleted_at IS NULL. Now multiple rows can have the same email as long as at most one has deleted_at IS NULL. Alternative: nullify the unique field on delete (SET email=NULL, store original in deleted_email). The partial index approach is simpler; the nullification approach is better when you need to track the original value and enforce uniqueness across both active and deleted records.”}},{“@type”:”Question”,”name”:”How do you implement a GDPR right-to-erasure request with soft delete?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Soft delete preserves personal data — it does not satisfy GDPR right-to-erasure. For erasure requests: either (1) immediately hard-delete the record, or (2) clear all PII fields at soft-delete time (SET email=NULL, name='[deleted]’, phone=NULL) while retaining the row for referential integrity. Option 2 is better when the record is referenced by FK constraints from other tables (orders, transactions) that must be preserved for accounting. Track erasure requests in a separate GDPRErasureRequest table with status and completed_at for compliance audit.”}},{“@type”:”Question”,”name”:”How do you cascade soft deletes to child records?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Database FK ON DELETE CASCADE hard-deletes children when a parent is deleted — this defeats the purpose of soft delete. Implement cascades in application code or database triggers. In application code: when soft-deleting a User, also soft-delete their Posts, Comments, Sessions (in a transaction). In triggers: CREATE TRIGGER soft_delete_cascade AFTER UPDATE ON User FOR EACH ROW WHEN (OLD.deleted_at IS NULL AND NEW.deleted_at IS NOT NULL) EXECUTE FUNCTION soft_delete_user_children(). Document which tables cascade and which don’t — cascades to all child tables is not always desirable.”}},{“@type”:”Question”,”name”:”How do you purge old soft-deleted records to reclaim disk space?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Schedule a background job that runs nightly: DELETE FROM tablename WHERE deleted_at IS NOT NULL AND deleted_at < NOW() – INTERVAL ’90 days’. Run with a LIMIT clause (DELETE WHERE id IN (SELECT id FROM tablename WHERE deleted_at < … LIMIT 1000)) to avoid long-running transactions that cause table bloat and lock contention. Log the count of purged rows. For very large tables: use table partitioning by deleted_at date and DROP PARTITION instead of DELETE — instant and generates no WAL. Set 90-day retention as the default; shorten to immediate for GDPR erasure requests.”}}]}
Soft delete pattern and data management design is discussed in Shopify system design interview questions.
Soft delete and data retention system design is covered in Atlassian system design interview preparation.
Soft delete and audit trail design is discussed in Stripe system design interview guide.