A user profile service stores and serves structured identity data — display name, avatar, bio, contact fields — while enforcing field-level privacy and supporting efficient partial updates. Getting the design right matters because profiles are read on nearly every page load and written on every settings change.
Requirements
Functional
- Store structured and semi-structured profile attributes per user.
- Support field-level visibility controls (public, friends, private).
- Allow partial updates via PATCH without overwriting unchanged fields.
- Expose profile read APIs with viewer-aware field filtering.
- Maintain an audit log of profile changes.
Non-Functional
- Read latency under 20 ms at p99 for cached profiles.
- Write consistency within 500 ms across regions.
- Support 500 million user profiles with horizontal scaling.
Data Model
The core table holds fixed high-cardinality fields alongside a JSONB column for extensible attributes.
profiles(
user_id BIGINT PRIMARY KEY,
display_name VARCHAR(100),
avatar_url TEXT,
bio TEXT,
created_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ,
attributes JSONB -- arbitrary key/value pairs
)
profile_privacy(
user_id BIGINT,
field_name VARCHAR(100),
visibility ENUM(public, connections, private),
PRIMARY KEY (user_id, field_name)
)
profile_audit(
id BIGSERIAL PRIMARY KEY,
user_id BIGINT,
changed_by BIGINT,
field_name VARCHAR(100),
old_value TEXT,
new_value TEXT,
changed_at TIMESTAMPTZ
)
Sharding by user_id distributes load evenly. A secondary index on display_name supports search. JSONB GIN indexes accelerate attribute queries.
Core Algorithms
Viewer-Aware Field Filtering
On every profile read, the service fetches the full profile then applies a privacy mask. The algorithm computes the viewer relationship (self, connection, public) and removes fields whose visibility exceeds that level. The privacy rules are cached in a small in-process LRU keyed by (owner_id, viewer_id) relationship type to avoid repeated DB lookups on hot profiles.
Partial Update (PATCH) Processing
The service accepts a sparse JSON payload. For each key present in the payload, it performs a targeted update rather than a full row replacement. For top-level columns this is a standard SQL UPDATE; for attributes inside the JSONB column it uses PostgreSQL jsonb_set to merge individual keys. A compare-and-swap version check (optimistic locking via updated_at) prevents lost updates under concurrent writes.
Cache Invalidation
Profiles are cached in Redis with a TTL of 60 seconds. On any successful write, the service publishes a profile.updated event to a Kafka topic. Cache consumers subscribe and issue targeted deletes. This fan-out pattern keeps read replicas and CDN edge caches coherent without synchronous cross-region writes.
API Design
GET /v1/profiles/{userId}— returns viewer-filtered profile; 304 on ETag match.PATCH /v1/profiles/{userId}— partial update; body contains only fields to change.PUT /v1/profiles/{userId}/privacy— bulk set field visibility rules.GET /v1/profiles/{userId}/privacy— owner-only; returns full privacy map.GET /v1/profiles/{userId}/audit— paginated audit trail; admin or owner only.
Authentication is enforced via a JWT bearer token. The service validates the token, extracts sub as the viewer ID, and passes it into the privacy filter. Rate limiting (100 reads/s per caller) is applied at the API gateway.
Scalability and Fault Tolerance
Reads are served from a read replica pool behind a load balancer. The primary handles all writes. A Redis cluster (3 shards, 3 replicas each) absorbs the read amplification from social feeds that fetch dozens of profiles per request. For cold profiles (not in cache), the service batches DB reads using an in-flight deduplication map — concurrent requests for the same profile are coalesced into a single query.
The audit log is written asynchronously via an outbox pattern to avoid adding write latency to the hot path. A background worker drains the outbox into the audit table and the Kafka event stream.
Interview Tips
- Clarify whether profile attributes are schema-on-write or schema-on-read before choosing JSONB vs. normalized columns.
- Discuss how privacy rules interact with search indexes — you typically cannot index private fields in a shared search cluster.
- Mention GDPR right-to-erasure: soft-delete plus a background scrubber that zeroes PII fields after a retention window.
- Optimistic locking with
updated_atis usually sufficient, but mention CRDTs if asked about offline/mobile edits.
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Meta Interview Guide 2026: Facebook, Instagram, WhatsApp Engineering