How does a user profile service handle flexible attribute storage?

Flexible attribute storage is typically implemented with an entity-attribute-value (EAV) table or a schemaless JSON column alongside a fixed-schema core table. The core table holds well-known, frequently queried fields (name, email, locale), while a JSON or EAV store absorbs arbitrary user-defined or product-specific attributes without requiring schema migrations for every new field.

How are field-level privacy controls enforced in a profile service?

Field-level privacy is enforced by associating a visibility policy (public, friends-only, private) with each attribute at write time and filtering the response payload at read time based on the requesting principal's relationship to the profile owner. An ACL or policy table maps (owner_id, field_name) to an access level, and the read path applies a projection based on that policy before serializing the response.

What are the semantics of partial PATCH updates for a user profile?

A PATCH request carries only the fields the client wants to change. The service reads the current record, merges the patch deltas, validates the merged result, and persists only the changed columns. Null vs. absent must be distinguished: an explicit null in the payload means clear the field, while an absent key means leave it unchanged. Optimistic locking (e-tag or version counter) prevents lost updates when two clients PATCH concurrently.

What cache invalidation strategy works best for a user profile service?

Write-through or write-around with key-based invalidation is the standard approach. On every successful write the service deletes (or overwrites) the cache entry keyed by user_id so the next read repopulates from the database. For high-fan-out scenarios — where many derived caches depend on a single profile — an event bus (Kafka, SNS) publishes a profile-changed event and downstream consumers invalidate their own caches asynchronously.

User Profile Service Low-Level Design: Storage, Privacy Controls, and Partial Updates

⏱ 5 min read

A user profile service stores and serves structured identity data — display name, avatar, bio, contact fields — while enforcing field-level privacy and supporting efficient partial updates. Getting the design right matters because profiles are read on nearly every page load and written on every settings change.

Requirements

Functional

Store structured and semi-structured profile attributes per user.
Support field-level visibility controls (public, friends, private).
Allow partial updates via PATCH without overwriting unchanged fields.
Expose profile read APIs with viewer-aware field filtering.
Maintain an audit log of profile changes.

Non-Functional

Read latency under 20 ms at p99 for cached profiles.
Write consistency within 500 ms across regions.
Support 500 million user profiles with horizontal scaling.

Data Model

The core table holds fixed high-cardinality fields alongside a JSONB column for extensible attributes.

profiles(
  user_id        BIGINT PRIMARY KEY,
  display_name   VARCHAR(100),
  avatar_url     TEXT,
  bio            TEXT,
  created_at     TIMESTAMPTZ,
  updated_at     TIMESTAMPTZ,
  attributes     JSONB          -- arbitrary key/value pairs
)

profile_privacy(
  user_id        BIGINT,
  field_name     VARCHAR(100),
  visibility     ENUM(public, connections, private),
  PRIMARY KEY (user_id, field_name)
)

profile_audit(
  id             BIGSERIAL PRIMARY KEY,
  user_id        BIGINT,
  changed_by     BIGINT,
  field_name     VARCHAR(100),
  old_value      TEXT,
  new_value      TEXT,
  changed_at     TIMESTAMPTZ
)

Sharding by user_id distributes load evenly. A secondary index on display_name supports search. JSONB GIN indexes accelerate attribute queries.

Core Algorithms

Viewer-Aware Field Filtering

On every profile read, the service fetches the full profile then applies a privacy mask. The algorithm computes the viewer relationship (self, connection, public) and removes fields whose visibility exceeds that level. The privacy rules are cached in a small in-process LRU keyed by (owner_id, viewer_id) relationship type to avoid repeated DB lookups on hot profiles.

Partial Update (PATCH) Processing

The service accepts a sparse JSON payload. For each key present in the payload, it performs a targeted update rather than a full row replacement. For top-level columns this is a standard SQL UPDATE; for attributes inside the JSONB column it uses PostgreSQL jsonb_set to merge individual keys. A compare-and-swap version check (optimistic locking via updated_at) prevents lost updates under concurrent writes.

Cache Invalidation

Profiles are cached in Redis with a TTL of 60 seconds. On any successful write, the service publishes a profile.updated event to a Kafka topic. Cache consumers subscribe and issue targeted deletes. This fan-out pattern keeps read replicas and CDN edge caches coherent without synchronous cross-region writes.

API Design

GET /v1/profiles/{userId} — returns viewer-filtered profile; 304 on ETag match.
PATCH /v1/profiles/{userId} — partial update; body contains only fields to change.
PUT /v1/profiles/{userId}/privacy — bulk set field visibility rules.
GET /v1/profiles/{userId}/privacy — owner-only; returns full privacy map.
GET /v1/profiles/{userId}/audit — paginated audit trail; admin or owner only.

Authentication is enforced via a JWT bearer token. The service validates the token, extracts sub as the viewer ID, and passes it into the privacy filter. Rate limiting (100 reads/s per caller) is applied at the API gateway.

Scalability and Fault Tolerance

Reads are served from a read replica pool behind a load balancer. The primary handles all writes. A Redis cluster (3 shards, 3 replicas each) absorbs the read amplification from social feeds that fetch dozens of profiles per request. For cold profiles (not in cache), the service batches DB reads using an in-flight deduplication map — concurrent requests for the same profile are coalesced into a single query.

The audit log is written asynchronously via an outbox pattern to avoid adding write latency to the hot path. A background worker drains the outbox into the audit table and the Kafka event stream.

Interview Tips

Clarify whether profile attributes are schema-on-write or schema-on-read before choosing JSONB vs. normalized columns.
Discuss how privacy rules interact with search indexes — you typically cannot index private fields in a shared search cluster.
Mention GDPR right-to-erasure: soft-delete plus a background scrubber that zeroes PII fields after a retention window.
Optimistic locking with updated_at is usually sufficient, but mention CRDTs if asked about offline/mobile edits.