Question 1

How do services get notified of config changes in real time?

Accepted Answer

Two approaches: (1) Long polling — the service sends a GET request that blocks until a config change occurs or a timeout (30s) is reached. On change, the server responds immediately; the client processes the update and immediately sends another long poll. (2) Pub/Sub — the config service publishes changes to a Kafka topic or Redis channel; services subscribe and receive updates in under 1 second. Long polling is simpler to implement; Pub/Sub is faster and more scalable for many subscribers.

Question 2

How do you store secrets (API keys, DB passwords) securely in a config system?

Accepted Answer

Use envelope encryption: generate a data encryption key (DEK), encrypt the secret with AES-256, then encrypt the DEK with a master key from KMS (AWS KMS, Google Cloud KMS, or HashiCorp Vault). Store the encrypted secret and the encrypted DEK. To decrypt: call KMS to decrypt the DEK, then decrypt the secret locally. The master key never leaves KMS. In production, use AWS Secrets Manager or HashiCorp Vault — they handle encryption, automatic rotation, and access audit logs out of the box.

Question 3

How do you roll back a bad configuration change?

Accepted Answer

Store every config change in an immutable ConfigVersion table with the changed_by, changed_at, and change_note fields. The rollback operation: find the ConfigVersion entry for the desired previous state, write that value back as a new ConfigVersion entry (don't overwrite — maintain history), and push the update to all subscribers. This approach maintains a complete audit trail and allows rolling back to any point in history.

Question 4

How do you handle configuration for multiple environments (dev/staging/prod)?

Accepted Answer

Use namespace hierarchy: {service}/{environment}/{key}. Example: payment-service/prod/db_url. Services read their namespace at startup based on an environment variable (APP_ENV=prod). Access control is applied at the namespace level: developers can write to */dev and */staging; only ops or CI/CD can write to */prod. Default values: services can define defaults in code; config service overrides take precedence. This prevents missing-config crashes in new environments.

Question 5

How does Netflix Archaius or LaunchDarkly handle config at scale?

Accepted Answer

Netflix Archaius uses a polling model: services reload config from a property source (Zookeeper, DynamoDB, or a custom URL) every 60 seconds. Changes propagate in up to 60 seconds. Services cache the config locally and use atomic CAS updates to avoid thread safety issues. LaunchDarkly uses a streaming architecture (SSE/WebSocket) for real-time flag updates with sub-second propagation, plus a local SDK cache that falls back to cached values if the connection is lost — ensuring zero downtime even if LaunchDarkly's service is unreachable.

Configuration Management System Low-Level Design

What is a Configuration Management System?

Requirements

Data Model

Read Path: Local Cache + Long Polling

Push Architecture (Alternative)

Secret Management

Key Design Decisions