Rollout Schema
A feature flag with gradual rollout has the following core fields:
{
flag_id: "checkout-v2",
rollout_plan: [
{percentage: 1, duration: "1h"},
{percentage: 10, duration: "4h"},
{percentage: 50, duration: "24h"},
{percentage: 100}
],
metrics_gates: [
{metric: "error_rate", threshold: 0.01, comparison: "lt"},
{metric: "p99_latency_ms", threshold: 500, comparison: "lt"}
],
targeting_rules: [],
current_percentage: 0,
status: "INACTIVE" | "ROLLING" | "PAUSED" | "COMPLETE" | "ROLLED_BACK"
}
The rollout_plan array defines the staged progression. metrics_gates define the conditions that must hold at each stage before advancing.
Percentage Rollout with Consistent Hashing
Determining which users are in a rollout must be consistent: the same user must always see the same experience for a given percentage. Using a random coin flip per request would cause users to see a feature flicker on and off.
The standard approach: hash(user_id + flag_id) mod 100. If the result is less than current_percentage, the flag is enabled for that user. The flag ID is included in the hash input so that different flags produce independent bucket assignments — a user in the top 10% for one flag is not automatically in the top 10% for all flags.
Automated Progression
A scheduler runs every few minutes and evaluates each ROLLING flag:
- Check if the current cohort has been at the current percentage for at least
duration - Evaluate all
metrics_gates— compare current metric values for the treatment cohort against thresholds - If both conditions pass: advance
current_percentageto the next step inrollout_plan - If the final step is reached: set
status = COMPLETE - If any metrics gate fails: set
status = PAUSED, send alert to on-call
Metrics Monitoring During Rollout
For each ROLLING flag, the metrics service continuously computes metrics for two cohorts:
- Treatment: users where
hash(user_id + flag_id) mod 100 < current_percentage - Control: all other users
Metrics include: error rate, p99 latency, conversion rate, and any business-specific KPIs defined on the flag. The comparison is treatment vs. control, not treatment vs. a historical baseline — this accounts for time-of-day and seasonal effects. Metrics are pulled from the observability platform (Datadog, Prometheus) via API.
Automated Rollback
If a metrics gate breaches its threshold beyond a configurable severity (e.g., error rate exceeds 3x the gate threshold, not just above it): automatic rollback kicks in without waiting for the scheduler cycle. The flag's current_percentage is set to 0 and status is set to ROLLED_BACK. Automated rollback is limited to flags where it is explicitly enabled — some features have side effects (database migrations, email sends) where rollback requires human judgment.
Manual Controls
Operators can intervene at any time via API or dashboard:
- Pause: stop automatic progression; current percentage holds; metrics monitoring continues
- Resume: restart progression from current percentage
- Override percentage: manually set
current_percentageto any value - Rollback: set
current_percentage = 0,status = ROLLED_BACK - Force complete: set
current_percentage = 100, skip remaining gates
Targeting Rules
Before percentage-based evaluation, targeting rules filter which users are eligible for the rollout at all:
- Internal employees (by email domain) — always first
- Beta user segment (opt-in list)
- Geographic region (by IP geolocation or user profile country)
- All users (the final stage of a typical rollout sequence)
Targeting rules are evaluated as a priority-ordered list; the first matching rule determines eligibility. Users who are not in the eligible segment are always in the control group.
Flag Evaluation SDK
Application code evaluates flags via a client SDK. The SDK downloads the full flag configuration from the flag service on startup and caches it locally. Flag evaluation is local — no network call per flag check. The config is refreshed in the background every 30 seconds via polling or server-sent events. Local evaluation means flag checks add sub-microsecond latency and work even if the flag service is temporarily unavailable (using the last cached config).
Flag Cleanup
A flag at 100% completion with no rollback plan should be removed from the codebase. The system tracks which flags have been at 100% for more than N days and surfaces them in a cleanup dashboard. Long-lived flags accrete as dead code, inflate SDK config size, and create confusion about what is the current behavior. Teams should treat flag cleanup as part of the feature delivery process, not an afterthought.
See also: Atlassian Interview Guide