Low-Level Design: IT Ticketing System — Ticket Lifecycle, SLA Tracking, and Assignment (2025)

Core Entities

Ticket: ticket_id, title, description, status (OPEN, IN_PROGRESS, PENDING, RESOLVED, CLOSED), priority (P1/P2/P3/P4), category (BUG, FEATURE, INCIDENT, QUESTION, CHANGE), reporter_id, assignee_id, team_id, created_at, updated_at, resolved_at, due_at (SLA deadline). TicketComment: comment_id, ticket_id, author_id, body, is_internal (bool, hidden from reporter), created_at. TicketHistory: history_id, ticket_id, field_changed, old_value, new_value, changed_by, changed_at. (Immutable audit log.) SLAPolicy: policy_id, name, priority, response_time_minutes (time to first response), resolution_time_minutes, business_hours_only (bool). SLABreach: breach_id, ticket_id, breach_type (RESPONSE/RESOLUTION), breached_at, notified_at. Team: team_id, name, queue_strategy (ROUND_ROBIN/LOAD_BASED/SKILL_BASED). Agent: agent_id, team_id, name, skills (text array), current_load (open ticket count), is_available.

Ticket State Machine

from enum import Enum

class TicketStatus(Enum):
    OPEN = "OPEN"
    IN_PROGRESS = "IN_PROGRESS"
    PENDING = "PENDING"       # waiting on reporter/external
    RESOLVED = "RESOLVED"     # agent believes issue is fixed
    CLOSED = "CLOSED"         # confirmed resolved + feedback collected

VALID_TRANSITIONS = {
    TicketStatus.OPEN:        {TicketStatus.IN_PROGRESS, TicketStatus.CLOSED},
    TicketStatus.IN_PROGRESS: {TicketStatus.PENDING, TicketStatus.RESOLVED},
    TicketStatus.PENDING:     {TicketStatus.IN_PROGRESS, TicketStatus.RESOLVED},
    TicketStatus.RESOLVED:    {TicketStatus.CLOSED, TicketStatus.IN_PROGRESS},  # reopen
    TicketStatus.CLOSED:      {TicketStatus.OPEN},  # reopen
}

class TicketService:
    def transition(self, ticket_id: int, new_status: TicketStatus,
                   actor_id: int, comment: str = None):
        ticket = self.db.get_ticket_for_update(ticket_id)
        current = TicketStatus(ticket.status)

        if new_status not in VALID_TRANSITIONS[current]:
            raise InvalidTransition(f"{current} -> {new_status} not allowed")

        now = datetime.utcnow()
        updates = {"status": new_status.value, "updated_at": now}

        if new_status == TicketStatus.RESOLVED:
            updates["resolved_at"] = now
        if new_status == TicketStatus.IN_PROGRESS and not ticket.first_response_at:
            updates["first_response_at"] = now
            self._check_response_sla(ticket, now)

        self.db.update_ticket(ticket_id, updates)
        self._log_history(ticket_id, "status", ticket.status,
                          new_status.value, actor_id)
        if comment:
            self.add_comment(ticket_id, actor_id, comment)
        self._send_notification(ticket, new_status)

SLA Tracking and Breach Detection

class SLAService:
    def compute_due_at(self, ticket: Ticket, policy: SLAPolicy) -> datetime:
        minutes = policy.resolution_time_minutes
        if not policy.business_hours_only:
            return ticket.created_at + timedelta(minutes=minutes)

        # Business hours: 9am-6pm Mon-Fri, skip weekends/holidays
        remaining = minutes
        current = ticket.created_at
        while remaining > 0:
            if self._is_business_hour(current):
                remaining -= 1
            current += timedelta(minutes=1)
        return current

    def check_breaches(self):
        # Called by a background job every minute
        now = datetime.utcnow()

        # Response SLA breaches: no first_response_at and past response deadline
        breached_response = self.db.query(
            "SELECT t.* FROM tickets t "
            "JOIN sla_policies p ON t.priority = p.priority "
            "WHERE t.first_response_at IS NULL "
            "AND t.status NOT IN ('RESOLVED', 'CLOSED') "
            "AND t.created_at + INTERVAL '1 minute' * p.response_time_minutes < %s "
            "AND NOT EXISTS (SELECT 1 FROM sla_breaches "
            "  WHERE ticket_id = t.ticket_id AND breach_type = 'RESPONSE')",
            now
        )
        for ticket in breached_response:
            self._record_breach(ticket.ticket_id, "RESPONSE", now)
            self._notify_breach(ticket, "RESPONSE")

        # Resolution SLA breaches: past due_at and not resolved
        breached_resolution = self.db.query(
            "SELECT * FROM tickets WHERE due_at < %s "
            "AND status NOT IN ('RESOLVED', 'CLOSED') "
            "AND NOT EXISTS (SELECT 1 FROM sla_breaches "
            "  WHERE ticket_id = ticket_id AND breach_type = 'RESOLUTION')",
            now
        )
        for ticket in breached_resolution:
            self._record_breach(ticket.ticket_id, "RESOLUTION", now)
            self._notify_breach(ticket, "RESOLUTION")

Ticket Assignment Strategies

Assignment strategies by queue_strategy: Round Robin: maintain a per-team cursor pointing to the last assigned agent. On new ticket, assign to next available agent in rotation. Simple, fair distribution. Load-Based: assign to the available agent with the lowest current_load (open ticket count). Query: SELECT agent_id FROM agents WHERE team_id=? AND is_available=true ORDER BY current_load ASC LIMIT 1. Increment current_load on assign, decrement on resolve/close. Skill-Based: match ticket tags/category to agent.skills array. SELECT agent_id FROM agents WHERE team_id=? AND is_available=true AND skills @> ARRAY[‘networking’, ‘linux’] ORDER BY current_load ASC LIMIT 1. If no skill match: fall back to load-based. Auto-escalation: background job promotes P2 tickets to P1 if unassigned for 30 minutes. P1 tickets unassigned for 15 minutes trigger an on-call page via PagerDuty webhook.


{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you implement a ticket status state machine in a ticketing system?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Define allowed transitions as a dictionary mapping each status to its valid next statuses. On every status change, validate the transition before applying it – throw InvalidTransition if the target status is not in the allowed set. Log every transition to an immutable TicketHistory table with old_value, new_value, changed_by, and changed_at. This provides a full audit trail. Side effects (setting resolved_at, recording first_response_at, sending notifications) are triggered based on the new status within the same transaction.”}},{“@type”:”Question”,”name”:”How do you track SLA deadlines with business hours?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Store an SLAPolicy per priority with response_time_minutes and resolution_time_minutes, and a business_hours_only flag. When a ticket is created, compute due_at by adding the SLA minutes to created_at, counting only business hours (9am-6pm Mon-Fri) if the flag is set. Store due_at on the ticket for fast breach queries. A background job runs every minute, querying for tickets where due_at < NOW() and status is not RESOLVED/CLOSED and no breach record exists yet, then records the breach and sends notifications.”}},{“@type”:”Question”,”name”:”What assignment strategies should a ticketing system support?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Three common strategies: Round-robin maintains a cursor per team and cycles through available agents – fair but ignores load. Load-based assigns to the agent with the fewest open tickets (lowest current_load) – better utilization. Skill-based matches ticket category/tags to agent skill arrays, falling back to load-based if no skill match exists. Teams choose their strategy. Auto-escalation is a cross-cutting concern: a background job promotes unassigned high-priority tickets after a timeout and triggers on-call pages for critical unassigned tickets.”}},{“@type”:”Question”,”name”:”How do you handle ticket comments with internal notes?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Add an is_internal boolean to the TicketComment table. Internal comments (is_internal=true) are visible only to agents, not to the ticket reporter. When fetching comments for display to a reporter, filter WHERE is_internal = false. Agent views include all comments. Internal notes are used for: team coordination, escalation notes, debugging information that should not be shared with the customer. Audit logging: all comments (internal and external) are preserved and visible to admins regardless of is_internal flag.”}},{“@type”:”Question”,”name”:”How do you implement ticket history and audit logging?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use an append-only TicketHistory table: ticket_id, field_changed (string name of the field), old_value, new_value, changed_by (user_id), changed_at. On every update to a ticket field, insert a history record – never update it. This creates a complete audit trail of who changed what and when. For the status field, history shows the full state machine path. For reassignment, history shows the chain of assignees. Never modify history records; if you need to annotate, add a note_id FK to a separate notes table.”}}]}

Atlassian products are built around ticketing (Jira). System design interviews cover ticketing at Atlassian interview: Jira and ticketing system design.

Stripe system design interviews cover state machine workflows. Review ticketing patterns for Stripe interview: workflow and status state machine design.

Shopify system design rounds cover support ticketing and order workflows. See patterns for Shopify interview: order and support ticketing system design.

Scroll to Top