Low-Level Design: Survey Builder — Dynamic Forms, Response Collection, and Analytics

Core Entities

Survey: survey_id, creator_id, title, description, status (DRAFT, ACTIVE, PAUSED, CLOSED), settings (JSONB: allow_anonymous, one_response_per_user, show_progress_bar, randomize_questions, response_limit), start_date, end_date, created_at, updated_at. Question: question_id, survey_id, order_index, type (TEXT, PARAGRAPH, SINGLE_CHOICE, MULTIPLE_CHOICE, RATING, SCALE, DATE, EMAIL, FILE_UPLOAD, MATRIX), is_required, question_text, description, validation_rules (JSONB: min_length, max_length, min_value, max_value, regex_pattern), display_logic (JSONB: show only if previous answer = X), created_at. QuestionOption: option_id, question_id, option_text, order_index, value (for scoring). Response: response_id, survey_id, respondent_id (NULL if anonymous), session_token, status (IN_PROGRESS, SUBMITTED, DISQUALIFIED), started_at, submitted_at, ip_address, user_agent, completion_time_seconds. Answer: answer_id, response_id, question_id, text_value (for text questions), selected_options (int array), rating_value (int), date_value (date), file_key (S3 key for file uploads), answered_at.

Dynamic Question Logic and Branching

class SurveyEngine:
    def get_next_question(self, survey_id: int, response_id: int,
                          current_question_id: Optional[int]) -> Optional[Question]:
        response = self.db.get_response(response_id)
        answers_so_far = self.db.get_answers(response_id)
        answered_ids = {a.question_id for a in answers_so_far}

        # Get all questions ordered by order_index
        questions = self.db.get_questions(survey_id)

        for q in questions:
            if q.question_id in answered_ids:
                continue
            # Check display logic: should this question be shown?
            if q.display_logic and not self._evaluate_logic(
                q.display_logic, answers_so_far
            ):
                continue  # skip this question (condition not met)
            return q  # next unanswered visible question
        return None  # survey complete

    def _evaluate_logic(self, logic: dict,
                        answers: list[Answer]) -> bool:
        # logic = {"question_id": 5, "operator": "equals", "value": "3"}
        dep_answer = next(
            (a for a in answers if a.question_id == logic["question_id"]),
            None
        )
        if not dep_answer:
            return False
        if logic["operator"] == "equals":
            return str(dep_answer.rating_value or
                       dep_answer.text_value) == str(logic["value"])
        if logic["operator"] == "in":
            return logic["value"] in (dep_answer.selected_options or [])
        return True

Response Validation

class ResponseValidator:
    def validate_answer(self, question: Question,
                        answer: dict) -> list[str]:
        errors = []
        rules = question.validation_rules or {}

        if question.is_required and not self._has_value(answer):
            errors.append("This question is required")
            return errors

        if question.type == "TEXT":
            val = answer.get("text_value", "")
            if "min_length" in rules and len(val)  rules["max_length"]:
                errors.append(f"Maximum {rules['max_length']} characters allowed")
            if "regex" in rules:
                import re
                if not re.match(rules["regex"], val):
                    errors.append(rules.get("regex_message", "Invalid format"))

        elif question.type == "RATING":
            val = answer.get("rating_value", 0)
            min_v = rules.get("min_value", 1)
            max_v = rules.get("max_value", 5)
            if not (min_v <= val  max_sel:
                errors.append(f"Select at most {max_sel} options")

        return errors

Analytics and Aggregation

Survey analytics: for each question, compute response distribution. For single/multiple choice: count responses per option, compute percentage. For rating/scale: compute mean, median, standard deviation, distribution histogram. For text: store raw responses; optionally run sentiment analysis or word frequency as a batch job. Aggregation approach: Real-time (small surveys): compute on-the-fly from the answers table on each analytics request. Simple SQL: SELECT selected_options, COUNT(*) FROM answers WHERE question_id = ? GROUP BY selected_options. Pre-aggregated (large surveys): maintain a QuestionAggregate table (question_id, option_id, count) updated on each submitted response. Analytics reads the aggregate table (O(options) per question) instead of scanning all answers. Invalidation: on response edit (if allowed), decrement old counts and increment new counts in the same transaction as the answer update.

{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you implement conditional/branching question logic in a survey builder?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Each question has a display_logic field (JSONB) specifying: the dependent question ID, operator (equals, in, gt), and comparison value. When determining the next question, iterate through questions in order_index order, skip already-answered ones, and evaluate each question's display_logic against current answers. If the condition is not met, skip the question. This supports linear, branching, and skip-logic survey flows.”}},{“@type”:”Question”,”name”:”How do you persist in-progress survey responses?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Create a Response record with status=IN_PROGRESS when the user starts. Each answered question creates or updates an Answer record linked to the response. Use a session_token for anonymous users to resume sessions. On submit, validate all required questions are answered, then update Response.status to SUBMITTED. This allows resuming abandoned surveys and prevents data loss.”}},{“@type”:”Question”,”name”:”What is the schema for a survey response with multiple question types?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”The Answer table has polymorphic value columns: text_value (VARCHAR for TEXT/PARAGRAPH), selected_options (INT ARRAY for SINGLE/MULTIPLE_CHOICE), rating_value (INT for RATING/SCALE), date_value (DATE), file_key (VARCHAR for FILE_UPLOAD). Only the relevant column is populated per answer. This avoids sparse columns while keeping answers in a single table for efficient joins.”}},{“@type”:”Question”,”name”:”How do you compute real-time survey analytics efficiently?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”For small surveys: compute on-the-fly with SQL aggregations (GROUP BY option_id, COUNT(*)) on each request. For large surveys: maintain a pre-aggregated QuestionAggregate table updated in the same transaction as each submitted response. Reads hit the aggregate table (O(options)) instead of scanning all answers. For edits (if allowed), decrement old counts and increment new counts atomically.”}},{“@type”:”Question”,”name”:”How do you enforce one response per user in a survey?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Add a unique constraint on (survey_id, respondent_id) in the Response table. Before inserting, check for an existing response. For anonymous surveys, use IP + browser fingerprint as a deduplication key stored in Redis (fast check) with DB as fallback. If allow_change_vote is enabled, update the existing response instead of rejecting. Always validate server-side, not just client-side.”}}]}

Atlassian product interviews frequently cover form and survey system design. See commonly asked design questions in Atlassian system design: form builder and survey tools.

Shopify system design rounds cover merchant feedback and survey collection systems. See common design problems in Shopify system design: merchant survey and feedback tools.

LinkedIn system design interviews cover poll and survey features on social platforms. See design patterns for LinkedIn system design: polls and survey features.