Low-Level Design: Survey Builder — Dynamic Forms, Response Collection, and Analytics

Core Entities

Survey: survey_id, creator_id, title, description, status (DRAFT, ACTIVE, PAUSED, CLOSED), settings (JSONB: allow_anonymous, one_response_per_user, show_progress_bar, randomize_questions, response_limit), start_date, end_date, created_at, updated_at. Question: question_id, survey_id, order_index, type (TEXT, PARAGRAPH, SINGLE_CHOICE, MULTIPLE_CHOICE, RATING, SCALE, DATE, EMAIL, FILE_UPLOAD, MATRIX), is_required, question_text, description, validation_rules (JSONB: min_length, max_length, min_value, max_value, regex_pattern), display_logic (JSONB: show only if previous answer = X), created_at. QuestionOption: option_id, question_id, option_text, order_index, value (for scoring). Response: response_id, survey_id, respondent_id (NULL if anonymous), session_token, status (IN_PROGRESS, SUBMITTED, DISQUALIFIED), started_at, submitted_at, ip_address, user_agent, completion_time_seconds. Answer: answer_id, response_id, question_id, text_value (for text questions), selected_options (int array), rating_value (int), date_value (date), file_key (S3 key for file uploads), answered_at.

Dynamic Question Logic and Branching

class SurveyEngine:
    def get_next_question(self, survey_id: int, response_id: int,
                          current_question_id: Optional[int]) -> Optional[Question]:
        response = self.db.get_response(response_id)
        answers_so_far = self.db.get_answers(response_id)
        answered_ids = {a.question_id for a in answers_so_far}

        # Get all questions ordered by order_index
        questions = self.db.get_questions(survey_id)

        for q in questions:
            if q.question_id in answered_ids:
                continue
            # Check display logic: should this question be shown?
            if q.display_logic and not self._evaluate_logic(
                q.display_logic, answers_so_far
            ):
                continue  # skip this question (condition not met)
            return q  # next unanswered visible question
        return None  # survey complete

    def _evaluate_logic(self, logic: dict,
                        answers: list[Answer]) -> bool:
        # logic = {"question_id": 5, "operator": "equals", "value": "3"}
        dep_answer = next(
            (a for a in answers if a.question_id == logic["question_id"]),
            None
        )
        if not dep_answer:
            return False
        if logic["operator"] == "equals":
            return str(dep_answer.rating_value or
                       dep_answer.text_value) == str(logic["value"])
        if logic["operator"] == "in":
            return logic["value"] in (dep_answer.selected_options or [])
        return True

Response Validation

class ResponseValidator:
    def validate_answer(self, question: Question,
                        answer: dict) -> list[str]:
        errors = []
        rules = question.validation_rules or {}

        if question.is_required and not self._has_value(answer):
            errors.append("This question is required")
            return errors

        if question.type == "TEXT":
            val = answer.get("text_value", "")
            if "min_length" in rules and len(val)  rules["max_length"]:
                errors.append(f"Maximum {rules['max_length']} characters allowed")
            if "regex" in rules:
                import re
                if not re.match(rules["regex"], val):
                    errors.append(rules.get("regex_message", "Invalid format"))

        elif question.type == "RATING":
            val = answer.get("rating_value", 0)
            min_v = rules.get("min_value", 1)
            max_v = rules.get("max_value", 5)
            if not (min_v <= val  max_sel:
                errors.append(f"Select at most {max_sel} options")

        return errors

Analytics and Aggregation

Survey analytics: for each question, compute response distribution. For single/multiple choice: count responses per option, compute percentage. For rating/scale: compute mean, median, standard deviation, distribution histogram. For text: store raw responses; optionally run sentiment analysis or word frequency as a batch job. Aggregation approach: Real-time (small surveys): compute on-the-fly from the answers table on each analytics request. Simple SQL: SELECT selected_options, COUNT(*) FROM answers WHERE question_id = ? GROUP BY selected_options. Pre-aggregated (large surveys): maintain a QuestionAggregate table (question_id, option_id, count) updated on each submitted response. Analytics reads the aggregate table (O(options) per question) instead of scanning all answers. Invalidation: on response edit (if allowed), decrement old counts and increment new counts in the same transaction as the answer update.

Atlassian product interviews frequently cover form and survey system design. See commonly asked design questions in Atlassian system design: form builder and survey tools.

Shopify system design rounds cover merchant feedback and survey collection systems. See common design problems in Shopify system design: merchant survey and feedback tools.

LinkedIn system design interviews cover poll and survey features on social platforms. See design patterns for LinkedIn system design: polls and survey features.

See also: Netflix Interview Guide 2026: Streaming Architecture, Recommendation Systems, and Engineering Excellence

See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering

Scroll to Top