Core Entities
Contact: contact_id, first_name, last_name, email (unique), phone, company_id, job_title, source (INBOUND, OUTBOUND, REFERRAL), owner_id (assigned sales rep), created_at, updated_at, tags (array). Company: company_id, name, domain, industry, size, website, owner_id. Deal: deal_id, name, contact_id, company_id, owner_id, pipeline_id, stage_id, amount, currency, close_date (expected), status (OPEN, WON, LOST), created_at, closed_at. Pipeline: pipeline_id, name, stages (ordered list). Stage: stage_id, pipeline_id, name, order_index, win_probability (0-100). Activity: activity_id, type (EMAIL, CALL, MEETING, NOTE, TASK), contact_id, deal_id (nullable), user_id, subject, body, scheduled_at, completed_at, created_at. CustomField: field_id, entity_type (CONTACT, DEAL, COMPANY), field_name, field_type (TEXT, NUMBER, DATE, SELECT, MULTI_SELECT), options (JSON for SELECT types).
Deal Pipeline and Stage Transitions
class DealService:
def move_stage(self, deal_id: int, new_stage_id: int,
user_id: int) -> Deal:
deal = self.repo.get_deal(deal_id)
old_stage = deal.stage_id
new_stage = self.repo.get_stage(new_stage_id)
# Validate stage belongs to the same pipeline
if new_stage.pipeline_id != deal.pipeline_id:
raise InvalidStageError("Stage not in deal's pipeline")
deal.stage_id = new_stage_id
deal.updated_at = datetime.utcnow()
# Auto-close on WON/LOST stages
if new_stage.name in ["Closed Won", "Closed Lost"]:
deal.status = DealStatus.WON if "Won" in new_stage.name else DealStatus.LOST
deal.closed_at = datetime.utcnow()
self.repo.save(deal)
# Log stage transition as activity
self.activity_service.log(ActivityType.NOTE,
contact_id=deal.contact_id, deal_id=deal.deal_id,
user_id=user_id,
body=f"Deal moved from {self.repo.get_stage(old_stage).name} "
f"to {new_stage.name}")
return deal
Custom Fields
CRMs need custom fields because each business tracks different attributes. Two storage approaches: (1) EAV (Entity-Attribute-Value): store all custom field values in a generic table: (entity_id, field_id, value_text/value_number/value_date). Pros: flexible, no schema migrations for new fields. Cons: terrible query performance (JOINs for every field), type casting issues, hard to enforce constraints. (2) JSONB column: add a custom_data JSONB column to each entity table. Store all custom field values as a JSON object: {“lead_score”: 85, “preferred_contact”: “email”}. Index specific fields: CREATE INDEX ON contacts ((custom_data->>”lead_score”)). Pros: fast for most queries, single row per entity, no JOIN explosion. Cons: schema validation must be in the application layer (not the database). Recommendation: use JSONB for new CRM systems. EAV was the standard before JSONB support in PostgreSQL; it is now generally avoided.
Activity Timeline and Full-Text Search
Activity timeline: ordered log of all interactions with a contact or deal. Query: SELECT * FROM activities WHERE contact_id=:id ORDER BY created_at DESC LIMIT 50. Index: (contact_id, created_at DESC). For deal activities: (deal_id, created_at DESC). Inbox view: show each user’s upcoming tasks and overdue activities: SELECT * FROM activities WHERE user_id=:u AND type=’TASK’ AND completed_at IS NULL ORDER BY scheduled_at ASC. Full-text search across contacts, companies, and activities: Elasticsearch index with a unified search endpoint. Documents: contact (first_name, last_name, email, company, tags), deal (name, description), activity (subject, body). Query time: search across all entity types in one query, merge and rank results by relevance + recency. Sync from PostgreSQL via Debezium CDC. Autocomplete: use a completion suggester for contact names and company names in the search bar.
Reporting and Sales Analytics
Pipeline report: deals grouped by stage with total value. Query: SELECT stage_id, COUNT(*) as deal_count, SUM(amount) as total_value FROM deals WHERE status=’OPEN’ AND owner_id=:id GROUP BY stage_id. Forecast report: expected revenue by month = SUM(amount * win_probability / 100) WHERE close_date BETWEEN :start AND :end. Conversion rates: deals that moved from stage A to stage B / total deals that entered stage A. Requires tracking stage history (deal_stage_history table: deal_id, from_stage, to_stage, changed_at). Activity metrics: calls per rep per day, email response rate, avg deal age by stage. Run reports against a read replica to avoid impacting the transactional primary. For complex analytics: sync to a data warehouse (BigQuery, Snowflake) and use BI tools (Tableau, Metabase).
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “What are the key differences between CRM and ERP systems?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “CRM (Customer Relationship Management) manages external relationships: contacts, leads, deals, sales pipeline, and customer interactions. It focuses on the sales cycle, marketing campaigns, and customer support. Data: who are our customers, what deals are in progress, what emails have we sent, what support tickets are open. Primary users: sales reps, marketing teams, customer success. ERP (Enterprise Resource Planning) manages internal operations: inventory, accounting, HR, procurement, manufacturing. It focuses on the operational and financial side: how many units do we have in stock, what are our payroll costs, what are we buying from suppliers. Data: purchase orders, invoices, inventory counts, employee records. Primary users: finance, operations, supply chain. Where they intersect: ERP often includes basic CRM-like features (customer records for billing), and CRM systems often sync with ERP for order and financial data (when a CRM deal closes, it creates an ERP order). In practice: most companies use both — CRM for front-office (sales, marketing) and ERP for back-office (finance, operations), integrated via APIs.”
}
},
{
“@type”: “Question”,
“name”: “How do you design a scalable activity feed for a CRM contact?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “An activity feed shows all interactions with a contact: emails, calls, meetings, notes, deal stage changes, tasks. Requirements: paginated (up to thousands of activities per contact), fast recent-first display, filterable by type. Schema: activities table with (contact_id, created_at, type, …). Composite index: (contact_id, created_at DESC) — supports the primary query pattern (get recent activities for a contact). Keyset pagination: SELECT * FROM activities WHERE contact_id=:id AND (created_at, activity_id) 0.85: high confidence duplicate; auto-merge. If 0.6-0.85: suggest merge to user; if MANAGER > REP. Visibility rules: ADMIN: sees all records in the tenant. MANAGER: sees own records + all records owned by their direct reports (user.manager_id = current_user_id). REP: sees only own records (deal.owner_id = current_user_id OR contact.owner_id = current_user_id). Implementation: add a visibility filter to every query. At query time: determine the user’s role and direct reports list. Add a WHERE owner_id IN :visible_owner_ids clause. Cache the “direct reports” list per manager (invalidate on org changes). Team-level sharing: some CRMs allow explicit sharing (share a contact with another rep). Store in a contact_shares table: (contact_id, shared_with_user_id, permission). Include shared contacts in visibility queries. API enforcement: never trust the client to filter — always apply visibility rules server-side.”
}
}
]
}
Asked at: Shopify Interview Guide
Asked at: Stripe Interview Guide
Asked at: LinkedIn Interview Guide
Asked at: Airbnb Interview Guide