What is an Address Book System?
An address book stores user contacts — names, emails, phones, addresses, and relationships. At consumer scale (Google Contacts, Apple Contacts, LinkedIn connections), the address book must handle fast search (autocomplete while typing), contact deduplication (same person from multiple sources), group management, and cross-device sync. The design touches full-text search, conflict resolution for concurrent edits, and efficient delta sync for mobile clients.
Requirements
- CRUD contacts: name, email(s), phone(s), address(es), notes, custom fields
- Autocomplete search: “Jo” → [John Smith, Joanna Lee, …] in <50ms
- Groups/labels: organize contacts into groups; filter by group
- Import: bulk import from CSV, vCard (iOS/Android contact export)
- Cross-device sync: changes on mobile appear on desktop within 5 seconds
- 10M contacts per user (enterprise address book for sales team)
Data Model
Contact(
contact_id UUID PRIMARY KEY,
user_id UUID NOT NULL,
display_name VARCHAR NOT NULL,
notes TEXT,
photo_url VARCHAR,
source VARCHAR, -- 'manual', 'gmail_import', 'linkedin'
external_id VARCHAR, -- ID in the source system (for deduplication)
version INT DEFAULT 1, -- optimistic locking for sync
created_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ,
deleted_at TIMESTAMPTZ -- soft delete for sync
)
ContactField(
field_id UUID PRIMARY KEY,
contact_id UUID NOT NULL,
field_type ENUM(EMAIL, PHONE, ADDRESS, URL, BIRTHDAY, CUSTOM),
label VARCHAR, -- 'work', 'personal', 'mobile'
value VARCHAR NOT NULL,
is_primary BOOL DEFAULT false
)
ContactGroup(
group_id UUID PRIMARY KEY,
user_id UUID NOT NULL,
name VARCHAR NOT NULL
)
ContactGroupMember(contact_id UUID, group_id UUID,
PRIMARY KEY (contact_id, group_id))
Search: Full-Text + Prefix Index
-- PostgreSQL: full-text search with tsvector
ALTER TABLE Contact ADD COLUMN search_vector TSVECTOR;
-- Update trigger: regenerate search_vector on insert/update
CREATE FUNCTION update_contact_search() RETURNS TRIGGER AS $$
BEGIN
NEW.search_vector := to_tsvector('english',
COALESCE(NEW.display_name, '') || ' ' ||
(SELECT string_agg(value, ' ') FROM ContactField
WHERE contact_id = NEW.contact_id)
);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE INDEX idx_contact_search ON Contact USING GIN(search_vector);
CREATE INDEX idx_contact_name ON Contact(user_id, display_name text_pattern_ops);
-- Autocomplete: prefix match on display_name (fast, uses trigram or prefix index)
SELECT * FROM Contact
WHERE user_id=:uid AND display_name ILIKE :prefix || '%'
ORDER BY display_name LIMIT 10;
Cross-Device Sync Protocol
Delta sync: instead of sending the full address book on every sync, send only changes since the last sync. Clients track their last sync timestamp; the server returns all contacts modified after that timestamp.
def sync_contacts(user_id, last_sync_at, device_id):
# Changes since last sync (including soft-deletes)
changed = db.query('''
SELECT * FROM Contact
WHERE user_id=:uid AND updated_at > :since
ORDER BY updated_at ASC
LIMIT 1000
''', uid=user_id, since=last_sync_at)
# Return delta + new sync token
new_sync_token = max(c.updated_at for c in changed) if changed else last_sync_at
return {
'contacts': [serialize(c) for c in changed], # includes deleted_at != NULL
'sync_token': new_sync_token.isoformat(),
'has_more': len(changed) == 1000
}
Conflict resolution: last-write-wins using updated_at + device_id as tiebreaker. For enterprise use: operational transformation or CRDTs for collaborative editing.
Import Pipeline (vCard / CSV)
def import_contacts(user_id, file_content, format):
contacts = parse_vcards(file_content) if format=='vcf' else parse_csv(file_content)
created = updated = skipped = 0
for parsed in contacts:
# Deduplication: match by email or phone
existing = find_duplicate(user_id, parsed.emails, parsed.phones)
if existing:
merge_contact(existing, parsed)
updated += 1
else:
create_contact(user_id, parsed)
created += 1
return {'created': created, 'updated': updated, 'skipped': skipped}
def find_duplicate(user_id, emails, phones):
# Match on any shared email or phone
return db.query('''
SELECT DISTINCT c.contact_id FROM Contact c
JOIN ContactField f ON c.contact_id = f.contact_id
WHERE c.user_id=:uid
AND ((f.field_type='EMAIL' AND f.value = ANY(:emails))
OR (f.field_type='PHONE' AND f.value = ANY(:phones)))
LIMIT 1
''', uid=user_id, emails=emails, phones=phones).first()
Key Design Decisions
- Soft delete (deleted_at) — sync protocol needs to propagate deletions to other devices
- version column for optimistic locking — prevents lost updates when two devices edit the same contact concurrently
- ContactField as separate table — contacts have variable numbers of emails/phones; avoids wide nullable columns
- Delta sync with timestamp — O(changes) not O(total contacts); mobile bandwidth and battery efficient
- Deduplication on import by email/phone — prevents the same person appearing twice after importing from multiple sources
{“@context”:”https://schema.org”,”@type”:”FAQPage”,”mainEntity”:[{“@type”:”Question”,”name”:”How do you implement fast contact autocomplete search?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use a trigram or prefix index for fast ILIKE prefix matching: CREATE INDEX ON Contact USING GIN(display_name gin_trgm_ops) (requires pg_trgm extension). For exact prefix matching (starts with): CREATE INDEX ON Contact(user_id, display_name text_pattern_ops) and query WHERE user_id=:uid AND display_name ILIKE :prefix || ‘%’. For full-text search (match any word): use tsvector with a GIN index. Cache recent search results in Redis for the most common queries. Target sub-50ms for autocomplete — users expect instant feedback.”}},{“@type”:”Question”,”name”:”How does cross-device contact sync work?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use delta sync: each device tracks a sync_token (timestamp of last successful sync). On sync, the server returns all contacts with updated_at > sync_token, including soft-deleted contacts (deleted_at IS NOT NULL) so deletions propagate to other devices. The response includes a new sync_token (max updated_at in the result). Apply changes locally; handle conflicts with last-write-wins (higher updated_at wins). For the initial sync on a new device: send the full contact list, then switch to delta sync.”}},{“@type”:”Question”,”name”:”How do you deduplicate contacts during import?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Match imported contacts against existing ones using shared identifiers: email address or phone number (normalized — strip spaces, dashes, country code variations). If a match is found, merge the imported fields into the existing contact (add new fields, update empty fields, preserve existing values for conflicts). If no match, create a new contact. Normalizing phone numbers before comparison is critical: "+1 (415) 555-1234" and "4155551234" are the same person. Use a library (libphonenumber) for phone normalization; lowercase and trim for emails.”}},{“@type”:”Question”,”name”:”How do you handle the ContactField table design for variable contact data?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Rather than pre-defining email1, email2, phone1, phone2 columns on the Contact table (which limits to a fixed number and wastes space for contacts with fewer), use a ContactField child table: each row has (contact_id, field_type, label, value, is_primary). This handles contacts with 5 emails and contacts with 0 emails uniformly. To get all fields for a contact: JOIN ContactField on contact_id. To find contacts by email: query ContactField WHERE field_type=EMAIL AND value=:email — index this column.”}},{“@type”:”Question”,”name”:”How do you implement contact groups/labels efficiently?”,”acceptedAnswer”:{“@type”:”Answer”,”text”:”Use a many-to-many ContactGroupMember table: (contact_id, group_id). To add a contact to a group: INSERT. To remove: DELETE. To list contacts in a group: SELECT contact_id FROM ContactGroupMember WHERE group_id=:gid. To list groups for a contact: SELECT group_id FROM ContactGroupMember WHERE contact_id=:cid. Index both directions: (group_id, contact_id) for group listing, (contact_id, group_id) for contact detail. Contacts can belong to multiple groups; groups can contain multiple contacts.”}}]}
Address book and contact sync system design is discussed in Google system design interview questions.
Contact management and cross-device sync design is covered in Apple system design interview preparation.
Contact and connection management system design is covered in LinkedIn system design interview guide.