Google Drive serves 1+ billion users storing documents, spreadsheets, photos, and files with real-time collaboration, sharing, and cross-device sync. While sharing similarities with Dropbox (covered in our Dropbox guide), Google Drive adds native document editing, granular sharing with Google Workspace, and deep search integration. This guide focuses on the unique aspects of a cloud drive platform for system design interviews.
File Versioning and History
Every file modification creates a new version. Google Drive keeps version history for 30 days (or 100 versions, whichever is reached first). Google Docs/Sheets/Slides have unlimited version history. Data model: file_version: version_id, file_id, storage_key (pointer to the file content in object storage), size, checksum, created_by, created_at, is_current. The current version is the one displayed to users. Previous versions are accessible via the version history UI. Storage optimization: for binary files (images, PDFs), each version stores the full content (delta compression between arbitrary binary files is complex and unreliable). For Google Docs: the document is stored as a sequence of operations (similar to event sourcing). Version snapshots are created periodically, but the full edit history is preserved at the operation level. Restore: a user can restore any previous version, making it the current version. The restored content becomes a new version (it does not delete the versions in between). Delete version: users can permanently delete specific versions to free storage. The deletion is soft (marked for cleanup) and the storage is reclaimed by a background garbage collection process.
Sharing Model and Permissions
Google Drive sharing is more granular than Dropbox. Permission levels: Viewer (read-only), Commenter (view + comment), Editor (full read-write), and Owner (full control including sharing, moving, and deleting). Permissions can be set on: individual files, folders (inherited by all contents), or organizational units (Google Workspace domain). Sharing types: (1) Specific people — share with email addresses. Each gets an entry in the ACL. (2) Anyone with the link — a public link with a specified permission level. No authentication required. (3) Domain-wide — share with everyone in the Google Workspace organization. Permission inheritance: a file in a shared folder inherits the folder permissions. Explicit permissions on the file override inherited ones. Moving a file to a different folder may change its inherited permissions — the system warns the user. Data model: permission: permission_id, file_id, grantee (email, link, domain), role (viewer/commenter/editor/owner), created_by, created_at, expires_at (optional). The ACL is checked on every file access. For performance: cache the resolved permission per user per file in Redis with a short TTL (5 minutes). Invalidate on permission changes.
Storage Quotas and Trash
Each Google account has a storage quota (15 GB free, up to 2 TB paid). The quota is shared across Drive, Gmail, and Photos. Quota tracking: maintain a per-user storage counter. On file upload: increment by file size. On file version creation: increment by the delta (new version size). On delete (move to trash): do NOT decrement — trashed files still count against quota. On permanent delete (empty trash): decrement. On shared file: the file counts against the owner quota, not the viewer. Counter implementation: Redis counter per user for real-time quota checks (fast reject on upload if quota exceeded). Periodic reconciliation with the authoritative database to fix any drift. Trash: deleted files go to the trash folder. They remain for 30 days, then are permanently deleted by a background job. Users can restore from trash (move back to the original location or root). Users can empty trash manually (permanent delete all). Permanent deletion: the file metadata is deleted. The storage content (object storage) is marked for garbage collection. If the content is shared (deduplication), it is only deleted when no other file references it. GDPR: permanent deletion must be irreversible and complete within 30 days of the user request.
Search
Google Drive search indexes: file name, file content (for Google Docs, PDFs with OCR, and text files), file type, owner, sharing status, modification date, and labels/stars. The search system uses a per-user index (the user can only search files they have access to). Architecture: when a file is created or modified, the content is asynchronously indexed in a search backend (Elasticsearch or Google internal search infrastructure). The index entry includes: file_id, extracted text content, metadata fields, and the ACL (which users/groups have access). Search query: full-text search on content + metadata filters (type:pdf, owner:me, modified after:2026-01-01). Results are ranked by: text relevance, recency (recently modified files rank higher), usage frequency (files the user opens often rank higher), and sharing context (files shared directly with the user rank higher than files in shared drives). Optical Character Recognition (OCR): images and scanned PDFs are processed by OCR to extract text. This text is indexed for search. A photo of a receipt becomes searchable by the text on the receipt. ML-based search: understand query intent. “quarterly report” should find the file named “Q1 2026 Financial Summary” even though the exact words do not match. Semantic search using embeddings bridges this gap.
Real-Time Collaboration Integration
Google Drive integrates with Google Docs for real-time collaborative editing (covered in depth in our Collaborative Editing guide). Drive-specific aspects: (1) Presence — when a file is open by multiple users, Drive shows who is viewing/editing. This uses the presence service (WebSocket-based, ephemeral state). (2) Conflict prevention for non-Docs files — for binary files (images, spreadsheets uploaded as .xlsx), simultaneous editing is not supported. Drive uses a lock mechanism: the first user to open for editing gets a soft lock. Other users see “User X is editing. Open in view-only mode?” The lock expires after 10 minutes of inactivity. (3) Comments — users can comment on any file type. Comments are threaded (replies) and can be resolved. Mentions (@user) trigger notifications. Comments are stored as metadata (not part of the file content) and indexed for search. (4) Activity feed — Drive tracks all activity on files: edits, shares, comments, moves, renames. The activity feed is displayed per file and per user. Events are published to Kafka and consumed by the activity service for persistence and the notification service for real-time alerts.