What Is a File Metadata Service?
A file metadata service manages the namespace, directory hierarchy, permissions, and quota accounting for a distributed file system or cloud storage product. It does not store file data — that is delegated to a blob or chunk store — but it answers questions like: does this path exist, who can access it, what are its attributes, and how much quota has been consumed? Getting this service right is critical because every file operation (read, write, list, stat) touches metadata, making it a central latency and scalability bottleneck.
Requirements
Functional Requirements
- Hierarchical namespace: directories contain files and subdirectories; paths are resolved by traversing the tree.
- Unix-style permission model: owner, group, and other with read/write/execute bits; optional ACLs for fine-grained sharing.
- Quota enforcement: track storage used per user and per namespace; reject writes that would exceed quota.
- Bulk listing: list directory contents efficiently, supporting pagination for directories with millions of entries.
- Atomic rename: move a file or directory atomically, including cross-directory moves.
Non-Functional Requirements
- Stat and permission check latency under 5ms at the 99th percentile.
- List operations must paginate to avoid unbounded response sizes.
- Quota updates must be consistent: no double-counting or missed accounting on concurrent writes.
- Scalable to billions of inodes across millions of namespaces.
Data Model
- Inode — inode_id (PK, UUID), namespace_id, type (FILE/DIR), name, parent_inode_id, owner_uid, owner_gid, mode (Unix permission bits), size_bytes, block_count, created_at, modified_at, accessed_at, blob_key (for FILE type).
- DirEntry — parent_inode_id + name (composite PK), child_inode_id — supports efficient directory listing by parent.
- ACLEntry — inode_id, principal_type (USER/GROUP), principal_id, permissions (bitmask).
- QuotaLedger — namespace_id (PK), owner_uid (PK), bytes_used, inode_count, quota_bytes, quota_inodes, updated_at.
Inode and DirEntry records live in a distributed database partitioned by namespace_id. QuotaLedger uses serializable transactions to prevent quota overrun under concurrent writes.
Core Algorithms
Path Resolution
Path resolution walks the directory tree component by component starting from the root inode of the namespace. For /a/b/c.txt the resolver fetches the root inode, then looks up DirEntry(root, “a”) to get the inode for directory a, then DirEntry(a, “b”), then DirEntry(b, “c.txt”). Each lookup is a point read by composite primary key. Resolution depth is bounded by the maximum path depth (configurable, e.g., 255 components). Path components are cached in a local LRU cache to amortize repeated resolutions of common prefixes.
Unix Permission Check
For each operation the service evaluates permissions in order: if the requester is the inode owner, apply the owner bits; if the requester belongs to the inode group, apply the group bits; otherwise apply the other bits. If ACLs are present they are checked after the standard Unix bits and may grant or restrict access beyond the mode bits. Permission checks for a full path require checking execute (x) permission on every directory in the path, not just the target inode.
Quota Accounting
On every write operation (file create, file grow, directory create) the service debits the QuotaLedger within the same database transaction as the Inode creation or update. If the debit would push bytes_used above quota_bytes the transaction is rolled back and the write is rejected with a quota exceeded error. On delete, the quota is credited. Using the same transaction for metadata change and quota update ensures no bytes are ever counted without a corresponding inode or charged without storage being consumed.
API Design
POST /namespaces/{ns}/inodes— Create a file or directory inode at a given parent path.GET /namespaces/{ns}/inodes?path=/a/b— Resolve a path and return the inode attributes.GET /namespaces/{ns}/inodes/{id}/children?page_token=X&limit=1000— Paginated directory listing.PATCH /namespaces/{ns}/inodes/{id}/permissions— Update mode bits or ACL entries.POST /namespaces/{ns}/inodes/{id}/rename— Atomic rename; body specifies new parent and new name.GET /namespaces/{ns}/quota/{uid}— Query current quota usage for a user.
Scalability and Reliability
Horizontal Partitioning
Inodes and DirEntries are partitioned by namespace_id. Each namespace maps to a shard group in the distributed database. Large namespaces with billions of inodes may be further sub-sharded by a hash of the parent_inode_id. Cross-shard atomic renames (moving a file between two shards) use a two-phase commit coordinated by the metadata service, which holds a distributed lock on the source and destination during the rename to prevent concurrent modifications.
Bulk Listing Scalability
DirEntry records are stored in sorted order by (parent_inode_id, name) so listing is a range scan rather than a full-table scan. Pagination uses a cursor encoding the last seen name. For directories with millions of entries the scanner returns batches of 1000 entries per page. The caller repeats the request with the cursor until an empty page is returned. Listing is eventually consistent in the presence of concurrent creates and deletes — the page cursor may miss or duplicate entries added during iteration, which is acceptable for most use cases.
Quota Ledger Consistency
QuotaLedger updates use serializable isolation to prevent two concurrent writes from both reading the same bytes_used, both concluding they are under quota, and both succeeding when their combined write would exceed quota. For high-concurrency namespaces, quota is tracked with a distributed counter using CRDTs or a reservations system: writers reserve a quota allocation in advance and release unused reservations periodically, reducing contention on the central ledger row.
Interview Tips
Interviewers probe: how do you handle deep directory trees efficiently? (cache resolved path prefixes; limit max depth). How do you implement shared directories without duplicating inodes? (hard links: multiple DirEntry records point to the same inode_id; link_count on the inode tracks how many DirEntries reference it). How do you prevent quota evasion through hard links? (count bytes against the owner of the inode, not the link, so creating a hard link does not create additional quota charge).
See also: Scale AI Interview Guide 2026: Data Infrastructure, RLHF Pipelines, and ML Engineering
See also: Atlassian Interview Guide