What Is Object Storage?
Object storage stores unstructured data (images, videos, backups, logs) as discrete objects with a flat namespace (bucket/key), unlike hierarchical file systems. Amazon S3 stores trillions of objects and handles millions of requests per second. Object storage is optimized for large, immutable objects (write-once-read-many) and provides high durability (S3 guarantees 99.999999999% — 11 nines) via replication.
Core API
- PUT object: upload an object to bucket/key. S3 returns 200 only after the object is durably persisted to multiple AZs. Object size limit: 5GB for single PUT; use multipart upload for larger objects.
- GET object: retrieve object by bucket/key. Range GET (HTTP Range header) retrieves a byte range — used by video streaming to seek within large video files.
- DELETE object: removes the object (with versioning enabled, creates a delete marker — the object is recoverable).
- Multipart upload: split large objects into 5MB-5GB parts, upload in parallel, complete with a final CompleteMultipartUpload call. Each part is a separate PUT request. Enables: parallel upload across multiple connections, resumable uploads (retry failed parts), and upload of objects > 5GB.
- Pre-signed URL: a time-limited URL that grants temporary access to a specific object. Generated server-side and sent to clients for direct browser-to-S3 upload/download, bypassing your application server. Avoids proxying large files through application servers.
Data Durability via Replication
S3’s 11-nine durability comes from storing each object across at least 3 Availability Zones (AZs). When you PUT an object: (1) S3 receives the object at the primary AZ. (2) Simultaneously replicates to 2+ additional AZs. (3) Returns success only after all replicas confirm. If an AZ loses power/hardware, the other AZs serve all GET requests for objects whose replicas they hold. The probability of all 3+ independent AZs failing simultaneously and causing data loss is ~10^-11 per year per object.
Erasure coding: for very large objects, S3 may use erasure coding (similar to RAID-6) instead of full replication. Split the object into k data shards and m parity shards. Any k shards can reconstruct the original data. k=10, m=4 (14 total) means you can lose any 4 shards and still recover the data, while storing only 40% more data than replication at factor 14/10=1.4× (versus 3× for full replication).
Consistency Model
S3 provides strong read-after-write consistency (since December 2020): after a successful PUT, any subsequent GET for the same key is guaranteed to return the new data. Before 2020, S3 was eventually consistent — a PUT might not be visible to subsequent GETs for seconds. Strong consistency simplifies application development but required significant engineering changes to S3’s internal metadata layer.
Bucket Internals: Data Placement
A bucket is a logical namespace. Internally: object data is chunked into 64MB or 128MB blocks stored on physical disks across many storage servers. The object metadata (bucket, key, size, ETag, created_at, version_id, block locations) is stored in a distributed metadata service. When a GET arrives: look up metadata to find block locations, fetch blocks (possibly from multiple storage nodes), reassemble and return. Object keys are hashed to distribute objects evenly across storage nodes — sequential keys (e.g., log files timestamped 2024-01-01, 2024-01-02) would all hash to nearby partitions, creating hotspots. S3 automatically added random prefixes in older days; modern S3 handles hot key detection and automatic partition splitting.
Storage Classes and Lifecycle Policies
S3 offers tiered storage with different cost/access trade-offs:
- S3 Standard: frequent access, 3-AZ replication, ~$0.023/GB-month, millisecond retrieval
- S3 Standard-IA (Infrequent Access): lower storage cost (~$0.0125/GB), retrieval fee per GB, same durability and availability. For objects accessed < 1x/month.
- S3 Glacier Instant: archive with millisecond retrieval, ~$0.004/GB. For objects accessed quarterly.
- S3 Glacier Flexible: 3-5 hour retrieval, ~$0.0036/GB. For compliance archives.
- S3 Glacier Deep Archive: 12-hour retrieval, ~$0.00099/GB. For 7-year compliance retention at minimum cost.
Lifecycle policies automate transitions: “move to Standard-IA after 30 days, Glacier after 90 days, delete after 365 days.” This automatically reduces storage costs for log files and backups.
Access Control
Three layers: (1) Bucket policies (resource-based IAM policies) — define who can access which operations. (2) IAM policies (identity-based) — define what AWS identities can do. (3) Access Control Lists (ACLs) — legacy per-object permissions. Modern recommendation: use bucket policies + IAM, disable ACLs. Block Public Access settings (account-wide or per-bucket) prevent accidental public exposure — a common cause of data breaches.
CDN Integration
S3 serves as the origin for CloudFront (CDN). Workflow: user requests an image → CloudFront edge node checks its cache → if miss, CloudFront fetches from S3 (origin) and caches the response → subsequent requests for the same object are served from the CloudFront edge node (50-200ms response time, near the user). Reduces S3 GET costs by 80-90% for popular objects; reduces latency from 100-500ms (transatlantic S3 request) to 5-20ms (edge cache). S3 + CloudFront is the standard architecture for static asset serving, video streaming (HLS segments), and software distribution.
{
“@context”: “https://schema.org”,
“@type”: “FAQPage”,
“mainEntity”: [
{
“@type”: “Question”,
“name”: “How does Amazon S3 achieve 11-nine durability (99.999999999%)?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “S3 achieves 99.999999999% durability by storing each object across at least 3 geographically separated Availability Zones (AZs) within a region. When you PUT an object: (1) S3 receives the object data, (2) synchronously replicates it to storage nodes in 2+ additional AZs before acknowledging success, (3) returns 200 OK only after all replicas confirm persistence to durable storage. For very large objects, S3 uses erasure coding: split the object into k=10 data shards and m=4 parity shards (14 total). Any k=10 shards can reconstruct the original u2014 you can lose any 4 shards before losing data. This uses 1.4u00d7 storage overhead versus 3u00d7 for full replication, while tolerating 4 simultaneous shard failures. Additional durability measures: checksums on all data at rest (MD5/SHA-256) and during transfer (Content-MD5 header) to detect bit rot. Background integrity scanning checks all stored objects periodically and automatically repairs corruption by recreating the corrupted shard from parity. The 11-nine figure means: with 1 trillion objects stored, you expect to lose at most 0.001 objects per year. Practical comparison: storing data on a single consumer SSD gives ~4-nines (99.99%) annual durability.”
}
},
{
“@type”: “Question”,
“name”: “What is a pre-signed URL and when should you use it?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “A pre-signed URL is a time-limited URL that grants temporary access to a specific S3 object without requiring AWS credentials. Generated by your application server using the AWS SDK with: bucket name, object key, HTTP method (GET or PUT), and expiration time (seconds, up to 7 days). The URL includes the AWS signature cryptographically tied to the specific object and expiry. Use cases: (1) Browser-to-S3 direct upload u2014 user selects a file in your web app; your server generates a pre-signed PUT URL (expiring in 5 minutes) and returns it to the browser; the browser uploads directly to S3 using the URL. Avoids routing large files through your application server (reduces server load, cost, and latency). (2) Temporary download access u2014 generate a pre-signed GET URL for a protected object. The URL is only valid for 1 hour u2014 users who share the URL cannot grant permanent access. (3) Sharing private content u2014 send a pre-signed URL in an email or webhook for secure one-time downloads. Security considerations: pre-signed URLs bypass IAM policies for the specific object u2014 anyone with the URL can access the object until expiry, so use short expiry times (minutes for uploads, hours for downloads) and never embed pre-signed URLs in client-side code that could be extracted.”
}
},
{
“@type”: “Question”,
“name”: “How do S3 storage classes reduce costs for infrequently accessed data?”,
“acceptedAnswer”: {
“@type”: “Answer”,
“text”: “S3 offers tiered storage classes with different cost/access trade-offs, letting you minimize cost by matching the storage class to your access pattern. S3 Standard: ~$0.023/GB-month, < 1ms retrieval. For data accessed frequently. S3 Standard-IA (Infrequent Access): ~$0.0125/GB-month but a retrieval fee of $0.01/GB. Cost-effective when accessed less than once per month (break-even vs Standard is ~1.8 GETs/month). S3 Glacier Instant Retrieval: ~$0.004/GB-month, millisecond retrieval. For archives accessed quarterly. S3 Glacier Flexible Retrieval: ~$0.0036/GB-month, 3-5 hour retrieval. For compliance archives where retrieval is rare. S3 Glacier Deep Archive: ~$0.00099/GB-month, 12-hour retrieval. For 7-year compliance retention at minimum cost u2014 95% cheaper than Standard. S3 Lifecycle policies automate transitions: CREATE u2192 Standard u2192 Standard-IA (after 30 days) u2192 Glacier (after 90 days) u2192 expire (after 365 days). For application logs: you need instant access to the last 7 days (Standard), occasional access to last 90 days (IA), rare access to last year (Glacier), then delete. A lifecycle policy handles all transitions automatically, reducing storage cost by 70-90% versus keeping everything in Standard."
}
}
]
}