Caching
Push content to edge locations close to users. Covers cache-control headers, invalidation strategies, and when to use a CDN vs application-level cache.
A user in Tokyo loads your website. Your servers are in Virginia. The request travels across the Pacific Ocean -- roughly 15,000 km of fiber optic cable, through multiple routers and undersea cables. The speed of light imposes a minimum round-trip time of about 100ms, and in practice, network latency, congestion, and TCP handshakes push this to 200-400ms. Multiply that by the dozens of resources a modern web page requires (HTML, CSS, JavaScript, images, fonts, API calls), and the page takes seconds to load. A Content Delivery Network (CDN) solves this by caching content on servers physically close to the user, turning that 200ms into 5-20ms.
A CDN operates hundreds or thousands of edge servers distributed globally in Points of Presence (PoPs). Each PoP is a data center strategically located near population centers. When a user requests content, DNS directs them to the nearest PoP.
Without CDN:
User (Tokyo) ──── 200ms ────> Origin Server (Virginia)
With CDN:
User (Tokyo) ──── 5ms ────> CDN PoP (Tokyo)
│ Cache HIT → return immediately
│ Cache MISS → fetch from origin, cache, return
Global PoP distribution (example):
North America: 50 PoPs
Europe: 40 PoPs
Asia: 30 PoPs
South America: 15 PoPs
Africa: 10 PoPs
Oceania: 5 PoPs1. User types www.example.com.
2. DNS resolution: CNAME points to cdn.example.com → CDN's DNS.
3. CDN's DNS uses GeoDNS or Anycast to resolve to the nearest PoP.
4. User's request goes to the nearest PoP.
5. PoP checks its cache:
a. Cache HIT → return cached response immediately.
b. Cache MISS → PoP fetches from the origin server, caches the response,
and returns it to the user.
6. Subsequent requests from nearby users hit the cache.Modern CDNs use a tiered architecture to reduce origin load:
User
│
┌─────▼──────┐
│ Edge PoP │ (closest to user, first cache check)
│ (L1 Cache) │
└─────┬───────┘
│ Cache MISS
┌─────▼──────┐
│ Regional │ (mid-tier, aggregates misses from multiple edge PoPs)
│ Shield (L2) │
└─────┬───────┘
│ Cache MISS
┌─────▼──────┐
│ Origin │ (your server)
│ Server │
└─────────────┘The regional shield (also called a mid-tier cache or origin shield) prevents multiple edge PoPs from simultaneously requesting the same resource from the origin (thundering herd problem).
HTTP cache-control headers govern how CDNs (and browsers) cache content. Getting these right is critical.
Cache-Control: public, max-age=86400
→ CDN and browsers can cache for 24 hours
Cache-Control: private, max-age=3600
→ Only the user's browser can cache (not CDN). For personalized content.
Cache-Control: no-cache
→ Can be cached, but must revalidate with origin before serving.
Cache-Control: no-store
→ Never cache. For sensitive data (banking, medical).
Cache-Control: s-maxage=3600
→ CDN-specific max age (overrides max-age for shared caches).
Cache-Control: stale-while-revalidate=60
→ Serve stale content while fetching fresh content in background.
→ Great for perceived performance.
ETag: "abc123"
→ Content fingerprint. CDN sends If-None-Match request to origin.
→ Origin responds 304 Not Modified if unchanged (no body transfer).
Vary: Accept-Encoding
→ Cache different versions based on request headers.
→ Essential for serving gzip vs brotli compressed content.Static assets (JS, CSS, images):
Cache-Control: public, max-age=31536000, immutable
Use content-hash in filename: app.a1b2c3.js
Cache forever. Deploy new filename when content changes.
HTML pages:
Cache-Control: public, max-age=300, stale-while-revalidate=60
Short TTL. Users get fresh content within 5 minutes.
API responses:
Cache-Control: public, s-maxage=60
or
Cache-Control: private, no-cache (for personalized data)
User-specific content:
Cache-Control: private, no-store
Never cache on CDN. Use application-level caching instead.Phil Karlton famously said, "There are only two hard things in Computer Science: cache invalidation and naming things." CDN cache invalidation is the practical embodiment of this.
TTL-based expiration: Set a max-age and let caches expire naturally. Simple but imprecise. Content might be stale for up to the TTL duration.
Purge by URL: Tell the CDN to invalidate a specific URL immediately.
CDN API: PURGE /images/product-42.jpg
→ All PoPs remove this URL from cache
→ Next request fetches fresh content from originPurge by cache tag: Tag cached objects with metadata. Purge all objects with a given tag.
Response header: Cache-Tag: product-42, category-electronics
CDN API: Purge all objects tagged "product-42"
→ Invalidates product page, product image, product API responseVersioned URLs (cache busting): The most reliable approach. Change the URL when the content changes. Since the URL is different, the CDN treats it as a new resource.
Before: /static/app.js
After: /static/app.v2.js or /static/app.a1b2c3.js (content hash)
The old URL stays cached (harmless). The new URL is fetched fresh.
No purging needed.In practice, use versioned URLs for static assets and TTL + purge for dynamic content.
CDNs are not just for static files. Modern CDNs accelerate dynamic content too:
CDNs maintain persistent, optimized TCP connections between PoPs and the origin. Instead of the user establishing a new TCP connection across the ocean, the user connects to the nearby PoP (fast), and the PoP uses its pre-established connection to the origin (optimized).
Without CDN:
User → TCP handshake (200ms) → TLS handshake (200ms) → Request (200ms)
Total: ~600ms before first byte
With CDN:
User → TCP+TLS to PoP (10ms) → PoP reuses connection to origin (100ms)
Total: ~110ms before first byteServices like Cloudflare Workers, AWS Lambda@Edge, and Fastly Compute@Edge allow you to run code at CDN edge locations. Use cases:
Some CDNs analyze HTML responses and prefetch linked resources (CSS, JS, images) before the client requests them. The resources are already cached at the PoP when the client's browser requests them.
| Aspect | CDN | Application Cache (Redis/Memcached) |
|---|---|---|
| Location | Edge, close to users | Data center, close to database |
| Content type | Static assets, public content | Dynamic data, session state |
| Cache key | URL + headers | Custom key |
| Latency reduction | Network latency (geography) | Database query latency |
| Invalidation | Purge API, TTL | Direct control (delete key) |
| Cost model | Per-GB transferred + per-request | Infrastructure cost |
| Personalization | Limited (Vary header, edge compute) | Full (per-user caching) |
Use CDN for: Static assets, public pages, APIs with cacheable responses, media files. Use app cache for: User sessions, database query results, computed values, rate limiting counters. Use both together. A typical architecture caches static assets on the CDN, caches database queries in Redis, and uses the CDN for API responses where possible.
Operates in 300+ cities. Known for security features (DDoS protection, WAF) alongside CDN. Offers Cloudflare Workers for edge compute. Free tier available.
Integrates tightly with AWS services (S3, ALB, Lambda@Edge). 400+ PoPs. Pay-per-use pricing. Good choice for AWS-native architectures.
The original CDN (founded 1998). Largest network with 4,000+ PoPs. Used by many enterprise and media companies. Known for reliability but more expensive and complex.
Known for real-time purging (< 150ms global purge) and programmability (VCL, Compute@Edge). Popular with developer-focused companies.
Scenario: Video streaming platform
- 10 million daily active users
- Average session: 30 minutes of video
- Video bitrate: 5 Mbps (1080p)
Daily bandwidth:
10M users × 30 min × 60 sec × 5 Mbps = 9 × 10^10 Mb = 9 Petabits
= 1.125 Petabytes per day
CDN cost estimate (at $0.02/GB):
1,125,000 GB × $0.02 = $22,500/day = $675,000/month
Cache hit ratio target: 95%+
→ 95% served from CDN cache, 5% from origin
→ Origin bandwidth: 56 TB/day (much more manageable)`public`, `private`, `no-cache`, `no-store`, and `s-maxage` demonstrates practical experience.