← back

Caching

CDN and Edge Caching

Push content to edge locations close to users. Covers cache-control headers, invalidation strategies, and when to use a CDN vs application-level cache.

CDN and Edge Caching

A user in Tokyo loads your website. Your servers are in Virginia. The request travels across the Pacific Ocean -- roughly 15,000 km of fiber optic cable, through multiple routers and undersea cables. The speed of light imposes a minimum round-trip time of about 100ms, and in practice, network latency, congestion, and TCP handshakes push this to 200-400ms. Multiply that by the dozens of resources a modern web page requires (HTML, CSS, JavaScript, images, fonts, API calls), and the page takes seconds to load. A Content Delivery Network (CDN) solves this by caching content on servers physically close to the user, turning that 200ms into 5-20ms.

How a CDN Works

Points of Presence (PoPs)

A CDN operates hundreds or thousands of edge servers distributed globally in Points of Presence (PoPs). Each PoP is a data center strategically located near population centers. When a user requests content, DNS directs them to the nearest PoP.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Without CDN:
  User (Tokyo) ──── 200ms ────> Origin Server (Virginia)

With CDN:
  User (Tokyo) ──── 5ms ────> CDN PoP (Tokyo)
                                │ Cache HIT → return immediately
                                │ Cache MISS → fetch from origin, cache, return

Global PoP distribution (example):
  North America: 50 PoPs
  Europe:        40 PoPs
  Asia:          30 PoPs
  South America: 15 PoPs
  Africa:        10 PoPs
  Oceania:        5 PoPs

Request Flow

1
2
3
4
5
6
7
8
9
1. User types www.example.com.
2. DNS resolution: CNAME points to cdn.example.com → CDN's DNS.
3. CDN's DNS uses GeoDNS or Anycast to resolve to the nearest PoP.
4. User's request goes to the nearest PoP.
5. PoP checks its cache:
   a. Cache HIT  → return cached response immediately.
   b. Cache MISS → PoP fetches from the origin server, caches the response,
                    and returns it to the user.
6. Subsequent requests from nearby users hit the cache.

CDN Architecture Tiers

Modern CDNs use a tiered architecture to reduce origin load:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
          User
           │
     ┌─────▼──────┐
     │  Edge PoP   │  (closest to user, first cache check)
     │ (L1 Cache)  │
     └─────┬───────┘
           │ Cache MISS
     ┌─────▼──────┐
     │ Regional    │  (mid-tier, aggregates misses from multiple edge PoPs)
     │ Shield (L2) │
     └─────┬───────┘
           │ Cache MISS
     ┌─────▼──────┐
     │   Origin    │  (your server)
     │   Server    │
     └─────────────┘

The regional shield (also called a mid-tier cache or origin shield) prevents multiple edge PoPs from simultaneously requesting the same resource from the origin (thundering herd problem).

Cache-Control Headers

HTTP cache-control headers govern how CDNs (and browsers) cache content. Getting these right is critical.

Key Headers

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
Cache-Control: public, max-age=86400
  → CDN and browsers can cache for 24 hours

Cache-Control: private, max-age=3600
  → Only the user's browser can cache (not CDN). For personalized content.

Cache-Control: no-cache
  → Can be cached, but must revalidate with origin before serving.

Cache-Control: no-store
  → Never cache. For sensitive data (banking, medical).

Cache-Control: s-maxage=3600
  → CDN-specific max age (overrides max-age for shared caches).

Cache-Control: stale-while-revalidate=60
  → Serve stale content while fetching fresh content in background.
  → Great for perceived performance.

ETag: "abc123"
  → Content fingerprint. CDN sends If-None-Match request to origin.
  → Origin responds 304 Not Modified if unchanged (no body transfer).

Vary: Accept-Encoding
  → Cache different versions based on request headers.
  → Essential for serving gzip vs brotli compressed content.

Caching Strategy by Content Type

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Static assets (JS, CSS, images):
  Cache-Control: public, max-age=31536000, immutable
  Use content-hash in filename: app.a1b2c3.js
  Cache forever. Deploy new filename when content changes.

HTML pages:
  Cache-Control: public, max-age=300, stale-while-revalidate=60
  Short TTL. Users get fresh content within 5 minutes.

API responses:
  Cache-Control: public, s-maxage=60
  or
  Cache-Control: private, no-cache (for personalized data)

User-specific content:
  Cache-Control: private, no-store
  Never cache on CDN. Use application-level caching instead.

Cache Invalidation

Phil Karlton famously said, "There are only two hard things in Computer Science: cache invalidation and naming things." CDN cache invalidation is the practical embodiment of this.

Approaches

TTL-based expiration: Set a max-age and let caches expire naturally. Simple but imprecise. Content might be stale for up to the TTL duration.

Purge by URL: Tell the CDN to invalidate a specific URL immediately.

1
2
3
CDN API: PURGE /images/product-42.jpg
  → All PoPs remove this URL from cache
  → Next request fetches fresh content from origin

Purge by cache tag: Tag cached objects with metadata. Purge all objects with a given tag.

1
2
3
Response header: Cache-Tag: product-42, category-electronics
CDN API: Purge all objects tagged "product-42"
  → Invalidates product page, product image, product API response

Versioned URLs (cache busting): The most reliable approach. Change the URL when the content changes. Since the URL is different, the CDN treats it as a new resource.

1
2
3
4
5
Before: /static/app.js
After:  /static/app.v2.js  or  /static/app.a1b2c3.js (content hash)

The old URL stays cached (harmless). The new URL is fetched fresh.
No purging needed.

In practice, use versioned URLs for static assets and TTL + purge for dynamic content.

Dynamic Content Acceleration

CDNs are not just for static files. Modern CDNs accelerate dynamic content too:

TCP Optimization

CDNs maintain persistent, optimized TCP connections between PoPs and the origin. Instead of the user establishing a new TCP connection across the ocean, the user connects to the nearby PoP (fast), and the PoP uses its pre-established connection to the origin (optimized).

1
2
3
4
5
6
7
Without CDN:
  User → TCP handshake (200ms) → TLS handshake (200ms) → Request (200ms)
  Total: ~600ms before first byte

With CDN:
  User → TCP+TLS to PoP (10ms) → PoP reuses connection to origin (100ms)
  Total: ~110ms before first byte

Edge Compute

Services like Cloudflare Workers, AWS Lambda@Edge, and Fastly Compute@Edge allow you to run code at CDN edge locations. Use cases:

  • A/B testing (route users to different versions without hitting origin).
  • Authentication and authorization at the edge.
  • Image resizing and optimization.
  • Personalized content assembly.
  • Bot detection and security filtering.

Prefetching

Some CDNs analyze HTML responses and prefetch linked resources (CSS, JS, images) before the client requests them. The resources are already cached at the PoP when the client's browser requests them.

CDN vs Application Cache

AspectCDNApplication Cache (Redis/Memcached)
LocationEdge, close to usersData center, close to database
Content typeStatic assets, public contentDynamic data, session state
Cache keyURL + headersCustom key
Latency reductionNetwork latency (geography)Database query latency
InvalidationPurge API, TTLDirect control (delete key)
Cost modelPer-GB transferred + per-requestInfrastructure cost
PersonalizationLimited (Vary header, edge compute)Full (per-user caching)

Use CDN for: Static assets, public pages, APIs with cacheable responses, media files. Use app cache for: User sessions, database query results, computed values, rate limiting counters. Use both together. A typical architecture caches static assets on the CDN, caches database queries in Redis, and uses the CDN for API responses where possible.

Real-World CDN Providers

Cloudflare

Operates in 300+ cities. Known for security features (DDoS protection, WAF) alongside CDN. Offers Cloudflare Workers for edge compute. Free tier available.

AWS CloudFront

Integrates tightly with AWS services (S3, ALB, Lambda@Edge). 400+ PoPs. Pay-per-use pricing. Good choice for AWS-native architectures.

Akamai

The original CDN (founded 1998). Largest network with 4,000+ PoPs. Used by many enterprise and media companies. Known for reliability but more expensive and complex.

Fastly

Known for real-time purging (< 150ms global purge) and programmability (VCL, Compute@Edge). Popular with developer-focused companies.

Capacity Estimation for CDN

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Scenario: Video streaming platform
  - 10 million daily active users
  - Average session: 30 minutes of video
  - Video bitrate: 5 Mbps (1080p)

Daily bandwidth:
  10M users × 30 min × 60 sec × 5 Mbps = 9 × 10^10 Mb = 9 Petabits
  = 1.125 Petabytes per day

CDN cost estimate (at $0.02/GB):
  1,125,000 GB × $0.02 = $22,500/day = $675,000/month

Cache hit ratio target: 95%+
  → 95% served from CDN cache, 5% from origin
  → Origin bandwidth: 56 TB/day (much more manageable)

Interview Tips

  • Explain the physics. Speed of light imposes minimum latency based on distance. CDNs solve this by reducing distance. This grounds your explanation in reality.
  • Discuss cache-control headers. Knowing the difference between `public`, `private`, `no-cache`, `no-store`, and `s-maxage` demonstrates practical experience.
  • Address cache invalidation. Mention versioned URLs for static assets (most reliable) and purge APIs for dynamic content.
  • Mention the tiered architecture. Edge → Shield → Origin. The shield layer prevents thundering herd on the origin.
  • Distinguish CDN from app cache. CDN reduces network latency (geographic). App cache reduces compute/database latency. Use both.
  • Know when NOT to use a CDN. Personalized content, real-time data, and content with strict consistency requirements are poor CDN candidates. However, edge compute is blurring this line.
  • Mention cache hit ratio. The metric that determines CDN effectiveness. A 95%+ hit ratio means only 5% of requests reach the origin. Discuss how TTLs, content types, and long-tail distribution affect hit ratio.