Skip to content
engineering

How AuraImage's Edge CDN Works Under the Hood

N

Narek Hakobyan

This post is for the engineers who read ARCHITECTURE.md before the README. Here is how AuraImage moves images from upload to edge-serve.

Architecture at a glance

Client → CDN (Cloudflare) → Origin (Worker) → R2 (storage)

         End user (serve)

The CDN layer is Cloudflare's global network. The origin is a Cloudflare Worker that handles upload verification, image processing, and storage. R2 is the persistent object store.

Upload flow

  1. Client requests an upload token from their backend. The backend calls createUploadToken() from @auraimage/sdk/server, which HMAC-signs the token with the project's Secret Key.

  2. Client uploads directly to the CDN. The file and token are sent via HTTP POST to the upload endpoint. The CDN worker verifies the token's HMAC against all active Secret Keys for the project (up to 10, tried in turn, O(1) lookup via SHA-256 index).

  3. Image pipeline runs at origin. On verification success, the worker runs the uploaded file through the image pipeline:

    • Validate format and dimensions
    • Generate derivatives (responsive breakpoints, WebP/AVIF variants)
    • Compute BlurHash for placeholder rendering
    • Extract metadata (EXIF, color profile)
  4. Stream to R2 via multipart. Originals and derivatives are written to R2 using multipart streaming. This keeps memory pressure low — large files never fully buffer in the worker.

Serve flow

  1. End user requests an image via a signed serve URL:

    https://imagedelivery.net/<project>/hero.jpg?w=800&h=600&fit=cover
  2. CDN checks cache. If the derivative exists at the edge, serve it. Median TTFB: 12ms from cache.

  3. Cache miss → origin. The origin worker fetches the original from R2, applies the requested transforms, caches the result, and serves it. Cache-miss TTFB: ~80ms.

Bounded queue and retry policy

Image processing is CPU-intensive and can spike under load. The origin uses a bounded queue with exponential backoff retry:

Queue depth: 100 concurrent transforms
Retry policy: 3 attempts, 200ms → 400ms → 800ms backoff
Overflow: HTTP 503, client should retry

This prevents a thundering herd of cache misses from saturating the worker. The queue is in-memory per worker instance — no external message broker needed.

Why R2 over S3

  • Zero egress fees. Cloudflare does not charge for bandwidth between Workers and R2. S3 egress would add $0.09/GB on every cache miss.
  • Multipart streaming. R2's S3-compatible API supports multipart uploads, letting the worker stream chunks as they arrive.
  • Global replication. R2 replicates objects across Cloudflare's network, keeping origin fetches fast regardless of which POP the request hits.

What is next

  • AVIF encoding at origin — currently AVIF derivatives are generated on first request; moving encoding to upload time removes the cache-miss penalty for AVIF.
  • Per-project custom domains — serve images from cdn.myapp.com instead of imagedelivery.net.
  • Image analytics — bandwidth and request counts broken down by image, not just by project.

Set up your project and trace the flow yourself in DevTools — every image on this page is served through AuraImage.