Frontend Caching: 3 Layers Explained Simply

Have you ever noticed how some pages load instantly on your second visit? Or wondered why your API call doesn't always hit the server? Or been amazed when an app works perfectly offline?

That's caching at work — and it's happening at multiple layers simultaneously.

As a frontend developer, understanding these caching layers is crucial for building fast, resilient applications. Let's demystify the three types of caching you'll encounter and when to use each one.

Last week, our team spent hours debugging a frustrating issue: users weren't seeing the latest data even though our API was returning updated values. After digging through network logs, we discovered the culprit—a Cache-Control: max-age=360000 header was causing responses to be served directly from the browser's disk cache. The API wasn't even being called!

This incident revealed a knowledge gap: Several team members weren't familiar with how different caching layers work in the frontend. They understood React Query and state management, but browser HTTP caching and Service Workers felt like mysterious black boxes.

Have you ever noticed how some pages load instantly on your second visit? Or wondered why your API call doesn't always hit the server? Or been amazed when an app works perfectly offline?

That's caching at work—happening at three distinct layers simultaneously.

Understanding these caching layers is crucial for building fast, resilient applications and avoiding those multi-hour debugging sessions. Let's demystify the three types of caching you'll encounter and when to use each one.

Layer 1: Application-Level Caching (In-Memory)

What it is: Data stored temporarily in your application's memory while it's running.

Where it lives: JavaScript runtime (browser memory, React state, library cache)

When to Use It

Application-level caching is perfect when:

You're fetching the same data multiple times in a short period
Users navigate between pages that share common data
The data is relatively stable (doesn't change every second)
You want to eliminate loading spinners for repeat requests

Real-World Example

Imagine a dashboard with multiple tabs showing user analytics. Without caching, switching between tabs would trigger a new API call each time — annoying loading spinners everywhere!

With application-level caching using React Query (TanStack Query):

const { data, isLoading } = useQuery({
  queryKey: ['analytics', userId],
  queryFn: () => fetchAnalytics(userId),
  staleTime: 5 * 60 * 1000, // Consider data fresh for 5 minutes
  cacheTime: 10 * 60 * 1000, // Keep in cache for 10 minutes
});

What happens:

First visit: API call is made, data is cached
Switch tabs: No API call — instant data from cache
Come back after 3 minutes: Still using cached data (it's "fresh")
Come back after 6 minutes: Shows cached data immediately, refetches in background
Refresh the page: Cache is cleared, starts fresh

The Tradeoff

Pros:

Blazing fast — no network latency
Reduces server load
Smoother user experience

Cons:

Data only lives during the session
Gone on page refresh
Not shared across browser tabs
Can show stale data if not configured properly

Layer 2: Service Worker Caching (Browser-Level)

What it is: A programmable proxy that sits between your app and the network, intercepting requests and deciding how to respond.

Where it lives: Browser (separate from your main app thread, survives page refreshes)

When to Use It

Service Workers shine when you need:

Offline functionality — app works without internet
Faster repeat visits — cache entire pages, images, API responses
Custom caching strategies — fine-grained control over what/when/how to cache
Background sync — queue requests when offline, send when back online

Real-World Example

A recipe app that lets users browse saved recipes even without internet:

// In service-worker.js
self.addEventListener('fetch', (event) => {
  const { request } = event;

  // Cache-first strategy for images
  if (request.url.includes('/images/')) {
    event.respondWith(
      caches.match(request).then(cached => {
        return cached || fetch(request).then(response => {
          return caches.open('images-v1').then(cache => {
            cache.put(request, response.clone());
            return response;
          });
        });
      })
    );
  }

  // Network-first for API calls, fallback to cache
  if (request.url.includes('/api/')) {
    event.respondWith(
      fetch(request)
        .then(response => {
          const clone = response.clone();
          caches.open('api-v1').then(cache => cache.put(request, clone));
          return response;
        })
        .catch(() => caches.match(request)) // Offline fallback
    );
  }
});

What happens:

User visits a recipe page online → Service Worker caches it
User goes offline → Service Worker serves cached version
User browses other cached recipes → All work offline
User comes back online → Service Worker updates cache in background

Common Caching Strategies

Cache First: Check cache, fallback to network (good for static assets)
Network First: Try network, fallback to cache if offline (good for dynamic content)
Stale-While-Revalidate: Serve cache immediately, update in background (best UX)
Cache Only: Never hit network (for truly static content)
Network Only: Never cache (for sensitive data)

The Tradeoff

Pros:

Survives page refreshes and browser restarts
Enables offline experiences
Full control over caching logic
Can cache anything (HTML, CSS, JS, API responses, images)

Cons:

More complex to implement and debug
Can serve stale content if not managed properly
Requires careful cache invalidation strategy
Users might not see updates immediately

Layer 3: HTTP Caching (Browser, CDN, Proxy)

What it is: The foundational caching layer built into HTTP protocol itself, controlled by response headers from the server.

Where it lives: Browser cache, CDN servers (Cloudflare, Fastly), reverse proxies (NGINX, Varnish)

When to Use It

HTTP caching is ideal for:

Static assets (JS bundles, CSS files, images, fonts)
Shared resources — cached by CDNs and served to millions of users
Bandwidth savings — reduces data transfer globally
Scalability — reduces server load exponentially

How It Works: The Headers

The server controls caching through HTTP headers:

HTTP/1.1 200 OK
Cache-Control: public, max-age=31536000, immutable
ETag: "abc123"
Last-Modified: Mon, 01 Jan 2024 00:00:00 GMT

Key headers explained:

Cache-Control: max-age=31536000 — Cache for 1 year (in seconds)
Cache-Control: public — Can be cached by CDNs and shared
Cache-Control: private — Only browser can cache (not CDNs)
Cache-Control: no-store — Never cache (sensitive data)
Cache-Control: immutable — Never revalidate (perfect for hashed filenames)
ETag — Unique identifier for this version
Last-Modified — When this resource was last changed

Real-World Example: Conditional Requests

Let's see how browsers avoid downloading unchanged files:

First request:

GET /app.js HTTP/1.1
Host: example.com

→ 200 OK
Cache-Control: max-age=3600
ETag: "v1.0.0"
Content-Length: 50000
[file content]

Second request (within 1 hour): Browser uses cached copy — no request sent at all!

Third request (after 1 hour):

GET /app.js HTTP/1.1
Host: example.com
If-None-Match: "v1.0.0"

→ 304 Not Modified
[no content — browser reuses cached copy]

The server only sends 304 Not Modified (tiny response), saving 50KB of bandwidth!

Modern Best Practices

For immutable assets (with hash in filename):

Cache-Control: public, max-age=31536000, immutable

Example: app.a3f2c1d.js — cache forever, filename changes when content changes

For HTML (entry point):

Cache-Control: no-cache

Always revalidate — ensures users get latest version

For API responses:

Cache-Control: private, max-age=300

Cache for 5 minutes, only in browser (not CDN)

The Tradeoff

Pros:

Works everywhere — browsers, CDNs, proxies
Zero code required (just configure headers)
Scales globally (CDNs distribute cached content)
Saves massive amounts of bandwidth

Cons:

Less flexible — can't implement complex logic
Hard to invalidate caches once set
Shared caches (CDN) can serve stale content
Users might not see updates until cache expires

How These Layers Work Together: A Complete Journey

Let's trace what happens when a user visits your news app:

First Visit

HTTP Cache (Browser): Checks cache — miss (first visit)
Service Worker: Intercepts request — no cache yet
Network: Fetches from server
Server responds with: Article data + Cache-Control: max-age=300
Service Worker: Stores copy in cache
Application (React Query): Caches in memory with staleTime: 5 minutes
Browser: Stores copy in HTTP cache for 5 minutes

User Switches Tabs (10 seconds later)

Application Cache: Returns data instantly from memory — no network request!

User Refreshes Page (1 minute later)

Application Cache: Cleared (page refresh)
Service Worker: Has cached copy — serves instantly
HTTP Cache: Also has copy, but Service Worker served first
Application: Re-caches in memory

User Returns (10 minutes later, offline)

Application Cache: Cleared (new session)
Service Worker: Serves cached copy — app works offline!
HTTP Cache: Would have expired anyway

User on Different Device (same city)

CDN (HTTP Cache): Serves from nearby edge server — no origin server hit!

💡 Choosing the Right Layer

Scenario	Best Choice	Why
Smooth tab switching	Application Cache	Instant, no network
Offline support	Service Worker	Survives page refresh
Static assets (JS/CSS)	HTTP Cache	CDN distribution
Real-time data	No caching	Data changes constantly
User-specific data	Application + HTTP private	Fast + secure
Public images	HTTP Cache (CDN)	Global distribution

Quick Decision Framework

Start with this thought process:

Does the data change frequently?
- Yes (every few seconds) → Minimal or no caching
- No → Continue...
Should it work offline?
- Yes → Service Worker caching
- No → Continue...
Is it user-specific?
- Yes → Application cache + HTTP private cache
- No → HTTP public cache (CDN-friendly)
Is it a static asset?
- Yes → Aggressive HTTP caching with immutable flag
- No → Shorter cache durations with revalidation

Key Takeaways

Application caching = Speed during the session (in-memory, React Query, SWR)
Service Worker caching = Offline resilience + custom strategies
HTTP caching = Global scalability + bandwidth savings

Each layer complements the others. Master all three, and you'll build apps that are:

⚡️ Fast — users see instant responses
🔌 Resilient — work offline when needed
📈 Scalable — handle millions of users efficiently
💰 Cost-effective — reduce server and bandwidth costs

Remember: Caching is not just about speed — it's about creating better user experiences while reducing costs and improving reliability.

⚡️ The 3 Layers of Caching Every Frontend Engineer Should Know

Layer 1: Application-Level Caching (In-Memory)

When to Use It

Real-World Example

The Tradeoff

Layer 2: Service Worker Caching (Browser-Level)

When to Use It

Real-World Example

Common Caching Strategies

The Tradeoff

Layer 3: HTTP Caching (Browser, CDN, Proxy)

When to Use It

How It Works: The Headers

Real-World Example: Conditional Requests

Modern Best Practices

The Tradeoff

How These Layers Work Together: A Complete Journey

First Visit

User Switches Tabs (10 seconds later)

User Refreshes Page (1 minute later)

User Returns (10 minutes later, offline)

User on Different Device (same city)

💡 Choosing the Right Layer

Quick Decision Framework

Key Takeaways

Further Reading

Comments

More from this blog

Building an Agentic RAG System: Smarter Answers for a Complex World

Building an Agentic Workflow: Intelligent Routing and Validation with LangGraph

Augmented LLMs: The Future of AI That Thinks, Acts, and Remembers

Generative AI vs. Agentic AI: Why Your AI Isn't Smart Enough (Yet)

Command Palette

Layer 1: Application-Level Caching (In-Memory)

When to Use It

Real-World Example

The Tradeoff

Layer 2: Service Worker Caching (Browser-Level)

When to Use It

Real-World Example

Common Caching Strategies

The Tradeoff

Layer 3: HTTP Caching (Browser, CDN, Proxy)

When to Use It

How It Works: The Headers

Real-World Example: Conditional Requests

Modern Best Practices

The Tradeoff

How These Layers Work Together: A Complete Journey

First Visit

User Switches Tabs (10 seconds later)

User Refreshes Page (1 minute later)

User Returns (10 minutes later, offline)

User on Different Device (same city)

💡 Choosing the Right Layer

Quick Decision Framework

Key Takeaways

Further Reading

Comments

More from this blog