Skip to content
All posts

Your REST API Has a Cache Built In. You’ve Never Used It.

May 13, 2026·Read on Medium·

Every HTTP response you send is a missed caching opportunity. Here’s how to stop throwing bandwidth away.

Most backend developers reach for Redis the moment someone mentions caching. Fair enough. Redis is fast, predictable and well-documented. But there is a caching layer that costs nothing, requires no infrastructure, works with every HTTP client on earth and has been part of the HTTP spec since 1996. Almost nobody uses it correctly.

HTTP caching is not a mystery. It is a set of response headers that tell browsers, CDNs and reverse proxies exactly what to do with a response. When you do not send them, each client makes a fresh request to your server for data that has not changed. When you do send them, a significant chunk of those requests never reach your server at all.

The reason most APIs ignore this is laziness dressed up as framework defaults. JSON frameworks default to Cache-Control: no-cache or nothing at all. Developers accept the default and move on. The result is APIs that re-compute and re-deliver identical responses thousands of times per hour.

What HTTP Caching Actually Is

There are three places where HTTP caching happens.

Browser cache: The client stores the response locally. Future requests for the same URL check the cache first. No network round-trip at all if the response is still fresh.

Shared cache (proxy/CDN): A CDN or reverse proxy (Nginx, Varnish, Cloudflare) sits between clients and your origin. When a response is cacheable and marked public, one request to your server can serve thousands of clients through the CDN without another origin hit.

Intermediate proxies: Less common in modern infrastructure but still relevant in enterprise environments. Same principle as CDN caching.

The rules governing all three are defined in RFC 9111 (which superseded RFC 7234 in June 2022). The mechanism is straightforward: your server sends response headers, and caches use those headers to decide whether to store the response and for how long.

Most APIs send none of these headers. Or they send Cache-Control: no-cache on everything because a developer once heard that was "safe." Either way, the caching layer that HTTP gives you for free is completely bypassed.

The Headers That Actually Matter

Cache-Control

This is the one. If you learn nothing else from this article, learn Cache-Control.

The public directive tells shared caches (CDNs, proxies) that the response can be stored and served to any client. The private directive tells them it is intended for a single user and should only be cached by the browser, not a shared cache.

max-age=N specifies the response is fresh for N seconds from the time of the request. After that, the cache must revalidate with the origin before serving the response again.

s-maxage=N is the same thing but applies only to shared caches (CDNs). This lets you set a different freshness window for CDN caches versus browser caches. A common pattern:

Cache-Control: public, max-age=60, s-maxage=300

This tells the browser to cache for 60 seconds and tells the CDN to cache for 5 minutes. The CDN absorbs most traffic; the browser stays reasonably current.

no-store is the nuclear option. It means: do not cache this response anywhere, ever. Use it for genuinely sensitive data.

no-cache is widely misunderstood. It does not mean "do not cache." It means "cache this but revalidate with the origin before every use." A cached response with no-cache can still be stored and served, but only after the origin confirms it is still valid. This is more useful than no-store for authenticated resources that you still want ETags to work on.

stale-while-revalidate

This directive is the one most developers have never encountered, and it is probably the most useful for JSON APIs serving data that changes slowly.

Cache-Control: public, max-age=60, stale-while-revalidate=300

This tells the cache: serve this response for up to 60 seconds without checking the origin. If a request arrives between 60 and 360 seconds after caching, serve the stale response immediately while revalidating in the background. After 360 seconds total, the response is gone and a fresh origin fetch is required.

The result: zero added latency on cache hits, even when the cache is technically stale. Revalidation happens asynchronously. Users never wait for a revalidation round-trip. For endpoints where data changes slowly (product listings, config payloads, public dashboards), this is a meaningful win with no application-level complexity.

ETag and Conditional Requests

An ETag is a version token for a resource. You generate it and send it with the response. Clients include it in subsequent requests via the If-None-Match header.

# First response
HTTP/1.1 200 OK
ETag: "d8e8fca2dc0f896fd7cb4cb0031ba249"
Cache-Control: private, max-age=0, must-revalidate

# Subsequent request from client
GET /api/products/42 HTTP/1.1
If-None-Match: "d8e8fca2dc0f896fd7cb4cb0031ba249"
# Response when nothing changed
HTTP/1.1 304 Not Modified
ETag: "d8e8fca2dc0f896fd7cb4cb0031ba249"

A 304 has no body. No JSON payload, no serialization, no database read if you compute the ETag from model metadata. If your endpoint returns a large document that changes infrequently, ETags let you skip transmitting that payload entirely on cache hits while still guaranteeing clients always have current data.

The Vary Header (The One Everyone Forgets)

The Vary header tells caches which request headers to include in the cache key. If you serve different responses based on Accept-Language or Accept-Encoding, you must tell the cache.

Vary: Accept-Encoding, Accept-Language

Forgetting Vary: Authorization on authenticated endpoints that also have public caching configured is how you serve user A's data to user B through a shared cache. This is not a theoretical concern. It has happened in production, repeatedly, at companies that should have known better.

If you are not sure whether your endpoint’s response varies by authorization state, add Vary: Authorization or, better yet, mark it private and avoid the question entirely.

Why Your Framework Sends Nothing Useful

The typical framework default is conservative. When you return a response from a Laravel controller or a FastAPI endpoint without setting cache headers, the framework either sets Cache-Control: no-cache or says nothing at all. Most HTTP clients will not cache a response without explicit headers instructing them to.

There is a reason for this conservatism. Incorrect caching is significantly worse than no caching. Cache a personalized response as public and you have potentially served sensitive data to the wrong person. Cache a response that changes every minute with a 10-minute TTL and users see stale data and file support tickets. The safe default is "don't cache," and frameworks take the safe path.

The problem is that developers accept the safe default for everything, including endpoints where caching is obviously correct. A product catalog does not change per user. A list of countries does not change at all. A public dashboard that refreshes every few minutes should not hammer your database on every page load.

Implementing It in Laravel

Laravel includes a cache.headers middleware that the majority of Laravel developers have never opened.

Route::middleware(
'cache.headers:public;max_age=60;s_maxage=300;stale_while_revalidate=600;etag'
)->group(function () {
Route::get('/products', [ProductController::class, 'index']);
Route::get('/products/{id}', [ProductController::class, 'show']);
});

When you add etag to the directive list, Laravel computes an MD5 hash of the full response body and sets it as the ETag automatically. Subsequent requests with a matching If-None-Match header receive a 304 response without the controller running again.

For ETags based on model state rather than full response content, generate one manually. This is cheaper because you avoid serializing the entire response just to hash it:

public function show(Product $product): Response
{
$etag = '"' . md5((string) $product->updated_at->timestamp) . '"';

if (request()->header('If-None-Match') === $etag) {
return response('', 304)->withHeaders(['ETag' => $etag]);
}

return response()->json($product)
->withHeaders([
'Cache-Control' => 'private, max-age=0, must-revalidate',
'ETag' => $etag,
]);
}

If updated_at has not changed, you send a 304 with a handful of bytes instead of a full JSON payload. No database join re-execution, no serialization overhead, no JSON encoding.

Implementing It in FastAPI

FastAPI sets no cache headers by default. Add them explicitly on the Response object.

import hashlib
import json
from fastapi import FastAPI, Request, Response

app = FastAPI()

@app.get("/products/{product_id}")
async def get_product(product_id: int, request: Request, response: Response):
product = await fetch_product(product_id)
# Deterministic serialization for a stable hash
payload = json.dumps(product, sort_keys=True, default=str)
etag = f'"{hashlib.md5(payload.encode()).hexdigest()}"'
if request.headers.get("if-none-match") == etag:
return Response(status_code=304, headers={"ETag": etag})
response.headers["Cache-Control"] = "public, max-age=60, stale-while-revalidate=300"
response.headers["ETag"] = etag
return product

For high-throughput endpoints, consider computing the ETag from a version field or a hash of a database timestamp rather than serializing the full response. Same outcome, lower CPU cost.

If you prefer not to wire this manually on every endpoint, the fastapi-cache2 package handles ETag generation and If-None-Match checks with a decorator. The manual approach gives you more control over the ETag source, which matters when response content includes fields that change without the underlying data changing (like formatted timestamps or computed display values).

When Not to Cache

This matters as much as knowing when to use it.

Do not use public caching for:

  • Endpoints that return personalized data tied to a specific user session
  • Anything behind authentication where the response varies per token or session
  • Responses that depend on request body content (most POST endpoints)
  • Endpoints where serving stale data for even 30 seconds causes incorrect behavior

Do not set Cache-Control: public and then rely on Authorization headers to differentiate users without Vary: Authorization. The CDN will not check the Authorization header in its cache key unless you tell it to.

Keep public and authenticated endpoint paths clearly separated if you can. It is easier to reason about caching rules when you are not asking the same URL to serve two different audiences.

The Actual Impact

HTTP caching reduces origin load. For a product listing endpoint that serves 10,000 requests per hour with a 60-second max-age and a CDN in front, the first request populates the CDN edge cache. The next 9,999 requests in that 60-second window come from cache, not your origin. Your database sees one query, not 10,000.

At the browser level, a user navigating between pages and returning to one they already loaded gets an instant response from their local cache. No round-trip, no database query, no server render. Just bytes from disk.

The side effects compound: lower database load means more headroom for writes and complex queries. Lower compute usage means your instances stay responsive under traffic spikes. Lower egress bandwidth means smaller cloud bills.

Redis caching is still useful. Server-side caching of expensive computations, session storage, shared state across processes. These are valid use cases that HTTP caching cannot replace. But adding Redis to cache API responses that you never told HTTP to cache is solving a problem you created by ignoring the spec.

Set the HTTP headers first. Measure what changes. Then decide what actually needs Redis.

Stop Ignoring the Protocol

RFC 9111 is not a long document. The caching-relevant sections take under an hour to read. Every behavior described in it is consistent across browsers, CDNs and reverse proxies. The HTTP caching model has not changed in decades because it works.

Your API is already producing responses. Most of them are probably cacheable. You are just not saying so.

Ten lines of middleware configuration in Laravel or a few response headers in FastAPI is the entire implementation cost. The payoff is a layer of caching that scales automatically with your CDN, costs nothing in infrastructure and requires zero ongoing maintenance.

The engineers who designed HTTP got caching right in the 1990s. You do not need to reinvent it. You just need to stop pretending it does not exist.

Found this helpful?

If this article saved you time or solved a problem, consider supporting — it helps keep the writing going.

Originally published on Medium.

View on Medium
Your REST API Has a Cache Built In. You’ve Never Used It. — Hafiq Iqmal — Hafiq Iqmal