Home / Use Cases / API Response Caching

Verdict: Either works. Memcached has the throughput edge; Redis adds hash-partial invalidation, pub/sub-based cache busting, and tag-based purge. Pick Redis if you anticipate complex invalidation, Memcached if it is pure read-through.

Redis vs Memcached for API Response Caching

API caches sit between the full-page cache and the database cache. The payloads are mid-sized JSON, the invalidation events are frequent, the per-user variants multiply the working set. The choice of store affects how easily you can purge selectively and how cleanly you can broadcast invalidation across a fleet of API servers.

2-50 KB

Typical API payload

JSON, gzipped on transit

Redis cache-bust pattern

Fan-out invalidation across fleet

None

Memcached cache-bust

Each server polls, no broadcast

HASH

Partial invalidation

HDEL one field, keep the rest

The shape of an API cache workload

API caches differ from page caches in three ways that matter for storage choice. First, the payloads are smaller and more numerous: a single user dashboard might call ten distinct endpoints to render, each cached separately under a different key. Second, the keys multiply by user identity: caching /api/notifications means you cache one entry per active user, which can be tens of thousands of keys for a busy product. Third, the invalidation events are frequent and specific: a user comments on a post and seventeen related caches need to drop within a few hundred milliseconds.

These properties favour Redis in subtle ways. The per-user key multiplication lands on Redis's strength (memory-efficient encoding for small keys, the intset and ziplist optimisations). The complex invalidation pattern lends itself to Redis pub/sub: when data changes, publish a cache-bust message on a channel, and every API server subscribing to that channel drops the affected keys locally. Memcached has no pub/sub, so you either accept slower invalidation (each server polls a version counter) or push out-of-band invalidation events through a separate system.

For pure read-through caching with simple TTL-based expiry and no fan-out invalidation, Memcached is fine and slightly faster. The decision pivots on whether your invalidation story is "set a 60-second TTL and accept the staleness" or "every write must purge the affected reads within hundreds of milliseconds." The first lives well on either store. The second is much easier on Redis.

The cache-aside read-through pattern

The canonical pattern: on a GET request, the API server checks the cache by key, returns the cached response on hit, and on miss queries the underlying data source, serialises the result, stores it in the cache with a TTL, and returns it. Most web framework middleware implements this in 20 lines. The hard parts are the cache key (must include all request inputs that affect the response), the TTL (must match freshness tolerance), and the invalidation (must fire on every relevant write).

The cache key construction is where most production bugs live. The key must include the route, the query parameters in canonical order, the user ID (for authenticated responses), any feature flag values that affect the output, the API version header, and the Accept-Language header if responses vary by locale. Missing any of these means cache collisions where one user sees another user's data or one feature variant leaks into the other. The fix is a key-builder function that explicitly enumerates everything that affects the response, hashed if it would otherwise produce keys that are too long.

The thundering herd problem also bites here. When a popular key expires, many concurrent requests miss the cache and all query the database. The mitigation is to use a single-flight pattern: requests that miss try to acquire a short SET NX PX lock; the winner queries and writes the cache, the losers wait briefly and retry the read. This is the same pattern as the cache-stampede defence covered in the distributed locking guide.

Pub/sub for fan-out invalidation

The pattern that distinguishes Redis from Memcached for API caching is fan-out invalidation via pub/sub. Imagine a content API: a user edits their profile, and the cached responses for /api/users/42, /api/users/42/posts, and every /api/feed containing user 42 in 200 followers' feeds need to drop. Doing this in Memcached means either deleting one key at a time (you have to know all of them) or letting TTL expire (you accept stale data until the TTL elapses).

In Redis you publish a single message on a channel: PUBLISH invalidations {userId: 42}. Every API server is subscribed to that channel. On message receipt, each server runs its own local logic to drop the affected keys. The publish is one operation; the fan-out happens at the subscriber side without any further central coordination. For a fleet of fifty API servers, the bandwidth and latency of this pattern is negligible.

Redis 8.0 added a more efficient broadcast mechanism via the client-side caching invalidation API: clients register the keys they have cached locally, and Redis pushes invalidation notifications when those keys change. This is a server-assisted client cache pattern that lets you cache in the application process (avoiding the network round-trip entirely on hit) while still getting correctness on writes. Valkey is expected to add the same in 8.x.

Tag-based invalidation patterns

For complex APIs the keys-to-purge problem becomes unmanageable. Solution: tag each cached response with one or more semantic tags (user:42, post:1234, feed) and store a reverse index mapping tag to the set of keys it tags. When data changes, look up the affected tags, fetch all keys under those tags, and DEL them. Redis sets (SADD per tag, SMEMBERS on invalidation, SUNIONSTORE for compound tags) make this straightforward. Memcached has no sets.

The pattern is used by Drupal, Symfony's HttpCache, and Django's cache_purge framework. Each cached response is stored with a list of cache tags. A separate Redis set per tag tracks "which cache keys carry this tag." Invalidating tag user:42 becomes SMEMBERS user:42 to get the affected keys, then DEL each. The whole operation can be wrapped in a Lua script for atomicity. Memcached's only path here is to track the reverse index in your application database, which negates the benefit of having a cache in the first place.

For very high-tag-count workloads (millions of distinct tags, each with thousands of cached keys) the SMEMBERS scan becomes slow. The mitigation is to expire the tag set itself with a longer TTL than the cached entries; periodic background re-indexing keeps it clean. This is operationally similar to a search index and benefits from the same instincts: occasional rebuilds, accept some staleness, optimise for the read path.

FAQ

Should I cache JSON responses or the underlying data?

Both work. JSON response caching is simpler (cache the bytes you send, return them on hit). Underlying data caching is more flexible (cache the query result, render JSON per request) and lets you mix cached and fresh data in one response. Pick by how often the response shape changes versus how often the data changes.

How do I handle ETag with cached responses?

Generate the ETag from a hash of the cached response body and store both. On a request with If-None-Match, compare the client's ETag to the cached one and return 304 if equal. Both Redis and Memcached can hold the body and the ETag together (Redis as a hash, Memcached as a JSON-wrapped value).

Per-user caches: how do I avoid leaking?

Key cache entries by user ID. Never cache an authenticated response with a non-user-specific key. The classic leak is caching /api/me by URL: every user gets the first user's data. Always concatenate the user identifier into the key.

What TTL should I use?

Depends on freshness tolerance. For dashboards and slow-changing data, 5-60 seconds. For news and user-generated content, sub-second to a few seconds. For configuration and rarely-changing reference data, hours. The right TTL is whatever the product tolerates given the staleness, multiplied by your traffic, divided by your backend capacity.

Redis hash vs string for cached response?

Use a hash if you might want to invalidate parts of the cached payload (HDEL one field rather than DEL the whole response). Use a string if the response is opaque. For most JSON responses the string is simpler and faster; the hash is worth it only when you have a clear partial-invalidation pattern.

Related decisions

Full-page caching
Where Memcached genuinely wins Pub/Sub
Fan-out invalidation pattern Single-flight pattern
Cache stampede defence All use cases
Back to the matrix