Memory Efficiency: Redis vs Memcached vs Valkey
Memory is the resource that costs money at scale. A 10% memory efficiency difference on a 256GB cache means saving the cost of 25GB worth of cluster capacity. The engines differ in how they manage memory, and the right answer depends on the shape of your data.
The Memcached slab allocator
Memcached allocates memory in fixed-size slab classes: typically 80 bytes, 100 bytes, 125 bytes, 156 bytes, and so on up to 1MB (the default per-item ceiling). When you SET a 90-byte item, Memcached rounds up to the 100-byte slab and uses 100 bytes of storage for the value, plus roughly 55 bytes of internal overhead (item header, expiry timestamp, flags, CAS token). The wasted 10 bytes is internal fragmentation.
The slab approach has trade-offs. For workloads where most items are similar size (a page cache with uniformly-sized HTML pages, an image thumbnail cache), fragmentation is minimal and the allocator is very efficient. For workloads with wildly varying item sizes (sometimes 10 bytes, sometimes 100KB), fragmentation can be 30-50% wasted capacity. Memcached can be tuned with custom slab class sizes via the -S flag to match a known workload shape.
The other property of the slab allocator is fast and predictable allocation: storing an item never triggers a malloc / free cycle, just a slab-class lookup. This contributes to Memcached's low-latency profile, especially under memory pressure where Redis's allocator can become a contention point.
Redis special-case encodings
Redis uses jemalloc for general allocation and applies special-case encodings for small data structures: ziplist (or listpack in newer versions) for small hashes and lists, intset for sets containing only small integers, embstr for short strings. These encodings can be dramatically more efficient than the general-case structures: a hash with 10 small fields might use 200 bytes total under ziplist encoding versus 800 bytes under the standard hash structure.
The encoding is automatic and transparent: Redis stores the data in the more efficient encoding while it fits the thresholds (configurable via hash-max-listpack-entries, hash-max-listpack-value, etc.), then transparently converts to the standard encoding when the structure grows beyond the threshold. The application sees no behavioural difference, but memory usage looks very different depending on whether you have many small structures or fewer large ones.
For workloads dominated by small string keys (the typical session-store, cache-key pattern), Redis is comparable to or slightly better than Memcached in memory efficiency. For workloads dominated by small hashes (typical hash-based session storage, hash-based config), Redis is significantly more efficient. For workloads with very large values (cached HTML pages, large JSON responses), the per-item overhead is amortised over the value size and both engines are similar.
The Valkey 8.1 result
The Momento benchmark cited on the homepage measured Valkey 8.1 using 28% less memory than Redis 8.0 for a workload of 50 million sorted-set entries on AWS c8g.2xlarge (Graviton4). The specific numbers were Redis 8.0 at 4.83GB and Valkey 8.1 at 3.77GB for the same data set. The benchmark was published by Momento (a managed-cache provider) and uses Momento's standard benchmark methodology.
The Valkey advantage traces to optimisations in the listpack encoding (Valkey 8.x rewrote parts of the listpack implementation for tighter packing) and improvements to the sorted-set internal representation. The 28% number is specific to this workload shape; for plain string keys the Valkey-Redis gap is much smaller (typically 2-5%). The implication is that Valkey is genuinely improving memory efficiency in ways the Redis OSS engine has not yet picked up.
At scale, 28% memory savings translate to 28% smaller cluster sizes for the same workload, which translates to 28% lower compute cost. For a $10k/month Redis cluster, that is a $2,800/month saving on top of the ~33% Valkey-versus-Redis license-driven cloud-pricing difference. The cumulative cost differential at scale is large enough that even teams without license concerns are considering Valkey for new workloads.
The bigger lever: application-level compression
The choice of engine has a 5-30% impact on memory usage for typical workloads. The choice of whether to compress cached values has a 50-80% impact for text-heavy data. For applications cacheing JSON API responses, HTML fragments, or any text-dominant payload, application-level compression (gzip, zstd) usually saves more memory than picking the more efficient engine.
The trade-off is CPU cost: compress on write, decompress on read. For modern CPUs and modern compressors (zstd at level 3 or so), the CPU overhead is single-digit microseconds for typical cache item sizes, comparable to the network round-trip itself. The bandwidth savings on the network (compressed payload is smaller to transmit) often pay for the CPU cost twice over.
Most production caches at scale use some combination of: (1) engine choice based on workload features; (2) per-application-instance compression for text-heavy values; (3) careful key naming to maximise the special-case encoding opportunities (in Redis); (4) custom slab classes for known item-size distributions (in Memcached). Each of these moves the dial by 10-30%; combined they can halve the required cluster size.
FAQ
What is the per-item overhead for Memcached?
Roughly 50-60 bytes per item beyond the key and value themselves, plus internal fragmentation from the slab allocator (which rounds item sizes up to fixed slab classes). For typical 100-byte items the effective overhead is 50-100% of the item size. For 10KB items the overhead is closer to 1%. The slab allocator favours uniform-size items.
What is the per-item overhead for Redis?
Roughly 80-120 bytes for a typical Redis string key-value pair, with special-case encodings (ziplist for small hashes and lists, intset for small integer sets) that compress significantly for small structures. For small values Redis can be more memory-efficient than Memcached; for large values they are similar; for very large values Memcached has marginally lower overhead.
Why is Valkey 8.1 28% more memory efficient than Redis 8.0?
The Momento benchmark (cited on the homepage) measured 28% lower memory usage on Valkey 8.1 vs Redis 8.0 for a workload of 50 million sorted-set entries. The win comes from Valkey 8.x optimisations in the listpack encoding (replacement for ziplist) and improvements to the hashtable resize logic. The gap is workload-dependent; for plain string keys it is smaller.
Does compression help?
Both engines store values as opaque bytes; the application is responsible for compression. For text-heavy workloads (JSON responses, HTML fragments) gzip or zstd compression at the application layer can reduce memory by 50-80%, which often saves more than choosing the more memory-efficient engine. The CPU cost of compression is usually negligible against the network savings.
How do I measure actual memory usage?
Redis: INFO MEMORY gives used_memory_human (active), used_memory_rss_human (resident, including allocator overhead), and per-type breakdowns. Memcached: stats command shows total_malloced, bytes (active), and per-slab statistics. The RSS / used ratio (mem_fragmentation_ratio in Redis) above 1.5 indicates significant allocator fragmentation.