name: caching-strategy description: Design a cache layer — cache-aside read/write/invalidate, TTL + jitter, stampede prevention (single-flight / probabilistic refresh), and explicit invalidation. Use when read latency is high, the DB is read-bound, or a hot key causes thundering-herd load. Not for fixing the slow query at its source (use query-optimization first) or HTTP/browser caching (a frontend concern). license: MIT
Caching Strategy
Purpose
Add caching deliberately — with a clear read/write/invalidate flow, stampede protection, and an invalidation plan — so it reduces load without serving stale or inconsistent data or collapsing under a hot-key herd.
Universal — cache-aside flow, TTL+jitter, stampede prevention, and invalidation strategy are caching principles independent of the cache store; Redis is the default implementation.
Procedure
Optimize the query FIRST, cache second
- Caching a slow query hides the problem and adds staleness risk
- Run
query-optimizationbefore adding a cache layer
Use cache-aside (lazy loading) as the default pattern
- Read: check cache → miss → read DB → populate cache → return
- Write: write DB → invalidate (delete) the cache key (don't write-through unless justified)
- Delete-on-write > update-on-write: avoids cache/DB races
Set TTL with jitter
- Every cached key gets a TTL (no infinite caches without an invalidation plan)
- Add random jitter to TTLs so keys don't all expire simultaneously (a synchronized expiry = mass stampede)
3b. Bound the cache: memory budget + eviction policy
- Set
maxmemoryand choose an eviction policy deliberately (allkeys-lrufor general read-through caches;volatile-lruif you mix persistent state into the same instance — but ideally don't) - Without a bound, a runaway key generator (per-user, per-query-fingerprint) eats memory until OOM
- Cache-key cardinality: unbounded distinct keys = unbounded memory; cap or hash high-cardinality identifiers
Prevent cache stampede on hot keys
- Single-flight lock: first request acquires an atomic compare-and-set lock, recomputes, others wait/serve-stale
- Probabilistic early refresh: recompute slightly before expiry with rising probability
- Both must be executed atomically (store-specific mechanism in Implementation)
- Without this, a popular key expiring under load → thundering herd hammers the DB
- Negative caching: cache "this key doesn't exist" (a sentinel value, short TTL) for queries that miss — otherwise the same non-existent key hits the DB on every request (silent thundering herd from 404s)
Plan invalidation explicitly — the hard part
- Know exactly which writes invalidate which keys
- Use key naming conventions (
user:{id}:profile) so invalidation is targeted - Tag-based / versioned keys for "invalidate everything related to X"
5b. Cache is an optimization, not a source of truth
- The app MUST keep working with an empty / unavailable cache — degraded latency, not broken behavior
- Never store authoritative state only in the cache (auth tokens, balances). On Redis restart you lose it
- Use
resilience-patternscircuit breaker around the cache client so a cache outage doesn't cascade
- Validate (validation loop)
- Load-test with a hot key expiring under concurrency → verify no DB spike (stampede prevented)
- After a write, verify the cache returns fresh data (invalidation works)
- If stale data served → invalidation gap; fix the write→invalidate wiring
Anti-patterns
| ❌ Anti-pattern | ✅ Correct |
|---|---|
| Caching before optimizing the query | Optimize first, cache second |
| No TTL (infinite cache, no invalidation plan) | TTL + explicit invalidation strategy |
| All keys same TTL | TTL + jitter to desynchronize expiry |
| Update-cache-on-write (race-prone) | Delete-cache-on-write (cache-aside) |
| No stampede protection on hot keys | Single-flight lock or probabilistic refresh |
No maxmemory / eviction policy (OOM under load) |
maxmemory + allkeys-lru (or chosen policy) |
| Unbounded distinct cache keys (memory leak) | Cap or hash high-cardinality keys |
| 404s hitting the DB on every retry (negative-cache miss) | Cache "not found" sentinel with a short TTL |
| Auth/session state stored only in cache | Cache is optimization, not source of truth |
Severity tiers
| Tier | Examples | Action SLA |
|---|---|---|
| Critical | Hot key with no stampede protection causing DB overload; cache serving stale auth/permission data | Fix immediately |
| Major | Infinite-TTL cache with no invalidation plan; update-on-write races | Fix this sprint |
| Minor | Uniform TTLs (no jitter); cache key naming inconsistency | Schedule within 2 sprints |
Completion Criteria
- Underlying query optimized before caching
- Cache-aside with delete-on-write applied
- Every key has a TTL + jitter
- Hot keys have stampede protection (verified under load)
- Invalidation mapping documented (which write → which key)
Output
- Cache layer code: cache-aside helpers + invalidation hooks
- Invalidation map:
docs/cache-invalidation.md— write → invalidated keys - Commit format:
perf(cache): add cache-aside for <query>/fix(cache): single-flight lock on <hot key>
Implementation
TypeScript + Redis (default)
- Cache-aside helper around Redis
GET/SETEX/DEL - Stampede: single-flight via
SET key val NX PX ttllock, orioredis+ LuaEVALfor atomicity - Probabilistic refresh: store
(value, computed_at, ttl)and recompute whennow - computed_at > ttl * random_threshold - Supabase: pair with Postgres; Redis via Upstash/managed
Other stacks
- Python:
redis-py+aiocache; same cache-aside + Lua patterns - Go:
go-redis+singleflightpackage (stdlib-adjacent) for stampede - Universal: cache-aside, TTL+jitter, and stampede prevention are store-agnostic; Memcached works for simple cases (no Lua → use add-based locks)
Related skills
query-optimization— cache only after the query itself is optimizedresilience-patterns— cache as a fallback when a dependency is downtransaction-management— invalidate cache after the write commits, not before
Reference
- Key insight encoded: A popular key expiring under concurrency triggers a stampede (thundering herd); prevent it with a single-flight mutex lock or probabilistic early refresh, both executed atomically via Lua. Delete-on-write (not update) avoids cache/DB races.