caching-strategy - SKILL.md Agent Skill

name: caching-strategy description: Design a cache layer — cache-aside read/write/invalidate, TTL + jitter, stampede prevention (single-flight / probabilistic refresh), and explicit invalidation. Use when read latency is high, the DB is read-bound, or a hot key causes thundering-herd load. Not for fixing the slow query at its source (use query-optimization first) or HTTP/browser caching (a frontend concern). license: MIT

Caching Strategy

Purpose

Add caching deliberately — with a clear read/write/invalidate flow, stampede protection, and an invalidation plan — so it reduces load without serving stale or inconsistent data or collapsing under a hot-key herd.

Universal — cache-aside flow, TTL+jitter, stampede prevention, and invalidation strategy are caching principles independent of the cache store; Redis is the default implementation.

Procedure

Optimize the query FIRST, cache second
- Caching a slow query hides the problem and adds staleness risk
- Run query-optimization before adding a cache layer
Use cache-aside (lazy loading) as the default pattern
- Read: check cache → miss → read DB → populate cache → return
- Write: write DB → invalidate (delete) the cache key (don't write-through unless justified)
- Delete-on-write > update-on-write: avoids cache/DB races
Set TTL with jitter
- Every cached key gets a TTL (no infinite caches without an invalidation plan)
- Add random jitter to TTLs so keys don't all expire simultaneously (a synchronized expiry = mass stampede)

3b. Bound the cache: memory budget + eviction policy

Set maxmemory and choose an eviction policy deliberately (allkeys-lru for general read-through caches; volatile-lru if you mix persistent state into the same instance — but ideally don't)
Without a bound, a runaway key generator (per-user, per-query-fingerprint) eats memory until OOM
Cache-key cardinality: unbounded distinct keys = unbounded memory; cap or hash high-cardinality identifiers

Prevent cache stampede on hot keys
- Single-flight lock: first request acquires an atomic compare-and-set lock, recomputes, others wait/serve-stale
- Probabilistic early refresh: recompute slightly before expiry with rising probability
- Both must be executed atomically (store-specific mechanism in Implementation)
- Without this, a popular key expiring under load → thundering herd hammers the DB
- Negative caching: cache "this key doesn't exist" (a sentinel value, short TTL) for queries that miss — otherwise the same non-existent key hits the DB on every request (silent thundering herd from 404s)
Plan invalidation explicitly — the hard part
- Know exactly which writes invalidate which keys
- Use key naming conventions (user:{id}:profile) so invalidation is targeted
- Tag-based / versioned keys for "invalidate everything related to X"

5b. Cache is an optimization, not a source of truth

The app MUST keep working with an empty / unavailable cache — degraded latency, not broken behavior
Never store authoritative state only in the cache (auth tokens, balances). On Redis restart you lose it
Use resilience-patterns circuit breaker around the cache client so a cache outage doesn't cascade

Validate (validation loop)
- Load-test with a hot key expiring under concurrency → verify no DB spike (stampede prevented)
- After a write, verify the cache returns fresh data (invalidation works)
- If stale data served → invalidation gap; fix the write→invalidate wiring

Anti-patterns

❌ Anti-pattern	✅ Correct
Caching before optimizing the query	Optimize first, cache second
No TTL (infinite cache, no invalidation plan)	TTL + explicit invalidation strategy
All keys same TTL	TTL + jitter to desynchronize expiry
Update-cache-on-write (race-prone)	Delete-cache-on-write (cache-aside)
No stampede protection on hot keys	Single-flight lock or probabilistic refresh
No `maxmemory` / eviction policy (OOM under load)	`maxmemory` + `allkeys-lru` (or chosen policy)
Unbounded distinct cache keys (memory leak)	Cap or hash high-cardinality keys
404s hitting the DB on every retry (negative-cache miss)	Cache "not found" sentinel with a short TTL
Auth/session state stored only in cache	Cache is optimization, not source of truth

Severity tiers

Tier	Examples	Action SLA
Critical	Hot key with no stampede protection causing DB overload; cache serving stale auth/permission data	Fix immediately
Major	Infinite-TTL cache with no invalidation plan; update-on-write races	Fix this sprint
Minor	Uniform TTLs (no jitter); cache key naming inconsistency	Schedule within 2 sprints

Completion Criteria

Underlying query optimized before caching
Cache-aside with delete-on-write applied
Every key has a TTL + jitter
Hot keys have stampede protection (verified under load)
Invalidation mapping documented (which write → which key)

Output

Cache layer code: cache-aside helpers + invalidation hooks
Invalidation map: docs/cache-invalidation.md — write → invalidated keys
Commit format: perf(cache): add cache-aside for <query> / fix(cache): single-flight lock on <hot key>

Implementation

TypeScript + Redis (default)

Cache-aside helper around Redis GET/SETEX/DEL
Stampede: single-flight via SET key val NX PX ttl lock, or ioredis + Lua EVAL for atomicity
Probabilistic refresh: store (value, computed_at, ttl) and recompute when now - computed_at > ttl * random_threshold
Supabase: pair with Postgres; Redis via Upstash/managed

Other stacks

Python: redis-py + aiocache; same cache-aside + Lua patterns
Go: go-redis + singleflight package (stdlib-adjacent) for stampede
Universal: cache-aside, TTL+jitter, and stampede prevention are store-agnostic; Memcached works for simple cases (no Lua → use add-based locks)

Related skills

query-optimization — cache only after the query itself is optimized
resilience-patterns — cache as a fallback when a dependency is down
transaction-management — invalidate cache after the write commits, not before

Reference

Key insight encoded: A popular key expiring under concurrency triggers a stampede (thundering herd); prevent it with a single-flight mutex lock or probabilistic early refresh, both executed atomically via Lua. Delete-on-write (not update) avoids cache/DB races.