memory-system

name: memory-system displayName: Memory System description: "Operate and extend the joelclaw agent memory system — observation pipeline, write gates, vector store, retrieval, reflection, and nightly maintenance. Use when working on memory functions, debugging recall, tuning observation quality, or evolving the memory architecture." version: 0.1.0 author: joel tags: - memory - system-bus - typesense - recall - reflection

Memory System

Operational reference for the joelclaw memory pipeline (session signal → durable recall → curated MEMORY.md).

When to use

Use this skill when working on memory capture, write gates, Typesense-backed recall, reflection/promote flows, or nightly maintenance.

Canonical flow

sessions → observe → write-gate → store → decay/rank → retrieve → inject
    ↑                                                              ↓
    └──── nightly maintenance (dedup + stale pruning + stats) ─────┘
                       ↓
               observe → reflect → propose → triage → MEMORY.md

Guiding filter: “will this fact still be true and useful next month?”

1) Write gate states (allow / hold / discard)

State	Persist	Default retrieval	Rules
`allow`	yes	yes	Durable, reusable facts (constraints, architecture truths, operational fixes, explicit user rules).
`hold`	yes	no (unless explicitly requested)	Ambiguous/contextual signal worth keeping but not auto-injecting.
`discard`	no	no	Noise, tool traces, instruction artifacts, ephemeral chatter.

Rules to enforce:

Very short/low-information observations (e.g. <12 chars) → discard.
Instruction-edit artifacts/raw tool traces (<toolCall>, command dumps, “replace X with Y”) → discard.
Facts with ADR IDs, concrete file paths, or explicit runnable commands bias toward allow.
If gate annotation parsing fails, mark write_gate_fallback=true and track drift (high fallback rate is a health smell).

2) Vector store schema (Typesense only)

Backend: memory_observations in Typesense. Qdrant is retired (per slog, 2026-02-28).

Minimum operational fields:

Identity/source: id, session_id, source, timestamp, updated_at
Content: observation, observation_type, embedding
Write gate: write_verdict, write_confidence, write_reason, write_gate_version, write_gate_fallback
Taxonomy: category_id, category_confidence, category_source, taxonomy_version
Ranking/lifecycle: merged_count, recall_count, retrieval_priority, stale, stale_tagged_at, last_used_at, superseded_by, supersedes

3) Category taxonomy (7 categories)

Use taxonomy v1 IDs:

jc:preferences
jc:rules-conventions
jc:system-architecture
jc:operations
jc:memory-system
jc:projects
jc:people-relationships

4) Retrieval pipeline

Query rewrite (fast model, hard timeout, fail-open to original query).
Hybrid search (keyword + vector over memory_observations).
Time decay ranking: final_score = raw_score × exp(-0.01 × days_since_created)
- stale memories get extra downweight.
Cap results before injection (protect context budget).
Budget profile:
- lean: 2–3 hits, no rewrite, low-latency checks
- balanced: 5–7 hits, default interactive mode
- deep: 10–15 hits, complex debugging/research
- auto: choose profile from query complexity/context

5) Reflection cycle

observe → reflect → propose → triage (3 tiers) → promote to MEMORY.md

Triage tiers:

Tier 1 auto-action: auto-promote / auto-reject / auto-merge using deterministic rules.
Tier 2 LLM batch review: batch adjudication for undecided proposals.
Tier 3 human review: only ambiguous/risky proposals; then promote/edit/reject.

Goal: keep MEMORY.md small, durable, and high-signal.

6) Nightly maintenance

Run idempotent maintenance to keep recall quality high:

Dedup sweep (semantic similarity merge; maintain supersession chain).
Stale pruning (mark old never-recalled observations stale; prune very old stale records conservatively).
Stats emission (observation count, merges, stale volume, category distribution).

7) ADR map (read before changing architecture)

ADR-0021 — memory system foundation.
ADR-0068 — auto-triage pipeline.
ADR-0077 — next phase (reflection + maintenance).
ADR-0082 — Typesense as memory backend (Qdrant replaced).
ADR-0094 → ADR-0100 — proposed evolution (write gate governance, taxonomy/budgets, forward triggers, graph/dual-search roadmap).

8) Writing observations (mandatory at session end)

Every session that produces a durable pattern, operational fix, or architectural insight MUST write observations before closing.

joelclaw send "memory/observation.submitted" -d '{
  "observation": "<what was learned — concrete, reusable, future-tense useful>",
  "category": "jc:operations",
  "source": "pi-session",
  "tags": ["stripe", "payout"]
}'

Use one send call per distinct observation. Batch is fine — fire them in a loop.

Category cheatsheet

Category	Use for
`jc:operations`	How things work, API quirks, CLI patterns, operational fixes
`jc:rules-conventions`	Conventions, SOPs, team/project rules
`jc:system-architecture`	Topology, wiring, how components connect
`jc:projects`	Per-project facts, payout rates, product catalogs
`jc:preferences`	Joel's explicit preferences
`jc:people-relationships`	People, contacts, roles
`jc:memory-system`	Memory system itself

What makes a good observation

Concrete: "Stripe Report Run requires explicit payment_metadata[product] column in columns param or it returns blank" — not "Stripe has metadata"
Reusable: will this still be true next month?
Actionable: an agent reading this cold should know what to do differently
Not a transcript: no "the user asked me to", no raw tool output, no "I discovered that"

What to skip

Instruction artifacts, tool traces, ephemeral command outputs
Facts already in skills or ADRs (skills are the durable home; observations are for recall/search)
Anything under 12 chars — the write gate discards it

9) Operations commands (slog + recall)

# Record memory-system changes
slog write --action configure --tool memory-system --detail "<what changed>" --reason "<why>"

# Inspect recent operational history
slog tail --count 50

# Default recall
joelclaw recall "<query>" --budget balanced --limit 7

# Deep recall for hard debugging
joelclaw recall "<query>" --budget deep --limit 10

# Inspect memory-system category specifically
joelclaw recall "<query>" --category jc:memory-system --limit 10

# Include held memories when needed
joelclaw recall "<query>" --include-hold --raw

9a) `joelclaw inngest memory-health` interpretation

memory-health uses recent/current-window observations when any were written in the lookback. Do not fail the system on a tiny current-window category-confidence sample: fewer than 25 known-confidence observations is reported as evidence but treated as too small to gate. Stale RUNNING memory runs that started before the current worker are stale SDK ghosts, not live backlog.

9b) Runs archive derived-index recovery (ADR-0243)

Raw Pi/Claude/Codex Run capture and searchable Typesense indexes are separate layers:

source of truth: ~/.joelclaw/runs-dev/<user>/<yyyy-mm>/<run-id>.jsonl plus .metadata.json
derived indexes: runs_dev, run_chunks_dev
Inngest function: memory/run.captured

When Typesense is stale but raw blobs still arrive:

Verify raw capture and latest Typesense timestamp separately.
Check Inngest queue health. joelclaw runs --count 5 --hours 1 --compact should return quickly.
If Inngest GQL hangs and logs show stale queue leases, log first, restart inngest-0, wait Ready, then force worker sync:

kubectl -n joelclaw delete pod inngest-0 --grace-period=30
kubectl -n joelclaw wait --for=condition=Ready pod/inngest-0 --timeout=180s
curl -fsS -X PUT http://127.0.0.1:3111/api/inngest

Backfill missing raw blobs into Typesense with the controlled script, not by flooding Inngest with thousands of replay events:

TYPESENSE_API_KEY=$(secrets lease typesense_api_key) \
  bun scripts/backfill-run-typesense.ts \
  --since <iso-or-ms> \
  --machine dark-wizard \
  --runtime pi \
  --limit 0 \
  --sleep-ms 250

9) Canonical code paths

Observe: packages/system-bus/src/inngest/functions/observe.ts
Write gate: packages/system-bus/src/memory/write-gate.ts
Taxonomy: packages/system-bus/src/memory/taxonomy-v1.ts
Recall adapter: packages/cli/src/capabilities/adapters/typesense-recall.ts
Reflect/propose: packages/system-bus/src/inngest/functions/reflect.ts, promote.ts
Nightly maintenance: packages/system-bus/src/inngest/functions/memory/nightly-maintenance.ts

10) Non-negotiables

Memory stores patterns, not transcript noise.
No silent failure paths: emit telemetry on every transition.
Keep retrieval bounded; never flood context windows.
Keep MEMORY.md curated; do not auto-append raw observations.