zo-memory-system - SKILL.md Agent Skill

name: zo-memory-system description: Hybrid SQLite + Vector persona memory system for Zo Computer. Episodic memory with temporal queries, graph-boosted search, BFS path finding, knowledge gap analysis, auto-capture pipeline. Gives personas persistent memory with semantic search (text-embedding-3-small), HyDE query expansion (gpt-4o-mini), LLM-powered memory gate, 5-tier decay, swarm integration, context budget tracking, cross-persona sharing, conflict resolution, and multi-hop retrieval. Requires OpenAI API key. compatibility: Created for Zo Computer. Requires Bun and OPENAI_API_KEY. metadata: author: marlandoj.zo.computer updated: 2026-03-29 version: 4.0.0

Zo Memory System Skill v4.0.0

Give your Zo personas persistent memory with semantic understanding, graph intelligence, and automatic fact capture.

v4.0 Updates (Q2-Q4 2026): Context budget awareness (token tracking + proactive checkpointing), recursive episode summarization (LLM-powered FIFO compression), metrics dashboard (latency/recall/capture/gate stats), iterative multi-hop retrieval (confidence-based BFS with query refinement), cross-persona memory sharing (pools + inheritance hierarchy), conflict resolution (semantic/temporal detection + provenance tracking), enhanced knowledge graph (typed relations, cycle detection, DOT export, co-occurrence inference), embedding model benchmarking (3-model comparison + recall measurement).

v3.2 Updates: Procedural memory (versioned workflow patterns with LLM-powered evolution), cognitive profiles (executor failure patterns + entity affinities), orchestrator integration (6-signal composite routing with procedure + temporal scoring, auto-episode creation after swarm runs)

v3.1 Updates: Episodic memory (event-based "what happened" with outcomes), temporal queries (since/until filtering), velocity trends, DB migration system, auto-capture episode hook

v3.0 Updates: Graph-boosted hybrid search (graph-boost.ts), BFS path finding & knowledge gap analysis (graph.ts), auto-capture pipeline for conversation-to-fact extraction (auto-capture.ts), co-capture linking, contradiction detection with supersession

What You Get

Automatic continuation recall — Detects continuation-like messages and silently blends relevant context from the last 14 days
First-class open loops — Stores unfinished tasks, unresolved bugs/incidents, pending approvals, and next-step commitments as queryable records
Searchable session summaries — Episodes are indexed for continuation retrieval, not just listed by time
Procedural memory — Versioned workflow patterns with step sequences, success/failure tracking, and LLM-powered evolution
Cognitive profiles — Extended executor history with episode linkage, failure pattern classification, and entity affinity scores
6-signal routing — Orchestrator composite router enhanced with procedure + temporal scoring for memory-enriched task routing
Episodic memory — Event-based "what happened" records with outcomes, entity tagging, and duration tracking
Temporal queries — Filter episodes by time ("7 days ago", "last week", ISO dates) and outcome
Velocity trends — Track success/failure rates per entity over time (day/week/month granularity)
DB migration system — Safe, idempotent schema migrations with rollback support
Hybrid search — BM25 (FTS5) + vector similarity with RRF fusion
Graph-boosted scoring — Linked facts boost each other in search results
Semantic understanding — Finds facts even with paraphrased queries
HyDE expansion — gpt-4o-mini query rewriting for vague searches (parallelized)
BFS path finding — Find shortest connection between any two entities
Knowledge gap analysis — Identify orphan facts, dead ends, weak links, and clusters
Auto-capture — Extract facts from conversation transcripts automatically
Conversation capture — Scan all workspace artifacts and extract facts from every conversation (not just swarm)
Scheduled capture agent — Daily agent that runs conversation-capture and emails a maintenance report
Contradiction detection — New facts that conflict with existing ones create supersession links
5-tier adaptive decay — Automatic promotion/demotion based on access patterns
Vector embeddings — text-embedding-3-small (1536d) via OpenAI
Per-persona memory files — Critical facts always loaded with the persona
Shared memory database — Cross-persona facts with vector index
Associative routing — Graph links between related facts (link/unlink/show commands)
Memory consolidation — Automatic deduplication and merging of related facts
Swarm integration — Token-optimized memory for multi-agent workflows
Memory gate — LLM-powered relevance filter (memory-gate.ts) that classifies messages before searching
Always-on injection — Zo rule integration for automatic context injection on every message
Gate-filtered savings — 40-60% of messages filtered as not needing memory (zero extra tokens)
Health checks — OpenAI API key validation and model-client connectivity at startup
Fetch timeouts — Timeouts on all model-client calls (prevents indefinite hangs)
Scheduled maintenance — Hourly prune/decay automation
Checkpoint system — Save/restore task state
Graceful fallback — Works without OPENAI_API_KEY (FTS5 only, no LLM workloads)

Prerequisites

# Set your OpenAI API key
export OPENAI_API_KEY=sk-...
# Or add to /root/.zo_secrets for persistence

Model Selection

All LLM workloads use OpenAI via model-client.ts. No local model installation required.

Model	Purpose	Notes
`gpt-4o-mini`	Gate, extraction, summarization, HyDE, briefing	Default for all LLM workloads
`text-embedding-3-small`	Vector embeddings (1536d)	Default embedding model

Override any workload model via env vars (see DEPLOYMENT.md).

Performance

Mode	Latency	Use Case
FTS + Vectors only	~0.5s	Specific queries with exact keywords
With HyDE	~1-2s	Vague/conceptual queries (e.g., "that thing about data safety")
Graph-boosted	+~5ms	Adds graph scoring to any search (negligible overhead)
Auto-capture	~3-5s	Per-conversation extraction (post-conversation, not per-turn)

HyDE Trade-off: Adds ~3.5s but dramatically improves recall for vague queries (1 result → 6 results in testing).

Quick Start

# Initialize the database
cd /home/workspace/Skills/zo-memory-system
bun scripts/memory.ts init

# Add memory to a persona
bun scripts/add-persona.sh "my-persona" "Role description"

# Store a fact
bun scripts/memory.ts store \
  --entity "user" \
  --key "preference" \
  --value "value" \
  --decay permanent

# Search with semantic understanding + graph boost
bun scripts/memory.ts hybrid "why did we choose the database"

# Manage knowledge graph
bun scripts/graph.ts link --source <id1> --target <id2> --relation "depends_on"
bun scripts/graph.ts knowledge-gaps

# Auto-capture facts from a conversation
bun scripts/auto-capture.ts --input conversation.md --dry-run

Commands

Store Facts

bun scripts/memory.ts store \
  --persona shared \
  --entity "user" \
  --key "name" \
  --value "Alice" \
  --decay permanent \
  --category preference

Search Memory

Continuation Recall (facts + episodes + open loops):

bun scripts/memory.ts continuation "where did we leave off on the health dashboard?"
bun scripts/memory.ts open-loops --status open
bun scripts/memory.ts resolve-loop "The dashboard issue is fixed now"

Hybrid Search (semantic + exact + graph boost):

bun scripts/memory.ts hybrid "database decision rationale"
bun scripts/memory.ts hybrid "why did we pick SQLite" --no-hyde  # Skip HyDE for speed

Hybrid search now includes graph-boosted scoring. When facts are linked via fact_links, linked results boost each other. Scoring weights:

Signal	Weight	Notes
RRF (FTS + Vector)	0.60	Primary signal
Graph Boost	0.15	Reward associative proximity
Freshness	0.15	Recency bonus
Confidence	0.10	Trust signal

When no graph links exist for any result, original weights (0.70/0.20/0.10) are preserved.

Fast Exact Search (FTS5 only):

bun scripts/memory.ts search "router password"

Lookup by entity:

bun scripts/memory.ts lookup --entity "user"
bun scripts/memory.ts lookup --entity "user" --key "name"

Maintenance

# View statistics (shows embeddings count, model config)
bun scripts/memory.ts stats

# Check memory system health (OpenAI API key, DB connectivity)
bun scripts/memory.ts health

# Backfill embeddings for all facts
bun scripts/memory.ts index

# Prune expired facts and orphaned embeddings
bun scripts/memory.ts prune

# Apply adaptive decay (demotes stale, promotes frequently-accessed)
bun scripts/memory.ts decay

# Consolidate duplicate facts (merge same entity+key entries)
bun scripts/memory.ts consolidate

Knowledge Graph (graph.ts)

The graph CLI manages associative links between facts and provides graph analysis tools.

Link Management

# Create a link between two facts
bun scripts/graph.ts link --source <id1> --target <id2> --relation "depends_on" --weight 0.8

# Remove a link
bun scripts/graph.ts unlink --source <id1> --target <id2>
bun scripts/graph.ts unlink --source <id1> --target <id2> --relation "depends_on"

# Show all links for an entity or fact
bun scripts/graph.ts show --entity "project.ffb-site"
bun scripts/graph.ts show --id <fact-id>

Find Connections (BFS Path Finding)

Find the shortest path between two entities through the knowledge graph:

bun scripts/graph.ts find-connections --from "project.ffb-site" --to "system.zo"
bun scripts/graph.ts find-connections --from "user" --to "decision.hosting" --max-depth 3

Output:

Path found (2 hops):

  [project.ffb-site.name] "FFB website redesign project"
    -->depends_on
  [decision.hosting] "Decided to use Zo hosting"
    -->related
  [system.zo.infrastructure] "Zo Computer infrastructure setup"

Knowledge Gaps Analysis

Scan all facts for maintenance opportunities:

bun scripts/graph.ts knowledge-gaps

Output:

Knowledge Gap Analysis
======================
Total facts: 73
Linked facts: 14 (19.2%)
Orphan facts: 59 (80.8%)

Dead ends (targets only, never source):
  - [decision.hosting] "Decided to use Zo hosting" (2 inbound, 0 outbound)

Weakly linked (only 1 connection):
  - [project.ffb-api] "FFB API integration" (1 link)

Connected components: 3
  Cluster 1 (8 facts): hub = [project.ffb-site] (5 links)
  Cluster 2 (4 facts): hub = [system.zo-infrastructure] (3 links)
  Cluster 3 (2 facts): hub = [decision.memory-cli] (1 link)

Suggested: 12 orphan facts share entities with linked facts and could be connected.

Auto-Capture (auto-capture.ts)

Extract structured facts from conversation transcripts automatically.

How it works

Transcript is sent to gpt-4o-mini for structured extraction
Each candidate fact is quality-filtered (confidence >= 0.6, value >= 10 chars)
Dedup check against existing facts (hash-based exact match)
Contradiction detection: same entity+key with different value creates a supersedes link
Passing facts are stored with embeddings and auto-linked as co-captured

Usage

# Preview extraction (dry-run, no storage)
bun scripts/auto-capture.ts --input conversation.md --dry-run

# Extract and store
bun scripts/auto-capture.ts --input conversation.md --source "chat:2026-03-04"

# Pipe from swarm output
cat swarm-output.md | bun scripts/auto-capture.ts --source "swarm:ffb"

# View capture history
bun scripts/auto-capture.ts stats

Options

Flag	Description	Default
`--input <file>`	Transcript file path	stdin
`--source <label>`	Source label for audit trail	"cli"
`--persona <name>`	Persona to store facts under	"shared"
`--dry-run`	Show extraction without storing	false
`--model <name>`	Override extraction model	openai:gpt-4o-mini

Quality Safeguards

Confidence threshold: Facts below 0.6 confidence are discarded
Value length minimum: Facts with values shorter than 10 chars are discarded
Max 20 facts per capture (prevents runaway extraction from long transcripts)
Transcript hash prevents re-processing the same content
Dry-run mode for testing extraction quality before committing
Source tagging for audit trail (source: "auto-capture:{label}")

Contradiction Handling

When auto-capture finds an existing fact with the same entity+key but a different value:

New fact is stored normally
A supersedes link is created: new_fact → old_fact
Old fact's confidence is halved (soft deprecation, not deletion)

Co-Capture Linking

Facts extracted from the same conversation are automatically linked with relation: "co-captured" and weight: 0.5. This seeds the knowledge graph with organic connections.

Conversation Capture (conversation-capture.ts)

Extends memory capture beyond swarm tasks to all conversations. Scans workspace artifact directories (/home/.z/workspaces/con_*/) for reports, analysis outputs, and tool results, then runs them through the auto-capture extraction pipeline.

# List all capturable artifacts
bun scripts/conversation-capture.ts --list

# Process last 24h (safe default)
bun scripts/conversation-capture.ts

# Only last 24 hours
bun scripts/conversation-capture.ts --since 24h

# Explicit full backlog sweep
bun scripts/conversation-capture.ts --all

# Preview without storing
bun scripts/conversation-capture.ts --dry-run

# Show capture statistics
bun scripts/conversation-capture.ts --stats

Options

Flag	Description	Default
`--since <duration>`	Filter by recency: 1h, 24h, 7d, 30d, 1w, 1m	24h
`--all`	Process all uncaptured artifacts across all conversations	false
`--dry-run`	Show extraction without storing	false
`--list`	List capturable files with status	—
`--stats`	Show conversation capture statistics	—

Filters

Skips files < 200 bytes or > 500 KB
Skips read_webpage/, browser_agent/, node_modules/, venv/ directories
Only processes .md, .txt, .json files
Hash-based dedup via capture_log table (never re-processes same content)
Creates episodes for each capture session
Bare invocation is intentionally bounded to the last 24 hours; use --all only for an explicit backlog sweep

Scheduled Agent

A daily agent runs at 04:00 AM Phoenix time, executing conversation-capture for the last 24 hours and emailing a maintenance report with fact counts, DB stats, and any errors.

Episodic Memory (v3.1)

Episodic memory captures "what happened" as events with outcomes — complementing facts (what is true) with episodes (what occurred).

Database Migration

Before using episodic memory, run the migration to add the new tables:

bun scripts/memory.ts migrate

This is idempotent (safe to run multiple times) and preserves all existing data.

Querying Episodes

# List all episodes
bun scripts/memory.ts episodes

# Filter by time
bun scripts/memory.ts episodes --since "7 days ago"
bun scripts/memory.ts episodes --since "2026-03-01" --until "2026-03-07"
bun scripts/memory.ts episodes --since "last week"

# Filter by outcome
bun scripts/memory.ts episodes --outcome failure
bun scripts/memory.ts episodes --outcome success --since "30 days ago"

# Filter by entity
bun scripts/memory.ts episodes --entity "executor.claude-code"
bun scripts/memory.ts episodes --entity "project.ffb-site" --since "7 days ago"

Velocity Trends

Track success/failure rates over time for any entity:

# Weekly trends (default)
bun scripts/memory.ts trends --entity "executor.claude-code"

# Daily granularity
bun scripts/memory.ts trends --entity "project.ffb-site" --granularity day

# Monthly, last 6 months
bun scripts/memory.ts trends --entity "swarm.ffb" --granularity month --since "180 days ago"

Output:

Velocity trends for "executor.claude-code" (by week):

  Period        Total  OK  Fail  Rate
  ─────────────────────────────────────
  2026-W06          1   1     0  100%
  2026-W09          1   0     0  0%

Time Formats

The temporal query system accepts:

Relative: "7 days ago", "30 minutes ago", "2 weeks ago", "3 months ago"
Named: "today", "yesterday", "last week", "last month", "last year"
ISO dates: "2026-03-01"
Unix timestamps: raw seconds

Auto-Capture Episode Hook

When auto-capture stores facts from a conversation, it automatically creates an episode recording the capture event. This means every auto-capture run becomes a queryable event in your episodic timeline.

Procedural Memory (v3.2)

Procedural memory captures reusable workflow patterns as versioned step sequences. Procedures track success/failure rates and can self-evolve via LLM (gpt-4o-mini) when failures accumulate.

Managing Procedures

# List all procedures (shows latest version per name)
bun scripts/memory.ts procedures --list

# Show steps for a specific procedure
bun scripts/memory.ts procedures --show "site-review"

# Record feedback after running a procedure
bun scripts/memory.ts procedures --feedback <procedure-id> --success
bun scripts/memory.ts procedures --feedback <procedure-id> --failure

# Force LLM-powered evolution based on failure episodes
bun scripts/memory.ts procedures --evolve "site-review"

Procedure Evolution

When a procedure accumulates failures, --evolve uses OpenAI (gpt-4o-mini) to analyze linked failure episodes and suggest improved steps. The evolved procedure:

Gets a new version number (v1 -> v2)
Links to its parent via evolved_from
Starts with fresh success/failure counts
May adjust timeouts, add fallback executors, or reorder steps

Cognitive Profiles

Executor history (~/.swarm/executor-history.json) is extended with cognitive fields:

recent_episode_ids — Last 10 episode IDs for "why did this happen?" queries
failure_patterns — Auto-classified error types (timeout, mutation_failed, file_not_found, permission_denied)
entity_affinities — Per-entity success rates as exponential moving averages

These fields are backward compatible — old history files parse correctly without them.

Orchestrator Integration (v4.5)

The swarm orchestrator (orchestrate-v4.ts) now:

Creates episodes — Every swarm run auto-creates an episode with outcome, duration, executor list, and task tags
6-signal composite routing — Adds procedureScore (from procedure success rates) and temporalScore (from recent episodic performance) to the existing 4-signal composite router
Cognitive data on outcomes — Success/failure calls pass error types and entity affinities to executor history

MCP Server (v3.3.1)

Expose the memory system as MCP tools for Claude Desktop, Cursor, Gemini, Codex, and other MCP-compatible clients.

Transport Modes

Mode	Script	Use Case
HTTP (Streamable HTTP)	`mcp-server-http.ts`	Network access — hosted service, multi-agent, all personas
Stdio	`mcp-server.ts`	Direct process spawn — Claude Desktop, single-client

HTTP Server (Recommended)

The HTTP MCP server runs as a Zo hosted service with auto-restart and HTTPS.

# Start locally (for testing)
PORT=48400 bun scripts/mcp-server-http.ts

# Health check
curl http://localhost:48400/health

# MCP endpoint
POST http://localhost:48400/mcp

Hosted service: https://zo-memory-mcp-marlandoj.zocomputer.io

Service ID: svc_PXkgBzRdH8M
Local port: 48400
Health: /health
MCP: /mcp (Streamable HTTP transport)

Optional auth: Set ZO_MEMORY_MCP_TOKEN env var to require Bearer token auth.

Stdio Server

bun scripts/mcp-server.ts          # Direct
bun scripts/memory.ts mcp          # Via CLI

Available Tools

Tool	Description
`memory_search`	Hybrid FTS + vector search with RRF fusion
`memory_store`	Store a new fact with auto-generated embedding
`memory_episodes`	Query episodic memory with temporal/entity filters
`memory_procedures`	List or show workflow procedures
`cognitive_profile`	Show executor cognitive profile (history, patterns, affinities)

Client Configuration

Claude Code (~/.claude.json or project .mcp.json):

{
  "mcpServers": {
    "zo-memory": {
      "type": "http",
      "url": "http://localhost:48400/mcp"
    }
  }
}

Claude Desktop (stdio mode, claude_desktop_config.json):

{
  "mcpServers": {
    "zo-memory": {
      "command": "bun",
      "args": ["/home/workspace/Skills/zo-memory-system/scripts/mcp-server.ts"]
    }
  }
}

Gemini (~/.gemini/settings.json):

{
  "mcpServers": {
    "zo-memory": {
      "url": "http://localhost:48400/mcp",
      "type": "http",
      "trust": true
    }
  }
}

Codex (~/.codex/config.toml):

[mcp_servers.zo-memory]
url = "http://localhost:48400/mcp"
startup_timeout_sec = 10

Swarm bridge: The Claude Code bridge script (claude-code-bridge.sh) pre-approves all mcp__zo-memory__* tools via --allowedTools.

Import Pipeline (v3.2)

Import facts from external sources into the memory system.

Supported Sources

Source	Format	Extracts
`chatgpt`	JSON export	Conversation summaries, key decisions
`obsidian`	Markdown vault	Notes by section, respects frontmatter
`markdown`	Any `.md` file	Section-based facts from headings

Usage

# Preview (no changes)
bun scripts/memory.ts import --source markdown --path ~/notes/decisions.md --dry-run

# Import ChatGPT export
bun scripts/memory.ts import --source chatgpt --path ~/chatgpt-export.json

# Import Obsidian vault
bun scripts/memory.ts import --source obsidian --path ~/Vault

# Import single markdown
bun scripts/memory.ts import --source markdown --path ~/notes/architecture.md

Features:

Duplicate detection (skips already-imported facts)
Auto-generated embeddings for semantic search
Section-based splitting (one fact per heading)
Obsidian frontmatter support (entity, category fields)

Memory Gate (Always-On Context Injection)

The memory gate (scripts/memory-gate.ts) uses OpenAI (gpt-4o-mini) to classify incoming messages and decide whether memory context should be injected. It now prioritizes likely continuation-style turns first, using blended continuation recall before falling back to the standard memory search path.

How it works

User sends a message
Continuation detector checks for likely follow-on work
If continuation-like: blended recall searches facts + episodes + open loops from the last 14 days
Otherwise the LLM classifier (gpt-4o-mini) classifies whether stored memory is needed
If no: message passes through with zero overhead
If yes: gate extracts keywords, runs hybrid search, injects results as context

Usage

# Direct invocation
bun scripts/memory-gate.ts "what did we decide about FFB pricing?"
# Exit 0 + results printed

bun scripts/memory-gate.ts "hello"
# Exit 2 (no memory needed, no output)

bun scripts/memory-gate.ts "update the supplier scorecard"
# Exit 0 + results printed (or exit 3 if no results found)

Exit codes

Code	Meaning
0	Memory results found and printed
1	Error (API failure, parse failure)
2	No memory needed (greeting, general knowledge, self-contained request)
3	Memory needed but no results found

Zo rule integration

Create an always-applied Zo rule that runs the gate on every message:

Run: bun /home/workspace/Skills/zo-memory-system/scripts/memory-gate.ts "<message>"
- Exit 0: inject stdout as background context
- Exit 2: skip (no memory needed)
- Exit 3: skip (no results)

Configuration

export ZO_MODEL_GATE="openai:gpt-4o-mini"  # Default gate model
export OPENAI_API_KEY=sk-...

No warm-up or keep-alive is needed — OpenAI API calls have no cold-start penalty.

Performance in multi-agent swarms

The gate is what makes this memory system viable for swarm workflows. Without gating, always-on injection consumes 34-120% of an 8K context budget across 11 tasks. With gating, only ~4 of 11 tasks get memory injection, keeping consumption at 2.5-6.9%.

Metric	With gate	Without gate (always-on)
Tasks injected	~4 of 11	All 11
Memory tokens per swarm run	800-2,200	12,100-38,500
% of 8K context budget	2.5-6.9%	34-120%

Architecture

.zo/memory/
├── shared-facts.db          # SQLite database
│   ├── facts                # Core facts table
│   ├── facts_fts            # FTS5 virtual table
│   ├── fact_embeddings      # Vector embeddings (1536d, text-embedding-3-small)
│   ├── fact_links           # Associative routing graph
│   ├── capture_log          # Auto-capture history
│   ├── episodes             # Episodic memory (v3.1)
│   ├── episode_entities     # Episode-entity junction (v3.1)
│   ├── procedures           # Procedural memory (v3.1, Phase 2)
│   ├── procedure_episodes   # Procedure-episode junction (v3.1, Phase 2)
│   └── embedding_cache      # Content hash cache
├── personas/
│   ├── [persona-1].md       # Critical facts per persona
│   └── [persona-2].md
├── checkpoints/
│   └── [timestamp].json     # Saved states
└── scripts/
    ├── memory.ts            # Main CLI with episodic memory + graph-boosted search (v3.1)
    ├── memory-gate.ts       # LLM-powered relevance gate (v2.3)
    ├── graph.ts             # Knowledge graph CLI (v3.0)
    ├── graph-boost.ts       # Graph scoring module (v3.0)
    ├── auto-capture.ts      # Conversation-to-fact extraction + episode hook (v3.1)
    ├── conversation-capture.ts  # Workspace artifact scanner for all conversations (v3.3)
    ├── migrate-v2.sql       # Schema migration for episodic/procedural memory (v3.1)
    ├── rollback-v2.sql      # Migration rollback script (v3.1)
    ├── add-persona.sh       # Persona setup helper
    ├── schema.sql           # Database schema
    ├── test-graph.ts        # Graph integration tests
    └── test-capture.ts      # Auto-capture integration tests

Search Flow (v3.0 — Graph-Boosted)

Query → ┌─────────────────────────────────────┐
        │  Parallel Execution                 │
        │  ├── HyDE Expansion (gpt-4o-mini)   │
        │  ├── Query Embedding (text-emb-3s) │
        │  └── FTS5 Search (BM25)             │
        └─────────────────────────────────────┘
                        ↓
              RRF Fusion (base scores)
                        ↓
              Graph Boost (fact_links)
                        ↓
              Neighbor Injection (linked facts)
                        ↓
              Ranked Results

Configuration

Environment variables (optional):

export OPENAI_API_KEY="sk-..."                  # Required
export ZO_MODEL_GATE="openai:gpt-4o-mini"       # Default gate model
export ZO_MODEL_EXTRACTION="openai:gpt-4o-mini" # Default extraction model
export ZO_MODEL_HYDE="openai:gpt-4o-mini"       # Default HyDE model
export ZO_MODEL_EMBEDDING="openai:text-embedding-3-small"  # Default embedding
export ZO_HYDE_DEFAULT="true"                   # Default: use HyDE
export ZO_MEMORY_DB="/path/to/shared-facts.db"  # Default: .zo/memory/

Testing

# Run graph integration tests
bun scripts/test-graph.ts

# Run auto-capture integration tests
bun scripts/test-capture.ts

Version History

Version	Date	Changes
4.0.0	2026-03-29	Context budget awareness (token tracking + proactive checkpointing), recursive episode summarization (LLM-powered FIFO compression), metrics dashboard (latency/recall/capture/gate stats), iterative multi-hop retrieval (confidence-based BFS with query refinement), cross-persona memory sharing (pools + inheritance hierarchy), conflict resolution (semantic/temporal detection + provenance tracking), enhanced knowledge graph (typed relations, cycle detection, DOT export, co-occurrence inference), embedding model benchmarking (3-model comparison + recall measurement).
3.3.1	2026-03-07	HTTP MCP server (`mcp-server-http.ts`) — Streamable HTTP transport via Bun.serve(), registered as Zo hosted service (`svc_PXkgBzRdH8M`), HTTPS at `zo-memory-mcp-marlandoj.zocomputer.io`, configured in Claude Code / Gemini / Codex / workspace .mcp.json, bridge script updated with zo-memory tool permissions
3.3.0	2026-03-07	Conversation capture (conversation-capture.ts) — scan all workspace artifacts for fact extraction, not just swarm. Scheduled daily agent for automatic capture + maintenance reports. Memory Manager persona with full CLI access.
3.2.0	2026-03-07	Procedural memory (versioned workflow storage, CRUD, feedback, LLM evolution), cognitive profiles (episode IDs, failure patterns, entity affinities in executor-history.json), orchestrator v4.5 integration (auto-episode on swarm completion, 6-signal composite routing with procedure + temporal scores), MCP server (5 tools: search, store, episodes, procedures, cognitive_profile), import pipeline (ChatGPT, Obsidian, markdown), enhanced CLI (profile, import, mcp commands)
3.1.0	2026-03-07	Episodic memory (episodes table, entity tagging, temporal queries), velocity trends, DB migration system (migrate/rollback), auto-capture episode hook, stats v3 with episode/procedure counts
3.0.0	2026-03-04	Graph-boosted hybrid search (graph-boost.ts), BFS path finding & knowledge gap analysis (graph.ts), auto-capture pipeline (auto-capture.ts), co-capture linking, contradiction detection, scoring redistribution (RRF 0.60 + Graph 0.15 + Freshness 0.15 + Confidence 0.10)
2.3.0	2026-03-03	Memory gate (memory-gate.ts), always-on context injection via Zo rules, gate-filtered token savings for swarm workflows
2.2.0	2026-02-27	Health checks, fetch timeouts, prune/decay/consolidate/link/graph commands, vector pre-filtering, adaptive decay, associative routing, PRAGMA busy_timeout
2.1.0	2026-02-22	Parallelized HyDE/FTS/embedding execution, performance docs
2.0.0	2026-02-19	Hybrid SQLite + Vector search, HyDE query expansion, semantic retrieval, nomic-embed-text via Ollama (pre-migration baseline)
1.1.0	2026-02-18	Added swarm v4 integration documentation
1.0.0	2026-02-08	Initial release - SQLite persona memory, 5-tier decay, FTS5 search

Related Skills

zo-swarm-orchestrator — Multi-agent orchestration with token optimization

Design References

references/graphthulhu-concepts.md — Design notes on GraphThulhu adaptations
references/supermemory-concepts.md — Design notes on auto-capture inspiration