agentmemory-0-9-4 - SKILL.md Agent Skill

name: agentmemory-0-9-4 description: Persistent memory engine for AI coding agents with cross-session context capture, hybrid search (BM25 + vector + knowledge graph), and multi-agent coordination via MCP server. Works with Claude Code, Cursor, Gemini CLI, and any MCP client without external databases. Use when building AI coding agent workflows requiring persistent memory or semantic recall.

agentmemory v0.9.4

Overview

agentmemory is a persistent memory system for AI coding agents that eliminates the need to re-explain architecture, rediscover bugs, or re-teach preferences across sessions. It runs as a background server powered by iii-engine and provides 51 MCP tools, 107 REST endpoints, and a real-time viewer on port 3113.

Built on iii-engine's three primitives (Worker/Function/Trigger), agentmemory replaces the traditional Express + Postgres + Redis stack with a single lightweight runtime. State is stored in file-based SQLite via iii-engine's StateModule, with an in-memory vector index for semantic search and a knowledge graph for entity relationships.

Key statistics:

95.2% retrieval recall at R@5 on LongMemEval-S (500 questions, ~115K tokens each)
~170K tokens/year vs 19.5M+ for full-history pasting (92% fewer tokens, ~$10/yr vs impossible)
51 MCP tools across core memory, governance, multi-agent coordination, and diagnostics
12 lifecycle hooks for automatic capture with zero manual effort
Zero external database dependencies — SQLite + in-memory vector index + local embeddings

When to Use

Setting up persistent memory for any AI coding agent (Claude Code, Cursor, Gemini CLI, OpenCode, Codex, Cline, Goose, Windsurf, Roo Code, Claude Desktop, Aider, Hermes, OpenClaw, Kilo Code)
Implementing cross-session recall of coding decisions, architecture patterns, and bug fixes
Building multi-agent workflows that share memory through leases, signals, and routines
Reducing token costs by replacing full-history context injection with targeted memory retrieval
Enabling semantic search over coding history when keyword matching is insufficient
Setting up team memory with namespaced shared and private memories across team members

Core Concepts

Observations are raw captures of agent activity (tool uses, file edits, command outputs) recorded via lifecycle hooks. They flow through SHA-256 deduplication (5-minute window), privacy filtering (strips API keys, secrets, <private> tags), and synthetic or LLM-powered compression into structured facts, concepts, and narratives.

Memories are the compressed, long-term form of observations — typed as pattern, preference, architecture, bug, workflow, or fact. They carry strength scores (1-10), version chains with Jaccard-based supersession, relationship graphs, and TTL-based auto-forgetting.

Hybrid Search combines three retrieval streams: BM25 keyword matching with Porter stemming, vector cosine similarity over dense embeddings (6 providers including local all-MiniLM-L6-v2), and knowledge graph traversal via entity extraction. Results are fused with Reciprocal Rank Fusion (RRF, k=60) and session-diversified (max 3 results per session).

4-Tier Memory Consolidation mirrors human memory processing: working (raw observations), episodic (compressed session summaries), semantic (extracted facts and patterns), and procedural (workflows and decision patterns). Memories decay over time following the Ebbinghaus curve, frequently accessed memories strengthen, and stale memories auto-evict.

Installation / Setup

Quick Start

# Terminal 1: start the server
npx @agentmemory/agentmemory

# Terminal 2: seed sample data and see recall in action
npx @agentmemory/agentmemory demo

Open http://localhost:3113 for the real-time viewer.

Prerequisites

Node.js >= 20
iii-engine runtime (separate native binary) or Docker

Install iii-engine:

macOS / Linux: curl -fsSL https://install.iii.dev/iii/main/install.sh | sh
Windows: Download iii-x86_64-pc-windows-msvc.zip from iii-hq/iii releases, extract iii.exe to PATH
Docker fallback: agentmemory auto-starts bundled docker-compose.yml if Docker is available

Standalone MCP (no engine required)

For agents that only need MCP tools without the full server, viewer, or cron jobs:

npx -y @agentmemory/agentmemory mcp
# or via the shim package:
npx -y @agentmemory/mcp

Agent Configuration

Add to your agent's MCP config:

{
  "mcpServers": {
    "agentmemory": {
      "command": "npx",
      "args": ["-y", "@agentmemory/mcp"]
    }
  }
}

Environment Configuration

Create ~/.agentmemory/.env:

# LLM provider (pick one — default is no-op: no LLM calls)
# ANTHROPIC_API_KEY=sk-ant-...
# GEMINI_API_KEY=...
# OPENROUTER_API_KEY=...
# MINIMAX_API_KEY=...
# AGENTMEMORY_ALLOW_AGENT_SDK=true   # Opt-in Claude subscription fallback

# Embedding provider (auto-detected, or force local)
# EMBEDDING_PROVIDER=local
# OPENAI_BASE_URL=https://api.openai.com
# OPENAI_EMBEDDING_MODEL=text-embedding-3-small
# OPENAI_EMBEDDING_DIMENSIONS=1536

# Search tuning (default: BM25 0.4, Vector 0.6)
# BM25_WEIGHT=0.4
# VECTOR_WEIGHT=0.6
# TOKEN_BUDGET=2000

# Auth
# AGENTMEMORY_SECRET=your-secret

# Feature flags (all OFF by default)
# AGENTMEMORY_AUTO_COMPRESS=false   # LLM compression per observation
# AGENTMEMORY_INJECT_CONTEXT=false  # Context injection into conversation
# AGENTMEMORY_SLOTS=false           # Editable pinned memory slots
# AGENTMEMORY_REFLECT=false         # Auto-reflect into slots at session end
# GRAPH_EXTRACTION_ENABLED=false    # Knowledge graph extraction
# CONSOLIDATION_ENABLED=true        # 4-tier consolidation pipeline
# LESSON_DECAY_ENABLED=true         # Lesson auto-decay

Advanced Topics

Memory Pipeline and Hooks: The observation lifecycle from capture to retrieval → Memory Pipeline

Hybrid Search Architecture: Triple-stream retrieval with BM25, vector, and knowledge graph → Hybrid Search

MCP Tools Reference: All 51 tools organized by category → MCP Tools

REST API Endpoints: 107 endpoints for programmatic access → REST API

Configuration and Providers: LLM providers, embedding providers, environment variables → Configuration

Multi-Agent Coordination: Leases, signals, routines, checkpoints, mesh sync → Multi-Agent

Benchmarks and Comparison: LongMemEval results vs competitors → Benchmarks

New in v0.9.x: Doctor command, feature flags, session replay, import improvements → v0.9 Changes