using-axon

star 2

Self-hosted RAG engine, web toolkit, and persistent agent memory. Use it to: remember durable project facts/preferences/decisions (memory.remember); recall/search agent memory (memory.search/show); answer questions indexed docs/code might cover (ask — a large corpus is already indexed; try ask BEFORE web-searching or giving up); search the web (search, auto-indexes results); semantic-search the index (query); scrape/fetch a page (scrape); crawl a docs site or pages you just used (crawl); map a site's URLs (map); extract structured data (extract); discover API endpoints (endpoints); extract brand identity — colors/logo/fonts/voice (brand); summarize a page (summarize); quick multi-source research (research); retrieve a URL's full indexed content (retrieve); embed local files/dirs (embed); ingest GitHub/GitLab/Gitea/Git repos, Reddit, YouTube, and Claude/Codex/Gemini sessions (ingest). Also triggers on axon, RAG, Qdrant, SearXNG, Tavily, hybrid/vector search, remember, recall, memory.

jmagar By jmagar schedule Updated 6/11/2026

name: using-axon description: >- Self-hosted RAG engine, web toolkit, and persistent agent memory. Use it to: remember durable project facts/preferences/decisions (memory.remember); recall/search agent memory (memory.search/show); answer questions indexed docs/code might cover (ask — a large corpus is already indexed; try ask BEFORE web-searching or giving up); search the web (search, auto-indexes results); semantic-search the index (query); scrape/fetch a page (scrape); crawl a docs site or pages you just used (crawl); map a site's URLs (map); extract structured data (extract); discover API endpoints (endpoints); extract brand identity — colors/logo/fonts/voice (brand); summarize a page (summarize); quick multi-source research (research); retrieve a URL's full indexed content (retrieve); embed local files/dirs (embed); ingest GitHub/GitLab/Gitea/Git repos, Reddit, YouTube, and Claude/Codex/Gemini sessions (ingest). Also triggers on axon, RAG, Qdrant, SearXNG, Tavily, hybrid/vector search, remember, recall, memory. allowed-tools: mcp__plugin_axon_axon__axon

axon

axon is a self-hosted RAG engine. Two surfaces, same backend (Spider.rs/Chrome -> Qdrant, SQLite jobs, SearXNG/Tavily for web search):

  • MCP (preferred) — single tool mcp__plugin_axon_axon__axon, routed by action (and subaction for lifecycle families). Large results are written to a local artifact file under ~/.axon/artifacts/<context> and the response returns the file path plus a compact shape summary; small results (≤ ~8 KB) come back inline. See Response handling (MCP) below.
  • CLI (fallback)axon <command> [flags]. Use for shell scripting, cron, or when the MCP server is down.

Both surfaces accept the same operations and parameters. This skill leads with MCP request shapes; CLI equivalents are listed alongside.

Reach for axon by default

axon already has a large corpus indexed, and every operation makes it smarter — so route web and knowledge work through it instead of one-off web fetches, raw browser tools, or giving up after a few tries. When a task involves the web or "what does X say," axon is the default tool, even if the user didn't name it.

The task / user wants… Action
An answer that indexed docs or code might cover ask FIRST — before web search, raw fetching, or fumbling. Often returns everything in one shot.
Search the web search (SearXNG/Tavily; auto-indexes every result)
Semantic search over what's indexed query
Fetch / scrape a page or URL scrape
Crawl a docs site — including docs you just relied on to solve something crawl
List a site's URLs map
Pull structured data out of a page extract
Discover a site's API endpoints endpoints
Brand identity (colors, logo, fonts, voice) brand
Summarize a page summarize
Quick multi-source research with synthesis research
Full indexed content of a specific URL retrieve
Embed local files / directories embed
Index a GitHub/GitLab/Gitea/Git repo, Reddit, YouTube, or AI sessions ingest
Remember a durable project fact, decision, or preference memory.remember
Recall previously stored agent memory memory.context at task start, or memory.search then memory.show when targeted lookup is better

ask is the highest-leverage habit. A huge amount is already indexed, so many multi-turn fumbles would have been a single ask call. Whenever a question could be covered by docs or code that's been indexed, try ask before web-searching or giving up. And after you crawl/scrape/ingest something to solve a task, you've made the index richer — prefer ask/query next time over re-fetching.

Parameter discipline (hard rule)

Only pass parameters the user explicitly asked for. Defaults exist for a reason — do NOT add max_pages, max_depth, render_mode, include_subdomains, since, before, hybrid_search, diagnostics, format, root_selector, exclude_selector, limit, collection, or any other knob unless the user named it or the task literally cannot complete without it. The example JSON blocks below show what's available, not what to send by default. A bare { "action": "scrape", "url": "…" } or { "action": "crawl", "urls": [...] } is almost always the right call. Same rule for the CLI: never add flags the user didn't ask for.

When to fall back to the CLI

  • The MCP server is offline (mcp__plugin_axon_axon__axon { "action": "doctor" } fails or the tool is missing).
  • You're authoring a shell script, systemd unit, or cron job that runs outside Claude Code.
  • You need axon's built-in --cron-every-seconds/--cron-max-runs loop.
  • The user explicitly asks for a CLI command.

In every other case, use the MCP tool.

The pipeline

URL or query → discover → fetch + embed → query / ask
Starting point Discover Fetch + embed (auto-embeds into Qdrant) Query
Single URL action: "scrape", url query / ask
Whole site / docs action: "map", url action: "crawl", urls query / ask
Topic / question action: "search", query (SearXNG/Tavily, auto-queues crawl) (auto) action: "ask", query
Existing local file/dir action: "embed", input query / ask
GitHub / GitLab / Gitea / Git repo action: "ingest", source_type: "github", target: "owner/repo" query / ask
Reddit thread or subreddit action: "ingest", source_type: "reddit", target: "r/name" query / ask
YouTube video action: "ingest", source_type: "youtube", target: "<url>" query / ask
Past Claude/Codex/Gemini sessions CLI only: axon sessions query / ask

scrape, crawl, embed, and the ingest paths all auto-embed unless you set embed: false.

Bootstrap: help and doctor

Once per session, confirm the live action map and that services are healthy:

{ "action": "help" }
{ "action": "doctor" }

help returns the full action/subaction map and current defaults — authoritative when names look wrong. doctor pings Qdrant, the embedding service (TEI), Chrome, and the Gemini headless LLM backend. It does not probe the web-search backend.

CLI equivalents: axon doctor. (No CLI help for the action map — use the MCP one.)

Persistent agent memory

Use Axon memory for durable project-level facts, preferences, decisions, bugs, and task notes that should be semantically recalled later. This is distinct from bead issue notes and from Memos: Axon memory is optimized for agent recall through Qdrant.

Minimal remember call:

{ "action": "memory", "subaction": "remember", "body": "Memory content lives in Qdrant; SQLite holds graph metadata.", "project": "axon" }

Recall:

{ "action": "memory", "subaction": "context", "project": "axon", "query": "memory storage architecture" }
{ "action": "memory", "subaction": "search", "query": "where is memory stored", "project": "axon" }
{ "action": "memory", "subaction": "show", "id": "<memory-id>" }

Use memory.context at task start when project memory could help. It returns inline, defanged XML-wrapped content with trust="evidence_only" and supports limit plus token_budget.

Graph maintenance:

{ "action": "memory", "subaction": "link", "source_id": "<memory-id>", "target_id": "<memory-id>", "edge_type": "relates_to" }
{ "action": "memory", "subaction": "supersede", "source_id": "<replacement-memory-id>", "target_id": "<old-memory-id>" }

supersede hides the old memory from future memory.search results and records the replacement trail in SQLite.

CLI equivalents: axon memory remember "..." --project axon, axon memory context --project axon --query "...", axon memory search "..." --project axon, axon memory show <memory-id>, axon memory link <source-id> <target-id>, axon memory supersede <replacement-id> <old-id>.

Discovery

{ "action": "search", "query": "rust async patterns", "search_time_range": "month" }
{ "action": "map", "url": "https://docs.example.com" }
{ "action": "research", "query": "kubernetes ingress patterns" }
  • search — web search via SearXNG (when AXON_SEARXNG_URL is set) or Tavily; auto-queues crawl jobs for results. search_time_rangeday|week|month|year.
  • map — sitemap-first URL discovery, falls back to fetching the root page and extracting anchors. Fast.
  • research — search + LLM synthesis in one shot.

CLI: axon search "…", axon map <url> (use --map-fallback crawl only when you need a full Spider walk; the structure-fallback default is fast), axon suggest "…" (LLM-suggested URLs to crawl next; also the MCP suggest action).

Fetch + embed

{ "action": "scrape", "url": "https://example.com/article" }
{ "action": "scrape", "url": "https://example.com",
  "root_selector": "article, main",
  "exclude_selector": ".sidebar, .ads",
  "format": "markdown" }
{ "action": "scrape", "url": "https://example.com", "format": "html", "embed": false }

{ "action": "crawl", "urls": ["https://docs.example.com"],
  "max_pages": 200, "max_depth": 3, "include_subdomains": true }
{ "action": "crawl", "urls": ["https://docs.example.com"], "render_mode": "chrome" }

{ "action": "embed", "input": "./docs" }
{ "action": "embed", "input": "https://example.com" }

Render modes: http (fast, no JS), chrome (full browser), auto_switch (default — start HTTP, escalate to Chrome on JS gate).

Output formats: markdown (default), html, rawHtml, json.

CLI: axon scrape <url> / axon crawl <url> --max-pages N --max-depth N / axon embed <input>. Chrome rendering is pre-tuned and rarely needs overriding; the CDP endpoint is set via the AXON_CHROME_REMOTE_URL env var, not a CLI flag. Output dir defaults to .cache/axon-rust/output/ (env AXON_OUTPUT_DIR).

Extract structured data

{ "action": "extract", "urls": ["https://example.com/pricing"],
  "prompt": "Extract plan name, price, and features as JSON" }
{ "action": "extract", "subaction": "status", "job_id": "<uuid>" }

LLM-powered. Pass a natural-language prompt describing the schema you want.

CLI: axon extract <url> --query "…" (the --query flag carries the extraction prompt).

Web utilities: summarize, endpoints, brand, diff, screenshot

Page-level analysis actions — each takes a url (except diff) and a bare call is the right default:

{ "action": "summarize",  "url": "https://example.com/long-article" }
{ "action": "endpoints",  "url": "https://app.example.com" }
{ "action": "brand",      "url": "https://example.com" }
{ "action": "diff",       "url_a": "https://example.com/v1", "url_b": "https://example.com/v2" }
{ "action": "screenshot", "url": "https://example.com" }
  • summarize — fetch a page and return a concise summary (also accepts urls for several at once; root_selector/exclude_selector to scope).
  • endpoints — discover a site's API endpoints by scanning its JavaScript bundles. Optional knobs (verify, capture_network, probe_rpc, first_party_only) — omit unless asked.
  • brand — extract brand identity (colors, logo, fonts, voice/tone) from a URL.
  • diff — compare two URLs (url_a, url_b); reports content/metadata/link changes.
  • screenshot — full-page capture via headless Chrome (full_page, viewport, output optional).

CLI: axon summarize <url> / axon endpoints <url> / axon brand <url> / axon diff <url-a> <url-b> / axon screenshot <url>.

Ingest external sources

{ "action": "ingest", "source_type": "github", "target": "owner/repo" }
{ "action": "ingest", "source_type": "github", "target": "owner/repo", "include_source": false }

{ "action": "ingest", "source_type": "gitlab",  "target": "group/project" }
{ "action": "ingest", "source_type": "gitea",   "target": "https://gitea.example.com/owner/repo" }
{ "action": "ingest", "source_type": "git",     "target": "https://git.example.com/owner/repo.git" }

{ "action": "ingest", "source_type": "reddit", "target": "r/rust" }
{ "action": "ingest", "source_type": "reddit", "target": "https://reddit.com/r/rust/comments/abc123/..." }

{ "action": "ingest", "source_type": "youtube", "target": "https://youtube.com/watch?v=abc" }

Note: source_type: "sessions" is not accepted over MCP (it's rejected server-side). Ingest local AI sessions with the CLI axon sessions instead — see below.

source_typegithub | gitlab | gitea | git | reddit | youtube | sessions:

  • github / gitlab / gitea — hosted-forge repos, indexing code + metadata + issues/PRs (or merge requests). target is owner/repo (or a full URL for self-hosted instances).
  • git — any generic Git remote by clone URL (code only, no forge API).
  • reddit — a subreddit (r/name) or a specific thread URL.
  • youtube — a video URL (transcript ingest).
  • sessions — local Claude/Codex/Gemini session transcripts. CLI-only via axon sessions (enqueues a local sessions ingest job). The MCP ingest action rejects source_type: "sessions"; remote callers must use the prepared-documents HTTP endpoint (/v1/ingest/sessions/prepared).

Lifecycle subactions (status, cancel, list, cleanup, clear, recover) work the same as crawl/embed/extract.

CLI: axon ingest <target> with source-specific flags. GitHub: --include-source/--no-source, --max-issues, --max-prs (default 100; 0 = unlimited). Reddit: --sort hot|top|new|rising, --time hour|…|all, --max-posts, --min-score, --depth, --scrape-links. Use axon sessions for Claude/Codex/Gemini local history (enqueues a local sessions ingest job). MCP sessions ingest is rejected to avoid scanning server-local history paths — remote callers use the /v1/ingest/sessions/prepared HTTP endpoint instead.

Query and RAG

{ "action": "query", "query": "embedding pipeline", "limit": 10, "collection": "axon" }
{ "action": "query", "query": "rate limiting", "since": "7d" }

{ "action": "ask", "query": "How does axon handle Chrome auto-switching?" }
{ "action": "ask", "query": "...", "since": "7d" }
{ "action": "ask", "query": "...", "since": "2026-01-01", "before": "2026-03-01" }
{ "action": "ask", "query": "...", "diagnostics": true }
{ "action": "ask", "query": "...", "hybrid_search": false }
  • query — pure semantic vector search (top-K chunks).
  • ask — RAG: retrieve, then synthesize an answer with citations.
  • Hybrid search (dense + BM42 sparse + RRF) is on by default; hybrid_search: false forces dense-only for A/B comparison or when sparse is misbehaving. Server default: env AXON_HYBRID_SEARCH.
  • Temporal filters (since/before) accept 7d, 30d, YYYY-MM-DD, or RFC3339. They filter on indexing date, not publication date.
  • collection overrides the default axon collection per request (env AXON_COLLECTION).
{ "action": "retrieve", "url": "https://example.com/article" }

CLI: axon query "…" / axon ask "…" --since 7d --diagnostics / axon retrieve <url>.

evaluate{ "action": "evaluate", "query": "<question>", "retrieval_ab": true } (CLI: axon evaluate "<question>" --retrieval-ab) compares hybrid-RAG vs dense-only, scored by a separate judge prompt on the configured LLM backend (accuracy/relevance/completeness).

Inspect the index

{ "action": "sources" }
{ "action": "domains" }
{ "action": "stats" }
{ "action": "status" }                                 // global queue snapshot

CLI: axon sources / axon domains / axon stats / axon status.

Async jobs

crawl, extract, embed, ingest are async by default. subaction defaults to start, so { "action": "crawl", "urls": [...] } is enough to enqueue; poll with subaction: "status". Full lifecycle subactions and CLI-only surfaces: references/async-job-lifecycle.md.

Response handling (MCP)

The server runs in-process, so responses are size-routed automatically: payloads ≤ ~8 KB come back inline in data; larger ones are written to a local artifact file and the response returns its path plus a compact shape summary. Read shape first — it usually answers the question (counts, status, the URLs touched).

When you need the content, reach for RAG, not the raw file: everything axon fetches is already embedded, so ask (synthesized answer), query (semantic chunks), or retrieve (a specific URL's indexed content) get you what you need without parsing megabytes of JSON/markdown. Open the artifact path from disk only as a last resort — e.g. you need the exact raw bytes the RAG path doesn't surface. There is no artifacts MCP action; don't send { "action": "artifacts", ... }. To force a payload in-band instead of a file, set response_mode: "inline" (or "auto_inline").

Full response-mode contract and the JSON-RPC error model: references/mcp-response-protocol.md.

MCP resources

  • axon://schema/mcp-tool — full JSON schema and routing contract (read this when you need exact field types/enums).
  • ui://axon/status-dashboard — interactive MCP App widget for live queue status.

Configuration

This skill names only the handful of env vars that matter at the point of use (AXON_COLLECTION, AXON_SEARXNG_URL, AXON_HYBRID_SEARCH, AXON_OUTPUT_DIR, AXON_DATA_DIR, …). The full surface — every AXON_* env var and ~/.axon/config.toml tuning key — is documented authoritatively and is not duplicated here:

  • config.example.toml (repo root) — all config.toml tuning knobs with defaults.
  • .env.example (repo root) — service URLs, API keys, and secrets.
  • docs/guides/configuration.md — full environment-variable reference + the two-layer (.env + config.toml) priority model.

Priority: CLI flags > env vars > ~/.axon/config.toml > built-in defaults.

Choosing parameters — quick guide

Situation Reach for
User pastes a single URL action: "scrape"
User says "the docs", "the whole site" action: "crawl" with max_pages + max_depth
User asks a question without naming a source action: "ask" (retrieves over whatever's indexed)
User wants only recent content ask / query with since: "7d"
User wants citations / verification ask with diagnostics: true
Ranking looks wrong Try hybrid_search: false and compare
Need entity/relationship reasoning ask with diagnostics: true; graph retrieval is not available in the current runtime
Crawled the wrong thing crawl subaction: "clear" or per-family cleanup
RAG quality regression evaluate with retrieval_ab: true (or CLI axon evaluate <q> --retrieval-ab)
Debug "nothing happened" action: "doctor" first, then action: "status"

Tips and gotchas

  • Read shape, then reach for RAG — not the raw artifact. path mode keeps multi-megabyte results out of the conversation; pull the content back with ask/query/retrieve (it's already embedded) rather than reading/grepping the file. There is no artifacts action — it was removed precisely because RAG is the intended way to get content.
  • Don't paste raw axon <cmd> --help output. Most CLI subcommands inherit the entire Chrome flag set even when irrelevant. Use the action tables here instead.
  • help and doctor are cheap. Call help once per session to confirm the live action map; call doctor whenever something looks wrong.
  • Async by default. Lifecycle actions return a job_id; poll with subaction: "status". CLI users can pass --wait true for synchronous one-shots.
  • Cache reuse is on by default (CLI). Disable with --cache false; add --cache-http-only to force the HTTP path even when the cached job used Chrome.
  • graph: true is deprecated. Use hybrid search diagnostics and source coverage checks for retrieval debugging.
  • Temporal filters use indexing date, not document publication date — useful for "what did I add this week", not "what was published this week".
  • evaluate scores with a separate judge prompt (distinct role + reference material) on the same configured LLM backend. Available as both the MCP evaluate action and axon evaluate.
  • subaction defaults to start for lifecycle families — { "action": "crawl", "urls": [...] } is enough to enqueue a job.
Install via CLI
npx skills add https://github.com/jmagar/axon --skill using-axon
Repository Details
star Stars 2
call_split Forks 1
navigation Branch main
article Path SKILL.md
More from Creator