opengraphdb

star 1

Use when the user wants to query, traverse, build, or evolve a graph database; embedded or HTTP-served; Cypher syntax, vector similarity, full-text, RDF round-trip, time-travel, GraphRAG, or an MCP tool catalog. Trigger on phrases like "graph database", "knowledge graph", "Cypher query", "MCP graph", "GraphRAG", "vector + graph", "property graph", "RDF", "time-travel queries", "Neo4j alternative", "embedded graph", "single-file graph", "Memgraph alternative", "Kuzu alternative", "graph + vector + text", or any task framed as "agent owns the graph end-to-end". Use even when the user does not name OpenGraphDB, if the workload pattern matches load entities + relationships, then query by traversal, similarity, or time. Skip when the workload is a Neo4j cluster (causal cluster, Fabric), a time-series DB, plain key-value, or a managed vector DB where graph traversal is not part of the access pattern.

asheshgoplani By asheshgoplani schedule Updated 6/14/2026

name: opengraphdb description: >- Use when the user wants to query, traverse, build, or evolve a graph database; embedded or HTTP-served; Cypher syntax, vector similarity, full-text, RDF round-trip, time-travel, GraphRAG, or an MCP tool catalog. Trigger on phrases like "graph database", "knowledge graph", "Cypher query", "MCP graph", "GraphRAG", "vector + graph", "property graph", "RDF", "time-travel queries", "Neo4j alternative", "embedded graph", "single-file graph", "Memgraph alternative", "Kuzu alternative", "graph + vector + text", or any task framed as "agent owns the graph end-to-end". Use even when the user does not name OpenGraphDB, if the workload pattern matches load entities + relationships, then query by traversal, similarity, or time. Skip when the workload is a Neo4j cluster (causal cluster, Fabric), a time-series DB, plain key-value, or a managed vector DB where graph traversal is not part of the access pattern. license: Apache-2.0 compatibility: "Requires OpenGraphDB >= 0.4.0. Tested with Claude Code, Cursor, Continue.dev, Aider, Goose, Codex via MCP (stdio + HTTP). Engine binary ships ogdb mcp --stdio for local clients and ogdb serve --http for shared agents." allowed-tools: [mcp__opengraphdb__*, Bash, Read] metadata: version: "0.5.1" author: OpenGraphDB homepage: https://github.com/asheshgoplani/opengraphdb category: database tags: [graph, cypher, vector, rdf, mcp, embedded, knowledge-graph, graphrag] ogdb_min: "0.4.0" ogdb_max: null agents: [claude, cursor, continue, aider, goose, codex]

OpenGraphDB

Single-file embedded graph database. First-class openCypher, vector kNN, full-text, RDF round-trip, MCP server in the core CLI. Apache 2.0, no JVM, no sidecar.

This skill covers the cross-cutting workflow an agent uses to drive the database end-to-end. Four narrow sub-skills (data-import, graph-explore, ogdb-cypher, schema-advisor) handle deeper task-specific work.

When to use

  • Building or evolving a property graph the agent owns end-to-end.
  • Cypher generation, exploration, or schema design against OpenGraphDB.
  • Hybrid retrieval in one round-trip: vector kNN + 1-hop graph + full-text.
  • GraphRAG: ingest documents, query back with Cypher.
  • Bitemporal / time-travel queries (AT TIME, temporal_diff).
  • Wiring an MCP-aware client (Claude Code, Cursor, Goose, Codex) to a single-file graph.
  • Migrating from Neo4j / Memgraph / Kuzu where AGPL licensing, JVM weight, or sidecar overhead is the friction.

When NOT to use

  • The user wants Neo4j Enterprise / Aura cluster features (causal cluster, Fabric, Browser).
  • Workload is time-series-first (TimescaleDB / Prometheus).
  • Workload is plain KV with no traversal in the read path (sled / RocksDB).
  • Strict cross-region replication or > N=4 concurrent writers needed today (multi-writer kernel is a known gap; see "Limits" below).

Quickstart in 30 seconds

Same database file works in all three transports. Pick one.

CLI (zero-config embedded)

cargo build --release -p ogdb-cli                                  # ~2 min, one-time
./target/release/ogdb init /tmp/demo.ogdb
./target/release/ogdb create-node /tmp/demo.ogdb \
  --labels Person --props 'name=string:Alice;age=i64:30'
./target/release/ogdb create-node /tmp/demo.ogdb \
  --labels Person --props 'name=string:Bob;age=i64:25'
./target/release/ogdb add-edge /tmp/demo.ogdb 0 1 --type KNOWS
./target/release/ogdb query /tmp/demo.ogdb "MATCH (n:Person) RETURN n.name"
# columns=name | string:Alice | string:Bob

HTTP (single-process server, MCP-ready)

./target/release/ogdb serve --http --port 8080 /tmp/demo.ogdb &
export BASE=http://127.0.0.1:8080
curl -s $BASE/health                                               # {"status":"ok"}
curl -s -X POST $BASE/query -H 'Content-Type: application/json' \
  -d '{"query": "MATCH (n) RETURN count(n) AS c"}'
curl -s -X POST $BASE/mcp/tools -H 'Content-Type: application/json' -d '{}'

Embedded Rust

use ogdb_core::Database;
let db = Database::open("./mydb.ogdb")?;
db.write(|tx| {
    let alice = tx.create_node(&["Person"], &[("name", "Alice".into())])?;
    let bob   = tx.create_node(&["Person"], &[("name", "Bob".into())])?;
    tx.create_edge(alice, bob, "KNOWS", &[])?;
    Ok(())
})?;
let rows = db.query("MATCH (n:Person) RETURN n.name")?;

A runnable end-to-end demo lives at scripts/quickstart.sh.

Cheatsheet

Most-used CLI commands

Command Use
ogdb init <db> Create a new database file
ogdb info <db> File metadata (format version, page size)
ogdb schema <db> --json Dump labels, edge types, property keys
ogdb stats <db> Degree distribution, node/edge counts
ogdb query <db> "<cypher>" One-shot Cypher (--json for parseable output)
ogdb shell <db> Interactive Cypher shell
ogdb import <db> <file> CSV / JSON / JSONL bulk import
ogdb import-rdf <db> <file> RDF (Turtle / N-Triples / RDF-XML); preserves _uri
ogdb export <db> <file> / export-rdf Round-trip out
ogdb migrate <db> <script> Apply a schema-evolution migration
ogdb backup <db> <out> / checkpoint <db> Operations
ogdb serve --http | --bolt | --grpc <db> Network server (one transport per invocation; HTTP also exposes /mcp/tools + /mcp/invoke)
ogdb mcp --stdio <db> MCP over stdio for Claude Code / Cursor / Goose (separate subcommand — there is no serve --mcp flag)

MCP tool catalog (HTTP POST /mcp/tools returns this list)

Tool Purpose
browse_schema Always call first. Discover labels, edge types, property keys.
execute_cypher Run any Cypher (read or write).
query Read-only Cypher convenience wrapper.
schema Equivalent to ogdb schema --json over MCP.
upsert_node / upsert_edge Direct mutation without writing Cypher.
get_node_neighborhood Node + immediate edges; faster than hand-rolled.
search_nodes Keyword search across all string properties.
subgraph Extract subgraph around a starting node (depth configurable).
shortest_path Pre-built shortest-path; faster than Cypher pattern.
vector_search Semantic similarity via the HNSW index.
text_search Full-text via the tantivy index.
temporal_diff Compare graph state across two timestamps.
import_rdf / export_rdf Round-trip RDF preserving _uri per node.
agent_store_episode / agent_recall Agent-memory primitives.
rag_build_summaries / rag_retrieve GraphRAG community summaries + retrieval.
list_datasets Show available datasets in the database.

POST /mcp/invoke accepts the unwrapped shape { "name": "<tool>", "arguments": { ... } }. ogdb mcp --stdio accepts the full JSON-RPC envelope for tools/list + tools/call.

Cypher essentials

OpenGraphDB implements an openCypher subset. The TCK harness in crates/ogdb-tck enforces a 50% Tier-1 floor across MATCH / RETURN / WHERE / CREATE / DELETE / SET as a regression gate. Beyond Tier-1: OPTIONAL MATCH, MERGE (with ON CREATE SET / ON MATCH SET), WITH, UNWIND, pattern comprehension, CASE, aggregations, ordering, CREATE INDEX, and the OpenGraphDB-specific AT TIME extension.

Minimal cheatsheet

// Read
MATCH (n:Person)-[:KNOWS]->(m:Person) RETURN n.name, m.name

// Write (idempotent; prefer MERGE for any data import)
MERGE (a:Person {name: 'Alice'}) ON CREATE SET a.created_at = timestamp()
MERGE (b:Person {name: 'Bob'})
MERGE (a)-[:KNOWS]->(b)

// Bulk write with parameter list
UNWIND $rows AS row
MERGE (p:Person {id: row.id})
SET   p.name = row.name, p.age = row.age

// Aggregation + ordering — RETURN aliases are not visible to ORDER BY
// in this engine; project through WITH first.
MATCH (p:Person)-[:WROTE]->(b:Book)
WITH p.name AS author, count(b) AS books
RETURN author, books ORDER BY books DESC LIMIT 10

// Vector kNN (function form, not a custom operator)
MATCH (r:Review)
WHERE vector_distance(r.embedding, $q) < 0.3
RETURN r.text ORDER BY vector_distance(r.embedding, $q) ASC LIMIT 10

// Full-text — same alias rule: project through WITH before ORDER BY
MATCH (a:Article) WHERE text_search(a.body, 'graph database')
WITH a.title AS title, text_score(a.body, 'graph database') AS rel
RETURN title, rel ORDER BY rel DESC

// Time-travel (timestamps in milliseconds)
MATCH (a)-[:KNOWS]->(b) AT TIME 1750000000000 RETURN b

What is not supported (today)

  • LOAD CSV (use ogdb import or the /import API).
  • shortestPath() Cypher function (use the shortest_path MCP tool).
  • Variable-length patterns (-[:REL*1..N]->) and named paths (MATCH p = (...)...) — use a fixed-depth chain or the shortest_path MCP tool.
  • UNION between query parts — split into two queries and merge client-side.
  • EXISTS { ... } subquery and the exists((a)-[:R]->(b)) predicate function — rewrite with OPTIONAL MATCH + WHERE x IS NOT NULL.
  • Arbitrary stored-procedure CALL ... YIELD ... (engine ships only built-ins).
  • Most APOC procedures (rewrite as plain Cypher or a small MCP tool).
  • RETURN aliases are not visible to a trailing ORDER BY in the same clause — project the alias through WITH first (see the cheatsheet above).

For the full feature × status grid, see references/cypher-coverage.md.

AI-agent recipes

Condensed from documentation/COOKBOOK.md.

1. AI-agent over MCP

ogdb serve --http --port 8080 mydb.ogdb &
curl -s -X POST localhost:8080/mcp/invoke -H 'Content-Type: application/json' \
  -d '{"name":"browse_schema","arguments":{}}'
# Or stdio for a local MCP client (Claude Code, Cursor, Goose):
ogdb mcp --stdio mydb.ogdb

2. Hybrid retrieval (vector + 1-hop graph + BM25, RRF-fused)

curl -s -X POST $BASE/rag/search -H 'Content-Type: application/json' -d '{
  "query": "alice and bob writing the cookbook",
  "embedding": [0.12,-0.04,0.31,0.05,0.22,-0.18,0.09,0.41,
                0.15,-0.02,0.27,0.36,-0.11,0.08,0.19,0.04],
  "k": 10
}'

If embedding is omitted, server falls back to text-only. dim must match the index. Optional community_id scopes to a single GraphRAG community.

3. Doc ingest + Cypher query (canonical GraphRAG)

curl -s -X POST $BASE/rag/ingest -H 'Content-Type: application/json' -d '{
  "title": "alice-bob-collab", "format": "PlainText",
  "content": "Alice works with Bob on the OpenGraphDB cookbook."
}'
curl -s -X POST $BASE/query -H 'Content-Type: application/json' \
  -d '{"query":"MATCH (p:Person)-[:WORKS_WITH]->(q:Person) RETURN p.name, q.name"}'

4. Time-travel diff

NOW=$(date +%s)
curl -s -X POST $BASE/mcp/invoke -H 'Content-Type: application/json' -d "{
  \"name\":\"temporal_diff\",
  \"arguments\":{\"timestamp_a\":0,\"timestamp_b\":$NOW}
}"
# Returns snapshot_a, snapshot_b, and the diff (node_count / edge_count delta).

5. Agent memory (agent_store_episode + agent_recall)

agent_store_episode requires agent_id, session_id, content, embedding, and timestamp (ms-since-epoch); metadata is optional. The schema is enforced in crates/ogdb-cli/src/lib.rs::execute_mcp_agent_store_episode_tool.

curl -s -X POST $BASE/mcp/invoke -d '{"name":"agent_store_episode","arguments":{
  "agent_id":"planner-1",
  "session_id":"sess-2026-05-06",
  "content":"learned user prefers terse responses",
  "embedding":[/* dim must match the index */],
  "timestamp":1746489600000
}}' -H 'Content-Type: application/json'
curl -s -X POST $BASE/mcp/invoke -d '{"name":"agent_recall","arguments":{
  "agent_id":"planner-1","query_embedding":[/* ... */],"k":5}}' \
  -H 'Content-Type: application/json'

6. RDF round-trip (preserves _uri)

ogdb import-rdf mydb.ogdb ontology.ttl       # any of: ttl, nt, rdf, owl
ogdb export-rdf mydb.ogdb out.ttl --format turtle
# _uri is restored on export so URIs survive the round-trip.

7. Multi-agent shared KG (over MCP, single-process today)

ogdb serve --http --port 8080 shared.ogdb &
# Each agent calls /mcp/invoke against the same endpoint.
# NOTE: kernel is single-writer in 0.5.1; concurrent writers serialize at
# the storage layer. Plan for write-batching or shard per-agent today and
# revisit when the multi-writer kernel ships (see "Limits").

Performance, with honest framing

OpenGraphDB 0.5.1 baseline (i9-10920X, Linux, N=5 release-build median, cold cache, 1 warmup discarded). Live source of truth: documentation/BENCHMARKS.md.

The strict scorecard (post cycle-17 verdict tone-down): 1 verified WIN, 2 caveated WIN, 2 losses, 6 novel-or-directional. The verified row clears a published spec threshold apples-to-apples; caveated rows clear a competitive bar but carry a documented asterisk. Use the verified row as a trust anchor, the caveated rows as directional, and the novel rows as "feasibility, not benchmark".

Metric OpenGraphDB 0.5.1 Spec target Verdict
Scaling Tier 7.1 @ 10k nodes (read p95 / load / RSS) 0.38 μs / 0.32 s / 28.0 MB p95 < 1 ms, load < 1 s, RSS < 100 MB ✅ verified WIN, all three gates
Enrichment round-trip t_persist p95 (100 docs × 10ent + 15edge) 46.7 ms p95 < 40 ms best-in-class, < 150 ms competitive ✅ caveated; clears 150 ms competitive by 3.2×, misses 40 ms best-in-class by 7 ms
Graph-feature rerank batch p95 (100 candidates × 1-hop) 1.34 μs (153 μs/batch) p95 < 50 ms ✅ caveated; clears competitive bar by orders of magnitude — boost is a synthetic Σ neighbour_id, not a learned dot-product

Read the full 14-row scorecard with apples-to-apples notes, losses, and deferred apples-to-apples runs at references/benchmarks-snapshot.md.

For the trust-anchor view (verified-only and caveated-only rows with the exact published-comparison framing), see references/benchmarks-verified.md.

Limits

What 0.5.1 does not do well, and how to escalate:

  • Bulk ingest path is naïve. 254 nodes/s @ 10k+10k single write-tx; 670× behind Kuzu, 1 150× behind Memgraph at the same scale. Workaround: batch via UNWIND inside one write-tx, or use POST /import for >10k rows. Tracked in BENCHMARKS §4.1.
  • Single-writer kernel. Concurrent writers serialize. The published concurrent_rate is per-DB-per-thread, mechanical. Workaround: shard per-agent or queue writes through a coordinator. Multi-writer MVCC tracked in BENCHMARKS §4.6.
  • Mutation p99.9 = 720 ms tail. 56× ratio between p99 (16 ms) and p99.9 hints at a flush / page-cache pause. Don't put per-token writes on a user-facing latency-SLA flow until profiled.
  • Cypher coverage is a subset. No LOAD CSV, no shortestPath(), limited CALL/YIELD. Most APOC code does not port. See references/cypher-coverage.md for the authoritative grid.
  • No external openCypher TCK pass-rate published. Only the in-tree 50% Tier-1 floor is enforced today. Run cargo run --release -p ogdb-tck -- /path/to/openCypher/tck yourself if you need a number.
  • Bolt is v1 only. Modern Neo4j drivers may negotiate v4/v5 first. Use HTTP /query for clean compatibility.
  • No causal cluster / cross-region replication. By design. If that's a hard requirement, this engine is not the right fit.

Common pitfalls

  • RETURN of a bare node returns its id, not its properties. Use RETURN n.name or RETURN properties(n) when you need the body.
  • Don't rename labels behind a live agent. Use ogdb migrate with an explicit migration script; don't hot-edit the catalog.
  • Vector dimensionality is fixed at index creation. Mixing dim=16 query vectors against a dim=1536 index silently returns nothing. Always check the index first.
  • MCP HTTP and stdio accept different envelopes. HTTP /mcp/invoke takes {name, arguments}; stdio takes the full JSON-RPC tools/call envelope.
  • Bearer auth is single-tier (Authorization: Bearer <token>). Don't assume Neo4j role-based access ports across.
  • AT TIME uses milliseconds in the Cypher parser, but the temporal_diff MCP tool takes seconds. Mismatch silently returns an empty snapshot.
  • CREATE is not idempotent. Always prefer MERGE for any import or re-runnable workflow.

Sub-skills, when to descend

Sub-skill Descend when
skills/data-import User has a CSV / JSON / RDF source to load. Covers format detection, two-pass ingest, batch sizing, MERGE-based idempotency, validation.
skills/graph-explore User points at an unknown graph and asks "what's in here?" Covers exploration strategies, schema navigation, entry-point selection.
skills/ogdb-cypher Pure Cypher generation against a known schema. Covers all supported clauses, OpenGraphDB extensions, optimization rules, common error patterns.
skills/schema-advisor User describes a domain and wants a graph schema. Covers eight modeling best practices, six anti-patterns, index selection, RDF mapping with _uri preservation.

This master skill covers the cross-cutting workflow. Descend into a sub-skill only when the task is dominated by one of the four narrow concerns.

Reference index

See also (live docs in repo root)

Install via CLI
npx skills add https://github.com/asheshgoplani/opengraphdb --skill opengraphdb
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
asheshgoplani
asheshgoplani Explore all skills →