neumann-vector - SKILL.md Agent Skill

name: neumann-vector description: Build vector search and embedding pipelines with Neumann. Use when implementing similarity search, RAG retrieval, semantic caching, or working with HNSW indexes.

Neumann Vector Search Guide

Vector Storage Model

Neumann stores embeddings as key-vector pairs in a dedicated vector engine backed by an HNSW (Hierarchical Navigable Small World) index for fast approximate nearest neighbor search.

Key concepts:

Embedding: A string key mapped to a float vector of any dimensionality.
Collection: An optional namespace for grouping related embeddings (e.g., per-tenant, per-domain).
HNSW Index: Must be explicitly built after inserts for fast similarity queries.
Distance Metrics: COSINE (default), EUCLIDEAN, DOT_PRODUCT.

All vectors within a query must have the same dimensionality. Mixed dimensions cause runtime errors, not parse errors.

Embeddings are stored in tensor_store's embedding slab, which uses sharded locking -- concurrent writes to different keys do not block each other.

Embedding Lifecycle

The standard workflow: store embeddings, build the index, then query.

Store a single embedding

EMBED STORE 'doc-1' [0.12, -0.34, 0.56, 0.78]

Keys must be quoted strings. Unquoted keys with hyphens or colons will fail to parse.

Store into a collection

EMBED STORE 'doc-1' [0.12, -0.34, 0.56] IN my_collection

Batch store multiple embeddings

EMBED BATCH [('doc-1', [0.1, 0.2, 0.3]), ('doc-2', [0.4, 0.5, 0.6]), ('doc-3', [0.7, 0.8, 0.9])]

Always prefer EMBED BATCH over multiple EMBED STORE calls for bulk inserts.

Build the HNSW index

EMBED BUILD INDEX

This must be called after bulk inserts and before similarity queries. Building the index is an O(n log n) operation. Do it once after loading, not after each individual insert.

Retrieve an embedding by key

EMBED GET 'doc-1'

Returns the vector associated with the key, or an error if not found.

Delete an embedding

EMBED DELETE 'doc-1'

List stored embeddings

SHOW EMBEDDINGS
SHOW EMBEDDINGS LIMIT 50

Count embeddings

COUNT EMBEDDINGS

Check index status

SHOW VECTOR INDEX

Similarity Search

The SIMILAR command performs approximate nearest neighbor search. It returns key-score pairs ranked by similarity.

Search by vector

SIMILAR [0.1, 0.2, 0.3] LIMIT 10

Search by existing key

SIMILAR 'doc-1' LIMIT 5

Uses the vector already stored under that key as the query vector.

Specify distance metric

SIMILAR [0.1, 0.2, 0.3] LIMIT 10 METRIC COSINE
SIMILAR [0.1, 0.2, 0.3] LIMIT 10 METRIC EUCLIDEAN
SIMILAR [0.1, 0.2, 0.3] LIMIT 10 METRIC DOT_PRODUCT

Default is COSINE if omitted.

Search within a collection

SIMILAR [0.1, 0.2, 0.3] LIMIT 10 IN my_collection

Filtered search

SIMILAR [0.1, 0.2, 0.3] LIMIT 10 WHERE category = 'science'

The WHERE clause filters results after the ANN search.

Cross-engine: vector + graph

Find similar embeddings that are also connected to a specific graph node:

SIMILAR [0.1, 0.2, 0.3] LIMIT 10 CONNECTED TO 'user-42'

This intersects ANN results with the graph neighborhood of user-42.

Full syntax summary

SIMILAR <key-or-vector> [LIMIT n] [METRIC COSINE|EUCLIDEAN|DOT_PRODUCT] [IN collection] [CONNECTED TO 'node-id'] [WHERE expr]

Result type: Similar (list of key + score pairs).

Cross-Engine Patterns

Vector + Graph: similarity-aware traversal

Find neighbors ranked by vector similarity:

NEIGHBORS 'node-1' BOTH BY SIMILARITY [0.1, 0.2, 0.3] LIMIT 5

This returns graph neighbors of node-1, re-ranked by cosine similarity to the given vector.

Vector + Graph: connected similarity

SIMILAR [0.1, 0.2, 0.3] LIMIT 10 CONNECTED TO 'hub-node'

Restricts ANN results to embeddings whose keys correspond to nodes connected to hub-node in the graph.

Vector + Cache: semantic caching

CACHE SEMANTIC PUT 'query-text' 'response-text'
CACHE SEMANTIC GET 'similar-query-text'

Uses embedding similarity for cache lookups -- returns cached responses for semantically similar (not just exact-match) queries.

Unified entities with embeddings

ENTITY CREATE 'product-1' { name: 'Widget', price: 9.99 } EMBEDDING [0.1, 0.2, 0.3]
ENTITY UPDATE 'product-1' { price: 12.99 } EMBEDDING [0.15, 0.25, 0.35]

Stores properties, graph node, and embedding in one atomic operation.

Performance Tips

Build index after bulk inserts. Call EMBED BUILD INDEX once after loading all embeddings, not after each insert. Index building is expensive and redundant if more inserts follow immediately.
Use EMBED BATCH for bulk loading. A single EMBED BATCH with 1000 pairs is faster than 1000 individual EMBED STORE calls due to reduced parsing and locking overhead.
Choose the right metric for your data:
- COSINE -- best for normalized embeddings (most LLM outputs). Measures angle between vectors, invariant to magnitude.
- EUCLIDEAN -- best for spatial data where absolute distance matters.
- DOT_PRODUCT -- best when magnitude carries meaning (e.g., popularity-weighted embeddings). Fastest to compute but sensitive to vector norms.
Use collections for multi-tenant isolation. Rather than filtering after search, partition embeddings into collections. Each collection maintains its own index, so queries only search relevant vectors.
Keep vector dimensions consistent. All vectors in a collection should have the same dimensionality. Mixing dimensions causes runtime errors.
LIMIT is not optional for large datasets. Without LIMIT, SIMILAR returns all embeddings sorted by distance. Always specify LIMIT for production queries.

Common Mistakes

Forgetting to build the index:

EMBED STORE 'k1' [0.1, 0.2]
SIMILAR [0.1, 0.2] LIMIT 5       -- slow: falls back to brute-force scan
EMBED BUILD INDEX                  -- should come before SIMILAR
SIMILAR [0.1, 0.2] LIMIT 5       -- fast: uses HNSW index

Using SIMILAR TO (wrong keyword):

SIMILAR [0.1, 0.2] LIMIT 5       -- correct
SIMILAR TO [0.1, 0.2] LIMIT 5    -- WRONG: no TO keyword in SIMILAR

Using BY instead of METRIC:

SIMILAR [0.1, 0.2] LIMIT 5 METRIC COSINE    -- correct
SIMILAR [0.1, 0.2] LIMIT 5 BY COSINE        -- WRONG: use METRIC

Unquoted keys with special characters:

EMBED STORE 'doc:1' [0.1, 0.2]    -- correct
EMBED STORE doc:1 [0.1, 0.2]      -- WRONG: colon splits into 3 tokens
EMBED STORE 'my-key' [0.1, 0.2]   -- correct
EMBED STORE my-key [0.1, 0.2]     -- WRONG: hyphen causes parse error

Searching without LIMIT on large datasets:

SIMILAR [0.1, 0.2] LIMIT 10       -- correct: bounded result set
SIMILAR [0.1, 0.2]                 -- risky: returns all embeddings