name: rust version: 1.0.0 description: | Rust systems programming — storage engines, binary formats, SIMD, wire protocols, DataFusion/Arrow, tantivy search, HNSW vectors, arena allocation, graph engines, R-tree geo, async tokio, mmap, MVCC, zero-copy, io_uring. Use when working on any Rust crate involving database internals, storage, query execution, network protocols, search, vectors, or high-performance data structures. allowed-tools: - Bash - Read - Write - Edit
- Safe Rust by default.
unsafeonly in storage hot paths: mmap, SIMD intrinsics, arena internals. Everyunsafeblock gets a// SAFETY:comment explaining the invariant. - Zero-copy where possible. mmap'd segments,
&str/&[u8]references into pages, SIMD on contiguous arrays. Never copy data between storage and execution without reason. - Narrow on disk, wide in execution. Store the tightest encoding. Widen to uniform register-sized types (I64, F64) for SIMD and branch-free processing.
- Trait-based abstraction. Pluggable backends, storage providers, distance metrics — all behind traits. Concrete types for hot paths,
dyn Traitonly at configuration boundaries. - Error types per crate.
thiserrorenums. Neveranyhowin library crates — only in binaries and tests. Neverunwrap()in production code paths. - Async with tokio. All I/O is async. CPU-bound work (SIMD, compression, hashing) runs on
spawn_blockingor dedicated threads. Never block the tokio runtime. - Property tests for invariants.
proptestfor serialization round-trips, type coercion, binary format parsing. Unit tests for logic. Integration tests for cross-crate behavior. - Workspace crate structure. One crate per subsystem. Depend downward — never circular. Storage at the bottom, server at the top.
Rust Systems Programming
MANDATORY FIRST RESPONSE PROTOCOL
Before writing any Rust code:
- Identify which subsystem the task touches
- Read the matching reference file(s) from the routing table
- Check crate dependencies — is this a library crate (thiserror) or binary (anyhow)?
- State the approach before implementing
Routing Table
| Task / Area | Read |
|---|---|
| Toolchain, workspace layout, key crates, Cargo conventions | references/stack.md |
| WAL, mmap, segments, MVCC, compaction, io_uring, backends | references/storage-engine.md |
| Custom on-disk formats, zero-copy parsing, packed structs | references/binary-formats.md |
| Database type system, disk/exec types, Arrow interop | references/type-system.md |
| DataFusion table providers, UDFs, Arrow RecordBatch, planning | references/datafusion-arrow.md |
| pgwire server, MySQL protocol, dialect mapping, ORM compat | references/wire-protocols.md |
| tantivy FTS, HNSW vectors, SIMD distance, hybrid search | references/search-vector.md |
| bumpalo arenas, graph adjacency, traversal algorithms | references/arena-graph.md |
| R-tree spatial index, geo predicates, WGS84 distance | references/geo-rtree.md |
| tokio async, io_uring, crossbeam, lock-free structures, MVCC | references/async-concurrency.md |
| proptest, deterministic testing, integration test patterns | references/testing.md |
| thiserror hierarchies, unsafe patterns, safety invariants | references/error-unsafe.md |
Multiple tasks? Read multiple files.
Quick Rules
cargo fmtbefore commit.cargo clippy -- -D warningsmust pass.- Explicit
useimports — no glob imports except in test modules. #[must_use]on functions returningResultor computed values.#[inline]only on small functions called in hot loops — never on public API.- Feature flags for optional crate dependencies —
#[cfg(feature = "tantivy")]. Send + Syncbounds on all trait objects that cross async boundaries.- Prefer
&[u8]overVec<u8>in function signatures. Own data at boundaries, borrow inside. #[derive(Debug)]on all public types.#[derive(Clone)]only when cheap.- Document all public items.
//!module docs.///item docs with examples. - Integration tests in
tests/directory. Unit tests in#[cfg(test)] mod tests.