neo4j-cypher-patterns

star 1

Expert guide to Neo4j Cypher queries and SKUEL's graph patterns. Use when writing Cypher queries, optimizing graph traversals, understanding relationship types, analyzing query performance, or when the user mentions Cypher, Neo4j, graph queries, or asks about relationships between entities.

linguistic76 By linguistic76 schedule Updated 6/13/2026

name: neo4j-cypher-patterns description: Expert guide to Neo4j Cypher queries and SKUEL's graph patterns. Use when writing Cypher queries, optimizing graph traversals, understanding relationship types, analyzing query performance, or when the user mentions Cypher, Neo4j, graph queries, or asks about relationships between entities. allowed-tools: Read, Grep, Glob

Neo4j Cypher Patterns for SKUEL

Quick Start

SKUEL uses Neo4j as its graph database with a Entity Type Architecture. All domains flow toward LifePath (the destination).

Entity Labels (Neo4j Node Labels)

All domain entities use multi-label architecture: every entity gets :Entity (universal base) plus a domain-specific label. Match on the domain label for fast indexed queries, or :Entity for cross-domain queries.

Domain Label UID Format Example
Activity (6) — user-owned
Tasks Task task_{slug}_{random} task_fix-bug_abc123
Goals Goal goal_{slug}_{random} goal_launch-product_def456
Habits Habit habit_{slug}_{random} habit_daily-run_xyz789
Events Event event_{slug}_{random} event_team-standup_ghi012
Choices Choice choice_{slug}_{random} choice_accept-offer_jkl345
Principles Principle principle_{slug}_{random} principle_small-steps_mno678
Curriculum (4) — shared content
Knowledge Units Ku ku_{slug}_{random} ku_python-basics_abc123
Path Steps PathStep ps:{random} ps:intro-to-python
Learning Paths LearningPath lp:{random} lp:become-python-developer
Exercises Exercise varies
Ontology — shared taxonomy
Knowledge Domains KnowledgeDomain kd.{domain_name} kd.self_awareness
Curated Content — shared content
Resources Resource (no fixed format)
User-authored content + Reports (3) — ADR-054
User Entries UserEntry ue_{slug}_{random} ue_my-essay_abc123
Activity Reports ActivityReport ar_{random}
Entry Reports EntryReport sr_{random}
Destination
Life Path LifePath lp_{random} lp_abc123
Other
Users User user_{name} user_mike
Finance Expense expense_{random} expense_abc123
Groups Group group_{slug}_{random}

Core Relationships (Most Common)

// Ownership - Universal OWNS relationship (all Activity Domains)
(user:User)-[:OWNS]->(task:Task)
(user:User)-[:OWNS]->(goal:Goal)
(user:User)-[:OWNS]->(habit:Habit)

// Knowledge application
(task:Task)-[:APPLIES_KNOWLEDGE]->(ku:Ku)
(goal:Goal)-[:REQUIRES_KNOWLEDGE]->(ku:Ku)
(habit:Habit)-[:REINFORCES_KNOWLEDGE]->(ku:Ku)

// Goal hierarchy
(task:Task)-[:FULFILLS_GOAL]->(goal:Goal)
(habit:Habit)-[:SUPPORTS_GOAL]->(goal:Goal)
(goal:Goal)-[:SUBGOAL_OF]->(parent:Goal)

// Knowledge structure
(ku:Ku)-[:REQUIRES_KNOWLEDGE]->(prereq:Ku)
(ku:Ku)-[:ENABLES_KNOWLEDGE]->(enabled:Ku)
(ku:Ku)-[:RELATED_TO]->(related:Ku)

// MOC organization — any Ku can organize other Kus (emergent identity)
(moc:Ku)-[:ORGANIZES {order: 1}]->(child:Ku)

// Resource citations — curriculum cites reference material
(ps:PathStep)-[:CITES_RESOURCE]->(r:Resource)
(ku:Ku)-[:CITES_RESOURCE]->(r:Resource)

// Domain taxonomy (World Layer)
(ku:Ku)-[:IN_DOMAIN]->(d:KnowledgeDomain)

// Principles guidance
(goal:Goal)-[:GUIDED_BY_PRINCIPLE]->(principle:Principle)
(choice:Choice)-[:ALIGNED_WITH_PRINCIPLE]->(principle:Principle)

// Life path (everything flows toward the life path)
// Designation flips entity_type on the LP node — it does NOT add a :LifePath
// label, so match by property, never by label.
(user:User)-[:ULTIMATE_PATH]->(lp:Entity {entity_type: 'life_path'})
(entity:Entity)-[:SERVES_LIFE_PATH]->(lp:Entity {entity_type: 'life_path'})

Query Patterns

Pattern 1: Get User's Entities

// Get all active tasks for a user via universal OWNS relationship
MATCH (u:User {uid: $user_uid})-[:OWNS]->(t:Task)
WHERE t.status IN ['pending', 'in_progress']
RETURN t
ORDER BY t.priority DESC, t.due_date ASC

Pattern 2: Entity with Graph Context

// Get task with its full neighborhood
MATCH (t:Task {uid: $uid})
OPTIONAL MATCH (t)-[:APPLIES_KNOWLEDGE]->(ku:Ku)
OPTIONAL MATCH (t)-[:FULFILLS_GOAL]->(g:Goal)
OPTIONAL MATCH (t)-[:DEPENDS_ON]->(dep:Task)
RETURN t,
       collect(DISTINCT ku) as applied_knowledge,
       collect(DISTINCT g) as goals,
       collect(DISTINCT dep) as dependencies

Pattern 3: Relationship Traversal

// Find all knowledge required for a goal (including transitive)
MATCH (g:Goal {uid: $goal_uid})
MATCH path = (g)-[:REQUIRES_KNOWLEDGE*1..3]->(ku:Ku)
RETURN DISTINCT ku
ORDER BY length(path)

Pattern 4: Graph-Aware Search

// Search tasks with relationship filter
MATCH (t:Task)
WHERE t.title CONTAINS $query OR t.description CONTAINS $query
OPTIONAL MATCH (t)-[:APPLIES_KNOWLEDGE]->(ku:Ku)
WITH t, collect(ku) as knowledge
WHERE size(knowledge) > 0  // Only tasks that apply knowledge
RETURN t, knowledge

Pattern 5: User Learning Progress

// Get user's mastery state for knowledge units
MATCH (u:User {uid: $user_uid})-[r:MASTERED|IN_PROGRESS|VIEWED]->(ku:Curriculum)
RETURN ku.uid,
       type(r) as status,
       r.mastery_score as score,
       r.mastered_at as mastered_at

Query Builders (SKUEL Infrastructure)

SKUEL has two query builders for domain services (SKUEL001: no APOC in domain services):

Builder Location Use Case
UnifiedQueryBuilder adapters/persistence/neo4j/query/ Generic CRUD (used by backends)
CypherGenerator adapters/persistence/neo4j/query/cypher/ Pure Cypher, semantic traversal

SKUEL001 linter rule: APOC is scoped to apoc.meta.* (schema introspection only). Domain services use pure Cypher — never APOC in core/services/.

Three-Layer Architecture

Layer 1: UniversalNeo4jBackend (Generic CRUD)
├── Uses UnifiedQueryBuilder for generic operations
└── Powers ALL 20 entity types with CRUD, search, relationships

Layer 2: Domain Backends (Domain-Specific Cypher)
├── 27 typed subclasses in backends/ (9 cluster files — import directly from the cluster file)
├── 13 standalone backends (CrossDomainBackend, UserBackend, UserProgressBackend, SessionBackend, InsightBackend, LifePathBackend, ZpdBackend, ZpdSnapshotBackend, VectorSearchBackend, IngestionBackend, JupyterSyncBackend, EmbeddingsBackend, KnowledgeDomainBackend)
├── Domain-specific relationship Cypher (ORGANIZES, SHARES_WITH, FULFILLS_EXERCISE, etc.)
└── Rule: If a Cypher query uses domain-specific relationships, it belongs here

Layer 3: Services (Business Logic + Cross-Domain Aggregation)
├── Domain services delegate to backend methods, NOT execute_query()
├── Two service-layer Cypher exceptions (both use QueryExecutor directly):
│   ├── user_context_queries.py — MEGA-QUERY (full user state snapshot)
│   └── CrossDomainQueryService — 9 targeted cross-domain reads (returns frozen typed dataclasses)
└── Orchestration, events, validation — no other inline Cypher

Filter Operators

All query builders support these operators:

Operator Usage Cypher Output
eq (default) priority='high' n.priority = 'high'
gt due_date__gt=date n.due_date > $date
lt hours__lt=5.0 n.hours < 5.0
gte due_date__gte=date n.due_date >= $date
lte score__lte=8 n.score <= 8
contains title__contains='urgent' n.title CONTAINS 'urgent'
in priority__in=['high', 'urgent'] n.priority IN ['high', 'urgent']

Intent-Based Traversal

All 9 domains (6 Activity + Ku/Ps/Lp) read graph context through mechanism B: the shared _CoreIntelligenceMixin.get_with_contextUnifiedRelationshipService.get_with_context. The edge vocabulary is registry-sourced from DomainConfig.cross_domain_relationship_types (the single source of truth) — there is no per-domain get_suggested_query_intent() (deleted) and no per-domain {Domain}RelationshipService subclass.

Both graph readers (query_with_intent and get_cross_domain_context) now run ONE incident-edge-attributed producer (build_domain_context_with_paths); the old flat build_context_query_for_intent is deleted (PR #243). For a non-registry caller, QueryIntent / a domain's default_context_intent selects the edge slice from _INTENT_EDGE_SETS (in cross_domain_backend). Those slices:

Intent Focus Relationships
HIERARCHICAL HAS_CHILD, PARENT_OF, CHILD_OF
PREREQUISITE REQUIRES_KNOWLEDGE, PREREQUISITE_FOR, ENABLES
PRACTICE PRACTICES, REINFORCES, APPLIES_KNOWLEDGE
GOAL_ACHIEVEMENT FULFILLS_GOAL, SUPPORTS_GOAL, SUBGOAL_OF, GUIDED_BY_PRINCIPLE, CONTRIBUTES_TO_GOAL
else (EXPLORATORY/SPECIFIC/AGGREGATION/RELATIONSHIP) generic traversal, no edge filter

See: docs/roadmap/intent-traversal-registry-convergence.md (authoritative).

Index Architecture (Bootstrap)

Neo4j indexes are created automatically at startup via Neo4jSchemaManager in services_bootstrap/compose.py:

Index Type Method When Created Purpose
Domain indexes sync_domain_indexes() Always UID, user_uid, status, date, composite — 48 indexes
Full-text indexes sync_fulltext_indexes() Always Lucene keyword search across 15 domains — Cypher-first foundation
Auth indexes sync_auth_indexes() Always Rate limiting, session lookup, email uniqueness
Vector indexes sync_vector_indexes() FULL tier only 1024-dim cosine similarity on Entity + ContentChunk

Full-text indexes are the Cypher-first search foundation — always available, no embeddings needed:

-- Full-text search (Lucene-based, relevance-ranked)
CALL db.index.fulltext.queryNodes('task_fulltext_idx', 'urgent deadline')
YIELD node, score
RETURN node.uid, node.title, score

-- Vector search (FULL tier only, 1024-dim BAAI/bge-large-en-v1.5)
CALL db.index.vector.queryNodes('entity_embedding_idx', 10, $embedding)
YIELD node, score
RETURN node.uid, node.title, score

All DDL is idempotent (IF NOT EXISTS) — safe on every startup.

Best Practices

1. Always Use Parameters

// GOOD - parameterized
MATCH (t:Task {uid: $uid})

// BAD - string interpolation (SQL injection risk)
MATCH (t:Task {uid: '${uid}'})

Exception: labels, property names, and relationship types cannot be parameterized in Neo4j. SKUEL validates all interpolated values at the infrastructure boundary:

# Shared guards in _helpers.py (used by all 5 query builder modules)
from adapters.persistence.neo4j.query.cypher._helpers import validate_label, validate_identifier
validate_label(label)             # raises ValueError if not a known NeoLabel value
validate_identifier(field)        # raises ValueError if not a safe identifier (^[a-zA-Z_][a-zA-Z0-9_]*$)

# Relationship types — also validated in _build_direction_pattern() (single choke point for mixin Cypher)
# Uses validate_relationship_type() from core/utils/validation_helpers.py
# Accepts RelationshipName enum values OR safe identifiers

# Field names — validated in _search_mixin.py, _user_entity_mixin.py, and all query builders
from core.utils.validation_helpers import validate_field_name
validate_field_name(name)    # regex check, max 64 chars

Coverage: All 5 query builder modules (crud_queries.py, domain_queries.py, relationship_queries.py, semantic_queries.py, intelligence_queries.py) validate labels, field names, relationship types, and property keys before f-string interpolation. _build_direction_pattern() is the single choke point for mixin-level relationship Cypher (get_related_entities, get_related_uids, count_related). traverse() and find_path() validate pipe-separated patterns.

The same pattern applies to DDL (vector indexes, schema creation) — validate label, field_name, and similarity before building the query string. See adapters/persistence/neo4j/neo4j_schema_manager.py for the pattern.

2. Use OPTIONAL MATCH for Nullable Relationships

// GOOD - returns task even without knowledge
MATCH (t:Task {uid: $uid})
OPTIONAL MATCH (t)-[:APPLIES_KNOWLEDGE]->(ku:Curriculum)

// RISKY - returns nothing if no knowledge relationship
MATCH (t:Task {uid: $uid})-[:APPLIES_KNOWLEDGE]->(ku:Curriculum)

3. Use COLLECT to Prevent Cartesian Products

// GOOD - one row per task
MATCH (t:Task {uid: $uid})
OPTIONAL MATCH (t)-[:APPLIES_KNOWLEDGE]->(ku:Curriculum)
OPTIONAL MATCH (t)-[:FULFILLS_GOAL]->(g:Goal)
RETURN t, collect(DISTINCT ku) as knowledge, collect(DISTINCT g) as goals

// BAD - cartesian product of knowledge × goals
MATCH (t:Task {uid: $uid})
OPTIONAL MATCH (t)-[:APPLIES_KNOWLEDGE]->(ku:Curriculum)
OPTIONAL MATCH (t)-[:FULFILLS_GOAL]->(g:Goal)
RETURN t, ku, g

4. Use RelationshipName Enum (SKUEL013)

from core.models.relationship_names import RelationshipName

# GOOD - type-safe, IDE autocomplete
query = f"MATCH (a)-[:{RelationshipName.REQUIRES_KNOWLEDGE.value}]->(b)"

# GOOD - multi-line with Neo4j property maps (escape braces!)
query = f"""
MATCH (parent:Entity {{uid: $uid}})-[:{RelationshipName.HAS_SUBTASK.value}]->(child)
RETURN child
"""

# BAD - typo-prone, no compile-time check
query = "MATCH (a)-[:REQURES_KNOWLEDGE]->(b)"  # typo!

5. Check Ownership for Multi-Tenant Security

// GOOD - ownership verified via universal OWNS relationship
MATCH (u:User {uid: $user_uid})-[:OWNS]->(t:Task {uid: $task_uid})
RETURN t

// BAD - no ownership check (security risk)
MATCH (t:Task {uid: $task_uid})
RETURN t

Note: The OWNS relationship is the universal ownership pattern. Domain-specific variants (HAS_TASK, HAS_GOAL, etc.) exist in RelationshipName but OWNS is what the backends use.

6. Per-Query Server-Side Timeout (TimedDriver)

Every query through the shared driver carries a server-side per-tx ceiling. Default 120s (env NEO4J_TRANSACTION_TIMEOUT); a runaway is aborted by the Neo4j server, not by the client hanging. Bulk ingestion is already wrapped to 600s; MEGA-QUERY and analytics inherit the default. Startup DDL (Neo4jSchemaManager) is intentionally untimed (raw driver).

If a specific query legitimately needs longer, wrap the call site:

from adapters.persistence.neo4j.timed_driver import (
    neo4j_query_timeout,
    unbounded_neo4j_query_timeout,
)

# Bound the enclosed block to 300s instead of the default 120s:
with neo4j_query_timeout(300.0):
    async with self._driver.session() as session:
        result = await session.run(long_running_aggregation, params)

# Escape hatch for one-off admin maintenance through the wrapped driver:
with unbounded_neo4j_query_timeout():
    ...

Rule: Don't wrap by default — the 120s ceiling exists to catch unintended runaways (a Cartesian explosion, a typo'd MATCH with no anchor). Only wrap when you know the query is legitimately long-running. The with block MUST enclose the full await chain — the override is a ContextVar read at call time, so awaited work outside the block is unbounded by it.

See: docs/patterns/NEO4J_QUERY_TIMEOUT.md for the override mechanism, when-to-wrap table, and the ContextVar + asyncio.create_task caveat.

7. Schema-Change Monitoring (opt-in)

SchemaChangeDetector (core/services/schema_change_detector.py) fingerprints the live Neo4j schema (labels, indexes, constraints, relationship types) and, on drift, invalidates the adapter's lazily-built query-optimization caches (_index_aware_builder, _enhanced_templates) via the auto-registered AdaptiveOptimizationHandler.

It is exposed as an on-demand capability on the adapter — Neo4jAdapter.check_schema_changes(), initialize_schema_monitoring(), stop_schema_monitoring() — and is wired into the composition root as an opt-in background poll:

# In .env — both default off / 900s
NEO4J_SCHEMA_MONITORING=true            # start the background poll at startup
NEO4J_SCHEMA_MONITORING_INTERVAL=900    # poll interval (seconds); must be ≥ 1
  • Off by default. Gated by config.database.schema_monitoring_enabled, not by INTELLIGENCE_TIER — it's plain graph infrastructure (no API calls), so it can run in either tier. Keeping it off by default preserves the CORE-tier "no background workers" guarantee.
  • Where it's wired. services_bootstrap/compose.py calls initialize_schema_monitoring() right after the startup DDL sync (so it baselines against the freshly-synced schema); shutdown_skuel calls stop_schema_monitoring(). The detector owns its own asyncio poll task, which lives on the single loop shared by bootstrap and server.serve().
  • Non-fatal. A failed start warns and continues — monitoring is an optimization, never a correctness gate.
  • Interval is validated at the env boundary (DatabaseConfig.from_env rejects values < 1): a non-positive interval is truthy and would make asyncio.sleep(<=0) busy-spin Neo4j introspection.

Rule: Don't enable it where the schema is static after startup DDL (the common case) — it catches nothing at runtime and adds periodic introspection load. Enable it only where schema genuinely drifts mid-session.

8. Coerce string-stored temporals in comparisons

Date/datetime fields are stored as ISO strings (DTO .isoformat()), so comparing them directly to date()/datetime() yields null and silently drops rows. Wrap the stored side: datetime(n.created_at) >= datetime($w). datetime() is universally safe (parses date and datetime strings, no-op on natives); date() errors on a datetime string → use date(datetime(field)). The writer decides the type: DTO .isoformat() → string (coerce); Cypher = datetime() → native (leave). See PATTERNS.md Pattern 10 + Key Rules #17–18.

9. Relationship reads/writes go through real, config-keyed methods

UnifiedRelationshipService has no __getattr__ — calling a method it doesn't define is an AttributeError, and get_related_uids(method_key, uid) takes an exact DomainRelationshipConfig method-key that fails closed on a typo. Don't invent get_<x>_<y> methods or guess keys; never trust a mocked relationship service (it resolves any attribute). See /docs/patterns/UNIFIED_RELATIONSHIP_SERVICE.md § Phantom methods & keys.

Additional Resources

Related Skills

Deep Dive Resources

Architecture:

Patterns:

Code:

  • /core/models/relationship_names.py - RelationshipName enum (source of truth for all 80+ relationship types)

Foundation

This skill has no prerequisites. It is a foundational pattern.

See Also

  • /docs/patterns/query_architecture.md - Query architecture documentation
  • /docs/patterns/query_architecture.md - Database architecture
  • /core/models/relationship_names.py - RelationshipName enum (source of truth)
Install via CLI
npx skills add https://github.com/linguistic76/skuel --skill neo4j-cypher-patterns
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
linguistic76
linguistic76 Explore all skills →