epistemic-transaction

name: epistemic-transaction description: "Use when starting complex work, planning implementation, breaking down tasks, creating specs, or when the user says 'plan this as transactions', 'plan transactions', 'break this down', 'create a spec', 'how should I approach this', 'transaction plan', or mentions needing a structured approach to multi-step work. This skill guides the full epistemic workflow from task decomposition through measured execution. Prefer this over EnterPlanMode for non-trivial tasks." version: 1.1.0

Epistemic Transaction Planning

Turn tasks into measured work. This skill guides you through decomposing work into epistemic transactions — measured chunks where investigation and implementation happen together, artifacts are recorded, and learning compounds across boundaries.

Plan Transactions Mode (Interactive)

When a user asks to plan work, or when you face a non-trivial task, use this interactive mode instead of EnterPlanMode. It produces structured, measurable plans with executable commands rather than generic step lists.

How to Run

Interview — Clarify the task using AskUserQuestion
Explore — Read the codebase areas involved (Glob, Grep, Read)
Decompose — Break into goals with empirica goals-create
Plan — Generate transaction plan with estimated vectors
Output — Present as structured plan with executable commands

Step P1: Interview the Task

Use AskUserQuestion to clarify before decomposing. Key questions:

What to Ask	Why
What is the end state?	Defines completion criteria
What constraints exist?	Bounds the solution space
Are there dependencies on other work?	Orders transactions
What areas of the codebase are involved?	Scopes investigation
What's the risk tolerance?	Determines noetic depth

Don't over-interview. 2-3 focused questions max. If the task is clear, skip to P2.

Step P2: Explore and Log

Use read-only tools to explore. Log everything you find:

# What you discover
empirica finding-log --finding "Auth module uses middleware pattern at routes/auth.py" --impact 0.5

# What you don't know
empirica unknown-log --unknown "How does the session store handle concurrent access?"

# What you're assuming
empirica assumption-log --assumption "Database migrations run automatically" --confidence 0.6 --domain infrastructure

Step P3: Decompose into Goals (and tasks)

A goal is one coherent deliverable. Tasks are the AI-tracked units of work inside a goal — distinct steps that each end in a commit / test result / verifiable evidence.

The structural shape (Linear / GitHub / Jira convention):

objective is title-shaped (≤256 chars). Short, actionable.
description is the rich body (≤8000 chars, optional). Context, motivation, success criteria, links.

# Title-only goal — fine for small scope
empirica goals-create --objective "Implement auth middleware"

# Goal with rich description — when the why matters or success
# criteria need to be explicit
empirica goals-create \
  --objective "Implement auth middleware" \
  --description "Routes need JWT-based auth. Out-of-scope: session
storage (separate goal). Success: all routes except /health
require valid JWT, role-based guards work, unit tests pass.
References: RFC 7519, prior decision deead8f2 on bcrypt."

# Decompose into tasks — one per distinct unit of AI work
empirica goals-add-task --goal-id <ID> --description "Read existing middleware chain"
empirica goals-add-task --goal-id <ID> --description "Implement JWT validation middleware"
empirica goals-add-task --goal-id <ID> --description "Add role-based guards"
empirica goals-add-task --goal-id <ID> --description "Write unit tests + commit"

When to decompose into tasks (vs single-shot goal):

Multi-file work → one task per file or logical unit
Investigation followed by implementation → one task per phase
Anything that will produce ≥2 commits → tasks make per-commit evidence linkage explicit
Anything you'd otherwise track in a TodoWrite — log as tasks instead so the work is grounded against calibration

As you complete each task, close it with evidence:

empirica goals-complete-task \
  --task-id <ID> \
  --evidence "Commit abc1234: JWT validation middleware + unit tests passing"

The --evidence field is what makes tasks grounded AI work rather than self-reported progress. Tie it to a commit SHA, test result, or file path — something deterministic that grounded calibration can verify.

Planned vs in-progress goals:

# Logged but not yet started (collaborative planning, queue work)
empirica goals-create --objective "Future: refactor X" --status planned

# Active immediately (default — start work now)
empirica goals-create --objective "Implement X"

Cross-project goals: add --project-id <name-or-uuid> to log against a different project's epistemic state without switching session context.

Step P4: Generate Transaction Plan

For each goal, estimate the noetic-praxic loop:

# Transaction Plan: [Task Name]
# Generated: [timestamp]
# Goals: [count]

transactions:
  - id: 1
    goal: "Goal A description"
    goal_id: "<from goals-create>"
    noetic:
      investigate:
        - "Read module X to understand pattern"
        - "Check if Y exists"
    check_gate: "Understand X pattern and know where to make changes"
    praxic:
      implement:
        - "Write implementation"
        - "Add unit tests"
        - "Commit"
    depends_on: []

  - id: 2
    goal: "Goal B description"
    goal_id: "<from goals-create>"
    noetic:
      investigate:
        - "Review output from T1"
    check_gate: "Know integration points from T1 findings"
    praxic:
      implement:
        - "Build on T1's work"
        - "Integration test"
        - "Commit"
    depends_on: [1]

Step P5: Present and Execute

Present the plan to the user for approval. Once approved:

Start Transaction 1 with PREFLIGHT using the estimated vectors
Follow the noetic-praxic loop per transaction
POSTFLIGHT at the end of each transaction
Adjust subsequent transactions based on learnings

Key principle: The plan is a starting estimate, not a contract. Vectors will shift as you learn. That's the point — measuring the delta between estimated and actual is what builds calibration.

Reference Guide

The sections below are the full reference for epistemic transactions. Use them during execution, not just planning.

When to Use This Skill

Starting a complex task (3+ files, multiple concerns)
User provides a spec, ticket, or feature description
You need to plan before acting
Work will span multiple transactions or sessions
You want to ensure nothing falls through the cracks

Step 1: Understand the Task

Before creating any goals or transactions, assess what you're working with.

Read the spec/task/request. Then ask yourself:

Question	If Yes	If No
Do I understand what's being asked?	Move to Step 2	Log unknowns, investigate
Do I know the codebase areas involved?	Move to Step 2	Read code, log findings
Are there architectural decisions needed?	Log assumptions, investigate options	Move to Step 2
Is this a single coherent change?	Single transaction, skip to Step 3	Decompose into goals

# Log what you don't know yet
empirica unknown-log --unknown "How does the auth middleware chain work?"
empirica unknown-log --unknown "What's the expected behavior when X?"

# Log assumptions you're making
empirica assumption-log --assumption "The API is RESTful" --confidence 0.7 --domain architecture

Step 2: Decompose into Goals

Each goal = one coherent piece of work. Goals are structural (what needs doing), transactions are measurement windows (how you track doing it).

Decomposition heuristics:

Signal	Goal Boundary
Different files/modules	Separate goals
Different concerns (UI vs API vs DB)	Separate goals
Dependency chain (B needs A)	Separate goals, ordered
Single atomic change	One goal
Tests for implementation	Same goal as implementation

# Create goals from decomposition
empirica goals-create --objective "Implement authentication middleware"
empirica goals-create --objective "Add user session management"
empirica goals-create --objective "Write integration tests for auth flow"

Goal sizing guidance:

Size	Description	Transactions
Small	Bug fix, config change, single function	1
Medium	Feature with 2-3 files, schema + UI	1-2
Large	Cross-cutting concern, multiple modules	2-3
Too large	"Redesign the whole system"	Split further

Step 3: Plan Transaction Sequence

Each transaction picks up one goal (or a coherent subset) and runs the full noetic-praxic loop. Plan the sequence based on dependencies and information flow.

Transaction Template

Transaction N: [Goal Name]
  PREFLIGHT: Declare scope, assess baseline
    Noetic: [what to investigate]
    - Read relevant code
    - Check for existing patterns
    - Log findings, unknowns, dead-ends
  CHECK: Gate readiness
    - know >= threshold (holistic)
    - Key unknowns resolved
  Praxic: [what to implement]
    - Write code
    - Run tests
    - Commit
  POSTFLIGHT: Measure learning
    Artifacts to resolve:
    - Close goal if complete
    - Resolve unknowns answered during work
    - Convert verified assumptions to decisions/findings

Example: 3-Transaction Plan

Session Start
  Create goals: A (auth middleware), B (session mgmt), C (integration tests)

Transaction 1: Goal A — Auth Middleware
  PREFLIGHT: scope = auth middleware, know ~0.5, uncertainty ~0.4
  Noetic:
    - Read existing middleware chain
    - Check how routes are protected
    - Log finding: "Express middleware uses next() pattern"
    - Log unknown: "How are roles differentiated?"
    - Resolve unknown → finding: "Roles in JWT claims"
  CHECK: know ~0.8, uncertainty ~0.15 → proceed
  Praxic:
    - Implement auth middleware
    - Add role-based guards
    - Write unit tests
    - Commit: "feat(auth): add JWT middleware with role guards"
  POSTFLIGHT: know 0.9, completion 1.0
    Close Goal A, resolve unknowns

Transaction 2: Goal B — Session Management (informed by T1's findings)
  PREFLIGHT: know ~0.7 (JWT patterns from T1), uncertainty ~0.25
  Noetic:
    - Read session store options
    - Check token refresh patterns
    - Log assumption: "Redis available for session store" --confidence 0.6
  CHECK: → proceed
  Praxic:
    - Implement session creation/refresh/revoke
    - Decision: "Use httpOnly cookies for refresh tokens"
    - Commit: "feat(auth): add session management with token refresh"
  POSTFLIGHT: Close Goal B

Transaction 3: Goal C — Integration Tests
  PREFLIGHT: know ~0.85 (deep understanding from T1+T2)
  Noetic: Quick review of test patterns
  CHECK: → proceed
  Praxic:
    - Write integration tests covering auth + sessions
    - Commit: "test(auth): add integration tests for full auth flow"
  POSTFLIGHT: Close Goal C, session complete

Step 4: Execute Each Transaction

Within each transaction, follow the noetic-praxic loop:

4a. PREFLIGHT — Open the Measurement Window

empirica preflight-submit - << 'EOF'
{
  "session_id": "<ID>",
  "task_context": "Transaction 1: Implement auth middleware. Scope: middleware chain, role guards, unit tests.",
  "work_type": "code",
  "work_context": "iteration",
  "domain": "default",
  "criticality": "medium",
  "vectors": {
    "know": 0.5, "uncertainty": 0.4,
    "context": 0.6, "clarity": 0.7,
    "coherence": 0.6, "signal": 0.5,
    "density": 0.4, "state": 0.5,
    "change": 0.1, "completion": 0.0,
    "impact": 0.7, "do": 0.7,
    "engagement": 0.9
  },
  "reasoning": "Starting auth middleware. Read the route definitions but haven't explored the middleware chain yet. High engagement, moderate knowledge."
}
EOF

Context fields (optional, improve grounded calibration):

work_type: code|infra|research|release|debug|config|docs|data|comms|design|audit|remote-ops — scales evidence weights by source relevance. Use remote-ops for work the local Sentinel doesn't observe (SSH, customer machines, remote config); the POSTFLIGHT will return calibration_status=ungrounded_remote_ops and self-assessment will stand unchallenged.
work_context: greenfield|iteration|investigation|refactor — adjusts normalization baselines for project maturity

PREFLIGHT declares scope. If scope creeps during work, that's a signal to POSTFLIGHT and start a new transaction.

4b. Noetic Phase — Investigate

Use noetic_batch ONLY when batching ≥3 investigation operations. When a transaction's investigation needs reads + greps + globs + investigate together, bundle them in one call — the value is one merged result for your conversation and fewer round-trips, not a gating shortcut. Individual Read/Grep/Glob/investigate calls are noetic in any phase and don't need batching. NOT a Sentinel bypass — calling noetic_batch once for a single read is misuse (the executor will surface a warning field in the response).

empirica noetic-batch - << 'EOF'
{
  "intent": "understand auth middleware chain",
  "reads": [{"path": "src/auth.py"}, {"path": "src/middleware.py"}],
  "greps": [
    {"pattern": "decorator", "glob": "src/**/*.py", "context": 2},
    {"pattern": "Bearer", "glob": "src/**/*.py"}
  ],
  "globs": ["src/**/*auth*", "tests/**/*auth*"],
  "investigate": [{"query": "auth middleware patterns", "scope": "project"}]
}
EOF

(Or via MCP: mcp__empirica__noetic_batch with the same JSON payload.)

Fall back to individual Read/Grep/Glob for one-shot lookups after a batch surfaces something you need to drill into.

Read code. Search patterns. Build understanding. Log as you go:

# Every discovery → finding
empirica finding-log --finding "Middleware chain uses app.use() with path prefix" --impact 0.5

# Every question → unknown
empirica unknown-log --unknown "Where are role definitions stored?"

# Every failed approach → dead-end
empirica deadend-log --approach "Tried passport.js" --why-failed "Too heavy for JWT-only auth"

# Every unverified belief → assumption
empirica assumption-log --assumption "All routes need auth except /health" --confidence 0.8 --domain routing

Rich markdown bodies — `--description` for nuance

Every *-log command (finding, unknown, deadend, assumption, decision, mistake) accepts an optional --description flag carrying a markdown body. The extension and skill surfaces render this as prettified markdown — use sections, lists, code blocks, tables, links for nuance that doesn't fit the short title field.

Three shape examples (no rigid template — pick what the artifact warrants):

1. Prose body — short context behind a finding:

empirica finding-log \
  --finding "Express 5 changed middleware signature to async" \
  --description "Caught during the auth middleware port: synchronous \`next()\` callbacks now resolve as awaited promises, so error handlers must \`return next(err)\` instead of fire-and-forget. The legacy middleware in routes/auth.js silently swallowed errors because the old signature didn't propagate them. Documented in [Express 5 migration guide](https://expressjs.com/en/guide/migrating-5.html)." \
  --impact 0.7

2. Sectioned body — decision with trade-offs:

empirica decision-log \
  --choice "Use Redis for session store" \
  --rationale "Available via docker-compose, supports TTL primitives, matches our existing infra" \
  --description "## Why Redis over alternatives

| Option | Verdict |
|---|---|
| In-memory (Map) | ❌ scales to 1 process only |
| Postgres | ❌ heavy for ephemeral key/value |
| Redis | ✅ matches existing infra |

## What would reverse this
- Single-region deployment requirement (Redis Sentinel adds ops)
- Sub-millisecond write needs (consider DragonflyDB)" \
  --reversibility committal

3. Code-block body — dead-end with reproducible signal:

empirica deadend-log \
  --approach "Tried passport.js for auth middleware" \
  --why-failed "Too heavy for JWT-only auth" \
  --description "## Signal: bundle bloat

\`\`\`
passport@0.7.0 + passport-jwt@4.0.1 + dependencies = +180KB
in-house JWT verifier = ~30 lines, +2KB
\`\`\`

Passport's value is its strategy ecosystem (OAuth, SAML, etc.) — we're
JWT-only so the abstraction was pure overhead. Reverted to a minimal
\`verify-jwt.js\` middleware."

Skip the body entirely when the title alone tells the full story. Over-describing trivial artifacts is its own anti-pattern — let the nuance threshold be "would someone reading this in 3 months understand without the body?"

Sources — log when an artifact's origin matters

An epistemic source is the external thing your finding/decision came from: a doc, a URL, a paper, a transcript, a customer call, a GitHub issue. Sources are first-class artifacts (source-add) that other artifacts link to via the sourced_from relation in batch operations.

When to add a source:

A finding came from reading a non-code reference (RFC, paper, blog, spec, design doc) — log the source so future searches surface the origin, not just the conclusion
A decision rests on an external authority (compliance doc, vendor contract, security advisory) — the audit trail needs the link
A dead-end was learned the hard way from a community thread or postmortem — others can find the warning back to its origin
You're working in Claude Desktop or any non-CLI surface where most artifacts originate from web pages, conversations, attachments, or manually-pasted text rather than code reads. In CLI mode, git blame
- finding_refs auto-extraction often covers source provenance for free; in Desktop mode, explicit source-add is the only way to preserve where ideas came from.

How:

# Standalone: log a source first
empirica source-add --title "RFC 7519 — JSON Web Tokens" \
  --url "https://datatracker.ietf.org/doc/html/rfc7519" \
  --noetic --confidence 0.95
# Returns: source_id (UUID)

# Then link findings/decisions to it via batch graph:
empirica log-artifacts - << 'EOF'
{
  "nodes": [
    {"ref": "f1", "type": "finding",
     "data": {"finding": "JWTs are signed but not encrypted by default",
              "impact": 0.7}},
    {"ref": "d1", "type": "decision",
     "data": {"choice": "Use JWE for sensitive payloads",
              "rationale": "Default JWS leaks contents at rest"}}
  ],
  "edges": [
    {"from": "f1", "to": "<source_id_uuid>", "relation": "sourced_from"},
    {"from": "d1", "to": "f1", "relation": "evidence"}
  ]
}
EOF

Skip when: the source is the project's own code at the current HEAD — that provenance is already in git. Sources earn their keep when the origin is outside what git blame can reach.

4c. CHECK — Gate the Transition

empirica check-submit - << 'EOF'
{
  "session_id": "<ID>",
  "vectors": {
    "know": 0.82, "uncertainty": 0.15,
    "context": 0.85, "clarity": 0.88
  },
  "reasoning": "Investigated middleware chain, understand JWT flow, know where roles live. Ready to implement."
}
EOF

proceed → Start writing code (praxic phase, same transaction)
investigate → Keep exploring (noetic phase, same transaction)

CHECK does NOT end the transaction. It gates the transition.

4d. Praxic Phase — Implement

Write code. Run tests. Commit. Still log artifacts:

# Discoveries during implementation
empirica finding-log --finding "Express 5 changed middleware signature to async" --impact 0.6

# Decisions made while coding
empirica decision-log --choice "Use middleware factory pattern" \
  --rationale "Enables per-route config without duplication" \
  --reversibility exploratory

4e. POSTFLIGHT — Close the Measurement Window

BEFORE running POSTFLIGHT, always:

Log all remaining epistemic artifacts (findings, unknowns, decisions, dead-ends, mistakes)
Resolve any unknowns that were answered during the transaction
Complete any goals that were finished
Ask the user: "Any artifacts to log before I close the transaction?"

POSTFLIGHT without artifact sweep = lost data. The measurement window closes and unlogged work becomes invisible to calibration. Always log first, then close.

empirica postflight-submit - << 'EOF'
{
  "session_id": "<ID>",
  "vectors": {
    "know": 0.92, "uncertainty": 0.08,
    "context": 0.90, "clarity": 0.95,
    "completion": 1.0, "do": 0.90
  },
  "reasoning": "Auth middleware implemented with role guards. Unit tests passing."
}
EOF

4f. Compliance Loop — Domain Checklist (automatic)

After POSTFLIGHT, the compliance loop runs automatically when domain and criticality were set in PREFLIGHT. It checks the domain's required services:

POSTFLIGHT response includes:
  "compliance": {
    "status": "complete" | "iteration_needed" | "max_iterations_exceeded",
    "checks_run": 3,
    "checks_passed": 2,
    "checks_failed": 1,
    "check_results": [
      {"check_id": "lint", "passed": true, "summary": "lint clean (scoped to 4 files)"},
      {"check_id": "complexity", "passed": true, "summary": "complexity A (avg 2.1)"},
      {"check_id": "tests", "deferred": true, "tier": "goal_completion"}
    ],
    "next_transaction": {  // only if iteration_needed
      "intent": "address failures: tests",
      "inherited_domain": "default",
      "inherited_criticality": "medium"
    }
  }

Tiered execution: Checks run at different points to manage resource cost:

always (every POSTFLIGHT): lint, complexity, git_metrics — ~5s, ~80MB
goal_completion (at goal close): tests — runs full pytest
release (pre-release only): dep_audit — pip-audit for CVEs

Cached results: Same changed files = same content hash = cached result. The AI sees "cached": true and knows it wasn't a fresh run.

Brier scoring: If you stated check outcome beliefs in PREFLIGHT (predicted_check_outcomes), the compliance response includes a check_brier block measuring belief calibration. Only freshly-run checks count — deferred and cached are excluded.

Three-vector model: After seeing compliance results, you can submit grounded_vectors + grounded_rationale in POSTFLIGHT to record your reasoned synthesis. Services inform; you synthesize.

Step 5: Between Transactions — Artifact Review

At the start of each new transaction, review open artifacts. Resolve those that are completed or no longer pertinent. Where uncertainty is high about whether an artifact is still relevant, surface it collaboratively:

# 1. Review what's open
empirica goals-list
empirica unknown-list

# 2. Goals no longer needed → close with reason
empirica goals-complete --goal-id <ID> --reason "Superseded by new approach"

# 3. Verify/falsify assumptions
# Confirmed assumption → finding
empirica finding-log --finding "Confirmed: all routes except /health need auth" --impact 0.3
# Falsified assumption → decision about what to do instead
empirica decision-log --choice "Use Redis for sessions" --rationale "Confirmed Redis available via docker-compose"

Why this matters: Unresolved artifacts accumulate as noise. Each transaction's PREFLIGHT retrieves your prior artifacts via pattern matching — clean signal means better context for the next transaction.

Anti-Patterns

The Split-Brain (most common mistake)

WRONG:
  PREFLIGHT → [noetic: investigate] → POSTFLIGHT    ← closes before acting!
  PREFLIGHT → [praxic: implement] → POSTFLIGHT      ← acts without baseline!

Investigation and implementation belong in the same transaction. The PREFLIGHT-to-POSTFLIGHT delta should capture the full journey from "I don't know" to "I investigated, understood, and implemented."

The Mega-Transaction

WRONG:
  PREFLIGHT → [5 goals, 15 files, 3 domains] → POSTFLIGHT

Too much in one measurement window. The delta becomes meaningless noise. Scope to what you can hold coherently — 1-2 goals per transaction.

The Rush-Through

WRONG:
  PREFLIGHT → CHECK → POSTFLIGHT (no actual work between them)

Transactions need real noetic/praxic work. The system detects rushed transactions via minimum duration checks (30s noetic with evidence).

The Artifact Hoarder

WRONG:
  Transaction 1: Log 5 unknowns
  Transaction 2: Log 5 more unknowns (never resolve the first 5)
  Transaction 3: Log 5 more unknowns (pile grows...)

Resolve artifacts between transactions. Unknowns become findings. Assumptions become decisions. Unresolved artifacts accumulate as noise — resolve what's answered, close what's no longer pertinent.

Transaction Discipline Rules

These rules encode the working discipline that makes transactions meaningful. They are behavioral commitments, not code enforcement — internalize them.

Rule 1: Goal-per-Transaction

Every transaction should reference an empirica goal. If the goal has distinct steps, create tasks to track them:

# At PREFLIGHT, link to a goal
empirica goals-create --objective "Implement X"  # if not already created
empirica goals-add-task --goal-id <ID> --description "Read and understand module Y"
empirica goals-add-task --goal-id <ID> --description "Write implementation"
empirica goals-add-task --goal-id <ID> --description "Add tests"

# For goals you want to log but not start yet:
empirica goals-create --objective "Future: refactor Y" --status planned

Why: Goalless transactions produce ungrounded completion vectors. The grounded calibration has nothing to measure your completion claims against. planned goals are visible in goals-list but excluded from measurement until moved to in_progress.

Rule 2: Commit-per-Task

Commit after each completed task or coherent work unit. Don't batch commits to the end of the transaction. Each commit should be meaningful and atomic.

WRONG: noetic → praxic → [edit 5 files] → one big commit → POSTFLIGHT
RIGHT: noetic → praxic → [edit files A,B] → commit → [edit C] → commit → POSTFLIGHT

Why: Uncommitted work is invisible to grounded calibration. The change, state, and do vectors ground against git evidence. Late commits mean the POSTFLIGHT snapshot misses the learning trajectory.

Rule 3: Artifact Breadth

Log the full breadth of epistemic artifacts — not just findings. Every transaction should capture what was relevant:

Happened	Log It
Made a choice between options	`decision-log`
Assumed something unverified	`assumption-log`
Tried something that didn't work	`deadend-log`
Made an error	`mistake-log`
Discovered something	`finding-log`
Hit an open question	`unknown-log`
Noticed something to revisit, but not worth a full artifact mid-flow	`note "..."` (scratchpad)

Why: Single-type artifact logging (only findings) leaves calibration gaps ungrounded. The retrospective breadth_note will flag this, but by then the measurement window is closing.

Scratchpad notes vs artifacts. empirica note "..." is the low-friction middle ground between logging a full artifact (structured, embedded) and holding a thought in context (lost at compaction). Use it to jot a follow-up, doubt, or "check this later" without breaking stride to classify it. Notes are transaction-scoped, pure-metadata (not shared, not embedded), and surface at POSTFLIGHT under untriaged_notes. At the retrospective, triage them: note --list → promote the keepers to a finding/decision/goal → note --clear. Capture now, classify later. (The retrospective soft-gate is note-aware — real praxic work with zero artifacts and zero notes is the strongest "logged nothing" signal.)

Rule 4: Close Artifacts Before POSTFLIGHT

Complete goals and resolve unknowns BEFORE submitting POSTFLIGHT:

# Close what's done
empirica goals-complete --goal-id <ID> --reason "Implemented and tested"
empirica unknown-resolve --unknown-id <ID> --resolved-by "Found in codebase"

# THEN close the measurement window
empirica postflight-submit -

Why: The measurement window closes at POSTFLIGHT. Goal completion and unknown resolution feed grounded calibration's completion and know vectors. If you POSTFLIGHT first, the evidence is invisible to calibration.

Rule 5: Mesh Discipline (when peer practices are involved)

When the transaction crosses a practice boundary — you received a peer's collab/proposal, or your work will land in someone else's domain — the mesh-discipline rules apply in addition to the within-practice rules above:

Trigger	Action	Why
Mid-transaction peer collab arrives	Log `goals-create --objective "Process inbox/<status>: <proposal_id>"`, finish current chunk, reply substantively at next break	Silent accept-and-forget is the drop-thread anti-pattern. The goal stub is the cue you saw it.
You finished work a peer asked of you	`empirica mailbox reply --parent-id <proposal_id> --commit-sha <sha> ...` (atomic propose+complete) before POSTFLIGHT	Without the handshake, source AI's outbox stays visibly stalled even though the work landed. Ack IS part of the work.
You're uncertain and a peer practice's domain genuinely owns the answer	Send a collab brief (noetic — auto-accepted, ungated) instead of guessing	Asking is cheap; shipping on a bad assumption and being corrected at review is expensive.
You reached a grounded, actionable conclusion that crosses practice boundaries	Emit a typed propose (code_change_request / architecture_decision / etc. — ECO-gated)	Sitting on convergent insight because "they'll figure it out" is the inverse free-ride.
You registered a canonical reference (RFC, spec, design doc, customer call)	`source-add --visibility shared` (or `public`) — not the `local` default	`local` sources are invisible to `empirica sources-map --global`; peers can't reference what they can't see.
Your finding/decision rests on a peer's source or another practice's prior work	`--source <uuid>` or `sourced_from` edge in `log-artifacts`	The citation network is what makes the mesh self-correcting — useful peers earn weight; abandoned ideas fade.

Why this is structural, not moral: Same logic as the artifact-breadth rule. Gaming the mesh (silent free-ride, no acks, hoarded sources) doesn't hurt anyone other than your own practice's discoverability + trust trajectory. Your peers route attention based on which practices return calls. There is no opponent to deceive.

Full framing + examples: /empirica-constitution §V. Send-side mechanism: /cortex-mailbox-send (collab vs ECO-gated flavors, completion handshake, recovery on mis-routing).

Rule 5b: Sustained multi-practice work

When a transaction is one practice's leg of work spanning ≥2 practices and outliving this session, the shared coordination state lives in the proprietary mesh layer (Empirica Cortex), not in any one practice's goals. Your local transaction discipline (PREFLIGHT → CHECK → POSTFLIGHT) is unchanged; what's added is keeping the shared record in sync — and those shared-state mutations are human-authorized at the ECO boundary, so treat them as proposed-then-confirmed, never synchronous steps you busy-wait on. If your install has the mesh layer, the operational guidance loads from it. See getempirica.com.

Rule 6: Mirror Empirica Tasks → Claude Code Tasks (Visibility)

Empirica tasks and Claude Code Tasks share the same name now (this is deliberate — they're the same shape of work, tracked in two surfaces). For larger transactions, mirror empirica tasks to Claude Code Tasks so the user sees progress. Create at PREFLIGHT, update as you complete:

empirica goals-add-task → Claude Code TaskCreate (mirror)
empirica goals-complete-task → Claude Code TaskUpdate (mirror)

This is advisory — use your judgment on when the user benefits from visible task tracking vs when it's overhead.

Quick Reference: Commands by Phase

Phase	Commands
Planning	`goals-create`, `goals-add-task`, `unknown-log`, `assumption-log`
PREFLIGHT	`preflight-submit` (opens transaction)
Noetic	`noetic-batch` (3+ ops in one call — preferred), `source-add`, `finding-log`, `unknown-log`, `deadend-log`, `assumption-log`
CHECK	`check-submit` (gates noetic → praxic)
Praxic	`finding-log`, `decision-log`, `goals-complete-task`
Before POSTFLIGHT	`goals-complete`, `unknown-resolve`, or batch: `resolve-artifacts`
POSTFLIGHT	`postflight-submit` (closes transaction + triggers grounded verification)
Between	`goals-list`, `resolve-artifacts` (batch), `delete-artifacts` (cleanup)
Batch	`log-artifacts` (connected graph), `resolve-artifacts`, `delete-artifacts`

Spec-to-Transactions Cheatsheet

Given a spec or feature description:

Read it fully — don't start decomposing mid-read
Identify nouns — these are your domains/modules (potential goal boundaries)
Identify verbs — these are your actions (potential tasks)
Identify dependencies — A before B? Separate transactions, ordered
Identify unknowns — what the spec doesn't say (log immediately)
Identify assumptions — what you're inferring (log with confidence)
Group into goals — by domain coherence
Order into transactions — by dependency chain + information flow
Execute — one transaction at a time, full noetic-praxic loop each

Earned Autonomy

Vectors are beliefs about your epistemic state. Deterministic services provide observations that inform those beliefs. The divergence between your beliefs and observations tells you where work discipline needs attention — not where to adjust numbers.

Each transaction with good discipline (artifact breadth, commit cadence, goal closure before POSTFLIGHT) builds a behavioral track record that the Sentinel uses to adapt thresholds → better discipline earns more autonomy.

Believe what you observe. Log what you learn. Let discipline drive improvement.