agent-deck

star 2.7k

Terminal session manager for AI coding agents. Use when user mentions "agent-deck", "session", "sub-agent", "MCP attach", "git worktree", or needs to (1) create/start/stop/restart/fork sessions, (2) attach/detach MCPs, (3) manage groups/profiles, (4) get session output, (5) configure agent-deck, (6) troubleshoot issues, (7) launch sub-agents, or (8) create/manage worktree sessions. Covers CLI commands, TUI shortcuts, config.toml options, and automation.

asheshgoplani By asheshgoplani schedule Updated 6/9/2026

name: agent-deck description: Terminal session manager for AI coding agents. Use when user mentions "agent-deck", "session", "sub-agent", "MCP attach", "git worktree", or needs to (1) create/start/stop/restart/fork sessions, (2) attach/detach MCPs, (3) manage groups/profiles, (4) get session output, (5) configure agent-deck, (6) troubleshoot issues, (7) launch sub-agents, or (8) create/manage worktree sessions. Covers CLI commands, TUI shortcuts, config.toml options, and automation. metadata: compatibility: "claude, opencode"

Agent Deck

Terminal session manager for AI coding agents. Built with Go + Bubble Tea.

Repo: github.com/asheshgoplani/agent-deck | Discord: discord.gg/e4xSs6NBN8

Run agent-deck --version for your installed version. This skill targets v1.9+ but most patterns work back to v1.7.

Script Path Resolution (IMPORTANT)

This skill includes helper scripts in its scripts/ subdirectory. When Claude Code loads this skill, it shows a line like:

Base directory for this skill: /path/to/.../skills/agent-deck

You MUST use that base directory path to resolve all script references. Store it as SKILL_DIR:

# Set SKILL_DIR to the base directory shown when this skill was loaded
SKILL_DIR="/path/shown/in/base-directory-line"

# Then run scripts as:
$SKILL_DIR/scripts/launch-subagent.sh "Title" "Prompt" --wait

Common mistake: Do NOT use <project-root>/scripts/launch-subagent.sh. The scripts live inside the skill's own directory (plugin cache or project skills folder), NOT in the user's project root.

For plugin users, the path looks like: ~/.claude/plugins/cache/agent-deck/agent-deck/<hash>/skills/agent-deck/scripts/ For local development, the path looks like: <repo>/skills/agent-deck/scripts/

Quick Start

# Launch TUI
agent-deck

# Create and start a session
agent-deck add -t "Project" -c claude /path/to/project
agent-deck session start "Project"

# Send message and get output
agent-deck session send "Project" "Analyze this codebase"
agent-deck session output "Project"

Capabilities

What agent-deck does, at the noun level (independent of which surface — CLI / TUI / Web UI — you reach it through). Each row maps to one or more commands, keystrokes, or screens. For full surface coverage and per-row known issues, see your conductor's CAPABILITIES.md (generated by the Self-Improvement workflow below).

Capability What it does Surfaces
Manage sessions Create, start, stop, restart, fork, send, output, remove a session CLI ✅ · TUI ✅ · Web UI 🟡
Sub-agent / worker spawning agent-deck launch a child Claude session with parent linkage and inherited --add-dir CLI ✅ · TUI ⚪
Manage conductors Set up long-lived orchestrators with their own profile + channel + heartbeat CLI ✅ · TUI 🟡
Manage groups Move / delete groups; organize sessions hierarchically CLI ✅ · TUI ✅
Manage watchers Install / configure event-driven adapters (Gmail, GitHub, ntfy) — doorbell-not-messenger CLI ✅ · TUI ✅
Heartbeat orchestration Cron / ScheduleWakeup feeding the conductor periodic system-state nudges CLI ✅
Worktree workflows --worktree to create isolated git-worktree-backed sessions for parallel branch work CLI ✅
Channel routing Telegram / Slack inbound delivery to the right conductor, with --channels per-session binding CLI ✅
Attach MCPs Per-session or global MCP plugin attach / detach / status, with optional pooling CLI ✅ · TUI ✅ · Web UI ⚪
Attach skills Pool-based on-demand skill loading per Claude session CLI ✅ · TUI ✅
Session-metadata mutation session set for claude-session-id, path, wrapper, channels, parent; session unset-parent CLI ✅
State persistence state.json + task-log.md + LEARNINGS.md + HANDOFF.md survive Claude Code compaction/restart CLI ✅
GitHub pipeline oversight Conductor-driven release flow: PR merge → tag → goreleaser → release CLI ✅
Remote sessions SSH-based remote register / list / attach across hosts CLI ✅
Session sharing Export / import a Claude conversation for handoff between developers CLI ✅
Consult another agent Launch a Codex / Gemini sub-agent for a second opinion CLI ✅
Profile-scoped operation -p <profile> separation between personal and work auth CLI ✅ · TUI ✅
Self-improvement Analyze your conductor's own transcripts → surface bugs / patterns / capabilities, file GH issues with privacy guards CLI ✅

Status legend: ✅ verified, 🟡 partial, 🔴 known broken, ⚪ unknown. To update for your own machine, see Self-Improvement below.

Per-CLI Capabilities (what a launched session can do)

The table above is what agent-deck does. This one is what the CLI inside a session can do — agent-deck launches one per session (-c claude|codex|gemini). A conductor uses this to pick the right tool for a child; a launched child uses it to know its own powers without being told. This is a capability map, not a manual — run <cli> --help for exact flags. Verified 2026-06 against claude 2.1.x, codex-cli 0.137, gemini 0.45.

In-session capability claude (Claude Code) codex gemini
Multi-agent fan-out inside one session Agent tool (parallel subagents, each its own context window; run_in_background) and Workflow tool (deterministic JS: agent()/pipeline()/parallel() over item lists, structured-output schemas, phases) ❌ single-agent — fan out by launching codex peers via agent-deck ❌ not exposed — fan out via agent-deck peers
Skills ✅ Skill tool + agent-deck pool skills (~/.agent-deck/skills/pool/) gemini skills
MCP servers claude mcp / --mcp-config; agent-deck mcp attach codex mcp; also runs as a server (codex mcp-server) gemini mcp, --allowed-mcp-server-names
Built-in code review ultrareview (cloud multi-agent) + /code-review skill codex review / codex exec review --uncommitted via prompt only
Plan / read-only mode --permission-mode plan -s read-only --approval-mode plan
Autonomy / sandbox --permission-mode acceptEdits|bypassPermissions; config dangerous_mode -s read-only|workspace-write|danger-full-access; --dangerously-bypass-approvals-and-sandbox --approval-mode auto_edit|yolo, -s/--sandbox
Structured output --json-schema, --output-format json codex exec json event stream -o json|stream-json
Image input (multimodal) paste / Read an image -i/--image (attach to prompt; not image generation) multimodal prompt
Apply a diff to working tree Edit/Write tools codex apply (git-apply the last agent diff) Edit tools
Git worktree -w/--worktree via agent-deck --worktree -w/--worktree
Plugins / extensions / hooks / channels plugin, --plugin-dir/-url, hooks, --channels codex plugin extensions, hooks
Resume / fork conversation -r/--resume, --fork-session codex resume / codex fork --resume, --session-file
Local / OSS models 3P providers (Bedrock/Vertex) --oss, --local-provider lmstudio|ollama gemini gemma routing

Choosing the -c tool for a child: default to claude — only it has in-session multi-agent fan-out (Agent + Workflow tools) and pool skills, so it can own a whole task end-to-end and orchestrate its own sub-work. Reach for codex for a fast non-interactive second opinion or code review (codex review) and sandboxed exec; gemini for a third opinion or large-context reads. For codex/gemini, parallelism comes from launching multiple agent-deck peers, not from inside the session.

agent-deck powers every child also has (independent of CLI): agent-deck mcp attach/detach then session restart; launch further child or peer sessions (-no-parent for peers); load pool skills on demand; session send to talk to sibling sessions. See Sub-Agent Launch, Peer (Root) Sessions vs Sub-Agents, MCP Management.

When to go inline vs Agent tool vs Workflow (for claude children) is owned by the shared conductor template's Delegation section (~/.agent-deck/conductor/conductor-claude.md) and is not duplicated here: 1 task = inline; a few independent subtasks = Agent tool; a sweep / audit / matrix = Workflow — and always adversarially verify findings with a second agent told to refute.

Essential Commands

Command Purpose
agent-deck Launch interactive TUI
agent-deck add -t "Name" -c claude /path Create session
agent-deck session start/stop/restart <name> Control session
agent-deck session send <name> "message" Send message
agent-deck session output <name> Get last response
agent-deck session current [-q|--json] Auto-detect current session
agent-deck session fork <name> Fork Claude/Pi conversation
agent-deck session switch-account <name> <account> Switch Claude account, conversation follows
agent-deck mcp list List available MCPs
agent-deck mcp attach <name> <mcp> Attach MCP (then restart)
agent-deck status Quick status summary
agent-deck add --worktree <branch> Create session in git worktree
agent-deck worktree list List worktrees with sessions
agent-deck worktree cleanup Find orphaned worktrees/sessions
agent-deck feedback Submit feedback (opens rating prompt + optional comment)

Status: running | waiting | idle | error

Sub-Agent Launch

Use when: User says "launch sub-agent", "create sub-agent", "spawn agent"

$SKILL_DIR/scripts/launch-subagent.sh "Title" "Prompt" [--mcp name] [--wait]

The script auto-detects current session/profile and creates a child session.

Retrieval Modes

Mode Command Use When
Fire & forget (no --wait) Default. Tell user: "Ask me to check when ready"
On-demand agent-deck session output "Title" User asks to check
Blocking --wait flag Need immediate result

Recommended MCPs

Task Type MCPs
Web research exa, firecrawl
Code documentation context7
Complex reasoning sequential-thinking

Worker Prompt Conventions

When the prompt asks the worker to Edit or Write any existing file, include an explicit prelude-read as the first step. Claude Code's tool guard rejects Edit/Write before Read of the same path, and conductor-spawned workers hit this mid-task (#968) when their prompt jumps straight into the change. The interruption forces the worker to backfill reads inside its main loop, breaking flow and burning cycles.

Template skeleton — bake this into every worker prompt that mutates code:

## Step 0 — Prelude reads
Read every file you intend to Edit/Write below. Read calls are cheap
and do NOT count against scope discipline; they prevent tool-guard
interruptions mid-task. Skip only for paths that will be created fresh.

Files to read first:
  - <path/to/file/you/will/edit>
  - <path/to/other/file/you/will/edit>

## Step 1 — Investigation
…

## Step 2 — Implementation
…

Rules:

  • Any file the worker will modify: prelude Read it first, even if the worker "knows" the contents.
  • Brand-new files (Write to a path that does not yet exist): no prelude needed — the guard only applies to modifications.
  • For conductor nudges that re-fire the same prompt across cycles, keep Step 0 in the prompt. The second-cycle process may not have prior reads in context.

Completion sentinel (trustworthy "worker finished" signal, #1186)

A worker's Stop hook fires at the end of every turn, so "waiting" never means "done" — the conductor would otherwise have to poll RESULTS files / gh pr / session output to know a task actually finished. Instead, instruct every worker to assert completion by ending its final turn with a single machine-greppable line:

===AGENTDECK_DONE=== status=<ok|fail> summary=<one line to end of line>

agent-deck detects this on the Stop edge (scanning the transcript tail) and emits a distinct event to the parent:

[DONE] Child '<name>' (<id>) finished: status=ok summary=<...>

instead of the generic [EVENT] … is waiting. This is by-construction: completion is asserted by the only party that knows (the worker), not inferred from terminal cosmetics.

Bake this line into every worker prompt's final instruction:

## Final step — assert completion
When the task is fully done, print exactly this as the last line of your final message:
  ===AGENTDECK_DONE=== status=ok summary=<what you accomplished, one line>
Use status=fail if you could not complete it; put the blocker in the summary.

Notes:

  • Fires once per distinct completion (idempotent per task): re-reading the same sentinel across polls — or repeating an identical sentinel on a later Stop — does not re-emit. A genuinely new completion (different summary) emits again.
  • Absent on ordinary mid-task Stop edges, so existing waiting behavior is unchanged.
  • Malformed sentinel lines (missing/invalid status) are ignored, not guessed at.

Consult Another Agent (Codex, Gemini)

Use when: User says "consult with codex", "ask gemini", "get codex's opinion", "what does codex think", "consult another agent", "brainstorm with codex/gemini", "get a second opinion"

IMPORTANT: You MUST use the --tool flag to specify which agent. Without it, the script defaults to Claude.

Quick Reference

# Consult Codex (MUST include --tool codex)
$SKILL_DIR/scripts/launch-subagent.sh "Consult Codex" "Your question here" --tool codex --wait --timeout 120

# Consult Gemini (MUST include --tool gemini)
$SKILL_DIR/scripts/launch-subagent.sh "Consult Gemini" "Your question here" --tool gemini --wait --timeout 120

DO NOT try to create Codex/Gemini sessions manually with agent-deck add. Always use the script above. It handles tool-specific initialization, readiness detection, and output retrieval automatically.

Full Options

$SKILL_DIR/scripts/launch-subagent.sh "Title" "Prompt" \
  --tool codex|gemini \     # REQUIRED for non-Claude agents
  --path /project/dir \     # Working directory (auto-inherits parent path if omitted)
  --wait \                  # Block until response is ready
  --timeout 180 \           # Seconds to wait (default: 300)
  --mcp exa                 # Attach MCP servers (can repeat)

Supported Tools

Tool Flag Notes
Claude --tool claude Default, no flag needed
Codex --tool codex Requires codex CLI installed
Gemini --tool gemini Requires gemini CLI installed

How It Works

  1. Script auto-detects current session and profile
  2. Creates a child session with the specified tool in the parent's project directory
  3. Waits for the tool to initialize (handles Codex approval prompts automatically)
  4. Sends the question/prompt
  5. With --wait: polls until the agent responds, then returns the full output
  6. Without --wait: returns immediately, check output later with agent-deck session output "Title"

Examples

# Code review from Codex
$SKILL_DIR/scripts/launch-subagent.sh "Codex Review" "Read cmd/main.go and suggest improvements" --tool codex --wait --timeout 180

# Architecture feedback from Gemini
$SKILL_DIR/scripts/launch-subagent.sh "Gemini Arch" "Review the project structure and suggest better patterns" --tool gemini --wait --timeout 180

# Both in parallel (consult both, compare answers)
$SKILL_DIR/scripts/launch-subagent.sh "Ask Codex" "Best way to handle errors in Go?" --tool codex --wait --timeout 120 &
$SKILL_DIR/scripts/launch-subagent.sh "Ask Gemini" "Best way to handle errors in Go?" --tool gemini --wait --timeout 120 &
wait

Cleanup

After getting the response, remove the consultation session:

agent-deck remove "Consult Codex"
# Or remove multiple at once:
agent-deck remove "Codex Review" && agent-deck remove "Gemini Arch"

Peer (Root) Sessions vs Sub-Agents

The default — sub-agent linkage: agent-deck launch and agent-deck add, when invoked from inside an existing agent-deck session, automatically link the new session as a child of the calling session (sets parent_session_id, inherits the parent's group when -g is omitted, and grants --add-dir to the parent's project path). This is usually what you want for short-lived work sessions (plan / verify / release / consult).

When the default is wrong — root-level peer sessions: if you are creating a session that should stand independently at the root — a peer conductor, a standalone project session, a session that should outlive the current one, or anything that semantically is NOT a child of the calling session — pass the -no-parent flag.

Use case Parent linkage Flag
Plan / impl / verify worker for the current task ✅ child (default)
Consultation (codex / gemini / research) ✅ child (default)
Another conductor (root-level peer) ❌ child -no-parent
Project session unrelated to current work ❌ child -no-parent
Session intended to outlive the caller ❌ child -no-parent
# Root-level peer conductor, no parent linkage:
agent-deck launch ~/projects/foo -t "conductor-foo" -g "conductor" -c claude -no-parent -m "..."

# Verify after spawn:
agent-deck list --json | jq '.[] | select(.title=="conductor-foo") | .parent_session_id'
# Must print: null

Symptoms you created a sub-agent when you wanted a peer:

  • parent_session_id is non-null in list --json output
  • The new session's baked pane_start_command contains --add-dir <caller's path> even though you gave it a different project path
  • Transition events for the new session's children flow to the caller instead of the new peer
  • Event routing and heartbeat parent-linkage puts it under the caller's tree in the TUI

Fix for an already-created sub-agent: stop + remove the session, re-launch with -no-parent. There is no in-place un-parent flag.

Note on the launch-subagent.sh script: that script is specifically designed to create sub-agents (the name says so). It does NOT support -no-parent. For peer sessions, skip the script and invoke agent-deck launch -no-parent directly.

Conductors

A conductor is a persistent agent-deck session that orchestrates other sessions. It watches the rest of your sessions, auto-responds when confident, escalates to you when ambiguous, and optionally pairs with a remote channel (Telegram or Slack) so you can talk to it from your phone.

Use this section when the user says "conductor", "set up a conductor", "monitor sessions", "telegram bot for agent-deck", "slack bot for agent-deck", or "remote control my sessions".

# Create a conductor in the default profile
agent-deck conductor setup ops --description "Ops monitor"

# Create on a specific profile (work/personal/etc.)
agent-deck -p work conductor setup infra --description "Infra watcher"

# Use a non-Claude agent for the conductor itself
agent-deck conductor setup review --agent codex --description "Codex reviewer"

# Provide custom env (e.g., third-party Anthropic-compatible endpoint)
agent-deck conductor setup glm-bot \
  -env ANTHROPIC_BASE_URL=https://api.z.ai/api/anthropic \
  -env ANTHROPIC_AUTH_TOKEN=<token>

# Status across all conductors
agent-deck conductor status

# List configured conductors
agent-deck conductor list

Each conductor lives at ~/.agent-deck/conductor/<name>/ with its own CLAUDE.md (or AGENTS.md for Codex), meta.json, state.json, and task-log.md. Multiple conductors per profile are supported and each can pair with its own bot.

Channels (Telegram / Slack)

Channels are how a conductor talks to you remotely. Each conductor pairs one-to-one with its own bot — bots are not shared. agent-deck conductor setup interactively walks you through Telegram or Slack pairing during creation.

Key constraints:

  • The Telegram plugin must be installed under the conductor's Claude profile, but never globally enabled in settings.json. Per-session activation happens via the channels field on the conductor's session record.
  • Bot tokens live at <channel-state-dir>/.env (chmod 600). Never committed to git.
  • Exactly one bot, one conductor, one chat — the routing is deterministic.
  • Watch for the "many competing telegram pollers" gotcha (see Known Gotchas section below) — child sessions inherit TELEGRAM_STATE_DIR and can leak duplicate pollers on the same bot token, causing 409 conflicts.

See: documentation/CONDUCTOR.md for the full ten-minute quickstart, telegram bot creation flow, multi-conductor setups, and lifecycle commands.

TUI Keyboard Shortcuts

Navigation

Key Action
j/k or ↑/↓ Move up/down
h/l or ←/→ Collapse/expand groups
Enter Attach to session

Session Actions

Key Action
n New session
r/R Restart (reloads MCPs)
m MCP Manager
s Skills Manager
f/F Fork Claude/Pi session
d Delete
M Move to group

Search & Filter

Key Action
/ Local search
G Global search (all Claude conversations)
!@#$ Filter by status (running/waiting/idle/error)

Global

Key Action
? Help overlay
Ctrl+Q Detach (keep tmux running)
Ctrl+E Open feedback dialog
q Quit

MCP Management

Default: Do NOT attach MCPs unless user explicitly requests.

# List available
agent-deck mcp list

# Attach and restart
agent-deck mcp attach <session> <mcp-name>
agent-deck session restart <session>

# Or attach on create
agent-deck add -t "Task" -c claude --mcp exa /path

Scopes:

  • LOCAL (default) - .mcp.json in project, affects only that session
  • GLOBAL (--global) - Claude config, affects all projects

Worktree Workflows

Create Session in Git Worktree

When working on a feature that needs isolation from main branch:

# Create session with new worktree and branch
agent-deck add /path/to/repo -t "Feature Work" -c claude --worktree feature/my-feature --new-branch

# Create session in existing branch's worktree
agent-deck add . --worktree develop -c claude

List and Manage Worktrees

# List all worktrees and their associated sessions
agent-deck worktree list

# Show detailed info for a session's worktree
agent-deck worktree info "My Session"

# Find orphaned worktrees/sessions (dry-run)
agent-deck worktree cleanup

# Actually clean up orphans
agent-deck worktree cleanup --force

When to Use Worktrees

Use Case Benefit
Parallel agent work Multiple agents on same repo, different branches
Feature isolation Keep main branch clean while agent experiments
Code review Agent reviews PR in worktree while main work continues
Hotfix work Quick branch off main without disrupting feature work

Watchers

Watchers listen for inbound events (webhooks, push notifications, GitHub events, Slack messages) and route them into conductor sessions. Use them when the user says "set up a watcher", "listen for webhooks", "route GitHub events to my conductor", "forward ntfy notifications", or similar.

Four adapter types are supported:

Type Required flag Typical use
webhook --port Generic HTTP listener
github --secret GitHub repo webhooks with HMAC verification
ntfy --topic ntfy.sh push notifications
slack --topic Slack (via Cloudflare Worker bridge)
agent-deck watcher create <type> --name <name> <adapter-flags...>
agent-deck watcher start <name>
agent-deck watcher list                # health + events/hour
agent-deck watcher test <name>         # synthetic event (verify routing)

Full conversational setup flow is available as a separate skill:

agent-deck watcher install-skill watcher-creator

After running the install command, read the installed watcher-creator/SKILL.md to walk the user through adapter selection, required settings, and configuring the effective watcher data dir's clients.json routing (${XDG_DATA_HOME:-$HOME/.local/share}/agent-deck/watcher/clients.json for new users; legacy ~/.agent-deck/watcher/clients.json when existing watcher state is present).

See agent-deck watcher --help for the full command surface and per-adapter examples.

Self-Improvement

Use when: user says "self-improve", "analyze my conductor", "what bugs are we hitting", "file issues from my usage", or asks the conductor to learn from past conversations.

A pipeline that analyzes a conductor's own Claude Code conversation transcripts and surfaces actionable signal:

  • Bugs — tool errors, user-reported issues, recurring friction (with citations back to the source transcript)
  • Workflow patterns — repeated multi-step sequences worth promoting to a skill or script
  • Capability discoveries — undocumented commands / flags / recipes the conductor used in real work
  • User corrections — meta-rules captured from "no, do it this way" exchanges, suitable to encode in the conductor's CLAUDE.md

Output lives at the conductor root and is regenerated on each run:

~/.agent-deck/conductor/<name>/
├── FINDINGS.md              # raw synthesis of the latest run
├── CAPABILITIES.md          # curated inventory (you edit; survives runs)
├── analysis-manifest.json   # tracking — sha + line counts + analyzer session IDs
└── analysis/                # scripts, prompts, distilled transcripts, per-transcript reports

Quick start

SKILL_DIR="<base directory shown when this skill was loaded>"
SELFIMP="$SKILL_DIR/scripts/self-improvement"

# Phase 1 — distill all transcripts for one conductor (Python, no LLM, ~1 min)
mkdir -p ~/.agent-deck/conductor/<name>/analysis/distilled
for f in ~/.claude*/projects/-home-*-agent-deck-conductor-<name>/*.jsonl; do
  sid=$(basename "$f" .jsonl | cut -c1-8)
  python3 "$SELFIMP/distill.py" "$f" ~/.agent-deck/conductor/<name>/analysis/distilled/$sid.md
done

# Phase 2 — analyze + synthesize (spawns agent-deck sub-sessions, ~30 min, ~$5)
cp -r "$SELFIMP/prompts" ~/.agent-deck/conductor/<name>/analysis/
bash "$SELFIMP/run-analyzers.sh"   # paced, resumable via manifest

# Phase 3 — file issues from FINDINGS.md (interactive; never auto-files)
bash "$SELFIMP/file-issues.sh"

Privacy

Three layers run before anything leaves the box: regex sanitize → AI sanitizer session → independent AI auditor session. The auditor must verdict SAFE_TO_SHARE before the filer will submit. Each layer covers the others' blind spots (regex catches tokens / IPs / paths; AI catches contextual names; auditor catches what the first two missed with fresh eyes). Human review is non-negotiable — the filer prints the exact gh command and waits for [f] before running it.

Constraints

  • All LLM work happens in spawned agent-deck sub-sessions — never via the Anthropic SDK directly. This dogfoods agent-deck and uses your Claude Max plan.
  • Sequential with 30-60s pacing between launches — rate-limit safe.
  • Manifest-based resume — re-runs only process new or grown transcripts.

Deep dive

For the full architecture, output schemas, lessons learned from real runs, and per-script reference, see references/self-improvement.md.

Goal (goal-driven worker autonomy)

Use when: user says "pursue this", "set a goal", "make it work until done", "nudge the agent", "stop me having to message it again", or describes wanting an agent to keep working autonomously toward a specific goal without manual re-prompting.

A complementary layer on top of Self-Improvement. Self-improvement is post-hoc analysis. Goal is the live mechanism that prevents the kinds of stalls self-improvement keeps surfacing — specifically the FINDINGS pattern where a conductor's hourly cron fires 18 times with identical [STATUS] replies and no actual progress.

The core idea

Three entities, never collapsed:

Entity Job Restriction
Worker Take one bounded step per cycle, write a progress receipt May NOT decide it's done. May NOT escalate.
Verifier An external shell command — runs the done-condition independently NOT an LLM. NOT the worker's self-assessment.
Manager Small Python daemon (cron'd) — runs the verifier, reads receipts, nudges the worker, escalates to user when stuck NOT involved in doing the work

Separating these three concerns is what prevents the "agent keeps reporting status but never finishes" failure mode the FINDINGS captured.

Done-conditions must be shell commands

Examples that work:

  • gh release view v1.6.0 -R asheshgoplani/agent-deck --json publishedAt | jq -e '.publishedAt != null'
  • gh pr view 890 -R asheshgoplani/agent-deck --json mergedAt | jq -e '.mergedAt != null'
  • test -s /tmp/report.csv && [ "$(wc -l < /tmp/report.csv)" -gt 100 ]

Examples that DON'T work:

  • "Get this working" → not testable
  • "Make the code better" → not measurable
  • "Worker says it's done" → self-judgment (the bug we're avoiding)

Quick start (Phase 1 — hand-wired proof)

The full spec is in references/goal.md. For early use, follow Phase 1:

  1. Write a goal JSON at ~/.agent-deck/goals/<id>.json (schema in the deep-dive doc).
  2. Spawn the worker with the contract prompt (template in the deep-dive doc).
  3. Run the manager script manually every few minutes to check + nudge.
  4. After one real goal completes successfully, graduate to Phase 2 (CLI wrapper) and Phase 3 (cron'd daemon).

Future CLI surface (Phase 2)

agent-deck goal \
    --goal "Ship agent-deck v1.6.0" \
    --done 'gh release view v1.6.0 --json publishedAt | jq -e ".publishedAt != null"' \
    --check-every 5m \
    --max-idle 1h \
    --escalate-after 3 \
    --max-cycles 24

agent-deck goal list           # active goals + state
agent-deck goal show <id>      # full JSON dump
agent-deck goal tail <id>      # tail the worker's task-log.md
agent-deck goal cancel <id>    # stop the worker
agent-deck goal resume <id> "<hint>"  # send context-rich hint, reset nudge counter

Deep dive

For the full design — three-entity model, registry schema, worker contract prompt, manager loop pseudocode, nudge generator, escalation bundle, done-condition guidelines, failure modes, implementation phases, and the verification this closes the FINDINGS 18-hour stall — see references/goal.md.

Trust-but-Verify

Use when: A conductor or worker reports any non-trivial completion claim — "PR is merge-ready", "release shipped", "tap updated", "comment posted", "goal done", "bulk drain complete". Treat all such claims as unverified until an independent verifier session has hit ground truth.

The rule

For ANY non-trivial done-claim, spawn a separate Claude session (not the same conductor, not the same worker) whose job is to re-derive the claim from primary sources. The verifier MUST hit:

  • Live state (test runs, GitHub API, release artifacts, file contents on disk)
  • Not transcripts. Not the original worker's self-report. Not a same-session review.

Same-session reviewers carry the same blind spots that produced the claim. A separate session re-reads the world from scratch.

Claim → verifier mapping

Done-claim Independent verifier MUST run
"PR is merge-ready" gh pr checkout <N> && go test -race ./... AND gh pr view <N> --json mergeable,statusCheckRollup
"release shipped" gh release view <tag> --json assets AND probe each download URL (HTTP 200, non-empty body, correct content-type)
"brew tap updated" gh api repos/<owner>/homebrew-tap/contents/Formula/agent-deck.rb — confirm the SHA + URL match the new release
"comment posted" gh issue view <N> --comments (or gh pr view <N> --comments) and string-match the expected body
"goal worker done" The manager's done_cmd must return rc=0 AS PART OF the verify cycle — the worker's self-report alone never closes a goal
"bulk drain complete" Spawn a read-only AUDIT worker that enumerates the residual open items (open issues, open PRs, pending releases) and diffs them against the claimed-closed set

How to spawn the verifier

Use launch-subagent.sh from a different session than the one making the claim (peer session, not child of the claimant). Block with --wait so the conductor gets the verifier's answer before reporting up:

$SKILL_DIR/scripts/launch-subagent.sh "Verify PR #885" \
  "Independently verify PR #885 is merge-ready. Run \`gh pr checkout 885 && go test -race ./...\` and \`gh pr view 885 --json mergeable,statusCheckRollup\`. Report PASS only if BOTH succeed; otherwise report the exact failing tests + the mergeable state." \
  --wait --timeout 600

Key points:

  • Verifier prompt asks for primary-source commands, not opinions. The verifier reports what the commands returned, not what it thinks.
  • --wait is mandatory. Fire-and-forget defeats the purpose — the claim ships before the verifier answers.
  • One verifier per claim. Don't fold three claims into one verifier; that re-introduces the blind-spot pattern at a smaller scale.
  • Verifier MUST be a peer (or root) session, not a child of the claimant. A child of the claimant inherits the claimant's working state and is structurally closer to "same blind spots".

Concrete examples from 2026-05-18 (the day this section was baked in)

  1. PR #885 over-claim. A worker review reported "merge-ready; clean diff". Independent local re-test (gh pr checkout 885 && go test -race ./...) caught 2 failing tests. The reviewer had grepped the diff for obvious issues but never ran the suite. Lesson: code-review confidence ≠ test pass.

  2. ux-rethink-path verdict false-positive. The ux-rethink-path worker reported "merge as-is; clean diff" for #885. User asked for a second look; a fresh peer verifier caught the same 2 failing tests the original worker missed. Lesson: reviewer + author in the same session-tree share priors. A peer session re-derives.

  3. Goal framework "metronome wakes". The conductor's hourly wake cycles were firing [STATUS] replies with no actual work — the wake-loop had degenerated into a heartbeat instead of a do-work loop. User had to flag it manually. Lesson: a worker that hasn't run a ground-truth check since the last wake is not a working worker, it's a metronome. Bake the priority-0 check (below) into the wake template so this can't recur.

Wake-loop priority 0 (mandatory)

The goal worker contract prompt at scripts/goal/prompts/worker.md now begins each wake with PRIORITY 0: before reporting status or taking the next bounded step, run one ground-truth verifier check against any non-trivial claim made in the previous wake's receipt. If no claim was made last wake, skip priority 0 and go to step 1.

This converts a metronome wake into a do-work wake. See the worker template for the exact wording the cycle uses.

Anti-patterns

  • ❌ "I reviewed my own diff and it looks clean" — same-session review.
  • ❌ "Codex agreed with me" — consult is brainstorming, not verification. Codex did not run your tests.
  • ❌ "The PR's CI is green" alone — CI lag, flaky tests, or test-skip can hide real failures. Pair CI with a local re-run.
  • ❌ "Bulk close all 200 — done." — without an audit pass, you don't know how many silently failed.
  • ❌ Fire-and-forget verifier — by the time it answers, the conductor has already shipped the claim upstream.

When you don't need a verifier

For trivial mechanical actions where the action IS its own verification (and the claim is "I did X"):

  • File edit + git diff visible → diff IS verification
  • agent-deck list reporting current sessions → list IS verification
  • Reading a file and quoting its contents → quote IS verification

The verifier requirement attaches to claims about external mutable state: PRs, releases, comments, deployments, bulk operations.

Configuration

File: ~/.agent-deck/config.toml

[claude]
config_dir = "~/.claude-work"    # Custom Claude profile
dangerous_mode = true            # --dangerously-skip-permissions
use_chrome = false               # --chrome
use_teammate_mode = false        # --teammate-mode tmux
extra_args = ["--agent", "reviewer"]

[logs]
max_size_mb = 10                 # Max before truncation
max_lines = 10000                # Lines to keep

[mcps.exa]
command = "npx"
args = ["-y", "exa-mcp-server"]
env = { EXA_API_KEY = "key" }
description = "Web search"

See config-reference.md for all options.

Troubleshooting

Issue Solution
Session shows error agent-deck session start <name>
MCPs not loading agent-deck session restart <name>
Flag not working Put flags BEFORE arguments: -m "msg" name not name -m "msg"

Get Help

  • Discord: discord.gg/e4xSs6NBN8 for quick questions and community support
  • GitHub Issues: For bug reports and feature requests

Report a Bug

If something isn't working, create a GitHub issue with context:

# Gather debug info
agent-deck version
agent-deck status --json
cat ~/.agent-deck/config.toml | grep -v "KEY\|TOKEN\|SECRET"  # Sanitized config

# Create issue at:
# https://github.com/asheshgoplani/agent-deck/issues/new

Include:

  1. What you tried (command/action)
  2. What happened vs expected
  3. Output of commands above
  4. Relevant log: tail -100 ~/.agent-deck/logs/agentdeck_<session>_*.log

See troubleshooting.md for detailed diagnostics.

Session Sharing

Share Claude sessions between developers for collaboration or handoff.

Use when: User says "share session", "export session", "send to colleague", "import session"

# Export current session to file (session-share is a sibling skill)
$SKILL_DIR/../session-share/scripts/export.sh
# Output: ~/session-shares/session-<date>-<title>.json

# Import received session
$SKILL_DIR/../session-share/scripts/import.sh ~/Downloads/session-file.json

See: session-share skill for full documentation.

Switch a Session to Another Claude Account

Move a session — conversation included — to a different Claude account (work/personal/client) and continue exactly where it left off.

Use when: User says "switch account", "move this conversation to my other account", "continue this session on account X", "this session should use the account".

One-time setup — name each account in ~/.agent-deck/config.toml (the target profile must already be logged in: CLAUDE_CONFIG_DIR=<dir> claude/login):

[profiles.personal.claude]
  config_dir = "~/.claude"
[profiles.work.claude]
  config_dir = "~/.claude-work"

Commands:

# Full flow: stop → copy conversation into the target account → set account → restart with --resume
agent-deck session switch-account <session> <account>

# Skip the restart (e.g. switch several sessions, restart later)
agent-deck session switch-account <session> <account> --no-restart

# Equivalent low-level form — also migrates the conversation; restart required
agent-deck session set <session> account <account>

How it works / guarantees:

  • The conversation .jsonl is copied into <target-config-dir>/projects/<encoded-path>/ — the old account keeps its copy; a conflicting file in the target is backed up as .bak-<timestamp> first. The session id does not change.
  • claude --resume is a pure file lookup, so the restarted session continues with full history under the new account's auth.
  • Tools/MCPs/plugins and usage limits follow the new account; enable any needed plugins in the target profile (e.g. CLAUDE_CONFIG_DIR=<dir> claude plugin enable telegram@claude-plugins-official for channel owners).
  • A fresh session with no conversation yet switches cleanly (nothing to migrate).
  • Unknown account names error and list the configured ones.

Make a whole group/conductor use the account going forward: set [groups.<name>.claude].config_dir / [conductors.<name>.claude].config_dir to the same dir in config.toml — new sessions there spawn on that account; switch-account is what carries existing conversations over.

Critical Rules

  1. Flags before arguments: session start -m "Hello" name (not name -m "Hello")
  2. Restart after MCP attach: Always run session restart after mcp attach
  3. Never poll from other agents - can interfere with target session

Known Gotchas (v1.7.0+)

Friction points discovered during real usage. Work around them per the patterns below.

session send --no-wait can leave prompts typed-but-not-submitted

On a freshly-launched Claude session, agent-deck session send --no-wait <id> "..." may paste the message into the input buffer before Claude is fully ready, leaving it TYPED but not SUBMITTED. Classic race.

Workaround (always safe):

agent-deck -p <profile> session send <id> "..." --no-wait -q
sleep 3
# Get the tmux session name and send Enter to submit
TMUX=$(agent-deck -p <profile> session show --json <id> | jq -r .tmux_session)
tmux send-keys -t "$TMUX" Enter

The Enter is idempotent — if already submitted, it's just a no-op newline. Use this pattern every time you session send --no-wait to a freshly-launched session.

Alternative: omit --no-wait so the built-in 60s readiness wait kicks in before submitting.

Replacing the binary while agent-deck is running (text file busy)

If /usr/local/bin/agent-deck is a symlink to a build artifact and the binary is currently running (any tmux session, any daemon), a direct cp over it fails with Text file busy.

Workaround — move-then-copy (keeps running processes on the old inode):

INSTALL=$(which agent-deck)
TARGET=$(readlink -f "$INSTALL")
go build -ldflags "-X main.Version=X.Y.Z" -o /tmp/agent-deck-new ./cmd/agent-deck
mv "$TARGET" "$TARGET.old"
cp /tmp/agent-deck-new "$TARGET" && chmod +x "$TARGET"
agent-deck --version    # verify
rm "$TARGET.old"

Kernel tracks inodes, not names. Running processes keep a reference to the renamed inode; new invocations resolve through the original name to the new inode.

Cross-machine config drift (macOS ↔ Linux)

If ~/.agent-deck/skills/sources.toml (or other config files) were copied verbatim from a macOS machine, paths like /Users/<name>/ won't exist on Linux (should be /home/<user>/). The symptom: agent-deck skill list returns "No skills found" while the pool directory is clearly populated.

Check & fix:

grep -n "/Users/" ~/.agent-deck/skills/sources.toml
# If any matches, substitute the Linux home path:
sed -i "s|/Users/<mac-user>|$HOME|g" ~/.agent-deck/skills/sources.toml

Channel subscription for conductor/bot sessions (v1.7.0+)

For a session to receive Telegram/Discord/Slack messages as conversation turns (not just as MCP tool calls), it MUST be started with --channels <plugin-id>. Use the first-class field:

# At creation (preferred):
agent-deck -p personal add --channel plugin:telegram@claude-plugins-official -c claude -t my-bot /path

# Or after creation, then restart:
agent-deck -p personal session set my-bot channels plugin:telegram@claude-plugins-official
agent-deck -p personal session restart my-bot

The channels field persists and every session start / session restart rebuilds the claude invocation with --channels. Do NOT rely on .mcp.json telegram entries — those load the plugin as a regular MCP (tools only), not a channel (inbound delivery).

Note — v1.7.0 display bug: agent-deck session show --json <id> currently omits the channels field (fix pending). agent-deck list --json | jq '.[] | select(.id==<id>)' shows it correctly. Data is persisted fine regardless.

Many competing telegram pollers after multiple session starts

Telegram's Bot API getUpdates is single-consumer per bot token. If N Claude sessions all load the telegram plugin, N bun pollers race for messages — deliveries land in whichever wins, not where you want them.

Correct topology: exactly ONE session loads the telegram channel plugin (normally the conductor, via --channels at start-time). All other sessions should NOT have telegram in their enabled plugins.

Disable globally: in ~/.claude/settings.json:

"enabledPlugins": {
  "telegram@claude-plugins-official": false
}

Enable per-session: via --channel on the specific session that should receive messages. See "Channel subscription" above.

Debug: pgrep -af "bun.*telegram" | wc -l should return 1. Anything higher means a race. Kill extras: pkill -f "bun.*telegram" then restart only the intended session.

Telegram conductor topology (v1.7.22+)

Supported topology — enforce this on every conductor host:

  • Telegram is activated per-session via --channels plugin:telegram@claude-plugins-official. This is the only supported activation path for a conductor bot.
  • TELEGRAM_STATE_DIR is injected exclusively via [conductors.<name>.claude].env_file in ~/.agent-deck/config.toml. The env file sources deterministically on both fresh-start and --resume spawns.
  • One bot token = one channel-owning session. Never share tokens between sessions.
  • enabledPlugins."telegram@claude-plugins-official" in the profile settings.json must be absent or false. Global enablement makes every claude subprocess (including child agents) load the plugin.

Codified anti-patterns — agent-deck v1.7.22 emits warnings for these:

Anti-pattern Code Why it breaks
enabledPlugins."telegram@claude-plugins-official" = true in profile settings GLOBAL_ANTIPATTERN Every claude process loads the plugin, including every child agent the conductor spawns. Each one starts a bun telegram poller.
Global enablement AND --channels plugin:telegram@... on the same session DOUBLE_LOAD The plugin loads twice in one claude process. Two bun pollers race on one bot token and Telegram rejects with 409 Conflict.
session set wrapper "TELEGRAM_STATE_DIR=... {command}" WRAPPER_DEPRECATED Works on the resume path; silently fails on fresh-start due to bash -c argv splitting. The env var never reaches claude, so the plugin falls back to the default state dir and two conductors collide. Use env_file instead.
Relying on .mcp.json telegram entries for inbound delivery .mcp.json loads the plugin as an MCP server (tool-use only). Inbound message → conversation-turn delivery requires --channels.
Using the same bot token for multiple concurrent sessions getUpdates is single-consumer per token.
Assuming an empty TELEGRAM_STATE_DIR is fine The plugin falls back to ~/.claude/channels/telegram/; any DM approval there leaks across unrelated conductors.

Verifying steady state (conductor host):

pgrep -af 'bun.*telegram' | grep -v grep | wc -l   # expect: exactly one per conductor bot
for PID in $(pgrep -f 'bun.*telegram.*start'); do
  echo "PID=$PID TSD=$(tr '\0' '\n' < /proc/$PID/environ | grep ^TELEGRAM_STATE_DIR= | cut -d= -f2-)"
done
# Each PID must show a distinct TELEGRAM_STATE_DIR; collisions indicate env_file is not being sourced.

When agent-deck emits a ⚠ GLOBAL_ANTIPATTERN / DOUBLE_LOAD / WRAPPER_DEPRECATED warning, the problem is in your topology, not in agent-deck. Fix the profile settings or the conductor env_file; the warning is a leading indicator of the 409-Conflict symptom that follows minutes-to-hours later.

v1.9.x findings (filed via the self-improvement pipeline)

These were surfaced by mining real conductor transcripts (see Self-Improvement). Each links to its filed GH issue.

Symptom Workaround Issue
agent-deck rm driven by xargs -P N reports ✓ Removed but rows persist Sequential while read id; do agent-deck rm "$id"; done — never xargs -P for agent-deck mutations #961
session send --wait --timeout 300s still times out at 80s if recipient is busy Poll until ready: until [ "$(agent-deck session show --json $id | jq -r .status)" = "waiting" ]; do sleep 10; done #957
Conductor restart wipes Claude history when claude_session_id is empty / "none" Use tool: claude not tool: shell for conductors so JSONL-resume applies #956
Transition-notifier replays events for removed sessions out of ~/.agent-deck/inboxes/<id>.jsonl Truncate the inbox file by hand after rm #962
Feedback dialog never shows for users with feedback_enabled:false (single boolean gates everything) Manual flip in feedback-state.json to reopt-in #967
All sessions die on SSH logout (tmux server in login-session cgroup) loginctl enable-linger on the host + launch_in_user_scope=true #958
Parallel agent-deck launch cascade → swap thrash → workers + conductor die Sequential launches; cap parallelism; don't reach for vm.overcommit_memory=2 (worsens it) #964
Orphaned context7 MCP procs (PPID=1) accumulating; pkill -f context7-mcp from inside the conductor self-immolates Guard with $$ check: grep -q $$ <(pgrep -f "<pat>") || pkill -f "<pat>" #965
launch-subagent.sh puts children in parent's conductor group instead of project group Always pass -g <project-group> explicitly #972
Bare slash commands sent via session send ignored on a freshly restarted child Wrap conversationally: "Please run /cmd …" #966
.mcp.json plugin version pins go stale after plugin upgrade After /mcp reload, rewrite .mcp.json from current plugin spec #960
Cron heartbeat NEED: lines repeat unchanged for 12-21h with no auto-retire After 3 repeats, change tactic — escalate explicitly or spawn a different worker #971
agent-deck launch -m "<rich text>" short-flag parser misroutes — text after -m becomes positional [path] Use long-form flags: --message, --title, --group, --parent (filed in batch)
CLI verb inconsistency: session update --no-parent, group remove, launch -parent all rejected Correct verbs: session unset-parent, group delete, launch does not accept -parent (it's automatic) #974

See the Self-Improvement section for how these were discovered and how to surface more from your own conductor's transcripts.

References

User guides (full how-to, in the repo):

Reference (shipped with this skill):

Install via CLI
npx skills add https://github.com/asheshgoplani/agent-deck --skill agent-deck
Repository Details
star Stars 2,672
call_split Forks 316
navigation Branch main
article Path SKILL.md
More from Creator
asheshgoplani
asheshgoplani Explore all skills →