name: ccb-self-recover description: Recover CCB agents, panes, mounts, provider contexts, API/provider failures, config reload aftermath, clear operations, and guarded single-agent restarts. Use when the user asks to fix, recover, restart if safe, clear context, reload, remount, or keep work going after provider/API failure.
CCB Self Recover
Use this skill for runtime recovery after diagnosis. Mutations must go through CCB control-plane commands. Raw tmux mutation and direct runtime-file writes are forbidden.
Recovery Gates
Before any mutation:
- Confirm maintenance intent from the user, such as "fix", "recover", "restart if safe", "apply this config", or "make CCB healthy".
- Read the current mounted daemon graph. The target must be a current daemon-graph agent for clear or restart-like actions.
- Check busy/pending state:
ccb psccb queue --detail <agent|all>ccb pend --inbox --detail <agent>ccb trace <id>when the issue involves active or pending lineage
- Check
ccb fault list. If active fault-injection rules affect the target, treat them as diagnostic evidence. Clear them withccb fault clear <rule_id|all>only when the user intended maintenance and the rules are known test residue. If a rule is recent, or its task/reason fields could represent an active drill, ask before clearing unless the user explicitly asked to clear fault-injection rules. - Choose the least disruptive supported action.
- Report exact commands, gates, affected agents, blockers, and what remains unchanged.
If the target is unknown, busy, has queued work, has pending reply delivery, or has a pending callback continuation, stop and report blockers.
Provider/API Or Startup-Input Recovery
For provider/API failures or changes that affect provider process, model, base URL, environment, provider profile, command template, role assets, or startup context, use this exact flow:
- Gather evidence without reading secrets.
- Use built-in
ccb-configto edit.ccb/ccb.configonly when the fallback provider/model/base URL/profile/env-var reference is already configured or explicitly supplied by the user as a safe reference. - Run
ccb config validate. - Run
ccb reload --dry-run. - If validation and dry-run pass, and the user intended materialization, run
ccb reload. - Re-check the current daemon graph and affected agent status.
- Decide whether affected running agents still use stale provider process, environment, model, base URL, role asset, or context state.
- If runtime refresh is still needed, restart only one affected current-graph
agent at a time with
ccb restart <agent>, and only when busy checks pass. - If
ccb restart <agent>returnsblockedorfailed, report the blockers. Do not emulate restart with tmux commands. The remaining user-level options are to continue with unaffected agents or explicitly stop and restart the project withccb killthenccb; do not run project shutdown autonomously as a substitute for single-agent restart.
ccb reload is not the recovery finish line. It materializes config into the
daemon graph; running provider processes may still hold old startup inputs.
Supported Actions
ccb clear <agent>: provider-native conversation/context clear. Run the pre-mutation checks in the Recovery Gates section first; use only when context clearing is the right fix and pending-work checks pass.ccb reload --dry-run: no-mutation config reload plan. Always safe in maintenance workflows.ccb reload: config materialization afterccb config validate,ccb reload --dry-run, supported plan, and explicit user materialization intent.ccb restart <agent>: one configured pane-backed current-graph agent after busy checks pass. The command itself must reportrestart_status, blockers, restartable agents, busy gate evidence, and old/new runtime evidence.ccb roles update agentroles.ccb_selforccb roles sync <path>: role asset repair when the user is repairingccb_selfitself, there is no active maintenance operation that depends on the current role assets, and the target role/source version is clear.
Handoffs
- Use
ccb-self-chainfirst when trace evidence shows message/reply lineage is the primary problem. Restart is not the first repair for a broken job chain. - Use built-in
ccb-configfor disk config edits and affected-agent reporting. After reload, this skill owns guarded runtime refresh decisions. - Return the original business work to the original target agent unless the user explicitly retargets it.
Red Lines
- Never restart all agents or unrelated agents.
- Never use
tmux kill-pane,kill-window,kill-server,respawn-pane,send-keys, manual pane creation, or other raw tmux mutation. - Never write lifecycle, lease, runtime, mailbox, provider session, or tmux authority files directly.
- Never read, print, store, search for, scrape, borrow, or use API keys or credentials.
- Never treat
.ccb/agents/*, disk config, pid files, or tmux panes as live restart target authority.