effort: medium name: relay description: | The ONLY way to call Codex, DeepSeek, MiMo, or Grok. Use whenever the user wants to ask, delegate to, or get a second opinion from Codex, DeepSeek, MiMo, or Grok. Do NOT run the codex, deepseek (ds), mimo (mm), or grok CLI directly — from the main agent or a subagent; always use this skill's relay call command. Triggers on "ask/have/send to/get/delegate to codex" or the same with "deepseek"/"mimo"/"grok", "second opinion", "relay". allowed-tools: Read, Write, Bash(relay:), Bash(find:), Bash(printf:*) user-invocable: true
Relay
Claude-only. If ANTHROPIC_BASE_URL contains deepseek or xiaomimimo, this skill is unavailable — stop and tell the user: "relay is Claude-only; a non-Claude session cannot orchestrate other models." The relay script also refuses at the shell layer.
Call Codex, DeepSeek, MiMo, or Grok like a function: one command generates the request, invokes the peer, and prints the response.
The peer is a full agent in the Claude Code harness — not a stateless API call. Relay invokes each peer through its registered transport (DeepSeek/MiMo via
claude -pwith the model weights swapped — V4-Pro, MiMo-V2.5-Pro; Codex viacodex exec; Grok via its owngrokCLI), so the peer has your core tools — Bash, file read/write, Grep/Glob, subagents, multi-step agentic loops. It can see this repo, run commands, and verify its own work; delegate file I/O and shell work directly. Do not treat it as a one-shot completion that "can't see the codebase." Web tools (WebFetch/WebSearch) are registered on every peer and broadly work — verified 2026-06-06: Codex and both Grok tiers do both; the only gaps are DeepSeek's WebFetch (400thinking options ... reasoning_effortunder its forcedmaxeffort) and MiMo's WebSearch (no live search), and each of those two still has the other web tool. So every peer can reach the web — treat browsing as available, not a per-call unknown to re-verify. The only constant difference from you is the model behind the harness.
relay call --name <slug> [--to <peer>] [--effort <level>] [--body-only] <<'BODY'
task
BODY
relay is in PATH. The caller is always Claude (this is a Claude-only skill); the peer defaults to Codex. Pass --to deepseek, --to mimo, --to grok-build, or --to grok-composer to route elsewhere.
If a bare relay ever returns "command not found" (a sandboxed/non-zsh/reset-env shell that didn't inherit the PATH entry), re-run the identical command with the absolute install path — ~/.claude/skills/relay/scripts/relay call …. That is the whole recovery; do not reconstruct the call by hand.
All Codex, DeepSeek, MiMo, and Grok interactions go through relay call. Do not invoke codex exec, the grok CLI, or the ds/mm aliases directly, do not spawn agents to run the codex, grok, or claude CLI for these purposes, and do not pass model flags (-m, --model) — the model and invocation method come from the peer registry (peers.json), not the call.
Peer selection
| Peer | When to pick | How to invoke |
|---|---|---|
| Codex (default) | Code review, security review, refactoring, agentic coding. GPT-5.5 lineage. Two effort tiers (medium/xhigh). |
relay call --name ... (no --to needed) |
| DeepSeek | Independent model family for true cross-vendor diversity, frontier reasoning, multi-step analysis. Open-weight V4-Pro (1.6T MoE). Always runs at max (DeepThink). Text-only (no image input via relay). |
relay call --to deepseek --name ... |
| MiMo | Another independent open-weight lineage (Xiaomi MiMo-V2.5-Pro, 1.02T MoE / 42B active, 1M context). Use for a third cross-vendor perspective. No effort knob. Text-only (no image input via relay). | relay call --to mimo --name ... |
| Grok Build | An independent xAI lineage (grok-build), xAI's agentic coding model. Runs via grok's own CLI in headless mode (not Anthropic-compatible). Effort medium/high (default medium). |
relay call --to grok-build --name ... |
| Grok Composer | xAI's fast model (grok-composer-2.5-fast) — same lineage as Grok Build, lighter/cheaper. Use as a faster xAI option (not a distinct cross-vendor perspective). No effort knob. |
relay call --to grok-composer --name ... |
Pick Codex by default — it's the strongest general-purpose coding agent and integrates cleanly with the relay protocol. Pick DeepSeek, MiMo, or Grok Build for a perspective from a model trained outside both the Anthropic and OpenAI lineages, or when running /prism Parallax (Grok Composer is a faster xAI variant, not a distinct lineage from Grok Build). Of these, only Grok Build has an effort knob (medium/high); DeepSeek, MiMo, and Grok Composer ignore --effort, so omit it for them. DeepSeek requires DEEPSEEK_API_KEY and MiMo requires MIMO_API_KEY in the environment; Grok uses its own cached login (no key var).
Peer registry
Every model-family fact — transport (codex CLI vs a generic claude-env Anthropic-compatible envelope), endpoint, key variable, model id, effort knob, per-peer extras, and launcher template style — lives once in peers.json next to the script. relay and prism-launch both read it, so adding a peer that reuses an existing transport is one stanza there, not edits across the script, the prism launcher, and the docs. The claude-env peers share one code path that differs only by registry data; Codex and Grok each have their own transport. Two deliberate exceptions stay in code, not data: a brand-new transport needs its own script branch (this is how grok was added — its own headless-CLI invocation), and a new claude-env peer should also get its endpoint added to the inbound Claude-only refusal at the top of the script (a one-line safety guard that must run before the registry is loaded). Grok needs no refusal entry — it sets no ANTHROPIC_BASE_URL, and the transport-agnostic RELAY_PEER guard already blocks a dispatched grok peer from recursing. The interactive ds/mm launchers in shell/.functions are a separate consumer and still carry their own copy — keep them in sync.
Common Mistakes
- Premature failure diagnosis: If a relay call was launched with
run_in_background: true, do not inspect.relayfiles or enter the failure flow until the background task's completion notification arrives. No notification means the peer is still running. - Wrapping relay in a subagent: Do not spawn an Agent that then calls
relayinside. When the subagent completes, the platform kills its child processes — including the still-running peer CLI. Callrelaydirectly from the main conversation withrun_in_background: trueinstead. - Empty heredoc body: The
<<'BODY'...BODYblock must contain text — an empty body causes an immediate error. - Missing
--name: Every call requires--name— omitting it is a script error, not a peer failure.
Example
relay call --name auth-review --effort medium <<'BODY'
Review src/auth.py for security issues. Run pytest to verify.
BODY
Effort Levels
--effort applies to Codex and Grok Build. Codex accepts medium/xhigh; Grok Build accepts medium/high (default medium). DeepSeek always runs at max (DeepThink), and MiMo and Grok Composer have no effort knob — the flag is silently ignored or omitted on those calls.
| Level | When to use |
|---|---|
medium |
Default for Codex and Grok Build. Balanced starting point for code review, tests, bug fixes, and most refactoring. |
xhigh |
Codex only. Hard architecture work, deep security review, or eval-bound tasks worth the extra latency. |
high |
Grok Build only — its deeper reasoning tier. Use for hard analysis where the extra latency is worth it. |
Before raising effort, improve the prompt first — add outcome-first success criteria, stop rules, verification steps, and completeness criteria.
Prompting Codex
Before composing the prompt body, read the prompt-engineer references — ~/.claude/skills/prompt-engineer/references/gpt.md for cross-cutting GPT-5.5 prompt patterns and ~/.claude/skills/prompt-engineer/references/codex.md for Codex coding agent patterns. If those symlinks are unavailable, use the repo copies at agents/extensions/skills/prompt-engineer/references/. This is not optional — the guides contain model-specific patterns that materially affect output quality.
Lead with the outcome, not the procedure. GPT-5.5 responds best to outcome-first prompts — state the goal, success criteria, and stop rules, then let Codex pick the path. Use XML scaffolding only when a specific failure mode needs it:
<output_contract>— when format precision matters<completeness_contract>— when the task has discrete items that must all be covered<verification_loop>— when post-change validation is required
Example:
relay call --name pool-refactor --effort medium <<'BODY'
Add connection timeouts and stale-connection recovery to src/db/pool.py.
Success criteria:
- ConnectionPool accepts a timeout_seconds parameter at construction
- stale connections are auto-reconnected on use
- a reclaim_stale() method exists for explicit cleanup
- existing callers keep working without changes
<verification_loop>
Run pytest tests/test_pool.py — all tests must pass. No new lint errors.
</verification_loop>
<output_contract>
Summary of changes, one per line, with file path and description.
</output_contract>
BODY
Prompting DeepSeek, MiMo, and Grok
These are all independent (non-Anthropic/OpenAI) models that respond well to XML-scaffolded, structured prompts. Before composing a DeepSeek prompt body, read ~/.claude/skills/relay/references/deepseek.md (symlinked to the prompt-engineer reference) — it covers the CO-STAR framework, XML scaffolding conventions, thinking-mode quirks, and DeepThink failure modes. This is not optional — the guide contains model-specific patterns that materially affect output quality. MiMo-V2.5-Pro and Grok (both models) have no dedicated reference; treat them like DeepSeek.
Default to XML scaffolding (DeepSeek V4 was trained heavily on XML-tagged data; MiMo and Grok behave similarly). The CO-STAR sections — <context>, <objective>, <style>, <tone>, <audience>, <response_format> — give the cleanest results for non-trivial tasks. Use positive framing ("include X") over negative constraints ("don't omit X"). Aside from Grok Build's --effort flag, their thinking is always on — so keep system-style meta-instructions out of the prompt body; they degrade under long system prompts. Lead with the outcome and success criteria, then let the model pick the path.
Example (swap --to deepseek for --to mimo, --to grok-build, or --to grok-composer to route elsewhere — the prompt shape is identical):
relay call --to deepseek --name pool-design <<'BODY'
<context>
We're hardening src/db/pool.py before a SOC 2 audit. Codebase is Python 3.12 +
FastAPI. Tests live in tests/test_pool.py and run under pytest.
</context>
<objective>
Add connection timeouts and stale-connection recovery to src/db/pool.py.
</objective>
<success_criteria>
- ConnectionPool accepts a timeout_seconds parameter at construction
- stale connections are auto-reconnected on use
- a reclaim_stale() method exists for explicit cleanup
- existing callers keep working without changes
- pytest tests/test_pool.py passes; no new lint errors
</success_criteria>
<response_format>
Summary of changes first (one line per change: file path + description),
then the diffs grouped by file. Cap at 400 words excluding diffs.
</response_format>
BODY
Calibration handoff
For judgment tasks — analysis, review, design, research, second opinions — ask the peer to end its answer with a reasons-based calibration block, then act on it when the response returns. Skip it for mechanical or code-changing calls (run-a-command, apply-a-defined-change): there the trust signal is tests, diffs, and the verify: frontmatter, not a self-report. Don't ask for a number — verbalized confidence from these models is poorly calibrated (clusters at round numbers, skews overconfident), so a % or High/Med/Low manufactures false precision the orchestrator can't discount.
Add to the prompt body — inside <output_contract> for Codex, <response_format> for DeepSeek/MiMo/Grok:
End with a ## Calibration block:
- Key assumptions: 1-3 the answer rests on ("none material" only if true)
- Most likely wrong because: the strongest failure mode, missing info, or counterargument
- Would change my conclusion: the specific fact, test, or counterexample that would flip it
- Verify before acting: specific current/high-stakes claims to check ("none" for pure reasoning)
No numeric %, probability, or High/Medium/Low label — this block is for routing and verification, not a calibrated probability.
On return, use it — otherwise it is decoration. Verify the listed claims before passing the answer up; re-query with corrected context if a Key assumption conflicts with what you know; seek a tie-breaker (another peer or /prism) if "Most likely wrong because" attacks the core conclusion. Ignore any self-confidence score a peer volunteers anyway — the assumptions and failure mode are the signal, not a self-graded number.
Output
The script prints the response file content to stdout. The response has YAML frontmatter followed by free-form markdown:
- Frontmatter:
relay,re,from,to,status(done|error),verify(pass|fail|skip) - Body: findings, changes, reasoning — free-form markdown below the frontmatter fence
Use --body-only to strip the frontmatter and get just the markdown body.
Request and response files are saved in .relay/ (auto-gitignored). Peer stderr is logged to a .log sidecar file alongside the request. Never read the .log file — it contains the peer's full stderr output, which is extremely long and token-heavy. Only inspect the .res.md response file.
When a relay call fails
You must diagnose and retry — do not report failure to the user without attempting a fix first.
Background-task guard: If the relay call was launched with run_in_background: true, this diagnosis flow applies only after the background task's completion notification has arrived. Relay calls take significantly longer than subagents — this is normal, not a failure. Until the completion notification arrives, the call is in progress and healthy. Do not read logs or check for the response file.
When a completed relay call reports a missing response file, the peer failed before producing output. Each call generates a new request ID, so retrying does not re-execute previous attempts.
- Check the Bash output. The relay script prints diagnostic information (exit code, error summary) to stdout/stderr. Use this — visible in the Bash tool result — to identify the cause. Do not read the
.logsidecar file — full stderr, token-heavy. - Diagnose. Common causes and fixes:
- Peer binary not found → verify the peer CLI is installed and in PATH
- Empty body / malformed heredoc → verify the heredoc has content and a matching terminator
- Peer exited non-zero but response file exists → not a failure; read the response file
- Fix and retry once. Correct the invocation based on the diagnosis and re-run the relay call.
- If the retry also fails, report the failure to the user with the diagnosed cause from the Bash output.
The first failure is information, not a stop signal.
Async / Parallel
When you have independent subagent work alongside a relay call, never block on relay while subagents wait (or vice versa). Run everything concurrently:
Background the Bash call: Use run_in_background: true on the Bash tool so the relay call runs concurrently with your subagents. The platform sends a completion notification when the background task finishes — do not poll, do not inspect .relay files, and do not enter the failure diagnosis flow before that notification arrives.
Give it a generous timeout. Relay peers are full agents and can run long (Codex xhigh, DeepSeek/MiMo DeepThink, Grok at high routinely take many minutes). A Bash-tool timeout that fires mid-run kills the peer and wastes every token it already spent — favor completion over a tight bound. Set timeout: 3600000 (60 min) on the backgrounded Bash call; relay has no internal per-call cap, so this outer timeout is the only bound.
Rule: Launch relay calls and subagents concurrently. Never serialize independent work.
Never wrap relay in a subagent. If an Agent task calls relay with run_in_background: true, the subagent will complete before the peer (Codex, DeepSeek, MiMo, or Grok) finishes, and the platform will kill the orphaned peer process. Always call relay from the main conversation. If a subagent must call relay (e.g., the skill was invoked before you could prevent it), the Bash call must run in foreground — omit run_in_background so the subagent blocks until the peer replies.
Prism / Parallax
When Relay is used as the Parallax transport inside Prism, the relay call receives the same full question and same context as every local reviewer — only the lens (weighing posture) differs. Do not narrow the prompt for the Parallax agent. Prism dispatches each parallax tier independently — the relay calls run concurrently as separate Bash invocations.
Launch each relay Bash call with run_in_background: true in the same parallel dispatch step as the local reviewer subagents. Do not wrap Relay itself in another subagent layer.
If a Parallax relay call fails (after its background completion notification has arrived), treat it as a recoverable transport problem. Check the relay script's Bash output for the diagnosed cause, fix the invocation, and retry once before declaring that peer unavailable — never read the .log sidecar (full stderr, token-heavy). A failure of one peer (e.g., Codex) does not affect the others.
Utility Commands
relay --help and relay --version print usage and version info.
--to accepts codex (default), deepseek, mimo, grok-build, or grok-composer. There is no relay-to-Claude direction — Claude is the sole caller in this protocol.