choosing-swarm-patterns

star 726

Use when coordinating multiple AI agents with Agent Relay's workflow engine and need to pick the right orchestration pattern - covers the 10 core patterns (fan-out, pipeline, hub-spoke, consensus, mesh, handoff, cascade, dag, debate, hierarchical) plus 14 specialized ones, with decision framework and accurate workflow/YAML examples.

AgentWorkforce By AgentWorkforce schedule Updated 6/10/2026

name: choosing-swarm-patterns description: Use when coordinating multiple AI agents with Agent Relay's workflow engine and need to pick the right orchestration pattern - covers the 10 core patterns (fan-out, pipeline, hub-spoke, consensus, mesh, handoff, cascade, dag, debate, hierarchical) plus 14 specialized ones, with decision framework and accurate workflow/YAML examples.

Overview

The Agent Relay workflow engine (@relayflows/core) supports 24 swarm patterns via a single swarm.pattern field. Patterns are configured declaratively in YAML or programmatically via the workflow() fluent builder — there are no standalone fanOut(...) / hubAndSpoke(...) helpers. Pick the simplest pattern that solves the problem; add complexity only when the system proves it's insufficient.

Two ways to run a pattern

1. YAML (portable):

import { runWorkflow } from '@relayflows/core';

const run = await runWorkflow('workflows/feature-dev.yaml', {
  vars: { task: 'Add OAuth login' },
});

2. Fluent builder (programmatic):

import { workflow } from '@relayflows/core';

const run = await workflow('feature-dev')
  .pattern('hub-spoke')
  .channel('swarm-feature-dev')
  .agent('lead', { cli: 'claude', role: 'lead' })
  .agent('developer', { cli: 'codex', role: 'worker', interactive: false })
  .step('plan', { agent: 'lead', task: 'Plan {{task}}' })
  .step('implement', { agent: 'developer', task: 'Implement: {{steps.plan.output}}', dependsOn: ['plan'] })
  .run();

Both paths hit the same WorkflowRunner.

Quick Decision Framework

Is the task independent per agent?
  YES → fan-out (parallel workers, hub collects)

Does each step need the previous step's output?
  YES → Is it strictly linear?
    YES → pipeline
    NO  → dag (parallel where possible, `dependsOn` edges)

Does a coordinator need to stay alive and adapt?
  YES → hub-spoke (single-level hub + workers)
        hierarchical (structurally identical in current impl; use for naming/intent)

Is the task about making a decision?
  YES → Do agents need to argue opposing sides?
    YES → debate (adversarial, full mesh)
    NO  → consensus (cooperative, full mesh + coordination.consensusStrategy)

Does the right specialist emerge during processing?
  YES → handoff (sequential chain, one active at a time)

Do all agents need to freely collaborate?
  YES → mesh (full peer-to-peer edges)

Is cost the primary concern?
  YES → cascade (chain of increasingly capable agents; each step's prompt
        decides whether to pass through or redo the prior output)

Pattern Reference (Core 10)

# Pattern Topology (actual edges) Best For
1 fan-out Hub broadcasts to N workers; workers reply to hub only Independent subtasks (reviews, research, tests)
2 pipeline Linear chain (agenti → agent{i+1}) Ordered stages (design → implement → test)
3 hub-spoke Hub ↔ spokes (bidirectional); no spoke-to-spoke Dynamic coordination, lead reviews/adjusts
4 consensus Full mesh; decision via coordination.consensusStrategy Architecture decisions, approval gates
5 mesh Full mesh (every agent ↔ every other) Brainstorming, collaborative debugging
6 handoff Chain; passes control forward Triage, specialist routing
7 cascade Chain of dependsOn steps; all run on success, downstream skipped on upstream failure (no built-in "fall through") Cost optimization: cheap first, each step's prompt passes through or redoes
8 dag Edges from step dependsOn Mixed dependencies, parallel where possible
9 debate Full mesh (same topology as mesh; roles drive behavior) Rigorous adversarial examination
10 hierarchical Hub + subordinates (single-level in current impl) Large teams; semantic distinction from hub-spoke

Heads up: hierarchical resolves to the same edge structure as hub-spoke in coordinator.ts:313-319. Multi-level tree topology is not currently implemented — use pattern name for intent, but expect the same runtime graph.

Additional Patterns (role-driven)

These 14 additional patterns exist in SwarmPattern (types.ts:114-139). The coordinator has role-based auto-selection heuristics (coordinator.ts:51-165), but they only fire when swarm.pattern is omitted — YAML validation requires it (runner.ts:2105-2117), so auto-selection is effectively a programmatic-API feature. In YAML, set swarm.pattern explicitly.

Topology is still resolved per-pattern once selected; the "Triggering roles" column reflects what the coordinator looks for to shape edges (per coordinator.ts:250-450):

Pattern Roles the topology keys off Topology
map-reduce mapper + reducer coordinator → mappers → reducers → coordinator
scatter-gather hub → workers → hub
supervisor supervisor supervisor ↔ workers
reflection critic or reviewer (auto-select uses critic only) producers → critic → producers (loop)
red-team attacker/red-team + defender/blue-team adversarial mesh with optional judges
verifier verifier producers → verifiers → back to producers
auction auctioneer auctioneer → bidders → auctioneer
escalation tier-* tiered chain, escalate up / report down
saga saga-orchestrator, compensate-handler orchestrator ↔ participants
circuit-breaker primary + fallback/backup try primary, fallback on failure
blackboard blackboard / shared-workspace shared state hub
swarm hive-mind / swarm-agent stigmergy-style
competitive — (declared explicitly) independent parallel implementations + judge
review-loop implement* + 2+ reviewer* implementer ↔ reviewers

Structured Squad Review Loop

  • Split the work into bounded implementation squads. Each squad owns a non-overlapping file or subsystem scope.
  • Give each squad an implementer plus a shadow/review partner. The shadow follows the implementer in real time, checks alignment with the spec, and posts concise feedback before the work drifts.
  • Require the implementer to self-reflect before external review: compare the final diff against the spec, AGENTS.md / CLAUDE.md, recent local conventions, tests, and declared non-goals.
  • Run an independent self-review/fresh-eyes agent that reads the actual files and recent repo context, not just the chat transcript.
  • Send that review back to the implementer for one repair round.
  • After squads converge, run a final two-agent review team, usually one Claude reviewer and one Codex reviewer, independently. They compare notes, merge findings, and produce one final verdict.
  • Spawn fresh fix agents for final-review findings. Those fix agents self-reflect, then the final reviewers re-check the post-fix state until the spec is fully satisfied or a blocker is documented.
  • Use supervisor or hub-spoke when a lead needs to coordinate live squads.
  • Use review-loop when the main risk is code quality and feedback iteration.
  • Use reflection when critic feedback should loop directly back to producers.
  • Use verifier when completion evidence matters more than design debate.
  • Use competitive only when independent alternative implementations are useful; otherwise split by ownership scope.

Pattern Details

1. fan-out — Parallel Workers

await workflow('review')
  .pattern('fan-out')
  .agent('lead', { cli: 'claude', role: 'lead' })
  .agent('auth-rev', { cli: 'claude', role: 'worker', interactive: false })
  .agent('db-rev', { cli: 'claude', role: 'worker', interactive: false })
  .step('review-auth', { agent: 'auth-rev', task: 'Review auth.ts' })
  .step('review-db', { agent: 'db-rev', task: 'Review db.ts' })
  .run();

2. pipeline — Sequential Stages

swarm: { pattern: pipeline }
agents:
  - { name: designer, cli: claude }
  - { name: implementer, cli: codex, interactive: false }
  - { name: tester, cli: codex, interactive: false }
workflows:
  - name: build
    steps:
      - {
          name: design,
          agent: designer,
          task: 'Design the API schema',
          verification: { type: output_contains, value: DONE },
        }
      - {
          name: implement,
          agent: implementer,
          dependsOn: [design],
          task: 'Implement: {{steps.design.output}}',
        }
      - { name: test, agent: tester, dependsOn: [implement], task: 'Write integration tests' }

3. hub-spoke — Persistent Coordinator

await workflow('api-build')
  .pattern('hub-spoke')
  .channel('swarm-api')
  .agent('lead', { cli: 'claude', role: 'lead' })
  .agent('db-worker', { cli: 'claude', role: 'worker' }) // interactive by default — hub DMs it
  .agent('api-worker', { cli: 'claude', role: 'worker' }) // interactive by default — hub DMs it
  .step('models', { agent: 'db-worker', task: 'Build database models' })
  .step('routes', { agent: 'api-worker', task: 'Build route handlers', dependsOn: ['models'] })
  .step('review', { agent: 'lead', task: 'Review everything', dependsOn: ['routes'] })
  .run();

4. consensus — Cooperative Voting

swarm: { pattern: consensus }
agents:
  - { name: perf, cli: claude, role: reviewer }
  - { name: dx, cli: claude, role: reviewer }
  - { name: sec, cli: claude, role: reviewer }
coordination:
  consensusStrategy: majority # declarative marker: majority | unanimous | quorum
  votingThreshold: 0.66
workflows:
  - name: decide
    steps:
      - { name: evaluate-perf, agent: perf, task: 'Evaluate perf of Fastify migration' }
      - { name: evaluate-dx, agent: dx, task: 'Evaluate DX of Fastify migration' }
      - { name: evaluate-sec, agent: sec, task: 'Evaluate security of Fastify migration' }

5. mesh — Peer Collaboration

await workflow('debug-auth')
  .pattern('mesh')
  .channel('swarm-debug')
  .agent('logs', { cli: 'claude' })
  .agent('code', { cli: 'claude' })
  .agent('repro', { cli: 'claude' })
  .step('logs', { agent: 'logs', task: 'Check server logs' })
  .step('code', { agent: 'code', task: 'Review auth code' })
  .step('repro', { agent: 'repro', task: 'Write repro test' })
  .run();

6. handoff — Dynamic Routing

swarm: { pattern: handoff }
agents:
  - { name: triage, cli: claude }
  - { name: billing, cli: claude }
  - { name: tech, cli: claude }
workflows:
  - name: support
    steps:
      - { name: triage, agent: triage, task: 'Triage: {{request}}' }
      - { name: billing, agent: billing, dependsOn: [triage], task: 'Handle billing' }
      - { name: tech, agent: tech, dependsOn: [triage], task: 'Handle tech issues' }

7. cascade — Cost-Aware Fallthrough

await workflow('answer')
  .pattern('cascade')
  .agent('haiku', { cli: 'claude', model: 'claude-haiku-4-5-20251001' })
  .agent('sonnet', { cli: 'claude', model: 'claude-sonnet-4-6' })
  .agent('opus', { cli: 'claude', model: 'claude-opus-4-7' })
  .step('try-haiku', { agent: 'haiku', task: '{{question}}' })
  .step('try-sonnet', {
    agent: 'sonnet',
    task: 'If this is a complete answer, echo it verbatim. Otherwise answer anew:\n{{steps.try-haiku.output}}',
    dependsOn: ['try-haiku'],
  })
  .step('try-opus', {
    agent: 'opus',
    task: 'Final-tier answer, using prior attempts for context:\n{{steps.try-sonnet.output}}',
    dependsOn: ['try-sonnet'],
  })
  .run();

8. dag — Directed Acyclic Graph

await workflow('fullstack')
  .pattern('dag')
  .maxConcurrency(3)
  .agent('dev', { cli: 'codex', role: 'worker' })
  .step('scaffold', { agent: 'dev', task: 'Create project scaffold' })
  .step('frontend', { agent: 'dev', task: 'Build React UI', dependsOn: ['scaffold'] })
  .step('backend', { agent: 'dev', task: 'Build API', dependsOn: ['scaffold'] })
  .step('integrate', { agent: 'dev', task: 'Wire together', dependsOn: ['frontend', 'backend'] })
  .run();

9. debate — Adversarial Refinement

swarm: { pattern: debate }
agents:
  - { name: pro, cli: claude, role: debater, task: 'Argue FOR monorepo' }
  - { name: con, cli: claude, role: debater, task: 'Argue FOR polyrepo' }
  - { name: judge, cli: claude, role: judge, task: 'Decide after 3 rounds' }
coordination:
  barriers:
    - { name: debate-done, waitFor: [pro-round-3, con-round-3] }

10. hierarchical — Multi-Level (structurally hub-spoke today)

await workflow('large-team')
  .pattern('hierarchical')
  .agent('lead', { cli: 'claude', role: 'lead' })
  .agent('fe-coord', { cli: 'claude', role: 'coordinator' })
  .agent('be-coord', { cli: 'claude', role: 'coordinator' })
  .agent('fe-dev', { cli: 'codex', role: 'worker', interactive: false })
  .agent('be-dev', { cli: 'codex', role: 'worker', interactive: false })
  .step('plan', { agent: 'lead', task: 'Coordinate full-stack app' })
  .step('fe-plan', { agent: 'fe-coord', task: 'Manage frontend', dependsOn: ['plan'] })
  .step('be-plan', { agent: 'be-coord', task: 'Manage backend', dependsOn: ['plan'] })
  .step('fe-impl', { agent: 'fe-dev', task: 'Build components', dependsOn: ['fe-plan'] })
  .step('be-impl', { agent: 'be-dev', task: 'Build API', dependsOn: ['be-plan'] })
  .run();

Verification & Completion Signals

An agent step can complete in several ways in the @relayflows/core runner:

verification:
  type: output_contains # or: exit_code | file_exists | custom
  value: DONE # or: PLAN_COMPLETE, IMPLEMENTATION_COMPLETE, REVIEW_COMPLETE

Agent Relay MCP - Correct Tool Names

The old category-expanded names are wrong. Current Agent Relay MCP tools are flat names. In a client that decorates MCP tools, the prefix comes from the configured server key; workflow prompts commonly show mcp__relaycast__send_dm, while an agent-relay server key may expose mcp__agent_relay__send_dm.

Purpose Canonical tool Common workflow-prefixed form
Send DM to another agent send_dm mcp__relaycast__send_dm
Check inbox check_inbox mcp__relaycast__check_inbox
List agents list_agents mcp__relaycast__list_agents
Post to a channel post_message mcp__relaycast__post_message
Reply in a thread reply_to_thread mcp__relaycast__reply_to_thread
Spawn sub-agent add_agent mcp__relaycast__add_agent
Remove sub-agent remove_agent mcp__relaycast__remove_agent

interactive: false agents run as non-interactive subprocesses with no relay connection. They must not call Relay MCP tools.

Reflection (Trajectories)

Reflection is not a reflectionThreshold callback. It's configured via the trajectories: block:

trajectories:
  enabled: true
  reflectOnBarriers: true # config flag exists but runner does NOT currently invoke this path
  reflectOnConverge: true # fires at parallel convergence points (runner.ts:2762-2779)
  autoDecisions: true # record retry/skip/fail decisions

Common Mistakes

Mistake Why It Fails Fix
Using mesh/debate for everything Full-mesh blows up message volume past ~5 agents Use hub-spoke or dag for most tasks
Pipeline for independent work Sequential bottleneck Use fan-out or dag
Hub-spoke for 2 agents Hub is unnecessary overhead Use pipeline or fan-out
Expecting consensusStrategy to tally votes Runner has no vote-tally logic; field only affects coordinator auto-selection Aggregate votes in a judge/lead step that reads {{steps.*.output}}
Handoff with "routing = skip other branches" Skipping only fires on upstream failure, not routing decisions Emit a routing token in triage output; downstream prompts self-no-op if token doesn't match
Cascade expecting skip-on-success Runner has no cascade skip logic; failed upstream skips downstream Chain downstream prompts to pass-through or redo based on {{steps.previous.output}}
Relying on reflectOnBarriers Config flag exists but runner never calls it Use reflectOnConverge for convergence reflection; use reflection pattern for critic loops
interactive: false agent calling MCP Non-interactive subprocess has no relay Use interactive: true (default) or emit output on stdout
Relying on multi-level hierarchical Topology is single-level hub in current impl Use pattern for naming; model levels via dependsOn graph
Writing mcp__relaycast__send(...) Wrong tool name Use post_message / mcp__relaycast__post_message or send_dm / mcp__relaycast__send_dm

Resume & Re-run

```ts

// Resume a failed run:
await runWorkflow('feature-dev.yaml', { resume: '<runId>' });

// Skip ahead, re-using cached outputs from an earlier run:
await runWorkflow('feature-dev.yaml', {
  startFrom: 'review',
  previousRunId: '<runId>',
});

Complete YAML Example

```yaml

version: '1.0'
name: feature-dev
description: 'Blueprint-style feature development with quality gates.'
swarm:
  pattern: hub-spoke
  maxConcurrency: 2
  timeoutMs: 3600000
  channel: swarm-feature-dev
  idleNudge: { nudgeAfterMs: 120000, escalateAfterMs: 120000, maxNudges: 1 }
agents:
  - { name: lead, cli: claude, role: lead, permissions: { access: full } }
  - { name: planner, cli: codex, role: planner, interactive: false, permissions: { access: readonly } }
  - { name: developer, cli: codex, role: worker, interactive: false, permissions: { access: readwrite } }
  - { name: reviewer, cli: claude, role: reviewer, permissions: { access: readonly } }
workflows:
  - name: feature-delivery
    onError: retry
    preflight:
      - { command: 'git status --porcelain', failIf: non-empty, description: 'Clean worktree' }
    steps:
      - name: plan
        agent: planner
        task: 'Plan: {{task}}'
        verification: { type: output_contains, value: PLAN_COMPLETE }
      - name: implement
        agent: developer
        dependsOn: [plan]
        task: 'Implement: {{steps.plan.output}}'
        verification: { type: output_contains, value: IMPLEMENTATION_COMPLETE }
      - name: test
        type: deterministic
        dependsOn: [implement]
        command: npm test
      - name: review
        agent: reviewer
        dependsOn: [test]
        task: 'Review implementation'
        verification: { type: output_contains, value: REVIEW_COMPLETE }
coordination:
  barriers:
    - { name: delivery-ready, waitFor: [plan, implement, review], timeoutMs: 900000 }
trajectories:
  enabled: true
  reflectOnBarriers: true
  reflectOnConverge: true
errorHandling:
  strategy: retry
  maxRetries: 2
  retryDelayMs: 5000

Source of Truth

Claim File
Pattern enum (24 patterns) @relayflows/core/dist/schema.d.ts (SwarmPattern)
Topology resolution per pattern @relayflows/core/dist/coordinator.js
Interactive-only topology edges @relayflows/core/dist/coordinator.js filters interactive: false agents
Pattern auto-selection heuristics @relayflows/core/dist/coordinator.js
WorkflowBuilder fluent API @relayflows/core/dist/builder.d.ts
runWorkflow(yamlPath, options) @relayflows/core/dist/run.d.ts
YAML validation requires version + name + swarm.pattern @relayflows/core/dist/runner.js
MCP tool names packages/cli/src/cli/agent-relay-mcp.ts, @relayflows/core/dist/channel-messenger.js
Completion modes (verification / evidence / owner / process-exit) @relayflows/core/dist/runner.js, @relayflows/core/dist/step-executor.js
Trajectory reflection @relayflows/core/dist/trajectory.js, @relayflows/core/dist/runner.js
Install via CLI
npx skills add https://github.com/AgentWorkforce/relay --skill choosing-swarm-patterns
Repository Details
star Stars 726
call_split Forks 58
navigation Branch main
article Path SKILL.md
More from Creator
AgentWorkforce
AgentWorkforce Explore all skills →