agent-workflow-graphs

star 1

Guide the design of graph-based workflows for AI agents -- branching, chaining, merging, conditions, suspend/resume for human-in-the-loop, streaming updates to users, and observability with tracing. Use when the user needs to build structured multi-step agent processes, add deterministic control flow to LLM-powered systems, implement durable workflows that survive crashes, or add tracing and observability. Also use when user mentions "workflow", "graph", "branching", "chaining", "suspend", "resume", "tracing", "observability", "OpenTelemetry", "durable execution", or asks how to make their agent follow a specific step sequence.

Lemirq By Lemirq schedule Updated 4/2/2026

name: agent-workflow-graphs description: > Guide the design of graph-based workflows for AI agents -- branching, chaining, merging, conditions, suspend/resume for human-in-the-loop, streaming updates to users, and observability with tracing. Use when the user needs to build structured multi-step agent processes, add deterministic control flow to LLM-powered systems, implement durable workflows that survive crashes, or add tracing and observability. Also use when user mentions "workflow", "graph", "branching", "chaining", "suspend", "resume", "tracing", "observability", "OpenTelemetry", "durable execution", or asks how to make their agent follow a specific step sequence.

Agent Workflow Graphs

When agents have too much freedom, they produce unpredictable results. Graph-based workflows constrain the agent to a structured process while still leveraging LLM intelligence at each step.

Core Concept

A workflow graph breaks a complex task into discrete steps connected by edges. Each step can:

  • Call an LLM for a focused decision
  • Execute deterministic code
  • Call external APIs
  • Wait for human input

The key insight: the LLM makes a few binary decisions instead of one big decision.

Workflow Primitives

Branching (Fan-Out)

Trigger multiple LLM calls on the same input in parallel:

Input --> Step 1 (check symptom A)
     --> Step 2 (check symptom B)
     --> Step 3 (check symptom C)

Use when: You need to check multiple independent things. Better to have 12 parallel calls each checking one symptom than 1 call checking all 12.

Chaining (Sequential)

Feed the output of one step into the next:

Step 1 (fetch data) --> Step 2 (analyze) --> Step 3 (summarize)

Each step waits for the previous step and has access to prior results via a shared context object.

Use when: Steps have dependencies -- each needs the previous step's output.

Merging (Fan-In)

After branching paths diverge, converge their results:

Step 1 --\
           --> Merge step (combine results) --> Output
Step 2 --/

Use when: You branched earlier and need to combine independent results into a single output.

Conditions

Execute steps conditionally based on intermediate results:

Step 1 (fetch data)
  |
  v
  [condition: fetchData.status === "success"]
  |
  v
Step 2 (process data)

Use when: Workflow paths depend on runtime results (success/failure, data type, user choice).

Best Practices for Workflow Steps

  1. Meaningful I/O at each step: Design inputs and outputs so they make sense in your tracing UI
  2. One LLM call per step maximum: Each step should do one focused thing
  3. Combine primitives: Loops, retries, and complex patterns are all compositions of these four primitives

Suspend and Resume

Problem

Workflows sometimes need to pause for external input (human approval, webhook callback, long-running external process).

Solution

Persist the workflow state, then resume from exactly where it left off.

Step 1 --> Step 2 --> [SUSPEND] --> waiting for human approval
                                        |
                          human approves |
                                        v
                                    [RESUME] --> Step 3 --> Step 4

Implementation Pattern

  1. Define suspendSchema on the step that needs to pause
  2. Call suspend() with a payload (what you're waiting for)
  3. The workflow persists its state to a database
  4. When the external event arrives, call resume() with the response data
  5. The workflow continues from the suspended step

Key Insight

This is the workflow equivalent of HITL (human-in-the-loop). The workflow doesn't keep a running process alive -- it serializes state and picks back up later.

Streaming Updates

Why Streaming Matters

A 10-second blank screen feels broken. The same 10 seconds with live progress updates feels fast and responsive.

What to Stream

  • LLM tokens: Show text as it's generated
  • Workflow step updates: "Searching... Analyzing... Writing..."
  • Partial results: Push intermediate outputs before the workflow completes

How to Build Streaming

  • Stream as much as you can: Tokens, workflow steps, custom data
  • Use reactive tools: ElectricSQL, Turbo Streams, SSE for real-time updates
  • Escape hatches: If a function is stuck waiting, push partial results to the frontend

Pattern: Streaming from Workflow Steps

Each step can emit progress updates to the client while executing. The client renders updates as they arrive, creating a responsive experience even for multi-minute agent runs.

Observability and Tracing

Why Observability is Critical

LLMs are non-deterministic. The question isn't whether your agent will go off the rails. It's when and how much.

Tracing

A trace is a tree of spans showing the input/output of every function called during an agent run. Think of it like a flame chart or nested HTML document.

Standard format: OpenTelemetry (OTel) -- use it for portability across vendors.

What a Tracing UI Shows

  1. Trace view: How long each step took (parse_input, process_request, api_call, etc.)
  2. Input/output inspection: Exact JSON data flowing in and out of each LLM call
  3. Call metadata: Status, start/end times, latency, operation type

Eval Integration

Tracing UIs also show eval results:

  • Side-by-side comparison of agent response vs. expected
  • Overall score per PR (to catch regressions)
  • Score over time, filterable by tags and run date

Best Practices

  • Emit traces in OpenTelemetry format for vendor portability
  • Use a cloud tool for production; local tracing tools (like Mastra's dev UI) for development
  • Look at production traces regularly -- they reveal failure patterns that tests miss

Decision Framework

Situation Pattern
Multiple independent checks on same input Branching (fan-out)
Sequential dependent steps Chaining
Combining parallel results Merging (fan-in)
Runtime-dependent paths Conditions
Need human approval mid-workflow Suspend/resume
Users waiting for multi-step results Streaming updates
Debugging production agent failures Tracing + observability

Gotchas

  • Workflows add complexity. Only use them when agents are too unpredictable without structure.
  • Design each step's I/O to be meaningful -- you'll be reading it in traces.
  • One LLM call per step. Multi-call steps are harder to debug and trace.
  • Suspend/resume requires persistent state storage (database, not memory).
  • Streaming isn't optional for production agent UX. Users need to see progress.
  • Use OpenTelemetry for tracing. Proprietary formats lock you into a vendor.

For implementation examples, see references/workflow-examples.md.

Install via CLI
npx skills add https://github.com/Lemirq/agent-best-practices --skill agent-workflow-graphs
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator