name: workflows description: Delegate a big or high-stakes job to a fleet of parallel subagents, orchestrated deterministically; runs unattended and reports back compatibility: "Designed for Vellum personal assistants" metadata: emoji: "⚙️" vellum: display-name: "Workflows" category: "system" always-candidate: true activation-hints: - "Batch — apply one operation to each of MANY items (score / rate / rank / classify / extract / summarize each of a large set)" - "Comprehensive coverage — exhaustively sweep, audit, or find EVERY instance across a large surface" - "Research & synthesize — gather across many sources or pages and combine into one answer" - "Confidence — generate several independent attempts and judge them, or adversarially verify findings before trusting the result" - "Scale — work too large to finish well in one inline pass" avoid-when: - "A single inline answer, a quick lookup, or a small one-off" - "Interactive, conversational back-and-forth rather than unattended fan-out"
A workflow is a short JS/TS script you author that runs in a sandbox and fans work
out across many short-lived leaf agents, orchestrated deterministically. Launch
one with run_workflow (inline script OR saved name, exactly one). It returns a
runId immediately; the run is asynchronous and you are notified in this
conversation when it completes — do NOT poll.
Reach for one when a job is too big, too parallel, or too important for one inline pass. That is more than batch/map-reduce over many items — it also covers exhaustively sweeping or auditing a large surface, researching across many sources and synthesizing, and generating several independent attempts to judge or adversarially verify before trusting the result. For a single task or a quick lookup, do it inline.
The script model
These are the load-bearing invariants. Get them wrong and the run misbehaves silently.
Scripts are SYNCHRONOUS — never await
Host functions block and return their result directly. Write straight-line code.
const r = agent("Summarize this thread."); // r is the result, right here
Do not write await agent(...), and do not make the script async. An
async script deadlocks on its second host call — the sandbox can suspend the main
evaluation stack but not a promise continuation.
Every script begins with a literal meta
The first statement must be a pure-literal export — no computed values, template strings, or concatenation:
export const meta = {
name: "triage-inbox",
description: "Triage and label inbox messages",
};
meta is extracted statically, without executing the script, so it must be a
plain object literal with string name and description. The name is how a saved
workflow is referenced by workflow(name) and the scheduler.
You must return the result
The script body runs as a function. Its result is whatever it returns at the top
level — a bare trailing expression (e.g. result;) is discarded and the run
finishes with no result. Always return the value you want surfaced.
const result = agent(`Write the final summary: ${JSON.stringify(parts)}`);
return result;
Determinism (this is what makes runs resumable)
Every leaf call is journaled by sequence number and input hash, so a resumed run can
replay the unchanged prefix instead of re-spawning agents. That only holds if the
script is deterministic, so Date.now(), Math.random(), and argless new Date()
throw. Pass any timestamps or random seeds in through args.
Host API
All functions are synchronous from the script's perspective.
| Function | Returns | Notes |
|---|---|---|
agent(prompt, opts?) |
the leaf's result | Runs ONE leaf. Throws on leaf failure (fails the whole run). |
leaf(prompt, opts?) |
a leaf descriptor | Runs nothing on its own; used inside parallel/map/pipeline. |
parallel(specs) |
results[] |
Runs an array of leaf(...) descriptors concurrently, results in input order. A failed leaf becomes null (never throws). |
map(items, build) |
results[] |
build(item, i) returns a leaf(...) descriptor per item; runs them like parallel. |
pipeline(items, ...stages) |
results[] |
Each stage(prev, i) returns a leaf(...) descriptor (run an agent) OR a plain value (pass through unchanged — filter/transform locally, no agent spent). Per-stage barrier: stage N+1 starts only after all of stage N finishes. |
phase(title) |
— | Marks a named phase for progress reporting. |
log(msg) |
— | Emits a progress log line. |
usage() |
{ agentsSpawned, inputTokens, outputTokens } |
Live snapshot so a script can self-moderate. |
workflow(name, args?) |
the child's result | Runs a SAVED workflow inline, depth 1 only (a child may not call workflow()). |
args |
the run input | The args object passed to run_workflow. |
Use agent for a single sequential leaf (throws on failure). Use parallel/map/
pipeline for fan-out (a failed leaf is null, so a batch survives a few bad items).
Leaf options (opts for agent / leaf)
| Option | Type | Effect |
|---|---|---|
schema |
JSON Schema object literal | Forces structured output via a tool. A schema leaf runs with no tools — no file_read/file_list/recall/web_search, so it cannot read files or recall memory (pure judge/extractor). Pass anything it must judge inline in the prompt; a schema leaf told to "read these files" answers from the model's prior, not real data. Use a plain JSON Schema literal, not Zod. |
label |
string | Short display/diagnostic label for the leaf. |
profile |
string | Overrides the model profile. Must exist in llm.profiles or the leaf throws. See Listing profiles. |
persona |
boolean | true makes the leaf speak AS the assistant (identity + memory) — use for output meant to be in the assistant's voice. Default is anonymous — use for impartial judging/extraction. |
persona: true is the costly path (it runs the full memory-injection pipeline). Use
it only for the few leaves whose output must be in the assistant's voice; keep bulk
judging/extraction anonymous.
Capabilities — the single consent point
The capabilities argument to run_workflow declares once, up front, what the
run's leaves may do. There are no per-call permission prompts inside a running
workflow.
{
"tools": ["file_write", "gmail_send"], // side-effecting tools granted to leaves
"hostFunctions": [], // host-function names the run may invoke
"persona": true // grant leaves persona (identity + memory)
}
- Read-only baseline (available to tool leaves, no declaration, no launch prompt):
file_read,file_list,recall,web_search. A schema leaf gets none of these (it runs as a single forced-tool-choice call) — pass it inline content, never tell it to read. web_fetchis NOT in the baseline — an outbound fetch is side-effecting (its URL can exfiltrate read data), so a leaf that must fetch a URL has to declare"web_fetch"incapabilities.tools.- Declaring ANY side-effecting tool (writes, sends, shell,
web_fetch, …) or host function makes the LAUNCH prompt the user for approval once — that single approval covers the whole run. A read-only run (no declared tools) launches with no prompt. Declare the minimum you need.
Runs are autonomous but BOUNDED by a per-run agent cap — spend is structurally capped and you cannot exceed it.
Listing available profiles
Before choosing a profile for a leaf, look up the valid values rather than guessing
— an unknown profile throws.
- Preferred (model-accessible): call
manage_workflowswith actionlist_profiles. It returns the profile names defined inllm.profilesplus the workspace-wide active profile. - The same data is served by the daemon route
GET config/llm/profiles(operationIdllm_profiles_list), which clients use to populate profile dropdowns.
Omit profile to use the default: a persona leaf mirrors the main agent (the active
profile floats above the call-site default); an anonymous leaf uses the cost-optimized
workflowLeaf default.
Run management
Use manage_workflows to inspect and control runs:
| Action | Requires | Purpose |
|---|---|---|
status |
run_id |
Status + agent/token counts for one run (NOT the result). |
get_result |
run_id |
The full result of a finished run. |
list_runs |
— | Recent runs, newest first. |
abort |
run_id |
Signal an in-flight run to abort. |
resume |
run_id |
Resume an interrupted run (see below). |
list_profiles |
— | List defined profiles + the active profile (for leaf profile). |
The completion notification injected when a run finishes carries a truncated
preview of the result (large results are cut off). To read the complete result,
call manage_workflows with action get_result and the run_id — status
deliberately omits the result to stay lightweight.
Crash recovery / resume
Resume is not automatic. If the assistant restarts mid-run, the run is reconciled
to status interrupted (the agent/token accounting is preserved so the agent cap
still carries across the restart). It sits there until you explicitly resume it by
run_id via manage_workflows action resume. Resuming re-invokes the engine with
the same runId: the journal replays the completed prefix without re-spawning (or
re-paying for) finished leaves, then continues from the first unfinished leaf under the
run's originally-declared capabilities. Only interrupted runs are resumable; a
completed / failed / aborted run is terminal.
Worked example
Score each inbox item in parallel (anonymous schema leaves), then write one summary in
the assistant's voice (a single persona leaf). The item list comes in via args —
never fetched inside the script.
export const meta = {
name: "triage-inbox",
description: "Score and summarize inbox items",
};
phase("score");
const scored = map(args.items, (item) =>
leaf(`Rate this message's urgency 0-10 with a one-line reason:\n${item.subject}\n${item.body}`, {
label: `score:${item.id}`,
schema: {
type: "object",
properties: { urgency: { type: "number" }, reason: { type: "string" } },
required: ["urgency", "reason"],
},
}),
);
phase("summarize");
const summary = agent(
`Here are scored inbox items. Write a short triage summary, highlighting anything urgent:\n${JSON.stringify(scored)}`,
{ persona: true },
);
return summary;
A failed scoring leaf shows up as null in scored; the run continues.