name: pydantic-ai-harness description: Extend Pydantic AI agents with batteries-included capabilities from pydantic-ai-harness -- currently Code Mode, which collapses many tool calls into one sandboxed Python execution. Use when the user mentions pydantic-ai-harness, CodeMode, Monty, code mode, or tool sandboxing, when they want an agent to run agent-written Python, or when a Pydantic AI agent would benefit from orchestrating multiple tool calls in a single sandboxed script. license: MIT compatibility: Requires Python 3.10+ and pydantic-ai-slim>=1.95.1 metadata: version: "0.1.0" author: pydantic
Building with Pydantic AI Harness
Pydantic AI Harness is the official capability library for Pydantic AI. Capabilities that need model or
framework support -- and those fundamental to every agent -- live in core pydantic-ai; optional,
batteries-included capabilities live here. Both are composed onto an agent through the same
capabilities=[...] API.
This skill covers the capabilities shipped by pydantic-ai-harness. For the core framework -- agents,
tools, structured output, hooks, and testing -- use the building-pydantic-ai-agents skill instead.
When to Use This Skill
Invoke this skill when:
- The user mentions
pydantic-ai-harness,CodeMode, code mode, or the Monty sandbox - An agent makes many sequential tool calls that could collapse into one sandboxed Python execution
- The user wants the model to write Python that loops, branches, aggregates, or parallelizes tool calls with
asyncio.gather - The user asks to sandbox or constrain the code an agent runs
Do not use this skill for:
- Core Pydantic AI usage -- building agents, adding tools, structured output, streaming, or testing (use
building-pydantic-ai-agents) - Capabilities that ship in core
pydantic-ai, such as web search, tool search, and thinking - The Pydantic validation library on its own (
pydantic/BaseModelwithout agents)
Supported Capabilities
| Capability | Description | Reference |
|---|---|---|
CodeMode |
Wraps eligible tools into a single sandboxed run_code tool so the model orchestrates them in Python |
Code Mode |
More capability areas are tracked in the capability matrix; as they stabilize, this skill grows to cover them.
Install
uv add pydantic-ai-harness
Each capability declares its own extra. Code Mode needs the Monty sandbox:
uv add "pydantic-ai-harness[codemode]" # `code-mode` is also accepted as an alias
Requires Python 3.10+ and pydantic-ai-slim>=1.95.1.
Quick Start
A harness capability is added to the agent like any other. Here CodeMode wraps an MCP server's tools into
a single run_code tool that the model drives with Python.
from pydantic_ai import Agent
from pydantic_ai.capabilities import MCP # MCP ships in core pydantic-ai
from pydantic_ai_harness import CodeMode
agent = Agent(
'anthropic:claude-sonnet-4-6',
capabilities=[
# native=False routes the MCP tools through a local toolset so CodeMode can wrap them.
# Without it, providers with native MCP run the tools server-side and bypass the sandbox.
MCP('https://hn.caseyjhand.com/mcp', native=False),
CodeMode(),
],
)
result = agent.run_sync(
'Across the top and best Hacker News feeds, find the most-discussed story with at '
'least 100 points and summarize its comment thread in one paragraph.'
)
print(result.output)
#> The most-discussed story clearing 100 points is ...
Instead of one model round-trip per tool call, the model writes a single Python script that fetches both
feeds with asyncio.gather, dedupes and ranks them in plain Python, and pulls the winning thread --
collapsing many calls into one run_code.
Key Practices
- Confirm a harness capability is actually needed. If core Pydantic AI tools and capabilities are enough, use the
building-pydantic-ai-agentsskill instead -- don't reach for the harness by default. - Read the reference before writing code. Each capability has its own configuration, constraints, and gotchas -- load the linked reference (e.g. Code Mode) first.
- Install the capability's extra. Importing
CodeModewithoutpydantic-ai-harness[codemode]raises anImportError; the Monty sandbox is an optional dependency.
Common Gotchas
native=Truetools bypassCodeMode. Provider-native MCP servers and web search execute server-side, sorun_codenever sees them. Construct them withnative=Falseto keep them local and wrappable.- The Monty sandbox is a Python subset. No class definitions, no third-party imports, and only a small stdlib allowlist -- read Code Mode before debugging generated code that fails to run.
CodeModeneeds its extra. Installpydantic-ai-harness[codemode], not the bare package.