name: tool-orchestration description: > Use this skill to implement reliable agent-to-tool communication using Model Context Protocol (MCP), define strict tool contracts with JSON Schema, build idempotent and retry-safe tool invocations, manage tool discovery and routing, enforce permission models, and orchestrate multi-tool chaining pipelines. This skill enforces: typed tool schemas, idempotency keys, permission scoping, graceful error propagation, and tool version compatibility checks. Do NOT use for: general API design, REST endpoint scaffolding, or model fine-tuning workflows. version: "2.0.0" author: "j4flmao" license: "MIT" type: skill compatibility: claude-code: true cursor: true codex: true windsurf: true tags: [harness-engineering, tool-orchestration, mcp, idempotency, agent-tools]
Tool Orchestration Skill
Purpose
Establishes a production-grade framework for agents to discover, invoke, chain, and recover from failures when interacting with external tools. Covers the full lifecycle from tool schema definition through MCP transport negotiation, permission evaluation, idempotent execution, error handling, and version evolution. This system ensures that every tool call an agent makes is typed, authorized, retry-safe, auditable, and compatible with the target tool's contract version.
Core Principles
- Schema-First Contracts: Every tool must be defined by a strict JSON Schema that specifies input parameters, output types, required fields, and error envelopes before any implementation is written.
- Idempotency by Default: All tool invocations must carry an idempotency key. Repeated calls with the same key and parameters must produce the same side effects exactly once.
- Least-Privilege Permissions: Agents receive the minimum set of tool permissions required for their current task scope. Permission grants are scoped, time-bounded, and auditable.
- Graceful Degradation: Tool failures must never crash the agent loop. Every tool call must have a timeout, a retry policy, and a fallback path defined before execution.
- Version-Aware Routing: Agents must negotiate tool versions at discovery time and refuse to invoke tools whose contract version is incompatible with the agent's expected interface.
Agent Protocol
Triggers
Use this skill when processing:
- Agent loops that invoke external tools (file operations, API calls, database queries, shell commands).
- MCP server/client handshake and capability negotiation sequences.
- Tool pipeline chains where the output of one tool feeds into the input of another.
- Permission grant evaluation, idempotency key management, or tool version migration tasks.
- Error recovery flows involving tool timeout, rate limiting, or partial failure scenarios.
Input Context Required
- Tool Registry Manifest: A JSON document listing available tools, their schemas, versions, and permission scopes.
- Agent Identity & Scope: The agent's ID, role, and current task context for permission evaluation.
- Idempotency Key Source: A deterministic key generator or client-provided idempotency token.
- Target MCP Server Endpoint: The transport URI (stdio, SSE, or HTTP) for the MCP server.
Output Artifact
- Tool Invocation Record: A structured log of the tool call including request, response, timing, and idempotency key.
- Pipeline Execution Trace: An ordered sequence of tool calls with dependency edges and intermediate results.
- Permission Audit Log: A record of which permissions were evaluated, granted, or denied for each tool call.
Response Formats
For programmatic integration, the tool invocation result must follow this structure:
{
"tool_call_id": "tc_a1b2c3d4",
"idempotency_key": "idem_usr42_task99_step3_v1",
"tool_name": "file_read",
"tool_version": "1.2.0",
"status": "success",
"input_hash": "sha256:9f86d08...",
"result": {
"content": "File contents here...",
"metadata": { "size_bytes": 4096, "encoding": "utf-8" }
},
"timing": {
"queued_at": "2026-06-04T09:00:00Z",
"started_at": "2026-06-04T09:00:01Z",
"completed_at": "2026-06-04T09:00:02Z",
"duration_ms": 1042
},
"permission_grant": {
"scope": "file:read:/workspace/**",
"granted_by": "system",
"expires_at": "2026-06-04T10:00:00Z"
}
}
Decision Matrix for Tool Invocation
Agent wants to call a tool?
├── Tool discovered in registry?
│ ├── NO → Query MCP server for capability list → Cache result → Retry lookup.
│ └── YES
│ ├── Version compatible?
│ │ ├── NO → Check compatibility matrix for migration adapter.
│ │ │ ├── Adapter available → Transform request → Proceed.
│ │ │ └── No adapter → ABORT with ToolVersionError.
│ │ └── YES
│ │ ├── Permission granted?
│ │ │ ├── NO → Request elevation or ABORT with PermissionDenied.
│ │ │ └── YES
│ │ │ ├── Idempotency key exists in dedup store?
│ │ │ │ ├── YES → Return cached result (no re-execution).
│ │ │ │ └── NO
│ │ │ │ ├── Execute tool call with timeout.
│ │ │ │ │ ├── SUCCESS → Store result keyed by idem key → Return.
│ │ │ │ │ ├── TRANSIENT FAILURE → Retry with exponential backoff.
│ │ │ │ │ └── PERMANENT FAILURE → Log error → Trigger fallback.
│ │ │ │ └── Pipeline step? → Route output to next tool in chain.
└── No registry configured → Fall back to static tool definitions.
Detailed Architectural Overview
Tool orchestration forms the bridge between agent reasoning and real-world side effects. Below is the system architecture mapping how tool calls flow from agent intent through validation, execution, and result propagation.
+---------------+ +-----------+ +----------------+ +------------------+
| Agent Planner | ──► | MCP Client| ──► | Schema Validator| ──► | Permission Gate |
+---------------+ +-----------+ +----------------+ +------------------+
│
▼
+---------------+ +------------------+
| Result Cache | ◄────────────────────────────────────────────── | Idempotent Executor|
+---------------+ +------------------+
│ │
▼ ▼
+---------------+ +------------------+ +------------------+ +-----------+
| Agent Memory | ◄── | Pipeline Router | ◄── | Error Handler | ◄── | Tool Impl |
+---------------+ +------------------+ +------------------+ +-----------+
Tool Invocation Lifecycle
Below is the execution pipeline for a single tool call:
[Agent Intent]
│
├──► (A) Tool Discovery ──► MCP listTools() → Schema + version negotiation
│
├──► (B) Permission Check ──► Evaluate scope grants against tool requirements
│
├──► (C) Input Validation ──► JSON Schema validation of parameters
│
├──► (D) Idempotency Gate ──► Check dedup store for existing result
│
├──► (E) Execution ──► Call tool with timeout $T_{max}$ and circuit breaker
│
└──► (F) Result Processing ──► Validate output schema → Route to pipeline or agent
Workflow Steps
Phase 1: Tool Discovery & Registration
- Initialize MCP Connection: Establish transport (stdio pipe, HTTP+SSE, or streamable HTTP) to the MCP server.
- Negotiate Capabilities: Exchange
initializemessages to determine supported protocol version and server capabilities. - List Available Tools: Call
tools/listto retrieve the tool manifest with schemas and version metadata. - Cache Tool Registry: Store tool definitions locally with TTL-based invalidation for repeated lookups.
Phase 2: Schema Validation & Contract Enforcement
- Parse Input Schema: Extract the JSON Schema
inputSchemafrom the tool definition and compile it into a validator. - Validate Agent Parameters: Run the agent's proposed parameters through the compiled schema validator.
- Enforce Required Fields: Reject calls missing required parameters before they reach the network layer.
- Type Coercion Guards: Ensure numeric strings are not silently coerced; fail loudly on type mismatches.
Phase 3: Permission Evaluation
- Extract Required Scopes: Parse the tool's permission requirements from the manifest metadata.
- Evaluate Agent Grants: Check the agent's current permission set against required scopes using glob matching.
- Time-Bound Validation: Verify that permission grants have not expired and refresh tokens if needed.
- Audit Log Entry: Record the permission decision (grant/deny) with timestamp, agent ID, and tool name.
Phase 4: Idempotent Execution
- Generate Idempotency Key: Derive key from
agent_id + task_id + step_index + param_hash. - Deduplication Lookup: Check the idempotency store for an existing result matching the key.
- Execute with Timeout: Invoke the tool with a hard timeout ($T_{max}$) and circuit breaker pattern.
- Store Result Atomically: Write the result to the dedup store keyed by the idempotency key with a TTL.
Phase 5: Error Handling & Recovery
- Classify Error Type: Distinguish transient errors (timeout, rate limit, 503) from permanent errors (400, 404, schema violation).
- Apply Retry Policy: For transient errors, retry with exponential backoff: $T_{wait} = T_{base} \cdot 2^{n} + \text{jitter}$.
- Circuit Breaker Evaluation: If failure rate exceeds threshold within time window, open the circuit and skip retries.
- Fallback Execution: Invoke the configured fallback tool or return a structured error to the agent planner.
Phase 6: Pipeline Chaining & Result Propagation
- Evaluate Pipeline Graph: Check if the completed tool is part of a multi-step pipeline DAG.
- Transform Intermediate Results: Apply output-to-input adapters between pipeline stages using JSONPath mappings.
- Propagate Errors Downstream: If a pipeline stage fails, evaluate whether downstream stages can proceed or must abort.
- Compile Final Result: Aggregate results from all pipeline stages into a single structured response for the agent.
Extended Troubleshooting Guide
When implementing tool orchestration, you may encounter the following failure modes:
| Symptom | Primary Cause | Mitigation Action |
|---|---|---|
| MCP handshake timeout | Server process not spawned or stdio pipe broken. | Verify server binary path and add a 5-second connection timeout with retry. |
| Schema validation rejects valid input | Tool schema uses additionalProperties: false but agent sends extra fields. |
Strip unknown fields before validation or update schema to allow extensions. |
| Duplicate side effects (double-write) | Idempotency key not stored before tool execution completes. | Use write-ahead logging: store the key with pending status before execution. |
| Permission denied on previously allowed tool | Time-bounded permission grant expired mid-pipeline. | Implement grant refresh with 60-second pre-expiry buffer and re-evaluate. |
| Tool returns 200 but empty result | Tool version mismatch: response schema changed between versions. | Pin tool version in agent config and add response schema validation. |
| Pipeline hangs on intermediate step | Downstream tool waiting for input that upstream tool omitted from output. | Add mandatory output field checks between pipeline stages with clear errors. |
| Rate limit errors cascade across tools | Shared API key used by multiple tool instances without coordination. | Implement a global rate limiter with token bucket algorithm per API key. |
Complete Execution Scenario
Below is a multi-tool pipeline scenario where an agent reads a file, transforms its contents, and writes the result:
[Agent Planner] ──► "Read config.yaml, extract DB settings, write connection string"
│
[Step 1] ──► tools/call: file_read(path="config.yaml")
│ ├── Permission check: file:read:/workspace/** ──► GRANTED
│ ├── Idempotency: idem_ag1_t5_s1_abc123 ──► NOT FOUND ──► EXECUTE
│ └── Result: { content: "db:\n host: pg.local\n port: 5432" }
│
[Step 2] ──► tools/call: yaml_parse(input=step1.result.content, path="db")
│ ├── Input validation: schema check ──► PASS
│ ├── Idempotency: idem_ag1_t5_s2_def456 ──► NOT FOUND ──► EXECUTE
│ └── Result: { host: "pg.local", port: 5432 }
│
[Step 3] ──► tools/call: file_write(path="conn.txt", content="postgresql://pg.local:5432")
│ ├── Permission check: file:write:/workspace/** ──► GRANTED
│ ├── Idempotency: idem_ag1_t5_s3_ghi789 ──► NOT FOUND ──► EXECUTE
│ └── Result: { bytes_written: 28, status: "created" }
│
[Pipeline Complete] ──► Return aggregated result to agent
Rules and Guidelines
- Rule 1: Every tool call must include an idempotency key. Calls without keys must be rejected at the orchestration layer.
- Rule 2: Tool schemas are the single source of truth. Never infer parameter types from example values; always validate against the published JSON Schema.
- Rule 3: Permission checks must occur before input validation. An agent should not learn about a tool's parameter structure if it lacks permission to invoke it.
- Rule 4: Pipeline failures must be atomic at the declared transaction boundary. If a pipeline step fails and rollback is configured, all prior steps must be compensated.
- Rule 5: Tool version negotiation happens once per session during MCP initialization. Mid-session version changes require a full re-initialization handshake.
Reference Guides
Below are links to the reference guides detailing the protocols, schemas, algorithms, and implementations used in this tool orchestration framework:
- mcp-protocol-patterns.md Covers Model Context Protocol lifecycle, transport negotiation (stdio, SSE, HTTP), capability exchange, and server/client implementation patterns.
- tool-schema-definitions.md Defines strict tool contract schemas using JSON Schema, input/output validation, type systems, and schema composition patterns.
- idempotency-patterns.md Implements retry-safe tool invocations using idempotency keys, deduplication stores, write-ahead logging, and at-most-once execution guarantees.
- tool-discovery-routing.md Details how agents discover tools dynamically via MCP, maintain tool registries, and route calls based on capability matching.
- tool-permission-models.md Specifies permission and authorization models for tool access including scope-based grants, time-bounded tokens, and audit logging.
- tool-chaining-pipelines.md Describes multi-tool pipeline orchestration with DAG execution, intermediate result transformation, and transactional boundaries.
- tool-error-handling.md Covers error classification, retry policies, circuit breaker patterns, fallback strategies, and structured error propagation.
- tool-versioning-compatibility.md Manages tool version evolution using semantic versioning, compatibility matrices, migration adapters, and deprecation policies.
Handoff
For projects requiring context window optimization when passing tool results, hand off to context-engineering. For enforcing architectural constraints on tool implementations, hand off to architectural-constraints. For prompt design that structures tool call instructions, hand off to prompt-engineering.
Implementation Patterns
MCP Client Implementation
import json
import asyncio
from typing import Dict, Any, Optional, Callable
from dataclasses import dataclass
@dataclass
class MCPTool:
name: str
version: str
description: str
input_schema: Dict
output_schema: Dict
class MCPClient:
def __init__(self, transport: str = "stdio", endpoint: Optional[str] = None):
self.transport = transport
self.endpoint = endpoint
self.tools: Dict[str, MCPTool] = {}
self.capabilities: Dict = {}
async def initialize(self):
if self.transport == "stdio":
self.process = await asyncio.create_subprocess_exec(
*self.endpoint.split() if self.endpoint else ["python", "-m", "mcp_server"],
stdin=asyncio.subprocess.PIPE,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
)
init_msg = self._build_request("initialize", {
"protocolVersion": "0.1.0",
"capabilities": {},
"clientInfo": {"name": "agent-tool-orchestrator", "version": "2.0.0"},
})
response = await self._send(init_msg)
self.capabilities = response.get("capabilities", {})
return response
async def list_tools(self) -> Dict[str, MCPTool]:
msg = self._build_request("tools/list", {})
response = await self._send(msg)
tools = {}
for t in response.get("tools", []):
tools[t["name"]] = MCPTool(
name=t["name"],
version=t.get("version", "1.0.0"),
description=t.get("description", ""),
input_schema=t.get("inputSchema", {}),
output_schema=t.get("outputSchema", {}),
)
self.tools = tools
return tools
async def call_tool(self, name: str, params: Dict, idempotency_key: str) -> Dict:
self._validate_params(name, params)
msg = self._build_request("tools/call", {
"name": name,
"arguments": params,
"meta": {"idempotencyKey": idempotency_key},
})
return await self._send(msg)
def _validate_params(self, tool_name: str, params: Dict):
if tool_name not in self.tools:
raise ValueError(f"Unknown tool: {tool_name}")
schema = self.tools[tool_name].input_schema
required = schema.get("required", [])
for field in required:
if field not in params:
raise ValueError(f"Missing required parameter: {field}")
def _build_request(self, method: str, params: Dict) -> str:
return json.dumps({
"jsonrpc": "2.0",
"id": id(self),
"method": method,
"params": params,
})
async def _send(self, msg: str) -> Dict:
if self.transport == "stdio":
self.process.stdin.write((msg + "\n").encode())
await self.process.stdin.drain()
line = await asyncio.wait_for(
self.process.stdout.readline(), timeout=30
)
return json.loads(line)
return {}
Idempotency Manager
import hashlib
import json
import time
from typing import Dict, Optional, Any
class IdempotencyManager:
def __init__(self, store: Optional[Dict] = None, ttl: int = 86400):
self.store = store or {}
self.ttl = ttl
def generate_key(self, agent_id: str, task_id: str, step_index: int, params: Dict) -> str:
param_hash = hashlib.sha256(
json.dumps(params, sort_keys=True).encode()
).hexdigest()[:12]
return f"idem_{agent_id}_{task_id}_s{step_index}_{param_hash}"
def get_result(self, key: str) -> Optional[Dict]:
entry = self.store.get(key)
if entry is None:
return None
if time.time() - entry["timestamp"] > self.ttl:
del self.store[key]
return None
return entry["result"]
def store_result(self, key: str, result: Dict):
self.store[key] = {
"result": result,
"timestamp": time.time(),
}
def cleanup_expired(self):
now = time.time()
expired = [k for k, v in self.store.items() if now - v["timestamp"] > self.ttl]
for k in expired:
del self.store[k]
class SchemaValidator:
def __init__(self):
self.validators = {}
def compile_schema(self, schema: Dict) -> Callable:
import jsonschema
return lambda data: jsonschema.validate(data, schema)
def validate(self, data: Any, schema: Dict) -> Dict:
import jsonschema
try:
jsonschema.validate(data, schema)
return {"valid": True, "errors": []}
except jsonschema.ValidationError as e:
return {"valid": False, "errors": [str(e)]}
Permission Evaluator
import fnmatch
import time
from typing import List, Dict, Optional
from dataclasses import dataclass
@dataclass
class PermissionGrant:
agent_id: str
scope: str
granted_at: float
expires_at: float
class PermissionEvaluator:
def __init__(self):
self.grants: Dict[str, List[PermissionGrant]] = {}
def grant_permission(self, agent_id: str, scope: str, duration_sec: int = 3600):
now = time.time()
grant = PermissionGrant(
agent_id=agent_id,
scope=scope,
granted_at=now,
expires_at=now + duration_sec,
)
if agent_id not in self.grants:
self.grants[agent_id] = []
self.grants[agent_id].append(grant)
def check_permission(self, agent_id: str, required_scope: str) -> Dict:
now = time.time()
agent_grants = self.grants.get(agent_id, [])
active_grants = [g for g in agent_grants if g.expires_at > now]
for grant in active_grants:
if fnmatch.fnmatch(required_scope, grant.scope):
return {
"granted": True,
"matching_grant": grant.scope,
"expires_at": grant.expires_at,
}
return {
"granted": False,
"required_scope": required_scope,
"active_grants": [g.scope for g in active_grants],
}
def revoke_expired(self):
now = time.time()
for agent_id in self.grants:
self.grants[agent_id] = [g for g in self.grants[agent_id] if g.expires_at > now]
Circuit Breaker
import time
from typing import Dict, Optional
from enum import Enum
class CircuitState(Enum):
CLOSED = "closed"
OPEN = "open"
HALF_OPEN = "half_open"
class CircuitBreaker:
def __init__(self, failure_threshold: int = 5, recovery_timeout: float = 30.0):
self.failure_threshold = failure_threshold
self.recovery_timeout = recovery_timeout
self.state = CircuitState.CLOSED
self.failure_count = 0
self.last_failure_time = 0.0
def record_success(self):
self.failure_count = 0
if self.state == CircuitState.HALF_OPEN:
self.state = CircuitState.CLOSED
def record_failure(self):
self.failure_count += 1
self.last_failure_time = time.time()
if self.failure_count >= self.failure_threshold:
self.state = CircuitState.OPEN
def allow_request(self) -> bool:
if self.state == CircuitState.CLOSED:
return True
if self.state == CircuitState.OPEN:
if time.time() - self.last_failure_time >= self.recovery_timeout:
self.state = CircuitState.HALF_OPEN
return True
return False
return True
def get_state(self) -> Dict:
return {
"state": self.state.value,
"failures": self.failure_count,
"threshold": self.failure_threshold,
"remaining_recovery": max(0, self.recovery_timeout - (time.time() - self.last_failure_time)),
}
Architecture Decision Trees
Tool Communication Transport Selection
What's the deployment context?
├── Local agent (same machine)
│ ├── Low latency needed → stdio transport (pipe)
│ ├── Multiple concurrent tools → SSE transport
│ └── Simple tool set → stdio (easiest to debug)
│
├── Remote agent (different machine)
│ ├── Within same network → SSE over HTTP
│ ├── Across networks → Streamable HTTP with auth
│ └── High throughput needed → gRPC streaming
│
└── Hybrid (some local, some remote)
├── Local tools via stdio, remote via SSE
└── Unified routing layer with transport abstraction
Error Handling Strategy Selection
What type of tool failure?
├── Transient (timeout, rate limit, service unavailable)
│ ├── Retry with backoff → Up to N=3 attempts
│ ├── Circuit breaker opens after N failures
│ └── Fallback to cached result if available
│
├── Client error (400, 404, validation error)
│ ├── Is input fixable? → Auto-correct + retry once
│ └── Not fixable → Return structured error to agent
│
├── Auth error (401, 403)
│ ├── Token expired? → Refresh + retry once
│ └── Insufficient scope → Return PermissionDenied
│
└── Server error (500+)
├── Retry with backoff (up to 3)
└── All retries exhausted → Return 503 equivalent to agent
Production Considerations
- Tool discovery caching: Cache tool registry responses with a 5-minute TTL. Full re-discovery on every tool call adds 50-500ms latency per call.
- Idempotency store sizing: Monitor idempotency store growth. Set TTL based on maximum expected retry window (typically 24-48 hours). Use Redis with eviction policy for automatic pruning.
- Permission audit frequency: Batch permission audit log writes (every 100 calls or 5 seconds) to reduce storage I/O. Store in append-only format for compliance.
- Tool health probes: Implement tool health check endpoints (/health, /ready) separate from tool invocation. Poll every 30 seconds for availability monitoring.
Security Considerations
- Tool parameter injection: Never pass agent-generated parameters directly to shell commands or eval() functions without strict schema validation and sanitization.
- Idempotency key reuse attack: An attacker with access to idempotency keys could replay operations. Include agent authentication in key derivation.
- Permission scope glob expansion: Use
**in scope patterns cautiously. A scope offile:read:/workspace/**allows reading all files, whilefile:read:/workspace/*only top-level. - MCP transport authentication: stdio transport inherits the parent process permissions. SSE/HTTP transports must authenticate every request with bearer tokens.
Anti-Patterns
| Anti-Pattern | Why It Fails | Correct Approach |
|---|---|---|
| Storing idempotency key after execution completes | Race condition allows duplicate execution on retry | Use write-ahead log: mark as pending before execution |
| Schema validation after permission check | Agent learns parameter structure of unauthorized tools | Check permissions first, deny before revealing parameter schemas |
| Hardcoding timeout values per tool | Different operations need different timeouts | Configure per-tool timeout in tool manifest |
| Ignoring tool output schema validation | Malformed tool outputs crash downstream processing | Validate output against schema before returning to agent |
| Single circuit breaker for all tools | One failing tool isolates all tool access | Per-tool or per-category circuit breakers |
| Not version-pinning MCP protocol | Protocol changes between updates break communication | Negotiate protocol version at init, fail on mismatch |
| Exposing agent ID in MCP responses | Leaks internal agent identity for profiling | Use opaque session IDs for external communication |
Performance Optimization
- Idempotency store batching: Batch idempotency store writes (flush every 50ms or 100 keys) to reduce write overhead on high-throughput systems.
- Parallel tool discovery: If multiple MCP servers are configured, discover tools from all servers in parallel using asyncio.gather.
- Schema compilation caching: Pre-compile JSON Schemas into validator functions on tool discovery. Avoid re-compiling on every invocation.
- Transport connection pooling: For HTTP-based transports, reuse connection pools and keep-alive to avoid TCP handshake overhead on each tool call.
- Result caching for read-only tools: Cache results of deterministic read-only tools (file_read, config_get) with TTL. Avoid re-execution when identical parameters are provided.