name: mcp-security-governance description: security protocols for MCP agents. Use this to prevent Dynamic Capability Injection, Tool Shadowing, and Confused Deputy attacks when connecting to external servers.
MCP Security & Governance
Goal
Secure the agent-tool interface by implementing defensive patterns against malicious servers, prompt injection, and unauthorized data access.
Core Risks
1. Dynamic Capability Injection
- Threat: A server dynamically changes its tool list at runtime, adding dangerous capabilities (e.g., "buy_book") to a previously benign server.
- Defense:
- Pinning: Pin tool definitions to a specific version or hash.
- Allowlisting: Maintain an explicit client-side allowlist of permitted tools and servers. Disconnect if a server broadcasts unapproved changes.
2. Tool Shadowing
- Threat: A malicious server defines a tool with the same name or trigger as a legitimate tool (e.g.,
save_secure_note), tricking the agent into sending sensitive data to the attacker. - Defense:
- Naming Collision Checks: Reject connection if a server advertises a tool name that already exists in the trusted set.
- Policy Enforcement: Use an API Gateway to filter tool lists before they reach the agent context.
3. The Confused Deputy
- Threat: The user asks the agent to perform an action (e.g., "Summarize this file"), but the agent uses its high-privilege credentials to access a restricted file that the user shouldn't see.
- Defense:
- Scoped Credentials: Do not give the agent "root" access. Use credentials scoped to the specific user.
- Human-in-the-Loop: Require explicit user confirmation for high-risk actions (e.g., file deletion, payments).
Defensive Implementation
Input/Output Sanitization
- Inbound: Sanitize tool descriptions before they enter the context window to prevent prompt injection attacks embedded in documentation.
- Outbound: Scrub tool outputs for PII (API keys, PII) before passing them back to the model or user history.
Taint Tracking
- Concept: Tag data from external/untrusted tools as "tainted."
- Rule: Tainted data cannot be used as input for sensitive tools (e.g.,
send_email) without explicit sanitization or user approval.