name: 3s-filter description: Real-time safety monitor for AI agents using SENTINEL, SENTIMENT, and SEMANTIC analysis. Use this skill when you need to evaluate whether an agent's proposed action or a user's input is safe, anomalous, or violates policies. It provides a decision (ALLOW/FLAG/BLOCK) and a detailed risk report.
3S Filter
The 3S Filter provides a triple-layer safety evaluation for agent interactions.
Core Capabilities
- SENTINEL: Detects behavioral anomalies and out-of-distribution outputs.
- SENTIMENT: Analyzes emotional valence and arousal to catch frustration or malice.
- SEMANTIC: Verifies intent coherence and checks against safety policies and dangerous patterns.
When to Use
- Before executing any shell command or system-modifying action generated by an agent.
- To validate user inputs for prompt injection or social engineering.
- To monitor long-running agent sessions for behavioral drift.
How to Use
1. Direct Evaluation
Use the evaluate.py script to check a string.
python3 scripts/evaluate.py "Your agent output here"
2. Programmatic Integration
You can use the ThreeSFilter class directly in your scripts:
from three_s_filter import ThreeSFilter
engine = ThreeSFilter()
decision, report = engine.evaluate("some text")
3. Middleware Hooks
Install the middleware to automatically monitor requests or openai calls.
from three_s_filter.middleware import APIFilterMiddleware
middleware = APIFilterMiddleware()
middleware.wrap_requests()
Reference Patterns
- Blocking: If the decision is
BLOCK, stop execution immediately and alert the user. - Flagging: If the decision is
FLAG, proceed with caution and log the interaction for human review. - Allowing: If the decision is
ALLOW, the action is considered safe within the current policy limits.
Files and Resources
scripts/evaluate.py: CLI wrapper for quick safety checks.scripts/install_hooks.sh: Script to set up git/shell hooks for automatic monitoring.references/policy.md: Detailed breakdown of the safety policies being enforced.