3s-filter - SKILL.md Agent Skill

name: 3s-filter description: Real-time safety monitor for AI agents using SENTINEL, SENTIMENT, and SEMANTIC analysis. Use this skill when you need to evaluate whether an agent's proposed action or a user's input is safe, anomalous, or violates policies. It provides a decision (ALLOW/FLAG/BLOCK) and a detailed risk report.

The 3S Filter provides a triple-layer safety evaluation for agent interactions.

SENTINEL: Detects behavioral anomalies and out-of-distribution outputs.
SENTIMENT: Analyzes emotional valence and arousal to catch frustration or malice.
SEMANTIC: Verifies intent coherence and checks against safety policies and dangerous patterns.

Before executing any shell command or system-modifying action generated by an agent.
To validate user inputs for prompt injection or social engineering.
To monitor long-running agent sessions for behavioral drift.

Use the evaluate.py script to check a string.

python3 scripts/evaluate.py "Your agent output here"

You can use the ThreeSFilter class directly in your scripts:

from three_s_filter import ThreeSFilter
engine = ThreeSFilter()
decision, report = engine.evaluate("some text")

Install the middleware to automatically monitor requests or openai calls.

from three_s_filter.middleware import APIFilterMiddleware
middleware = APIFilterMiddleware()
middleware.wrap_requests()

Blocking: If the decision is BLOCK, stop execution immediately and alert the user.
Flagging: If the decision is FLAG, proceed with caution and log the interaction for human review.
Allowing: If the decision is ALLOW, the action is considered safe within the current policy limits.

scripts/evaluate.py: CLI wrapper for quick safety checks.
scripts/install_hooks.sh: Script to set up git/shell hooks for automatic monitoring.
references/policy.md: Detailed breakdown of the safety policies being enforced.