name: security description: Run repository security scans for vulnerabilities, dependency risk, secrets, and release gates. practices:
- supply-chain-integrity
- design-by-contract
- sre hexagonal_role: driven-adapter consumes:
- repo-context produces:
- security-report.json context_rel:
- kind: supplier-to
with: validate
skill_api_version: 1
context:
window: fork
intent:
mode: task
sections:
exclude:
- HISTORY intel_scope: topic metadata: tier: product dependencies: [] output_contract: 'stdout: security scan report'
Security Skill
Purpose: Run repeatable security checks across code, scripts, hooks, and release gates, plus composable binary/internal-testing primitives and offline repo-surface redteam for authorized targets.
Use this skill when you need deterministic security validation before merge/release, recurring scheduled checks, binary black-box assurance, or offline prompt-surface redteam.
This skill has two complementary surfaces:
- Repository security gate (
scripts/security-gate.sh) — fast/full/nightly scanner gates for code, scripts, hooks, and release readiness. - Composable security suite (
scripts/security_suite.py,scripts/prompt_redteam.py) — testable, reusable primitives for authorized binaries and repo-managed prompt surfaces, with policy gating and machine-consumable outputs.
Quick Start
/security # quick security gate
/security --full # full gate with test-inclusive toolchain checks
/security --release # full gate for release readiness
/security --json # machine-readable report output
Guardrails (suite primitives)
- Use the binary/redteam primitives only on binaries you own or are explicitly authorized to assess.
- Do not use this workflow to bypass legal restrictions or extract third-party proprietary content without authorization.
- Prefer behavioral assurance and policy gating over ad-hoc one-off reverse-engineering.
Execution Contract (repository gate)
1) Pre-PR (fast)
Run quick gate:
scripts/security-gate.sh --mode quick
Expected behavior:
- Fails on high/critical findings from available scanners.
- Writes artifacts under
$TMPDIR/agentops-security/<run-id>/.
2) Pre-Release (strict)
Run full gate:
scripts/security-gate.sh --mode full
Expected behavior:
- Full scanner pass before release workflow can continue.
- Artifacts retained for audit and incident response.
3) Nightly (continuous)
Nightly workflow should run:
scripts/security-gate.sh --mode full
Expected behavior:
- Detects drift/regressions outside active PR windows.
- Failing run creates actionable signal in workflow summary/issues.
Composable Security Suite
This surface separates concerns into primitives so security workflows stay testable and reusable.
Primitive Model
collect-static— file metadata, runtime heuristics, linked libraries, embedded archive signatures.collect-dynamic— sandboxed execution trace (processes, file changes, network endpoints).collect-contract— machine-readable behavior contract from help-surface probing.compare-baseline— current vs baseline contract drift (added/removed commands, runtime change).enforce-policy— allowlist/denylist gates and severity-based verdict.collect-redteam— offline repo-surface attack-pack scan for prompt-injection, tool-misuse, secret-exfiltration, and unsafe-shell regressions.run— thin binary orchestrator that composes primitives and writes suite summary.
Suite Quick Start
Single run (default dynamic command is --help):
python3 skills/security/scripts/security_suite.py run \
--binary "$(command -v ao)" \
--out-dir .tmp/security-suite/ao-current
Baseline regression gate:
python3 skills/security/scripts/security_suite.py run \
--binary "$(command -v ao)" \
--out-dir .tmp/security-suite/ao-current \
--baseline-dir .tmp/security-suite/ao-baseline \
--fail-on-removed
Policy gate:
python3 skills/security/scripts/security_suite.py run \
--binary "$(command -v ao)" \
--out-dir .tmp/security-suite/ao-current \
--policy-file skills/security/references/policy-example.json \
--fail-on-policy-fail
Repo-surface redteam:
python3 skills/security/scripts/prompt_redteam.py scan \
--repo-root . \
--pack-file skills/security/references/agentops-redteam-pack.json \
--out-dir .tmp/security-suite-redteam
For OWASP Top 10 code-level review, see references/owasp-checklist.md.
Recommended Suite Workflow
- Capture baseline on known-good release.
- Run suite on candidate binary in CI.
- Compare against baseline and enforce policy.
- Block promotion on failing verdict.
Suite Output Contract
All outputs are written under --out-dir:
static/static-analysis.jsondynamic/dynamic-analysis.jsoncontract/contract.jsoncompare/baseline-diff.json(when baseline supplied)policy/policy-verdict.json(when policy supplied)suite-summary.jsonredteam/redteam-results.json(when repo-surface redteam is run)
This output structure is intentionally machine-consumable for CI gates.
Policy Model
Use skills/security/references/policy-example.json as a starting point. Policy gating produces a machine-readable policy-verdict.json.
Supported checks:
required_top_level_commandsdeny_command_patternsmax_created_filesforbid_file_path_patternsallow_network_endpoint_patternsdeny_network_endpoint_patternsblock_if_removed_commandsmin_command_count
Redteam Pack Model
Use agentops-redteam-pack.json as the starting point for offline repo-surface redteam checks.
Supported target fields:
globsrequire_groupsforbidden_anyapplies_if_any
Each case expresses a concrete adversarial prompt or operator-bypass attempt and binds it to one or more repo-owned files. The first shipped pack covers instruction precedence, context overexposure, destructive git misuse, security gate bypass, and unsafe shell or secret-handling regressions.
Technique Coverage
This suite is designed for broad binary classes, not just CLI metadata:
- static runtime/library fingerprinting
- sandboxed behavior observation
- command/contract capture
- drift classification
- policy enforcement and CI verdicting
- repo-surface redteam checks for prompt and operator-contract regressions
It is intentionally modular so you can add deeper primitives later (syscall tracing, SBOM attestation verification, fuzz harnesses) without rewriting the workflow.
Triage Guidance
When the repository gate fails:
- Open latest artifact in
$TMPDIR/agentops-security/and identify scanner + file. - Classify severity (critical/high/medium).
- Fix immediately for critical/high or create tracked follow-up issue with owner.
- Re-run
scripts/security-gate.shuntil gate passes.
Reporting Template
Security gate run: <run-id>
Mode: <quick|full>
Result: <pass|blocked>
Top findings:
- <scanner> <severity> <file> <summary>
Actions:
- <fix or issue id>
Validation
Run the merged skill validator (asserts the suite scripts/references, gate, and redteam pack stay healthy):
bash skills/security/scripts/validate.sh
bash tests/scripts/test-security-suite-redteam.sh
Suite smoke test (recommended):
python3 skills/security/scripts/security_suite.py run \
--binary "$(command -v ao)" \
--out-dir .tmp/security-suite-smoke \
--policy-file skills/security/references/policy-example.json
Repo-surface smoke test:
python3 skills/security/scripts/prompt_redteam.py scan \
--repo-root . \
--pack-file skills/security/references/agentops-redteam-pack.json \
--out-dir .tmp/security-suite-redteam-smoke
Notes
- Use this as the canonical security runbook instead of ad-hoc scanner commands.
- Keep workflow wiring aligned with this contract in:
.github/workflows/validate.yml.github/workflows/nightly.yml.github/workflows/release.yml
- For binary/internal black-box assurance plus offline repo-surface redteam, use the composable suite above (
security_suite.pyandprompt_redteam.py). - For dependency vulnerability and license scanning, use:
- deps — Audit dependency risks and updates: vulnerability scanning and license compliance (absorbed into this skill)
Examples
Scenario: Quick Security Gate Before Opening a PR
User says: /security
What happens:
- The skill runs
scripts/security-gate.sh --mode quick, which executes available scanners (semgrep, gosec, gitleaks) against the current working tree and flags high/critical findings. - Run
deps vulnto scan for vulnerable dependencies (OWASP A06: Vulnerable and Outdated Components). - Scan artifacts are written to
$TMPDIR/agentops-security/<run-id>/for review, and the gate reports a pass/blocked verdict.
Result: The gate passes with no high/critical findings, confirming the branch is safe to open a PR.
Scenario: Full Security Gate for a Release
User says: /security --release
What happens:
- The skill runs
scripts/security-gate.sh --mode full, which performs a comprehensive scan including all scanner passes, test-inclusive toolchain checks, and stricter severity thresholds. - Artifacts are retained under
$TMPDIR/agentops-security/<run-id>/for audit trail and incident response, and a structured report is generated.
Result: The full gate blocks the release on two medium-severity findings in cli/internal/config.go; the operator triages and fixes them before re-running the gate to get a clean pass.
Scenario: Capture a Baseline and Gate a New Release (suite)
User says: /security run --binary $(command -v ao) --out-dir .tmp/security-suite/ao-v2.4
What happens:
- The suite runs static analysis (file metadata, linked libraries, embedded archive signatures), dynamic tracing (sandboxed
--helpexecution observing processes, file changes, network endpoints), and contract capture against theaobinary. - It writes
static/static-analysis.json,dynamic/dynamic-analysis.json,contract/contract.json, andsuite-summary.jsonunder the output directory.
Result: A complete baseline snapshot is captured for ao v2.4, ready to be used as --baseline-dir for future release comparisons.
Scenario: CI Regression Gate With Baseline and Policy (suite)
User says: /security run --binary ./bin/ao-candidate --out-dir .tmp/ao-candidate --baseline-dir .tmp/security-suite/ao-v2.4 --policy-file skills/security/references/policy-example.json --fail-on-removed --fail-on-policy-fail
What happens:
- The suite runs all three collection primitives on the candidate binary, then compares the resulting contract against the v2.4 baseline to produce
compare/baseline-diff.jsonwith any added, removed, or changed commands. - It evaluates the policy file checks (required commands, denied patterns, network allowlists, file limits) and writes
policy/policy-verdict.jsonwith a pass/fail verdict.
Result: The suite exits non-zero if any commands were removed or a policy check failed, blocking the candidate from promotion in the CI pipeline.
Scenario: Offline Redteam the Repo's Prompt and Skill Surfaces (suite)
User says: /security collect-redteam --repo-root .
What happens:
- The redteam scanner loads the attack pack from
agentops-redteam-pack.jsonand evaluates repo-owned control surfaces against concrete attack cases. - It writes
redteam/redteam-results.jsonandredteam/redteam-results.mdunder the chosen output directory, then exits non-zero if a fail-severity case is not resisted.
Result: The repo gets a deterministic redteam verdict for prompt-injection, tool misuse, context overexposure, secret-handling, and unsafe-shell regressions without needing hosted model scanning.
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
| Gate reports "scanner not found" and skips checks | Required scanner (semgrep, gosec, or gitleaks) is not installed | Install the missing scanner: brew install semgrep, go install github.com/securego/gosec/v2/cmd/gosec@latest, or brew install gitleaks. |
| Gate passes locally but fails in CI | CI environment has additional scanners or stricter config | Compare $TMPDIR/agentops-security/ artifacts from both environments; align scanner versions and config files across local and CI. |
| False positive blocking the gate | Scanner flags a non-issue as high/critical severity | Add a scanner-specific inline suppression comment (e.g., # nosemgrep: rule-id) or update the scanner config to exclude the pattern, then document the suppression reason. |
Artifacts directory $TMPDIR/agentops-security/ not created |
Script lacks write permissions or $TMPDIR is not writable |
Verify $TMPDIR is set and writable; the script auto-creates subdirectories on each run. |
| Nightly scan not detecting regressions | Nightly workflow is not configured or is pointing at stale branch | Verify .github/workflows/nightly.yml runs scripts/security-gate.sh --mode full against the correct branch (typically main). |
| Suite exits non-zero with no clear finding | --fail-on-removed or --fail-on-policy-fail triggered on a legitimate change |
Review compare/baseline-diff.json and policy/policy-verdict.json to identify the specific delta, then update the baseline or policy file accordingly. |
dynamic/dynamic-analysis.json is empty or minimal |
Binary requires arguments beyond --help, or sandbox blocked execution |
Supply a custom dynamic command if supported, or verify the binary runs in the sandboxed environment (check permissions, missing shared libraries). |
contract/contract.json shows zero commands |
The binary does not expose a --help surface or uses a non-standard help flag |
Verify the binary supports --help; for binaries with unusual help interfaces, run collect-contract separately with the correct invocation. |
Policy verdict fails on deny_command_patterns |
A new subcommand matches a deny regex in the policy file | Either rename the subcommand or update deny_command_patterns in your policy JSON to exclude the legitimate pattern. |
baseline-diff.json not generated |
--baseline-dir was not provided or points to a missing directory |
Ensure the baseline directory exists and contains a valid contract/contract.json from a prior run. |
| Redteam scan fails after a wording cleanup | The attack pack no longer matches the intended guardrail language in target files | Review redteam/redteam-results.json, confirm whether the control regressed or the regex is too brittle, then update the target file or the pack intentionally. |
Reference Documents
- references/security.feature — Executable spec: run scanners, fail on high/critical, gate release, retain audit artifacts (soc-qk4b)
- references/security-suite.feature — Executable spec: composable primitives (static/dynamic/contract) → security-report.json, authorization-bounded, supplier-to vibe (soc-qk4b)
- references/owasp-checklist.md — OWASP Top 10 code-level review checklist
- references/agentops-redteam-pack.json — Offline repo-surface attack pack
- references/policy-example.json — Starter policy for the suite policy gate