name: warden-security-review description: Run Warden security scans in this repo using Sentry's warden-skills. Use when asked to audit security, scan with Warden, investigate authz/data-exfil/code-execution/GitHub Actions risks, or triage Warden findings.
Warden security review runbook
Use Warden as a first-pass scanner, then manually verify every finding against the code. A clean Warden run means "no findings from that skill/pass", not "the codebase is secure."
Setup
Warden uses Claude Code auth locally. For Claude Max usage:
claude login
Run Warden through npm so the package version does not need to be committed:
npm exec --yes --package=@sentry/warden -- warden --help
The repo has a warden.toml that uses remote skills from getsentry/warden-skills.
Reference skills are mirrored under .reference/warden-skills when needed. .reference/ is gitignored.
Local Outputs
Write run artifacts under .warden-runs/. Do not commit .warden/ or .warden-runs/.
Use JSONL output for later triage:
mkdir -p .warden-runs
npm exec --yes --package=@sentry/warden -- \
warden <targets...> --skill <skill> --fail-on off --report-on low --min-confidence low \
--parallel 2 --log -o .warden-runs/<name>.jsonl
Warden may not treat bare directories as recursive targets. Prefer explicit quoted globs or a target file list.
Recommended Scans
Authz on cloud/API surfaces:
npm exec --yes --package=@sentry/warden -- \
warden "apps/cloud/src/auth/**/*.ts" "apps/cloud/src/api/**/*.ts" \
"apps/cloud/src/routes/**/*.tsx" "packages/core/api/src/**/*.ts" \
--skill wrdn-authz --fail-on off --report-on low --min-confidence low \
--parallel 2 --log -o .warden-runs/authz.jsonl
Code execution on sink-bearing runtime/plugin files:
rg -l "\b(exec|spawn|execFile|fork|subprocess|Deno\.Command|new Function|eval\(|vm\.|QuickJS|quickjs|Worker\(|import\(|compile|instantiate|runIn|shell|command|child_process)\b" \
apps/local/src/server apps/cli/src packages/core/execution/src packages/core/sdk/src packages/kernel packages/plugins \
-g "*.ts" -g "*.tsx" -g "!*.test.ts" -g "!*.spec.ts" -g "!*.e2e.ts" -g "!**/dist/**" -g "!**/node_modules/**" \
> .warden-runs/code-execution-targets.txt
npm exec --yes --package=@sentry/warden -- \
warden $(tr '\n' ' ' < .warden-runs/code-execution-targets.txt) \
--skill wrdn-code-execution --fail-on off --report-on low --min-confidence low \
--parallel 2 --log -o .warden-runs/code-execution.jsonl
Data exfiltration on backend/API/storage/plugin SDK surfaces:
find apps/cloud/src/api apps/cloud/src/auth apps/local/src/server \
packages/core/api/src packages/core/storage-core/src packages/core/storage-file/src \
packages/core/storage-postgres/src packages/core/storage-drizzle/src \
packages/plugins/mcp/src packages/plugins/openapi/src packages/plugins/graphql/src \
packages/plugins/google-discovery/src packages/plugins/oauth2/src \
packages/plugins/onepassword/src packages/plugins/workos-vault/src \
packages/plugins/file-secrets/src packages/plugins/keychain/src \
-type f \( -name "*.ts" -o -name "*.tsx" \) |
rg -v '(\.test\.|\.spec\.|\.e2e\.|dist/|node_modules/|embedded-migrations\.gen\.ts|/react/)' \
> .warden-runs/exfil-targets-focused.txt
npm exec --yes --package=@sentry/warden -- \
warden $(tr '\n' ' ' < .warden-runs/exfil-targets-focused.txt) \
--skill wrdn-data-exfil --fail-on off --report-on low --min-confidence low \
--parallel 2 --log -o .warden-runs/data-exfil.jsonl
GitHub Actions workflow risks:
find .github -type f \( -name "*.yml" -o -name "*.yaml" \) > .warden-runs/gha-targets.txt
npm exec --yes --package=@sentry/warden -- \
warden $(tr '\n' ' ' < .warden-runs/gha-targets.txt) \
--skill wrdn-gha-workflows --fail-on off --report-on low --min-confidence low \
--parallel 2 --log -o .warden-runs/gha-workflows.jsonl
How to Triage
Deduplicate findings by root cause. Warden often reports the same bug at the low-level sink, wrapper, API handler, and plugin-tool entrypoint.
For each candidate:
- Trace whether input is user-controlled.
- Identify the exact sink.
- Check whether auth, scope, host allowlists, private-IP blocks, redirects, and DNS rebinding defenses exist.
- Determine what data returns to the caller: raw body, parsed fields, typed error message, timing/status oracle, or no observable data.
- State confidence and deployment caveats.
Current Known Findings
As of the Warden pass on 2026-04-29:
- Real: authenticated SSRF in plugin/source setup URL fetching for OpenAPI, Google Discovery, GraphQL, and MCP remote endpoints.
- Real: mutable third-party GitHub Actions refs in publish/release workflows, especially
oven-sh/setup-bun@v2andchangesets/action@v1. - Clean in that pass: authz scan on cloud auth/API/core API surfaces; code-execution scan on narrowed CLI/runtime/kernel/plugin sink files.
Do not claim the whole codebase is secure from those clean runs. They are scoped scanner results.