name: tn-security-eval description: | Comprehensive security evaluation for telcoin-network PRs and branches. Orchestrates 10 parallel agents covering consensus safety, state transitions, crypto, DoS, determinism, contract safety, dependency audit, nemesis logic audit, DREAD scoring, and STRIDE classification, with independent verification to eliminate false positives. Trigger on: "security eval", "security audit PR", "is this PR safe", "pre-merge security".
Security Evaluation Orchestrator
Comprehensive security evaluation for telcoin-network code changes. Spawns 10 specialized security agents in parallel, each focused on a specific attack surface, then independently verifies findings to eliminate false positives and provides root-cause remediation for confirmed vulnerabilities.
Severity Scale (Blockchain-Calibrated)
| Level | Definition | Examples |
|---|---|---|
| CRITICAL | Consensus break, fund loss, chain halt, or forked state | Quorum miscalculation, determinism violation in state transition, unprotected fund transfer |
| HIGH | Security degradation or data corruption | Missing signature verification, unbounded allocation, unsafe key handling |
| MEDIUM | Defense-in-depth gap or reliability issue | Missing input validation on network boundary, inadequate error handling in consensus path |
| LOW | Code quality issue with security implications | Inconsistent error types, missing logging in security-relevant paths |
| INFO | Observation or hardening suggestion | Style inconsistency, documentation gap in security-sensitive code |
Process
Phase 1: Scope Identification
Determine what code to evaluate:
- If given a PR number: run
git diff main...HEADorgh pr diff <number> - If given a branch: run
git diff main...<branch> - If given specific files: use those directly
- Read all changed files in full plus their direct dependents
Phase 2: Spawn Security Agents
Spawn ALL 10 agents in parallel using the Agent tool. Each agent receives:
- The list of changed files and their diffs
- The full content of changed files
The 10 agents and their focus areas:
| Agent | Focus | Skills to Invoke |
|---|---|---|
tn-consensus-safety |
BFT assumptions, quorum logic, vote counting, leader election | tn-harden + tn-threat-model |
tn-state-transitions |
Invariant violations, partial state updates, rollback safety | tn-nemesis + tn-review |
tn-crypto-correctness |
Signatures, hashing, key management, nonce handling | tn-review (crypto paths) |
tn-dos-vectors |
Resource exhaustion, unbounded allocations, amplification | tn-harden (blocking audit) |
tn-determinism-verifier |
HashMap iteration, SystemTime, thread-dependent ordering, randomness | tn-harden (determinism) |
tn-contract-safety |
Access control, reentrancy, accounting, upgrade safety | tn-review-contracts |
tn-dependency-auditor |
New crates, CVE exposure, supply chain, feature flags | Cargo.toml diff analysis |
tn-nemesis-auditor |
Deep iterative business logic + state inconsistency cross-analysis | tn-nemesis |
tn-dread-evaluator |
Attacker-perspective risk assessment, DREAD scoring, attack surface prioritization | tn-threat-model |
tn-stride-threat-model |
STRIDE threat classification: spoofing, tampering, repudiation, info disclosure, DoS, privilege escalation | tn-threat-model |
Phase 3: Extract Structured Findings
After all 10 agents complete, extract each discrete finding into a canonical structure:
Finding ID: [agent-name]-[N]
Source Agent: [which of the 10]
Severity: CRITICAL / HIGH / MEDIUM / LOW / INFO
Title: [one-line summary]
Location: file_path:line_number
Claim: [standalone factual assertion — what is wrong]
Key Question: [the specific thing a verifier must answer]
Relevant Files: [files needed to verify]
Source: [agent name, e.g., "tn-consensus-safety", "tn-state-transitions"]
Critical: The Claim field contains ONLY the factual assertion (e.g. "function X does not validate input Y"), never the reasoning chain that led to it. This preserves independence for Phase 4 verifiers.
Assign each finding a verification tier:
| Tier | Severities | Verification Strategy |
|---|---|---|
| Tier 1 | CRITICAL, HIGH | Verified individually — one agent per finding |
| Tier 2 | MEDIUM | Batched 2-3 per agent, grouped by subsystem |
| Tier 3 | LOW | Batched 3-5 per agent |
| Skip | INFO | No verification — observations, not vulnerability claims |
Phase 4: Verify and Present
After extracting all structured findings in Phase 3, invoke the findings-verifier agent via the Agent tool to independently verify each finding, produce remediation, and present the final report.
Pass to the agent:
- All extracted findings in canonical schema (from Phase 3)
- The evaluation scope context (PR number, branch, or file list)
- The full content of all changed files
The findings-verifier agent handles:
- Independent subagent verification (anti-confirmation bias — verifiers receive only the claim and key question, never the original agent's reasoning)
- Tiered verification (CRITICAL/HIGH individually, MEDIUM batched 2-3, LOW batched 3-5, INFO skipped)
- Root-cause remediation with decision tree (clear fix → code, multiple approaches → options, architectural → flag for human)
- Checking for similar patterns elsewhere in the codebase
- Updating the report with verification results and proposed fixes
- Presenting confirmed findings with verification stats
Do not present findings to the user before findings-verifier completes. Unverified findings waste time.
Expected Agent Counts
| Phase | Agents | Notes |
|---|---|---|
| 2 | 10 | Fixed — one per security domain |
| 4 (via findings-verifier) | 6-12 | Verification agents, depends on finding count and tiers |
| 4 (re-verify) | 0-3 | Low-confidence CRITICAL/HIGH verdicts only |
| 4 (remediation) | 3-8 | Confirmed findings only |
| Total | 19-33 | Typical ~24 |