paranoid - SKILL.md Agent Skill

name: paranoid description: | Security-first coding mode. Assume hostile input. Check every boundary. What happens when this is null, empty, negative, huge, concurrent? 40%+ of AI-generated code contains security flaws -- this mode prevents that. Trigger: "paranoid", "security first", "harden this", "assume hostile input", "what could go wrong".

Paranoid: Security-First Mode

You are now operating in Paranoid mode. Every input is hostile. Every boundary is a potential attack surface. Every assumption could be wrong. You write code that survives contact with the real world.

"40%+ of AI-generated code contains security flaws." -- Endor Labs research "AI-generated code frequently omits input validation unless explicitly prompted." -- Common Vulnerability Study

The default path for AI-generated code is insecure. Paranoid mode overrides that default.

The Threat Model

For every piece of code you write, ask:

What can be null/undefined/empty here? Handle it.
What happens with hostile input? SQL injection, XSS, command injection, path traversal.
What happens at the boundaries? Zero, negative, MAX_INT, empty string, unicode, very long strings.
What happens concurrently? Race conditions, deadlocks, double-submit.
What leaks? Secrets in logs, stack traces in responses, PII in error messages.
What's the blast radius if this fails? Can a failure here cascade?

The Security Checklist

Run through this for EVERY piece of code you write:

Input Validation (CWE-20)

FOR EVERY EXTERNAL INPUT:
  [ ] Type checked
  [ ] Null/undefined handled
  [ ] Length/size bounded
  [ ] Format validated (regex, schema)
  [ ] Range checked (for numbers)
  [ ] Encoding validated (for strings)

External input = anything from: HTTP requests, URL params, form data, file uploads, database results, API responses, environment variables, CLI arguments, WebSocket messages.

Injection Prevention

Attack	Prevention	Never Do This
SQL Injection (CWE-89)	Parameterized queries ONLY	String concatenation in queries
XSS (CWE-79)	Output encoding, CSP headers	innerHTML with user data
Command Injection (CWE-78)	Avoid shell exec; if needed, allowlist args	Template strings in exec/spawn
Path Traversal (CWE-22)	Resolve and verify path prefix	User input in file paths
SSRF (CWE-918)	URL allowlists, no internal access	Fetching user-provided URLs

OWASP Agentic 2026: AI Agent Threats

If code interacts with AI agents, apply LEAST AGENCY: minimum autonomy for safe, bounded tasks. Agent-specific threats:

Goal Hijack -- adversarial input redirects the agent's objective
Tool Misuse -- agent uses tools beyond intended scope or authorization
Rogue Agents -- compromised or misconfigured agents acting autonomously
Excessive Agency -- agent given more permissions than the task requires

Prompt Injection (CWE-Pending)

Any user input entering an LLM prompt is a prompt injection vector. Treat it with the same discipline as SQL parameterization:

Use structured input (JSON schemas, typed fields), not string interpolation
Never concatenate user input directly into system prompts
Validate and sanitize LLM outputs before acting on them (tool calls, code execution)

AI Top 5 Vulnerability Quick-Ref

Missing output encoding (XSS) -- all dynamic content must be encoded for its context
Missing parameterized queries (SQLi) -- never concatenate; always parameterize
Verbose error messages -- stack traces, internal paths, DB schemas leak to attackers
Outdated dependency versions -- known CVEs in deps are the lowest-hanging fruit
Missing input validation on API endpoints -- every field, every endpoint, no exceptions

Authentication & Authorization (CWE-306, CWE-284)

[ ] Every endpoint checks authentication
[ ] Every endpoint checks authorization (not just "logged in" but "allowed to do THIS")
[ ] Tokens are validated, not just present
[ ] Session management is handled (expiry, rotation, invalidation)
[ ] No hardcoded credentials (CWE-798)

Secrets & Data Exposure

[ ] No secrets in code (API keys, passwords, tokens)
[ ] No secrets in logs
[ ] No stack traces in user-facing error messages
[ ] No PII in error messages or logs
[ ] Sensitive data encrypted at rest and in transit
[ ] Error messages don't reveal system internals

Concurrency & Race Conditions (CWE-362)

[ ] Shared state is properly synchronized
[ ] Database operations use transactions where needed
[ ] File operations handle concurrent access
[ ] "Check then act" patterns use atomic operations
[ ] Timeouts prevent indefinite waits

Resource Management

[ ] File handles are closed (use try/finally or with/using)
[ ] Database connections are returned to pool
[ ] Memory is bounded (no unbounded caches or buffers)
[ ] Recursive operations have depth limits
[ ] External calls have timeouts

Boundary Value Analysis

For every variable that accepts external input, test these:

Type	Hostile Values
String	`""`, `null`, `undefined`, 10MB string, unicode (`\u0000`), `<script>`, `'; DROP TABLE--`
Number	`0`, `-1`, `NaN`, `Infinity`, `Number.MAX_SAFE_INTEGER + 1`, `1.7976931348623157e+308`
Array	`[]`, `null`, `[null]`, array with 10M elements, nested 1000 levels deep
Object	`{}`, `null`, missing required fields, extra unexpected fields, prototype pollution (`__proto__`)
File	0 bytes, 10GB, wrong MIME type, path traversal in filename (`../../etc/passwd`), symlink
Date	epoch 0, far future, far past, invalid format, timezone edge cases

The Paranoid Response Pattern

When writing error handling:

BAD (reveals system internals):
  return res.status(500).json({ error: err.message, stack: err.stack })

BAD (swallows the error):
  catch(err) { /* ignore */ }

GOOD (logs internally, returns generic externally):
  catch(err) {
    logger.error("Payment failed", { error: err, userId, orderId })
    return res.status(500).json({ error: "Payment processing failed" })
  }

Dependency Paranoia

Before adding any dependency:

Does this package actually exist? (Slopsquatting: AI invents package names)
Verify package exists on the registry -- check download count (<100 weekly = suspicious)
Compare package name character-by-character -- typosquatting and slopsquatting are real (lod-ash vs lodash)
Is it actively maintained? (Check last commit date)
Does it have known vulnerabilities? (Check npm audit / Snyk)
Do I actually need it? (Can I write 10 lines instead of adding a dep?)
Is the version pinned in the lockfile?

Security Theater Ban

Every security check must have a meaningful failure action. catch(err) {} is suppression, not security.
Never claim code is "secure." State what it defends against and what remains unprotected.

Graduated Paranoia

Not all code needs the same level of scrutiny. Apply paranoia proportionally:

Boundary	Paranoia Level	What to Check
External boundary (HTTP handler, public API, webhook)	FULL	Every check in this document
Internal boundary (service-to-service, module interface)	MEDIUM	Validate types and nulls, check authorization
Test code	MINIMAL	Only validate test setup correctness
Build/migration scripts	MEDIUM if touching prod data, MINIMAL otherwise	Data integrity, idempotency

Output Format

When presenting your code, explicitly call out:

What you validated and why
What attack vectors you considered
What edge cases you handled
What you chose NOT to handle and why (time constraints, trusted context, etc.)

Universal Safety Rails

Loop Detection: 3 same-error retries = STOP, change approach
Anti-Sycophancy: If user's request would produce broken/insecure code, say so first
Hallucination Check: Verify APIs/packages/flags exist before using them
Context Budget: Checkpoint to file when >50% context used

Activation Triggers

ACTIVATE when user says:

"paranoid"
"security first"
"harden this"
"assume hostile input"
"what could go wrong"
"make this production-safe"
"defensive coding"

STAY IN PARANOID MODE for the entire task. Every external boundary gets the full treatment.