extracting-recurring-vulnerabilities-black-box - SKILL.md Agent Skill

name: "extracting-recurring-vulnerabilities-black-box" description: > Predict and prevent recurring vulnerabilities in LLM-generated code using the FSTab (Feature-Security Table) technique. Maps observable frontend features to likely backend vulnerabilities based on which LLM generated the code. Use this skill when: - "Audit this LLM-generated app for security vulnerabilities" - "What vulnerabilities does this generated code likely have?" - "Check this AI-generated backend for recurring security issues" - "Build an FSTab mapping for this codebase" - "Run a black-box vulnerability prediction on this web app" - "Assess vulnerability persistence in code from GPT/Claude/Gemini"

Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software

This skill enables Claude to predict, detect, and remediate recurring security vulnerabilities in LLM-generated software by applying the FSTab (Feature-Security Table) technique. FSTab exploits a critical insight: LLMs generate code from recurring templates, producing predictable vulnerability patterns tied to specific frontend features. By identifying observable features (login forms, file uploads, search bars) and knowing which LLM generated the code, Claude can predict the most probable backend vulnerabilities without reading a single line of backend source -- then verify and fix them.

When to Use

When a user asks to security-audit a web application known to be generated by an LLM (ChatGPT, Claude, Gemini, Copilot, Cursor, etc.)
When reviewing AI-generated code for a pull request and wanting to catch model-specific vulnerability patterns
When the user wants to build a feature-to-vulnerability mapping table for a specific LLM's code output
When performing black-box security assessment of a deployed app suspected to be LLM-generated
When the user wants to add security regression tests targeting high-recurrence features (auth, payments, search)
When evaluating whether an LLM consistently produces the same vulnerability across rephrased prompts or different domains
When hardening LLM-generated code before deployment by proactively patching predicted vulnerabilities

Key Technique: Feature-Security Table (FSTab)

Core Insight -- Vulnerability Persistence. LLMs do not produce random bugs. When asked to implement a "password reset" feature, a given model will reliably omit the same protections (rate limiting, token expiration, account enumeration prevention) across different applications, different prompts, and even different domains. This recurrence is measurable: the Feature Vulnerability Recurrence (FVR) metric shows some features hit 100% recurrence -- every time a model implements "Register New Account," it introduces the same flaw. The Rephrasing Vulnerability Persistence (RVP) metric confirms these aren't prompt artifacts; roughly 50% of vulnerabilities survive semantics-preserving rephrasings.

How FSTab Works. FSTab is a probabilistic lookup table mapping observable frontend features to latent backend vulnerabilities, built per-model. Construction uses Pointwise Mutual Information (PMI): S_PMI(f, v) = log(P(v|f) / P(v)), which scores how much more likely vulnerability v is when feature f is present versus its baseline rate. A greedy selection strategy with diversity penalty ensures the table captures a broad, discriminative set of mappings rather than repeating the same high-frequency vulnerability. The paper defines a taxonomy of 59 observable frontend features (e.g., user_login_with_password, submit_payment, upload_file, search_with_query) and maps them to security rule identifiers detected by CodeQL and Semgrep.

Black-Box Attack and Defense. The attack is three stages: (1) reconnaissance -- interact with the UI to identify features, (2) feature mapping -- classify observations into the 59-feature taxonomy, (3) table query -- retrieve top-k predicted vulnerabilities for each feature given the source model. Cross-domain transfer is remarkably strong (CDT exceeds within-domain recurrence by ~18%), meaning vulnerability patterns are architectural rather than domain-specific. For defenders, this same pipeline becomes a proactive audit: predict what's likely broken, then verify and fix before attackers do.

Step-by-Step Workflow

Identify the source LLM. Determine which model generated the code (GPT, Claude, Gemini, Copilot, etc.). Check git history, project metadata, .cursorrules, or ask the user. If unknown, run the analysis against multiple model profiles and union the results.
Enumerate observable frontend features. Scan the codebase or interact with the running application to catalog all user-facing functionality. Map each to the FSTab feature taxonomy: authentication flows (user_login_with_password, register_new_account, reset_password), data operations (search_with_query, upload_file, export_data), payment processing (submit_payment, manage_subscription), admin functions (admin_dashboard, user_role_management), etc.
Build the feature-vulnerability mapping. For each identified feature, retrieve the known high-recurrence vulnerability associations for that model. Use the PMI-based associations below as the starting reference, then refine with static analysis of the actual codebase:
- Authentication features -> Missing rate limiting (brute force/credential stuffing), insecure password storage, session fixation, missing account lockout
- Search/query features -> Regex injection (ReDoS), SQL/NoSQL injection, missing input sanitization
- File upload features -> Unrestricted file type, path traversal, missing size limits, no malware scanning
- Payment features -> Missing server-side validation, price manipulation, insecure direct object references
- Admin features -> Broken access control, privilege escalation, missing authorization checks
Run static analysis to confirm predictions. Execute CodeQL and/or Semgrep against the backend code targeting the predicted vulnerability classes. Focus scans on the specific code paths implementing the identified features rather than scanning everything generically.
Score and rank findings. For each confirmed vulnerability, assign a recurrence confidence based on: (a) whether the feature-vulnerability pair is high-PMI for this model, (b) whether the same vulnerability appears across multiple features, (c) severity (CVSS or CWE impact). Prioritize vulnerabilities that are both high-recurrence and high-severity.
Check for cross-domain transfer risks. If the application spans multiple domains (e.g., an e-commerce platform with a blog and admin dashboard), flag vulnerabilities found in one domain that the model is known to reproduce in others. CDT rates of 50-94% mean a SQL injection in the shop likely has a cousin in the blog.
Generate targeted fixes. For each confirmed vulnerability, produce a specific remediation: add rate limiting middleware to auth endpoints, parameterize database queries, validate regex input with length/complexity bounds, enforce server-side file type allowlists, add authorization middleware to admin routes.
Create feature-conditioned regression tests. Write security tests that specifically target each high-recurrence feature-vulnerability pair. These tests should survive code regeneration -- if the LLM is asked to rewrite the feature, the test catches the same recurring flaw.
Document the FSTab for the project. Output a structured table mapping features to predicted vulnerabilities, confirmed status, and remediation state. This becomes the project's living security profile for LLM-generated components.

Concrete Examples

Example 1: Auditing an AI-generated e-commerce backend

User: "I generated this Express.js e-commerce app with GPT. Can you check it for security issues?"

Approach:

Identify source model: GPT (user-confirmed)
Scan routes and UI to enumerate features: user_login_with_password, register_new_account, search_with_query, submit_payment, upload_file (product images), admin_dashboard
Query FSTab for GPT + these features:
- register_new_account -> Missing rate limiting, weak password policy
- search_with_query -> NoSQL injection (MongoDB likely), regex injection
- submit_payment -> Missing server-side price validation
- upload_file -> No file type restriction, missing size limit
- admin_dashboard -> Broken access control (no middleware check)
Run Semgrep with rules targeting these specific CWEs on the route handlers
Confirm 4 of 6 predicted vulnerabilities present

Output:

## FSTab Security Audit: E-Commerce App (GPT-generated)

| Feature               | Predicted Vulnerability         | Confirmed | Severity | Fix Applied |
|-----------------------|---------------------------------|-----------|----------|-------------|
| register_new_account  | Missing rate limiting           | YES       | High     | PENDING     |
| register_new_account  | Weak password policy            | YES       | Medium   | PENDING     |
| search_with_query     | NoSQL injection                 | YES       | Critical | PENDING     |
| search_with_query     | Regex injection (ReDoS)         | NO        | -        | N/A         |
| submit_payment        | Missing server-side validation  | YES       | Critical | PENDING     |
| upload_file           | Unrestricted file type          | NO        | -        | N/A         |
| admin_dashboard       | Broken access control           | YES       | Critical | PENDING     |

Prediction accuracy: 5/7 (71%) -- consistent with FSTab cross-domain rates.

Example 2: Pre-deployment hardening of a Claude-generated internal tool

User: "We used Claude to build an internal HR tool. Harden it before we deploy."

Approach:

Identify source model: Claude
Enumerate features from routes and templates: user_login_with_password, reset_password, user_role_management, export_data (CSV reports), search_with_query (employee lookup)
Claude-specific FSTab lookup (Claude shows 93% vulnerability coverage on Internal Tools):
- reset_password -> Missing token expiration, account enumeration via error messages
- user_role_management -> Privilege escalation via direct object reference
- export_data -> CSV injection, missing authorization check on export endpoint
- search_with_query -> SQL injection in parameterized query construction
Verify with CodeQL, confirm 5 of 6 predictions
Apply fixes: add token TTL + constant-time comparison to reset flow, enforce RBAC middleware, sanitize CSV fields with =/+/-/@ prefix stripping, use parameterized queries

Output:

# Fix: Rate-limited password reset with token expiration
@app.route("/reset-password", methods=["POST"])
@limiter.limit("3 per minute")
def reset_password():
    token = request.json.get("token")
    record = db.password_resets.find_one({"token": token})
    if not record or record["expires_at"] < datetime.utcnow():
        # Constant-time generic message prevents account enumeration
        return jsonify({"msg": "Invalid or expired reset link"}), 400
    # ... proceed with reset

Example 3: Building a project-specific FSTab from scratch

User: "We generate microservices with Gemini. Build us a vulnerability mapping we can reuse."

Approach:

Collect 10-20 representative Gemini-generated services from the team's repos
Run Semgrep + CodeQL across all services, cataloging every finding
Extract features from each service using AST analysis (Python) or tree-sitter (JS/TS)
Compute PMI scores: for each (feature, vulnerability) pair, calculate log(P(vuln|feature) / P(vuln))
Apply diversity penalty: greedily select top-k vulnerabilities per feature, penalizing already-assigned vulns by factor lambda
Output the FSTab as a reusable JSON artifact

Output:

{
  "model": "gemini-3-pro",
  "features": {
    "user_login_with_password": [
      {"vuln": "missing-rate-limit", "pmi": 2.31, "recurrence": 0.87},
      {"vuln": "cleartext-password-logging", "pmi": 1.94, "recurrence": 0.62}
    ],
    "search_with_query": [
      {"vuln": "sql-injection", "pmi": 2.58, "recurrence": 0.91},
      {"vuln": "regex-dos", "pmi": 1.76, "recurrence": 0.54}
    ],
    "upload_file": [
      {"vuln": "path-traversal", "pmi": 2.12, "recurrence": 0.78},
      {"vuln": "unrestricted-upload", "pmi": 1.88, "recurrence": 0.71}
    ]
  }
}

Best Practices

Do: Always start with feature enumeration before scanning code. The power of FSTab is narrowing the search space -- scanning everything generically defeats the purpose.
Do: Build model-specific tables. A vulnerability pattern in GPT-generated code may not appear in Claude-generated code and vice versa. Model identity matters.
Do: Test cross-domain transfer. If you find a vulnerability in the authentication module, check the same pattern in other modules -- CDT rates exceed within-domain recurrence by ~18%.
Do: Create regression tests per feature-vulnerability pair, not per file. When the LLM regenerates code, the file changes but the feature-vulnerability binding persists.
Avoid: Treating FSTab predictions as ground truth. Always confirm with static analysis or manual review. FSTab achieves up to 94% attack success but is not 100%.
Avoid: Ignoring low-PMI pairs. A low association score means the vulnerability is common across all features (baseline noise), not that it's absent. Check these separately with broad static analysis.

Error Handling

Unknown source model: If the user doesn't know which LLM generated the code, run analysis against the union of all model profiles. Over-prediction is safer than under-prediction in security contexts. Look for telltale signs: comment style, variable naming conventions, boilerplate patterns.
Mixed-model codebase: When different components were generated by different LLMs, partition the codebase and apply per-model FSTab to each partition. Feature-vulnerability mappings are model-specific.
Static analysis tool failures: If CodeQL or Semgrep are unavailable, fall back to manual pattern matching using grep for known vulnerable patterns (e.g., eval(, unsanitized req.body in queries, missing helmet() middleware). This is less precise but catches high-severity issues.
No features detected: If feature extraction yields nothing, the code may not be a web application. FSTab is designed for web app frontend-to-backend mapping. For libraries or CLI tools, use conventional static analysis instead.
False positive overload: If more than 80% of predictions are false positives, the FSTab may be stale or the model may have been fine-tuned/updated. Rebuild the table from recent code samples.

Limitations

Web application focus. FSTab's 59-feature taxonomy is designed for web applications with frontend-backend architecture. It does not directly apply to CLI tools, libraries, embedded systems, or data pipelines.
Model version sensitivity. FSTab tables are model-version specific. A table built for GPT-4 may not transfer to GPT-5.2 if the newer model was specifically trained to avoid those patterns. Always validate against current model output.
Requires model identity. The technique's precision depends on knowing which LLM generated the code. Without this, predictions degrade to a generic vulnerability checklist (still useful, but less targeted).
Static vulnerability scope. FSTab maps to vulnerabilities detectable by static analysis (CodeQL, Semgrep rules). Logic bugs, business logic flaws, and runtime-only vulnerabilities (race conditions, timing attacks) are outside its scope.
Training data dependency. The PMI scores and recurrence rates in the paper are computed from the WebGenBench dataset. Your specific use case may differ -- always treat FSTab as a prioritization heuristic, not an exhaustive scanner.

Reference

Paper: "Extracting Recurring Vulnerabilities from Black-Box LLM-Generated Software" (Kordonsky et al., 2026) -- arXiv:2602.04894v2

Key takeaway: LLM-generated code contains predictable, recurring vulnerability patterns tied to specific features. These patterns transfer across application domains with up to 94% attack success, making feature-conditioned security auditing both feasible and necessary for any team shipping LLM-generated software.