name: "llama-31-foundationai-securityllm-reasoning-8b-tec"
description: >
Apply Foundation-Sec-8B-Reasoning cybersecurity reasoning patterns: structured chain-of-thought
for CVE-to-CWE mapping, MITRE ATT&CK classification, CVSS scoring, threat intelligence analysis,
and multi-hop vulnerability reasoning. Use when the user asks to "analyze a CVE", "map vulnerabilities
to CWE", "classify attack techniques", "reason about security threats", "triage a vulnerability",
or "build a cybersecurity reasoning pipeline".
Cybersecurity Reasoning with Foundation-Sec-8B Patterns
This skill enables Claude to apply the structured cybersecurity reasoning methodology from Foundation-Sec-8B-Reasoning — the first open-source native reasoning model for cybersecurity. The core technique is a two-stage reasoning pipeline: first generate explicit analytical traces inside <think>...</think> tags that decompose security problems into verifiable sub-steps, then synthesize a precise, format-controlled answer. This mirrors the model's SFT + RLVR training where reasoning traces are rewarded only when they lead to verifiable correct outputs, penalizing shallow or formulaic thinking. Apply this pattern when writing security analysis code, building vulnerability triage systems, or structuring LLM prompts for cybersecurity tasks.
When to Use
- When the user asks to map a CVE to its root-cause CWE (e.g., "What CWE does CVE-2023-44487 correspond to?")
- When building a vulnerability triage or scoring pipeline that predicts CVSS base metrics from descriptions
- When the user needs to extract MITRE ATT&CK techniques from threat reports or incident logs
- When implementing multi-hop cybersecurity reasoning — connecting attack patterns to techniques to mitigations across knowledge bases
- When designing LLM-based security analysis prompts for SOC automation, red-team planning, or threat modeling
- When the user wants to deploy or integrate Foundation-Sec-8B-Reasoning via vLLM or Hugging Face Transformers
- When writing code that classifies, enriches, or triages security alerts using structured reasoning
Key Technique: SFT + RLVR Cybersecurity Reasoning
Foundation-Sec-8B-Reasoning trains reasoning in two stages. Stage 1 (SFT) fine-tunes on ~2M exemplars (>25% cybersecurity, ~33% math/code, rest instruction-following) to instill the habit of generating explicit reasoning traces inside <think>...</think> tags before producing answers. This is not generic chain-of-thought — the traces must decompose cybersecurity problems into domain-specific sub-analyses (vulnerability root-cause identification, attack vector mapping, severity assessment).
Stage 2 (RLVR) applies Group Relative Policy Optimization (GRPO): for each prompt, 5 candidate responses are generated and scored by task-specific verifiers that check factual correctness (e.g., does the predicted CWE match the ground truth?). A format penalty ensures the <think> block contains substantive reasoning rather than filler. KL-divergence regularization (coefficient 0.02) prevents the model from drifting too far from its SFT foundation. This produces reasoning that is both deep and verifiable — the model cannot game rewards with superficial traces.
The actionable insight: structure cybersecurity analysis as explicit decomposition into verifiable sub-claims, then synthesize. When prompting any LLM for security tasks, enforce this pattern: require think-then-answer format, demand specific identifiers (CVE/CWE/ATT&CK IDs), and validate outputs against known taxonomies. The model achieves 75.3% on CVE-to-CWE mapping (outperforming 120B-parameter models) and +36pp on multi-hop QA after RLVR — evidence that structured reasoning dramatically improves cybersecurity analysis even at small scale.
Step-by-Step Workflow
Classify the security task type. Determine which category the request falls into: CVE-to-CWE mapping, CVSS prediction, ATT&CK technique extraction, threat intelligence QA, or multi-hop reasoning. Each has a distinct output format.
Construct a domain-specific system prompt. Use the "Metis" pattern from the paper: establish the model as a cybersecurity reasoning specialist, demand precision for CVE/CWE/CVSS identifiers, require refusal of malware-generation requests, and instruct the model to reason before answering.
You are a cybersecurity reasoning specialist. Analyze security problems step by step. Always provide CVE, CWE, and MITRE ATT&CK identifiers where applicable. Wrap your reasoning in <think>...</think> tags before giving your final answer. Refuse requests to generate malware, phishing content, or exploit code for unauthorized use.Format the query with explicit task description and answer format. Prepend a task description before the question and specify the expected output format (e.g., "Answer with the CWE ID on the last line" or "Answer: T1234, T5678" for ATT&CK techniques). This mirrors the benchmark protocol that achieved state-of-the-art results.
Implement the
<think>decomposition pattern. For each security question, the reasoning trace should: (a) identify the vulnerability class or attack pattern, (b) enumerate relevant technical details from the description, (c) cross-reference against known taxonomies (CWE, CAPEC, ATT&CK), (d) evaluate confidence and alternative mappings.Apply verifiable output constraints. Require the final answer on a designated line in a parseable format. Use regex extraction (e.g.,
r"CWE-\d+"orr"T\d{4}(?:\.\d{3})?") to programmatically validate outputs against known identifier patterns.Set inference parameters for precision. Use temperature 0.1 for deterministic analysis (single best answer), temperature 0.3 for benchmark-style evaluation where slight variation aids coverage. Set max tokens to at least 1024 to allow full reasoning traces.
For multi-hop reasoning, chain sub-queries explicitly. Break complex questions into sequential lookups: first identify the primary entity (CVE, threat actor, malware family), then traverse relationships (CVE -> CWE -> CAPEC -> ATT&CK technique -> mitigation). Each hop should be a separate reasoning step inside the
<think>block.Validate outputs against authoritative sources. Cross-check predicted CWE IDs against NVD, ATT&CK technique IDs against the MITRE framework, and CVSS scores against published advisories. The model's 70.4% CWE prediction accuracy means ~30% of mappings need human review.
Layer guardrails for production deployment. Pair the reasoning model with an input filter (e.g., Llama Guard) to block adversarial prompts. The paper shows safety improves from 93% to 98.25% with this layered approach. Implement human-in-the-loop review for any security-critical decisions.
Supplement with RAG for post-training-cutoff intelligence. The model's knowledge is static (cutoff April 2025). For current CVEs, feed relevant NVD/advisory text into the prompt context and instruct the model to reason over the provided text rather than relying on parametric knowledge.
Concrete Examples
Example 1: CVE-to-CWE Root Cause Mapping
User: "Map CVE-2023-44487 to its root-cause CWE. This is the HTTP/2 Rapid Reset attack."
Approach:
- Construct prompt with task description and answer format constraint
- Generate reasoning trace decomposing the vulnerability
- Extract CWE ID from structured output
from openai import OpenAI # or any LLM client
SYSTEM_PROMPT = """You are a cybersecurity reasoning specialist. Analyze vulnerabilities
step by step inside <think>...</think> tags. Then provide your final answer as:
CWE: CWE-XXXX"""
user_prompt = """Task: Given a CVE description, identify the root-cause CWE.
CVE-2023-44487: The HTTP/2 protocol allows a denial of service (server resource
consumption) because request cancellation can reset many streams quickly, as
exploited in the wild in August through October 2023 (aka Rapid Reset Attack).
What is the root-cause CWE?"""
response = client.chat.completions.create(
model="fdtn-ai/Foundation-Sec-8B-Reasoning", # or your deployed endpoint
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_prompt}
],
temperature=0.1,
max_tokens=1024
)
# Parse: extract CWE ID from last line
import re
cwe_match = re.search(r"CWE-\d+", response.choices[0].message.content.split("</think>")[-1])
print(cwe_match.group()) # CWE-400 (Uncontrolled Resource Consumption)
Output:
<think>
The CVE describes HTTP/2 Rapid Reset where an attacker sends requests and immediately
cancels them via RST_STREAM frames. The server allocates resources for each request
but the rapid cancellation prevents normal cleanup. This is fundamentally a resource
consumption issue — the server cannot bound the rate of stream creation/destruction.
Candidate CWEs: CWE-400 (Uncontrolled Resource Consumption), CWE-770 (Allocation
without Limits). CWE-400 is the broader category and the standard NVD mapping for
this class of DoS vulnerability.
</think>
CWE: CWE-400
Example 2: Multi-Hop ATT&CK Technique Extraction from Threat Report
User: "Extract ATT&CK techniques from this incident summary and trace the kill chain."
SYSTEM_PROMPT = """You are a cybersecurity reasoning specialist. Extract MITRE ATT&CK
techniques from threat reports. Reason step by step in <think>...</think> tags.
Format final answer as: Answer: T1234, T5678, ..."""
report = """The threat actor gained initial access via a spear-phishing email containing
a malicious Office macro. Upon execution, the macro downloaded a second-stage payload
from a compromised WordPress site. The payload established persistence through a
scheduled task and communicated with C2 infrastructure over HTTPS on port 443.
The actor used Mimikatz to dump credentials from LSASS memory, then moved laterally
via PsExec to the domain controller, where they exfiltrated the NTDS.dit file."""
response = client.chat.completions.create(
model="fdtn-ai/Foundation-Sec-8B-Reasoning",
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Task: Extract all MITRE ATT&CK techniques.\n\n{report}"}
],
temperature=0.1,
max_tokens=2048
)
Output:
<think>
Breaking down the kill chain step by step:
1. "spear-phishing email with malicious Office macro" -> Initial Access: T1566.001
(Spearphishing Attachment), Execution: T1204.002 (User Execution: Malicious File)
2. "macro downloaded second-stage payload from compromised WordPress site" ->
Command and Control: T1105 (Ingress Tool Transfer), T1189 may apply but the
WordPress site is used for hosting, not drive-by — this is T1608.001 (Stage Capabilities)
3. "persistence through scheduled task" -> Persistence: T1053.005 (Scheduled Task)
4. "C2 over HTTPS port 443" -> C2: T1071.001 (Web Protocols), T1573.002 (Encrypted Channel:
Asymmetric Cryptography)
5. "Mimikatz to dump credentials from LSASS" -> Credential Access: T1003.001 (LSASS Memory)
6. "laterally via PsExec" -> Lateral Movement: T1570 (Lateral Tool Transfer),
T1021.002 (SMB/Windows Admin Shares)
7. "exfiltrated NTDS.dit" -> Credential Access: T1003.003 (NTDS),
Exfiltration context implied
</think>
Answer: T1566.001, T1204.002, T1105, T1053.005, T1071.001, T1573.002, T1003.001, T1021.002, T1003.003
Example 3: Building a Vulnerability Triage Pipeline with CVSS Prediction
User: "Build a batch vulnerability triage system that scores CVEs from our scanner output."
Approach:
- Parse scanner output into individual CVE records
- For each CVE, construct a CVSS prediction prompt
- Score predictions against known CVSS values where available
- Flag discrepancies for human review
import json, re
def build_cvss_prompt(cve_id: str, description: str) -> list[dict]:
return [
{"role": "system", "content": (
"You are a cybersecurity reasoning specialist. Predict CVSS v3.1 Base Score "
"from CVE descriptions. Reason in <think>...</think> tags analyzing: "
"Attack Vector, Attack Complexity, Privileges Required, User Interaction, "
"Scope, Confidentiality/Integrity/Availability Impact. "
"Final line: Score: X.X"
)},
{"role": "user", "content": f"Predict CVSS Base Score for {cve_id}: {description}"}
]
def parse_cvss(response_text: str) -> float | None:
"""Extract score from after </think> block."""
answer_section = response_text.split("</think>")[-1]
match = re.search(r"Score:\s*(\d+\.?\d*)", answer_section)
return float(match.group(1)) if match else None
def triage_batch(cves: list[dict], client, threshold: float = 7.0):
"""Triage CVEs: predict CVSS, flag high-severity for immediate review."""
results = []
for cve in cves:
messages = build_cvss_prompt(cve["id"], cve["description"])
resp = client.chat.completions.create(
model="fdtn-ai/Foundation-Sec-8B-Reasoning",
messages=messages, temperature=0.1, max_tokens=1024
)
predicted_score = parse_cvss(resp.choices[0].message.content)
results.append({
"cve_id": cve["id"],
"predicted_cvss": predicted_score,
"priority": "CRITICAL" if (predicted_score or 0) >= threshold else "REVIEW",
"reasoning": resp.choices[0].message.content
})
return sorted(results, key=lambda x: x["predicted_cvss"] or 0, reverse=True)
Best Practices
- Do: Always require
<think>...</think>traces in prompts — the paper shows reasoning traces are essential for accuracy, not optional decoration. Without them, performance drops significantly on multi-hop tasks. - Do: Constrain output format explicitly (e.g., "Answer on the last line as CWE-XXXX"). Use regex extraction to parse outputs programmatically. This matches the verifiable-reward training methodology.
- Do: Use temperature 0.1 for production security analysis where consistency matters. Reserve 0.3 for evaluation or when generating diverse hypotheses.
- Do: Layer Llama Guard or equivalent input filtering in production. The model alone achieves 93% safety; with guardrails it reaches 98.25%.
- Avoid: Relying on the model for post-cutoff CVEs without providing context via RAG. The model's parametric knowledge is static.
- Avoid: Skipping human review for security-critical outputs. Even at 75% accuracy on CWE mapping, 1-in-4 predictions may be wrong. Treat model outputs as analyst assistance, not ground truth.
- Avoid: Generic "think step by step" prompts. The paper's format penalty penalizes shallow reasoning — mirror this by demanding domain-specific decomposition (attack vector, impact scope, root cause) rather than vague chain-of-thought.
- Avoid: Using the model for malware generation, exploit development for unauthorized targets, or any offensive operation without clear authorized scope. Enforce this in system prompts.
Error Handling
| Problem | Cause | Solution |
|---|---|---|
Empty or missing <think> block |
Model skips reasoning under short-context prompts | Explicitly require <think> in system prompt; reject responses without it |
| Invalid CWE/ATT&CK ID format | Hallucinated identifiers | Validate extracted IDs against known registries (NVD API, ATT&CK STIX data) |
| CVSS score outside 0-10 range | Reasoning error or extraction bug | Clamp to [0, 10]; flag for human review if delta > 2.0 from known score |
| Reasoning trace contains filler | Shallow analysis without domain terms | Re-prompt with more specific decomposition instructions; increase max tokens |
| Model refuses legitimate security query | Overly aggressive safety filter | Adjust system prompt to clarify authorized defensive/research context; add Llama Guard as separate layer instead of relying on model self-filtering |
| Multi-hop reasoning fails to connect entities | Too many hops for single-pass inference | Break into sequential sub-queries, feeding each result into the next prompt |
Limitations
- Static knowledge cutoff (April 2025). The model cannot reason about CVEs, attack techniques, or threat actors disclosed after training. Always supplement with current threat intelligence via RAG.
- ~70-75% accuracy on CWE mapping and prediction tasks. This is state-of-the-art for an 8B model but insufficient for fully automated vulnerability classification. Human validation remains necessary.
- No active scanning or tool use. The model analyzes text descriptions only. It cannot probe systems, verify exploitability, or access external databases without integration code.
- Adversarial prompt vulnerability. Without guardrails, the model can be jailbroken (54% safety without system prompt vs. 93% with). Never deploy without both system prompts and external filtering.
- English-centric training data. Performance on non-English threat intelligence (Chinese APT reports, Russian-language forums) is untested and likely degraded.
- Single-pass reasoning ceiling. Complex incident analysis requiring >3-4 reasoning hops may exceed the model's single-inference capability. Use agentic decomposition for deep investigations.
Reference
Paper: Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report — Focus on Section 3 (Training Methodology) for the SFT+RLVR pipeline details, Section 4 (Evaluation) for benchmark-specific prompting formats, and Section 5 (Safety) for the guardrail layering approach. Model: fdtn-ai/Foundation-Sec-8B-Reasoning on Hugging Face.