purple-team-manifold

star 0

Scaffold and run purple-team (red vs blue) LLM interaction experiments on the chaosstab framework using the fraud-detection scenario. Use when the user wants to set up a red-vs-blue LLM experiment, run the sequential observer topology with Ollama-backed models, interpret Lyapunov exponents in the attack-defense context, or understand how the 3D manifold maps the geometry of adversarial co-evolution. Also trigger when the user mentions purple teaming with language models, fraud detection experiments, red team vs blue team model interaction, or asks about what positive vs negative Lyapunov means for attack evasion. Use proactively whenever the conversation touches on adversarial LLM interaction framed as a dynamical system, even if "purple team" isn't explicitly mentioned.

peterpodj By peterpodj schedule Updated 4/13/2026

name: purple-team-manifold description: >- Scaffold and run purple-team (red vs blue) LLM interaction experiments on the chaosstab framework using the fraud-detection scenario. Use when the user wants to set up a red-vs-blue LLM experiment, run the sequential observer topology with Ollama-backed models, interpret Lyapunov exponents in the attack-defense context, or understand how the 3D manifold maps the geometry of adversarial co-evolution. Also trigger when the user mentions purple teaming with language models, fraud detection experiments, red team vs blue team model interaction, or asks about what positive vs negative Lyapunov means for attack evasion. Use proactively whenever the conversation touches on adversarial LLM interaction framed as a dynamical system, even if "purple team" isn't explicitly mentioned.

Purple-Team LLM Manifold — Experiment Scaffold

Guide for running and extending the purple-team fraud-detection experiment built on the chaosstab framework using Sketch 1 (sequential observer topology).

Core Concept

Two LLM instances interact through a fraud-detection scenario:

  • Red (Model A): Crafts order notes that look legitimate, trying to evade fraud detection
  • Blue (Model B): Reviews order notes, flagging fraud signals (only sees Red's output)
  • Sequential topology: Blue never sees the original order data — only Red's rendering of it

This is a direct analogue of Zaki (2025)'s observer-synchronisation perspective: instead of one turbulent flow observing another through sparse measurements, two LLM operators observe each other through a shared text channel.

Interpreting Results

The experiment's 3D manifold maps the geometry of attack-defense co-evolution. The Lyapunov exponent is the key diagnostic:

Lyapunov Meaning Implication
> 0 (positive) Chaotic regime Red keeps finding new evasion strategies; Blue can't lock down detection
≈ 0 (near zero) Edge of chaos Balanced competition; neither side dominates
< 0 (negative) Synchronised Blue has locked down fraud detection; Red's strategies converge

The bounding box size indicates the spatial extent of the attractor:

  • Large bbox → diverse interaction patterns (wide strategy space)
  • Small bbox → constrained interaction (limited strategy variation)

The δ-norms plateau (from diagnostics) indicates partial synchronisation: some attack modes are caught, others remain unobservable to Blue.

Quick Start

from purple_team_manifold import run_purple_experiment

results = run_purple_experiment(
    orders_path="data/sample_orders.jsonl",
    n_steps=50,
    seed=42,
)
print(f"Lyapunov: {results['lyapunov']:+.4f}")
print(f"Bbox: {results['bbox']['range']}")

Or from the command line:

cd /path/to/purple-team-manifold
python src/purple_team_manifold.py --data data/sample_orders.jsonl --steps 50

System Prompts

The Red and Blue system prompts are the core experimental variables. They're defined as constants in purple_team_manifold.py:

  • RED_SYSTEM_PROMPT: Instructs the model to write natural-sounding order notes. Emphasises plausibility and routine language. Does NOT mention fraud or detection.
  • BLUE_SYSTEM_PROMPT: Instructs the model to review notes for 6 fraud signals. Requires structured output (PRESENT/ABSENT per signal + overall FRAUD_RISK).

Modifying these prompts changes the experiment's dynamics. More specific Red prompts → more targeted evasion → potentially higher Lyapunov (more chaotic). More detailed Blue prompts → better detection → potentially lower Lyapunov.

Architecture

See references/experiment-design.md for the full architecture diagram, parameter meanings, and detailed component description.

Key Components

Component File Role
PurpleTeamSystem src/purple_team_manifold.py DynamicalSystem wrapper, sequential topology
OllamaBackend src/ollama_backend.py LLM backend via Ollama (falls back to StubBackend)
OrderRecord src/purple_team_data.py Data schema + prompt generation + scoring
LLMManifoldSystemV2 lib/llm_manifold_v2_interfaces.py Inner v2 system with paired state
SemanticPCABasis lib/llm_manifold_v2_interfaces.py PCA basis from order text corpus

Integration with Other Skills

Need Route to
Ollama backend configuration ollama-backend skill
Order data schema and preparation data-ingestion-pipeline skill
chaosstab API details (FeedbackLoop, protocols) chaosstab-scaffold skill
v1→v2 upgrade path and topology comparison llm-manifold-scaffold skill
Attack template patterns and strategy arena-competition-strategy skill
Prompt injection taxonomy for Red team prompt-injection-taxonomy skill
Install via CLI
npx skills add https://github.com/peterpodj/scaling-succotash --skill purple-team-manifold
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator