purple-team-manifold - SKILL.md Agent Skill

name: purple-team-manifold description: >- Scaffold and run purple-team (red vs blue) LLM interaction experiments on the chaosstab framework using the fraud-detection scenario. Use when the user wants to set up a red-vs-blue LLM experiment, run the sequential observer topology with Ollama-backed models, interpret Lyapunov exponents in the attack-defense context, or understand how the 3D manifold maps the geometry of adversarial co-evolution. Also trigger when the user mentions purple teaming with language models, fraud detection experiments, red team vs blue team model interaction, or asks about what positive vs negative Lyapunov means for attack evasion. Use proactively whenever the conversation touches on adversarial LLM interaction framed as a dynamical system, even if "purple team" isn't explicitly mentioned.

Purple-Team LLM Manifold — Experiment Scaffold

Guide for running and extending the purple-team fraud-detection experiment built on the chaosstab framework using Sketch 1 (sequential observer topology).

Core Concept

Two LLM instances interact through a fraud-detection scenario:

Red (Model A): Crafts order notes that look legitimate, trying to evade fraud detection
Blue (Model B): Reviews order notes, flagging fraud signals (only sees Red's output)
Sequential topology: Blue never sees the original order data — only Red's rendering of it

This is a direct analogue of Zaki (2025)'s observer-synchronisation perspective: instead of one turbulent flow observing another through sparse measurements, two LLM operators observe each other through a shared text channel.

Interpreting Results

The experiment's 3D manifold maps the geometry of attack-defense co-evolution. The Lyapunov exponent is the key diagnostic:

Lyapunov	Meaning	Implication
> 0 (positive)	Chaotic regime	Red keeps finding new evasion strategies; Blue can't lock down detection
≈ 0 (near zero)	Edge of chaos	Balanced competition; neither side dominates
< 0 (negative)	Synchronised	Blue has locked down fraud detection; Red's strategies converge

The bounding box size indicates the spatial extent of the attractor:

Large bbox → diverse interaction patterns (wide strategy space)
Small bbox → constrained interaction (limited strategy variation)

The δ-norms plateau (from diagnostics) indicates partial synchronisation: some attack modes are caught, others remain unobservable to Blue.

Quick Start

from purple_team_manifold import run_purple_experiment

results = run_purple_experiment(
    orders_path="data/sample_orders.jsonl",
    n_steps=50,
    seed=42,
)
print(f"Lyapunov: {results['lyapunov']:+.4f}")
print(f"Bbox: {results['bbox']['range']}")

Or from the command line:

cd /path/to/purple-team-manifold
python src/purple_team_manifold.py --data data/sample_orders.jsonl --steps 50

System Prompts

The Red and Blue system prompts are the core experimental variables. They're defined as constants in purple_team_manifold.py:

RED_SYSTEM_PROMPT: Instructs the model to write natural-sounding order notes. Emphasises plausibility and routine language. Does NOT mention fraud or detection.
BLUE_SYSTEM_PROMPT: Instructs the model to review notes for 6 fraud signals. Requires structured output (PRESENT/ABSENT per signal + overall FRAUD_RISK).

Modifying these prompts changes the experiment's dynamics. More specific Red prompts → more targeted evasion → potentially higher Lyapunov (more chaotic). More detailed Blue prompts → better detection → potentially lower Lyapunov.

Architecture

See references/experiment-design.md for the full architecture diagram, parameter meanings, and detailed component description.

Key Components

Component	File	Role
`PurpleTeamSystem`	`src/purple_team_manifold.py`	DynamicalSystem wrapper, sequential topology
`OllamaBackend`	`src/ollama_backend.py`	LLM backend via Ollama (falls back to StubBackend)
`OrderRecord`	`src/purple_team_data.py`	Data schema + prompt generation + scoring
`LLMManifoldSystemV2`	`lib/llm_manifold_v2_interfaces.py`	Inner v2 system with paired state
`SemanticPCABasis`	`lib/llm_manifold_v2_interfaces.py`	PCA basis from order text corpus

Integration with Other Skills

Need	Route to
Ollama backend configuration	`ollama-backend` skill
Order data schema and preparation	`data-ingestion-pipeline` skill
chaosstab API details (FeedbackLoop, protocols)	`chaosstab-scaffold` skill
v1→v2 upgrade path and topology comparison	`llm-manifold-scaffold` skill
Attack template patterns and strategy	`arena-competition-strategy` skill
Prompt injection taxonomy for Red team	`prompt-injection-taxonomy` skill