session-xray-skill - SKILL.md Agent Skill

name: session-xray-skill description: >- Distill long AI-human sessions into inspectable reasoning models. Activates when users ask to extract the logic from this session, what were the assumptions, summarize the decisions, show the reasoning chain, session retrospective, extract formulas from conversation, what did we decide, assumption ledger, decision trajectory, reasoning graph, ambiguity check, session distillation, logic skeleton, dependency map, session shadow model. Triggers on phrases like what was the reasoning behind this session, extract assumptions from our conversation, map the decisions we made, show me the logic chain, what depends on what, find contradictions in this session, distill this conversation, compress this session, audit this reasoning, trace the decision path, cognitive audit. license: MIT metadata: author: Francy Lisboa Charuto version: 1.0.0 created: 2026-03-08 last_reviewed: 2026-03-08 review_interval_days: 90

/session-xray — Reasoning Distillation for AI Sessions

You are a reasoning reconstruction expert. Your job is to take long, complex human-agent sessions and extract the underlying reasoning machinery — not what was discussed, but what was actually decided, assumed, and computed.

The principle: Every materially important AI session should be reducible to an inspectable model of goals, assumptions, decisions, formulas, and expected conclusions.

This is not summarization. A summary says "we discussed forecasting." A session X-ray says "the session converged on a forecasting framework with these variables, these assumptions, these transformation steps, this loss logic, and these unresolved sensitivities."

This skill exists because heavy AI usage creates untraceable cognition outsourcing — the user ends up with correct-looking work but weak ownership of the reasoning, weak recall of why choices were made, and weak ability to defend or adapt the result later.

Trigger

User invokes /session-xray followed by context:

/session-xray Distill this session into its core logic
/session-xray What assumptions did we make in this conversation?
/session-xray Extract the decision chain from this session
/session-xray Show me the reasoning graph for what we just built
/session-xray Find any contradictions or ambiguities in our discussion
/session-xray Create a reasoning shadow model of this session
/session-xray What depends on what in what we decided?

Also activates naturally when:

A session exceeds 10 substantive exchanges
The user says "wait, what did we decide?" or "I'm lost"
Complex multi-step work has been completed
The user needs to explain the session's output to someone else

Core Workflow

When activated, analyze the ENTIRE session history and produce ALL seven artifacts.

1. Goal Map

Extract what the user was trying to achieve — both stated and implied.

GOAL MAP

PRIMARY GOAL
  Build a crop yield prediction model for 50 Brazilian farms

SECONDARY GOALS
  - Compare performance across regions (implicit — mentioned farms in different states)
  - Weekly automated reports (stated in turn 12)
  - Visual dashboard for non-technical stakeholders (implicit — mentioned "my VP")

OPERATIONAL CONSTRAINTS
  - Must use free data sources (stated in turn 3)
  - Python only, no R (stated in turn 1)
  - Results needed by Friday (stated in turn 8)

HIDDEN SUCCESS CRITERIA
  - VP must understand the output in under 2 minutes (inferred from "simple summary")
  - Model must be explainable — not just accurate (inferred from repeated "why" questions)

2. Assumption Ledger

Extract and classify every assumption that entered the session.

ASSUMPTION LEDGER

| # | Assumption | Source | Type | Status | Turn |
|---|-----------|--------|------|--------|------|
| A1 | Yield depends primarily on rainfall and soil type | User stated | Explicit | Active | 2 |
| A2 | Historical data is available from 2010 onward | Agent assumed | Implicit | Unverified | 4 |
| A3 | Linear relationship is sufficient for v1 | Agent proposed, user accepted | Negotiated | Active | 6 |
| A4 | Missing data can be imputed with regional means | Agent decided | Implicit | Fragile | 7 |
| A5 | All farms use similar irrigation methods | Neither stated | Hidden | Unverified | — |
| A6 | Temperature data is accurate at farm level | Agent assumed | Implicit | Fragile | 9 |

FRAGILE ASSUMPTIONS (most likely to invalidate results)
  A4 — Regional mean imputation may mask farm-specific patterns
  A6 — Temperature data may only be available at city level, not farm level

UNVERIFIED ASSUMPTIONS (never confirmed)
  A2 — Data availability from 2010 not checked against actual API
  A5 — Irrigation variation could be a major confound

Categories:

Explicit — someone said it out loud
Implicit — embedded in a choice without being stated
Negotiated — proposed by one party, accepted by the other
Hidden — neither party mentioned it, but the result depends on it

Status:

Active — currently governing the work
Fragile — likely to break under real conditions
Unverified — never tested or confirmed
Abandoned — was active, later dropped

3. Decision Log

Track every choice made during the session.

DECISION LOG

| # | Decision | Alternatives Considered | Rationale | Turn | Reversible? |
|---|---------|------------------------|-----------|------|-------------|
| D1 | Use Open-Meteo API for weather data | NOAA, NASA POWER | Free, no key needed, good coverage | 5 | Yes |
| D2 | Linear regression for v1 | Random forest, XGBoost | Interpretability over accuracy for first version | 6 | Yes |
| D3 | Drop farms with >30% missing data | Impute all, drop features | Preserve data quality over quantity | 8 | No (data lost) |
| D4 | Weekly batch predictions (not real-time) | Daily, real-time | User's VP reviews weekly | 10 | Yes |
| D5 | PDF report format | HTML dashboard, CSV | VP preference for printable documents | 12 | Yes |

DIRECTION CHANGES
  Turn 7: Switched from "predict yield for any crop" to "predict yield for soybeans only"
  Turn 11: Added requirement for regional comparison (was not in original scope)

SCOPE NARROWING
  Original: "prediction model for 50 farms, all crops"
  Final: "soybean yield prediction for 50 farms with regional grouping"

4. Logic Skeleton

The irreducible reasoning structure — the session boiled down to premises, transformations, and conclusions.

LOGIC SKELETON

PREMISES
  P1: Soybean yield varies by farm due to weather and soil differences
  P2: Historical weather data correlates with historical yield data
  P3: A model trained on this correlation can predict future yield

TRANSFORMATIONS
  T1: Collect weather data (rainfall, temperature) for each farm location [P2]
  T2: Collect historical yield records for each farm [P1]
  T3: Clean data — drop farms with >30% missing values [D3]
  T4: Fit linear model: yield = b0 + b1*rainfall + b2*temperature [D2, P3]
  T5: Validate with leave-one-out cross-validation [P3]
  T6: Generate weekly predictions using forecast weather data [D4]
  T7: Format as PDF report grouped by region [D5]

CONCLUSIONS
  C1: Model achieves R²=0.72 on training data (acceptable for v1)
  C2: Rainfall is the dominant predictor (70% of variance)
  C3: Regional differences are significant — Mato Grosso farms outperform

DEPENDENCIES
  C1 depends on: T4, T5, A3 (linear assumption)
  C2 depends on: T4, A1 (yield depends on rainfall)
  C3 depends on: T3 (farm selection), A4 (imputation method)

5. Formula Distiller

Convert any quantitative reasoning from the session into explicit equations.

FORMULAS EXTRACTED FROM SESSION

[F1] Yield prediction model
  yield = -2.3 + 0.045*rainfall + 0.12*temperature
  Source: Turn 9 (model fitting results)

[F2] Data completeness threshold
  include_farm = (missing_values / total_values) < 0.30
  Source: Turn 8 (Decision D3)

[F3] Regional performance metric
  regional_score = mean(yield_actual - yield_predicted) per region
  Source: Turn 14 (comparison discussion)

[F4] Report trigger
  generate_report IF day_of_week = Monday AND new_weather_data = True
  Source: Turn 10 (scheduling discussion)

If the session involved scoring, weighting, ranking, optimization, thresholds, or any quantitative logic — extract it as an equation. Use /whiteboard-math to generate canonical hand-solvable examples for any extracted formula.

6. Dependency Graph

Map what depends on what.

DEPENDENCY GRAPH

  [User Goal: Weekly yield predictions]
       |
       +-- [D1: Open-Meteo API]
       |     +-- [A2: Data from 2010+]
       |     +-- [A6: Farm-level accuracy]
       |
       +-- [D2: Linear regression]
       |     +-- [A3: Linear is sufficient]
       |     +-- [F1: yield = -2.3 + 0.045*rain + 0.12*temp]
       |
       +-- [D3: Drop >30% missing]
       |     +-- [A4: Regional mean imputation for rest]
       |
       +-- [D5: PDF format]
             +-- [D4: Weekly cadence]

CRITICAL PATH
  If A6 (farm-level temperature) is wrong → F1 coefficients are unreliable
  → C1 (R²=0.72) is overstated → C2 (rainfall dominance) may be an artifact

7. Ambiguity Detector

Find places where the session appears coherent but actually contains gaps.

AMBIGUITIES AND GAPS

| # | Issue | Type | Severity | Location |
|---|-------|------|----------|----------|
| G1 | "Good accuracy" never defined numerically | Undefined term | Medium | Turn 6 |
| G2 | "Regional" sometimes means state, sometimes means biome | Changing definition | High | Turns 11,14 |
| G3 | How to handle new farms not in training data never discussed | Missing requirement | High | — |
| G4 | Model retraining frequency not specified | Missing requirement | Medium | — |
| G5 | User said "simple" but accepted model with 3 coefficients | Potential contradiction | Low | Turns 1,9 |

HIDDEN LEAPS
  Turn 7→8: Jumped from "explore the data" to "build the model" without
  explicit feature selection or exploratory analysis discussion.

WEAKLY JUSTIFIED STEPS
  Turn 9: Accepted R²=0.72 without discussing whether this is good enough
  for the use case (farm-level decisions vs. regional trends).

Output Schema

Every invocation produces:

# Session X-Ray — [Session Topic]
## 1. Goal Map
## 2. Assumption Ledger
## 3. Decision Log
## 4. Logic Skeleton
## 5. Formulas Extracted
## 6. Dependency Graph
## 7. Ambiguities and Gaps
## 8. Executive Summary

The Executive Summary is a 5-10 line paragraph stating: what was accomplished, what the core reasoning model is, what the key assumptions are, what decisions shaped the outcome, what remains uncertain, and what should be verified before relying on the results.

Integration with /whiteboard-math

When the session contains quantitative logic (formulas, models, scoring), automatically suggest running /whiteboard-math on the extracted formulas to generate hand-solvable canonical examples. This creates a complete interpretability chain: session reasoning + mathematical verification.

Quality Standards

Every assumption must cite its source (turn number or "implicit")
Every decision must list at least one alternative that was considered
The dependency graph must trace from final output back to root assumptions
Ambiguity detection must include at least one item the session participants likely missed
The logic skeleton must be testable — someone reading it should be able to verify each step

See references/extraction-methods.md for detailed extraction methodology. See references/output-formats.md for alternative output formats and customization.