canonicalize-architecture

name: canonicalize-architecture description: Analyze architecture/spec documents for contradictions, ambiguities, and inconsistencies. Produces an encyclopedia-style canonical specification series. Use when user asks to canonicalize docs, find contradictions in specs, create a canonical spec, or consolidate design documents. Input is a directory or file list (e.g., "design-docs/" or "spec.md arch.md api.md").

Architecture Document Canonicalization

You are analyzing architecture and technical specification documents to surface contradictions, inconsistencies, and consequential ambiguities. The ultimate goal is to compile a comprehensive canonical specification encyclopedia—a three-tiered collection of documents organized by change cost.

Purpose of the Canonical Specification

What it IS:

Alignment of high-level ideas
Reduction of contradictions and ambiguities
A way to verify plans and implementations against goals and strategy
The ideas matter, not the examples (examples help explain, but ideas are what's important)

What it is NOT:

A complete encoding of every implementation detail
A replacement for actual code or comprehensive documentation
An attempt to specify everything exhaustively

Three-Tier Organization

The canonical specification organizes content by answering: "How expensive would this be to change?"

Tier	File Prefix	Meaning	Contents
T1	`t1_*.md`	Cannot change. Would make this a different application.	Core invariants, fundamental principles
T2	`t2_*.md`	Can change, but it's work. Affects many other things.	Architecture, type system, core topology
T3	`t3_*.md`	Use it or don't. Change freely if something works better.	Examples, implementation notes, details

Conflict Resolution

Lower number wins.

If content in t3_*.md conflicts with t1_*.md, the foundational tier wins. No exceptions.

Agent Reading Pattern

Always read: **/t1_*.md (small and critical)
Usually read: **/t2_*.md (core architecture context)
Consult as needed: **/t3_*.md (reference material)

Critical Context: Documents Represent Historical Systems

IMPORTANT: Source documents may have been written for previous versions of the system and contain outdated assumptions, terminology, and architectural patterns that no longer apply.

The canonicalization process must:

Extract architectural intent where it aligns with current canonical spec
Reject outdated assumptions from previous system iterations
Identify useful patterns (UI flows, interaction models) that can be adapted to current architecture
Not treat every detail as authoritative just because it's written down

Integration Priority

When integrating source documents into the canonical spec:

Canonical spec wins - Always prefer existing canonical definitions over source document claims
Verify alignment - Before accepting any architectural statement, verify it against existing canonical topics
Flag misalignment - Clearly mark where source documents contradict or invent concepts not in canonical spec
Extract value - Focus on what's useful (UI patterns, interaction flows, user needs) rather than treating docs as ground truth
Question authority - Source documents describing "how X works" have NO authority over canonical spec's definition of X

Example

If a UI spec document says "Domain blocks have jitter, spacing, and origin parameters," but the canonical spec defines DomainDecl with only shape parameters, the canonical spec wins. The UI spec's claims about domain parameters are rejected as outdated speculation from a previous system iteration.

Output Contract (Locked)

Contract: topic_dirs_with_tiers
Status: LOCKED (single allowed structure; enforce, do not prompt)

The canonical output organizes content by topic (directories) and tier (file prefixes):

CANONICAL-<topic>-<timestamp>/
├── INDEX.md                       # Master navigation and overview (derived)
├── TIERS.md                       # Tier system (derived)
├── ESSENTIAL-SPEC.md              # Minimal baseline (derived, required)
├── GLOSSARY.md                    # Authoritative terminology
├── RESOLUTION-LOG.md              # Decision history with rationale
├── QUESTIONS.md                   # Open + resolved items (authoritative record)
├── topics/
│   ├── <topic-id>-<slug>/         # Topic directory
│   │   ├── t1_<slug>.md           # Foundational content
│   │   ├── t2_<slug>.md           # Structural content
│   │   └── t3_<slug>.md           # Optional content
│   └── ...
└── appendices/
    ├── source-map.md              # Which sources contributed to what
    └── superseded-docs.md         # List of archived originals

Forbidden legacy layout (must not exist in canonical output):

topics/<topic-id>-<slug>.md
topics/<topic-id>-<slug>.INDEX.md

Why This Organization

Topics stay cohesive: All type-system content in one directory
Tiers easily filterable: **/t1_*.md, **/t2_*.md, **/t3_*.md
Conflict resolution: Lower tier number wins - simple and unambiguous
Agent filtering: Load all t1 files always, t2 for context, t3 on-demand
Flexibility: Not every topic needs all three tiers
Purpose alignment: Separates "cannot change" from "implementation details"

Input

The user has provided: $ARGUMENTS

This may be:

A directory path (analyze all markdown/text files within)
A space-separated list of file paths
A glob pattern
An description of where to write your output
A combination of these

First determine where to write your output. Then determine which files to use as input.

Step 0: Determine Output Contract + Run Type (DISPATCHER)

Before reading any source files, determine the output contract and run type.

Reference Files Location (Important)

This skill uses bundled reference files stored at:

.claude/skills/canonicalize-architecture/references/

When instructed to "load a reference file", use the full repo-relative path above (not references/..., which will not resolve from the repo root).

Do not try to “discover” these via search/glob. Some tooling/workflows ignore hidden directories like .claude/. Instead, directly read the exact file you need.

Required reference files (must exist):

.claude/skills/canonicalize-architecture/references/run-first.md
.claude/skills/canonicalize-architecture/references/run-middle.md
.claude/skills/canonicalize-architecture/references/run-review.md
.claude/skills/canonicalize-architecture/references/run-approval.md
.claude/skills/canonicalize-architecture/references/run-update.md
.claude/skills/canonicalize-architecture/references/run-final.md
.claude/skills/canonicalize-architecture/references/run-migrate.md

If any required reference file cannot be read, stop and report which path failed (do not “proceed anyway”).

Output Contract Enforcement (NO PROMPTS)

There is one allowed structure: topic_dirs_with_tiers.

No prompting, no opt-in. If the on-disk canonical directory does not conform, run a deterministic migration (MIGRATE run) before doing any UPDATE/FIRST/MIDDLE/FINAL work.

Contract: `topic_dirs_with_tiers`

Within a canonical directory CANONICAL-<topic>-<timestamp>/:

Canonical topic content lives under topics/<topic-id>-<slug>/
Each topic directory contains tiered files:
- t1_*.md (foundational)
- t2_*.md (structural)
- t3_*.md (optional)

Forbidden legacy layout (must be migrated):

topics/<topic-id>-<slug>.md (flat topic files)
topics/<topic-id>-<slug>.INDEX.md (flat topic index files)

If you see forbidden legacy files, the run type is MIGRATE. Do not continue without migrating.

Output Directory: Determine the common ancestor directory of all input files.

Check for existing files/directories:

CANONICAL-<topic>-*/ directory (completed encyclopedia)
CANONICALIZED-QUESTIONS-*.md
CANONICALIZED-GLOSSARY-*.md
CANONICALIZED-TOPICS-*.md (topic breakdown)
EDITORIAL-REVIEW-*.md (editorial review)
USER-APPROVAL-*.md (user approval record)

Decision table:

Condition	Run Type	Action
`CANONICAL-<topic>-/` directory exists AND forbidden legacy layout exists (`topics/.md` or `topics/*.INDEX.md`)	MIGRATE	Load `.claude/skills/canonicalize-architecture/references/run-migrate.md`
`CANONICAL-<topic>-*/` directory exists AND user provides any non-canonical source inputs	UPDATE	Load `.claude/skills/canonicalize-architecture/references/run-update.md`
`CANONICAL-<topic>-*/` directory exists AND user provides no non-canonical source inputs	UPDATE	Load `.claude/skills/canonicalize-architecture/references/run-update.md`
No `CANONICALIZED-*` files exist	FIRST	Load `.claude/skills/canonicalize-architecture/references/run-first.md`
`CANONICALIZED-*` exist with `indexed: true`, progress < 100%	MIDDLE	Load `.claude/skills/canonicalize-architecture/references/run-middle.md`
`CANONICALIZED-` exist, progress = 100%, no `EDITORIAL-REVIEW-.md` exists	REVIEW	Load `.claude/skills/canonicalize-architecture/references/run-review.md`
`EDITORIAL-REVIEW-.md` exists, no `USER-APPROVAL-.md` exists	APPROVAL	Load `.claude/skills/canonicalize-architecture/references/run-approval.md`
`USER-APPROVAL-*.md` exists with `approved: true`	FINAL	Load `.claude/skills/canonicalize-architecture/references/run-final.md`

Print the detected run type, then load and follow the appropriate reference file from .claude/skills/canonicalize-architecture/references/.

Shared Context

These rules apply to all run types:

Precedence Rules for Prior Outputs

Prior resolutions take precedence over source documents - If a QUESTIONS file contains a RESOLVED item, that resolution is authoritative
Carry forward all resolutions - Every RESOLVED item from prior QUESTIONS files must appear in the new output
Migrate resolved ambiguous terms - When a term in the Ambiguous Terms table is marked resolved, move it to the GLOSSARY file in the next run

Authoritative vs Derived Content

Authoritative (source of truth):

All **/t1_*.md, **/t2_*.md, **/t3_*.md topic files
GLOSSARY.md
RESOLUTION-LOG.md
appendices/source-map.md

Derived (regenerate every run or omit if unnecessary):

INDEX.md
TIERS.md
ESSENTIAL-SPEC.md (required minimal baseline)
Any extra summary or index files (e.g., *.INDEX.md, SUMMARY.md) only if the user opts in

Rule: Derived files must be regenerated from authoritative content on each run. If efficiency is a concern, prefer smaller authoritative files over larger derived summaries.

ESSENTIAL-SPEC.md Rules (Required)

Purpose: a minimal baseline for any agent or implementer.
Content: T1 content plus the smallest necessary T2 content for core flows (type system, compilation, runtime, renderer).
Must exclude T3/UI/implementation examples.
Must be short, consistent, and never introduce new concepts not in authoritative files.

QUESTIONS File Handling (User Preference)

FIRST/MIDDLE runs: Use CANONICALIZED-QUESTIONS-*.md working files.
UPDATE runs: Use the existing canonical QUESTIONS.md in-place. Append new sections; do not create a new questions file.

Progressive Disclosure for Agents (Required)

The tier system exists to let agents load a slim foundation and only the topic-specific details they need.

Requirements:

T1 files must be small and critical (no examples, no implementation details).
Each T2 file must start with a "Prerequisites" section listing the minimal T1 files and any cross-topic dependencies.
Each T2 file must include a "Touchpoints" section listing other topics/systems it interacts with.

This replaces the need for heavy derived summaries while still enabling efficient, targeted loading.

Pruning & Signal Discipline (Required)

Remove or avoid generating anything that is:

Distracting: process reports, compression stats, or workflow artifacts
Vague: non-actionable principles without constraints or enforcement
Redundant: indexes of indexes, duplicate summaries

If a document is not authoritative or directly useful for implementation, it should not be generated by default.

Topic Identification and Tier Classification

During analysis, identify distinct topics that warrant separate documents AND classify them into tiers.

Topic Boundaries:

Different architectural layers (type system, compiler, runtime, renderer)
Distinct subsystems with clear interfaces
Separable concerns (state management, time handling, error handling)
Different user-facing concepts (blocks, wires, buses, domains)

Tier Classification:

For each topic, ask: "How expensive would this be to change?"

T1 (Foundational): "Cannot change" / "Would make this a different application"
- Example: Core invariants, fundamental principles defining what this app IS
T2 (Structural): "Can change, but it's work" / "Touches many other things"
- Example: Type system architecture, core topology, data flow patterns
T3 (Optional): "Use it or don't" / "Change freely"
- Example: Implementation examples, detailed specifications, code patterns

Naming Convention: Use kebab-case slugs:

principles (tier 1)
type-system (tier 2)
examples/basic-flow (tier 3)

Timestamp Format

All timestamps: YYYYMMDD-HHMMSS

Front-matter (all output files)

---
command: /canonicalize-architecture $ARGUMENTS
files: [space-separated list of files processed this run]
indexed: true
source_files:
  - [path/to/source1.md]
  - [path/to/source2.md]
topics:
  - [topic-slug-1]
  - [topic-slug-2]
---

Cross-Linking Convention

Within the encyclopedia, use relative links.

From root-level files (e.g. INDEX.md):

[Type System](./topics/01-type-system/t2_type-system.md)
[Glossary: SignalType](./GLOSSARY.md#signaltype)

From within a topic file (located at topics/<topic>/...):

[Foundational Rules](../00-foundation/t1_foundation.md) (example)
[Another Topic](../02-block-system/t2_block-system.md) (example)
[Glossary: SignalType](../GLOSSARY.md#signaltype)

Encyclopedia Index Requirements

The INDEX.md must include:

Status badge: CANONICAL / UPDATING / DRAFT / SUPERSEDED
Quick navigation: Links to all major sections
Topic map: Visual or tabular overview of all topics
Reading order: Suggested sequence for newcomers
Search hints: Key terms and where to find them
Version info: When generated, from what sources, approval status

Canonical Integrity Gate (Required)

After generating or updating outputs, run an integrity check. If any check fails, add a blocking item to the QUESTIONS file (existing canonical QUESTIONS.md for UPDATE runs; CANONICALIZED-QUESTIONS-*.md for FIRST/MIDDLE runs).

Minimum checks:

Contract conformity: Output matches topic_dirs_with_tiers (topic directories under topics/ + t1_/t2_/t3_ files; no legacy topics/*.md or topics/*.INDEX.md).
No deprecated tokens in derived files (e.g., known deprecated terms that were explicitly replaced).
Counts are computed, not handwritten (sources, topics, resolutions).
Cross-link validity: All links resolve to existing files/anchors.
Tier sanity: T1 is small and critical; T3 is non-critical; flag any misclassification.