qa - SKILL.md Agent Skill

name: qa description: Compare extracted WXR content against the original source site page by page. Find missing text, headings, images, and links. Fix by patching the WXR or re-extracting individual pages. Produces a health score and structured report. Use when asked to "qa", "check extraction", "compare content", or "verify extraction quality". allowed-tools: - Bash - Read - Write - Edit - Glob - Grep - AskUserQuestion

QA: Compare → Fix → Verify

You are a QA engineer for content migrations. Compare every page in a WXR file against its original source URL — check that text, headings, images, and links made it through extraction intact. When you find gaps, fix them by patching the WXR or re-extracting the page. Produce a structured report with before/after evidence.

Setup

Parse the user's request for these parameters:

Parameter	Default	Override example
WXR file	Auto-detect `output.wxr` in the resolved site output dir	`<outputDir>/output.wxr`
Tier	Standard	`--quick`, `--exhaustive`
Scope	All pages	`Focus on the blog posts`

Tiers determine which issues get fixed:

Quick: Fix critical (fail grade) only
Standard: Fix critical + warn grade (default)
Exhaustive: Fix all, including minor discrepancies

If no WXR path is given: Call liberate_paths({ url }) to resolve the site output dir (default base: ~/Studio/_liberations/<host>, overridable via --output / DLA_OUTPUT_DIR). If the user has not provided a URL, ask for it. If multiple sites exist, ask which to QA.

Workflow

Phase 1: Initialize

Locate the WXR file
Read it with readWxr() from src/lib/wxr-reader.ts
Count pages/posts with _source_url — these are testable
Count pages/posts without _source_url — these are skipped (warn the user)
Start timer for duration tracking

Phase 2: Compare

For each page/post with a _source_url:

Fetch the origin page via HTTP
Parse both — origin HTML and WXR content — into a content model (text, headings, images, links) using parseContent() from src/lib/content-parser.ts
Diff using diffContent() from src/lib/content-differ.ts
Grade the page:
- pass (>90% weighted match) — content faithfully extracted
- warn (70-90%) — minor gaps
- fail (<70%) — significant content missing
- error — fetch failed or page unreachable
Document immediately — don't batch. Log each result.

Per-page checks:

Dimension	What to check	Weight
Text	Word-level similarity (Jaccard on word sets)	50%
Headings	h1-h6 count, text, order match	20%
Images	Count match, missing images by filename	20%
Links	Count match, missing hrefs	10%

Depth judgment: Spend more attention on pages that fail — these need investigation. Pass pages just get logged.

Phase 3: Compute Health Score

Content Health Score (0-100):

  Text fidelity (50%):
    All pages pass     → 100
    1-2 pages warn     → 80
    1-2 pages fail     → 50
    3+ pages fail      → 20

  Heading fidelity (20%):
    0 missing headings → 100
    Each missing       → -10 (min 0)

  Image fidelity (20%):
    0 missing images   → 100
    Each missing       → -15 (min 0)

  Link fidelity (10%):
    0 missing links    → 100
    Each missing       → -10 (min 0)

  score = Σ (dimension_score × weight)

Phase 4: Report (Before Fixes)

Show the comparison report to the user:

Per-page results:

Page: /about (https://www.example.com/about)
  Text:     98% ✓
  Headings: 3/3 ✓
  Images:   2/3 ⚠ missing: hero-banner.jpg
  Links:    5/5 ✓
  Grade:    warn

Summary:

Content QA: 10 pages checked, 2 skipped (no source URL)
  8 pass  1 warn  1 fail  0 error
  Health score: 74/100

  Top issues:
  1. /project-3 [fail] — text similarity 42%, 3 missing images
  2. /about [warn] — 1 missing image (hero-banner.jpg)

Phase 5: Triage

Sort issues by severity, then decide which to fix based on tier:

Quick: Fix fail grade only. Mark warn as deferred.
Standard: Fix fail + warn. (default)
Exhaustive: Fix all, including pages with minor discrepancies.

Mark pages with error grade (fetch failed) as deferred — can't fix what you can't compare.

Phase 6: Fix Loop

For each fixable page, in severity order (fail first, then warn):

6a. Assess

Read the diff details. What's missing?

Missing alt text on existing images → patchable
Missing images entirely → needs re-extraction
Missing text sections → needs re-extraction
Minor text differences → acceptable, skip

6b. Fix

Level 1: Patch the WXR (for minor fixes)

Run runQa({ wxrFile, fix: true }) which patches missing alt text and minor gaps directly in the WXR

Level 2: Re-extract (for major gaps)

If content is too far off (text similarity <50%), the page needs full re-extraction
Flag it for the user: "Page /project-3 needs re-extraction — text similarity is 42%"
If the user approves, re-run extraction for just that URL through the adapter

6c. Verify

After fixes, re-run the comparison on fixed pages:

const result = await runQa({ wxrFile, fix: false });

Check: did the fix improve the grade? If a fix made things worse, revert the WXR from the backup.

6d. Self-Regulation

After every 5 fixes, evaluate:

Are the remaining issues actually fixable from the WXR?
Are we making things better or just churning?
If all remaining issues are warn with >80% similarity, stop — that's good enough.

Hard cap: 20 fix attempts. After 20, stop and report.

6e. Escalate to /diagnose

If after fixing, pages still have fail grades that can't be patched — especially if the failures share a pattern (e.g. all blog posts fail, all product pages are empty) — suggest running /diagnose to investigate the root cause. QA finds the symptoms; diagnose finds the cause.

Phase 7: Final Report

After all fixes:

Content QA Complete — 10 pages checked

  Before: 74/100  →  After: 92/100

  Fixed:
    /about — patched missing alt text on hero-banner.jpg (warn → pass)
    /project-3 — re-extracted (fail → pass)

  Deferred:
    /project-5 — origin returns 404, cannot compare

  Health score: 74 → 92 (+18)

Include:

Total pages checked
Fix count (patched: X, re-extracted: Y)
Deferred issues with reasons
Health score delta: before → after

Using the Code

import { runQa } from './src/lib/qa-runner.js';

// Compare only (no fixes)
const result = await runQa({ wxrFile: '<outputDir>/output.wxr' });

// Compare and fix
const fixResult = await runQa({ wxrFile: '<outputDir>/output.wxr', fix: true });

The QaResult contains:

pages[] — per-page results with slug, sourceUrl, grade, diff details
skipped — count of pages without _source_url
summary — { pass, warn, fail, error, fixed }

The QA log is written to qa-log.jsonl alongside the WXR file.

Important Rules

Compare before fixing. Always show the report first. Ask user before applying fixes.
Minimal fixes. Patch what's safe (alt text, minor gaps). Flag major gaps for re-extraction.
Verify after fixing. Re-run comparison on fixed pages. If the fix made things worse, revert.
No WordPress site needed. QA compares the WXR against the origin site directly.
Log everything. Every comparison and fix goes to qa-log.jsonl.
Don't over-fix. Some text differences are acceptable (navigation, footers, cookie banners). Focus on the main content.
Pages without _source_url can't be QA'd. Warn the user if many pages lack source URLs — they need re-extraction with a newer version that records source URLs.
Self-regulate. Stop after 20 fix attempts or when remaining issues are minor.
Log discoveries. If you find a pattern of content loss specific to a platform (e.g. "Squarespace always drops image captions"), add it to DISCOVERIES.md so future extractions can be improved.