name: zbeam-crosslinker description: "Adds internal crosslinks to an existing page (target-first, density scaled to depth). Trigger on 'crosslink [slug]' or in a batch quality run."
Z-Beam Crosslinker
You add internal links to an existing page. Work target-first, not keyword-first.
The naive approach — scan the prose and link whatever common words appear — floods every
page with steel, aluminum, and a handful of other over-linked hubs, while hundreds of
less-known pages get nothing. Invert it: first decide which under-linked, topically
relevant pages deserve a link from here, then hunt this page's prose for an interesting,
distinctive place to introduce each one.
Two goals:
- Spread link equity to a variety of less-known pages — strongly favour targets that currently have few inbound links over the popular hubs. Variety of targets is the point.
- Place each link in a distinctive context with a specific anchor — never a bare, over-common anchor like "steel"; prefer a grade number, a full distinctive name, or a specific technical phrase, each sitting in a different section of the page.
- Create a context when none exists. When an under-linked target is genuinely relevant
but unmentioned, you may write a short, factually true clause that introduces it, then
link it (see
references/scoring-algorithm.md→ Creating a context).
Linking is one-directional by design — no reciprocal or back-link logic. Density scales with page depth (Step 3 budget); links spread across distinct contexts, never clustered.
Input required: slug of the page to crosslink.
Output: updated YAML file + diff summary at data/audit/crosslinks-[slug]-[YYYY-MM-DD].md
Modes
- Default (manual / reviewed runs): full behaviour — existing-anchor links and created contexts.
existing-anchor-only(autonomous cadence, e.g. weekly-batch): place only existing-anchor links — never create a context. When a relevant under-linked target has no natural anchor, log it todata/audit/crosslink-created-context-candidates-[slug]-[date].json. Everything else (target-first selection, relevance gate, exclusion list, budget) is unchanged.
Step 0: Governance — prior knowledge check + hypothesis write
python3 skills/shared/check-prior-knowledge.py --slug [slug]
If USE_PRIOR or SUPPLEMENT: note prior findings and dead ends. If dead ends present (e.g. "previous crosslink attempt showed no impression lift"), factor into hypothesis.
Infer hypothesis from scoring results and orphan status — never ask the user. Example: "Adding [n] crosslinks to [slug] (orphan score: [x]). Hypothesis: internal linking will reduce orphan status and increase discovery via related pages, improving inbound-link count on targets within 28 days."
python3 skills/shared/change-log-utils.py write \
--slug [slug] --skill zbeam-crosslinker --mode crosslink \
--hypothesis "[inferred — specific, testable, 28-day window]"
Skip change-log write if zero links were added (run Step 0 check first; defer write to after Step 5 if budget is unknown).
Step 1: Build the crosslink index
Read references/scoring-algorithm.md for the full cache-first index build (Materials, Applications,
Videos), inbound-link ledger construction, and bucket definitions (0–1 under-linked / 2–5 normal /
≥6 saturated). Remove the current page's own slug from the index — never link to self.
Step 2: Select target pages (target-first)
Read references/scoring-algorithm.md → Step 2: Target selection for the full relevance gate,
ranking rules, and saturated-hub demotion logic. Produce ~10–15 ranked candidates.
Step 3: Find a distinctive context + anchor for each target
A context is any reader-facing body prose field on the page — links may land in any
of them, not just page.description.
Discover contexts generically — do not use a fixed list. Walk the page's YAML and treat every
sentence-bearing body string as a candidate: FAQ answers, every heading.description, every
*.prose, page.description, problem / standard / regulatory *.description and itemDescription,
body.sections[*].items[*].text / .description / .details[*], localContext.regulatoryNotes,
enhancements.summary, and any other body field that holds prose.
Never place a link in these (not body prose, or not markdown-rendered):
- SEO metadata:
page.metaDescription,openGraph.*,twitter.* - Structured data:
jsonLd.*,schema.*, any*Schema/videoSchemablock - Titles / names / questions:
page.title, any*.title,*.name,heading.title, FAQquestion,longName, author-bio fields keywords,citations[*].*, and any URL / slug / id / date / image / alt field
Render dependency: body prose becomes HTML via renderMarkdown (section & heading descriptions,
prose, body sections) and parseSimpleMarkdown (FAQ answers — ExpertAnswersPanel); both turn
[text](url) into <a href>. Any body field rendered through those is link-safe.
For each ranked candidate, find the single best (context, anchor) pair:
- Acceptable anchor (best to worst): grade/model number (
316L,6061); full distinctive name (hastelloy,weld decay,passivation); specific multi-word technical phrase already in prose. - Reject generic anchors: bare
steel,aluminum,metal,surface,coating,cleaning,oxide. Drop the target if only a generic anchor is available. - Natural fit: the sentence must already be about the target's topic. Never bend prose to fit a link.
- One link per context.
If no distinctive natural anchor exists, read references/scoring-algorithm.md → Creating a context
for rules (all mandatory) before writing any new prose.
Assemble the final set:
- Walk ranked targets; take each target's best (context, anchor) pair.
- Global de-dup: one link per target, one link per anchor phrase; no self-link; no link inside existing
[...]. - One link per context (second allowed only if context >400 chars, anchors >120 chars apart, different targets).
- Per-page budget: ~1 link per 120 words of prose, capped at 10.
- Variety check: if more than ~⅓ of links point to the same category, drop the weakest and substitute an under-linked candidate.
Step 4: Apply crosslinks surgically
(a) Existing anchor — replace the matched keyword with a markdown link:
new_text = f'[{matched_keyword}]({target_url})'
(b) Created context — insert the short true clause with the link embedded at the natural position.
Apply replacements last to first by position. Use surgical string replacement (NOT yaml.dump):
with open(page_path, 'r') as f:
content = f.read()
field_block = extract_field_block(content, field_path)
new_block = field_block.replace(old_keyword_occurrence, new_text, 1)
content = content.replace(field_block, new_block, 1)
with open(page_path, 'w') as f:
f.write(content)
Validate after each replacement:
python3 -c "import yaml; yaml.safe_load(open('[path]'))" && echo "valid"
If validation fails, revert the replacement and skip it.
Step 5: Generate diff summary
Save to data/audit/crosslinks-[slug]-[YYYY-MM-DD].md:
# Crosslink Report: [slug]
**Date**: [date]
**Links added**: [n] (budget [b], from ~[w] words)
**Contexts covered**: [c] of [total] distinct contexts on the page
**Target mix**: [x] under-linked (0–1 inbound) · [y] normal · [z] saturated hubs
| Context (field) | Anchor | Target page | Target inbound (before) | URL |
|---|---|---|---|---|
| panels.faq.items[3].answer | 316L | Stainless Steel 316 | 1 | /materials/metal/alloy/stainless-steel-316-laser-cleaning |
**Candidates considered but not selected** (with reason):
- "stainless steel" hub → saturated (8 inbound) and only a generic anchor → skipped
**Remaining opportunities** (for future runs):
- [under-linked targets that were relevant but found no good context this run]
Rendering dependency
Crosslinks render as <a href> because renderMarkdown (BaseSection) and parseSimpleMarkdown
(ExpertAnswersPanel, lib/text-formatting.ts) turn [text](url) into links. If either converter's
link handling is removed, crosslinks in affected fields display as literal bracket syntax.
What this skill never does
- Never exceeds the per-page budget or places more than one link in a single context (save the long-context exception)
- Never links generic anchors (
steel,aluminum,metal,surface,coating,cleaning,oxide) - Never piles links onto saturated hubs (≥6 inbound) over under-linked targets
- Never places an irrelevant link just to spread equity — relevance is a hard gate
- Never creates reciprocal or back-links
- Never links to external sites
- Never fabricates facts when creating a context — a created clause must be true, or the target is skipped
- Never creates more than 2–3 contexts per page
- Never links to the page being crosslinked (no self-links)
- Never uses yaml.dump() — surgical string replacement only
- Never runs without YAML validation after each write
⏸ COMMIT GATE
Never auto-commit. Summarize what changed, provide the git command, wait for Todd to run it.