zbeam-category-enricher - SKILL.md Agent Skill

name: zbeam-category-enricher description: "Researches a material category and produces a reusable data package for batch page updates. Run before zbeam-page-updater for any batch update."

Z-Beam Category Enricher

You are building a reusable data package for all pages in a material or application category. A single run here replaces individual content-researcher runs for every page in the category.

Input: category name and subcategory (from content auditor output or Todd's request). Example: metal / ferrous, stone / sedimentary, applications / weld-prep

Output: data/research/category-data/[category]-[subcategory]-[YYYY-MM-DD].json

Defined categories

Material categories to enrich (in priority order from auditor):

Category	Subcategory	Key materials	Primary research angle
metal	ferrous	steel, stainless steel, cast iron, wrought iron, galvanized	Mill scale removal, weld prep, SSPC standards
metal	non-ferrous	aluminum, copper, titanium, nickel	Oxide removal, EV/aerospace applications
metal	alloy	brass, bronze, zinc	Corrosion products, heritage restoration
stone	sedimentary	limestone, sandstone, travertine	Heritage conservation, biological growth
stone	metamorphic	marble, slate, quartzite	Cultural heritage, gypsum crust removal
stone	igneous	granite, basalt	Industrial and heritage, harder substrates
masonry	general	brick, concrete, mortar	Heritage, industrial, BAAQMD
ceramic	technical	alumina, zirconia, silicon carbide	Precision manufacturing, semiconductor
glass	optical	borosilicate, soda-lime, optical glass	Precision optics, contamination
polymer	engineering	PEEK, Nylon, PTFE	Industrial, mold cleaning
wood	hardwood	oak, walnut, mahogany	Heritage restoration, furniture

Research protocol

Track 1: Source citations for existing dataCard values

The most critical task. Every material page has dataCard values with confidence: high but no source attribution. Your job is to find the primary literature that supports (or corrects) these values for the category.

For each key dataCard metric, search for published measurements:

Fluence threshold / ablation threshold:

site:mdpi.com "laser cleaning" [category materials] "fluence" J/cm²
site:mdpi.com "pulsed laser" ablation [category material] threshold nanosecond
"Journal of Laser Applications" [category material] cleaning fluence
"1064 nm" laser cleaning [category material] parameters measured
Nd:YAG OR "fiber laser" cleaning [category material] "J/cm²" results

Pulse duration effects:

site:mdpi.com nanosecond "laser cleaning" [category material] pulse duration effect
"pulsed laser ablation" [category material] pulse width ns

Surface roughness outcomes:

"laser cleaning" [category material] "surface roughness" Ra μm measured
"laser cleaning" [category material] adhesion improvement measured

For each value found, record:

The measured value (e.g., "ablation threshold 1.2 J/cm²")
The laser type and wavelength (note if 1064nm — matches Netalux)
The substrate condition (oxidized, coated, bare)
The full source citation (author, journal, year, DOI if available)
Whether it matches, contradicts, or refines the existing dataCard value

This produces source attributions that can be added directly to the dataCard YAML.

Extracting data from fetched papers — use grep, not subagents

When mcp__workspace__web_fetch saves a large paper to disk, do not spawn a subagent to read it. A subagent reading a 1,200-line file in 28-line chunks costs ~80k tokens. Grep costs zero.

Use this pattern instead:

FILE="/path/to/saved/paper.txt"

# Extract all quantitative measurements in one pass
grep -iE "J/cm|fluence|ablation|threshold|Ra |roughness|porosity|hardness|MPa|W/m|kHz|ns pulse|DOI" "$FILE" | head -40

Then read only the specific lines around each match:

grep -n "ablation threshold" "$FILE" | head -5
# → line 342: "ablation threshold was 1.8 J/cm²"
sed -n '338,348p' "$FILE"  # read 5 lines of context

This extracts the same data the subagent would find in milliseconds with no token cost. Only spawn a subagent when grep fails to find structured data (e.g., results embedded in tables or images that grep can't read).

Track 2: Category-level technical specifics

What is distinctive about laser interaction with this category of material that is not currently in the pages?

For metals: wavelength absorption characteristics, thermal conductivity effects on cleaning efficiency, oxidation chemistry of common contaminants, how alloy composition affects threshold

For stone/masonry: photochemical vs photothermal removal of biological growth and gypsum crusts, wavelength selectivity for organic vs. inorganic contamination, risk of thermal spalling on porous materials

For ceramics/glass: plasma formation effects, surface damage thresholds, dielectric breakdown considerations

site:mdpi.com "laser cleaning" [category] mechanism photochemical OR photothermal
site:opg.optica.org laser ablation [category material] mechanism
"Applied Surface Science" laser cleaning [category] thermal effects

Track 3: Machine class parameter variation for this category

Validates byMachineClass multipliers for this category against primary literature. See references/output-format.md for the machineClassFindings[] output schema and ADR 008 default multiplier values. Record per-material refinements or flag as unvalidated.

Beam profile comparison (top-hat vs Gaussian):

site:mdpi.com "top-hat" OR "flat-top" laser cleaning [category material] Gaussian comparison
"beam profile" laser cleaning [category material] fluence ablation
site:opg.optica.org beam shaping laser cleaning [category material]

Pulse width effects:

"pulse width" OR "pulse duration" laser cleaning [category material] nanosecond comparison
MOPA laser cleaning [category material] pulse optimization

Track 4: Standards applicable to this category

SSPC "laser cleaning" [category application] standard
NACE [category material] surface preparation laser
AWS laser cleaning [category material] weld preparation
ISO 8501 laser cleaning acceptance criteria [category]

Track 5: Diverse FAQ questions from real user searches

The existing FAQ formula ("How is X laser cleaner used on X?") must be replaced with questions real users actually ask. Find them:

site:reddit.com "laser cleaning" [category material] question
site:reddit.com/r/metalworking OR r/DIY OR r/fabrication [category material] clean
"laser cleaning" [category material] "how do" OR "can you" OR "will it" site:quora.com
People Also Ask results for "laser cleaning [category material]"
[category material] cleaning problems forum OR discussion 2024 2025 2026

Extract 8-12 genuine user questions per category. These replace the formulaic FAQ questions across all pages in the category — each individual page will use the 4-5 most relevant to that specific material.

Track 6: Unusual effects specific to this category

At least 2 unusual or non-obvious effects documented in primary literature:

"laser cleaning" [category material] "unexpected" OR "anomalous" OR "surprising"
"laser cleaning" [category material] subsurface OR microstructure OR fatigue

Data sufficiency check and output format

Read references/output-format.md before saving — it contains the full JSON schema, data sufficiency thresholds (min 3 quantitative sources, 6 FAQ questions, 1 unusual effect), and the mandatory real-URL requirement for all source: fields.

Save to: data/research/category-data/[category]-[subcategory]-[YYYY-MM-DD].json

After saving, confirm: pages this applies to, dataCard values sourced/corrected/flagged, diverse FAQ questions found, unusual effects documented, machineSettings confirmed as material-specific or flagged as generic.

How this feeds page updates

When zbeam-page-updater runs on an individual page in this category, it loads this package first. The package tells it:

Which dataCard values are now attributed (add source field to each metric)
Which values need correction (update the value and add source)
Which FAQ questions to use (drawn from diverseFAQQuestions, selecting the 4-5 most relevant to this specific material)
Which unusual effect to include in the properties description
Which standards to reference if applicable

This means the full research pipeline doesn't need to re-run per page — the category package is the research, and the page updater applies it.