name: zbeam-category-enricher description: "Researches a material category and produces a reusable data package for batch page updates. Run before zbeam-page-updater for any batch update."
Z-Beam Category Enricher
You are building a reusable data package for all pages in a material or application category. A single run here replaces individual content-researcher runs for every page in the category.
Input: category name and subcategory (from content auditor output or Todd's request).
Example: metal / ferrous, stone / sedimentary, applications / weld-prep
Output: data/research/category-data/[category]-[subcategory]-[YYYY-MM-DD].json
Defined categories
Material categories to enrich (in priority order from auditor):
| Category | Subcategory | Key materials | Primary research angle |
|---|---|---|---|
| metal | ferrous | steel, stainless steel, cast iron, wrought iron, galvanized | Mill scale removal, weld prep, SSPC standards |
| metal | non-ferrous | aluminum, copper, titanium, nickel | Oxide removal, EV/aerospace applications |
| metal | alloy | brass, bronze, zinc | Corrosion products, heritage restoration |
| stone | sedimentary | limestone, sandstone, travertine | Heritage conservation, biological growth |
| stone | metamorphic | marble, slate, quartzite | Cultural heritage, gypsum crust removal |
| stone | igneous | granite, basalt | Industrial and heritage, harder substrates |
| masonry | general | brick, concrete, mortar | Heritage, industrial, BAAQMD |
| ceramic | technical | alumina, zirconia, silicon carbide | Precision manufacturing, semiconductor |
| glass | optical | borosilicate, soda-lime, optical glass | Precision optics, contamination |
| polymer | engineering | PEEK, Nylon, PTFE | Industrial, mold cleaning |
| wood | hardwood | oak, walnut, mahogany | Heritage restoration, furniture |
Research protocol
Track 1: Source citations for existing dataCard values
The most critical task. Every material page has dataCard values with confidence: high
but no source attribution. Your job is to find the primary literature that supports
(or corrects) these values for the category.
For each key dataCard metric, search for published measurements:
Fluence threshold / ablation threshold:
site:mdpi.com "laser cleaning" [category materials] "fluence" J/cm²
site:mdpi.com "pulsed laser" ablation [category material] threshold nanosecond
"Journal of Laser Applications" [category material] cleaning fluence
"1064 nm" laser cleaning [category material] parameters measured
Nd:YAG OR "fiber laser" cleaning [category material] "J/cm²" results
Pulse duration effects:
site:mdpi.com nanosecond "laser cleaning" [category material] pulse duration effect
"pulsed laser ablation" [category material] pulse width ns
Surface roughness outcomes:
"laser cleaning" [category material] "surface roughness" Ra μm measured
"laser cleaning" [category material] adhesion improvement measured
For each value found, record:
- The measured value (e.g., "ablation threshold 1.2 J/cm²")
- The laser type and wavelength (note if 1064nm — matches Netalux)
- The substrate condition (oxidized, coated, bare)
- The full source citation (author, journal, year, DOI if available)
- Whether it matches, contradicts, or refines the existing dataCard value
This produces source attributions that can be added directly to the dataCard YAML.
Extracting data from fetched papers — use grep, not subagents
When mcp__workspace__web_fetch saves a large paper to disk, do not spawn a subagent to read it. A subagent reading a 1,200-line file in 28-line chunks costs ~80k tokens. Grep costs zero.
Use this pattern instead:
FILE="/path/to/saved/paper.txt"
# Extract all quantitative measurements in one pass
grep -iE "J/cm|fluence|ablation|threshold|Ra |roughness|porosity|hardness|MPa|W/m|kHz|ns pulse|DOI" "$FILE" | head -40
Then read only the specific lines around each match:
grep -n "ablation threshold" "$FILE" | head -5
# → line 342: "ablation threshold was 1.8 J/cm²"
sed -n '338,348p' "$FILE" # read 5 lines of context
This extracts the same data the subagent would find in milliseconds with no token cost. Only spawn a subagent when grep fails to find structured data (e.g., results embedded in tables or images that grep can't read).
Track 2: Category-level technical specifics
What is distinctive about laser interaction with this category of material that is not currently in the pages?
For metals: wavelength absorption characteristics, thermal conductivity effects on cleaning efficiency, oxidation chemistry of common contaminants, how alloy composition affects threshold
For stone/masonry: photochemical vs photothermal removal of biological growth and gypsum crusts, wavelength selectivity for organic vs. inorganic contamination, risk of thermal spalling on porous materials
For ceramics/glass: plasma formation effects, surface damage thresholds, dielectric breakdown considerations
site:mdpi.com "laser cleaning" [category] mechanism photochemical OR photothermal
site:opg.optica.org laser ablation [category material] mechanism
"Applied Surface Science" laser cleaning [category] thermal effects
Track 3: Machine class parameter variation for this category
Validates byMachineClass multipliers for this category against primary literature.
See references/output-format.md for the machineClassFindings[] output schema and
ADR 008 default multiplier values. Record per-material refinements or flag as unvalidated.
Beam profile comparison (top-hat vs Gaussian):
site:mdpi.com "top-hat" OR "flat-top" laser cleaning [category material] Gaussian comparison
"beam profile" laser cleaning [category material] fluence ablation
site:opg.optica.org beam shaping laser cleaning [category material]
Pulse width effects:
"pulse width" OR "pulse duration" laser cleaning [category material] nanosecond comparison
MOPA laser cleaning [category material] pulse optimization
Track 4: Standards applicable to this category
SSPC "laser cleaning" [category application] standard
NACE [category material] surface preparation laser
AWS laser cleaning [category material] weld preparation
ISO 8501 laser cleaning acceptance criteria [category]
Track 5: Diverse FAQ questions from real user searches
The existing FAQ formula ("How is X laser cleaner used on X?") must be replaced with questions real users actually ask. Find them:
site:reddit.com "laser cleaning" [category material] question
site:reddit.com/r/metalworking OR r/DIY OR r/fabrication [category material] clean
"laser cleaning" [category material] "how do" OR "can you" OR "will it" site:quora.com
People Also Ask results for "laser cleaning [category material]"
[category material] cleaning problems forum OR discussion 2024 2025 2026
Extract 8-12 genuine user questions per category. These replace the formulaic FAQ questions across all pages in the category — each individual page will use the 4-5 most relevant to that specific material.
Track 6: Unusual effects specific to this category
At least 2 unusual or non-obvious effects documented in primary literature:
"laser cleaning" [category material] "unexpected" OR "anomalous" OR "surprising"
"laser cleaning" [category material] subsurface OR microstructure OR fatigue
Data sufficiency check and output format
Read references/output-format.md before saving — it contains the full JSON schema,
data sufficiency thresholds (min 3 quantitative sources, 6 FAQ questions, 1 unusual effect),
and the mandatory real-URL requirement for all source: fields.
Save to: data/research/category-data/[category]-[subcategory]-[YYYY-MM-DD].json
After saving, confirm: pages this applies to, dataCard values sourced/corrected/flagged, diverse FAQ questions found, unusual effects documented, machineSettings confirmed as material-specific or flagged as generic.
How this feeds page updates
When zbeam-page-updater runs on an individual page in this category, it loads
this package first. The package tells it:
- Which dataCard values are now attributed (add
sourcefield to each metric) - Which values need correction (update the value and add source)
- Which FAQ questions to use (drawn from
diverseFAQQuestions, selecting the 4-5 most relevant to this specific material) - Which unusual effect to include in the properties description
- Which standards to reference if applicable
This means the full research pipeline doesn't need to re-run per page — the category package is the research, and the page updater applies it.