zbeam-weekly-batch - SKILL.md Agent Skill

name: zbeam-weekly-batch description: "Batch content session: reads gap-signals + content-improvement-queue, queues pages, runs full research→editorial→generation pipeline. Manual — invoke after running zbeam-pipeline-auditor to refresh gap-signals."

Z-Beam Weekly Batch

Pipeline position: zbeam-pipeline-auditor (manual) → YOU (manual)

What this skill does: Reads the pipeline auditor's output, builds a prioritized queue of pages to update, and runs the full upstream research + generation pipeline for each page — unattended. No git actions. Todd commits and pushes after reviewing results.

Input (no arguments — reads automatically):

data/audit/gap-signals-[latest].json — prioritized pages from auditor (must be < 24 h old)
data/audit/content-audit-[latest].json — thinness scores
data/research/category-data/ — enricher packages
skills/shared/validator-config.json — shared thresholds

Output:

Updated frontmatter files (via generation-loop → page-updater)
data/audit/weekly-batch-[YYYY-MM-DD].md — run summary
data/audit/weekly-batch-[YYYY-MM-DD].json — machine-readable results (includes lastUpdated per slug)
data/audit/weekly-batch-checkpoint-[YYYY-MM-DD].json — per-slug progress (enables resume on context overflow)

Session limit: max pages per run = batchLimits.maxPagesPerRun in validator-config.json (default 8).

Step 0a — Load session state

Load data/pipeline/session-state.json. Clear stale in-flight slugs (> 2 hours old). Use lastRunDecisions to deprioritize slugs that completed in the most recent run.

At end of batch: write updated session state back (stats, selfHealingAttempts, lastRunDecisions).

Session state shape:

{
  "lastRunDecisions": {},
  "inFlightSlugs": {},
  "recentSelfHealingAttempts": [],
  "stats": {"totalBatchRuns": 0, "totalPagesProcessed": 0, "totalNewPagesBuilt": 0}
}

Step 0 — Trigger check + load shared config

Trigger gate (idempotency): compare gap-signals mtime to last batch run date.

No gap-signals file → print "TRIGGER NOT MET", exit cleanly (not an error)
gap-signals date ≤ last batch date → print "TRIGGER NOT MET", exit cleanly
gap-signals newer → proceed

import json, glob, os
from datetime import date

cfg = json.load(open('skills/shared/validator-config.json'))
MAX_QUEUE       = cfg['batchLimits']['maxPagesPerRun']       # default 8
ENRICHER_TTL    = cfg['enricherTtlDays']                     # default 30
GAP_MAX_AGE_HRS = cfg['freshness']['gapSignalsMaxAgeHours']  # default 24
INTEL_MAX_AGE   = cfg['freshness']['intelligenceMaxAgeDays'] # default 45

Step 0 — Preamble: update research index

Run before any queue-building or research:

python3 skills/shared/index-research-briefs.py

This ensures data/research/research-index.json is current so all research skills (content-researcher, differentiation-researcher) can call check-prior-knowledge.py and avoid re-researching settled facts.

Step 1 — Safety checks

Duplicate run guard: if weekly-batch-[today].json exists with status: completed → exit.
Resume mode: if weekly-batch-checkpoint-[today].json exists → load completed_slugs set.
Gap signals freshness: abort if gap-signals older than GAP_MAX_AGE_HRS.
Intelligence freshness: warn only (do not abort) if intelligence briefing > INTEL_MAX_AGE days.

Step 2 — Build the slug queue

Read references/queue-tiers.md for the full Tier 1–4 definitions and content-improvement-queue processing logic.

Cross-week deduplication: skip slugs updated within last 14 days (load from all weekly-batch-*.json).

autoRunnable: false items are always skipped — they require human judgment.

New page candidates: read gaps.highPriorityGaps where currentCoverage: none and demandStrength: high|medium. Run autonomously (max 1 per batch, isNewPage: true). Todd reviews at commit time. See Step 5 for review-queue entry.

Max queue size: MAX_QUEUE. Stop adding once reached.

Plan

Print the full queue (slugs, tiers, estimated pages), then proceed immediately — no approval required.

Step 3 — Enricher readiness check

For each material slug in queue, check data/research/category-data/{cat}-{subcat}-*.json:

missing or older than ENRICHER_TTL days → run zbeam-category-enricher with category={cat}, subcategory={subcat}
Fresh → proceed

Run all missing/stale enrichers upfront (not mid-queue).

Step 4 — Process pages (parallel fan-out)

Launch all queue items concurrently via concurrent.futures.ThreadPoolExecutor(max_workers=MAX_QUEUE). Each slug is fully independent — separate input/output files, no shared mutable state.

Per-slug effort scaling — call get_track(slug, cat, subcat) before any skill:

Track	Condition	Skills invoked	Est. time
`updater`	enricher + editorial + both briefs present	editorial-brief (update) → generation-loop	10–15 min
`editorial_only`	enricher + both briefs, no editorial	editorial-brief → generation-loop	15–20 min
`research_light`	enricher only	research-director → researchers (parallel) → editorial → gen	25–30 min
`full_research`	no enricher	category-enricher → all above	40–50 min

Per-slug pipeline steps:

4a. Researchers — parallel (skip if track = updater or editorial_only) Invoke zbeam-content-researcher + zbeam-differentiation-researcher concurrently. After both complete: confidence floor check — skip slug if dataDensityAssessment.meetsMinimumThreshold: false.

4c. Editorial brief Invoke zbeam-editorial-brief with slug, mode=update. Check readyToWrite gate — skip to human_review if blocked.

4d. Generation loop Invoke zbeam-generation-loop with slug, mode=update. Read data/audit/generation-loop-{slug}-{today}.json — hard error if file missing or malformed.

`finalResult`	Batch result
`PASS` / `CONDITIONAL_PASS`	`status: updated`
`HUMAN_REVIEW_REQUIRED`	`status: human_review`
`MAX_ITERATIONS`	`status: human_review`

4e. Citation builder (PASS/CONDITIONAL_PASS pages only) Invoke zbeam-citation-builder with slug. If output has blockingIssues → override to status: human_review.

4f. Write checkpoint after every slug → data/audit/weekly-batch-checkpoint-{today}.json

Step 5 — Write batch report + review queue

After all pages complete, write:

data/audit/weekly-batch-{today}.json — full results including lastUpdated per slug
data/audit/weekly-batch-{today}.md — human-readable summary
Remove checkpoint file

Review queue entry (data/review-queue/{today}-summary.md):

Updated pages → one BATCH COMMIT entry with all slugs + commit command
New pages → separate NEW PAGE BUILT entry tagged Requires: approval (not just commit)
Human-review-only run → REVIEW NEEDED entry

Step 5b — Prose maintenance pass

Read references/batch-polisher-pass.md for the full candidate-loading, adversarial-directive priority, generation-loop invocation, and directive-marking protocol (max 4 pages per run).

Step 5c — Crosslink orphan pages (existing-anchor only)

After prose maintenance, identify orphan pages (zero internal links) and invoke zbeam-crosslinker with mode=existing-anchor-only for up to 5 per run. Safe for autonomous runs — never writes new prose, logs created-context candidates for human review. Log results under crosslinkPasses in the batch report. Never commit — leave changes for Todd's review.

Step 6 — Handoff note

Weekly batch complete.

  Updated:        [n] pages (content + structure)
  Prose-polished: [n] pages (voice-polisher pass on low-engagement candidates)
  Human review:   [n] pages (see weekly-batch-[date].md)
  Skipped:        [n] pages (confidence floor — need more research)
  Errors:         [n] pages (pipeline contract failures — see report)

Files written to frontmatter/ are ready to review and commit.
No git actions taken — Todd commits and pushes to main.

Rules

Never runs git add, git commit, git push, or any deployment command
Never modifies skill files or threshold configs
Never processes more than MAX_QUEUE pages per run
autoRunnable: false items are always skipped
Slugs updated within 14 days are skipped
If gap-signals.json older than 24 h → abort
Checkpoint written after every slug — context overflow is resumable
If generation-loop does not write its output file → hard error (logged, not silently ignored)
If enricher run fails for a category → all pages in that category → human_review: enricher_insufficient

Scheduling (self-registration)

On every invocation, check whether a scheduled task already exists and create if not:

# Call mcp__scheduled-tasks__list_scheduled_tasks
# If 'zbeam-weekly-batch' not found in name or prompt:
#   Call mcp__scheduled-tasks__create_scheduled_task with:
#     name="zbeam-weekly-batch"
#     prompt="Run the zbeam-weekly-batch skill. Read skills/writing/zbeam-weekly-batch/SKILL.md and execute it."
#     cronExpression="0 6 * * *"  # Daily 6am — trigger gate handles idempotency

Daily cron is safe: the Step 0 trigger gate exits in < 5 seconds if no new data.

Active schedule:

0 7 * * 6  zbeam-cross-page-quality-audit  (Saturday)
0 9 * * 0  zbeam-adversarial-sampling      (Sunday — reads Saturday audit)
0 5 * * 1  zbeam-improvement-loop          (Monday)
0 6 * * 1  zbeam-weekly-batch ← this skill (Monday — reads Sunday directives)

⏸ COMMIT GATE

Never auto-commit. Summarize what changed, provide the git command, wait for Todd to run it.