name: source-manifest
description: |
Build or validate the source manifest for the source-tutorial pipeline from user-provided URLs/files.
Trigger: source manifest, sources list, tutorial sources, url list, 资料清单, 教程来源.
Use when: source-tutorial 的 C1,需要把多源输入落成统一的 sources/manifest.yml,并在内容不完整时显式阻塞。
Skip if: 已经有完整且经过确认的 sources/manifest.yml。
Network: none.
Guardrail: 不要伪造来源;manifest 不完整时应返回 BLOCKED,而不是假装完成。
Source Manifest
Goal: collect the source candidates for source-tutorial into one explicit manifest before ingest starts.
Inputs
GOAL.mdDECISIONS.md
Outputs
sources/manifest.yml
Workflow
- Read
GOAL.mdand any existingDECISIONS.mdnotes to understand what kinds of sources the tutorial is expected to use. - If
sources/manifest.ymldoes not exist, scaffold it with a minimal example. - Validate that each source has:
source_idkindlocatorlabel- for
kind: video, prefertranscript_locatorunless the platform exposes public subtitles
- Reject placeholder/example-only manifests.
- Return success only when at least one real source exists.
Accepted kind
webpagepdfmarkdownrepodocs_sitevideo
For video:
- preferred: add
transcript_locator - Bilibili may succeed from public subtitles when available
- YouTube should currently be paired with
transcript_locator
Script
Quick Start
python .codex/skills/source-manifest/scripts/run.py --workspace <ws>
All Options
--workspace <dir>(required)--unit-id <U###>--inputs <semicolon-separated>--outputs <semicolon-separated>--checkpoint <C#>
Examples
- Scaffold or validate the manifest:
python .codex/skills/source-manifest/scripts/run.py --workspace <ws>
Troubleshooting
Issue: the unit keeps blocking after scaffold
Cause:
- The manifest still contains example placeholders.
Fix:
- Replace the scaffold entry with real URLs or local file paths, then rerun the unit.