name: sn-ppt-entry description: | Entry point for PPT generation. Asks the user to choose a mode (fast, standard, or creative), then collects role / audience / scene / page_count as needed. For standard mode, also asks how images should be sourced (AI generation, web search, or none) and whether charts should use AI-generated infographics or ECharts. Parses uploaded pdf/docx/md/txt files, produces task_pack.json + info_pack.json in a new deck_dir, then dispatches to sn-ppt-creative or sn-ppt-standard. Fast mode skips optional questions and gets straight to building. Use when the user asks to make a PPT / presentation / 演示 / PPT. metadata: project: SenseNova-Skills tier: 1 category: scene user_visible: true triggers: - "生成 PPT" - "做一套 PPT" - "做一份演示" - "sn-ppt-entry"
sn-ppt-entry
Hard preconditions
Run sn-ppt-doctor hard checks (SN_API_KEY or capability-specific API keys / node / sn-image-base) at the start of this skill. If any fails, stop and tell the user to run /skill sn-ppt-doctor.
Flow
Extract parameters from the user's message:
role(speaker identity)audiencescene(where the deck will be used)page_countlanguage— detect from the user's query:zh-Hans(Simplified Chinese),zh-Hant(Traditional Chinese), oren(English). Do NOT ask the user; just infer and record it. If unsure, usezh-Hans.
If
task_pack.json+info_pack.jsonalready exist in a deck_dir the user refers to, read them and jump to step 10 (see "Resume" below).Always ask the user which mode to use first. Call
ask_user:Question — Mode: "Which generation mode should I use?"
- "Fast mode — build the slides now so you can review and iterate"
- "Standard mode — plan the style and content thoroughly first, then build"
- "Creative mode — full-page AI-generated images per slide"
Store as
ppt_modeintask_pack.Only ask image-related questions for standard mode. Fast mode and creative mode have fixed defaults — asking extra questions defeats the purpose of "fast."
If
ppt_mode == "standard", ask two more questions:Question — Normal images (decorative / conceptual): "Should I include images, and how should they be sourced?"
- "AI generation — create images from scratch"
- "Web search — pull real photos from the web (requires Serper API key)"
- "No images — use text, charts, and CSS visuals only"
If the user picks web search and
SERPER_API_KEYis not set, tell them how to get a free key at https://serper.dev. Store asimage_sourceintask_pack.params.Question — Infographics (charts, flowcharts, diagrams): "For charts and diagrams, should I use AI-generated infographics or ECharts?"
- "AI-generated infographics — U1 creates custom diagram images"
- "ECharts — rendered as interactive charts in the HTML"
Store as
infographic_sourceintask_pack.params("ai-gen"or"echarts").If
ppt_mode == "fast": skip image questions. Default toimage_source = "ai-gen"andinfographic_source = "echarts". Also skip role/audience/scene/page_count questions — infer reasonable defaults from the user's query and move directly to building slides. Fast mode means fewer questions, faster start. If the user didn't explicitly state these, make your best guess and proceed.If
ppt_mode == "creative": skip image questions. Default toimage_source = "ai-gen"(full-page T2I rendering). Infographics are not applicable. Skip role/audience/scene/page_count unless explicitly stated.Collect
role -> audience -> scene -> page_count— for standard mode only. Use the wording inreferences/ask_user_templates.md. 2-3 options per question; do not write "其他". For fast/creative modes, infer from the query and move on.Create deck_dir — location is FIXED, do not guess:
- Parent: always
$(pwd)/ppt_decks/. In OpenClaw, cwd at skill-invocation time is the agent's workspace directory (e.g.~/.openclaw/workspace/). Do NOT use/tmp, the home directory, the repo root, or$SKILL_DIRas the parent. Do NOT honor$PPT_DECK_ROOTeither — it's been removed to avoid drift. - Parent directory must be created if missing:
mkdir -p $(pwd)/ppt_decks. - Deck name:
<topic_concise>_<YYYYMMDD_HHMMSS>. - Full deck_dir path:
$(pwd)/ppt_decks/<topic_concise>_<YYYYMMDD_HHMMSS>/. - Immediately resolve to absolute (
realpath/Path.resolve()) before writing it intotask_pack.json— downstream must see an absolute path. - Create subdirs:
pages/always;images/ifppt_mode in {standard, fast}. - If
$(pwd)/ppt_decks/cannot be created (permission denied) → abort, tell the user to check workspace permissions.
- Parent: always
If user attached reference_docs (pdf/docx/md/txt):
- Run
$SKILL_DIR/scripts/parse_user_docs.py --files <paths...> --output <deck_dir>/raw_documents.json. The--outputflag tells the script to write the JSON itself (recommended — works reliably even on agents that don't handle shell redirection well). The script prints a single-line JSON status{"status":"ok","output":"...","documents":N,"errors":M}to stdout when--outputis used. - Call the LLM with
$SKILL_DIR/prompts/document_digest.mdas system prompt + (user_query + concatenated document text) as user prompt. See "Invoking the LLM" below. - On success: write
document_digestJSON intoinfo_pack.document_digest. - On failure: degrade — set
info_pack.document_digest = null, continue (do NOT abort entry).
- Run
Write
task_pack.json+info_pack.jsonto deck_dir (see "Schemas" below). All path-bearing fields absolute.Caption every image once with VLM (mandatory, idempotent — runs after
info_pack.jsonis written so both pools are visible):python3 $SKILL_DIR/scripts/caption_images.py --deck-dir <deck_dir>This script is the single source of truth for image-content descriptions:
- Pool A — doc-embedded images (
raw_documents.jsondocuments[*].inherited_images[*]): caption written into the same JSON asvlm_caption. - Pool B — standalone uploads (
info_pack.user_assets.reference_images): caption written into a sister fieldinfo_pack.user_assets.reference_image_captions: {abs_path: caption}. - Already-captioned images are skipped silently, so re-running is cheap and safe. Only newly added images incur a VLM call.
- Failures don't abort: the script reports them in the JSON status; downstream stages fall back to filename / alt / digest hint when a caption is missing.
Downstream (sn-ppt-standard
cmd_page_html) reads these cached captions and never re-captions — that's the "single source of truth" rule. If you change image files in a deck, delete theirvlm_caption(orreference_image_captions[path]) entry and re-run this script to refresh.
- Pool A — doc-embedded images (
Dispatch to
sn-ppt-creativeorsn-ppt-standardbased ontask_pack.ppt_mode.
ask_user boundary conditions
- User answers multiple params in one turn -> extract all with a single
sn-text-optimizecall; skip asked-already params. - User's answer isn't in the 2-3 options -> record verbatim; don't force into the enumeration.
- Session interrupted before task_pack.json written -> discard temp params; next entry starts over.
- task_pack.json already exists -> skip param collection, go straight to dispatch.
Invoking the LLM for document_digest
parse_user_docs.py --output <deck_dir>/raw_documents.json already creates the file. Then call the LLM with a user prompt that gives only counts + indices of tables/images (not row contents) so the LLM can't accidentally paraphrase numbers:
python3 -c "
import sys, json, pathlib
sys.path.insert(0, '$PPT_STANDARD_DIR/lib')
from model_client import llm
raw = json.loads(pathlib.Path('<deck_dir>/raw_documents.json').read_text())
# Build the digest-safe view: strip tables[] and image paths, keep text + indices
docs_view = []
for d in raw.get('documents', []):
docs_view.append({
'doc_index': d['doc_index'],
'type': d['type'],
'text': d.get('text',''),
'tables_count': len(d.get('tables') or []),
'images_count': len(d.get('inherited_images') or []),
})
user_prompt = json.dumps({
'user_query': '<the user's original query>',
'documents': docs_view,
}, ensure_ascii=False)
sys_prompt = open('$SKILL_DIR/prompts/document_digest.md').read()
out = llm(sys_prompt, user_prompt)
# Parse JSON; if it fails, degrade digest to null (not abort entry)
try:
digest = json.loads(out)
except Exception:
digest = None
pathlib.Path('<deck_dir>/digest_tmp.json').write_text(json.dumps(digest, ensure_ascii=False))
"
The digest JSON then merges into info_pack.document_digest. Downstream stages (outline, page_html) read both info_pack.document_digest (structured summary + inherited_tables/images index lists) AND raw_documents.json (actual table rows + image paths).
Substitute $PPT_STANDARD_DIR with the sn-ppt-standard skill install dir.
Schemas
task_pack.json:
{
"deck_id": "AI产品发布会_20260318_154500",
"deck_dir": "/abs/path/ppt_decks/AI产品发布会_20260318_154500",
"ppt_mode": "standard",
"params": {
"role": "...",
"audience": "...",
"scene": "...",
"page_count": 10,
"language": "zh",
"image_source": "ai-gen",
"infographic_source": "ai-gen"
},
"created_at": "2026-04-21T15:45:00+08:00",
"skill_version": "0.1.0"
}
info_pack.json:
{
"user_query": "...",
"user_assets": {
"reference_images": ["/abs/..."],
"reference_docs": ["/abs/..."],
"reference_docs_failed": []
},
"document_digest": {
"topic_summary": "...",
"key_sections": [],
"key_points": [],
"data_highlights": [],
"inherited_tables": [{"doc_index": 0, "table_index": 2, "title_hint": "..."}],
"inherited_images": [{"doc_index": 0, "image_index": 0, "caption_hint": "..."}]
},
"raw_document_excerpts": {
"enabled": true,
"path": "/abs/.../raw_documents.json"
}
}
🚫 Hard rules
- Do NOT use python-pptx, pptxgenjs, or any alternative PPTX builder. PPTX is produced by the downstream mode skills (sn-ppt-standard / sn-ppt-creative) through their designated scripts. Never
pip install python-pptxor write Node scripts that importpptxgenjs. - Wait for
ask_userresponses. When you ask the user a question, do NOT proceed until they reply. Never continue with assumed or default values. - Validate paths before writing. Always
lsorpwdto verify the current working directory before creating files. The only valid output location is$(pwd)/ppt_decks/<deck_dir>/. Never write to/workspace/,/tmp/,~/, or any hallucinated path. If a path doesn't start with the verified$(pwd), it's wrong.
Failure handling
- Missing required env var -> stop, tell user
/skill sn-ppt-doctor. $(pwd)/ppt_decks/not creatable / not writable -> stop, tell user to check workspace permissions.- Per-file doc parse failure -> record in
reference_docs_failed, continue. document_digestLLM failure -> set to null, continue.
Progress echo — MANDATORY
Emit a short chat reply at each boundary. Silence between ask_user rounds and mode dispatch is a bug.
| When | Example |
|---|---|
| Right after entering sn-ppt-entry | 已进入 sn-ppt-entry,开始收集参数... |
| Missing a param | 缺少参数:<role>,马上问你 (then ask_user) |
| All params collected | 参数齐备:mode=standard, image_source=ai-gen, role=...。开始创建 deck_dir... |
| Before doc parse | 检测到 2 个附件,开始解析... |
| After doc parse | 解析完成:sample.pdf (12 页) / sample.docx (45 段) |
| Before digest | [LLM] 正在汇总文档要点... |
| After digest | 文档摘要已入 info_pack.json |
| task_pack / info_pack written | task_pack.json / info_pack.json 已写入 <deck_dir> |
| Dispatching | 分发到 sn-ppt-creative(deck_dir=...) |
Output and handoff
Final message includes a short summary:
准备就绪:
- 模式: <creative | standard>
- 页数: <n>
- deck_dir: <abs path>
即将进入<创意 | 标准>模式...
Then dispatch:
- ppt_mode=creative -> invoke
/skill sn-ppt-creative deck_dir=<abs> - ppt_mode=standard -> invoke
/skill sn-ppt-standard deck_dir=<abs>
Does NOT
- Do not generate any style / outline / page content (that's the mode skill's job).
- Do not run any image generation.