name: paper-ppt-generate description: Generate an academic PowerPoint deck from a PDF or LaTeX source inside Paper PPT Agent's Agent-mode workspace. Use this when the task asks Claude Code or Codex to analyze a paper, plan a slide deck, author SVG slides, perform research/QA with subagents where useful, and write artifacts that the backend can preview and export.
Paper PPT Generate
You are working inside a Paper PPT Agent generation workspace. The backend has
prepared agent_task.json, the uploaded paper source, template assets, icon
assets, and output folders. You own paper understanding, research, deck
planning, slide SVG authoring, and ordinary QA. The backend only watches files,
validates SVGs, finalizes them, and exports PPTX.
Required Workflow
- Read
agent_task.json. - Read only the references you need:
skills/paper-ppt-generate/references/output-contract.mdfor required files and SVG rules.skills/paper-ppt-generate/references/slide-authoring.mdbefore writing SVG.skills/paper-ppt-research/SKILL.mdwhen external research is enabled.skills/paper-ppt-deep-research/SKILL.mdwhen deep research is enabled.
- If
agent_task.json.presentation.template_idis set, inspect the selected template directory inagent_task.json.paths.selected_templatebefore planning the deck. Read itsdesign_spec.md, page SVG skeletons, andtemplate_context.json; preserve its page chrome, coordinate system, colors, typography, and structural conventions unless the user explicitly asks otherwise. - Inspect the paper source in
sources/. Use commands/scripts as needed. For PDF sources, do not use the Read tool directly on*.pdf: it is a binary file and the tool may falsely report that the PDF is password protected. Use the Python interpreter fromagent_task.json.paths.pythonorPAPER_PPT_PYTHONwith a PDF library such as PyMuPDF (fitz) first, and only call the PDF password-protected ifdoc.needs_passis true. Before drafting content, readsource_assets/paper.mdas the complete extracted paper, not merely as a summary, and readsource_assets/figures.md/figures.jsonif present. If they are missing, stale, orsource_assets/extraction_error.jsonexists, run:"<python>" skills/paper-ppt-generate/scripts/extract_paper_assets.py --task agent_task.json --out source_assetsusingagent_task.json.paths.pythonorPAPER_PPT_PYTHON. During long work, periodically checkagent_feedback/for user guidance. - Write
manuscript.mdas slide-structured content. Slide 1 is the cover. Slide 2 is a mandatory table of contents whose chapter titles exactly match the final chapter plan. Choose chapter count and slide count so every TOC chapter has content slides that develop it. - Write
design_spec.mdwith deck-level visual rules and per-slide intent. - Write each slide as soon as it is usable:
svg_output/slide_001.svgsvg_output/slide_002.svg- etc. Author each file directly and independently. Do not create or run a general script/program that generates multiple slide SVGs.
- Optionally write speaker notes in
notes/slide_001.md, etc. - Write
agent_report.jsonwhen finished.
Agent Behavior
- Do not call Paper PPT Agent's provider model pipeline or request provider API keys.
- Treat
agent_task.json.presentation.user_instructionand neweragent_feedback/guidance as the highest product requirements after hard export validity. The backend needs SVG/PPTX-compatible files, but that is only the technical wrapper. Do not dismiss a user preference by saying the output contract requires SVG. - If user wording conflicts with the file format, preserve the user's underlying goal and translate it into valid artifacts. For example, requests such as "不用 SVG", "只用图像生成", "GPT image", or "更美观" should become an image-led, polished visual system inside SVG slide wrappers, using local/generated raster assets where available and SVG composition only as the export container.
- Use
agent_task.json.presentation.detail_profileonly for slide-count planning and source/summary retrieval budgets. It must not determine per-slide density, bullet count, evidence count, typography, or layout. - Follow
agent_task.json.agent_policy.slide_authoring_policyas a hard contract. Write one explicitly named slide SVG at a time. Do not create or run Python, JavaScript, TypeScript, shell, PowerShell, batch, Ruby, or other code that emits multiple slides. - Do not use loops, slide arrays/registries, shared
base_svg/title/card renderers, or template expansion to generate the deck. Shared palette, typography, margins, and recurring chrome are allowed only as documented visual tokens; every slide's complete composition and markup must remain explicit. - Final slide SVGs must use inline SVG attributes. Do not use
<style>blocks orclass=attributes; placefill,stroke,font-family,font-size, and related styling directly on the element. - Scripts may parse the paper, retrieve research, inspect or validate SVGs read-only, render previews, and export PPTX. They may not write or regenerate slide SVGs.
- Follow
agent_task.json.agent_policy.layout_policyduring generation, not only during review. Design each slide's spatial layout when generating that slide, based on its content, figures, and layout family. - Content slides should use a shared grid, 24-40px gutters, and normally occupy 65-85% of the content area. Do not leave an empty quadrant, a short bullet list floating beside blank space, or a bottom callout detached from the main layout.
- Visual depth is required — flat pages without elevation look unfinished. Use filter shadows on cards/panels (
<filter>withfeGaussianBlur+feOffset,flood-opacity0.12-0.20). Use gradients (linearGradient/radialGradient) for backgrounds and overlays. Use accent top-bars (4px colored rect) or left-borders on cards. Add subtle decorative elements: rotated small shapes as corner accents, gradient divider lines, brand-color orbs/circles. - Card design: every card/panel should have a shadow, an accent element (top-bar or left-border), and proper inner padding. Do not use plain flat rectangles with just a stroke border.
- Visual hierarchy: use large bold numbers (36-48px) with small gray labels for KPI/metrics. Use color-coded status (green=positive, red=negative, yellow=warning). Use accent callout boxes with light tinted backgrounds (
fill-opacity="0.08") and colored left borders for key takeaways. - Color rules: 60-30-10 rule (primary 60%, secondary 30%, accent 10%). Maximum 4 colors per page. Use monochromatic opacity variations for chart series, not rainbow colors.
- Before drawing a slide, plan the layout regions explicitly. Budget heading height, line-height, caption space, and padding. Reflow only when clipping or overlap would occur.
- Size table columns from the longest visible method/label/cell before placing any row text. A long method name must never collide with the first numeric column.
- Wrap Chinese body text dynamically from each region's actual width, font size, visual share, and current-page density. Avoid conservative early breaks, use semantic boundaries, and allow light raggedness when it improves composition.
- When a paper figure supports a slide, reserve the figure frame first and preserve the figure aspect ratio inside that frame. Balance the adjacent text column with grouped callouts, captions, and a clear takeaway instead of treating evidence as pasted snippets.
- Follow
agent_task.json.agent_policy.factual_rendering_policy. Keep source numbers and units exact; preserve the paper's original notation and precision. Do not invent or round chart ticks, metrics, sample sizes, dates, or model settings. Label derived values and show their inputs/calculation nearby. - Use
paper-ppt-deep-researchfor deep work. Whenallow_deep_researchis true, run that skill and produceresearch/deep/notes_index.jsonandresearch/deep/brief.mdbefore authoring any deck output. Launch focused research SubAgents with the Task/SubAgent tool if it is available; skip only if the runtime has no such tool or it fails, and record the concrete reason inagent_report.json.subagents. When Agent review is enabled, start one focused reviewer after the first full deck draft; review the whole deck for layout overflow, missing assets, icon-policy violations, and narrative gaps instead of checking every page one by one. The main Agent remains responsible for the final story, visual consistency, and artifact integration. - Generate slides in parallel when useful; keep consistency through
design_spec.md. - Before authoring a chapter slide, read at most the two most recent earlier chapter-slide SVGs. Before authoring a content slide, read at most the two most recent earlier content-slide SVGs and skip structural pages. Use those SVGs only for visual continuity; the current manuscript remains authoritative.
- Icons are optional visual aids. Add them only when they clarify the content (e.g., process steps, comparison dimensions, legend keys). Do not add icons as filler decoration.
- Use filesystem commands against
agent_task.json.paths.iconsto find real local.svgfiles. Verify the file exists before referencing it. Never invent icon names, use remote URLs, or use text glyphs (letters, emoji, Unicode symbols, arrows, stars) as icon substitutes. - Define a small icon vocabulary in
design_spec.mdbefore drafting slides. Most content slides should use 0-3 icons. Color icons from the deck palette. - Flow connectors must be real SVG geometry (
line,path,polyline), not text glyph arrows. - Use external research only when
agent_task.jsonallows it. Runpaper-ppt-researchafter reading the paper; choose search queries yourself from paper understanding, then produceresearch/raw_external_results.json,research/sources.json, andresearch/brief.mdbefore authoring any deck output. Readresearch/external_search_summary.jsonbefore sampling raw search output, and treat documented sparse/no relevant results as a valid completion state. - Treat
agent_task.json.agent_policy.research_gateas mandatory. When enabled, the backend will reject writes to manuscript, design, notes, report, and slide SVG files until the required research artifacts exist. - If deep research is enabled, run
paper-ppt-deep-research, split work across focused readers/SubAgents where supported, and merge the synthesis into the manuscript. - If Agent review is enabled, perform a whole-deck review after the first complete draft. Use multimodal inspection only when the runtime supports it; otherwise perform XML/static review and record the limitation.
- For any Python command, prefer
agent_task.json.paths.pythonorPAPER_PPT_PYTHONover systempython; this keeps PDF parsing, SVG checks, and export helpers in the backend environment where required packages are installed. - Treat
source_assets/figures.jsonas the paper-figure contract. It gives each figure'sid, exact SVGhref, caption, PDF page/bbox when available, surrounding context, and natural dimensions. Select figures by caption/page/context, not by filename alone. - Use paper figures in content slides whenever they directly support the slide argument. In SVG, reference the exact manifest
hrefsuch as../source_assets/images/pdf_fig_001_p3_abcd.png; preserve the listed aspect ratio. - For TeX archives, use the source graphics discovered from
\includegraphics; do not screenshot every archive image. PDF graphics referenced by TeX may already be rendered intosource_assets/images/.
Live Preview
The backend watches svg_output/. Save each completed or repaired slide SVG
there immediately. Rewriting an existing slide_###.svg updates the app's live
preview.
Quality Bar
- Every SVG must parse as XML and define a correct canvas
viewBox. - Do not use
<style>,class=,mask,clipPath,foreignObject, scripts, event handlers, or remote URLs in final SVGs. - Keep all needed assets local or embedded by relative path.
- Avoid external image URLs in final SVGs.
- Audit the final SVGs for icon-policy violations: icons are purposeful rather than decorative, the icon vocabulary is consistent, colors follow the deck palette, and there are no fake icon letters/symbols/emoji, made-up icon files, or missing local icon references.
- Audit for text glyph symbols used visually, including exclamation marks, stars, checkmarks, and arrow characters. Replace them with real local icon SVGs or SVG geometry before finishing.
- Preserve template chrome when a template is provided.
- Prefer clear academic storytelling over dense paper transcription.
- Audit layout: major elements should align to a consistent grid, content slides should have balanced occupancy, paper figures should not be stretched, and Chinese line breaks should not create repeated short orphan lines or large dead zones.
- Treat out-of-bounds text, text outside a card, table-column collision, and a last baseline below its assigned region as generation failures that must be repaired before completion.
- Audit authorship: no generated program may have written multiple slide SVGs, and
agent_report.json.slide_authoring.direct_filesmust list every final slide. - End only when all slides are present and
agent_report.jsonexplains what was produced, what research was used, what icon assets were used or why icons were omitted, which SubAgent tasks were used/skipped, and whether QA passed. - Include
paper_figures_usedinagent_report.jsonwith figure ids/hrefs/captions used in slides, or explain why no extracted paper figure was suitable.