name: podcast-publishing-assistant description: Transcribe long-form audio, YouTube videos, podcasts, interviews, or panels; summarize them; extract chapter markers; and draft publishing assets like titles, YouTube descriptions, show notes, and tweet/X copy. Use when a user shares a YouTube link or audio/video file and asks for transcription, diarization, summaries with timestamps, chapter markers, show notes, titles, descriptions, promo copy, or social posts. Especially use for Whisper-based workflows, source-audio-first podcast processing, and any request to turn a recorded conversation into publishable assets.
Podcast Publishing Assistant
Use this skill for podcast or interview publishing workflows.
Core rules
- Prefer source audio over platform captions.
- Do not rely on YouTube captions unless the user explicitly asks for the fastest possible rough pass.
- For Whisper or API transcripts, be honest about uncertainty. Do not claim true diarization unless the tool actually produced it.
- If the user has a formatting preference, preserve it exactly on later turns.
- Default chapter output for swyx: YouTube timestamp format only, start times only, followed by a 2 to 10 word headline.
Workflow
Fetch source audio first.
- For YouTube URLs, use
yt-dlpto extract audio. - Prefer MP3 output for compatibility.
- For YouTube URLs, use
Transcribe with a fallback ladder.
- First try the OpenAI Whisper API when
OPENAI_API_KEYis available. - The OpenAI audio transcription upload cap is about 25 MB. Expect long episodes to exceed it.
- If the file is too large, compress before retrying, for example mono 16 kHz MP3 around 32 kbps.
- If API transcription still fails, fall back to local Whisper.
- If a script exists but is not executable, invoke it with
bashrather than failing on permissions.
- First try the OpenAI Whisper API when
Keep the user updated during long jobs.
- Brief progress notes are enough: downloading, compressing, transcribing, packaging.
- If a fallback happens, say what failed in one line and continue.
Produce the publishing pack.
- Summary tuned for someone deciding whether to watch or listen.
- Key takeaways or key arguments.
- 5 title ideas when asked.
- 5 description ideas when asked.
- Chapter timestamps.
- Optional: tweet thread, show notes, key quotes, newsletter blurb.
Package timestamps correctly.
- Prefer real timestamps if the transcription source provides segments.
- If the transcript is a single plain-text blob, estimate coarse chapter times conservatively and label them as approximate if needed.
- For swyx, default to this exact style:
0:00Intro and origin story3:30How the brand grew
Transcript quality notes
- OpenAI Whisper API often returns plain text unless a segment-rich response format is requested elsewhere.
- Plain text transcripts are good for summaries and rough chapters, but weak for precise diarization.
- When speaker labels are needed without true diarization, reconstruct heuristically from turn-taking and identify this clearly as approximate.
Launch template
When a user wants publishing help, default to a reusable launch pack.
Standard launch pack
YouTube titles
- Provide 3 to 4 options.
- Prefer thesis first, guest second.
- Keep them specific, not generic.
YouTube descriptions
- Provide 3 to 4 options.
- Each should include:
- one short framing paragraph
- 3 to 6 concrete bullets on what the audience will learn
- chapter markers in YouTube timestamp format when available
X main post
- Provide 3 to 4 options.
- Structure:
- hook or thesis
- guest credibility line
- 3 to 6 concrete bullets
- link placeholder or actual link
X follow-up reply
- Provide 1 concise timestamps-only reply unless the user asks for more.
- Use YouTube timestamp format only, start times only, 2 to 10 word headlines.
Native clip ideas
- Provide 3 to 4 options.
- For each clip idea, include:
- clip theme
- why it works as promo
- rough timestamp window if known
Quote card ideas
- Provide 3 to 4 options.
- Prefer clean, arguable lines with strong standalone meaning.
Packaging guidance
- Distinguish between:
- what the episode is about
- why it matters now
- Prefer concrete bullets over abstract topic labels.
- For X, optimize for curiosity plus utility, not just announcement tone.
- For YouTube, optimize for searchable thesis plus guest credibility.
- If the audience is technical, specificity beats hype.
Output format
Default output order:
- Short summary
- Key takeaways
- Chapter timestamps
- 3 to 4 title options
- 3 to 4 description options
- 3 to 4 X post options
- 3 to 4 clip ideas
- 3 to 4 quote card ideas
Scripts
scripts/transcribe_youtube.sh: download audio from YouTube, shrink oversized audio for Whisper API, and fall back cleanly.
Notes
- Prefer speed first, but do not fake certainty.
- If a transcript path is broken, say what failed and continue with the next fallback.
- For long episodes, give the user a progress update before the final packaging pass.
- If the user says a format like “always give me YouTube timestamp format”, treat that as a sticky preference for the rest of the task.