podcast-gen

star 0

Generate podcast-style audio conversations from text, files, or URLs using Podcastfy + MiniMax Speech 2.8 TTS by default.

clawdwork By clawdwork schedule Updated 4/28/2026

name: podcast-gen description: Generate podcast-style audio conversations from text, files, or URLs using Podcastfy + MiniMax Speech 2.8 TTS by default. user-invocable: false metadata: { "openclaw": { "emoji": "๐ŸŽ™๏ธ", "requires": { "bins": ["uv"], "env": ["GEMINI_API_KEY", "REPLICATE_API_TOKEN"] }, "primaryEnv": "GEMINI_API_KEY", },

}

podcast-gen

Generate engaging two-host podcast conversations from any text content using Podcastfy.

How it works

  1. Script generation: An LLM (Gemini Flash by default) reads the input and writes a natural two-person conversational script
  2. Text-to-speech: The script is voiced by two distinct TTS voices
  3. Output: A single MP3 file with the full podcast episode

Quick start

# From text
uv run {baseDir}/scripts/generate_podcast.py --text "Your content here" --filename "./media/generated/drafts/daily-intel.mp3"

# From a markdown file
uv run {baseDir}/scripts/generate_podcast.py -f ./notes/2026-02-09.md --filename "./media/generated/drafts/daily-digest.mp3"

# From multiple files
uv run {baseDir}/scripts/generate_podcast.py -f ./notes/monday.md -f ./notes/tuesday.md --filename "./media/generated/drafts/weekly-recap.mp3"

# From a URL
uv run {baseDir}/scripts/generate_podcast.py -u "https://example.com/article" --filename "./media/generated/drafts/article-pod.mp3"

# Mixed sources
uv run {baseDir}/scripts/generate_podcast.py -f ./report.md -u "https://blog.com/post" --text "Additional context" --filename "./media/generated/drafts/mixed.mp3"

Parameters

Input sources (at least one required)

Parameter Description
--text Raw text to convert
--file, -f Path to text/markdown/PDF file (repeatable)
--url, -u URL to include as source (repeatable)

Output

Parameter Default Description
--filename ./media/generated/drafts/podcast.mp3 Output audio path
--save-transcript โ€” Save conversation transcript to path
--transcript-only false Only generate transcript, skip audio

Podcast style

Parameter Default Description
--name โ€” Podcast name
--tagline โ€” Podcast tagline
--language English Output language (40+ supported)
--style engaging,informative,conversational Comma-separated styles
--instructions โ€” Custom focus/topic instructions
--creativity 0.7 Temperature (0.0โ€“1.0)
--ending "Thanks for listening!" Closing message

TTS provider

Parameter Default Description
--tts minimax TTS provider: minimax (default โ€” studio voices), edge (free fallback), openai, elevenlabs, gemini, geminimulti
--voice1 auto Voice for host 1 (questioner) โ€” minimax default English_ManWithDeepVoice
--voice2 auto Voice for host 2 (answerer) โ€” minimax default Wise_Woman

LLM

Parameter Default Description
--llm-model gemini-3-flash-preview LLM for script generation
--gemini-key GEMINI_API_KEY env API key override

Length

Parameter Default Description
--longform false Generate longer podcast (10โ€“30+ min)
--max-chunks 5 (short) / 15 (long) Max discussion rounds

TTS providers comparison

Provider Cost Quality API Key Best for
edge Free Good None Daily digests, quick summaries
openai ~$15/1M chars Great OPENAI_API_KEY Professional quality
elevenlabs ~$180/1M chars Excellent ELEVENLABS_API_KEY Premium voice customization
minimax ~$0.01/episode Excellent REPLICATE_API_TOKEN Studio-grade, 300+ voices, 40+ languages
geminimulti Varies Excellent GEMINI_API_KEY English, natural multi-speaker

Examples

Daily intelligence digest (2โ€“5 min, free)

uv run {baseDir}/scripts/generate_podcast.py \
  -f ~/org/shared/memory/2026-02-09.md \
  --name "Daily Intel" \
  --tagline "Your morning briefing" \
  --style "concise,professional,informative" \
  --instructions "Summarize key findings and action items" \
  --filename "./media/generated/drafts/2026-02-09-daily-intel.mp3"

Weekly project recap (10+ min, longform)

uv run {baseDir}/scripts/generate_podcast.py \
  -f ~/org/shared/projects/acme/research/weekly-summary.md \
  -f ~/org/shared/memory/2026-02-09.md \
  --name "Project Pulse" \
  --longform \
  --style "analytical,engaging,thorough" \
  --instructions "Focus on progress, blockers, and decisions made this week" \
  --filename "./media/generated/drafts/2026-02-09-weekly-recap.mp3"

MiniMax voices (studio-grade, recommended)

uv run {baseDir}/scripts/generate_podcast.py \
  -f ~/org/shared/memory/2026-02-09.md \
  --tts minimax \
  --voice1 English_ManWithDeepVoice --voice2 Wise_Woman \
  --name "Daily Intel" \
  --filename "./media/generated/drafts/2026-02-09-daily-intel.mp3"

Premium quality with OpenAI voices

uv run {baseDir}/scripts/generate_podcast.py \
  -f ./report.md \
  --tts openai \
  --voice1 echo --voice2 shimmer \
  --filename "./media/generated/drafts/premium-podcast.mp3"

API keys

  • GEMINI_API_KEY โ€” Required for script generation (already configured)
  • REPLICATE_API_TOKEN โ€” Only if using --tts minimax (already configured)
  • OPENAI_API_KEY โ€” Only if using --tts openai
  • ELEVENLABS_API_KEY โ€” Only if using --tts elevenlabs
  • Edge TTS (--tts edge) requires no API key at all

Notes

  • Always use ./ relative paths for filenames so OpenClaw can auto-attach via chat.
  • The script prints a MEDIA: line for OpenClaw to auto-attach on supported chat providers.
  • Default TTS is MiniMax Speech 2.8 (studio quality, ~$0.01/episode). Use --tts edge for a free fallback when REPLICATE_API_TOKEN is unavailable.
  • Longform podcasts can take 2โ€“5 minutes to generate depending on content length.
  • Supports 40+ languages for both script and TTS.
  • Do not read the audio back; report the saved path only.
Install via CLI
npx skills add https://github.com/clawdwork/openclaw --skill podcast-gen
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator