name: podcast-gen
description: Generate podcast-style audio conversations from text, files, or URLs using Podcastfy + MiniMax Speech 2.8 TTS by default.
user-invocable: false
metadata:
{
"openclaw":
{
"emoji": "๐๏ธ",
"requires": { "bins": ["uv"], "env": ["GEMINI_API_KEY", "REPLICATE_API_TOKEN"] },
"primaryEnv": "GEMINI_API_KEY",
},
}
podcast-gen
Generate engaging two-host podcast conversations from any text content using Podcastfy.
How it works
- Script generation: An LLM (Gemini Flash by default) reads the input and writes a natural two-person conversational script
- Text-to-speech: The script is voiced by two distinct TTS voices
- Output: A single MP3 file with the full podcast episode
Quick start
# From text
uv run {baseDir}/scripts/generate_podcast.py --text "Your content here" --filename "./media/generated/drafts/daily-intel.mp3"
# From a markdown file
uv run {baseDir}/scripts/generate_podcast.py -f ./notes/2026-02-09.md --filename "./media/generated/drafts/daily-digest.mp3"
# From multiple files
uv run {baseDir}/scripts/generate_podcast.py -f ./notes/monday.md -f ./notes/tuesday.md --filename "./media/generated/drafts/weekly-recap.mp3"
# From a URL
uv run {baseDir}/scripts/generate_podcast.py -u "https://example.com/article" --filename "./media/generated/drafts/article-pod.mp3"
# Mixed sources
uv run {baseDir}/scripts/generate_podcast.py -f ./report.md -u "https://blog.com/post" --text "Additional context" --filename "./media/generated/drafts/mixed.mp3"
Parameters
Input sources (at least one required)
| Parameter |
Description |
--text |
Raw text to convert |
--file, -f |
Path to text/markdown/PDF file (repeatable) |
--url, -u |
URL to include as source (repeatable) |
Output
| Parameter |
Default |
Description |
--filename |
./media/generated/drafts/podcast.mp3 |
Output audio path |
--save-transcript |
โ |
Save conversation transcript to path |
--transcript-only |
false |
Only generate transcript, skip audio |
Podcast style
| Parameter |
Default |
Description |
--name |
โ |
Podcast name |
--tagline |
โ |
Podcast tagline |
--language |
English |
Output language (40+ supported) |
--style |
engaging,informative,conversational |
Comma-separated styles |
--instructions |
โ |
Custom focus/topic instructions |
--creativity |
0.7 |
Temperature (0.0โ1.0) |
--ending |
"Thanks for listening!" |
Closing message |
TTS provider
| Parameter |
Default |
Description |
--tts |
minimax |
TTS provider: minimax (default โ studio voices), edge (free fallback), openai, elevenlabs, gemini, geminimulti |
--voice1 |
auto |
Voice for host 1 (questioner) โ minimax default English_ManWithDeepVoice |
--voice2 |
auto |
Voice for host 2 (answerer) โ minimax default Wise_Woman |
LLM
| Parameter |
Default |
Description |
--llm-model |
gemini-3-flash-preview |
LLM for script generation |
--gemini-key |
GEMINI_API_KEY env |
API key override |
Length
| Parameter |
Default |
Description |
--longform |
false |
Generate longer podcast (10โ30+ min) |
--max-chunks |
5 (short) / 15 (long) |
Max discussion rounds |
TTS providers comparison
| Provider |
Cost |
Quality |
API Key |
Best for |
edge |
Free |
Good |
None |
Daily digests, quick summaries |
openai |
~$15/1M chars |
Great |
OPENAI_API_KEY |
Professional quality |
elevenlabs |
~$180/1M chars |
Excellent |
ELEVENLABS_API_KEY |
Premium voice customization |
minimax |
~$0.01/episode |
Excellent |
REPLICATE_API_TOKEN |
Studio-grade, 300+ voices, 40+ languages |
geminimulti |
Varies |
Excellent |
GEMINI_API_KEY |
English, natural multi-speaker |
Examples
Daily intelligence digest (2โ5 min, free)
uv run {baseDir}/scripts/generate_podcast.py \
-f ~/org/shared/memory/2026-02-09.md \
--name "Daily Intel" \
--tagline "Your morning briefing" \
--style "concise,professional,informative" \
--instructions "Summarize key findings and action items" \
--filename "./media/generated/drafts/2026-02-09-daily-intel.mp3"
Weekly project recap (10+ min, longform)
uv run {baseDir}/scripts/generate_podcast.py \
-f ~/org/shared/projects/acme/research/weekly-summary.md \
-f ~/org/shared/memory/2026-02-09.md \
--name "Project Pulse" \
--longform \
--style "analytical,engaging,thorough" \
--instructions "Focus on progress, blockers, and decisions made this week" \
--filename "./media/generated/drafts/2026-02-09-weekly-recap.mp3"
MiniMax voices (studio-grade, recommended)
uv run {baseDir}/scripts/generate_podcast.py \
-f ~/org/shared/memory/2026-02-09.md \
--tts minimax \
--voice1 English_ManWithDeepVoice --voice2 Wise_Woman \
--name "Daily Intel" \
--filename "./media/generated/drafts/2026-02-09-daily-intel.mp3"
Premium quality with OpenAI voices
uv run {baseDir}/scripts/generate_podcast.py \
-f ./report.md \
--tts openai \
--voice1 echo --voice2 shimmer \
--filename "./media/generated/drafts/premium-podcast.mp3"
API keys
GEMINI_API_KEY โ Required for script generation (already configured)
REPLICATE_API_TOKEN โ Only if using --tts minimax (already configured)
OPENAI_API_KEY โ Only if using --tts openai
ELEVENLABS_API_KEY โ Only if using --tts elevenlabs
- Edge TTS (
--tts edge) requires no API key at all
Notes
- Always use
./ relative paths for filenames so OpenClaw can auto-attach via chat.
- The script prints a
MEDIA: line for OpenClaw to auto-attach on supported chat providers.
- Default TTS is MiniMax Speech 2.8 (studio quality, ~$0.01/episode). Use
--tts edge for a free fallback when REPLICATE_API_TOKEN is unavailable.
- Longform podcasts can take 2โ5 minutes to generate depending on content length.
- Supports 40+ languages for both script and TTS.
- Do not read the audio back; report the saved path only.