tts

star 32

Text-to-speech — convert text to audio using gTTS or edge-tts. Use when: user asks to read text aloud, generate an audio file from text, or create a voiceover. NOT for: speech-to-text/transcription (use Deepgram or whisper), music generation, or audio editing.

ericwang915 By ericwang915 schedule Updated 3/6/2026

name: tts description: "Text-to-speech — convert text to audio using gTTS or edge-tts. Use when: user asks to read text aloud, generate an audio file from text, or create a voiceover. NOT for: speech-to-text/transcription (use Deepgram or whisper), music generation, or audio editing." dependencies: gTTS metadata: emoji: "🔊"

Text-to-Speech

Convert text to spoken audio files.

When to Use

USE this skill when:

  • "Read this text aloud"
  • "Generate an audio file of this paragraph"
  • "Create a voiceover for this script"
  • "Convert this text to speech"
  • "Say this in Spanish"

When NOT to Use

DON'T use this skill when:

  • Speech-to-text / transcription → use Deepgram or Whisper
  • Music generation → use specialised music tools
  • Audio editing or effects → use ffmpeg or audio editors
  • Playing existing audio files → use system player

Usage

Basic TTS (using gTTS — Google Translate TTS)

python {skill_path}/speak.py "Hello, this is PythonClaw speaking." --output hello.mp3

Options

# Different language
python {skill_path}/speak.py "Bonjour le monde" --lang fr

# Slow speed
python {skill_path}/speak.py "Important message" --slow

# Custom output path
python {skill_path}/speak.py "Text here" --output ~/audio/speech.mp3

Alternative: edge-tts (higher quality, more voices)

If edge-tts is installed (pip install edge-tts):

python {skill_path}/speak.py "Hello world" --engine edge --voice en-US-AriaNeural

Available edge-tts voices (examples)

  • en-US-AriaNeural — Female, US English (default)
  • en-US-GuyNeural — Male, US English
  • en-GB-SoniaNeural — Female, British English
  • zh-CN-XiaoxiaoNeural — Female, Chinese
  • ja-JP-NanamiNeural — Female, Japanese
  • de-DE-KatjaNeural — Female, German

List all voices: python -m edge_tts --list-voices

Notes

  • gTTS requires internet (uses Google Translate's TTS endpoint)
  • edge-tts requires internet (uses Microsoft Edge's TTS service)
  • Output format is MP3 by default
  • For offline TTS, consider pyttsx3 (lower quality but no network needed)

Resources

File Description
speak.py Multi-engine TTS wrapper (gTTS + edge-tts)
Install via CLI
npx skills add https://github.com/ericwang915/PythonClaw --skill tts
Repository Details
star Stars 32
call_split Forks 10
navigation Branch main
article Path SKILL.md
More from Creator