name: whisper-transcribe-docker
description: Speech-to-text (逐字稿/转写) in Docker using faster-whisper (local, no API key). Use when you already have an audio file (e.g. from media-audio-download) and need a transcript with optional timestamps for summarization.
Whisper Transcribe (Docker, faster-whisper)
This skill turns an audio file into a transcript locally (no OpenAI key).
Use with media-audio-download:
- Download audio ->
out/*.m4a - Transcribe ->
out/*.txt(or JSON)
Quick Start
Build image:
docker build -t moltbot-whisper-transcribe {baseDir}
Transcribe an audio file (writes plain text to stdout by default):
docker run --rm -v "$PWD:/work" -v whisper-models:/models \
moltbot-whisper-transcribe /work/out/audio.m4a --model small
If huggingface.co is blocked/unreachable in your network, set a mirror endpoint:
docker run --rm -e HF_ENDPOINT='https://hf-mirror.com' -v "$PWD:/work" -v whisper-models:/models \
moltbot-whisper-transcribe /work/out/audio.m4a --model small
Write transcript to a file:
docker run --rm -v "$PWD:/work" -v whisper-models:/models \
moltbot-whisper-transcribe /work/out/audio.m4a --model small --out /work/out/audio.txt
With timestamps:
docker run --rm -v "$PWD:/work" -v whisper-models:/models \
moltbot-whisper-transcribe /work/out/audio.m4a --model small --timestamps --out /work/out/audio.txt
Notes:
- First run downloads model weights (cached in the
whisper-modelsDocker volume). - For speed, start with
--model tiny/--model base. - For quality, use
--model medium(CPU will be slower).