name: whispercpp-transcribe description: Transcribe local audio/video files offline using whisper.cpp (the C++ port of OpenAI Whisper), generating plain text, timestamped, SRT, and JSON outputs. Use when the user wants fast native-speed transcription with GGML quantized models, or prefers whisper.cpp over Python-based alternatives like faster-whisper. Triggers on mentions of whisper.cpp, whisper-cli, GGML models, or requests for high-performance local transcription.
whisper.cpp Transcribe
Use this skill for local-only transcription with whisper.cpp (whisper-cli).
The key advantage over Python-based whisper (faster-whisper) is raw speed: whisper.cpp runs optimized C++ inference with optional GPU acceleration, quantized GGML models, and minimal memory footprint.
Quick start
python3 scripts/transcribe_whispercpp.py "path/to/audio.mp4" \
--model-path ~/models/ggml-small.bin \
--output-dir ./output/transcribe-whispercpp
Workflow
- Ensure
whisper-cliis installed and a GGML model is downloaded. - Run the bundled script on one or more local media files.
- Read
.transcript.txtfor plain text and.transcript.timed.txtfor timestamps. - If quality is low, use a larger model (
ggml-medium.binorggml-large-v3-q5_0.bin).
Commands
Single file:
python3 scripts/transcribe_whispercpp.py "./input/video.mp4" \
--model-path ~/models/ggml-small.bin \
--language pt \
--output-dir ./output/transcribe-whispercpp
Multiple files:
python3 scripts/transcribe_whispercpp.py "./a.mp3" "./b.wav" \
--model-path ~/models/ggml-small.bin \
--output-dir ./output/transcribe-whispercpp
Force WAV conversion (useful for formats whisper-cli struggles with):
python3 scripts/transcribe_whispercpp.py "./input/video.mp4" \
--model-path ~/models/ggml-small.bin \
--force-wav \
--output-dir ./output/transcribe-whispercpp
Outputs
For each input file <name>:
<name>.transcript.txt— plain text transcript<name>.transcript.timed.txt—[start --> end] textformat<name>.transcript.json— structured JSON with segments<name>.srt— SRT subtitle file (generated by whisper-cli)
Model download
Download GGML models from Hugging Face:
# Small model (~500MB, good balance of speed and quality)
curl -L -o ~/models/ggml-small.bin \
'https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-small.bin'
# Large v3 quantized (~1GB, best quality with reasonable size)
curl -L -o ~/models/ggml-large-v3-q5_0.bin \
'https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-large-v3-q5_0.bin'
Or use the bundled download script from whisper.cpp:
sh ./models/download-ggml-model.sh small
Dependencies
Install whisper-cli (one of):
# macOS via Homebrew
brew install whisper-cpp
# pip (cross-platform, no GPU accel)
pip install whisper.cpp-cli
# Or build from source
git clone https://github.com/ggml-org/whisper.cpp.git
cd whisper.cpp && cmake -B build && cmake --build build -j --config Release
Required:
ffmpeg— for audio conversion to 16kHz WAV when needed.
Notes
- This flow is local-only and does not use any API key.
- whisper-cli natively supports flac, mp3, ogg, and wav. For other formats the script auto-converts via ffmpeg.
- Use
--threads Nto control CPU thread count (default: 4). - The script generates SRT output via whisper-cli's
-osrtflag and then parses stdout for the timed text output.