transcribe

star 695

Transcribe audio and video files using the configured speech-to-text provider

vellum-ai By vellum-ai schedule Updated 6/11/2026

name: transcribe description: Transcribe audio and video files using the configured speech-to-text provider compatibility: "Designed for Vellum personal assistants" metadata: emoji: "🎙️" vellum: display-name: "Transcribe" category: "voice" activation-hints: - "User has an audio or video file on disk they want converted to text" - "User wants speech-to-text on a recording, voice memo, podcast, or meeting capture" - "User asks for a transcript of a media file (mp3, wav, m4a, mp4, mov, etc.)"


Transcribe audio and video files using the configured speech-to-text provider. Supports multiple STT providers including OpenAI Whisper, Deepgram, and Google Gemini — the active provider is selected in Settings under Speech-to-Text (services.stt).

Usage Notes

  • The tool accepts a file_path (absolute path to a local audio or video file) to transcribe.
  • Supported formats: any video (mp4, mov, etc.) or audio (mp3, wav, m4a, etc.) file.
  • For video files, audio is automatically extracted via ffmpeg before transcription.
  • Large files are automatically split into chunks for processing.
  • If no STT provider credentials are configured, the tool will return an error with setup instructions.
  • The STT provider (services.stt) is shared between transcription and telephony call paths.

Maintenance

When adding or modifying an STT provider, follow the onboarding checklist at assistant/docs/stt-provider-onboarding.md. That document covers the daemon catalog, config schema, adapter wiring, client catalog parity, and required tests.

Install via CLI
npx skills add https://github.com/vellum-ai/vellum-assistant --skill transcribe
Repository Details
star Stars 695
call_split Forks 102
navigation Branch main
article Path SKILL.md
More from Creator