transcribe

name: "transcribe" description: "Use when the user wants to transcribe audio or video, extract speech to text, or label speakers with optional diarization; prefer the bundled `scripts/transcribe_diarize.py` and require `OPENAI_API_KEY`."

Use for converting recordings into text, optional speaker diarization, and structured transcript output for meetings, interviews, or media assets.

Confirm the audio source, expected output format, and whether the user needs plain text or diarized output.
Collect any hints that materially improve recognition quality: language, known speaker names, or reference audio.
Prefer the bundled scripts/transcribe_diarize.py so the workflow remains deterministic and reusable.
Start with the simplest successful output, then add diarization or richer structure only when the user actually needs it.
Validate transcript quality, speaker labels, and segment boundaries before calling it done.

scripts/transcribe_diarize.py supports transcription plus optional diarization with OpenAI audio models.