ai-voice-memo

star 0

Transcribe voice memos, summarize key points, and extract action items.

DAMediaCo By DAMediaCo schedule Updated 2/12/2026

name: ai-voice-memo description: Transcribe voice memos, summarize key points, and extract action items. metadata: { "openclaw": { "emoji": "๐ŸŽ™๏ธ", "requires": { "bins": ["curl"], "env": ["OPENAI_API_KEY"] }, "primaryEnv": "OPENAI_API_KEY", },

}

AI Voice Memo

Receive a voice note or audio file, transcribe it, summarize key points, and extract action items.

Workflow

  1. Receive audio โ€” voice message (Telegram/Discord) or uploaded file
  2. Transcribe โ€” send to OpenAI Whisper API via openai-whisper-api skill
  3. Analyze โ€” summarize and extract action items using the LLM
  4. Respond โ€” formatted output; optionally a TTS audio summary

Supported Formats

mp3, m4a, wav, ogg, webm (anything Whisper accepts)

Usage

Step 1 โ€” Transcribe

Use the openai-whisper-api skill to get a transcript:

{baseDir}/../../../opt/homebrew/lib/node_modules/openclaw/skills/openai-whisper-api/scripts/transcribe.sh /path/to/audio.m4a --out /tmp/transcript.txt

Or use the bundled helper that does transcribe + analyze in one shot:

{baseDir}/scripts/process-memo.sh /path/to/audio.m4a

This outputs a JSON file with summary, action_items, and transcript.

Step 2 โ€” Analyze the Transcript

Feed the transcript to the LLM with this prompt structure:

Analyze this voice memo transcript. Provide:

## Summary
A concise summary of the key points (2-5 bullet points).

## Action Items
Extract any action items, tasks, or todos mentioned. Format as a checklist:
- [ ] Action item 1
- [ ] Action item 2

If no action items are found, say "No action items identified."

## Full Transcript
<paste transcript>

Step 3 โ€” Optional TTS Response

For an audio summary, use ElevenLabs TTS (voice: Charlie, ID: IKne3meq5aSn9XLyUdCD):

Use TTS to read back the summary and action items.

Output Format

๐ŸŽ™๏ธ Voice Memo Summary

## Summary
โ€ข Key point 1
โ€ข Key point 2
โ€ข Key point 3

## Action Items
- [ ] Task 1
- [ ] Task 2

## Full Transcript
<transcript text>

Example Agent Flow

When a user sends a voice message:

  1. Download the audio file to a temp path
  2. Run: openai-whisper-api/scripts/transcribe.sh <audio> --out /tmp/memo-transcript.txt
  3. Read the transcript
  4. Analyze with the LLM using the prompt above
  5. Reply with the formatted summary
  6. If user requested audio: generate TTS of the summary

Tips

  • For long memos (>5 min), the summary becomes more valuable
  • Action items work best when the speaker is explicit ("I need to...", "remind me to...")
  • The --language flag on transcribe helps with non-English memos
  • Works great as an automatic handler for all voice messages in a chat
Install via CLI
npx skills add https://github.com/DAMediaCo/ai-voice-memo --skill ai-voice-memo
Repository Details
star Stars 0
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator