voice-agent

star 3

Local Voice Input/Output for Agents using the AI Voice Agent API.

duclm1x1 By duclm1x1 schedule Updated 2/10/2026

name: voice-agent display-name: AI Voice Agent Backend version: 1.0.0 description: Local Voice Input/Output for Agents using the AI Voice Agent API. author: trevisanricardo homepage: https://github.com/ricardotrevisan/ai-conversational-skill user-invocable: true disable-model-invocation: false

Voice Agent

This skill allows you to speak and listen to the user using the local Voice Agent API.

Behavior Guidelines

  • Audio First: When the user communicates via audio (files), your PRIMARY mode of response is Audio File.
  • Silent Delivery: When sending an audio response, DO NOT send a text explanation like "I sent an audio". Just send the audio file.
  • Workflow:
    1. User sends audio.
    2. You usage transcribe to read it.
    3. You think of a response.
    4. You usage synthesize to generate the audio file.
    5. You send the file.
    6. STOP. Do not add text commentary.

Tools

Transcribe File

To transcribe an audio file (Speech-to-Text), run the client script with the transcribe command.

python3 {baseDir}/scripts/client.py transcribe "/path/to/audio/file.ogg"

Synthesize to File

To generate audio from text and save it to a file (Text-to-Speech), run the client script with the synthesize command.

python3 {baseDir}/scripts/client.py synthesize "Text to speak" --output "/path/to/output.mp3"

Health Check

To check if the voice agent API is running and healthy:

python3 {baseDir}/scripts/client.py health

Service Management

If the Health Check fails or you receive a connection error, the service may be stopped. You can attempt to start it by running:

{baseDir}/scripts/start.sh
Install via CLI
npx skills add https://github.com/duclm1x1/Dive-Ai --skill voice-agent
Repository Details
star Stars 3
call_split Forks 2
navigation Branch main
article Path SKILL.md
More from Creator