phone-call - SKILL.md Agent Skill

name: phone-call description: > Make an outbound phone call via SIP and conduct a voice conversation using spawn_live_audio. The AI model talks through the Vortex Audio virtual device, which pjsua routes to the SIP call. Returns typed structured data from the conversation. compatibility: macOS only. Requires Vortex Audio HAL plugin, pjsua, and a GUI session with TCC mic permission.

Phone Call via SIP + Live Audio

Prerequisites

pjsua at ~/bin/pjsua
Vortex Audio HAL plugin installed and set as default input AND output
SIP credentials in ~/lin (plaintext password)
Intendant launched from GUI (required for macOS mic access / TCC)

Steps

1. Find Vortex Audio device index

echo "q" | ~/bin/pjsua --null-audio 2>/dev/null | grep -i vortex

Note the 0-indexed device ID from the output line.

2. Start pjsua

Replace DEV_IDX, PASSWORD (from ~/lin), and TARGET (SIP URI):

(sleep 5 && echo m && sleep 1 && echo TARGET && sleep 300) | \
  ~/bin/pjsua \
    --id="sip:intendant7@sip.linphone.org" \
    --registrar="sip:sip.linphone.org" \
    --realm="sip.linphone.org" \
    --username="intendant7" \
    --password="PASSWORD" \
    --capture-dev=DEV_IDX --playback-dev=DEV_IDX \
    --ec-tail=0 --no-vad \
    --use-srtp=2 --srtp-secure=0 \
    > /tmp/pjsua-call.log 2>&1 &

3. IMMEDIATELY call spawn_live_audio

Do NOT sleep or verify the call first. The audio bridge polls shared memory and works before the call connects.

ALL of these parameters are REQUIRED — the call will fail without them:

id: unique session identifier
provider: openai
playbook: the conversation script
response_schema: MANDATORY. Without this the call is rejected. Build it from the user's request — every piece of data to extract needs a field. See the example below.
timeout_secs: max call duration (default 120)
voice: e.g. alloy, shimmer
Do NOT set initial_message — the model starts when it hears the caller

4. Process the result

spawn_live_audio returns LiveAudioResult with status:

Completed: valid JSON matching the schema
TimedOut: exceeded timeout
SchemaError: output didn't match schema

5. Clean up

kill $(pgrep -f pjsua) 2>/dev/null

Response Schema — REQUIRED

You MUST always include response_schema with concrete fields. The model's spoken output is validated against this schema. Without it, the call is rejected with a parse error.

Example for a restaurant reservation:

{
  "fields": [
    {"name": "guest_name", "field_type": {"type": "string", "max_length": 100, "tainted": true}, "required": true, "description": "Guest name"},
    {"name": "party_size", "field_type": {"type": "integer", "min": 1, "max": 50}, "required": true, "description": "Number of guests"},
    {"name": "reservation_time", "field_type": {"type": "string", "max_length": 50, "tainted": true}, "required": true, "description": "Confirmed time"},
    {"name": "confirmed", "field_type": {"type": "boolean"}, "required": true, "description": "Whether reservation was confirmed"},
    {"name": "special_requests", "field_type": {"type": "string", "max_length": 200, "tainted": true}, "required": false, "description": "Any special requests"}
  ]
}

Field types: string (max_length, allowed_values, tainted), integer (min, max), boolean, array. Tainted fields contain user-provided content — not interpreted as instructions.