name: "Build voice and multimodal agents with Pipecat" slug: "build-voice-and-multimodal-agents-with-pipecat" description: "Use Pipecat to define realtime voice and multimodal agent pipelines with transports, model providers, tools, and turn-taking tests." github_stars: 12703 verification: "security_reviewed" source: "https://github.com/pipecat-ai/pipecat" author: "Pipecat AI" publisher_type: "open_source_project" category: "Media & Transcription" framework: "Custom Agents" tool_ecosystem: github_repo: "pipecat-ai/pipecat" github_stars: 12703
Build voice and multimodal agents with Pipecat
Use Pipecat to define realtime voice and multimodal agent pipelines with transports, model providers, tools, and turn-taking tests.
Prerequisites
Pipecat, audio or video transport, model provider credentials
Installation
Use the upstream install or setup path that matches your environment:
- Install uv
- uv init my-pipecat-app
- uv add pipecat-ai
- uv add "pipecat-ai[option,...]"
Requirements and caveats from upstream:
- Pipecat is an open-source Python framework for building real-time voice and multimodal conversational agents. Build a single voice agent or a full multi-agent system where specialists hand off, fan out in parallel...
- Minimum Python Version: 3.11
Basic usage or getting-started notes:
Want to dive right in? Run pipecat init quickstart or follow the quickstart guide.
Multi-Agent Systems – specialists that hand off, fan out in parallel, or run as sidecars over a shared bus
cp env.example .env
Extracted from upstream docs: https://raw.githubusercontent.com/pipecat-ai/pipecat/HEAD/README.md