name: open-montage description: > Agentic video production system. Describe a video idea in natural language; the agent orchestrates research, scripting, asset generation, editing, and rendering across 11 pipelines and 49 tools. Supports zero-key mode (Piper TTS + Pexels stock + Remotion + FFmpeg) and premium APIs (ElevenLabs, Runway, Kling, Veo 3, Suno). Use when the user says "make a video", "create an explainer", "produce a trailer", "video production", "animate", or "podcast to video". From calesthio/OpenMontage. version: 1.0.0 author: calesthio (OpenMontage) tags: - video - production - animation - tts - avatar - remotion - explainer - cinematic env_vars: - FAL_KEY - ELEVENLABS_API_KEY - OPENAI_API_KEY - SUNO_API_KEY - HEYGEN_API_KEY - RUNWAY_API_KEY - PEXELS_API_KEY - PIXABAY_API_KEY
Open Montage — Agentic Video Production
Describe your video idea. The agent orchestrates the entire production pipeline: research, scripting, asset generation, editing, and final render.
When to Use This Skill
- "Make a 60-second explainer about quantum computing"
- "Create a cinematic trailer for our product launch"
- "Produce an animated video in Ghibli style"
- "Turn this podcast episode into video highlights"
- "Create an avatar spokesperson video"
- "Make a screen demo of our app"
- Any video production from natural language description
When Not to Use
- For simple image generation — use
art(Nano Banana 2) orcomfyui - For video transcoding/editing only — use
ffmpeg-processing - For academic diagrams — use
paperbanana - For 3D modelling/rendering — use
blender - For 3D Gaussian Splatting — use
lichtfeld-studio
Architecture
User: "Make a 45-second explainer about black holes"
│
▼
┌─────────────────────────────────┐
│ Open Montage Dispatcher │
│ (this skill — reads manifests) │
└──────────────┬──────────────────┘
│ Selects pipeline
▼
┌─────────────────────────────────┐
│ Pipeline Manifest (YAML) │
│ e.g. animated-explainer.yaml │
│ Defines: stages, tools, order │
└──────────────┬──────────────────┘
│ Agent executes stages
▼
┌─────────────────────────────────┐
│ 49 Python Tools │ ← Cloned on first use
│ video/ audio/ graphics/ │ to ~/.open-montage/
│ enhancement/ analysis/ avatar/ │
└──────────────┬──────────────────┘
│ Renders
▼
┌─────────────────────────────────┐
│ Remotion Composer (React) │
│ + FFmpeg post-production │
└─────────────────────────────────┘
11 Production Pipelines
| Pipeline | Description | Key Tools |
|---|---|---|
animated-explainer |
Research-driven explainer with narration | Perplexity → Script → FLUX/Veo → ElevenLabs → Remotion |
animation |
Kinetic typography and motion graphics | Remotion → FFmpeg |
avatar-spokesperson |
Talking head presentations | HeyGen/SadTalker → TTS → Composite |
cinematic |
Trailers and mood-driven edits | Runway/Kling → Music (Suno) → Color grade |
clip-factory |
Batch short-form extraction from long video | WhisperX → Scene detect → Auto-trim |
hybrid |
Source footage + AI support visuals | Input video → AI B-roll → Stitch |
localization-dub |
Multilingual distribution | WhisperX → Translate → TTS (target lang) → Lip sync |
podcast-repurpose |
Highlight extraction from podcasts | WhisperX → Highlight detect → Visual overlay |
screen-demo |
Software walkthroughs | Screen capture → Annotation → TTS narration |
talking-head |
Speaker-focused footage editing | Face enhance → Background → Audio cleanup |
On First Use — Clone Full Repository
The pipeline defs are baked into this skill, but the 49 Python tools and 767 skill files live in the full repository. On first invocation:
# Clone or update OpenMontage
if [ ! -d ~/.open-montage ]; then
git clone --depth 1 https://github.com/calesthio/OpenMontage.git ~/.open-montage
cd ~/.open-montage && pip install -r requirements.txt --break-system-packages
cd remotion-composer && npm install
else
cd ~/.open-montage && git pull --rebase
fi
After cloning, tools are available at ~/.open-montage/tools/ and skills at
~/.open-montage/skills/.
Zero-Key Mode (No API Keys Required)
These capabilities work without any API keys:
- Piper TTS: Offline text-to-speech narration
- Pexels/Pixabay: Free stock footage and images
- Remotion: React-based video composition and animation
- FFmpeg: Post-production, trimming, mixing, encoding
- ManimCE: Mathematical animations
- Auto-captioning: WhisperX transcription → subtitle burn-in
# Run zero-key demo
cd ~/.open-montage && make demo
Premium API Capabilities
| Provider | Capability | Key |
|---|---|---|
| FAL.ai | FLUX images, Kling video, Veo 3 video, MiniMax, Recraft | FAL_KEY |
| ElevenLabs | Premium TTS (29 languages), Music, SFX | ELEVENLABS_API_KEY |
| OpenAI | DALL-E 3, OpenAI TTS | OPENAI_API_KEY |
| Suno | AI music generation (full songs) | SUNO_API_KEY |
| HeyGen | Avatar generation | HEYGEN_API_KEY |
| Runway | Gen-4 video generation | RUNWAY_API_KEY |
| Pexels | Higher-quality stock footage | PEXELS_API_KEY |
Execution Flow
- Route: Select pipeline from the table above based on user's request
- Research: Use
perplexity-researchskill or OpenMontage's built-in web research - Script: Agent writes script following pipeline's skill guide
- Assets: Generate images/video/audio using pipeline's tool chain
- Review: Agent self-reviews by extracting frames and transcribing
- Compose: Remotion renders or FFmpeg stitches final video
- Output: MP4 with subtitles, mixed audio, platform-optimised resolution
Checkpoint Protocol
Each stage saves state as JSON. If interrupted, resume from last checkpoint:
# State files at ~/.open-montage/checkpoints/
ls ~/.open-montage/checkpoints/
Cost Governance
Built-in cost estimation runs before each API call. Configure spend caps in
~/.open-montage/config.yaml:
cost_governance:
max_per_action: 2.00 # USD per API call
max_per_pipeline: 25.00 # USD per full production
require_approval_above: 5.00
Integration with Existing Skills
| Phase | Our Skill | How It Integrates |
|---|---|---|
| Research | perplexity-research |
Substitute for OpenMontage's built-in web search |
| Image gen | comfyui |
Alternative for local Stable Diffusion when available |
| Image gen | art |
Nano Banana 2 for stylised thumbnails and editorial art |
| Post-production | ffmpeg-processing |
Direct overlap — can use either |
| Narration review | notebooklm |
Generate audio overview of script for review |
| Script quality | codex-companion |
Cross-model review of script via GPT-5.4 |
Troubleshooting
Remotion build fails:
cd ~/.open-montage/remotion-composer && npm install && npx remotion --help
Piper TTS not found:
pip install piper-tts --break-system-packages
GPU tools (local video gen):
cd ~/.open-montage && make install-gpu
Attribution
OpenMontage by calesthio. MIT License. 11 pipelines, 49 tools, 400+ agent skills.