name: short-video-tool description: Processes videos from YouTube/X/Twitter with ASR, translation, bilingual subtitle burning, and fast video summaries. Use when the user mentions video processing, subtitle generation, video translation, downloading from YouTube/X/Twitter, or quick summary-only video understanding.
Short Video Tool
Automated video processing: download → ASR → translation → bilingual subtitles.
What The Tool Supports
- Download from YouTube, TikTok, and X/Twitter
- Run ASR with
mlx-whisper,faster-whisper, orwhisperCLI - Reuse chunk-level ASR cache for long videos
- Generate clips, subtitles, and integrated outputs
- Hard-burn or soft-embed bilingual subtitles
- Generate fast video summaries with
--summary-only-fast
When Using This Skill
- Confirm whether the user wants:
- full pipeline
- summary only
- burn-only
- Prefer the existing CLI in
main.pyover custom scripts - Keep output paths explicit when the user cares about generated files
- Mention fast summary output under
output/summary/when discussing--summary-only-fast - For Chinese audio, pass
--language zh - When the user asks for fast summary with cleanup, use
--summary-only-fast --enCleanUp 1 - Do not replace
--summary-only-fastwith--summary-onlyjust because--enCleanUpis present
Quick Start
cd /Users/jackwl/Code/gitcode/short-video-tool
# Process YouTube video
./venv/bin/python main.py --url "https://youtube.com/watch?v=VIDEO_ID"
# Process X/Twitter video
./venv/bin/python main.py --url "https://x.com/i/status/STATUS_ID"
# Process local video with subtitle burning
./venv/bin/python main.py --local-file video.mp4 --burn-subtitles --summary
# Full video use url(no clipping)
./venv/bin/python main.py --url "URL" --no-clip --burn-subtitles --summary
# Full video use file(no clipping)
./venv/bin/python main.py --local-file video.mp4 --no-clip --burn-subtitles --summary
# Burn-only mode (skip ASR, use existing subtitles)
./venv/bin/python main.py --burn-only --video video.mp4
# fast summary only
./venv/bin/python main.py --summary-only-fast --local-file video.mp4
./venv/bin/python main.py --summary-only-fast --url "https://x.com/i/status/STATUS_ID"
./venv/bin/python main.py --summary-only-fast --local-file video.mp4 --language zh --enCleanUp 1
# summary after full pipeline
./venv/bin/python main.py --url "https://x.com/i/status/STATUS_ID" --summary
Key Parameters
Video source (required, choose one):
--url <URL>: Video URL (YouTube / TikTok / X/Twitter)--local-file <path>: Local video file path (skips download)
Output control:
--output <dir>: Output directory (default:output/)
Clipping control:
--no-clip: Skip clipping, treat the whole video as a single segment--min-duration <sec>: Minimum clip duration in seconds (default: 15)--max-duration <sec>: Maximum clip duration in seconds (default: 60)--max-clips <n>: Maximum number of clips to extract (default: 5)--clip-strategy <strategy>: Clip selection strategy (default:opinion)opinion— Opinion-driven, extracts segments with independent viewpointstopic— Topic/chapter-driven, splits by subject structurehybrid— Hybrid mode combining opinion and topic
Subtitle options:
--embed-subtitles: Soft-embed subtitle track (toggle in player)--burn-subtitles: Hard-burn bilingual subtitles (EN top, ZH bottom)--subtitle-status <mode>: Subtitle strategy (default:auto)auto— Auto-detect source languageen— Source is English, overlay Chinese translation onlyzh— Source is Chinese, keep original subtitlesnone— Burn bilingual subtitles (same as burn mode)
ASR & video quality:
--language <code>: ASR speech recognition language (default:en; usezhfor Chinese audio)--quality <res>: Download resolution, e.g.1080p/720p/best
Burn-only mode (skip ASR/analysis/translation, directly burn subtitles):
--burn-only: Enable quick burn mode--video <path>: Video file path (required in burn-only mode)--en-subtitle <path>: English subtitle file (optional, auto-detects<name>_en.srt)--zh-subtitle <path>: Chinese subtitle file (optional, auto-detects<name>_zh.srt)
Fast summary
--summary-only-fast: run onlyaudio extraction + external Cohere ASR runner + LLM summary--enCleanUp 0|1: optional cleanup pass for--summary-only-fastbefore summary (0= skip,1= enable)--summary-only: different path; do not switch to it when the user explicitly asked for--summary-only-fast
Performance
- 53-min video: ASR 4m27s + translation 9m = <10m total
- 57-sec video: 97s total (ASR 9s + translation 9s + burning 19s)
Output Structure
output/
├── analysis/
├── clips/
├── original/
├── subtitles/
├── summary/
│ └── <video_stem>_video_summary.md
├── clips_with_subtitles/ # optional
├── integration_metadata.json
└── summary.md
Environment Setup
Required environment variable:
export SILICONFLOW_API_KEY="your_api_key"
Optional:
export MLX_WHISPER_LOCAL_MODEL_DIR="~/models"
Common Scenarios
YouTube tutorial → bilingual:
cd /Users/jackwl/Code/gitcode/short-video-tool
./venv/bin/python main.py --url "YOUTUBE_URL" --no-clip --burn-subtitles
X/Twitter short video:
cd /Users/jackwl/Code/gitcode/short-video-tool
./venv/bin/python main.py --url "X_URL" --burn-subtitles
URL fast summary:
cd /Users/jackwl/Code/gitcode/short-video-tool
./venv/bin/python main.py --summary-only-fast --url "X_OR_YOUTUBE_URL" --output ./result
Local file fast summary:
cd /Users/jackwl/Code/gitcode/short-video-tool
./venv/bin/python main.py --summary-only-fast --local-file video.mp4 --output ./result
Chinese fast summary with cleanup:
cd /Users/jackwl/Code/gitcode/short-video-tool
./venv/bin/python main.py --summary-only-fast --local-file video.mp4 --output ./result --language zh --enCleanUp 1
Expected flag pairing:
./venv/bin/python main.py --summary-only-fast --local-file video.mp4 --language zh --enCleanUp 1 --output ./result
Do not rewrite that command to --summary-only.
Batch local videos:
cd /Users/jackwl/Code/gitcode/short-video-tool
for video in *.mp4; do
./venv/bin/python main.py --local-file "$video" --burn-subtitles
done
Burn existing subtitles:
cd /Users/jackwl/Code/gitcode/short-video-tool
./venv/bin/python main.py --burn-only --video video.mp4
Troubleshooting
SSL error (X/Twitter):
cd /Users/jackwl/Code/gitcode/short-video-tool
./venv/bin/pip install --upgrade yt-dlp
GPU memory issue: Edit config.py, set whisper_model = "small"
Translation slow: Already optimized with 6 concurrent workers
Additional Resources
- Roadmap:
doc/roadmap.md - Full docs:
doc/README.md - Main script:
main.py