kuavi-deep-analyze - SKILL.md Agent Skill

name: kuavi-deep-analyze description: Deep multi-pass video analysis with parallel sharding and zoom agent: video-analyst context: fork argument-hint: disable-model-invocation: true

Perform comprehensive, multi-pass video analysis with parallel temporal sharding and progressive zoom.

Parse $ARGUMENTS for: video path and question/task.
Index: If the video is not yet indexed, call kuavi_index_video.
Orient: Call kuavi_get_scene_list to understand the video structure.
Parallel Shard Analysis: Call kuavi_analyze_shards with the question to get per-shard answers from parallel LLM workers. Use this to identify which temporal regions are most relevant.
Pass 1 — Visual Search: Use kuavi_search_video across multiple fields (summary, action, visual, temporal) to find relevant segments. Cross-reference with shard analysis results.
Pass 2 — Transcript Search: Use kuavi_search_transcript and kuavi_get_transcript to find spoken evidence. Note any discrepancies with visual findings.
Pass 3 — Cross-Reference: For key findings, use the 3-pass zoom protocol:
- Level 1 zoom to locate
- Level 2 zoom to read
- Level 3 zoom to confirm specific values
Pixel Analysis: When counting, measuring, or comparing, use pixel tools (crop_frame, diff_frames, threshold_frame, frame_info) for deterministic evidence.
Code Reasoning: Use kuavi_eval for complex computations, iterating over frames, or chaining multiple operations.
Budget Check: Monitor kuavi_get_session_stats — synthesize after 15 tool calls or 120 seconds.
Synthesize: Combine all evidence into a comprehensive, timestamped answer.

The first argument is the video path. Remaining text is the question.

Example: /kuavi-deep-analyze /path/to/video.mp4 What are the key events and when do they occur?