kuavi-pixel-analysis

star 3

Compositional pixel analysis patterns using kuavi_eval for counting, motion detection, change tracking, and visual comparison. Use when questions require precise measurements, object counting, or frame-by-frame comparison.

apicurius By apicurius schedule Updated 2/22/2026

name: kuavi-pixel-analysis description: Compositional pixel analysis patterns using kuavi_eval for counting, motion detection, change tracking, and visual comparison. Use when questions require precise measurements, object counting, or frame-by-frame comparison.

Pixel Analysis with kuavi_eval

Use kuavi_eval(code) to compose pixel tools programmatically when discrete MCP tool calls are insufficient. The persistent Python namespace has np, cv2, and all kuavi tools pre-loaded.

When to Use This

  • Counting objects in a frame (people, items, markers)
  • Detecting motion between frames (what changed?)
  • Measuring visual properties (brightness trends, color distribution)
  • Comparing regions across time (same area, different moments)
  • Reading values from charts, tables, or graphs
  • Iterating over multiple frames with the same analysis
  • Motion-focused search: Use field="temporal" with kuavi_search_video to find motion/dynamics-heavy segments before pixel analysis

Pattern 1: Object Counting

# Extract frames, threshold, count contours
kuavi_eval("""
frames = extract_frames(10.0, 15.0, fps=2.0, max_frames=5)
counts = []
for f in frames:
    mask = threshold_frame(f, value=120)
    counts.append(mask['contour_count'])
print(f"Object counts per frame: {counts}")
print(f"Average: {sum(counts)/len(counts):.1f}, Max: {max(counts)}")
""")

Pattern 2: Motion Detection Between Frames

# Compare consecutive frames to find when motion occurs
kuavi_eval("""
frames = extract_frames(20.0, 30.0, fps=1.0, max_frames=10)
for i in range(len(frames)-1):
    diff = diff_frames(frames[i], frames[i+1])
    print(f"Frame {i}→{i+1}: changed={diff['changed_pct']:.1f}%, mean_diff={diff['mean_diff']:.1f}")
""")

Pattern 3: Region-of-Interest Tracking

# Crop the same region across frames and compare
kuavi_eval("""
frames = extract_frames(5.0, 25.0, fps=0.5, max_frames=10)
for i, f in enumerate(frames):
    roi = crop_frame(f, x1_pct=0.6, y1_pct=0.1, x2_pct=0.95, y2_pct=0.3)
    info = frame_info(roi)
    print(f"Frame {i}: brightness={info['brightness']['mean']:.0f}, size={info['width']}x{info['height']}")
""")

Pattern 4: Brightness/Color Trend Analysis

# Track brightness over time to detect scene changes
kuavi_eval("""
frames = extract_frames(0.0, 60.0, fps=0.5, max_frames=30)
brightness = []
for f in frames:
    info = frame_info(f)
    brightness.append(info['brightness']['mean'])
print(f"Brightness over time: {[f'{b:.0f}' for b in brightness]}")
# Find sudden changes (potential scene boundaries)
for i in range(1, len(brightness)):
    delta = abs(brightness[i] - brightness[i-1])
    if delta > 30:
        print(f"  Sharp change at frame {i}: delta={delta:.0f}")
""")

Pattern 5: Composite/Blend for Background Extraction

# Blend multiple frames to extract static background
kuavi_eval("""
frames = extract_frames(10.0, 20.0, fps=1.0, max_frames=10)
composite = blend_frames(frames)
# Now diff a single frame against the composite to isolate foreground
single = frames[5]
fg = diff_frames(composite, single)
print(f"Foreground coverage: {fg['changed_pct']:.1f}%")
""")

Pattern 6: Multimodal LLM Analysis of Frames

# Ask an LLM to describe specific content in a single frame
kuavi_eval("""
frames = extract_frames(30.0, 40.0, fps=2.0, max_frames=6)
# Single frame — llm_query_with_frames(prompt, frame_or_frames)
result = llm_query_with_frames("What text or numbers are visible?", frames[0])
print(result)
""")

# Batch: ask the LLM about every frame in parallel
kuavi_eval("""
prompts = ["What text or numbers are visible in this frame? Be precise."] * len(frames)
# llm_query_with_frames_batched(prompt_texts, frames_list)
descriptions = llm_query_with_frames_batched(prompts, frames)
for i, desc in enumerate(descriptions):
    print(f"Frame {i}: {desc}")
""")

# You can also pass multiple frames to a single query for comparison:
kuavi_eval("""
result = llm_query_with_frames("Compare these two frames — what changed?", [frames[0], frames[-1]])
print(result)
""")

Pattern 7: Iterative Search + Extract Pipeline

# Programmatically search, then extract and analyze hits
kuavi_eval("""
hits = search_video("scoreboard results table", field="visual", top_k=5)
for h in hits:
    frames = extract_frames(h['start_time'], h['end_time'], fps=4.0, width=1280, height=960, max_frames=3)
    for f in frames:
        # Crop the likely scoreboard region (right side of frame)
        roi = crop_frame(f, x1_pct=0.5, y1_pct=0.0, x2_pct=1.0, y2_pct=0.5)
        info = frame_info(roi)
        print(f"  [{h['start_time']:.1f}s] ROI brightness={info['brightness']['mean']:.0f}")
""")

Key Reminders

  • Variables persist across kuavi_eval calls — set values in one call, use them in the next.
  • Use SHOW_VARS() to inspect what's in the namespace.
  • llm_query(prompt) for single text-only LLM calls; llm_query_batched(prompts) for parallel text-only.
  • llm_query_with_frames(prompt_text, frames) for multimodal (text + images); llm_query_with_frames_batched(texts, frames_list) for parallel multimodal.
  • All pixel tools return dicts with metadata — check the keys before accessing.
  • Keep each eval block focused on one task for clarity.
Install via CLI
npx skills add https://github.com/apicurius/VideoRLM --skill kuavi-pixel-analysis
Repository Details
star Stars 3
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator