ffmpeg-mixing

star 322

Mix, trim, and concatenate video clips with ffmpeg without audio/video desync. Use when stitching generated clips into an original video, inserting scenes at timestamps, or any ffmpeg filter_complex work involving trim/concat with a continuous audio track.

vargHQ By vargHQ schedule Updated 6/8/2026

name: ffmpeg-mixing description: Mix, trim, and concatenate video clips with ffmpeg without audio/video desync. Use when stitching generated clips into an original video, inserting scenes at timestamps, or any ffmpeg filter_complex work involving trim/concat with a continuous audio track. license: MIT metadata: author: vargHQ version: "1.0.0" compatibility: Requires ffmpeg installed locally. allowed-tools: Bash(ffmpeg:) Bash(ffprobe:) Read

ffmpeg video mixing

Lessons for mixing video clips with ffmpeg while keeping audio and video in sync.

problem: audio/video desync when mixing clips

what went wrong

using -ss X -t Y to pre-trim input, then applying relative trim filters caused timing drift:

# BAD: relative timestamps after pre-trim
ffmpeg -ss 64 -t 36 -i original.mp4 ...
  -filter_complex "
    [0:v]split=5[orig1][orig2]...;
    [orig1]trim=0:4,setpts=PTS-STARTPTS[o1];
    [orig2]trim=4:11,setpts=PTS-STARTPTS[o2];
    ..."

this produced wrong duration (38s instead of 36s) with audio desync.

solution: use absolute timestamps from full input

trim directly from full original using absolute timestamps:

# GOOD: absolute timestamps from full file
ffmpeg -i original.mp4 -i scene1.mp4 -i scene2.mp4 ...
  -filter_complex "
    [0:v]trim=64:68,setpts=PTS-STARTPTS[o1];
    [1:v]scale=1280:720,trim=4:6,setpts=PTS-STARTPTS[s1];
    [0:v]trim=70:75,setpts=PTS-STARTPTS[o2];
    ...
    [o1][s1][o2]...concat=n=N:v=1:a=0[outv];
    [0:a]atrim=64:100,asetpts=PTS-STARTPTS[outa]
  "
  -map "[outv]" -map "[outa]"

key points

  1. absolute timestamps: trim from full input file, not pre-trimmed
  2. separate audio handling: use atrim on audio stream independently
  3. setpts reset: always use setpts=PTS-STARTPTS after trim to reset timestamps
  4. scale before trim: when mixing different resolutions, scale first then trim
  5. video duration = audio duration: ensure total video segments match audio segment length

example: inserting clips into original

to insert generated clips at specific timestamps while keeping continuous audio:

ffmpeg -y \
  -i original.mp4 \
  -i generated-scene.mp4 \
  -filter_complex "
    [0:v]trim=START1:END1,setpts=PTS-STARTPTS[o1];
    [1:v]scale=1280:720,trim=0:DURATION,setpts=PTS-STARTPTS[s1];
    [0:v]trim=START2:END2,setpts=PTS-STARTPTS[o2];
    [o1][s1][o2]concat=n=3:v=1:a=0[outv];
    [0:a]atrim=AUDIO_START:AUDIO_END,asetpts=PTS-STARTPTS[outa]
  " \
  -map "[outv]" -map "[outa]" \
  -c:v libx264 -preset fast -crf 18 \
  -c:a aac -b:a 192k \
  output.mp4

timestamps must add up: (END1-START1) + DURATION + (END2-START2) = AUDIO_END - AUDIO_START

Install via CLI
npx skills add https://github.com/vargHQ/sdk --skill ffmpeg-mixing
Repository Details
star Stars 322
call_split Forks 23
navigation Branch main
article Path SKILL.md
More from Creator