name: video-transcripts
description: Generate structured video transcripts from local files or video URLs using Gemini Files API. Use when a GitHub or Linear tracker item, comment, or attachment includes a screen recording, .mov, .mp4, or tracker-hosted video and you need a block instead of hand-written notes.
disable-model-invocation: true
Video Transcripts
Quick Start
Run the helper once per relevant video:
bash .agents/skills/video-transcripts/scripts/generate_video_transcript.sh \
"https://uploads.linear.app/.../screen-recording.mov" \
--title "PDF preview hyperlinks trigger leave-page modal"
Or for a GitHub attachment:
bash .agents/skills/video-transcripts/scripts/generate_video_transcript.sh \
"https://github.com/user-attachments/assets/..." \
--title "Slash menu loses selection after confirm"
Or for a local file:
bash .agents/skills/video-transcripts/scripts/generate_video_transcript.sh \
"/absolute/path/to/video.mov" \
--title "Preview hyperlink exits workflow"
For auth-gated Linear uploads, the helper automatically retries with cookies from the local Linear desktop app on macOS. For private GitHub asset URLs, it retries with GITHUB_TOKEN, GH_TOKEN, or gh auth token when available.
Use This When
- A GitHub or Linear issue, PR, or comment includes a screen recording.
- An attachment URL points to
uploads.linear.app. - An attachment URL points to a GitHub attachment or private GitHub asset host.
- You need timeline-style transcript lines, not a vague summary.
- You want the result in this exact XML shape:
<video-transcripts>
<video-transcript title="...">
[00:00] (...)
</video-transcript>
</video-transcripts>
Workflow
- Run the helper once for each relevant video.
- Give each run a short, bug-focused
--title. - For tracked work, the canonical shared cache should live in the tracker next to the evidence it describes.
- If the video is in the issue or PR body, use a top-level tracker comment.
- If the video is in a Linear comment, post the transcript cache as a reply to that specific comment.
- If the video is in a GitHub issue or PR comment, use one dedicated top-level cache comment for that source comment's video set. GitHub has no replies, so keep cache comments separated by source container instead of merging unrelated comment videos together.
- Cache comment or reply body should start with a transcript source link, then the timestamp lines:
[[Transcript](link to video or comment if not available)]
[00:00] (...)
- When caching helper XML, strip the
<video-transcripts>and<video-transcript ...>wrapper lines and paste only timestamp lines after the source link. - If there are multiple videos in the same source container, repeat the
[[Transcript](...)]source link plus timestamp lines for each video. - Do not hand-write or paraphrase video behavior when the helper can run. Use the actual transcript output.
- Link
[[Transcript](...)]to the video URL when available. If the video URL is not available or is unstable, link to the source comment that contains the video. - For signed tracker-hosted URLs like
uploads.linear.app, strip the query string so a new signature does not invalidate an otherwise valid cache entry. - Do not add decorative metadata like
titleto cached transcript entries unless a later workflow truly needs it. - Before re-transcribing for tracked work, match cache entries by source container first:
- one cache comment for issue or PR body videos
- one Linear reply per comment containing video(s)
- one dedicated GitHub cache comment per issue or PR comment containing video(s)
- If the matching cache already covers the current normalized video keys for that source container, reuse it. Only transcribe missing or new video evidence.
- Do not use
docs/for raw tracker transcript cache. That is durable repo knowledge space, not raw issue evidence. - If you invoke this through
codex exec, prefer-o <file>so the final XML is captured without CLI progress chatter.
Output Contract
- Return XML, not Markdown.
- Use one
[MM:SS] (...)line per observed action or system response. - Quote visible UI text when legible.
- Describe only visible actions, screen changes, and audible speech if present.
- Do not invent hidden state, motives, or implementation details.
- Prefer concise, high-signal lines over per-keystroke sludge.
Model Strategy
The helper defaults to gemini-3.1-flash-lite-preview for cost efficiency.
If that output is malformed, too thin, or obviously noisy, it retries with gemini-3-flash-preview.
For Gemini 3 models, the helper forces minimal thinking so output budget goes to the transcript instead of hidden reasoning.
Override with:
VIDEO_TRANSCRIPTS_MODEL=gemini-3-flash-preview bash .agents/skills/video-transcripts/scripts/generate_video_transcript.sh ...
Or:
bash .agents/skills/video-transcripts/scripts/generate_video_transcript.sh ... \
--model gemini-2.5-flash
Notes
- The helper accepts a local file path or remote URL.
- For
uploads.linear.appURLs, it first triesLINEAR_COOKIE_HEADER, thenLINEAR_COOKIES_DB, then falls back to the local Linear desktop cookie store at~/Library/Application Support/Linear/Cookies. - For GitHub asset URLs, it first tries
GITHUB_TOKEN, thenGH_TOKEN, thengh auth token, then falls back to an unauthenticated download. - It looks for
GEMINI_API_KEY, thenGOOGLE_API_KEY. - If neither is set, it tries
~/.bash_profilebefore failing. - Use
--debug-dir <dir>when you want request and response artifacts saved. - Quality gate rejects obvious low-signal transcripts such as repeated partial typing runs and falls back automatically.