thumbnail

name: thumbnail description: Run when the user wants to create a high-intent, high-fidelity, viral thumbnail/cover image for a video — typically a 9:16 reel, optionally 16:9 long-form — or uses the /thumbnail command. Pairs with /virality: reads its peak hook timestamp to extract the most cortically activating frame as the hero, optionally adds Will's headshot as talent reference, generates a headline overlay, and renders the cover via Nano Banana 2. Output saved under the active brand folder.

Designs a high-fidelity, viral-feeling cover image for a video.

The hero frame is not picked at random — it's pulled from the peak hook timestamp that /virality (TRIBE v2) identifies as the moment of maximum predicted brain activation. That frame becomes the visual anchor; a bold headline carries the hook.

Output saved to ./brands/[brand-name]/thumbnails/[output-name]/.

Aspect ratio is parameterised — defaults to 9:16 for reels; 16:9 is supported for YouTube long-form.

Step 1 — Select brand

Scan ./brands/ for subfolders that contain brand-identity/visual-guidelines.md.

One brand found: use it automatically and confirm: "Using brand: [name]"
Multiple brands found: list them and ask which to use
None found: tell the user to run /brand first, or accept a --no-brand mode and save to ./thumbnails/[output-name]/ at project root

Step 2 — Locate the source video

Two modes — try Mode A first.

Mode A — Linked to a /virality run (preferred):

Scan brands/[brand-name]/virality/ for subfolders. If any exist:

"Which scored video do you want a thumbnail for? Pick a slug from below, or drop a video path to start fresh.

• gels-launch-hook-v1 72/100 peak 1.4s • reid-monday-reel 65/100 peak 0.9s"

Read virality-spec.json and score.json from the chosen folder. Pull:

video_path → source video
peak_hook_seconds (from score.json) → frame extraction timestamp
engagement_score and hold_rate for the user-facing summary at the end

Mode B — Standalone (no virality run yet):

Ask:

"Drop the path to the video. MP4 / MOV / WebM. If you've got a hook timestamp in mind drop that too — otherwise I'll pick the most magnetic-looking second."

Validate file exists and extension is .mp4, .mov, .webm, .m4v. If no timestamp given, use 1.0s as the default extraction point — it's the typical hook moment for short-form. Strongly suggest running /virality first if the video is destined for publication.

Step 3 — Watch the video

This is non-negotiable per the skill's contract: Claude must view the video content before designing the cover, not just trust a single extracted frame.

Run from project root:

mkdir -p brands/[brand-name]/thumbnails/[output-name]/_frames
ffmpeg -y -i [video-path] -vf "fps=1" -frames:v 1 \
  -ss [peak-hook-seconds] \
  brands/[brand-name]/thumbnails/[output-name]/_frames/peak.png

# Plus context frames at start, mid, end, and ±0.4s around peak
ffmpeg -y -i [video-path] -ss 0.5  -frames:v 1 _frames/start.png
ffmpeg -y -i [video-path] -ss [peak-0.4]  -frames:v 1 _frames/peak-pre.png
ffmpeg -y -i [video-path] -ss [peak+0.4]  -frames:v 1 _frames/peak-post.png
ffmpeg -y -i [video-path] -ss [duration/2] -frames:v 1 _frames/mid.png
ffmpeg -y -i [video-path] -sseof -0.5      -frames:v 1 _frames/end.png

Read each extracted frame with the Read tool. Form a brief internal scene reading covering:

Subject — who/what is on screen at the peak (talking head? action? product? text-on-screen?)
Expression / pose / energy — at the peak frame specifically: is the subject's face open and visible, eyes engaged? Is there a gesture mid-motion?
Environment — interior/exterior, dominant colour palette, lighting quality
Composition — where is the focal point in the 9:16 frame? Centre, upper third, off-axis?
Hook reading — based on the cluster of frames + the peak position, what is the video's promise? (Curiosity, transformation, contrarian claim, demonstration, reveal?)

Do not show this analysis to the user. It feeds the prompt construction in Step 7.

If the peak frame is unusable (subject blurred, eyes closed, mouth mid-sentence in an awkward shape, motion-smeared), fall back to whichever of peak-pre or peak-post is cleanest. Note the substitution internally.

Step 4 — Headline / hook text

Ask:

"What's the cover headline? Drop the line you want plastered on the thumbnail. Or hit enter and I'll suggest three based on the video."

If user provides text: use it verbatim. Do not rewrite.

If user wants suggestions: generate 3 candidates grounded in the video reading from Step 3 + the brand's visual-guidelines.md voice. Each candidate:

3–7 words
Front-loads the hook word
Reads as a curiosity gap, contrarian claim, or specific number — never generic ("This changed everything")
All-caps friendly (will likely render in heavy uppercase)

Present numbered. Ask which to use, or accept an edit.

Step 5 — Optional talent reference

Ask:

"Include Will's headshot as a talent reference? (y/N) Use this when the peak frame doesn't feature you cleanly but the cover should — e.g. a B-roll heavy reel where you still want your face on the cover."

Default: no.

If yes, the skill uses ~/Desktop/STUDIO/wills-headshot.png as a face/identity reference. If a brand-specific talent file exists at brands/[brand-name]/talent/headshot.png, prefer that.

Step 6 — Aspect ratio + treatment

Ask in one message:

"Aspect ratio? Default 9:16 (reels). Other options: 16:9 (YouTube long-form), 1:1.

Treatment? Default bold-text (oversized headline, expression-led — best for scroll feeds). Other options: cinematic (photo dominates, minimal text — premium feel), split (subject one side, text-block opposite)."

Defaults: 9:16 + bold-text.

Step 7 — Name, spec, prompt

Derive a slug from the source video name + treatment, e.g. gels-launch-bold, reid-monday-cinematic. Confirm or let the user rename. Slug it: lowercase, hyphens, no spaces.

Create the folder and write the spec:

mkdir -p brands/[brand-name]/thumbnails/[output-name]

Write brands/[brand-name]/thumbnails/[output-name]/thumbnail-spec.json:

{
  "output_name": "gels-launch-bold",
  "brand": "puresport",
  "source_video": "absolute/path/to/video.mp4",
  "peak_hook_seconds": 1.4,
  "extracted_frame": "_frames/peak.png",
  "headline": "ONE SCOOP. SEVEN HOURS.",
  "headshot_used": false,
  "headshot_path": null,
  "aspect_ratio": "9:16",
  "treatment": "bold-text",
  "virality_link": "brands/puresport/virality/gels-launch-hook-v1",
  "prompt": "[fully resolved prompt — see template below]",
  "created_at": "2026-05-10T..."
}

Prompt construction — assemble from the treatment template + scene reading + brand visuals.

Read visual-guidelines.md first and explicitly name the brand's primary background colour, headline typeface and weight, and any accent colour the brand uses for emphasis. Never let these be inherited from the source frame.

Treatment templates

bold-text (default — viral feed thumbnail)

Use the attached image as the photographic base. Image 1 is a hero frame extracted from a video at its highest-engagement moment — preserve the subject's face, expression, body position, and the lighting on them exactly. [If headshot present: Image 2 is a face reference for the same subject — use it to keep facial likeness consistent and sharp; do not re-pose or re-style.]

Recompose into a [ASPECT RATIO] cover image. Reframe so the subject's face occupies the upper 45–60% of the frame, eyes near the upper third gridline. Increase contrast and saturation by ~15% versus source. Add a subtle [BRAND ACCENT COLOUR] colour-grade pull in the highlights for cohesion with the brand. Background may be photographically extended or replaced with a clean [BRAND BACKGROUND COLOUR] gradient if the source background is cluttered.

Overlay the headline "[HEADLINE]" in [BRAND HEADLINE TYPEFACE], heavy weight, all-caps, white with a 2px [BRAND ACCENT COLOUR] outer stroke for legibility on any backdrop. Position the headline in the lower third of the frame, two lines maximum, kerning tight, leading tight, scaled so the longest line spans 85–90% of frame width. The headline must remain readable at IG-grid thumbnail size (~120px wide).

No logo, no watermark, no URL, no decorative elements. The image is the subject's face + the headline only.

Safe zones: keep the top 8% and bottom 12% of the [ASPECT RATIO] frame free of text and critical facial features — these areas are clipped by the IG handle bar and caption row in feeds. Photographic content (hair, shoulders, ambient background) entering these zones is fine.

cinematic (photo-forward, premium)

Use the attached image as the base. Image 1 is the peak hook frame from a video. Preserve the subject, expression, environment, and lighting. [If headshot present: Image 2 is a face reference for likeness — apply only to the face.]

Recompose into a [ASPECT RATIO] cover image with cinematic colour grading: gentle film-stock look, slightly lifted blacks, [BRAND ACCENT COLOUR] in the warm highlights. Tighten contrast on the subject's face. Smooth the background — no busy detail behind the head.

Overlay the headline "[HEADLINE]" in [BRAND HEADLINE TYPEFACE], small and refined — sentence case if the headline reads as a sentence, all-caps only if 3 words or fewer. Place it in the lower-left or lower-right margin, never centred over the face. Maximum width: 35% of frame width. Colour: white over dark grade, or [BRAND ACCENT] over light grade.

Mood: editorial, magazine-cover, restrained. The subject and the lighting do most of the work.

Safe zones: top 8%, bottom 12%, free of text and faces.

split (subject one side, text block opposite)

Use the attached image as the photographic base. Image 1 is the peak hook frame. Preserve subject, expression, lighting. [If headshot present: Image 2 is a face reference for likeness.]

Recompose into a [ASPECT RATIO] frame with the subject occupying the right 55% (or left, pick whichever side the subject's gaze naturally points away from — they should be looking into the empty space, not out of frame). Subject's face occupies the upper-mid third of their half. The opposing 45% is a clean [BRAND BACKGROUND COLOUR] block — flat, no gradient, no texture.

In the empty block, place the headline "[HEADLINE]" in [BRAND HEADLINE TYPEFACE], heavy weight, vertically stacked one word per line where natural, all-caps. Colour: high-contrast against the block — [BRAND ACCENT COLOUR] on a light block, white on a dark block. Scale so the block of text fills 60% of the block's height with comfortable margins.

Add a thin 4px [BRAND ACCENT COLOUR] vertical rule on the seam between subject and text block.

Safe zones: top 8%, bottom 12%.

The chosen template is filled in fully (every bracket replaced) and stored in the spec's prompt field. The user never sees the prompt template itself.

Step 8 — Generate

Run from project root:

python3 skills/references/generate-thumbnail.py brands/[brand-name]/thumbnails/[output-name]

The script:

Reads thumbnail-spec.json
Uploads the extracted peak frame as Image 1 (and the headshot as Image 2 if headshot_used: true)
Calls Nano Banana 2 edit with the resolved prompt + aspect ratio
Downloads the result to [output-name]_v1.png (auto-incrementing on rerun)

Cost: ~$0.12 per thumbnail.

Step 9 — Present and iterate

Generated:
  ✓ brands/puresport/thumbnails/gels-launch-bold/gels-launch-bold_v1.png

  Source: peak hook at 1.4s of gels-launch-hook-v1.mp4 (engagement 72/100)
  Headline: "ONE SCOOP. SEVEN HOURS."
  Treatment: bold-text · 9:16

If a virality_link is set, also surface the score so the user sees the through-line:

"This is the cover for the variant your /virality run scored 72/100, hold rate 81. Scroll-feed friendly. If you score a stronger V4 later, regenerate the cover from that one."

Ask: "Happy with it? Regenerate (same prompt, new seed), tweak the headline, switch treatment, or change aspect ratio?"

To regenerate same prompt:

python3 skills/references/generate-thumbnail.py brands/[brand-name]/thumbnails/[output-name]

Saves as the next version (_v2, _v3, …) — does not overwrite.

To change headline or treatment: update the relevant field in thumbnail-spec.json (and rebuild the prompt field accordingly), then re-run.

To produce a 16:9 long-form cover from the same source: create a second output folder with _yt suffix, copy the spec, change aspect_ratio to 16:9 and treatment to cinematic (or bold-text — both work for YouTube), then run.

Notes

The peak hook timestamp from /virality is the deterministic "best frame" — that's the whole reason this skill pairs with TRIBE v2. Do not overwrite that logic with vibes-based frame picking unless the peak frame is technically unusable (Step 3 fallback).
_frames/ is kept inside the output folder so reruns can re-read the peak frame without re-running ffmpeg. It's safe to delete after final approval.
Will's headshot at ~/Desktop/STUDIO/wills-headshot.png is an optional talent reference — never auto-applied. The default is no headshot.
Versioning: reruns increment _v1 → _v2 → _v3. Previous outputs are never overwritten.
The thumbnail is the cover, not the post. Text is intentionally minimal — this is a scroll-feed object, not a full ad. For text-heavy ad creatives, use /static-ads.