thumbnail-creator

star 1

Specialized skill for generating YouTube-style thumbnails from a topic description, video title, or video URL. Use when: (1) creating a YouTube thumbnail for a video, (2) generating a click-worthy thumbnail image from a topic or title, (3) designing thumbnails with text overlays, faces, and bold visuals, (4) creating multiple thumbnail variations for A/B testing, (5) making social media or blog thumbnails optimized for clicks. Builds on: gemini-image.

vibecode By vibecode schedule Updated 3/1/2026

name: thumbnail-creator description: > Specialized skill for generating YouTube-style thumbnails from a topic description, video title, or video URL. Use when: (1) creating a YouTube thumbnail for a video, (2) generating a click-worthy thumbnail image from a topic or title, (3) designing thumbnails with text overlays, faces, and bold visuals, (4) creating multiple thumbnail variations for A/B testing, (5) making social media or blog thumbnails optimized for clicks. Builds on: gemini-image. metadata: {"openclaw": {"emoji": "๐Ÿ–ผ๏ธ"}}

Thumbnail Creator

Generate high-CTR YouTube-style thumbnails from a topic, title, or video URL. Uses AI image generation to produce bold, attention-grabbing visuals optimized for click-through.

Foundational Skills Used

  • gemini-image โ€” Image generation and editing via Gemini (Nano Banana). Read it for API details.

Cloud proxy fallback: If the direct Gemini API returns a "key expired" or 400 error, use the cloud proxy endpoint per the environment skill. The proxy URL replaces the domain (e.g., generativelanguage.googleapis.com โ†’ generativelanguage.googleapis.com.cloudproxy.vibecodeapp.com). Read the environment skill for details.

Thumbnail Design Principles

Before generating, understand what makes thumbnails click-worthy:

  1. Bold, simple composition โ€” One clear focal point. No clutter. Readable at 168ร—94px (mobile size).
  2. High contrast colors โ€” Bright backgrounds, saturated tones. Avoid muted palettes.
  3. Emotional faces โ€” Expressive human faces increase CTR by up to 30%. Exaggerated surprise, excitement, curiosity.
  4. Minimal text โ€” Under 5 words max. Large, bold, high-contrast text. Often ALL CAPS.
  5. Visual tension โ€” Before/after, contrast, unexpected juxtaposition, mystery elements.
  6. Brand consistency โ€” Recurring color scheme, text style, and layout across a channel's thumbnails.

What NOT to do

  • Don't overcrowd โ€” if you can't describe the thumbnail in one sentence, it's too busy
  • Don't use small text โ€” it must be readable on a phone
  • Don't use generic stock-photo compositions โ€” they look like ads, not content
  • Don't put important elements in the bottom-right corner โ€” YouTube's timestamp overlay covers it

Workflow

Step 1: Understand the Content

If given a video URL, extract context about the video first:

  • Use the supadata skill's /metadata endpoint (more reliable) or serpapi-youtube skill to get the video title, description, and key topics. Avoid Supadata's /youtube/video endpoint โ€” it's slower and prone to timeouts. If /metadata fails, fall back to serpapi-youtube.
  • If a transcript is available, scan it for the main hook or surprise

If given a topic or title, work directly with that.

Identify:

  • The hook โ€” What's the one thing that would make someone click?
  • The emotion โ€” Surprise? Curiosity? Excitement? Fear of missing out?
  • Key visual element โ€” What single image captures the concept?

Step 2: Craft the Image Prompt

Build a detailed prompt for the gemini-image skill. The prompt should describe a thumbnail, not a generic image.

Prompt formula:

A YouTube thumbnail showing [MAIN VISUAL ELEMENT]. [COMPOSITION DETAILS].
[COLOR/STYLE DETAILS]. The image is bold, high-contrast, and designed to
grab attention at small sizes. YouTube thumbnail style.

If the thumbnail needs text, include it directly in the prompt (see Step 4).

Prompt tips:

  • Always include "YouTube thumbnail style" or "YouTube thumbnail" in the prompt
  • Specify the emotional tone: "dramatic", "exciting", "shocking", "curious"
  • Describe the composition: "close-up face on the left, [object] on the right"
  • Request bold colors: "vibrant", "saturated", "neon accents", "bright background"
  • If faces are relevant, specify expression: "person with an shocked/excited/amazed expression"
  • Mention contrast: "dark subject on bright background" or "glowing text effect"

Example prompts by category:

Category Prompt Pattern
Tutorial "YouTube thumbnail of a person looking excited pointing at [SUBJECT], bright colorful background, bold and clean composition"
Listicle "YouTube thumbnail showing [NUMBER] items arranged dramatically, bright gradient background, bold visual hierarchy"
Reaction "YouTube thumbnail of a person with an exaggerated shocked expression looking at [SUBJECT], split composition, vibrant colors"
Comparison "YouTube thumbnail showing [THING A] vs [THING B] in a split-screen style, dramatic lighting, VS text in the center"
Story/Drama "YouTube thumbnail with a dramatic cinematic scene of [SUBJECT], moody lighting with one bright focal point, dark background"
How-to "YouTube thumbnail of [END RESULT] looking impressive, before/after split, bright clean background"

Step 3: Generate the Image

Use the gemini-image skill to generate the thumbnail.

  • Aspect ratio: Always use 16:9 โ€” this is YouTube's thumbnail ratio
  • Generate 2-3 variations with slightly different prompts (swap emotion, change composition, alter colors) so the user can pick the best one
  • If the first result isn't right, iterate โ€” adjust the prompt and regenerate

Output validation: After base64 decoding the generated image, check the file size is > 1KB. Files under 1KB (e.g., 3 bytes) indicate a failed generation โ€” typically from a null API response. Discard and retry.

File format note: Gemini may return JPEG data regardless of what the docs say. The saved file may actually be JPEG even with a .png extension. This usually doesn't matter (image viewers handle it), but if accuracy matters, run file <output> to verify the actual format and rename the extension if needed.

Step 4: Add Text Overlay (If Needed)

Most thumbnails benefit from short, punchy text (1-5 words). Gemini handles text generation well for short phrases.

How to do it: Include the desired text directly in your generation prompt. Specify the text content, position, and style:

...with bold white text saying "TOP 10" in the upper left, large block letters, high contrast against the background

Best practices for prompt-baked text:

  • Keep it under 5 words โ€” shorter text renders more reliably
  • Specify the style: "bold", "block letters", "ALL CAPS", "thick outline"
  • Specify high contrast: "white text with black outline", "bright yellow text on dark background"
  • Specify position: "upper left", "center", "across the top"
  • If the text doesn't render correctly on the first try, regenerate or use gemini-image's editing mode to fix it

If no text is needed: Add "Do NOT include any text, words, letters, or numbers in the image" to the prompt to prevent Gemini from adding unwanted text.

Text placement guidelines:

  • Top-left or center for primary text
  • Avoid bottom-right (YouTube timestamp covers it)
  • Maximum 5 words
  • Bold, thick sans-serif style works best for thumbnails

Step 5: Deliver

Save the thumbnail(s) and present to the user. Default save location: ~/Photos/thumbnails/.

  • Name files descriptively: thumbnail-[topic]-v1.png, thumbnail-[topic]-v2.png
  • If the user wants to host it, use the cloud-storage skill for a CDN URL
  • Show all variations so the user can pick

Quick Reference: Common Requests

User Says What To Do
"Make a thumbnail for [topic]" Steps 2โ†’3โ†’5 (skip transcript extraction)
"Make a thumbnail for this video [URL]" Steps 1โ†’2โ†’3โ†’5 (extract context first)
"Add text to this thumbnail" Step 4 only (use gemini-image editing mode to add text)
"Make variations" Regenerate Step 3 with prompt tweaks
"Make it more clickbaity" Increase emotional language, add more contrast, exaggerate expressions
"Make it cleaner/professional" Reduce elements, use more whitespace, softer colors, remove text

Tips

  • Generate before perfecting. Get a rough version fast, then iterate. Don't spend 10 minutes on the perfect prompt.
  • Mobile-first. Shrink the image mentally to phone size. If the focal point is lost, simplify.
  • Study the niche. If the user has a specific channel style, ask for examples and match the vibe.
  • Contrast is king. The thumbnail competes with dozens of others. High contrast wins attention.
  • Iterate with editing. Use gemini-image's editing mode to refine โ€” change backgrounds, swap colors, adjust composition โ€” rather than regenerating from scratch each time.
Install via CLI
npx skills add https://github.com/vibecode/Skillsforagent --skill thumbnail-creator
Repository Details
star Stars 1
call_split Forks 2
navigation Branch main
article Path SKILL.md
More from Creator