name: banana-maker description: Generate images using the Gemini image generation API with the Nano Banana Pro prompting methodology. Use when the user asks to generate, create, or make an image, picture, photo, illustration, or visual. argument-hint: ""PROMPT" [--model flash|pro|grounded] [--aspect-ratio AR] [--size SIZE] [--reference PATH]"
Banana Maker — Gemini Image Generation
When to Use
Activate this skill when the user asks to:
- Generate, create, or make an image / picture / photo / illustration
- Visualize a concept, scene, or idea as an image
- Edit or modify an existing image (with reference images)
- Create thumbnails, posters, logos, or any visual content
Workflow
1. Capture the User's Idea
Take the user's description or prompt as input. It can be a quick idea or a detailed description.
2. Ask for Model Choice
Present these options:
- Flash — Nano Banana 2 (
gemini-3.1-flash-image-preview). Fast, cost-effective. Supports 512 to 4K, controllable thinking, new ultra-wide aspect ratios. Good for iteration and quick concepts. - Pro — Nano Banana Pro (
gemini-3-pro-image-preview). Advanced reasoning ("Thinking"), high-fidelity text rendering. Best for final output and detailed work. 1K-4K. - Grounded — Same Flash 3.1 model + search tools. The model can search the web and Google Images for real visual references before generating. Best for real people, real products, specific brands, real places, or anything where factual accuracy matters.
3. Enhance the Prompt (Recommended)
Read the prompting guide at ~/.claude/skills/banana-maker/prompting-guide.md and apply the Nano Banana Pro methodology to enhance the user's idea into a detailed, high-quality prompt. Claude does this enhancement directly -- no extra API call needed.
Apply the Ultimate Prompt Structure:
- Subject -- Who/What
- Action -- Doing what
- Environment -- Where
- Lighting/Mood -- Time of day, atmosphere
- Technical Specs -- Camera angle, style, texture, materiality
- Context -- Intended audience/medium (this helps the model infer artistic decisions)
Key principles:
- Write in natural language, not tag soup
- Add materiality keywords to avoid the "plastic AI look" (brushed steel, soft velvet, matte finish, grainy film texture)
- Give context/purpose so the model can infer artistic decisions
Show the enhanced prompt to the user before generating.
4. Generate the Image
Run the script using the banana-maker venv:
~/.claude/skills/banana-maker/venv/bin/python ~/.claude/skills/banana-maker/generate_image.py "THE ENHANCED PROMPT" --model MODEL_CHOICE [OPTIONS]
5. Report Results
- Show the output file path
- Since Matthew browses from his Surface tablet, provide:
http://<host>:PORT/filenameif a file server is running, otherwise just the file path
Script Usage Reference
generate_image.py PROMPT --model MODEL [OPTIONS]
Positional:
prompt The image generation prompt
Required:
--model, -m flash | pro | grounded
Options:
--output, -o PATH Output file path (default: output/generated_TIMESTAMP.png)
--aspect-ratio, -a AR 1:1 | 16:9 | 9:16 | 4:3 | 3:4 | 1:4 | 4:1 | 1:8 | 8:1 (default: 1:1)
--size, -s SIZE 512 | 1K | 2K | 4K (default: 2K, 512 is Flash 3.1 only)
--reference, -r PATH Reference image path(s), can specify multiple times (max 14)
--search SEARCH web | image | both (requires --model grounded)
--thinking, -t LEVEL minimal | high (Flash 3.1 thinking control, flash/grounded only)
Model Mapping
| Flag | Model ID | Notes |
|---|---|---|
flash |
gemini-3.1-flash-image-preview |
Nano Banana 2: fast, 512-4K, thinking control |
pro |
gemini-3-pro-image-preview |
Nano Banana Pro: advanced reasoning, 1K-4K |
grounded |
gemini-3.1-flash-image-preview |
Flash 3.1 + search tools (Nano Banana 2) |
Reference Images
For identity locking, style transfer, or image editing, pass reference images:
~/.claude/skills/banana-maker/venv/bin/python ~/.claude/skills/banana-maker/generate_image.py \
"Transform this photo into a watercolor painting" \
--model flash \
--reference photo.jpg
Supports up to 14 reference images (10 objects + 4 characters for Flash 3.1).
Examples
Simple generation:
~/.claude/skills/banana-maker/venv/bin/python ~/.claude/skills/banana-maker/generate_image.py \
"A hyper-realistic close-up of an astronaut fixing a circuit board on Mars. Harsh sunlight from the right creates deep contrast. Shot on 35mm film with visible grain. For a sci-fi documentary poster." \
--model pro --aspect-ratio 16:9
With reference image:
~/.claude/skills/banana-maker/venv/bin/python ~/.claude/skills/banana-maker/generate_image.py \
"Keep this person's facial features identical but place them in a cyberpunk city at night" \
--model pro --reference portrait.jpg
With search grounding (real-world accuracy):
~/.claude/skills/banana-maker/venv/bin/python ~/.claude/skills/banana-maker/generate_image.py \
"The Sydney Opera House at sunset with a dramatic storm approaching from the west" \
--model grounded --search both --aspect-ratio 16:9
Fast low-res iteration with thinking:
~/.claude/skills/banana-maker/venv/bin/python ~/.claude/skills/banana-maker/generate_image.py \
"A minimalist logo for a coffee brand called 'Dark Roast Labs'" \
--model flash --size 512 --thinking high
Ultra-wide panoramic:
~/.claude/skills/banana-maker/venv/bin/python ~/.claude/skills/banana-maker/generate_image.py \
"A sweeping desert landscape at golden hour, endless sand dunes under dramatic cloud formations" \
--model flash --aspect-ratio 8:1 --size 4K