nano-banana-imagegen - SKILL.md Agent Skill

name: nano-banana-imagegen description: Generate and edit images using Google Gemini image models via the nano-banana CLI. Use when the user asks to create, generate, make, or edit images with AI. Supports text-to-image, image editing, style transfer, and multi-image composition. Trigger on requests like "create an image", "generate a picture", "make me a logo", "edit this photo", "add X to this image".

Nano Banana Image Generation

Generate and edit images using Google's Gemini image models via the nano-banana CLI.

Prerequisites

GEMINI_API_KEY environment variable must be set
The CLI is installed via npx @the-focus-ai/nano-banana

Quick Reference

# Generate a new image
npx @the-focus-ai/nano-banana "a serene mountain landscape at sunset"

# Edit an existing image
npx @the-focus-ai/nano-banana "add a hot air balloon to the sky" --file photo.jpg

# Specify output path
npx @the-focus-ai/nano-banana "a minimalist logo" --output logo.png

# Use a specific model
npx @the-focus-ai/nano-banana "detailed illustration" --model gemini-2.0-flash-exp

Workflow

Step 1: Understand the Request

Before generating, clarify:

Subject: What should be in the image?
Style: Photorealistic, illustration, cartoon, abstract?
Mood: Bright, dark, moody, cheerful?
Composition: Close-up, wide shot, specific aspect ratio?
Use case: Hero image, icon, social media, print?

Step 2: Craft an Effective Prompt

Read references/prompting-guide.md for comprehensive guidance.

Key principles:

Be specific and descriptive
Include style references
Specify what you DON'T want
Describe composition and framing

Example — Weak prompt:

"a cat"

Example — Strong prompt:

"A fluffy orange tabby cat curled up on a velvet armchair, soft afternoon sunlight streaming through a window, warm cozy interior, photorealistic style, shallow depth of field"

Step 3: Generate the Image

npx @the-focus-ai/nano-banana "your detailed prompt here"

Default output: output/generated-<timestamp>.png

Step 4: Iterate

If the result isn't right:

Refine the prompt — Add more detail or constraints
Edit the image — Use --file to modify the generated image
Try a different model — Some models handle certain styles better

Commands

Text-to-Image Generation

npx @the-focus-ai/nano-banana "<prompt>"

Image Editing

npx @the-focus-ai/nano-banana "<edit instruction>" --file <input-image>

Edit instructions should describe the change:

"Remove the background and replace with a gradient"
"Add sunglasses to the person"
"Change the sky to sunset colors"
"Make it look like a watercolor painting"

Options

Option	Description
`--file <image>`	Input image for editing
`--output <path>`	Custom output path
`--model <name>`	Specific Gemini model
`--flash`	Use gemini-2.0-flash (faster, simpler images)
`--prompt-file <path>`	Read prompt from file
`--list-models`	Show available models

Best Practices

For Better Results

Start with composition: Describe the layout first, then details
Use artistic references: "in the style of Studio Ghibli", "like a National Geographic photo"
Specify lighting: "golden hour lighting", "dramatic chiaroscuro", "soft diffused light"
Include negative guidance: Describe what to avoid in the prompt itself
Consider aspect ratio: The model generates square by default; describe wide/tall if needed

For Editing

Be specific about changes: "Add a blue butterfly to the top-left corner"
Preserve what works: "Keep the background unchanged, only modify the foreground"
Iterative refinement: Make one change at a time for better control

Environment Setup

Ensure GEMINI_API_KEY is set:

export GEMINI_API_KEY="your-api-key-here"

Or create a .env file in your project:

GEMINI_API_KEY=your-api-key-here

Troubleshooting

Problem	Solution
"No image in response"	Prompt may have triggered safety filters — rephrase
Poor quality results	Add more specific style guidance, use `gemini-2.0-flash-exp`
Image doesn't match description	Be more explicit about composition, add negative constraints