name: generate-image description: Generate 2K images using Gemini 3 Pro API. Creates high-quality images from text prompts with various aspect ratios.
Gemini 3 Pro Image Generator
Generate 2K resolution images from text prompts using Google's Gemini 3 Pro model (Nano Banana Pro).
Quick Start
python ~/.claude/commands/generate-image/scripts/generate_image.py "A beautiful sunset over mountains" -o sunset.png
Prompting Guide
Reference: Nano Banana Pro Prompting Tips
Key Elements for Effective Prompts
Include these elements for nuanced creative control:
Subject: Who or what is in the image? Be specific
- "a stoic robot barista with glowing blue optics"
- "a fluffy calico cat wearing a tiny wizard hat"
Composition: How elements are arranged
- "centered", "rule of thirds", "close-up", "wide shot"
Action: What is happening
- "pouring coffee", "reading a book", "running through rain"
Location: Where the scene takes place
- "in a neon-lit Tokyo alley", "on a misty mountain peak"
Style: Visual aesthetic
- "photorealistic", "watercolor painting", "retro 80s poster"
7 Tips for Best Results
Use Specific Details: Combine subject, composition, action, location, and style in one prompt
Leverage Real-World Knowledge: Gemini 3 Pro understands real-world context - reference real places, products, or cultural elements
Translate and Localize: Generate text in multiple languages for international markets, posters, infographics
Refine with Technical Terms: Add camera angles, lighting conditions, text integration requirements
Blend Multiple Concepts: Combine ideas, translate images, generate visuals with embedded text
Maintain Brand Consistency: Apply designs with consistent styling, drape patterns/logos onto 3D objects while preserving lighting and texture
Know the Limitations: Complex text rendering and factual diagrams may need iteration
Example Prompts
Product Photography:
"Professional product photo of a minimalist smartwatch on white marble surface,
soft studio lighting, slight reflection, centered composition, 8K detail"
Marketing Banner:
"Modern tech startup banner with text 'Innovation Starts Here',
gradient blue to purple background, geometric shapes, clean typography"
Artistic Portrait:
"Cinematic portrait of a jazz musician in a smoky club,
warm amber lighting, shallow depth of field, 1950s New York aesthetic"
Options
-o, --output: Output file path (default: generated_image.png)-a, --aspect-ratio: Aspect ratio (default: 16:9)- Square: 1:1
- Portrait: 2:3, 3:4, 4:5, 9:16
- Landscape: 3:2, 4:3, 5:4, 16:9, 21:9
Examples
# 16:9 landscape image (default)
python ~/.claude/commands/generate-image/scripts/generate_image.py "Futuristic city skyline at night, cyberpunk aesthetic, neon lights reflecting on wet streets" -o city.png
# Square image for social media
python ~/.claude/commands/generate-image/scripts/generate_image.py "Flat lay of coffee and pastries on wooden table, morning light, cozy cafe aesthetic" -o product.png -a 1:1
# Portrait for mobile
python ~/.claude/commands/generate-image/scripts/generate_image.py "Fashion editorial, model in avant-garde outfit, dramatic studio lighting, high contrast" -o fashion.png -a 9:16
Requirements
- Python 3.8+
GEMINI_API_KEYin~/.env
API Details
- Model: gemini-3-pro-image-preview (fixed)
- Resolution: 2K (fixed)
- Endpoint: Google Generative Language API v1beta
- Output: JPEG format
Troubleshooting
API Key Not Found
Ensure ~/.env contains:
GEMINI_API_KEY=your_api_key_here
Timeout Errors
The default timeout is 120 seconds. For complex prompts, the API may take longer.
No Image in Response
Some prompts may be rejected by safety filters. Try rephrasing the prompt.