design-assets - SKILL.md Agent Skill

name: design-assets description: "Generate and manipulate visual assets — icons, illustrations, images. Supports OpenAI (gpt-image-1, gpt-image-1-mini, dall-e-3), Google Vertex AI Imagen (Imagen 3/4), and Google Vertex AI Gemini (gemini-2.5-flash-image, gemini-3-pro-image-preview). Operations include generate, edit/inpaint, outpaint, background removal. Also fetches stock photos from Unsplash. Triggers on 'generate illustration', 'create image', 'remove background', 'find stock photo'." allowed-tools: ["Read", "Write", "Bash", "Glob", "Grep", "Edit", "Agent", "WebFetch", "WebSearch", "AskUserQuestion"]

Design Assets

Generate, edit, and source visual assets for design projects. Supports AI generation via OpenAI, Google Vertex AI Imagen, and Google Vertex AI Gemini, plus stock photography from Unsplash.

Reference files:

references/api-reference.md — API endpoints, parameters, prompt templates
references/asset-manifest-schema.md — manifest format for tracking assets

1. Pre-flight

Check for available API keys by testing environment variables:

echo "OpenAI: ${OPENAI_API_KEY:+set}" && echo "GCP Project: ${GOOGLE_CLOUD_PROJECT:+set}" && echo "GCP Location: ${GOOGLE_CLOUD_LOCATION:+set}" && echo "Unsplash: ${UNSPLASH_ACCESS_KEY:+set}"

Note: GCP credentials are shared between Imagen and Gemini image generation — both use the same project, location, and gcloud auth.

Report which providers are available. If none are set, tell the designer which environment variables to configure and stop.

Detect context:

Called from Style skill: Read state.json and directions.json from the style working directory. Extract the active direction's color palette, mood keywords, illustration style preference, and any existing assets. Use this context to inform prompts automatically.
Called standalone: Ask the designer what kind of asset they need and for what purpose.

2. Asset Type Selection

Determine what asset is needed. Ask or infer from context:

What kind of asset do you need? A) Icon — small UI element, single subject, clean lines B) Spot illustration — small decorative illustration for empty states, onboarding C) Full illustration — larger scene illustration for hero sections, features D) Photo/image — AI-generated photo-realistic image E) Stock photo — real photography from Unsplash (free, with attribution)

For AI-generated assets (A-D), ask about artistic style:

What style? A) Flat 2D minimal — clean, modern, works at any size B) 3D rendered — depth and realism, heavier visual weight C) Line art — elegant, lightweight, pairs well with text D) Hand-drawn — organic, approachable, distinctive character

If a Style direction is available and specifies illustrationStyle, recommend that style and explain why it fits the direction. Present pros and cons briefly for each option so the designer makes an informed choice.

3. Model Selection

Present model options based on the asset type and available API keys:

For quick drafts and iteration:

gpt-image-1-mini — ~80% cheaper than gpt-image-1, great for rapid exploration
Imagen 4 Fast (imagen-4.0-fast-generate-001) — fastest Imagen generation
Gemini 2.5 Flash (gemini-2.5-flash-image) — fast, cost-effective, conversational editing
dall-e-3 — reliable, good prompt adherence

For final quality assets:

Imagen 4 Ultra (imagen-4.0-ultra-generate-001) — highest quality from Google
gpt-image-1 — best from OpenAI, supports editing
Gemini 3 Pro (gemini-3-pro-image-preview) — highest quality Gemini image gen (preview)
Imagen 4 (imagen-4.0-generate-001) — strong quality, good balance

For stock photography:

Unsplash — real photos, free with attribution, no AI artifacts

For assets that need editing later:

gpt-image-1 / gpt-image-1-mini — supports inpaint/edit via mask
Gemini 2.5 Flash (gemini-2.5-flash-image) — conversational editing (pass image + text instruction)
Imagen 3 (imagen-3.0-generate-002) — supports generation and mask-based editing

If the Style direction specifies a preferred provider, recommend that. Ask the designer to choose a model.

4. Generation

Read references/api-reference.md for the exact API call format and prompt templates.

Build the prompt:

Start with the appropriate template from api-reference.md for the chosen asset type.
Fill in the template variables (subject, style, colors, mood, context).
If Style direction context is available, inject: color palette names, mood keywords, style descriptors. Example: "using a palette of teal, coral, and warm gray, with a calm and focused mood."
Add technical requirements: background color/transparency, dimensions, any constraints.

Execute the API call:

Use curl via Bash with the appropriate endpoint from api-reference.md.
For OpenAI (gpt-image-1, gpt-image-1-mini, dall-e-3): request b64_json response format for reliable saving.
For Vertex AI Imagen: authenticate with gcloud auth print-access-token. Response is in predictions[0].bytesBase64Encoded.
For Vertex AI Gemini: authenticate with gcloud auth print-access-token. Must include "responseModalities": ["TEXT", "IMAGE"] in generationConfig. Extract image from candidates[0].content.parts[] — find the part with inlineData.data (base64). Use the python3 extraction snippet from api-reference.md.
For Unsplash: search, present top results with descriptions, let designer pick.

Save the image:

Decode base64 and write to file: echo "$B64_DATA" | base64 --decode > output.png
File location:
- If called from Style: style/direction-{n}/assets/{descriptive-name}.png
- If standalone: current directory, or ask the designer where to save.
For Unsplash: download the image URL directly with curl -L -o output.jpg "$URL"

Report the file path so the designer can open it in a browser or image viewer.

5. Iteration

After generating an asset, present options:

Happy with this? I can: A) Generate another variation — same prompt, different result B) Edit a region (inpaint) — describe what to change in a specific area C) Remove background — re-generate with transparency or use edit API D) Extend/outpaint — expand the canvas in a specified direction E) Try different model — switch provider or model for comparison F) Keep this one — finalize and record in manifest

For each option:

A) Variation: Re-run the same API call. The model produces a different result each time.

B) Edit/Inpaint: Ask the designer to describe the region and desired change. For OpenAI, create a mask PNG (transparent where edits should happen) and call the edits endpoint. For Vertex AI Imagen 3, include base64 image and mask in the predict request. See api-reference.md for exact formats.

C) Background removal: Two approaches:

Re-generate with an explicit "transparent background, PNG with alpha channel" prompt addition.
Use the edit API with a mask covering the background area and prompt "remove background, make transparent."

D) Outpaint: Extend the canvas by creating a larger image with the original placed within it:

Create an expanded canvas PNG with the original image positioned and transparent regions where new content should appear.
Use the edit endpoint with this expanded image and a mask marking the new regions.

E) Different model: Switch to another model from the options in section 3 and regenerate.

F) Keep: Proceed to manifest update.

Loop through iterations until the designer is satisfied.

6. Manifest Update

After finalizing an asset, update assets-manifest.json. Read references/asset-manifest-schema.md for the full schema.

Steps:

Check if assets-manifest.json exists in the output directory. If not, create it with the project name and empty assets array.
Read the existing manifest to determine the next asset ID (increment from the last asset-NNN).
Build the asset entry with all required fields: id, type, file path (relative), provider, model, prompt or searchQuery, parameters, usedIn context, and ISO 8601 timestamp.
For Unsplash assets: record unsplashId and unsplashCredit (photographer name). Trigger the download tracking endpoint — this is required by Unsplash API terms.
Append the new asset to the array, update totalAssets, and write the manifest.

# Trigger Unsplash download tracking
curl -s "https://api.unsplash.com/photos/${PHOTO_ID}/download" \
  -H "Authorization: Client-ID ${UNSPLASH_ACCESS_KEY}"

Report the final asset summary: file path, dimensions, model used, and manifest entry ID.