name: logo-composite description: How the generate-then-composite pipeline puts a pixel-perfect canonical logo onto a generated image without letting the LLM regenerate the logo.
Logo composite
LLMs — including Gemini Pro — degrade complex logos. They smear gradients, drop text, and rearrange elements. The Assets app sidesteps this entirely with a generate-then-composite pipeline that every serious brand-imagery system uses today.
How it works
- The library has a
canonicalLogoUrl(set viaset-canonical-logo --libraryId --assetId). The asset's role islogo_reference. - When
generate-image --includeLogo=trueruns, the prompt envelope adds:Leave a clean uncluttered area in the upper-right for the real brand logo; do not draw or approximate the logo yourself.
- Gemini returns an image with empty space in that corner.
compositeLogo()fromserver/lib/image-processing.ts(Sharp) loads the canonical logo PNG / SVG, resizes it to ~16% of the image width with reasonable inset, and composites it onto the generated image.- Output: the image with the actual logo, pixel-perfect, vector-quality if the source is SVG.
When to use it
- The user explicitly toggled "Use logo" in the Generate popover.
- The agent infers the user wants the logo (e.g. "make a hero with our brand logo").
- The image will appear in a customer-facing context where logo accuracy matters.
When NOT to use it
- Logo on a product (a t-shirt mockup, a billboard scene, a coffee cup).
Compositing onto a flat corner is fine; compositing onto a curved or perspective surface needs mask-based inpainting that Gemini doesn't expose. v2 will use OpenAI
gpt-image-1's edit API for this. v1: tell the user to mock that up in design. - Multi-logo scenes (a partner-logo wall, a footer sponsor row). Same reason. v2.
Setting a canonical logo
upload reference image (role: logo_reference, category: logo) →
set-canonical-logo --libraryId=<id> --assetId=<asset-id>
set-canonical-logo flips the asset's role to logo_reference AND its status to reference. This means the reference selector won't pick up generated logo candidates as canonical — only intentionally pinned uploads.
Sharp composite parameters (current defaults)
In image-processing.ts:compositeLogo():
- Logo width:
max(120, round(imageWidth * 0.16))— ~16% of the image, but never smaller than 120 px. - Inset:
max(24, round(min(width, height) * 0.035))— ~3.5% of the smaller dimension, but never less than 24 px. - Position: upper-right (
top: inset,left: width - logoWidth - inset). - Output format: PNG (preserves transparency).
If you change these, also update the corresponding language in the prompt envelope ("upper-right") so the LLM's clean area aligns with where Sharp will composite.
Why not in-image text?
The same logic applies to body and headline text. Image models still smear small letters and rearrange long strings. The Assets app's prompt envelope explicitly says:
Do not render headlines, body text, UI labels, or prompt wording inside the image unless the user explicitly asks for exact visible text.
Overlay text in HTML/CSS in the calling app (slides, design, mail) — it's more reliable, more accessible, and the user can edit it without re-running the generation.
Failure modes & detection
- Gemini ignores the placeholder ask and renders something in the corner. The composite still works, but the hand-drawn-looking element underneath will peek out behind a transparent logo. Fix: re-roll, or ask the user to crop.
- The canonical logo's transparency is lost on a non-PNG source. Fix: re-upload as PNG; SVG works too via Sharp's rasterization.
- The user swaps a logo mid-generation. The action reads
canonicalLogoAssetIdat generate time, so racing here is rare; but the variant slot will reflect whichever logo was current when the call landed.