name: fan-cam description: > Create personalized live sports broadcast fan-cam videos with genmedia. Use this for realistic spectator cutaways, stadium or arena crowd reactions, broadcast screenshots, sports TV shots, scoreboard overlays, TV channel bugs, and identity-preserving fan reaction videos from a user photo.
Fan cam production with genmedia
Use this skill when the user wants a personalized spectator video that feels like a real live sports broadcast cutaway. The usual input is one photo of the person, event details, and a desired reaction or situation.
Runtime is the genmedia CLI. Use the genmedia skill for command syntax. Load
model-routing, fal-prompting, and genmedia-workflow when endpoint choice,
model-specific prompt craft, or pipeline execution details matter.
Do not encode private examples, local file paths, user-specific workflow names, or conversation-specific details into prompts or docs. Keep this skill generalized.
References
Load only what is needed:
references/prompt-contract.mdfor the image prompt and Kling prompt rules.references/genmedia-commands.mdfor executable CLI command patterns.references/examples.mdfor sport-specific examples.
Required inputs
Ask only for missing information that changes execution:
- User photo: local path or URL. This is the identity reference.
- Event details: sport, matchup, venue, league, broadcast context, wardrobe, scoreboard idea, crowd behavior, and any specific scenario.
- Reaction or situation: excited, happy, laughing, sad, neutral, angry, surprised, nervous, focused, eating, distracted, caught on camera, noticing the stadium screen, celebrating, disappointed, or another user-specified moment.
- Budget or quality preference only when the user explicitly asks for economy,
preview, or native 4K final output. Otherwise use the standard fan-cam
defaults: GPT Image 2 edit at
quality=highwith a 3840x2160 frame, then Kling v3 Pro.
If the user gives a local image path, upload it once with genmedia upload and
reuse the returned URL. If the user gives multiple references, treat the first
person image as the identity source and later images as optional venue,
broadcast, or styling references.
The user photo is an identity reference, not a Kling-ready start frame. Do not
skip GPT Image 2 edit just because the user supplied a person's photo. For a
personalized fan-cam, first use openai/gpt-image-2/edit to place the person
inside a realistic 16:9 broadcast scene, then use the approved generated frame
as Kling start_image_url.
Pipeline
Default graph:
photo URL -> prompt planning -> GPT Image 2 edit frame -> optional compression -> Kling v3 image-to-video -> downloaded video manifest
The GPT Image 2 edit frame is mandatory when the input is an ordinary person photo. Only bypass this step if the user explicitly provides an already approved 16:9 broadcast fan-cam frame and asks to animate that frame.
The planning step is performed by the agent using this skill. Do not call a separate LLM endpoint just to write prompts unless the user explicitly asks for a hosted planner. Write the image prompt and Kling multi prompts directly.
Endpoint selection
Always verify endpoints before use:
genmedia models --endpoint_id openai/gpt-image-2/edit --json
genmedia models --endpoint_id fal-ai/kling-video/v3/standard/image-to-video --json
genmedia models --endpoint_id fal-ai/kling-video/v3/pro/image-to-video --json
genmedia models --endpoint_id fal-ai/kling-video/v3/4k/image-to-video --json
Inspect schemas before running:
genmedia schema openai/gpt-image-2/edit --json
genmedia schema fal-ai/kling-video/v3/pro/image-to-video --format openapi --json
Use --format openapi for Kling v3 image-to-video endpoints because compact
schema output may omit top-level fields such as multi_prompt,
start_image_url, duration, prompt, elements, shot_type,
negative_prompt, and cfg_scale.
Check pricing when cost matters:
genmedia pricing openai/gpt-image-2/edit --json
genmedia pricing fal-ai/kling-video/v3/standard/image-to-video --json
genmedia pricing fal-ai/kling-video/v3/pro/image-to-video --json
genmedia pricing fal-ai/kling-video/v3/4k/image-to-video --json
GPT Image 2 quality choice
- Use
quality=highby default for personalized fan-cam frames. GPT Image 2 price is strongly affected bylowvshigh, but fan-cam identity, broadcast integration, and readable overlays need the stronger default. - Use
quality=lowonly when the user explicitly requests economy, preview, fast iteration, or lower-cost social drafts. - Use
output_format=jpegfor the generated broadcast frame unless the user needs transparency or lossless output. - Use 16:9 4K frame size by default:
{"width":3840,"height":2160}
Kling v3 choice
Select the endpoint based on the brief:
fal-ai/kling-video/v3/pro/image-to-video: default fan-cam endpoint. Use it for normal personalized sports cutaways, public examples, and any request where the user did not explicitly ask for economy.fal-ai/kling-video/v3/standard/image-to-video: use only when the user explicitly asks for economy, preview, fastest iteration, or lower cost.fal-ai/kling-video/v3/4k/image-to-video: use only for final premium 4K delivery or when the user explicitly asks for 4K video. Check pricing first.
Do not choose from memory alone. Verify model status and schema with genmedia in the current session.
Shot and duration planning
The agent decides the number and duration of multi prompts.
Hard rules:
- Each multi prompt must be at least 3 seconds.
- Total video duration must be 15 seconds or less.
- Use 2 to 5 multi prompts.
- Set the top-level Kling
durationequal to the sum of all beat durations. - If a real user-provided or approved reference is supplied through Kling
elements, every multi prompt must reference@Element1. Do not invent extra elements just to satisfy a prompt pattern. - Keep every Kling prompt concise. Aim for 250-430 characters.
Recommended patterns:
- Simple cutaway: 2 beats, 6 seconds total.
- Standard reaction: 3 beats, 9 seconds total.
- Rich fan-cam moment: 4 beats, 12 seconds total.
- Full story beat: 5 beats, 15 seconds total.
Do not always use five beats. Pick the smallest number that expresses the moment clearly.
Scene planning
The fan-cam does not need to be only a zoom on the spectator. Design the scene from the event details:
- A nervous fan watching a decisive point.
- A supporter eating or drinking when the broadcast camera catches them.
- A spectator noticing themselves on the stadium screen.
- A quiet tennis audience reaction during a tiebreak.
- A basketball lower-bowl fan reacting to a buzzer-beater.
- A race grandstand spectator turning toward a pass or crash offscreen.
- A combat sports crowd cutaway during a tense round.
- A watch-party or esports arena reaction if the user specifies it.
Keep the whole video anchored to the generated frame. Use motion, camera correction, crowd behavior, expression changes, and offscreen event energy to create the sequence.
Broadcast logo and overlay
Add a small top-right TV channel bug when it fits the brief. It should feel sport-specific and broadcast-realistic, but generic unless the user supplies an exact approved logo or explicitly requests a named network.
Good generic examples:
FOOTBALL LIVECOURT LIVEBASKET LIVERACE LIVEFIGHT LIVEMATCH CAM
Use compact score or timing overlays when the event calls for them. Keep them small, integrated, and secondary to the spectator. Avoid fake sponsor marks, large UI graphics, unstable text, and logos that dominate the frame.
Image prompt requirements
The GPT Image 2 edit prompt must:
- Use the uploaded photo as the identity reference.
- Preserve the real face, age impression, skin tone, hair, facial hair, glasses, face structure, asymmetry, pores, wrinkles, blemishes, and ordinary imperfections.
- Create a horizontal 16:9 live TV broadcast screenshot.
- Place the person naturally in the spectator area.
- Make the selected reaction or situation visible but not theatrical.
- Include sport-specific venue, crowd, wardrobe, scoreboard, and broadcast language.
- Include realistic TV capture flaws: mild compression noise, subtle motion blur, off-center crop, foreground occlusion, focus falloff, imperfect background faces, natural venue light, and small exposure inconsistencies.
- Include a small top-right broadcast channel bug when appropriate.
The image prompt must avoid:
- Beauty retouching, AI influencer face, changed face anatomy, enlarged eyes, jawline sharpening, face slimming, porcelain skin, waxy skin.
- Studio portrait, passport photo, selfie framing, isolated subject, pasted face, face cutout, empty background.
- Fake sponsor marks, oversized logos, warped scoreboard text, random props not requested by the user, CGI crowd, cloned faces, anime, cartoon.
Kling prompt requirements
The Kling prompts must:
- Reference
@Element1in every beat only when the request actually includes a real user-provided or approved Klingelementsentry. Otherwise describe the featured spectator from thestart_image_url; do not invent extra elements. - Always submit Kling with
generate_audio=true. Do not usegenerate_audio=falsein this skill. - When using
multi_prompt, do not sendend_image_url; Kling rejectsend_image_urltogether withmulti_prompt. - Preserve the same person, face, outfit, seat area, crowd, overlay, lighting, and channel bug.
- Animate realistic broadcast motion: small head movement, blinking, breath, slight hand motion, food/drink gesture if present, nearby fans shifting, camera push-in, pan, sidestep, operator correction, or crowd swell.
- Use sport-specific language. Never write generic alternatives like "field or court" or "stadium or arena".
- Fit the chosen beat duration.
- If a spoken phrase should be external narration, do not write it as something the featured spectator says. Phrase it as an off-screen broadcast commentator, arena PA voice, or non-diegetic voiceover, and explicitly state that the featured spectator stays silent with no lip sync and no mouth movement matching the voice.
- Avoid face morphing, beautification, unstable scoreboard text, unstable logo, wrong sport, impossible crowd action, excessive camera movement, and sudden scene resets.
Negative prompt
Use a negative prompt like this and adapt only when needed:
low quality, smeared face, distorted faces, duplicated face, deformed hands, broken fingers, fake sponsor marks, oversized logos, unstable broadcast logo, watermark, text artifacts, unstable broadcast banner, flickering scoreboard, warped scoreboard text, unreadable names, passport photo, studio portrait, glamour portrait, beauty lighting, AI influencer, beautified face, changed face, enlarged eyes, sharpened jawline, pasted face, face cutout, over-smoothed skin, plastic skin, waxy skin, CGI crowd, cloned crowd, anime, cartoon, excessive camera movement, wrong sport, wrong venue
Quality gate
Before returning:
- The selected endpoint and schema were verified with genmedia.
- GPT Image 2 edit used
quality=highandimage_size={"width":3840,"height":2160}unless the user explicitly requested an economy or preview run. - If the input was a person photo, a GPT Image 2 edit frame was generated and approved before Kling. The raw person photo was not sent directly to Kling as the fan-cam start frame.
- Kling used
fal-ai/kling-video/v3/pro/image-to-videounless the user explicitly requested economy/preview or native 4K video. - Multi prompt durations are each at least 3 seconds.
- Total duration is 15 seconds or less.
- Top-level Kling duration equals the sum of beat durations.
- Every beat references
@Element1when a real Klingelementsentry is used. - No invented extra elements were added.
- Kling request uses
generate_audio=true. - Kling request does not combine
multi_promptwithend_image_url. - Any intended external narration is not lip-synced to the featured spectator; verify the mouth does not move like the voice belongs to the person on camera.
- The generated frame is below Kling image limits. If not, compress it.
- The person remains recognizable and not beautified.
- The broadcast bug and scoreboard are small and stable enough.
- Final files were downloaded with
--download.
Return a compact manifest with endpoint IDs, request IDs, model settings, prompts used, output URLs, downloaded files, and any visible defects.