24-ai-avatar-production-global - SKILL.md Agent Skill

name: 24-ai-avatar-production-global description: "AI Avatar production pipeline for global markets — 3-tier tools (Free/Pro/Enterprise), 4 workflows (single avatar, translate, batch, hybrid), reference image intake for avatar prompts, face/style/logo/palette replacement workflows, voice clone, anti-detection, QA Score 100. Has 4 region variants for DISCLOSURE LAW (US FTC, EU AI Act, SEA per country, LATAM mixed). Tools: HeyGen, Synthesia, ElevenLabs, Captions, Rask AI. Trigger: 'AI avatar', 'HeyGen', 'Synthesia', 'avatar AI video', 'talking head AI', 'AI video translate', 'batch AI video', 'avatar reference image', 'AI avatar prompt', 'replace avatar face'." metadata: version: 1.1.0 category: content license: MIT triggers: - "AI avatar" - "HeyGen" - "Synthesia" - "avatar AI video" - "talking head AI" - "AI video translate" - "batch AI video" - "avatar reference image" - "AI avatar prompt" - "replace avatar face" related: - 25-voice-clone-podcast-global - 04-script-video-global - 26-thought-leadership-content-global - references/ai-video-disclosure-global - references/voice-clone-prompts-global

AI Avatar Production (Global) — Pipeline 3-Tier, 4 Workflows, QA Score 100

Flagship skill of the AI Content cluster. Covers the full pipeline from zero to publish, voice clone, anti-detection, and region-specific disclosure law.

For newbies

What is an AI Avatar?

An AI Avatar is a video that shows your face (or a stand-in) but uses AI-generated voice and motion. You provide one photo or a short selfie video; the AI produces a final video with natural-looking speech, gestures, and expressions. No filming crew, no studio, no actor required.

What do you need to start?

Method	Requirement	Quality
Portrait photo	1 forward-facing photo, clean background, 1024x1024+	Medium — mouth less natural
Selfie video	30s video, looking at the lens, speaking naturally	Good — better lipsync
Custom avatar	2-5 min recording with teleprompter + lavalier mic	Excellent — near photo-real

Minimum gear: Phone with HD front camera + lavalier mic (or headset mic).

How long does it take?

One single video (60s): 30-60 min (script + render)
Batch of 10: 1-2 days
Batch of 30: 4-5 days (with optimized process)

What does it cost?

Tier	USD/month	Output
Free	$0	1-3 videos, watermark
Pro	$30-100	10-30 videos, no watermark
Enterprise	$200-500+	30+ videos, custom avatar, API

5 common newbie mistakes

Lipsync drift: Script too fast or voice mismatch -> slow speech 10-15%, use voice clone instead of default voice.
Voice doesn't sound like you: Sample too short or noisy -> re-record 3-5 minutes in a quiet room with phonetically varied script.
Video flagged as "AI content": Platform pattern detection -> see Anti-detection section below.
Blurry / pixelated output: Low-quality input -> use 1024x1024+ photo, natural lighting, no filters.
Slow render: Free tier queue -> render off-peak (early morning in your timezone = US night) or upgrade to Pro.

Information collection (4 questions max)

Ask up to 4 questions before starting:

Primary use case? Brand awareness / Sales / Education / Internal training?
Primary platform? TikTok / YouTube / Facebook / Instagram / LinkedIn / X / Threads?
Budget tier? Free ($0) / Pro ($30-100/mo) / Enterprise ($200+/mo)?
Videos per month target? 1-5 / 10-30 / 30+?

Based on the 4 answers, auto-select Tier + Workflow. If the user has already uploaded reference images, do not ask a long intake form first; classify the images, create the setup/prompt, then ask only for missing assets.

Tier decision — Tools and pricing

Tier	Suggested tool	Price/month	Quality	Limit	Fits
Free	Captions Free, HeyGen Trial, D-ID Trial	$0	6/10 — watermark, limited duration	1-5 videos, max 60s/video	Personal test, new freelancers
Pro	HeyGen Creator ($29), Synthesia Starter ($29), ElevenLabs Pro ($22)	$30-100	8/10 — no watermark, HD	10-30 videos, max 5 min/video	SME, small agency, content creator
Enterprise	HeyGen Business ($89+), Synthesia Enterprise (custom)	$200-500+	9.5/10 — custom avatar, API, priority render	30+ videos, unlimited	Large agency, large brand, e-learning

Quick recommendations:

Just starting: HeyGen Trial (1 video free, full experience)
Serious but budget-limited: Captions Pro ($10/mo) for lipsync + ElevenLabs Starter ($5) for voice
Scale fast: HeyGen Creator + ElevenLabs Pro = best price/quality combo
Enterprise: Synthesia Enterprise + ElevenLabs Scale

Workflow 1: Single Avatar Production

One video, end-to-end in 30-60 minutes.

6-step process

Step	Task	Tool	Time
1. Script	150-300 words for a 60s video	Skill `04-script-video-global`	10 min
2. Voice	Generate or use voice clone	ElevenLabs / HeyGen Voice	5 min
3. Avatar	Pick stock avatar or upload your media	HeyGen / Synthesia / D-ID	3 min
4. Render	Combine voice + avatar, choose background, gestures	Tool from step 3	5-15 min (render)
5. QA	QA Score 100 review (see section below)	Manual review	5 min
6. Publish	Export MP4 -> post to platform	Manual / Scheduler	2 min

Script template for AI Avatar (60s)

[HOOK — 3s] Curiosity hook, frame the problem
[PROBLEM — 10s] Describe the customer pain
[SOLUTION — 25s] Your solution, 2-3 key points
[PROOF — 12s] Numbers, testimonial, result
[CTA — 10s] Concrete action: "Link in bio for..."

Workflow 2: Multi-language translate

One source video -> many languages for global rollout. Use cases: DTC brand expanding markets, multi-language courses, multi-country agency work.

Tool comparison

Tool	Languages	Price	Notes
Rask AI	130+	$50/mo (Pro)	Best for translate today
HeyGen Translate	40+	Included Creator+	Built-in, convenient
Synthesia Translate	35+	Included Enterprise	Best for e-learning

Process

Create source video (Workflow 1)
Upload to translate tool (Rask AI recommended)
Pick target language — tool auto-translates and lipsyncs
Review with a native speaker
Export and publish per market

Caveat: Tonal languages (Mandarin, Vietnamese, Thai) have weaker lipsync. Workaround: produce native voice clone + native avatar per language.

See full disclosure law per region in the variant files.

Workflow 3: Batch Production

30 videos in 5 days — assembly-line process.

Detailed timeline

Day	Task	Output	Tool
Day 1	Script batch — write 10 scripts from template	10 scripts (.md)	Skill `04-script-video-global` + AI assist
Day 2	Voice batch — render 10 audio files	10 audio (.mp3)	ElevenLabs API
Day 3	Avatar batch — upload audio + avatar, queue render	10 videos rendering	HeyGen Batch / Synthesia
Day 4	QA batch — review 10 videos, fix issues, re-render	10 QA'd videos	Manual + QA Score
Day 5	Publish batch — export, add captions, schedule	10 videos published	Buffer / Later / Manual

Repeat 3 weeks = 30 videos. Or scale Days 1-2 to 15 scripts/week.

Cost estimate batch 30 videos/month

Tier	Tool combo	Monthly cost	Per-video cost
Free	HeyGen Trial + Captions Free	$0 (limited 3-5 videos)	$0 (watermark)
Pro	HeyGen Creator + ElevenLabs Pro	~$51	~$1.70
Enterprise	HeyGen Business + ElevenLabs Scale	~$189	~$6.30

Batch optimization tips

Templated scripts: 3-5 frameworks, swap the core content
Voice consistency: One voice clone for the entire series
Off-peak rendering: Queue overnight to skip the queue
QA checklist: Print the QA Score, check videos like an assembly line

Workflow 4: Hybrid Real + AI

Real face for trust + AI body for speed.

Use cases

Real face intro 5s + AI body 55s (save filming time)
AI video weekdays + Real video weekly (balance quality/effort)
Real talking head + AI B-roll (studio-grade output)

Assembly + tools

Film real intro 5-10s (eye contact, natural greeting); use Captions for lipsync fixes
Create AI for the rest with same outfit/background (HeyGen / Synthesia)
Edit in CapCut / Premiere (precise cuts, smooth transitions)
Color match AI to real footage (LUT or DaVinci Resolve free)

Trust gain: Real face up front -> 20-35% more engagement than full-AI.

Voice Clone Protocol

Voice sample requirements

Criterion	Requirement
Duration	3-5 minutes
Quality	WAV/FLAC, 44.1kHz+, mono, quiet room
Script content	Phonetically varied passages (all vowels, hard consonants)
Emotion	Read normal, natural, not acted

Tool comparison

Tool	Price	Quality	Notes
ElevenLabs	From $5/mo	9/10	Best overall, 30+ languages
HeyGen Voice	Included Creator+	6/10	Convenient if using HeyGen
Resemble AI	From $99/mo	7/10	Strong API
PlayHT	From $39/mo	7/10	Good for narration

Consent form template

MANDATORY before cloning anyone's voice.

VOICE USAGE CONSENT

I, [FULL NAME], consent to [COMPANY] using my voice for: [SPECIFIC PURPOSE].
Term: [X months / Until revoked]
Date: [YYYY-MM-DD]
Signature: _______________

Reference: See references/voice-clone-prompts-global.md

Avatar Setup Checklist

Before recording / uploading photo or video for an AI avatar:

Lighting: Natural light or softbox; no harsh shadows on the face
Background: Solid (white / gray) or real environment (office, store)
Wardrobe: On-brand; avoid small busy patterns (AI moire)
Framing: Chest up; eyes on the upper-third line
Eye contact: Look directly at the lens (not the screen)
Gestures: Natural; hands can rest or do light gestures
Resolution: Minimum 1080p (1920x1080); 4K preferred
Aspect ratio: 9:16 (TikTok / Reels), 16:9 (YouTube), 1:1 (Feed)
File format: MP4 (H.264) for video, PNG / JPG for photo
Backup: Keep originals on cloud (Google Drive / OneDrive) before uploading to the tool

Reference Image -> Avatar Prompt Director

Use this when the user drops one or more reference images and wants to create an avatar, replace a face, adapt brand colors, add a logo, or create the prompt before uploading assets into a tool.

Classify Input Images

Image type	Role	Requirement
Style ref	Mood, lighting, background, outfit, camera angle	Do not use as identity unless requested
Face ref	Identity preservation / face replacement	1-3 clear face images, no filter, front + 3/4 angle
Selfie video	Better custom avatar / natural lipsync	30s-2 min, looking at camera, speaking naturally
Logo/palette	Personal/company brand adaptation	PNG/SVG logo + 2-4 hex colors
Product/location	Prop or avatar environment	Clear product label or location/background image

Multiple Images = Multiple Flows

## Avatar Flows

| Flow | Input image | Role | Suggested tool | Missing assets |
|------|-------------|------|----------------|----------------|
| A | style-01 | style/background | Design Master -> HeyGen | face ref, logo |
| B | face-01 | identity | HeyGen custom avatar | script, voice sample |

If every image is a different style direction, create a separate prompt for each flow.
If images support one avatar, group by role: style + face + logo + palette + product.
Ask for each next asset explicitly: face image, selfie video, logo, hex colors, script, voice sample.

Prompt Setup Output

## Avatar Prompt Setup — Flow A

- Style ref:
- Face ref:
- Brand assets:
- Target platform:
- Tool route:

## Copy-Paste Visual Prompt
[English prompt for avatar/source image generation]

## Upload Next
- Face/selfie video:
- Logo:
- Brand colors:
- Voice sample:
- Script:

For a static personal avatar only, route to 30-design-master-global personal-brand mode. For talking-head video, continue this workflow.

Anti-detection for FB / IG / TikTok / YouTube

5 detection signals and fixes

Signal	Platforms flagging	Fix
Stiff face, no natural blinking	FB, IG	Use selfie video over photo; pick avatars with micro-expressions
Monotone voice, no natural pauses	TikTok, FB	Use voice clone (natural pacing) over default TTS
Fully static background	FB, IG	Add slight noise/grain, or use real-world background
Isolated motion (only mouth moves)	TikTok	Pick avatars with gesture (hands, head); use HeyGen v3+
Metadata flagged as AI tool	YouTube (monetize)	Re-export through CapCut (strips metadata); add color grade

Techniques to add "human feel"

Add film grain / noise: 2-5% in CapCut or Premiere
Zoom and crop: 5-10% crop with subtle motion (Ken Burns)
Color grade: Apply film LUT or manually grade — avoid "too clean"
Text overlay: Add subtitles, callouts, stickers to cover AI weak spots
B-roll insert: Drop 2-3 b-roll clips (product, lifestyle) every 15-20s
Sound design: Background music + light SFX (immersion + masks AI voice)

Per platform

TikTok: Most lenient — content quality wins over AI checks
Facebook / Instagram: Moderate scrutiny — anti-detection matters
LinkedIn: Practically no detection — best fit for AI avatars
YouTube: Strict for monetized videos — must disclose per YPP policy

CRITICAL: NEVER use AI avatars to impersonate real people without consent. This is illegal in most jurisdictions and grounds for permanent platform bans.

Ethics and Disclosure — Region selector

Disclosure laws differ dramatically by region. Pick the matching variant:

Region	Variant file	Key law
US / Canada	`variants/01-us.md`	FTC Endorsement Guides (16 CFR Part 255), 2023 update
EU / EEA / UK	`variants/02-eu.md`	EU AI Act Article 50 (always disclose) + UCPD + GDPR
Southeast Asia	`variants/03-sea.md`	Per-country: ASAS (SG), AKARI (ID), DTI (PH), MCMC (MY), TH
Latin America	`variants/04-latam.md`	CONAR + LGPD (BR), PROFECO (MX), AAIP (AR), per-country

ALWAYS read the matching variant BEFORE publishing AI avatar content in that region. Penalties range from warning to multi-thousand-USD fines per influencer (US) and can stack under EU AI Act + GDPR.

Universal disclosure rule of thumb

When in doubt, disclose. Disclosure is rarely penalized; non-disclosure can be.

"This video uses AI Avatar technology for visuals and voice."

Placement: video description, first 3 seconds on-screen text, OR platform "AI-generated" tag (where available — Meta, TikTok, YouTube all now support this).

QA Score — 100 points

Scorecard

#	Criterion	Points	Description
1	Lipsync	/10	Mouth tracks speech within 0.2s
2	Voice match	/10	Voice sounds like the speaker (if clone) or natural (if TTS)
3	Visual quality	/10	Sharp image, no artifacts, no blur
4	Background	/10	Background suits context, no render glitches
5	Lighting	/10	Even light, no harsh shadows, matches background
6	Gesture	/10	Natural, no jitters, hand/head movement present
7	Script flow	/10	Hook -> Problem -> Solution -> CTA
8	Disclosure	/10	AI disclosure compliant with region (see variant)
9	Platform fit	/10	Correct aspect ratio, duration, format for platform
10	CTA	/10	Clear call-to-action, easy to execute

Action thresholds

Tier	Score	Action
Excellent	90-100	Publish now
Good	70-89	Publish, note improvements for next round
Needs fix	50-69	Fix items scoring under 7, then re-render
Redo	<50	Rebuild from script + voice + avatar

Output template

# AI Avatar Video — [Title] | [Region variant] | [Date]

1. Workflow used: [Single / Translate / Batch / Hybrid]
2. Script: [Content, 150-300 words]
3. Voice: [Tool] — [Voice ID / clone name] — Consent: [Yes / N/A]
4. Avatar: [Tool] — [Avatar ID / custom]
5. QA Score: [X]/100 (10 criteria)
6. Disclosure (per region variant): [Text + placement]
7. Publish: [Platform] — [Aspect ratio] — [Link]

Quality checklist

Information collection completed (4 questions)
Tier picked (Free / Pro / Enterprise) and aligns with budget + volume
Workflow picked (Single / Translate / Batch / Hybrid)
Voice clone consent recorded (if cloning a real person)
Avatar setup checklist completed before recording
Anti-detection techniques applied for the target platform
Region variant read and disclosure compliant
QA Score >= 70 before publishing

Related skills

25-voice-clone-podcast-global — voice clone deep-dive + podcast pipeline
04-script-video-global — script writing for AI avatar
26-thought-leadership-content-global — content strategy for personal brand
references/ai-video-disclosure-global — full legal reference
references/voice-clone-prompts-global — voice clone training prompts

Global Skill 24 (AI Avatar Production) | Over Powers Agency | v1.1.0