Avatar API

AI Avatar API

Generate lifelike AI avatars and talking-head videos from text or image inputs. One API for face synthesis, lip-sync, and custom digital human creation — unified key, unified billing.

Video Generation API API Dashboard

Overview

What the AI Avatar API does

AI avatar generation turns a face image and a text script into a video of that person speaking — with accurate lip-sync, natural head movement, and optional emotion control. It's the infrastructure behind spokesperson videos, personalized video messages, and AI presenter tools.

What the AI Avatar API does

Building this directly against individual providers means managing separate auth flows, output formats, watermark handling, and compliance rules for real-person generation. ImaRouter consolidates the integration: you pass a face reference and a script, we route to the right model and return the video.

Face image + text script → talking-head video in one API call
Lip-sync accuracy across English, Chinese, Spanish, and more
Emotion and expression control on supported models
Built-in compliance review layer for real-person video generation
Output: MP4 via CDN URL, consistent across all models

Supported models and generation types

ImaRouter integrates the leading digital human and avatar generation models. Each has a distinct profile for realism, speed, supported languages, and customization depth.

HeyGen Studio — High-realism avatar video, deep customization, enterprise-grade quality
Vidu Portrait — Fast portrait animation, good for social and UGC use cases
D-ID Creative Reality — Scalable talking-head generation, strong multilingual support
SadTalker — Open-weight face animation, image-driven with fine head pose control
MuseTalk — Real-time lip-sync for live streaming and interactive applications

Capabilities

What you can build

AI avatar generation unlocks product patterns that previously required expensive video production or on-camera talent. These are the integration patterns teams are using in production.

What you can build

AI avatar generation unlocks product patterns that previously required expensive video production or on-camera talent. These are the integration patterns teams are using in production.

Personalized video outreach: generate unique spokesperson videos at scale for sales or marketing
E-learning content: produce instructor-led explainer videos from text scripts without studio time
Multilingual video localization: re-lip-sync existing video to a new language without re-recording
AI presenter tools: build a SaaS product where users create their own avatar video with one click
Customer support video: generate personalized video responses at support ticket scale
Brand ambassador content: create on-brand spokesperson clips from a single brand face asset

Real-person compliance layer

Generating video of real people requires content moderation to prevent misuse. ImaRouter routes real-person avatar requests through a human review layer before delivery — the same system used for Seedance real-person video generation.

Pass review_required: true in your request to enable the compliance flow. The review adds 1–4 hours to delivery time and ensures output meets commercial use standards. All reviewed outputs are logged with audit metadata.

Set review_required: true to route through the human review layer
Review SLA: 1–4 hours for standard requests, 30 min for priority review
Audit log: every reviewed job is stored with reviewer decision and timestamp
Non-compliant outputs are rejected with a structured reason code

Getting started

Submit a POST request to /v1/avatar/generate with your face image URL, script text, model preference, and output language. Use your ImaRouter API key. Avatar generation is asynchronous — you receive a job ID and poll for the completed video.

If you need real-person compliance review, add review_required: true to your request. The completed job response includes the video URL, model used, generation time, and cost in USD.

POST /v1/avatar/generate — async, returns {jobId, status: 'queued'}
Required params: face_image_url, script, model, language
GET /v1/jobs/{jobId} — poll for status: queued → processing → completed
Webhook: set callback_url to receive the completed video without polling

FAQ

Can I use my own face image for avatar generation?

Yes. Pass any face image URL in the face_image_url field. The image should be a clear front-facing photo with good lighting. For best results, use a 512×512 or larger image with the face occupying at least 50% of the frame.

What languages are supported for lip-sync?

Language support varies by model. HeyGen Studio and D-ID support 40+ languages including English, Chinese (Mandarin and Cantonese), Spanish, French, German, Japanese, Korean, and Arabic. MuseTalk focuses on English and Chinese with real-time performance.

How does the compliance review work for real-person video?

When review_required: true is set, ImaRouter routes your job to a human review queue before delivering the output. A reviewer checks the output against commercial content standards and either approves delivery or rejects with a reason code. This process typically takes 1–4 hours.

What is the output resolution and format?

Avatar videos are delivered as MP4 files via CDN URL in the job completion response. Standard resolution is 720p. 1080p is available on HeyGen Studio and D-ID models. Output files are retained for 72 hours after generation — download and store before the window closes.

Can I create a reusable avatar that I don't need to re-upload each time?

Yes. After your first generation, you can save the face asset to your ImaRouter account and reference it by asset ID in future requests. Saved assets are available for 90 days and can be re-uploaded at any time.

Launch paths

AI Avatar API

What the AI Avatar API does

What the AI Avatar API does

Supported models and generation types

What you can build

What you can build

Real-person compliance layer

Getting started

FAQ

Related links and launch paths