Explore/muapi.ai/sd-2-omni-reference-train

muapi/sd-2-omni-reference-train

Training

Train a reusable character from a reference photo. Once complete, reference the character in Omni Reference video prompts using @omni-character:<request_id> to generate videos featuring that character consistently.

Input

Configure the model parameters below.

Drag & drop, paste file/image, or paste a link

Result

Your generated results
will appear here

🚀Related Models

View all
sd-2-t2v

sd-2-t2v

SD 2.0 is the latest multimodal video generation model by ByteDance, offering advanced camera control, native audio-video sync, and high-resolution output.

Text to Video
sd-2-watermark-remover

sd-2-watermark-remover

🎉 FREE for a limited time — Remove SD 2.0 watermarks from videos using LaMa AI inpainting. Automatically detects the watermark region, builds a precise mask via Canny edge detection, and inpaints each frame for artifact-free results. No credits deducted — requires a positive balance to access.

Video to Video
sd-2-video-watermark-remover-pro

sd-2-video-watermark-remover-pro

SD 2 Video Watermark Remover Pro uses the SD 2 AI model to remove watermarks, logos, and overlaid text from videos with high accuracy. Powered by ByteDance's SD 2 engine, it delivers superior quality compared to traditional inpainting approaches. Pricing: $0.013 per second, minimum charge for 5 seconds ($0.065).

Video to Video
sd-2-i2v-480p

sd-2-i2v-480p

SD 2.0 480p image-to-video generation. Faster and more cost-effective than the 720p variant, ideal for previews and drafts.

Image to Video
sd-2-omni-reference

sd-2-omni-reference

SD 2.0 Omni Reference — generate videos with visual consistency using reference images, videos, and audio. Maintain character identity, style, and scene continuity. Supports up to 9 images, 3 video clips, and 3 audio clips. Use @image1, @video1, @audio1 syntax in your prompt.

Image to Video
sd-2-i2v

sd-2-i2v

SD 2.0 is the latest multimodal video generation model by ByteDance, offering advanced camera control, native audio-video sync, and high-resolution output.

Image to Video
sd-2-video-edit

sd-2-video-edit

SD 2.0 Video Edit modifies existing videos based on text prompts and optional reference images.

Video to Video
sd-2-extend

sd-2-extend

SD 2.0 Extend Video continues an existing SD 2.0 generated video seamlessly. Provide the original request ID and an optional prompt to guide the extension — the model preserves visual style, motion, characters, and audio consistency across the new segment. Optional image, video, and audio references can be supplied to steer the extension: user-supplied references map to @image2…@image9, @video1…@video3, @audio1…@audio3 in the prompt (the source video's last frame is always @image1).

Text to Video
sd-2-character

sd-2-character

[Beta] Turn fictional character references into reusable video characters. Upload reference images and describe the outfit to get a character_id you can use in SD 2.0 Omni Reference.

Image to Image
sd-2-omni-reference-480p

sd-2-omni-reference-480p

SD 2.0 480p Omni Reference — generate videos with visual consistency using reference images, videos, and audio at 480p resolution. More cost-effective than the 720p variant. Supports up to 9 images, 3 video clips, and 3 audio clips. Use @image1, @video1, @audio1 syntax in your prompt.

Image to Video
sd-2-t2v-480p

sd-2-t2v-480p

SD 2.0 480p text-to-video generation. Faster and more cost-effective than the 720p variant, ideal for previews and drafts.

Text to Video
sd-2-vip-extend

sd-2-vip-extend

SD 2.0 VIP Extend Video continues an existing SD 2.0 generated video seamlessly at 720p. Provide the original request ID and an optional prompt to guide the extension — the model preserves visual style, motion, characters, and audio consistency across the new segment. Optional image, video, and audio references can be supplied to steer the extension: user-supplied references map to @image2…@image9, @video1…@video3, @audio1…@audio3 in the prompt (the source video's last frame is always @image1).

Text to Video
sd-2-vip-extend-1080p

sd-2-vip-extend-1080p

SD 2.0 VIP Extend Video 1080p continues an existing SD 2.0 generated video seamlessly at 1080p resolution. Provide the original request ID and an optional prompt to guide the extension — the model preserves visual style, motion, characters, and audio consistency across the new segment. Optional image, video, and audio references can be supplied to steer the extension: user-supplied references map to @image2…@image9, @video1…@video3, @audio1…@audio3 in the prompt (the source video's last frame is always @image1).

Text to Video
📝

Overview

About this model

Train a reusable character identity from a single reference photo. The trained character can then be used across multiple Omni Reference video generations by referencing it with @omni-character:<request_id> in your prompt — maintaining consistent appearance, style, and identity across different scenes and motion types.

1Character Consistency: Create a recurring character for a video series or story, keeping their appearance identical across all generated clips.
2Brand Mascots: Train a brand character once and reuse them in unlimited video campaigns.
3Storytelling: Build a cast of characters and direct them in different scenes without re-uploading reference images each time.
💰

Pricing & Value

Cost analysis

muapiapp$0.50 per character

One-time training cost per character. Trained characters can be reused in unlimited Omni Reference video generations.

Fal.aiNot available

Character training for Omni Reference-style generation is not available on Fal.ai.

ReplicateNot available

Character training for Omni Reference-style generation is not available on Replicate.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Reference Imagestring

Reference photo of the character (JPEG/PNG/WebP). One clear face portrait works best.

Default Valuehttps://d3adwkbyhxyrtq.cloudfront.net/webassets/videomodels/seedance-v2.0-omni-reference.png
Character Namestring

Name for this character. Use this name with @omni-character:<request_id> in Omni Reference prompts.

Default ValueJuno
Descriptionstring

Optional description of the character.

Default ValueA brave explorer with piercing blue eyes
📖

Implementation Guide

Developer documentation

How to Use Omni Reference Train Character

  1. Upload a reference photo: Provide a clear, well-lit portrait photo of the character. A single face photo works best — avoid group shots or heavily obscured faces.

  2. Set a character name: Give the character a memorable name (e.g. 'Alex'). This name is for your reference only and does not affect generation.

  3. Submit and wait: Training takes approximately 20–60 seconds. You'll receive a request_id when the job completes.

  4. Use in Omni Reference: In any Omni Reference video prompt, reference the trained character with @omni-character:<request_id>. Example: @character:abc123ef walking through a neon-lit city at night.

  5. Combine with other references: You can combine character references with image, video, and audio references in the same prompt.

Common Questions

Frequently asked

How do I use the trained character in a video?

After training completes, use @omni-character:<request_id> in any Omni Reference video prompt. Replace <request_id> with the request_id returned by this endpoint. The character will be automatically included as an image reference.

What makes a good reference photo?

A clear, front-facing portrait with good lighting works best. Avoid group photos, heavy shadows, extreme angles, or images where the face is partially obscured.

Can I use the same character in multiple videos?

Yes. Once trained, you can reuse the character's request_id in as many Omni Reference prompts as you like — you only pay the $0.50 training cost once.

How long does training take?

Training typically completes in 20–60 seconds. The request will timeout after 5 minutes if training does not complete.