Explore/muapi.ai/sd-2-vip-text-to-video

muapi/sd-2-vip-text-to-video

Text to Video

SD 2 Text-to-Video VIP (Pro) by ByteDance. Generates high-quality cinematic video from a text prompt with priority routing, native audio-visual sync, up to 2K resolution, and 4–15 second duration.

Input

Configure the model parameters below.

Result

🚀Related Models

View all
sd-2-omni-reference-no-video

sd-2-omni-reference-no-video

SD 2 Omni Reference by ByteDance. Generate videos using up to 9 image references and up to 3 audio references. Reference images in your prompt with @image1, @image2, etc. and audio with @audio1, @audio2, etc.

Image to Video
sd-2-image-to-video-fast

sd-2-image-to-video-fast

SD 2 Image-to-Video (Fast) by ByteDance. Quickly animates a start-frame image into video with 4–15 second duration at reduced cost.

Image to Video
sd-2-first-last-frame

sd-2-first-last-frame

SD 2 First & Last Frame (Pro) by ByteDance. Generate video that transitions between two reference images. Provide 1 image for start-frame-only, or 2 images for both start and end frames.

Image to Video
sd-2-vip-image-to-video-fast

sd-2-vip-image-to-video-fast

SD 2 Image-to-Video VIP Fast by ByteDance. Faster animation of a start-frame image with priority routing, 4–15 second duration, and 2K resolution.

Image to Video
sd-2-vip-first-last-frame-1080p

sd-2-vip-first-last-frame-1080p

SD 2 First & Last Frame VIP 1080p by ByteDance. Generate 1080p video that transitions between two reference images with priority routing. Provide 1 image for start-frame-only, or 2 images for both start and end frames.

Image to Video
sd-2-vip-image-to-video-1080p

sd-2-vip-image-to-video-1080p

SD 2 Image-to-Video VIP 1080p by ByteDance. Animates a still image into a cinematic 1080p video with priority routing, 4–15 second duration.

Image to Video
sd-2-omni-reference-no-video-fast

sd-2-omni-reference-no-video-fast

SD 2 Omni Reference (Fast) by ByteDance. Quickly generate videos using up to 9 image references and up to 3 audio references at reduced cost. Reference images in your prompt with @image1, @image2, etc. and audio with @audio1, @audio2, etc.

Image to Video
sd-2-vip-omni-reference-fast

sd-2-vip-omni-reference-fast

SD 2 Omni Reference VIP Fast by ByteDance. Faster video generation using up to 9 image references, up to 3 video clips, and up to 3 audio references with priority routing. Reference materials in your prompt with @image1…@image9, @video1…@video3, and @audio1…@audio3.

Image to Video
sd-2-vip-text-to-video-1080p

sd-2-vip-text-to-video-1080p

SD 2 Text-to-Video VIP 1080p by ByteDance. Generates cinematic 1080p video from a text prompt with priority routing, native audio-visual sync, and 4–15 second duration.

Text to Video
sd-2-text-to-video

sd-2-text-to-video

SD 2 Text-to-Video (Pro) by ByteDance. Generates high-quality cinematic video from a text prompt with native audio-visual sync, up to 2K resolution, and 4–15 second duration.

Text to Video
sd-2-image-to-video

sd-2-image-to-video

SD 2 Image-to-Video (Pro) by ByteDance. Animates a start-frame image into a high-quality video with native audio, 4–15 second duration, and 2K resolution.

Image to Video
sd-2-first-last-frame-fast

sd-2-first-last-frame-fast

SD 2 First & Last Frame (Fast) by ByteDance. Quickly generate video that transitions between reference images at reduced cost. Provide 1 or 2 images.

Image to Video
sd-2-vip-image-to-video

sd-2-vip-image-to-video

SD 2 Image-to-Video VIP (Pro) by ByteDance. Animates a start-frame image into a high-quality video with priority routing, native audio, 4–15 second duration, and 2K resolution.

Image to Video
sd-2-text-to-video-fast

sd-2-text-to-video-fast

SD 2 Text-to-Video (Fast) by ByteDance. Generates video from text at faster speeds with 4–15 second duration and 2K resolution.

Text to Video
sd-2-vip-first-last-frame

sd-2-vip-first-last-frame

SD 2 First & Last Frame VIP (Pro) by ByteDance. Generate video that transitions between two reference images with priority routing. Provide 1 image for start-frame-only, or 2 images for both start and end frames.

Image to Video
sd-2-vip-first-last-frame-fast

sd-2-vip-first-last-frame-fast

SD 2 First & Last Frame VIP Fast by ByteDance. Faster generation of video transitions between two reference images with priority routing.

Image to Video
sd-2-vip-text-to-video-fast

sd-2-vip-text-to-video-fast

SD 2 Text-to-Video VIP Fast by ByteDance. Faster generation with priority routing from a text prompt, 4–15 second duration and 2K resolution.

Text to Video
sd-2-vip-omni-reference

sd-2-vip-omni-reference

SD 2 Omni Reference VIP (Pro) by ByteDance. Generate videos using up to 9 image references, up to 3 video clips, and up to 3 audio references with priority routing. Reference materials in your prompt with @image1…@image9, @video1…@video3, and @audio1…@audio3. Also supports @omni-character:<char_id> for trained characters.

Image to Video
sd-2-vip-omni-reference-1080p

sd-2-vip-omni-reference-1080p

SD 2 Omni Reference VIP 1080p by ByteDance. Generate full HD videos using up to 9 image references, up to 3 video clips, and up to 3 audio references with priority routing. Reference materials in your prompt with @image1…@image9, @video1…@video3, and @audio1…@audio3.

Image to Video
📝

Overview

About this model

SD 2 Text-to-Video VIP (Pro) generates high-quality cinematic videos directly from text prompts using priority routing for faster queue times. Featuring native audio-visual synchronization, up to 2K resolution, and support for 4–15 second durations, this VIP tier delivers the same top-quality output as the standard pro model with reduced wait times.

1Commercial Production: Generate professional-grade product and brand videos from descriptive prompts with priority delivery.
2Content Creation: Produce cinematic short-form videos for social media and marketing campaigns without queue delays.
3Creative Projects: Explore AI-generated video art and storytelling with guaranteed high-quality output and fast turnaround.
💰

Pricing & Value

Cost analysis

muapiapp$0.30/sec (pro)

VIP priority routing with the same high-quality output as standard pro.

Fal.aiNot available

SD 2 VIP priority tier not available.

ReplicateNot available

SD 2 VIP priority tier not available.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Promptstring

Text description of the video to generate. Use @character:<id> to anchor the video to a Seedance 2 character — automatically switches to image-to-video mode. Use @omni-character:<char_id> for a trained Kinovi character.

Default ValueA cinematic shot of a futuristic city at night with neon lights reflecting on wet streets.
Aspect RatioEnum (6 options)

Output video aspect ratio.

Default Value16:9
Duration (seconds)int

Video duration in seconds.

Default Value5
📖

Implementation Guide

Developer documentation

How to Use SD 2 VIP Text-to-Video

  1. Write your prompt: Describe your video scene in detail. Include motion cues ("camera panning right"), lighting ("golden hour"), style ("cinematic"), and subject details.

  2. Set aspect ratio: Choose from 16:9, 9:16, 1:1, 4:3, 3:4, or 21:9 depending on your target platform.

  3. Choose duration: Set between 4 and 15 seconds. Longer durations increase cost proportionally.

  4. Use character references: Include @character:<request_id> from a SD 2 Character generation to anchor the video to a specific character, or @omni-character:<char_id> for a trained character.

  5. Submit and poll: The API returns a request_id. Poll /predictions/{request_id}/result or use a webhook URL for completion notification.

Common Questions

Frequently asked

What makes VIP different from the standard text-to-video model?

VIP endpoints use priority routing which reduces queue wait times, making them ideal for time-sensitive workflows. Output quality and model capabilities are identical to the standard pro tier.

Can I use character references in VIP text-to-video?

Yes. Use @character:<request_id> from a completed SD 2 Character generation to anchor the video to a character identity. The request automatically switches to image-to-video mode with the character sheet as the reference. You can also use @omni-character:<char_id> for trained Kinovi characters.

How does cost scale with duration?

Cost is charged per second of video generated. A 5-second video at the pro rate costs $1.25, while a 10-second video costs $2.50. The fast tier costs proportionally less.