Explore/muapi.ai/vidu-q2-pro-text-to-video

muapi/vidu-q2-pro-text-to-video

Text to Video

Vidu Q2 Pro Text-to-Video generates cinematic, prompt-faithful clips from text alone with strong temporal consistency and rich detail at up to 1080p. Pick this when you need polished output without a reference frame.

Input

Configure the model parameters below.

Add background music to the output. When enabled, duration must be exactly 4 seconds.

Result

🚀Related Models

View all
vidu-q2-turbo-image-to-video

vidu-q2-turbo-image-to-video

Vidu Q2 Turbo Image-to-Video animates a starting image into a fast, prompt-guided clip while preserving subject identity. Built for speed and cost efficiency.

Image to Video
vidu-q2-reference

vidu-q2-reference

Vidu Q2 Reference Video generates breathtaking cinematic clips from text prompts guided by multiple reference images. Each image refines the model’s understanding of subject, environment, and visual tone — ensuring perfect consistency in appearance and motion across every frame.

Image to Video
vidu-q2-reference-to-image

vidu-q2-reference-to-image

VIDU Reference-to-Image Q2 generates new high-quality images based on one or more reference images. It preserves the key identity, structure, or style of the reference while creating a new scene, variation, or enhanced composition. Ideal for character consistency, object re-interpretation, stylized redesigns, and cinematic recreations guided by reference inputs.

Image to Image
vidu-q2-turbo-text-to-video

vidu-q2-turbo-text-to-video

Vidu Q2 Turbo Text-to-Video is the fast, affordable Q2 tier for prompt-only generation. Use it for storyboards, social cuts, and high-volume work where speed and cost matter.

Text to Video
vidu-q2-turbo-start-end-video

vidu-q2-turbo-start-end-video

Vidu Q2 Turbo Start–End Video creates highly detailed cinematic sequences by interpolating between two visual states — your start frame and end frame. Built for story moments, cinematic transformations, product reveals, and artistic transitions, it captures smooth motion, realistic lighting shifts, and dynamic camera movements while maintaining fidelity and emotional tone.

Image to Video
vidu-q2-pro-start-end-video

vidu-q2-pro-start-end-video

Vidu Q2 Pro Start–End Video is a professional-grade model built for cinematic transformation storytelling. It evolves a scene, subject, or concept from one moment to another through smooth visual interpolation, natural lighting transitions, and dynamic motion.

Image to Video
vidu-q2-text-to-image

vidu-q2-text-to-image

VIDU Text-to-Image Q2 is a high-quality generative model focused on producing vivid, dynamic, and cinematic still images using natural language prompts. It excels at atmospheric depth, expressive lighting, surreal concepts, and motion-infused compositions typical of VIDU’s visual identity.

Text to Image
vidu-q2-pro-image-to-video

vidu-q2-pro-image-to-video

Vidu Q2 Pro Image-to-Video animates a single starting image into a smooth, prompt-guided clip up to 1080p while preserving subject identity, lighting, and composition.

Image to Video
📝

Overview

About this model

Vidu Q2 Pro Text-to-Video generates cinematic, prompt-faithful clips from text alone, with strong temporal consistency, accurate motion, and rich detail at up to 1080p. Pick this when you want polished output without a reference frame.

1Cinematic: Hero shots and stylised sequences for trailers and short films.
2Marketing: Polished promotional clips and brand storytelling.
3Concepting: Motion explorations of characters and environments.
4Social: High-quality vertical or square clips for premium social posts.
💰

Pricing & Value

Cost analysis

muapiapp$0.20 per 5s clip at 720p

Same Q2 Pro pricing as the start-end variant — by resolution and duration.

Fal.aiNot available

Vidu Q2 Pro text-to-video is not listed on Fal.ai at this time.

ReplicateNot available

Vidu Q2 Pro text-to-video is not listed on Replicate at this time.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Promptstring

Text prompt describing the video.

Default ValueA lone astronaut walks slowly across a cracked Martian plain at dusk, her boots kicking up rust-coloured dust. The camera tracks beside her in a slow dolly as twin moons rise over distant mesas, soft volumetric light spilling across her visor.
ResolutionEnum (2 options)

The resolution of the generated video.

Default Value720p
Aspect RatioEnum (3 options)

Aspect ratio of the output video.

Default Value16:9
Durationint

The duration of the generated video in seconds.

Default Value5
Bgmboolean

Add background music to the output. When enabled, duration must be exactly 4 seconds.

Default Valuefalse
Movement AmplitudeEnum (4 options)

The movement amplitude of objects in the frame.

Default Valueauto
📖

Implementation Guide

Developer documentation

How to Use Vidu Q2 Pro Text-to-Video

  1. Write a vivid prompt: Describe subject, action, environment, lighting, and camera move.

  2. Pick resolution and aspect ratio: 1080p for hero output, 720p for a strong balance. Aspect ratio supports 16:9, 9:16, and 1:1.

  3. Set duration: 2–8 seconds (default 5).

  4. Optional bgm: Toggle background music if needed. When bgm=true, duration must be exactly 4 seconds.

  5. Submit: Async — poll /api/v1/predictions/{request_id}/result or pass a webhook_url.

Common Questions

Frequently asked

What resolutions and aspect ratios does Vidu Q2 Pro support?

Resolutions: 720p (default) and 1080p. Aspect ratios: 16:9, 9:16, and 1:1.

How long can the generated video be?

Between 2 and 8 seconds. Default is 5 seconds.

When should I pick Pro over Turbo?

Pick Pro when fidelity, motion quality, and prompt adherence matter most. Pick Turbo for rapid iteration and high-volume generation.