Explore/muapi.ai/vidu-q2-text-to-image

muapi/vidu-q2-text-to-image

Text to Image

VIDU Text-to-Image Q2 is a high-quality generative model focused on producing vivid, dynamic, and cinematic still images using natural language prompts. It excels at atmospheric depth, expressive lighting, surreal concepts, and motion-infused compositions typical of VIDU’s visual identity.

Input

Configure the model parameters below.

Result

Generated output

🚀Related Models

View all
vidu-q2-turbo-image-to-video

vidu-q2-turbo-image-to-video

Vidu Q2 Turbo Image-to-Video animates a starting image into a fast, prompt-guided clip while preserving subject identity. Built for speed and cost efficiency.

Image to Video
vidu-q2-reference

vidu-q2-reference

Vidu Q2 Reference Video generates breathtaking cinematic clips from text prompts guided by multiple reference images. Each image refines the model’s understanding of subject, environment, and visual tone — ensuring perfect consistency in appearance and motion across every frame.

Image to Video
vidu-q2-reference-to-image

vidu-q2-reference-to-image

VIDU Reference-to-Image Q2 generates new high-quality images based on one or more reference images. It preserves the key identity, structure, or style of the reference while creating a new scene, variation, or enhanced composition. Ideal for character consistency, object re-interpretation, stylized redesigns, and cinematic recreations guided by reference inputs.

Image to Image
vidu-q2-turbo-text-to-video

vidu-q2-turbo-text-to-video

Vidu Q2 Turbo Text-to-Video is the fast, affordable Q2 tier for prompt-only generation. Use it for storyboards, social cuts, and high-volume work where speed and cost matter.

Text to Video
vidu-q2-pro-text-to-video

vidu-q2-pro-text-to-video

Vidu Q2 Pro Text-to-Video generates cinematic, prompt-faithful clips from text alone with strong temporal consistency and rich detail at up to 1080p. Pick this when you need polished output without a reference frame.

Text to Video
vidu-q2-turbo-start-end-video

vidu-q2-turbo-start-end-video

Vidu Q2 Turbo Start–End Video creates highly detailed cinematic sequences by interpolating between two visual states — your start frame and end frame. Built for story moments, cinematic transformations, product reveals, and artistic transitions, it captures smooth motion, realistic lighting shifts, and dynamic camera movements while maintaining fidelity and emotional tone.

Image to Video
vidu-q2-pro-start-end-video

vidu-q2-pro-start-end-video

Vidu Q2 Pro Start–End Video is a professional-grade model built for cinematic transformation storytelling. It evolves a scene, subject, or concept from one moment to another through smooth visual interpolation, natural lighting transitions, and dynamic motion.

Image to Video
vidu-q2-pro-image-to-video

vidu-q2-pro-image-to-video

Vidu Q2 Pro Image-to-Video animates a single starting image into a smooth, prompt-guided clip up to 1080p while preserving subject identity, lighting, and composition.

Image to Video
📝

Overview

About this model

VIDU Text-to-Image Q2 is a cutting-edge generative model that transforms natural language prompts into vivid, cinematic still images with unparalleled detail. Leveraging advanced deep learning techniques, it excels in rendering atmospheric depth, expressive lighting, and surreal, motion-infused compositions that capture the essence of VIDU’s visual identity. The model seamlessly integrates technical prowess with artistic creativity, making it an ideal tool for professionals seeking to visualize dynamic and imaginative concepts.

Designed for versatility and high-quality output, VIDU Text-to-Image Q2 delivers images with rich textures and ultra-realistic details even at higher resolutions. It supports various aspect ratios, ensuring that your creative vision is maintained regardless of the format. Whether used for cinematic storyboarding, digital art creation, or conceptual design, this model stands out by combining state-of-the-art technology with the intuitive simplicity of natural language input.

1Creating cinematic storyboards that capture the mood and atmosphere of a scene.
2Designing dynamic and surreal digital art pieces for visual storytelling.
3Visualizing complex novel concepts for creative projects and presentations.
4Generating atmospheric backgrounds for video games and virtual environments.
5Producing high-resolution images for marketing and advertising campaigns.
💰

Pricing & Value

Cost analysis

muapiapp$0.04 per generation

muapiapp is 20-50% more affordable than its competitors while delivering comparable or superior image quality.

Fal.ai$0.06 per generation

muapiapp offers a 20-50% cost advantage over Fal.ai, providing equally impressive results at a lower price point.

Replicate$0.06 per generation

muapiapp is 20-50% more cost-effective compared to Replicate, making it a competitive choice for high-quality image generation.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Promptstring

Text prompt describing the image.

Default ValueA colossal floating serpent made of shimmering stardust coils around a broken moon suspended in deep space. Each scale glows with shifting nebula colors, sending ripples of light across the void. Meteor fragments drift slowly around the creature, leaving trails of violet plasma. Beneath the serpent, a crystalline ring structure orbits the shattered moon, reflecting cosmic beams in intricate patterns. The background is a star field swirling into a spiral galaxy, with vibrant energy storms crackling along the horizon. Ultra-cinematic cosmic fantasy, high contrast, 8k detail, volumetric glow, deep space atmosphere.
Aspect RatioEnum (8 options)

Aspect ratio of the output image.

Default Value1:1
ResolutionEnum (3 options)

The target resolution of the generated image.

Default Value1k
📖

Implementation Guide

Developer documentation

How to Use VIDU Text-to-Image Q2

  1. Prepare Your Input

    • Write a detailed text prompt that describes the image you envision. Include specific visual elements like lighting, atmosphere, and mood to guide the generation process.
    • Choose the desired aspect ratio (e.g., 16:9, 1:1) that fits the context of your project.
    • Select the resolution (e.g., 1k, 2k, 4k) for the output image based on your quality needs.
  2. Submit Your Request

    • Use the provided API endpoint vidu-q2-text-to-image to send your input data.
    • Ensure your JSON payload includes all required fields, especially the prompt.
  3. Interpreting the Results

    • Once processed, the model will return a URL pointing to your generated image.
    • Review the image to assess if it meets your creative expectations and, if necessary, adjust your prompt for refinements.
  4. Iterate and Refine

    • Experiment with different prompts and settings to explore a variety of artistic styles and compositions.
    • Use feedback from each iteration to enhance your subsequent inputs.

Common Questions

Frequently asked

What types of images can VIDU Text-to-Image Q2 generate?

VIDU Text-to-Image Q2 excels in creating high-quality, dynamic, and cinematic still images. It is particularly effective at rendering atmospheric depth, dramatic lighting, and surreal compositions that align with VIDU’s visual identity.

How do I choose the right aspect ratio and resolution?

The aspect ratio and resolution should be selected based on your project needs. For wider cinematic images, a 16:9 or 21:9 ratio is ideal, while standard formats like 1:1 or 4:3 work well for general purposes. The resolution (1k, 2k, or 4k) determines the detail and clarity of the output image.

Is there a cost associated with each image generation?

Yes, each image generation using VIDU Text-to-Image Q2 costs $0.04, making it an affordable option for high-quality image generation.

Can I refine or edit my prompt based on the image output?

Absolutely! The model encourages iterative refinement. If the generated image does not fully meet your expectations, adjust your text prompt with more detailed descriptions or alternative composition angles and try again.