Explore/muapi.ai/gpt-image-2-image-to-image

muapi/gpt-image-2-image-to-image

Image to Image

Transform and edit existing images using GPT Image 2 with text instructions. Supports up to 16 input images for precise style transfer, editing, and image transformation.

Input

Configure the model parameters below.

0/16 items
Drag & drop images here or paste file/image

Result

Generated output

🚀Related Models

View all
gpt4o-text-to-image

gpt4o-text-to-image

Generate images from text prompts using GPT-4o's vision capabilities. Ideal for basic concept visuals, diagrams, and abstract compositions.

Text to Image
gpt4o-image-to-image

gpt4o-image-to-image

Transform an input image based on a new prompt — like changing style, lighting, or composition. Useful for reinterpreting visuals while keeping structure.

Image to Image
gpt-5-nano

gpt-5-nano

GPT-5 Nano is a lightweight, high-speed language model from the GPT-5 family designed for instant text generation. It delivers intelligent, context-aware responses for creative writing, summarization, dialogue, code generation, and automation — all at low latency and cost. Perfect for chatbots, assistants, content tools, and real-time applications that need fast, reliable text output.

Text to Text
gpt-image-2-text-to-image

gpt-image-2-text-to-image

Generate high-quality images from text prompts using GPT Image 2, supporting up to 20,000 character prompts for detailed and precise image creation.

Text to Image
gpt4o-edit

gpt4o-edit

Edit a specific part of an image using natural language. Ideal for object removal, replacement, or content-aware filling.

Image to Image
gpt-5-mini

gpt-5-mini

GPT‑5 Mini is a compact yet powerful AI that converts plain text ideas into detailed, structured prompts suitable for use in text-to-image, text-to-video, and other generative AI models. It’s perfect for creators who want to quickly craft high-quality prompts without manually thinking about style, composition, and descriptive details. The model helps accelerate workflows for artists, video producers, and designers.

Text to Text
📝

Overview

About this model

GPT Image 2 Image to Image transforms and edits existing images using natural language instructions, supporting up to 16 input images for precise editing, style transfer, and creative transformation. It follows detailed instructions to modify composition, style, and content while preserving important elements, with selectable quality (low / medium / high) and 1K / 2K / 4K resolutions.

1E-commerce: Transform product photos into premium poster or lifestyle styles.
2Design: Apply style transfers and visual effects to existing images.
3Marketing: Repurpose existing brand images with new styles or compositions.
💰

Pricing & Value

Cost analysis

muapiapp$0.025 – $0.150 per image

Pay only for what you use, regardless of the number of input images. Low: $0.025 / $0.040 / $0.075 (1K / 2K / 4K). Medium: $0.030 / $0.045 / $0.090. High: $0.060 / $0.090 / $0.150.

Fal.ai$0.040 – $0.110 per image

Low and medium quality only. Low/Medium: $0.040 / $0.060 / $0.110 (1K / 2K / 4K).

ReplicateNot available

GPT Image 2 is not available on Replicate.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Promptstring

Text instructions describing the desired transformation. Maximum 20,000 characters.

Default ValuePreserve the original subject identity, facial structure, hairstyle, pose, body proportions, camera framing, and overall composition from the source image. Transform the environment into a flooded underwater railway terminal with giant glass ceilings showing whales and ocean water above. Add realistic cinematic water reflections, soft underwater caustic lighting, floating atmospheric particles, subtle wetness on clothing and skin, and highly detailed environmental storytelling elements such as abandoned luggage, vines, cracked marble, and cinematic fog depth. Maintain natural realism and believable anatomy while enhancing texture fidelity, lighting realism, color harmony, and cinematic atmosphere. Keep the face sharp and recognizable with authentic skin detail and emotionally grounded expression. Avoid over-stylization, distorted anatomy, plastic skin, blurry textures, extra limbs, cartoon aesthetics, oversaturated colors, low-detail backgrounds, or artificial-looking lighting.
Image URLsarray

Upload or provide input images to transform. Up to 16 images supported.

Default Valuehttps://d3adwkbyhxyrtq.cloudfront.net/webassets/videomodels/gpt-image-2-image-to-image-in.jpg
Aspect RatioEnum (6 options)

Output image aspect ratio. Note: images with aspect ratio 'auto' (or unspecified) will only be converted to 1K; 1:1 cannot be converted to 4K — otherwise the task will fail to create.

Default Valueauto
ResolutionEnum (3 options)

Image resolution. Note: images with a 1:1 aspect ratio cannot be converted to 4K. Images with aspect ratio 'auto' (or unspecified) will only be converted to 1K; otherwise the task will fail to create.

Default Value2K
QualityEnum (3 options)

Generation quality. 'low' and 'medium' use a faster, cheaper backend; 'high' uses the higher-fidelity backend.

Default Valuehigh
📖

Implementation Guide

Developer documentation

How to Use GPT Image 2 Image to Image

  1. Upload your images: Provide up to 16 input images via the images_list field. These are the images to be transformed.

  2. Write your prompt: Describe the transformation you want. Be specific about the target style, changes, or effects.

  3. Pick a quality: Choose low or medium for fast, lower-cost edits, or high for the highest fidelity output. high is the default.

  4. Pick aspect ratio and resolution: Select an aspect ratio (e.g. 1:1, 16:9, 9:16) and a resolution (1K, 2K, or 4K). Note that 1:1 cannot be rendered at 4K, and auto aspect ratio is locked to 1K.

  5. Submit and review: Click Generate. The model applies your instructions to the input images and returns the transformed result.

Common Questions

Frequently asked

How many input images can I provide?

You can provide up to 16 images in the `images_list` field. The model will use all of them as reference when generating the output.

What does the quality setting do?

`low` and `medium` are faster and cheaper and are well suited to drafts and bulk edits. `high` runs the full-fidelity model and is best for finals. Pricing scales with both quality and resolution.

What kinds of transformations are supported?

Style transfer, background changes, object modifications, composition adjustments, and creative reinterpretations are all supported via natural language instructions in the prompt.

Will the model preserve specific elements from my input image?

Yes, you can instruct the model to preserve specific elements (like a product shape or a person) while changing other aspects such as background or style.