Explore/muapi.ai/gemini-3-5-flash-openai

muapi/gemini-3-5-flash-openai

Text to Text

Gemini 3.5 Flash (OpenAI-compatible) is a high-speed, multimodal language model built for real-time text generation, supporting text and image inputs natively. Token-based pricing: $0.60/M input tokens and $3.60/M output tokens. Two endpoints: standard async (/gemini-3-5-flash-openai) and live streaming (/gemini-3-5-flash-openai/stream) via SSE.

Input

Configure the model parameters below.

Append past messages to the context to remember conversation history.

Token-based pricing

TypeRate
Input tokens$0.60/M
Output tokens$3.60/M
Minimum per run$0.0001

🚀Related Models

View all
claude-opus-4-6

claude-opus-4-6

Claude Opus 4.6 is Anthropic's most capable model for complex coding, long-context reasoning, and agentic workflows. Supports text and image inputs. Token-based pricing: $3.00/M input tokens, $15.00/M output tokens. Two endpoints: standard async (/claude-opus-4-6) and live streaming (/claude-opus-4-6/stream) via SSE.

Text to Text
gemini-2-5-pro

gemini-2-5-pro

Gemini 2.5 Pro is Google's advanced multimodal reasoning model, optimized for complex coding, logical tasks, and deep analysis. Supports text and image inputs. Token-based pricing: $1.25/M input tokens, $10.00/M output tokens. Two endpoints: standard async (/gemini-2-5-pro) and live streaming (/gemini-2-5-pro/stream) via SSE.

Text to Text
claude-opus-4-7

claude-opus-4-7

Claude Opus 4.7 is Anthropic's highly capable model for complex coding, long-context reasoning, and agentic workflows. Supports text and image inputs. Token-based pricing: $3.00/M input tokens, $15.00/M output tokens. Two endpoints: standard async (/claude-opus-4-7) and live streaming (/claude-opus-4-7/stream) via SSE.

Text to Text
gemini-3-1-pro

gemini-3-1-pro

Gemini 3.1 Pro is Google's next-generation multimodal model, optimized for complex reasoning, planning, coding, and multi-turn conversation. Supports text and image inputs. Token-based pricing: $4.00/M input tokens, $24.00/M output tokens. Two endpoints: standard async (/gemini-3-1-pro) and live streaming (/gemini-3-1-pro/stream) via SSE.

Text to Text
claude-sonnet-4-5

claude-sonnet-4-5

Claude Sonnet 4.5 is Anthropic's state-of-the-art model offering high intelligence, speed, and efficiency for code generation, writing, and logical analysis. Supports text and image inputs. Token-based pricing: $1.80/M input tokens, $9.00/M output tokens. Two endpoints: standard async (/claude-sonnet-4-5) and live streaming (/claude-sonnet-4-5/stream) via SSE.

Text to Text
claude-haiku-4-5

claude-haiku-4-5

Claude Haiku 4.5 is Anthropic's fastest and most cost-effective model, designed for high-frequency queries, simple tasks, and near-instant response times. Supports text and image inputs. Token-based pricing: $0.60/M input tokens, $3.00/M output tokens. Two endpoints: standard async (/claude-haiku-4-5) and live streaming (/claude-haiku-4-5/stream) via SSE.

Text to Text
gemini-2-5-flash

gemini-2-5-flash

Gemini 2.5 Flash is Google's high-speed multimodal language model, optimized for rapid text generation, real-time image understanding, and high-frequency tasks. Supports text and image inputs. Token-based pricing: $0.30/M input tokens, $2.50/M output tokens. Two endpoints: standard async (/gemini-2-5-flash) and live streaming (/gemini-2-5-flash/stream) via SSE.

Text to Text
gpt-5-5

gpt-5-5

GPT 5.5 is OpenAI's state-of-the-art flagship reasoning model for high-complexity problems. Supports image and file uploads, system prompts, web search capabilities, and reasoning effort control. Pricing: $2.40/M input tokens, $16.00/M output tokens.

Text to Text
generate-social-video-script

generate-social-video-script

Generate viral short-form video scripts for social media based on a topic and niche.

Text to Text
claude-opus-4-8

claude-opus-4-8

Claude Opus 4.8 is Anthropic's most capable model for complex coding, long-context reasoning, and agentic workflows. Supports text and image inputs. Token-based pricing: $3.00/M input tokens, $15.00/M output tokens. Two endpoints: standard async (/claude-opus-4-8) and live streaming (/claude-opus-4-8/stream) via SSE.

Text to Text
gemini-3-pro

gemini-3-pro

Gemini 3 Pro is Google's powerful multimodal reasoning model, designed for complex problem solving, coding, and logical tasks. Supports text and image inputs. Token-based pricing: $4.00/M input tokens, $24.00/M output tokens. Two endpoints: standard async (/gemini-3-pro) and live streaming (/gemini-3-pro/stream) via SSE.

Text to Text
gpt-codex

gpt-codex

OpenAI GPT Codex delivers advanced coding capabilities with scalable reasoning depth. Supports multiple model variants (gpt-5-codex through gpt-5.4-codex) and multimodal inputs. Token-based pricing: $1.25/M input tokens, $9.00/M output tokens. Two endpoints: standard async (/gpt-codex) and live streaming (/gpt-codex/stream) via SSE.

Text to Text
gpt-5-2

gpt-5-2

GPT 5.2 is a lightweight reasoning model with fast response times and deep coding capabilities. Supports image inputs, system prompts, web search capabilities, and reasoning effort control. Pricing: $1.25/M input tokens, $9.00/M output tokens.

Text to Text
claude-fable-5

claude-fable-5

Claude Fable 5 is the latest flagship model from Anthropic. Supports text and image inputs with advanced reasoning and creative capabilities. Token-based pricing: $8.00/M input tokens, $40.00/M output tokens. Two endpoints: standard async (/claude-fable-5) and live streaming (/claude-fable-5/stream) via SSE.

Text to Text
claude-sonnet-4-6

claude-sonnet-4-6

Claude Sonnet 4.6 delivers strong reasoning, advanced coding, and native computer-use functionality. Supports text and image inputs with up to 1M token context. Token-based pricing: $1.80/M input tokens, $9.00/M output tokens. Two endpoints: standard async (/claude-sonnet-4-6) and live streaming (/claude-sonnet-4-6/stream) via SSE.

Text to Text
claude-opus-4-5

claude-opus-4-5

Claude Opus 4.5 is Anthropic's highly capable model for complex coding, long-context reasoning, and agentic workflows. Supports text and image inputs. Token-based pricing: $3.00/M input tokens, $15.00/M output tokens. Two endpoints: standard async (/claude-opus-4-5) and live streaming (/claude-opus-4-5/stream) via SSE.

Text to Text
gemini-3-5-flash

gemini-3-5-flash

Gemini 3.5 Flash is a high-speed, multimodal language model built for real-time text generation, supporting text and image inputs natively. Token-based pricing: $0.60/M input tokens and $3.60/M output tokens. Two endpoints: standard async (/gemini-3-5-flash) and live streaming (/gemini-3-5-flash/stream) via SSE.

Text to Text
gemini-3-flash

gemini-3-flash

Gemini 3 Flash is a fast, multimodal language model for real-time text generation. Supports text and image inputs, function calling, and Google Search grounding. Token-based pricing: $0.30/M input tokens and $1.80/M output tokens. Two endpoints: standard async (/gemini-3-flash) and live streaming (/gemini-3-flash/stream) via SSE.

Text to Text
gpt-5-4

gpt-5-4

GPT-5.4 delivers powerful reasoning, coding, and professional knowledge work. Supports multimodal inputs (text and image) with adjustable reasoning depth. Token-based pricing: $1.25/M input tokens, $9.00/M output tokens. Two endpoints: standard async (/gpt-5-4) and live streaming (/gpt-5-4/stream) via SSE.

Text to Text
📝

Overview

About this model

Gemini 3.5 Flash (OpenAI-compatible) is a high-speed, multimodal language model optimized for rapid text generation and real-time image understanding, accessed via an OpenAI-compatible API interface. Token-based pricing: $0.60/M input tokens and $3.60/M output tokens.

1Real-time customer support assistants
2Multimodal image analysis and captioning
3Fast content summarization and rewriting
4Quick code snippet generation and debugging
5General conversational agents
💰

Pricing & Value

Cost analysis

muapiapp$0.60/M input, $3.60/M output tokens

Fast, high-quality, token-based pricing with an upfront minimum of $0.0001.

Google (official)$0.075/M input, $0.30/M output tokens (under 128k context)

Official API pricing. We scale our rates to match standard Gemini rates with preserved developer margin.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Promptstring

The user message or instruction for the model.

Default ValueSummarize the key points of the attached image.
Image URLstring

Optional image URL to include as multimodal input.

Default Valueundefined
System Promptstring

Optional system-level instruction to guide model behavior.

Default ValueYou are a helpful assistant that responds concisely.
📖

Implementation Guide

Developer documentation

How to Use Gemini 3.5 Flash (OpenAI)

  1. Prepare your Prompt: Write a clear text instruction or question under the prompt field.
  2. Include an Image (Optional): If you want to analyze an image, pass the image's direct URL under the image_url field.
  3. Provide System instructions (Optional): Pass system-level constraints under the system_prompt field.
  4. Submit Request: Send the payload to the /gemini-3-5-flash-openai endpoint for async processing, or use /gemini-3-5-flash-openai/stream to receive tokens in real time.

Common Questions

Frequently asked

What is the pricing model?

Billing is token-based. We charge $0.60/M input tokens and $3.60/M output tokens, with an upfront minimum charge of $0.0001 per call. The final price is adjusted post-call from actual token usage.

Does Gemini 3.5 Flash (OpenAI) support streaming?

Yes. A streaming endpoint is available at `/gemini-3-5-flash-openai/stream` which uses Server-Sent Events (SSE) to stream back the token responses.