Explore/muapi.ai/gpt-codex

muapi/gpt-codex

Text to Text

OpenAI GPT Codex delivers advanced coding capabilities with scalable reasoning depth. Supports multiple model variants (gpt-5-codex through gpt-5.4-codex) and multimodal inputs. Token-based pricing: $1.25/M input tokens, $9.00/M output tokens. Two endpoints: standard async (/gpt-codex) and live streaming (/gpt-codex/stream) via SSE.

Input

Configure the model parameters below.

Append past messages to the context to remember conversation history.

gpt-codex Chat

🚀Related Models

View all
claude-opus-4-6

claude-opus-4-6

Claude Opus 4.6 is Anthropic's most capable model for complex coding, long-context reasoning, and agentic workflows. Supports text and image inputs. Token-based pricing: $3.00/M input tokens, $15.00/M output tokens. Two endpoints: standard async (/claude-opus-4-6) and live streaming (/claude-opus-4-6/stream) via SSE.

Text to Text
claude-sonnet-4-6

claude-sonnet-4-6

Claude Sonnet 4.6 delivers strong reasoning, advanced coding, and native computer-use functionality. Supports text and image inputs with up to 1M token context. Token-based pricing: $1.80/M input tokens, $9.00/M output tokens. Two endpoints: standard async (/claude-sonnet-4-6) and live streaming (/claude-sonnet-4-6/stream) via SSE.

Text to Text
claude-haiku-4-5

claude-haiku-4-5

Claude Haiku 4.5 is Anthropic's fastest and most cost-effective model, designed for high-frequency queries, simple tasks, and near-instant response times. Supports text and image inputs. Token-based pricing: $0.60/M input tokens, $3.00/M output tokens. Two endpoints: standard async (/claude-haiku-4-5) and live streaming (/claude-haiku-4-5/stream) via SSE.

Text to Text
claude-opus-4-8

claude-opus-4-8

Claude Opus 4.8 is Anthropic's most capable model for complex coding, long-context reasoning, and agentic workflows. Supports text and image inputs. Token-based pricing: $3.00/M input tokens, $15.00/M output tokens. Two endpoints: standard async (/claude-opus-4-8) and live streaming (/claude-opus-4-8/stream) via SSE.

Text to Text
gemini-3-5-flash

gemini-3-5-flash

Gemini 3.5 Flash is a high-speed, multimodal language model built for real-time text generation, supporting text and image inputs natively. Token-based pricing: $0.60/M input tokens and $3.60/M output tokens. Two endpoints: standard async (/gemini-3-5-flash) and live streaming (/gemini-3-5-flash/stream) via SSE.

Text to Text
gemini-3-5-flash-openai

gemini-3-5-flash-openai

Gemini 3.5 Flash (OpenAI-compatible) is a high-speed, multimodal language model built for real-time text generation, supporting text and image inputs natively. Token-based pricing: $0.60/M input tokens and $3.60/M output tokens. Two endpoints: standard async (/gemini-3-5-flash-openai) and live streaming (/gemini-3-5-flash-openai/stream) via SSE.

Text to Text
claude-opus-4-7

claude-opus-4-7

Claude Opus 4.7 is Anthropic's highly capable model for complex coding, long-context reasoning, and agentic workflows. Supports text and image inputs. Token-based pricing: $3.00/M input tokens, $15.00/M output tokens. Two endpoints: standard async (/claude-opus-4-7) and live streaming (/claude-opus-4-7/stream) via SSE.

Text to Text
gemini-3-1-pro

gemini-3-1-pro

Gemini 3.1 Pro is Google's next-generation multimodal model, optimized for complex reasoning, planning, coding, and multi-turn conversation. Supports text and image inputs. Token-based pricing: $4.00/M input tokens, $24.00/M output tokens. Two endpoints: standard async (/gemini-3-1-pro) and live streaming (/gemini-3-1-pro/stream) via SSE.

Text to Text
gemini-3-pro

gemini-3-pro

Gemini 3 Pro is Google's powerful multimodal reasoning model, designed for complex problem solving, coding, and logical tasks. Supports text and image inputs. Token-based pricing: $4.00/M input tokens, $24.00/M output tokens. Two endpoints: standard async (/gemini-3-pro) and live streaming (/gemini-3-pro/stream) via SSE.

Text to Text
claude-opus-4-5

claude-opus-4-5

Claude Opus 4.5 is Anthropic's highly capable model for complex coding, long-context reasoning, and agentic workflows. Supports text and image inputs. Token-based pricing: $3.00/M input tokens, $15.00/M output tokens. Two endpoints: standard async (/claude-opus-4-5) and live streaming (/claude-opus-4-5/stream) via SSE.

Text to Text
claude-sonnet-4-5

claude-sonnet-4-5

Claude Sonnet 4.5 is Anthropic's state-of-the-art model offering high intelligence, speed, and efficiency for code generation, writing, and logical analysis. Supports text and image inputs. Token-based pricing: $1.80/M input tokens, $9.00/M output tokens. Two endpoints: standard async (/claude-sonnet-4-5) and live streaming (/claude-sonnet-4-5/stream) via SSE.

Text to Text
gemini-2-5-pro

gemini-2-5-pro

Gemini 2.5 Pro is Google's advanced multimodal reasoning model, optimized for complex coding, logical tasks, and deep analysis. Supports text and image inputs. Token-based pricing: $1.25/M input tokens, $10.00/M output tokens. Two endpoints: standard async (/gemini-2-5-pro) and live streaming (/gemini-2-5-pro/stream) via SSE.

Text to Text
gemini-2-5-flash

gemini-2-5-flash

Gemini 2.5 Flash is Google's high-speed multimodal language model, optimized for rapid text generation, real-time image understanding, and high-frequency tasks. Supports text and image inputs. Token-based pricing: $0.30/M input tokens, $2.50/M output tokens. Two endpoints: standard async (/gemini-2-5-flash) and live streaming (/gemini-2-5-flash/stream) via SSE.

Text to Text
gpt-5-2

gpt-5-2

GPT 5.2 is a lightweight reasoning model with fast response times and deep coding capabilities. Supports image inputs, system prompts, web search capabilities, and reasoning effort control. Pricing: $1.25/M input tokens, $9.00/M output tokens.

Text to Text
claude-fable-5

claude-fable-5

Claude Fable 5 is the latest flagship model from Anthropic. Supports text and image inputs with advanced reasoning and creative capabilities. Token-based pricing: $8.00/M input tokens, $40.00/M output tokens. Two endpoints: standard async (/claude-fable-5) and live streaming (/claude-fable-5/stream) via SSE.

Text to Text
gpt-5-5

gpt-5-5

GPT 5.5 is OpenAI's state-of-the-art flagship reasoning model for high-complexity problems. Supports image and file uploads, system prompts, web search capabilities, and reasoning effort control. Pricing: $2.40/M input tokens, $16.00/M output tokens.

Text to Text
generate-social-video-script

generate-social-video-script

Generate viral short-form video scripts for social media based on a topic and niche.

Text to Text
gemini-3-flash

gemini-3-flash

Gemini 3 Flash is a fast, multimodal language model for real-time text generation. Supports text and image inputs, function calling, and Google Search grounding. Token-based pricing: $0.30/M input tokens and $1.80/M output tokens. Two endpoints: standard async (/gemini-3-flash) and live streaming (/gemini-3-flash/stream) via SSE.

Text to Text
gpt-5-4

gpt-5-4

GPT-5.4 delivers powerful reasoning, coding, and professional knowledge work. Supports multimodal inputs (text and image) with adjustable reasoning depth. Token-based pricing: $1.25/M input tokens, $9.00/M output tokens. Two endpoints: standard async (/gpt-5-4) and live streaming (/gpt-5-4/stream) via SSE.

Text to Text
📝

Overview

About this model

GPT Codex is OpenAI's code-specialized model series built on GPT-5 architecture. Optimized for code generation, debugging, and complex engineering workflows with scalable reasoning depth. Supports five model variants from gpt-5-codex to gpt-5.4-codex, multimodal inputs, and SSE streaming. Token-based pricing: $1.25 per million input tokens and $9.00 per million output tokens.

1Code Generation: Write complete, production-ready functions, classes, and modules from natural language.
2Debugging: Identify and fix bugs with deep understanding of code logic and edge cases.
3Code Review: Analyze code for correctness, performance, security, and style.
4Engineering Workflows: Automate repetitive coding tasks, generate tests, and produce documentation.
💰

Pricing & Value

Cost analysis

muapiapp$1.25/M input tokens, $9.00/M output tokens

Token-based billing. Minimum $0.00023 per call. All 5 Codex variants share the same pricing. Supports Prompt Caching (0.1x for hits, 1.25x for creation).

OpenAI (official)~$1.75/M input tokens, ~$14.00/M output tokens

Official pricing via api.openai.com.

Fal.aiNot available

GPT Codex is not available on Fal.ai.

* Competitor pricing is estimated based on similar model architectures and usage tiers.

⚙️

Technical Details

Configuration schema

Promptstring

The coding task or instruction.

Default ValueImplement a binary search tree with insert, delete, and search methods in Python.
ModelEnum (5 options)

Codex model variant to use.

Default Valuegpt-5.4-codex
Image URLstring

Optional image URL for multimodal requests.

Default Valueundefined
📖

Implementation Guide

Developer documentation

Standard (Async)

POST /api/v1/gpt-codex — returns request_id, poll for result via /api/v1/predictions/{id}/result.

Streaming (SSE)

POST /api/v1/gpt-codex/stream — returns a live SSE stream. Each chunk: data: {"choices":[{"delta":{"content":"text"}}]}, ending with data: [DONE].

Model variants (pass in the model field): gpt-5.4-codex (default, most capable), gpt-5.3-codex, gpt-5.2-codex, gpt-5.1-codex, gpt-5-codex.

See Streaming Documentation for full code examples.

Common Questions

Frequently asked

How is pricing calculated?

Pricing is token-based: $1.25 per million input tokens and $9.00 per million output tokens regardless of which Codex model variant you use. The minimum charge per call is $0.00023.

Which Codex variant should I use?

gpt-5.4-codex is the default and most capable variant. Use lower variants (gpt-5.3-codex, gpt-5.2-codex, etc.) if you need faster responses for simpler tasks. All variants share the same pricing.

What is the difference between /gpt-codex and /gpt-codex/stream?

/gpt-codex is async — you receive a request_id and poll for the result. /gpt-codex/stream returns a live SSE stream. Use streaming for interactive coding UIs; use the async endpoint for batch processing.

Does this model support Prompt Caching?

Yes. Prompt Caching allows you to reuse frequently used text prompts at reduced rates. Cache hits (reusing previously cached tokens) are charged at 0.1x of the input cost. New cache creation (writing new tokens to cache for future reuse) is charged at 1.25x of the standard input cost.