AI

OpenAI Ships GPT-5.4 Mini and Nano: The Architecture of Cheap Intelligence

Two new models bring computer use and coding performance at a fraction of the cost. The real story is the system architecture they enable.

OpenAI released its smallest GPT-5 class models this week.

Editor

By Editor

Mar 21, 2026 · 7 min read

By Takeshi Mori · 2026-03-21

The announcement landed without fanfare: two new models, a pricing sheet, some benchmark tables. But strip away the product marketing and you see something more interesting. OpenAI just made a structural bet on where AI development is heading.

TLDR

OpenAI released GPT-5.4 mini and nano on 17 March 2026, bringing near-flagship performance to developers at dramatically lower costs. Mini runs 2x faster than GPT-5 mini, scores 54.4% on SWE-Bench Pro (compared to 57.7% for the full GPT-5.4), and costs $0.75 per million input tokens. The company is also developing a desktop superapp that will merge ChatGPT, Codex, and its Atlas browser into a single product.

KEY TAKEAWAYS

01GPT-5.4 mini costs $0.75/M input tokens and $4.50/M output, running more than 2x faster than GPT-5 mini.

02GPT-5.4 nano costs $0.20/M input and $1.25/M output, OpenAI's cheapest model for classification and extraction tasks.

03Mini scores 72.1% on OSWorld-Verified for computer use tasks, nearly matching the flagship's 75%.

04OpenAI is building a desktop superapp combining ChatGPT, Codex, and its Atlas browser.

05In Codex, mini uses only 30% of the GPT-5.4 quota, enabling tiered agent architectures.

GPT-5.4 mini and nano, released on 17 March 2026, are designed for a world where AI systems don't run as monolithic applications. They run as orchestrated networks of specialised agents, each optimised for different cost-latency tradeoffs.

The unit economics tell the story

GPT-5.4 mini costs $0.75 per million input tokens and $4.50 per million output tokens. That puts it at roughly one-tenth the cost of the full GPT-5.4. Nano goes cheaper still: $0.20 input, $1.25 output.

At these prices, a developer can run thousands of API calls for the cost of a single complex task on the flagship model. Workflows that were cost-prohibitive a year ago become routine: screenshot parsing at scale, real-time code review, batch document classification across hundreds of files.

OD

OpenAI Developers

@OpenAIDevs

𝕏

GPT-5.4 mini is available today in the API, Codex, and ChatGPT. In the API, it has a 400k context window. In Codex, it uses only 30% of the GPT-5.4 quota, letting you handle simpler coding tasks for about one-third of the cost.

Mar 17, 2026

The 400k context window matters because it can hold an entire codebase in memory during a debugging session. Combined with the 2x speed improvement over GPT-5 mini, the model maintains context across long reasoning chains without the latency penalty that made previous models frustrating for interactive use.

Benchmarks worth reading

OpenAI released performance numbers across coding, tool use, and computer interaction tasks. The spread tells you exactly what they optimised for.

On SWE-Bench Pro, the standard test for automated software engineering, mini scored 54.4% while the full GPT-5.4 scored 57.7%. That 3.3 percentage point gap costs you roughly ten times more per token, a tradeoff that rarely makes sense for production workloads.

The more interesting number is OSWorld-Verified, which measures computer use performance across screenshot interpretation and UI navigation tasks. Mini scored 72.1%, compared to 75% for the flagship and 39% for nano. The gap between mini and the full model is trivial; the gap between nano and everything else is not. Nano should stay away from computer use workflows.

These models are built for the kinds of workloads where latency directly shapes the product experience: coding assistants that need to feel responsive, subagents that quickly complete supporting tasks, computer-using systems that capture and interpret screenshots.

— OpenAI announcement, March 2026

The architecture play

OpenAI is pushing developers toward a specific system design pattern where GPT-5.4 handles planning and coordination while mini subagents execute tasks in parallel.

This mirrors how distributed systems have always worked, where you tier compute rather than running everything on the most expensive option. The scheduler talks to the database through a connection pool, the frontend makes cheap API calls, and heavy processing happens asynchronously on dedicated workers.

AI agents are following the same path: a primary model decides what needs to happen, smaller models do the actual work, and the primary model reviews the results. This pattern lets you scale horizontally while keeping inference costs manageable.

In Codex, OpenAI's coding assistant platform, mini already functions this way. The main agent delegates file searches, code reviews, and document processing to mini subagents that consume only 30% of the quota, presenting a single interface to the developer while multiple models collaborate behind the scenes.

Computer use goes mainstream

The flagship GPT-5.4, released earlier this month, introduced native computer use capabilities. The model can read screenshots, interpret UI elements, and generate keyboard and mouse actions. Mini inherits this capability with minimal performance loss.

For developers building automation tools, this changes the economics of desktop and web automation significantly. A screenshot costs roughly 1,000 tokens, which at mini's pricing works out to $0.00075 per image interpretation. An automation that takes 50 screenshots to complete a task costs less than four cents.

Anthropic pioneered this approach with Claude's computer use beta. OpenAI is now matching the capability at a lower price point, creating competitive pressure across the industry.

The superapp convergence

Two days after the mini and nano release, reports emerged that OpenAI is building a desktop superapp. The plan is to merge ChatGPT, Codex, and the Atlas browser into a single product.

The reasoning behind the consolidation is straightforward: OpenAI shipped too many standalone products too quickly, and users ended up with a chatbot in one app, a coding assistant in another, and a browser somewhere else. The fragmentation made it harder to build integrated workflows.

A superapp would let OpenAI surface the new agent capabilities across a unified interface. Computer use, coding assistance, web research, and conversational AI would all live in the same application. Mini and nano would handle the fast, cheap operations. The flagship model would step in for complex reasoning.

OpenAI has not announced a timeline for the superapp, though the company confirmed the mobile ChatGPT app will remain separate.

What this means for developers

If you're building AI-powered applications, the release shifts your cost model. Tasks you previously batched to save money can now run in real time. Workflows you previously ran through the flagship model can drop to mini with minimal quality loss.

The tiered architecture pattern is worth adopting even if you're not using OpenAI. The principle generalises: let expensive models plan, let cheap models execute. Use the right model for each task rather than running everything through your most capable option.

For enterprises evaluating AI infrastructure, the mini and nano release adds another data point to the build-versus-buy calculation. OpenAI is making hosted inference cheaper and more flexible. The gap between running your own models and using API services continues to narrow.

GPT-5.4 mini is available now in the API, Codex, and ChatGPT. Nano is API-only. Free ChatGPT users can access mini through the Thinking feature.

SOURCES & CITATIONS

FREQUENTLY ASKED QUESTIONS

How much does GPT-5.4 mini cost?

$0.75 per million input tokens and $4.50 per million output tokens, roughly one-tenth the cost of GPT-5.4.

What is GPT-5.4 nano best for?

Classification, data extraction, ranking, and lightweight coding support tasks where speed and cost matter more than reasoning depth.

Can GPT-5.4 mini control a computer?

Yes, it has native computer use capabilities and scores 72.1% on OSWorld-Verified, nearly matching the flagship model's 75%.

Is GPT-5.4 mini available to free ChatGPT users?

Yes, free and Go tier users can access mini through the Thinking option in ChatGPT.

When will OpenAI's desktop superapp launch?

No timeline has been announced. OpenAI confirmed the app is in development and will combine ChatGPT, Codex, and the Atlas browser.

AI OpenAI Anthropic NAB

Editor

The Bushletter editorial team. Independent business journalism covering markets, technology, policy, and culture.

OpenAI Ships GPT-5.4 Mini and Nano: The Architecture of Cheap Intelligence

The unit economics tell the story

Benchmarks worth reading

The architecture play

Computer use goes mainstream

The superapp convergence

What this means for developers

Editor

Read Next

9 AEO agencies in Sydney worth shortlisting in 2026

Trump's 100% pharma tariffs hit Australia's $1.3bn export pipeline

Government denies SAS deployment to Middle East reported by News Corp

Five social media giants face A$49.5M fines over teen ban failures

The Morning Brief