Why GPT-5’s most controversial feature—the model router—might also be the future of AI   
Business News
Fortune

Why GPT-5’s most controversial feature—the model router—might also be the future of AI   

August 12, 2025
07:30 PM
7 min read
AI Enhanced
moneytechnologyaimarket cyclesseasonal analysistechnological

Key Takeaways

OpenAI’s latest upgrade was supposed to be a leap forward. Instead, it sparked backlash — and raises big questions about whether stitching together multiple models is the path to AI’s future.

Article Overview

Quick insights and key information

Reading Time

7 min read

Estimated completion

Category

business news

Article classification

Published

August 12, 2025

07:30 PM

Source

Fortune

Original publisher

Key Topics
moneytechnologyaimarket cyclesseasonal analysistechnological

AI·OpenAIWhy GPT-5’s most controversial feature—the model router—might also be the future of AI By Sharon GoldmanBy Sharon GoldmanAI ReporterSharon GoldmanAI ReporterSharon Goldman is an AI reporter at Fortune and co- Eye on AI, Fortune’s flagship AI

She has written digital and enterprise for over a decade.SEE FULL BIO Sam Altman, CEO of OpenAI.Chris Jung/NurPhoto via Getty ImagesOpenAI’s GPT-5 announcement last week was meant to be a triumph—of that the company was still the undisputed leader in AI—until it wasn’t

Over the weekend, a groundswell of pushback from customers turned the rollout into more than a PR firestorm: it became a duct and trust crisis

Users lamented the loss of their favorite models, which had doubled as therapists, friends, and romantic partners

Developers complained of degraded performance

Industry critic Gary Marcus predictably called GPT-5 “overdue, overhyped, and underwhelming.” The culprit, many argued, was hiding in plain sight: a new real-time model “router” that automatically decides which one of GPT-5’s several variants to spin up for every job

Many users assumed GPT-5 was a single model trained from scratch; in reality, it’s a network of models—some weaker and cheaper, others stronger and more expensive—stitched together

Experts say that apach could be the future of AI as large language models advance and become more resource-intensive

But in GPT-5’s debut, OpenAI demonstrated some of the inherent challenges in the apach and learned some important lessons how user expectations are evolving in the AI era

For all the benefits mised by model routing, many users of GPT-5 bristled at what they perceived as a lack of control; some even suggested OpenAI might purposefully be trying to pull the wool over their eyes

In response to the GPT-5 uar, OpenAI moved quickly to bring back the main earlier model, GPT-4o, for users

It also said it fixed buggy routing, increased usage limits, and mised continual to regain user trust and stability

Anand Chowdhary, co-founder of AI sales platform FirstQuadrant, summed the situation up bluntly: “When routing hits, it feels magic

When it whiffs, it feels broken.” The mise and inconsistency of model routing Jiaxuan You, an assistant fessor of computer science at the University of Illinois Urbana-Champaign, told Fortune his lab has studied both the mise—and the inconsistency—of model routing

In GPT-5’s case, he said, he believes (though he can’t confirm) that the model router sometimes sends parts of the same query to different models

A cheaper, faster model might give one answer while a slower, reasoning-focused model gives another, and when the system stitches those responses together, subtle contradictions slip through

The model routing idea is intuitive, he explained, but “making it really work is very non-trivial.” Perfecting a router, he added, can be as challenging as building Amazon-grade recommendation systems, which take years and many domain experts to refine. “GPT-5 is supposed to be built with maybe orders of magnitude more resources,” he explained, pointing out that even if the router picks a smaller model, it shouldn’t duce inconsistent answers

Still, You believes routing is here to stay. “The community also believes model routing is mising,” he said, pointing to both nical and economic reasons. nically, single-model performance appears to be hitting a plateau: You pointed to the commonly cited scaling laws, which says when we have more data and compute, the model gets better. “But we all know that the model wouldn’t get infinitely better,” he said. “Over the past year, we have all witnessed that the capacity of a single model is actually saturating.” Economically, routing lets AI viders keep using older models rather than discarding them when a new one launches

Current events require frequent , but static facts remain accurate for years

Directing certain queries to older models avoids wasting the enormous time, compute, and money already spent on training them

There are hard physical limits, too

GPU memory has become a bottleneck for training ever-larger models, and chip nology is apaching the maximum memory that can be packed onto a single die

In practice, You explained, physical limits mean the next model can’t be ten times bigger

An older idea that is now being hyped William Falcon, founder and CEO of AI platform Lightning AI, points out that the idea of using an ensemble of models is not new—it has been around since around 2018—and since OpenAI’s models are a black box, we don’t know that GPT-4 did not also use a model routing system. “I think maybe they’re being more explicit it now, potentially,” he said

Either way, the GPT-5 launch was heavily-hyped up—including the model routing system

The blog post introducing the model called it the “smartest, fastest, and most useful model yet, with thinking built in.” In the official ChatGPT blog post, OpenAI confirmed that GPT‑5 within ChatGPT runs on a system of models coordinated by a behind-the-scenes router that switches to deeper reasoning when needed

The GPT‑5 System Card went further, ly outlining multiple model variants—gpt‑5‑main, gpt‑5‑main‑mini for speed, and gpt‑5‑thinking, gpt‑5‑thinking‑mini, plus a thinking‑ version—and explains how the unified system automatically routes between them

In a press pre-briefing, OpenAI CEO Sam Altman touted the model router as a way to tackle what had been a hard to decipher list of models to choose from

Altman called the previous model picker interface a “very confusing mess.” But Falcon said the core blem was that GPT-5 simply didn’t feel a leap. “GPT-1 to 2 to 3 to 4 — each time was a massive jump

Four to five was not noticeably better

That’s what people are upset .” Will multiple models add up to AGI? The debate over model routing led some to call out the hype over the possibility of artificial general intelligence, or AGI, being developed soon

OpenAI officially defines AGI as “highly autonomous systems that outperform humans at most economically valuable work,” but Altman notably said last week that it is “not a super useful term.”) “What the mised AGI?” wrote Aiden Chaoyang He, an AI reer and co-founder of TensorOpera, on X, criticizing the GPT-5 rollout. “Even a powerful company OpenAI lacks the ability to train a super-large model, forcing them to re to the Real-time Model Router.” Robert Nishihara, CEO of AI duction platform Anyscale, says scaling is still gressing in AI, but the idea of one all-powerful AI model remains elusive. “It’s hard to build one model that is the best at everything,” he said

That’s why GPT-5 currently runs on a network of models linked by a router, not a single monolith

OpenAI has said it hopes to unify these into one model in the future, but Nishihara points out that hybrid systems have real advantages: you can upgrade one piece at a time without disrupting the rest, and you get most of the benefits without the cost and complexity of retraining an entire giant model

As a result, Nishihara thinks routing will stick around

Aiden Chaoyang He agrees

In theory, scaling laws still hold — more data and compute make models better — but in practice, he believes development will “spiral” between two apaches: routing specialized models together, then trying to consolidate them into one

The deciding factors will be engineering costs, compute and energy limits, and pressures

The hyped-up AGI narrative may need to adjust, too. “If anyone does anything that’s close to AGI, I don’t know if it’ll literally be one set of weights doing it,” Falcon said, referring to the “brains” behind LLMs. “If it’s a collection of models that feels AGI, that’s fine

No one’s a purist here.”Introducing the 2025 Fortune Global 500, the definitive ranking of the biggest companies in the world

Explore this year's list.