
OpenAI launches GPT-5 with router-based reasoning system
On August 7, 2025, OpenAI launched GPT-5 in a livestream led by chief executive Sam Altman, describing the unified model family as a "PhD-level expert" across domains, according to the BBC, NBC News, and OpenAI's announcement page. The release introduces a real-time router that selects between a base model and a longer-reasoning variant called GPT-5 thinking, with smaller GPT-5 mini and GPT-5 nano models offered to developers via the API.
GPT-5 replaces the proliferation of GPT-4-era SKUs — GPT-4, GPT-4o, GPT-4 Turbo, o3 — under a single brand. OpenAI had been preparing the launch since at least mid-2024 and pushed the release back several times, with Axios and Reuters reporting that the company spent the spring of 2025 on alignment, evaluation, and safety work. Altman first publicly previewed the model on Theo Von's podcast in July 2025.
At the launch event, Altman compared the experience to past generations: "GPT-3, sort of felt to me like talking to a high school student. GPT-4 felt like you're kind of talking to a college student. With GPT-5, now it's like talking to an expert, a legitimate PhD-level expert in anything, in any area you need," in remarks reported by the BBC and PCMag. NBC News also quoted him saying the model felt like "a team of Ph.D. level experts in your pocket."
OpenAI reported state-of-the-art results on several public benchmarks: 74.9% on SWE-bench Verified for software-engineering tasks; 94.6% on AIME 2025 mathematics without tools; 85.7% on GPQA Diamond science questions; and 88% on the Aider Polyglot coding benchmark, according to OpenAI's developer post and Wired. The company also said GPT-5's responses are roughly 45% less likely to contain a factual error than GPT-4o, and about 80% less likely than the o3 reasoning model when GPT-5 thinking is engaged. API pricing is $1.25 per 1M input tokens and $10 per 1M output tokens for the flagship model, $0.25 / $2 for mini, and $0.05 / $0.40 for nano.
Coverage was mixed. The Register highlighted the hallucination-rate claims and the consolidation of OpenAI's model line-up, while Wired noted that the router approach hides model-selection details from end users. Some early testers and developers complained on social media that GPT-5's Plus-tier responses felt less personable than GPT-4o, prompting OpenAI to restore optional access to legacy models within days of launch, according to NBC News.
For us at Enpo Sekai, the most relevant change in GPT-5 is not the headline benchmark but the introduction of an automatic router. Our character and persona work depends on tight, predictable per-turn behavior — tone, latency, refusal patterns — and a router that silently switches models breaks the contract our designers and writers rely on. We are evaluating GPT-5 mini and GPT-5 thinking as separately addressable endpoints rather than the routed default, and we will keep our pipelines model-agnostic until that behavior settles.
We will be watching three things over the next twelve months: (1) whether OpenAI exposes deterministic per-call control over the GPT-5 router or keeps it opaque; (2) how the SWE-bench gains translate into real productivity for live game-engine and tooling work, where most of our internal coding happens; (3) whether the hallucination-rate claims hold up in long-running, character-driven conversations, where small factual drifts compound into broken persona consistency.


