GPT-5 Launch: 5 Major Breakthroughs and Real User Reviews
GPT-5 Officially Released, OpenAI Calls It the 'Biggest Leap'
In May 2026, OpenAI officially released GPT-5, the most significant model update since GPT-4. The company claims five major breakthroughs. This article provides an independent evaluation based on public benchmarks and early user feedback.
5 Official Breakthroughs
1. Comprehensive Multimodal Capabilities
GPT-5 natively supports text, image, audio, and video input, eliminating the need to switch between 'GPT-4o' and 'DALL-E'.
Test: Uploaded a product demo video, asked to 'extract key selling points and generate marketing copy'—GPT-5 completed it fully, with video understanding quality close to Gemini 2.5 Pro.
2. Reasoning Ability Close to o3
| Benchmark | GPT-4o | GPT-5 | o3 (Reference) |
|---|---|---|---|
| AIME 2024 | 13.4% | 72.3% | 96.7% |
| GPQA Diamond | 53% | 79.1% | 87.7% |
| SWE-bench | 38% | 57.6% | 71.7% |
For most users, there's no longer a need to choose between 'fast model' and 'reasoning model'—GPT-5 handles most scenarios directly.
3. Context Window Expanded to 256k
Increased from 128k to 256k, handling longer documents and codebases.
4. Tool Calling Reliability Greatly Improved
Function Calling success rate increased from 84% to 96%—significant for AI Agent applications.
5. Price Same as GPT-4o
Despite major capability improvements, GPT-5 pricing is comparable to GPT-4o ($2.5/1M input tokens).
Real User Feedback
Developer:
'Writing complex business logic, one-shot success rate went from 60% to 80%+. Function Calling is much more stable.'
Content Creator:
'Chinese writing quality has improved significantly, especially coherence and logical structure in long texts.'
Researcher:
'Math reasoning is weaker than o3, but sufficient for most research tasks. No need to wait for o3's slow responses.'
When to Use GPT-5? When to Use o3?
Use GPT-5: Daily work tasks, multimodal tasks, real-time conversations requiring fast responses, Agent tool calls
Continue with o3: Math proofs, high-precision code debugging, research tasks requiring highest reasoning quality
Pressure on Claude and Gemini
- Claude 3.5 Sonnet's writing advantage is significantly narrowed by GPT-5
- Gemini 2.5 Pro's multimodal advantage remains, but GPT-5 has entered the same competitive tier
Anthropic is expected to release Claude 4 in Q3, and Google will accelerate Gemini updates.
Conclusion
GPT-5 is a truly meaningful iteration. For most users, it can replace GPT-4o + separate image generation + most o3 use cases in the existing toolchain.
If you use only one AI tool, GPT-5 will be the most reasonable primary choice in the second half of 2026.
Also available in 中文.