Chinese tech giant Alibaba’s artificial intelligence (AI) team has unveiled the latest version of its “Qwen-3 Max Thinking” model. According to recent test results, it outperforms competitors such as OpenAI’s “GPT-5.2” and Google’s “Gemini 3 Pro” on some specific tasks.

Strengths of the Qwen-3 Max Thinking model
The advanced Chinese language model is built with a billion-parameter hybrid architecture and trained on a massive amount of data – over 36 trillion tokens.
The model can automatically integrate external tools, such as web searches, thereby reducing the number of so-called “hallucinations.” However, its main strength is revealed in the application of the “Test-Time Scaling” method, during which the model performs multi-level deep thinking, which allows it to solve complex tasks, such as programming or high-level mathematical questions, more accurately and consistently.
Results on “Humanity’s Last Exam” and other tests
Independent tests show that Qwen-3 Max Thinking achieved very good results in the so-called “Humanity’s Last Exam” test, a complex set of tests designed to test how AI models cope with complex, academically demanding tasks.
In this test, Qwen-3 Max Thinking scored 49.8 percent correct, outperforming both Gemini 3 Pro (45.8 percent) and GPT-5.2 Thinking (45.5 percent).
In addition to the aforementioned test, the model also performed well on programming and math tasks. High scores on other tests show that Qwen-3 Max Thinking is capable of effectively solving both technical and academic problems and can compete with the most advanced AI models currently on the market.

Qwen-3 Max-Thinking, especially when using the “Test-Time Scaling” method, achieves very high results on a variety of science, math, and programming tests. In many evaluations, the model is on par with or outperforms GPT-5.2, Gemini 3 Pro, Claude Opus 4.5, and DeepSeek-V3.2, especially on tasks that allow the use of additional tools.
Is the DeepSeek “moment” repeating itself?
The Qwen-3 Max Thinking results are inevitably reminiscent of the DeepSeek breakthrough last year, when an AI model developed in China unexpectedly reached the level of the rest of the world’s tech giants.
As before, test data shows that China’s AI ecosystem is capable of rapidly developing competitive models, especially through improving thinking methods and efficient use of computing resources.
However, the real breakthrough will be seen later – individual tests show progress, but the long-term adaptability of the model and maintaining competitiveness in a dynamic environment remain much more important.
Credits:
Image:


