Ali Qianwen 3 is released and open source, with parameters only one-third of DeepSeek-R1
2025-04-29 07:43:28

Alibaba has open-sourced the new generation of Tongyi Qianwen model Qwen3 (Qwen3 for short), which has only 1/3 of the parameters of DeepSeek-R1, and announced that the cost has dropped significantly, and the performance has surpassed leading models such as R1 and OpenAI-o1. Qianwen 3 is a "hybrid reasoning model" that integrates "fast thinking" and "slow thinking" into the same model, greatly saving computing power consumption. It is understood that Qianwen 3 adopts a mixed expert (MoE) architecture, with a total parameter volume of 235B and only 22B required for activation. The pre-training data volume of Qianwen 3 reaches 36T tokens, and after multiple rounds of reinforcement learning in the post-training stage, the non-thinking mode is seamlessly integrated into the thinking model. Qianwen 3 has been greatly enhanced in reasoning, instruction following, tool calling, and multi-language capabilities. While the performance has been greatly improved, the deployment cost of Qianwen 3 has also been greatly reduced. Only 4 H20s are needed to deploy the full-blooded version of Qianwen 3, and the video memory occupancy is only one-third of that of models with similar performance.
Email Subscription
Newsletters and emails are now available! Delivered on time, every weekday, to keep you up to date with North American business news.
ASIA TECH WIRE

Grasp technology trends

Download