Ali Qianwen 3 is released and open source, with parameters only one-third of DeepSeek-R1
Alibaba has open-sourced the new generation of Tongyi Qianwen model Qwen3 (Qwen3 for short), which has only 1/3 of the parameters of DeepSeek-R1, and announced that the cost has dropped significantly, and the performance has surpassed leading models such as R1 and OpenAI-o1. Qianwen 3 is a "hybrid reasoning model" that integrates "fast thinking" and "slow thinking" into the same model, greatly saving computing power consumption. It is understood that Qianwen 3 adopts a mixed expert (MoE) architecture, with a total parameter volume of 235B and only 22B required for activation. The pre-training data volume of Qianwen 3 reaches 36T tokens, and after multiple rounds of reinforcement learning in the post-training stage, the non-thinking mode is seamlessly integrated into the thinking model. Qianwen 3 has been greatly enhanced in reasoning, instruction following, tool calling, and multi-language capabilities. While the performance has been greatly improved, the deployment cost of Qianwen 3 has also been greatly reduced. Only 4 H20s are needed to deploy the full-blooded version of Qianwen 3, and the video memory occupancy is only one-third of that of models with similar performance.