
SAN JOSE, Calif. — Alibaba Group Holding claims to have broken yet another speed record in an escalating artificial intelligence (AI) race. On Wednesday, the company announced a new version of its Qwen model that it says “comprehensively outperformed” DeepSeek-V3 in certain benchmark tests.
Alibaba said its new Qwen 2.5-Max model also exceeded the performance of OpenAI’s GPT-4o and Meta Platforms Inc.’s Llama 3.1-405B in large language model (LLM) performance benchmark platforms Arena-Hard and LiveBench. The new model equaled Anthropic’s Claude 3.5-Sonnet model, according to Alibaba Cloud.
“Qwen 2.5-Max outperforms…almost across the board GPT-4o, DeepSeek-V3 and Llama-3.1-405B,” Alibaba’s cloud unit said in an announcement posted on its WeChat account. Alibaba’s upgraded model can parse files, comprehend videos, count objects in images, and control a PC, the company said.
Alibaba’s claims are just the latest machinations (boasts?) in AI’s ongoing version of leap frog. On Thursday, U.S.-based Allen Institute for AI (Ai2) released what it asserts is a next-generation open model that outperforms DeepSeek-v3 and is on par with GPT-4o in early evaluations. The company claimed Ai2 showcases the U.S. “can lead with competitive, open-source AI independent of the tech giants.”
The flurry of activity around performance claims started Jan. 10 with the introduction of DeepSeek’s AI assistant, followed by the Jan. 20 release of its low-cost R1 model that shook Silicon Valley – which has plied billions of dollars into AI development, and precipitated a $1 trillion sell-off of tech stocks.
It’s the furious pace of AI in general that has one long-time tech analyst comparing it to previous innovation waves and the pitfalls each has faced. “While AI is new and unique, the competition is reminiscent of virtually all previous big tech launches,” says longtime tech analyst Jack Gold. “Remember the browser wars? Cloud Wars? Even the database wars of many years ago? All the players looked for advantage in tech and often with outrageous marketing claims. Of course, the difference here is there are Chinese players involved and that causes some specific issues. But I see this as really a similar competition to what’s happened with so many new technologies in the past.”
Will Lu, co-founder of Orby AI, an AI agent platform used by Fortune 10 companies, added the AI arms race among major players — whether in China or U.S. — is “less about who has the biggest model and more about how efficiently those models are deployed,” he said.
Outside the U.S., DeepSeek’s success has prompted an equally frenzied scramble among the company’s domestic competitors to upgrade their own AI models.
On Jan. 22, TikTok owner ByteDance unveiled an update to its AI model, which it says outpaces OpenAI’s o1 in AIME, a benchmark test that measures how AI models understand and respond to complex instructions. [Earlier, DeepSeek claimed R1 rivaled OpenAI o1 on several performance benchmarks.]
The timing of Alibaba’s announcement, on the first day of the Lunar New Year, was significant in reaching as many folks as possible to create buzz and shift a days-long fascination away from DeepSeek.
But in their haste to be the fastest, biggest AI gunslinger, are AI model makers strictly bringing attention and funding to themselves? And are consumers and companies potentially exposing themselves to serious security issues and compromised personal data by adopting open-source models like DeepSeek? In other words, are they entering a Faustian bargain in pursuit of the fastest model as quickly as possible?
“Almost every engineer will talk of the triple constraint – good, fast, or cheap. At best one may get two of the three,” John Sheehy, senior vice president of research and strategy at IOActive, said in an email. “In the case of an all-out race, you likely will only get speed.”
Among the consequences of a strategic focus on speed for AI and machine learning, Sheehy said, are poor quality, weak security, more latent risks, and weaker understanding of the underlying science and supporting technologies.
“History shows that those who accept the benefits of technology without foresight often build their own gallows,” Peter Ackerson, general partner at Audere Capital, said.