The day after Christmas, a small Chinese start-up called DeepSeek unveiled a new artificial intelligence system that could match the capabilities of cutting-edge chatbots from companies like OpenAI and Google.
That alone would be a milestone. But the team behind the system, called DeepSeek-V3, described an even bigger step. In a research paper explaining how they built the technology, DeepSeek’s engineers said they used only a fraction of the highly specialized computer chips that leading AI companies relied on to train their systems.
These chips are at the center of a tense technological rivalry between the United States and China. As the US government works to maintain the country’s lead in the global artificial intelligence race, it is trying to limit the number of powerful chips, such as those made by Silicon Valley firm Nvidia, that can be sold to China and other competitors.
But the DeepSeek model’s performance raises questions about the unintended consequences of the US government’s trade restrictions. The controls have forced researchers in China to get creative with a wide range of tools freely available online.
The DeepSeek chatbot answered questions, solved logical problems and wrote its own computer programs as proficiently as anything already on the market, according to benchmark tests used by US AI companies.
And it was built on the cheap, challenging the prevailing notion that only the biggest companies in the tech industry — all based in the United States — could afford to build the most advanced AI systems. The Chinese engineers said they only needed about $6 million in raw computing power to build their new system. That’s about 10 times less than what tech giant Meta spent on building its latest AI technology.
“The number of companies that have $6 million to spend is far greater than the number of companies that have $100 million or $1 billion to spend,” said Chris V. Nicholson, an investor at venture capital firm Page One Ventures, the which focuses on AI Technologies.
Ever since OpenAI sparked the AI boom in 2022 with the release of ChatGPT, many experts and investors concluded that no company could compete with the market leaders without spending hundreds of millions of dollars on specialized chips.
The world’s leading AI companies train their chatbots using supercomputers that use up to 16,000 chips, if not more. DeepSeek engineers, on the other hand, said they only needed about 2,000 specialized computer chips from Nvidia.
China’s chip restrictions forced DeepSeek’s engineers to “train it more efficiently so it can still be competitive,” said Jeffrey Ding, an assistant professor at George Washington University who specializes in emerging technology and international relations.
Earlier this month, the Biden administration issued new rules aimed at preventing China from acquiring advanced artificial intelligence chips through other countries. The rules build on multiple rounds of earlier restrictions that prevented Chinese companies from being able to buy or make cutting-edge computer chips. President Trump has not yet indicated whether he will keep the rules or rescind them.
The US government has tried to keep the advanced chips out of the hands of Chinese companies because of concerns that they could be used for military purposes. In response, some companies in China have stockpiled thousands of chips, while others have sourced them from a thriving underground market of smugglers.
DeepSeek is run by a quantitative stock trading firm called High Flyer. By 2021, it had funneled its profits into acquiring thousands of Nvidia chips, which it used to train its earlier models. The company, which did not respond to requests for comment, has become known in China for luring talent from top universities with the promise of high salaries and the ability to pursue the research questions that interest them most.
Zihan Wang, a computer engineer who worked on an earlier model of DeepSeek, said the company is also hiring people without a computer science background to help the technology understand and be able to create poetry and ace questions on the extremely difficult Chinese college entrance exams. .
DeepSeek does not produce products for consumers, leaving its engineers to focus solely on research. That means its technology does not fall under the strictest aspect of China’s AI regulations, which require consumer-facing technology to comply with government controls on information.
Leading US companies continue to advance state-of-the-art in artificial intelligence In December, OpenAI unveiled a new “thinking” system called o3 that outperforms existing technologies, although it is not yet widely available outside the company. But DeepSeek continues to show that it’s not far behind. This month, she released an impressive reasoning model of her own.
(The New York Times sued OpenAI and its partner Microsoft, accusing them of copyright infringement of news content related to artificial intelligence systems. OpenAI and Microsoft have denied these claims.)
A critical part of this rapidly changing global market is an old concept: open source software. Like many other companies, DeepSeek has open-sourced its latest AI system, meaning it has shared the underlying code with other businesses and researchers. This allows others to build and distribute their own products using the same technologies.
While workers at large Chinese tech companies are limited to working with colleagues, “if you work in open source, you work with talent all over the world,” said Yineng Zhang, a lead software engineer at Baseten in San Francisco who works on the open source SGLang. plan. Help other people and companies build products using DeepSeek’s system.
The open source AI ecosystem gathered steam in 2023 when Meta freely shared an AI system called LLama. Many assumed that this community would only flourish if companies like Meta—tech giants with massive data centers filled with specialized chips—continued to use their open-source technologies. But DeepSeek and others have shown that they can also extend the power of open source technologies.”
Many executives and experts have argued that major US companies should not open source their technologies because they could be used to spread disinformation or do other serious harm. Some US lawmakers have explored the possibility of preventing or cracking down on the practice.
But others argue that if regulators stifle the progress of open source technology in the United States, China will gain a significant advantage. If the best open source technologies come from China, they argue, American developers will build their systems on those technologies. In the long term, this could put China at the center of AI research and development.
“The center of gravity of the open source community has shifted to China,” said Ion Stoica, a professor of computer science at the University of California, Berkeley. “This could be a huge risk for the US” because it allows China to accelerate the development of new technologies.
Hours after his inauguration, President Trump rescinded an executive order from the Biden administration that threatened to curtail open source technologies.
Dr. Stoica and his students recently built an AI system called Sky-T1 that rivals the performance of the latest OpenAI system, called OpenAI o1, in some benchmark tests. They only needed $450 in computing power.
They did this by building on top of two open source technologies released by Chinese tech giant Alibaba.
Their $450 system isn’t as powerful as OpenAI’s technology or DeepSeek’s new system. And the techniques they used are unlikely to yield systems that outperform state-of-the-art technologies. But the project showed that even businesses with tiny resources can build competitive systems.
Reuven Cohen, a technology consultant in Toronto, has been using DeepSeek-V3 since late December. He says it’s comparable to the latest systems from OpenAI, Google and San Francisco start-up Anthropic — and much cheaper to use.
“DeepSeek is a way for me to save money,” he said. “This is the kind of technology someone like me wants to use.”