DeepSeek Opts for Huawei Chips to Train Some Models
DeepSeek's decision is a sign that the Chinese AI developer is reducing its reliance on Nvidia chips.
The Takeaway
• DeepSeek chooses Huawei chips to train some AI models
• Decision signals its desire to reduce reliance on Nvidia
• DeepSeek tested chips from Baidu and Cambricon as well
DeepSeek, one of China’s leading artificial intelligence developers, has decided to use Huawei Technologies’ AI chips to train some of its AI models, a sign it is reducing its reliance on Nvidia chips, according to three people with knowledge of the effort. The move follows pressure by the Chinese government on local tech companies to use locally made chips more.
DeepSeek continues to use Nvidia chips for its largest and most powerful AI models. Even so, the decision to use Huawei chips for training smaller models signals a turning point in the use of U.S. technology by Chinese AI companies. It follows years of growing U.S. restrictions on the export of advanced chips to China, and a push by Chinese authorities for the country’s tech industry to become more self-reliant.
It could have a big impact on Nvidia’s China business in the long term: Nvidia CEO Jensen Huang estimated on Wednesday night that the Chinese AI chip market this year was worth $50 billion and would likely grow 50% a year. He also described China as “the second-largest computing market in the world.”
DeepSeek became a global sensation in late January after the company released R1, a deep-reasoning model with performance on par with OpenAI’s similar model at the time but trained at lower costs. In China, DeepSeek has been hailed as an example of the country’s tech innovation and resilience in the face of challenges due to the U.S. government’s export controls and other attempts to contain China’s technological advancements.
In recent months, DeepSeek has tested Chinese AI chips from Huawei, Baidu and Cambricon Technologies—a publicly listed AI chip designer founded by two brothers who were researchers at the Chinese Academy of Sciences—for use in training its models.
DeepSeek has selected Huawei and is working with its engineers to use the tech giant’s Ascend chips to train and refine smaller versions of DeepSeek’s next-generation R2 models, which haven’t been released yet. Huawei, one of the world’s biggest makers of telecom equipment and smartphones, is also China’s leading chip developer. Over the past year, Nvidia’s Huang has repeatedly called Huawei a “formidable” competitor.
Huawei and DeepSeek didn’t respond to requests for comments.
DeepSeek is still using Nvidia chips for the most powerful R2 models, a sign that it will take time to replace Nvidia with domestic alternatives, the person added. As Nvidia has long dominated the AI chip market everywhere, including in China, most Chinese AI developers are accustomed to training and operating their AI models using Nvidia chips and the Cuda software that accompanies them.
DeepSeek’s earlier models, such as the R1, were so deeply optimized for Nvidia’s hardware and software that running them with Chinese chips was harder to manage and less efficient, according to employees of Chinese cloud service providers that help their customers run DeepSeek models. This means DeepSeek needs to deepen its understanding of Huawei’s technology to make sure its AI models will work well with Huawei’s hardware and software.
Still, DeepSeek is renowned for its innovation in developing models at a fraction of the computing costs other AI companies spend on their models. The collaboration between DeepSeek and Huawei could help Huawei enhance its software and attract more users for its Ascend chips. Working together, they pose a greater threat to Nvidia’s dominance in the AI chip market.
Asked about DeepSeek’s partnership with Huawei, a Nvidia spokesperson said: “The competition has undeniably arrived. The world will choose the best technology stack for running the most popular applications and open-source models. To win the AI race, U.S. industry must earn the support of developers everywhere, including China.”
DeepSeek still hasn’t set the exact launch date for the R2, the highly anticipated successor to the company’s R1 models launched in January. The main reason DeepSeek hasn’t launched the new models yet is because CEO Liang Wenfeng still isn’t satisfied with the R2’s performance, said two of the people and two others with knowledge of DeepSeek’s work on the new models.
His expectations are high. DeepSeek wants to offer top-notch capabilities in reasoning, coding and math, but it also wants R2 models to excel in terms of efficiency and computing costs. To figure out how to perfect the models, DeepSeek’s researchers are conducting tests that remove parts of an AI model in order to understand each part’s contribution to overall performance, according to the people.
While DeepSeek’s work on the R2 continues, the company is making progress in another effort to reduce its dependence on Nvidia. Earlier this month, when DeepSeek unveiled an upgraded version of its V3 foundation model, the company also introduced a new data processing format called UE8M0 FP8. Nvidia chips don’t typically support it, but the format is designed to work better with Chinese chips for AI models. The move shows how DeepSeek is trying to make sure its new AI models will run as smoothly with Chinese chips as they do on Nvidia chips.
Political Landscape
DeepSeek’s collaboration with Huawei comes as Nvidia’s Chinese business has stalled amid conflicting directives from the U.S. and Chinese government. To comply with U.S. export controls, Nvidia had developed chips tailor-made for the Chinese market that weren’t as powerful as its most advanced chips. In April, President Donald Trump’s administration blocked the sale of even those chips, but last month, Trump reversed that position and allowed Nvidia to sell the China-specific chips, known as H20s. He said earlier this month that he and Huang had agreed Nvidia would share 15% of the resulting Chinese revenue with the U.S. government.
But even as the U.S. government was softening its position, China’s internet regulator ordered local tech companies including ByteDance, Alibaba Group and Tencent Holdings to suspend their purchases of Nvidia chips, citing data security concerns. Further complicating matters, Nvidia Chief Financial Officer Colette Kress revealed on Wednesday that the U.S. government had not yet codified the revenue-sharing agreement with the chipmaker and Nvidia hadn’t shipped any of the chips it was now allowed to sell to China.
Meanwhile, since last year, the Chinese government has been ramping up its effort to promote the use of Chinese chips for domestic AI development, The Information previously reported.
While major Chinese tech firms have tested Huawei chips, few have fully adopted them for AI training. That’s partly because of the difficulty developers face in adapting to Huawei’s distinct software system, Cann, the Chinese company’s equivalent of Nvidia’s Cuda software.
DeepSeek’s decision to work more closely with Huawei indicates how the AI developer, an offshoot of Chinese quantitative hedge fund High-Flyer Capital Management, has become an important part of China’s national strategy in its intensifying technology competition against the U.S. Indicating DeepSeek’s growing significance for China, The Information reported in March that the company had asked some of its employees involved in the development of AI models to hand in their passports, restricting them from traveling abroad freely. At the time it told those employees their work made them privy to confidential information that could constitute trade secrets or even state secrets.