DeepSeek Claims Training Cost of AI Model R1 Significantly Lower than US Rivals

Technology

DeepSeek Claims Training Cost of AI Model R1 Significantly Lower than US Rivals

5 months ago

News sources:

REUTERS

INDIATIMES

CHANNELNEWSASIA

DeepSeek Claims Training Cost of AI Model R1 Significantly Lower than US Rivals

Credited from: CHANNELNEWSASIA

DeepSeek claims training its R1 AI model cost just $294,000, significantly lower than U.S. competitors.
The company aims to challenge U.S. dominance in AI development markets.
DeepSeek used 512 Nvidia H800 chips for training, raising questions about hardware sourcing amidst export restrictions.
CEO of OpenAI, Sam Altman, stated foundational model training costs "much more" than $100 million.
DeepSeek acknowledges using A100 chips during preparatory stages, complicating transparency in its cost claims.

Chinese AI developer DeepSeek has revealed that it spent only $294,000 on training its R1 model, a price that has raised eyebrows in the technology sector as it is significantly lower than the costs reported by U.S. companies like OpenAI and Nvidia. This figure appeared in a peer-reviewed article published in the academic journal Nature, marking the first official disclosure of R1's training expenses by the Hangzhou-based company. The emergence of its cost-effective AI model has ignited concerns among investors, prompting them to reassess the financial investments associated with AI development, especially those of leading firms like Nvidia, which has traditionally dominated in this arena, according to Reuters, India Times, and Channel News Asia.

DeepSeek's announcement indicates a calculated strategy to position itself as a formidable challenger in the AI landscape which includes prominent companies like OpenAI and Google. Their R1 model stands out not just for its training costs but also because it is open-source and offers unlimited free access, which may democratize AI technology further. Comparatively, Sam Altman, CEO of OpenAI, has indicated that the expenses involved in training foundational models can exceed $100 million, showcasing a striking disparity in the financial resources allocated to AI training, as highlighted in the report by Reuters, India Times, and Channel News Asia.

However, the validation of DeepSeek's claims has been contested by U.S. companies and government officials who raised concerns about the sourcing of their hardware. The H800 chips, used for training the R1 model, were designed specifically for the Chinese market after tighter U.S. export controls were enacted in late 2022, restricting the sale of more powerful AI chips to China. U.S. officials have asserted that DeepSeek may have gained unauthorized access to H100 chips, despite Nvidia maintaining that only H800 chips were used. Additionally, in a supplementary document to the Nature article, DeepSeek affirmed that it has access to A100 chips which were utilized in the model's initial development phases, further clouding the transparency surrounding their operational methods, according to Reuters, India Times, and Channel News Asia.