Useful information
Prime News delivers timely, accurate news and insights on global events, politics, business, and technology
Useful information
Prime News delivers timely, accurate news and insights on global events, politics, business, and technology
Join our daily and weekly newsletters to obtain the latest updates and exclusive content on the coverage of the industry leader. Get more information
The world of AI was shaken last week when VeteranA Chinese AI startup announced its latest Deepseek-R1 language model that seemed to coincide with the capacities of the main American systems to a cost fraction. The announcement caused a generalized Market sale That cleaned almost $ 200 billion of the Nvidia market value and caused heated debates about the future of AI development.
The narration that emerged quickly suggested that Depseek had fundamentally interrupted the economy of building advanced AI systems, supposedly achieving with only $ 6 million of what US companies had spent billions to achieve. This interpretation sent shock waves through Silicon Valley, where companies such as Opadai, Anthropic and Google They have justified massive investments in calculation infrastructure to maintain their technological advantage.
But in the midst of market turbulence and breathless, Dario AmodeiAnthropic co -founder and one of the pioneer researchers behind today’s large language models (LLMS), published a detailed analysis that offers a more nuanced perspective on Depseek’s achievements. His Blog It crosses hysteria to offer several crucial ideas about what Deepseek really achieved and what it means for the future of AI development.
Here are the four key ideas of Amodei’s analysis that remodel our understanding of the Deepseek announcement.
Deepseek reported Development costs You need to be seen through a broader lens, according to Amodei. He directly challenges the popular interpretation:
“Deepseek does not ‘do for $ 6 million that cost us the companies of the billion. I will not give an exact number).
This shocking revelation fundamentally changes the narrative around Deepseek’s profitability. When considering that Sonnet It was trained 9-12 months ago and still exceeds Deepseek model in many tasks, the achievement seems more in line with the natural progression of AI development costs than a revolutionary advance.
Time and context also significantly matter. After the historical tendencies of cost reduction in the development of AI, which Amodei estimates approximately 4 times per year, Deepseek’s cost structure seems to be in a trend instead of being dramatically ahead of the curve.
While markets and media focused intensely on Deepseek R1 ModelAmodei points out that the most significant innovation of the company arrived before.
“Deepseek-V3 Actually, it was the real innovation and what should have made people notice a month ago (we certainly did). As a model prior to oil, it seems to be close to the performance of state -of -the -art American models in some important tasks, while they cost substantially less to train. ”
The distinction between V3 and R1 is crucial to understanding the true technological advance of Deepseek. V3 represented genuine engineering innovations, particularly in model management “Key value cache“And pushing the limits of Mix of experts (MOE) Method.
This idea helps to explain why the dramatic reaction of the market to R1 may have been out of place. R1 essentially added reinforcement learning capabilities to the V3 foundation, a step that multiple companies are currently giving with their models.
Perhaps the most revealing aspect of Amodei’s analysis refers to the general investment of Deepseek in the development of AI.
“It has been informed, we cannot be sure that it is true, that Deepseek had really 50,000 hopper generation chipsthat I suppose is within a factor ~ 2-3x of what the main US companies have. Uu. Those 50,000 hopper chips cost the request of ~ $ 1b. Therefore, the total expenditure of Deepseek as a company (unlike spending to train an individual model) is not very different from the US laboratories. ”
This revelation dramatically reinforces the narrative around Deepseek’s resources. Although the company may have achieved impressive results with the training of individual models, its general investment in the development of AI seems to be approximately comparable to its US counterparts.
The distinction between model training costs and total corporate investment highlights the continuous importance of substantial resources in the development of AI. It suggests that although engineering efficiency can be improved, remaining competitive in AI still requires significant capital investment.
Amodei describes the moment present in the development of AI as unique but fleeting.
“Therefore, we are in an interesting cross point ‘, where it is temporarily the case that several companies can produce good reasoning models,” he wrote. “This will cease to be true as everyone moves above in the scale curve on these models.”
This observation provides a crucial context to understand the current state of the IA competition. The capacity of multiple companies to achieve similar results in reasoning capabilities represents a temporary phenomenon instead of a new status quo.
The implications are significant for the future of AI development. As companies continue to expand their models, particularly in the intensive area of reinforcement learning resources, it is likely that the field once again differentiates depending on who can invest more in training and infrastructure. This suggests that although Deepseek has achieved an impressive milestone, it has not fundamentally altered the long -term economy of the advanced development of AI.
Amodei’s detailed analysis of Deepseek’s achievements cut weeks of market speculation to expose the real economy of advanced AI systems. Its blog publication systematically dismantles both panic and enthusiasm that followed the announcement of Deepseek, showing how the cost of training model of $ 6 million of the company fits within the constant march of the development of AI.
The markets and the media gravitate towards simple narratives, and the history of a Chinese company that dramatically undermine the development costs of the AI of the United States was irresistible. However, Amodei’s breakdown reveals a more complex reality: Deepseek’s total investment, particularly its $ 1 billion in computing hardware, reflects the expenditure of its American counterparts.
This moment of cost parity between the development of AI and China marks what Amodei calls a “crossing point”, a temporary window where multiple companies can achieve similar results. Its analysis suggests that this window will close as the training capabilities and training demands intensify. The field will probably favor organizations with the deepest resources.
The construction of advanced AI remains an expensive effort, and the careful examination of Amodei shows why measuring its true cost requires examining the full scope of the investment. Its methodical deconstruction of Deepseek’s achievements can be more significant than the initial announcement that caused such turbulence in the markets.