Useful information

Prime News delivers timely, accurate news and insights on global events, politics, business, and technology

OpenAI responds to the Deepseek competition with detailed reasoning traces for O3-mini


Join our daily and weekly newsletters to obtain the latest updates and exclusive content on the coverage of the industry leader. Get more information


Operai now shows more details of the O3-mini reasoning process, his latest reasoning model. The change was announced in OPENAI X Account And it comes when the Laboratory of AI is under greater pressure by Depseek-R1, an open rival model that completely shows its reasoning tokens.

The models such as O3 and R1 experience a long process of “chain of thought” (COT) in which they generate additional tokens to break down the problem, reason and try different responses and achieve a final solution. Previously, Openai’s reasoning models hid their chain of thought and only produced a high level general description of reasoning steps. This hindered users and developers to understand the model of reasoning of the model and change their instructions and indications to direct it in the right direction.

Operai considered a chain of thought a competitive advantage and hid it to prevent rivals from copying to train their models. But with R1 and other open models that show their trace of complete reasoning, the lack of transparency becomes a disadvantage for OpenAi.

The new version of O3-mini shows a more detailed version of COT. Although we still do not see raw chips, it provides much more clarity about the reasoning process.

Why is it important for applications

In our previous experiments in O1 and R1, we found that O1 was slightly better to solve data analysis and reasoning problems. However, one of the key limitations was that there was no way to discover why the model made mistakes, he often made mistakes when he faced disorderly data from the real world obtained from the web. On the other hand, R1’s thinking chain allowed us to solve problems and change our indications to improve reasoning.

For example, in one of our experiments, both models could not provide the correct answer. But thanks to the R1 detailed thought chain, we were able to discover that the problem was not with the model itself but with the recovery stage that collected web information. In other experiments, R1’s chain of thought could provide us with clues when he failed to analyze the information we provided, while O1 only gave us a very approximate overview of how he was formulating his answer.

We tried the new O3-mini model in a variant of an previous experiment that we execute with O1. We provide the model with a text file containing prices of several actions from January 2024 to January 2025. The file was noisy and without format, a mixture of text without format and HTML elements. Then we ask the model to calculate the value of a portfolio that invested $ 140 in the magnificent 7 shares on the first day of each month from January 2024 to January 2025, distributed evenly in all actions (we use the term “MAG 7” in him he indicates that it is a bit more challenging).

O3-mini’s cradle was really useful this time. First, the model reasoned about what Mag 7 was, leaked the data to maintain only the relevant actions (to make the problem challenging, we add some actions that are not from Mag 7 to the data), we calculate the monthly amount for invest in each one. Stock, and made the final calculations to provide the correct answer (the portfolio would be worth around $ 2,200 at the last moment recorded in the data we provide to the model).

Many more evidence will be needed to see the limits of the new chain of thought, since Operai is still hiding many details. But in our environments, it seems that the new format is much more useful.

What it means to OpenAi

When Deepseek-R1 was launched, it had three clear advantages over OpenAi’s reasoning models: it was open, cheap and transparent.

Since then, Operai has managed to shorten the gap. While O1 costs $ 60 per million production tokens, O3-mini costs only $ 4.40, while exceeding O1 at many reasoning points. R1 costs about $ 7 and $ 8 per million tokens in US suppliers. (Deepseek offers R1 to $ 2.19 per million tokens on their own servers, but many organizations will not be able to use it because it is lodged in China).

With the new change in crib exit, OpenAi has managed to work around the transparency problem.

It remains to be seen what Openai will do to open your models. Since its launch, R1 has already been adapted, fork and lodged by many different laboratories and companies that potentially make it the preferred reasoning model for companies. The OpenAi CEO, Sam Altman, recently admitted that he was “on the wrong side of the story” in the open source debate. We will have to see how this realization will manifest in OpenAi’s future releases.

Discounts
Source link

Leave a Reply

Your email address will not be published. Required fields are marked *