Useful information

Prime News delivers timely, accurate news and insights on global events, politics, business, and technology

OpenAI’s o1 model does not show its thinking, giving open source an advantage


Join our daily and weekly newsletters to get the latest updates and exclusive content on industry-leading AI coverage. More information


OpenAI has ushered in a new paradigm of reasoning in large language models (LLM) with its o1 model, which recently got a major update. However, while OpenAI has a strong lead in reasoning models, it could lose some ground to rapidly emerging open source rivals.

Models like o1, sometimes called extensive reasoning models (LRM), use additional inference time calculation cycles to “think” more, review your answers, and correct them. This allows them to solve complex reasoning problems that classic LLMs struggle with and makes them especially useful for tasks such as coding, mathematics and data analysis.

However, in recent days, developers have shown mixed reactions towards o1, especially after the updated release. Some have posted examples of o1 accomplishing amazing tasks, while others have done it. expressed frustration about the model’s confusing responses. Developers have experienced all kinds of problems, from making illogical changes to code or ignoring instructions.

Secret about o1 details

Part of the confusion is due to OpenAI’s secrecy and its refusal to show the details of how o1 works. The secret ingredient behind the success of LRMs is the additional tokens that the model generates when it arrives at the final answer, called the model’s “thoughts” or “reasoning chain.” For example, if you ask a classic LLM to generate code for a task, it will generate the code immediately. In contrast, an LRM will generate reasoning tokens that examine the problem, plan the code structure, and generate multiple solutions before issuing the final answer.

o1 hides the thinking process and only shows the final answer along with a message showing how long the model thought and possibly an overview of the reasoning process. This is partly to avoid overwhelming the response and provide a smoother user experience. But most importantly, OpenAI considers the chain of reasoning a trade secret and wants to make it difficult for competitors to replicate o1’s capabilities.

The costs of training new models continue to rise and profit margins are not keeping pace, which is pushing some AI labs to become more secretive in expanding their lead. Even the Apollo research, which was done by red model kithe did not have access to his chain of reasoning.

This lack of transparency has led users to make all kinds of speculation, including accusing OpenAI of downgrading the model to reduce inference costs.

Fully transparent open source models

On the other hand, open source alternatives such as Qwen with questions and Alibaba’s Marco-o1 show the complete chain of reasoning of their models. Another alternative is DeepSeek R1, which is not open source but still reveals reasoning tokens. Viewing the chain of reasoning allows developers to troubleshoot their prompts and find ways to improve the model’s responses by adding additional prompts or in-context examples.

Visibility of the reasoning process is especially important when you want to integrate model responses into applications and tools that expect consistent results. Additionally, having control over the underlying model is important in enterprise applications. Private models and the scaffolding that supports them, such as the safeguards and filters that test their inputs and outputs, are constantly changing. While this can result in better overall performance, it can break many prompts and applications built on top of them. In contrast, open source models give full control of the model to the developer, which can be a stronger option for enterprise applications, where performance on very specific tasks is more important than general skills.

QwQ and R1 are still in preview versions and o1 has the advantage in terms of accuracy and ease of use. And for many uses, such as making general ad hoc prompts and one-time requests, o1 may still be a better choice than open source alternatives.

But the open source community is quickly catching up with proprietary models and we can expect more models to emerge in the coming months. They can become a suitable alternative where visibility and control are crucial.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *