Useful information
Prime News delivers timely, accurate news and insights on global events, politics, business, and technology
Useful information
Prime News delivers timely, accurate news and insights on global events, politics, business, and technology
Join our daily and weekly newsletters to get the latest updates and exclusive content on industry-leading AI coverage. More information
On the ninth day of its holiday-themed product announcement series known as “12 Days of OpenAI,” OpenAI is releasing its most advanced model, o1, to third-party developers through its application programming interface (API).
This marks a big step forward for developers looking to create new advanced AI applications or integrate the most advanced OpenAI technology into their existing applications and workflows, whether enterprise or consumer-facing.
If you’re not already familiar with OpenAI’s o1 series, here’s the rundown: Announced in September 2024, the first in a new “family” of models from the ChatGPT company, which goes beyond large language models ( LLM) of the GPT family series and which offers “reasoning” capabilities.
Basically, the o1 family of models (o1 and o1 mini) take longer to respond to user input, but they check for themselves. while formulating a response to see if they are correct and avoid hallucinations. At the time, OpenAI said that o1 could handle more complex PhD-level problems, something confirmed by real world users, too.
While developers previously had access to a pre-release version of o1 upon which they could build their own applications (for example, a PhD advisor or a lab assistant), the production-ready release of the full o1 model through the API offers improved performance and lower latency. and new features that facilitate integration into real-world applications.
OpenAI had already made o1 available to consumers through its ChatGPT Plus and Pro plans about two and a half weeks ago, and also added the models’ ability to analyze and respond to images and files uploaded by users.
Along with today’s launch, OpenAI announced major updates to its real-time API, along with price reductions and a new tuning method that gives developers greater control over their models.
The new o1 model, available as o1-2024-12-17, is designed to excel at complex, multi-step reasoning tasks. Compared with the previous version of o1, this version improves accuracy, efficiency and flexibility.
OpenAI reports significant gains across a variety of benchmarks, including coding, math, and visual reasoning tasks.
For example, coding results on the SWE-bench Verified rose from 41.3 to 48.9, while performance on the math-focused AIME test jumped from 42 to 79.2. These improvements make o1 well suited for creating tools that streamline customer service, optimize logistics, or solve challenging analytical problems.
Several new features improve the functionality of o1 for developers. Structured outputs allow responses to reliably match custom formats such as JSON schemas, ensuring consistency when interacting with external systems. Calling functions simplifies the process of connecting o1 to APIs and databases. And the ability to reason about visual input opens up use cases in manufacturing, science, and coding.
Developers can also tune the behavior of o1 using the new Reasoning_effort parameter, which controls how much time the model spends on a task to balance performance and response time.
OpenAI also announced updates to its real-time API, designed to power natural, low-latency conversation experiences, such as voice assistants, live translation tools or virtual tutors.
A new WebRTC integration simplifies building voice-based applications by providing direct support for streaming audio, noise suppression, and congestion control. Developers can now integrate real-time capabilities with minimal configuration, even under variable network conditions.
OpenAI is also introducing new pricing for its real-time API, reducing costs by 60% for GPT-4o audio to $40 for one million input tokens and $80 for one million output tokens.
Cached audio input costs are reduced by 87.5% and are now priced at $2.50 per million input tokens. To further improve affordability, OpenAI is adding GPT-4o mini, a smaller, more cost-effective model priced at $10 for one million input tokens and $20 for one million output tokens.
Text token fees for GPT-4o mini are also significantly lower, starting at $0.60 for input tokens and $2.40 for output tokens.
Beyond pricing, OpenAI gives developers more control over responses in the API in real time. Features like out-of-band simultaneous responses allow background tasks, such as content moderation, to run without interrupting the user experience. Developers can also customize input contexts to focus on specific parts of a conversation and control when voice responses are activated for more accurate and fluid interactions.
Another important addition is setting preferencesa method of customizing models based on user and developer preferences.
Unlike supervised fine-tuning, which relies on exact input and output pairs, preference fine-tuning uses pairwise comparisons to teach the model which responses are preferred. This approach is particularly effective for subjective tasks, such as summaries, creative writing, or scenarios where tone and style matter.
Early tests with partners like Rogo AI, which creates assistants for financial analysts, show promising results. Rogo reported that preference tuning helped their model handle complex out-of-distribution queries better than traditional tuning, improving task accuracy by more than 5%. The feature is now available for gpt-4o-2024-08-06 and gpt-4o-mini-2024-07-18, and there are plans to expand support to newer models early next year.
To streamline integration, OpenAI is expanding its official SDK offerings with beta versions for Go and Java. These SDKs join existing Python, Node.js, and .NET libraries, making it easier for developers to interact with OpenAI models in more programming environments. The Go SDK is particularly useful for building scalable backend systems, while the Java SDK is designed for enterprise-grade applications that rely on strong writing and robust ecosystems.
With these updates, OpenAI offers developers an expanded set of tools to create advanced and customizable AI-based applications. Whether through o1’s enhanced reasoning capabilities, real-time API enhancements, or tuning options, OpenAI’s latest offerings aim to deliver improved performance and cost-effectiveness for businesses that push the boundaries of integration. of AI.