Useful information
Prime News delivers timely, accurate news and insights on global events, politics, business, and technology
Useful information
Prime News delivers timely, accurate news and insights on global events, politics, business, and technology
Join our daily and weekly newsletters to get the latest updates and exclusive content on industry-leading AI coverage. More information
OpenAI has presented Operatorits first semi-autonomous AI agent, which is designed to “operate” a web browser much like a person would, on your behalf. The agent uses the point-and-click cursor, types on their own, surfs the web, and performs actions on various websites, such as making restaurant reservations through OpenTable and fulfilling orders on Instacart and DoorDash. That’s instead of being limited to the ChatGPT interface or the OpenAI application programming interface (API).
“This product is the beginning of our move into agents,” CEO and co-founder Sam Altman said in a demo streamed live on the company’s YouTube channel today at 1 p.m. ET.
President of OpenAI and co-founder Greg Brockman wrote in: “2025 is the year of the agents.”
The preview, now available to paying US subscribers to OpenAI’s ChatGPT Pro plan ($200 per month), aims to demonstrate the potential of agent AI while gathering critical feedback to refine its capabilities.
However, the operator is not responsible for your web browser. Instead, visit a new, independent website: operator.chatgpt.com – and are faced with a quick ChatGPT-like input box.
By typing a request in this box, “find me tickets to the Los Angeles Lakers game tonight,” the Operator will open a separate virtual browser running in the cloud on OpenAI servers. The agent can then perform tasks such as filling out forms, managing online reservations, including booking tickets for sporting events and concerts, and navigating other common workflows. The user watches the cursor move on its own in the cloud-based browser in real time. If the agent encounters a problem, it will stop and send a message to the user via text message, similar to ChatGPT responses.
Additionally, below the virtual browser, the user will see suggestions for actions that the Operator can perform on their behalf.
However, the user can take control at any time, similar to semi-autonomous driving systems in modern cars. The operator also prompts the user to enter their own payment credentials when they reach a purchase screen on another website. Finally, users can save particular workflows that they want to use in the future and start them again.
The operator is powered by what OpenAI calls computer-utilized agent (CUA) technology, a new variant of GPT-4o specifically trained to use computers.
Operator distinguishes itself from other automation tools by mimicking human interaction with graphical user interfaces (GUI).
Instead of relying on specialized APIs, the system leverages screenshots for visual information and uses virtual mouse and keyboard actions to complete tasks.
The underlying CUA model combines the vision capabilities of GPT-4o with reinforcement learning, allowing the agent to perceive, reason, and act on the screen.
This approach allows the Operator to handle various tasks, including e-commerce navigation, travel planning, and even repetitive tasks like creating playlists or managing shopping lists. Notable benchmarks illustrate its effectiveness:
• 87% success rate on WebVoyagera live web browsing test
• 58.1% success rate on WebArenathat simulates real-world e-commerce and content management scenarios
But there’s already stiff competition: Yesterday, Chinese tech company ByteDance (TikTok’s parent company) launched its own AI agent to control web browsers and perform actions on a user’s browsers. benefit. Called UI-TARS, It’s fully open source and boasts equally impressive benchmark performance (although it doesn’t appear to have been directly compared on the same benchmarks). That means the OpenAI Operator will have to be significantly better or more reliable to justify the relatively high cost ($200/month) of accessing it via ChatGPT Pro subscriptions.
OpenAI is partnering with several companies to ensure Operator meets real-world needs. Companies like Instacart, DoorDash, and Etsy are already testing the technology for use cases ranging from grocery delivery to personalized shopping.
Brett Keller, CEO of Priceline, highlighted its usefulness for travel planning, calling it “an important step in making travel more seamless and personalized.”
For public sector applications, the City of Stockton is exploring ways to use Operator to simplify civic participation. Jamil Niazi, the city’s chief information technology officer, highlighted the potential of AI to make it easier for residents to sign up for services.
However, there are limitations. Technology Publishing Each I got an early preview, tested it over the past week, and found that:
“One of the peculiarities of Operator’s design is that it does not use your browser. Instead, it uses a browser in one of OpenAI’s data centers that can observe and interact remotely. The advantage of this design decision is that you can use Operator anywhere and anytime, for example, on any mobile device.
“The downside is that many sites like Reddit already block AI agents from browsing, so the Operator cannot access them. In this research preview mode, OpenAI also blocks the Operator’s access to certain resource-intensive sites like Figma or competitor-owned sites like YouTube for legal or performance reasons.
Given its ability to act on behalf of users, Operator has been developed with strong security features:
• user control: The operator requests confirmation for sensitive actions, such as making purchases or sending emails.
• Clock mode: Ensures user monitoring for critical tasks, particularly on sensitive sites such as email or financial platforms.
• Prevention of misuse: The system is capable of rejecting harmful requests and includes safeguards against adversarial attacks, such as malicious messages embedded in websites.
OpenAI has also built in features to protect user privacy, including options to clear browsing data and opt out of data sharing to improve the model.
OpenAI envisions a broader role for the operator in both individual and enterprise environments. Over time, the company plans to expand access to Plus, Team, and Enterprise users, and eventually integrate Operator into ChatGPT.
There are also plans to make the underlying CUA technology available through an API, allowing developers to create custom agents that use computers.
Despite its potential, Operator remains a work in progress. OpenAI has been transparent about its limitations, such as difficulties with complex interfaces or unfamiliar workflows. Early user feedback will play a critical role in improving system accuracy, reliability, and security.
As OpenAI refines Operator through real-world use, it seeks to transform AI from a passive tool to an active participant in the digital ecosystem. Whether simplifying everyday tasks or innovating business workflows, OpenAI is positioning Operator as the next step in making AI accessible, practical and secure.