OpenAI released ChatGPT agent on July 17, 2025, enabling ChatGPT to "do work for you using its own computer, handling complex tasks from start to finish" through autonomous web navigation, code execution, and document creation.
The agent integrates capabilities from previous OpenAI tools, bringing together "Operator's ability to interact with websites, deep research's skill in synthesising information, and ChatGPT's intelligence and conversational fluency." Users can request complex workflows like analysing competitors and creating slide decks, with ChatGPT autonomously navigating websites, filtering results, running code, and delivering editable presentations and spreadsheets.
ChatGPT agent operates through "its own virtual computer, fluidly shifting between reasoning and action to handle complex workflows from start to finish." The system includes "a visual browser that interacts with the web through a graphical-user interface, a text-based browser for simpler reasoning-based web queries, a terminal, and direct API access."
OpenAI reports state-of-the-art performance across multiple benchmarks, achieving "41.6" on Humanity's Last Exam and "27.4% accuracy" on FrontierMath. On an internal knowledge-work benchmark, "ChatGPT agent's output is comparable to or better than that of humans in roughly half the cases."
The company has implemented "High Biological and Chemical capabilities" safeguards under its Preparedness Framework, requiring explicit user confirmation for consequential actions and active supervision for critical tasks like sending emails.
ChatGPT agent enables automation of professional tasks including financial modelling, presentation creation, meeting coordination, and data analysis, with Pro users receiving 400 monthly messages and other paid tiers getting 40 messages monthly through credit-based usage models.
OpenAI's agent release represents a shift toward autonomous AI task execution in enterprise environments, combining web interaction, code execution, and document generation capabilities. The platform's benchmark performance and safety framework position it for knowledge work automation while addressing prompt injection risks and requiring user oversight for consequential actions across professional workflows.