OpenAI's ChatGPT Agent Can Complete Complex Tasks

OpenAI’s introduction of an artificial intelligence (AI) agent in ChatGPT — capable of helping a user run code, navigate a personal calendar, and generate presentations and slideshows – may be the company’s boldest attempt yet to turn ChatGPT into an agentic product.

ChatGPT agent, announced Thursday, combines features from OpenAI’s previous agentic tools, including Operator (clicking around websites) and Deep Research (synthesizing information from dozens of websites into a research report). The new general purpose agent also has access to a terminal, and uses APIs to access certain apps.

ChatGPT agent is capable of pulling off difficult tasks like ordering clothes for a wedding while taking into the dress code and weather. The chatbot’s virtual computer comes with a number of tools that can interact with the web. It also allows the user to connect apps such as Gmail and GitHub so ChatGPT can find information relevant to a prompt.

Subscribers of ChatGPT’s Pro, Plus, and Team tiers can activate the chatbot’s agentic capabilities by selecting “agent mode” in ChatGPT’s dropdown tool menu.

The ability of OpenAI’s latest agent to complete complex tasks is designed to give it an edge in an AI agent race teeming with competition. In recent weeks, OpenAI, Alphabet Inc.’s Google, Perplexity AI, and others have saturated the market with dozens of AI agents — many of whom have struggled to perform complex tasks.

OpenAI claims ChatGPT agent’s model boasts state-of-the-art performance on several benchmarks, scoring 41.6% on Humanity’s Last Exam (pass@1), a test comprised of thousands of questions across more than 100 subjects. The score is roughly twice that of OpenAI’s o3 and o4-mini scored on the test.

New safeguards for ChatGPT agent include a monitor that works in real time as users interact with the product, according to OpenAI.

OpenAI’s ChatGPT Agent Can Complete Complex Tasks

SHARE THIS STORY

FOLLOW US

OpenAI’s ChatGPT Agent Can Complete Complex Tasks

TECHSTRONG AI PODCAST

SHARE THIS STORY

RELATED STORIES:

FOLLOW US

NEWSLETTER SIGN UP