A new AI model from OpenAI, code-named “Strawberry” that can perform human-like reasoning tasks, could give it a temporary edge in a crowded field.
The o1 model series is designed to spend more time computing the answer before responding to user queries, though it can’t browse the web or analyze files yet. OpenAI, which is expected to announce a $6.5 billion round of funding any day, said the o1 series works best in math and coding.
“These enhanced reasoning capabilities may be particularly useful if you’re tackling complex problems in science, coding, math and similar fields,” OpenAI said in a blog post Thursday. “For example, o1 can be used by health care researchers to annotate cell sequencing data, by physicists to generate complicated mathematical formulas needed for quantum optics, and by developers in all fields to build and execute multi-step workflows.”
The announcement — amid a week of AI-related news from Apple Inc., Oracle Corp., Salesforce Inc. and ServiceNow Inc. — further illustrates how OpenAI is building its AI tools for consumers beyond ChatGPT and how it intends to differentiate itself in the market from rivals Anthropic, Meta Platforms Inc., Alphabet Inc.’s Google, Elon Musk’s xAI and others.
Each company, in its own way, is attempting to develop AI products and services that can reason through complex tasks and emulate human-like thought processes.
The o1 series could “solidify [OpenAI’s] dominance in the AI landscape,” Paul Nashawaty, practice leader for application development at Futurum Group, said in an email. “By offering more advanced and capable models, OpenAI can attract more developers and enterprises, strengthening its market position.
“The o1 series also potentially alters the competitive landscape by continue investment in the AI tech stack,” Nashawaty added. “Other AI companies will need to invest heavily in research and development to keep pace with OpenAI’s advancements. This could lead to increased innovation and competition in the AI industry.”
Fututum research from nine months ago showed 18% of organizations were using AI in production applications; today, that number is up to 54%, he said.
In its blog post, OpenAI said it “trained these models to spend more time thinking through problems before they respond, much like a person would. Through training, they learn to refine their thinking process, try different strategies, and recognize their mistakes.”
In addition to its main o1 model, OpenAI said it is introducing a “faster, cheaper” version called o1-mini that it claims is “particularly effective” at coding. [To be clear, o1 is expensive. In the API, o1-preview is $15 per 1 million input tokens and $60 per 1 million output tokens. That is triple the cost of GPT-4o for input and four times the cost for output.]
“In a qualifying exam for the International Mathematics Olympiad (IMO), GPT-4o correctly solved only 13% of problems, while the reasoning model scored 83%. Their coding abilities were evaluated in contests and reached the 89th percentile in Codeforce’s competitions,” OpenAI said in the post.
The new model also excels at fending off jailbreak attacks designed to make AI systems violate safeguards around security and responsible use. OpenAI says it recently formalized agreements with the U.S. and U.K. AI Safety Institutes.
“There is always excitement about AI advancements, but also concerns” around safety and overall impact, Lee Klarich, chief product officer of Palo Alto Networks Inc., said in an interview.
AI expert Varun Glover offered this instant review in an email: “I tried OpenAI o1-preview today and was genuinely impressed by its thinking feature. It takes time to structure responses, which leads to more accurate and helpful answers — a practical step forward for generative AI handling complex tasks. This advancement could significantly impact the market by raising the bar for AI capabilities and pushing others to innovate.”