Anthropic on Monday unfurled Claude Sonnet 4.5, a new artificial intelligence (AI) model the Amazon.com Inc.-backed startup says delivers major improvements in coding, computer operation, and business applications — particularly in cybersecurity, finance and research.

The model, available to all users, represents a major leap in autonomous capabilities and software development, according to the company. The new model can work independently for up to 30 hours with minimal supervision, or enough time to build a complete software application. Opus 4, which launched just four months earlier, managed only seven hours of autonomous operation.

In addition to raw endurance, Claude Sonnet 4.5 reportedly surpasses Opus across critical performance measures and delivers stronger results on real-world business applications. The model excels particularly in coding, achieving what Anthropic calls state-of-the-art performance on SWE-Bench Verified, a benchmark for evaluating software development abilities.

In financial services testing, Anthropic’s latest model showed superior performance in research, financial modeling, and forecasting compared to earlier Claude versions.

Anthropic hopes to widen its competitive advantage in coding assistance and autonomous task execution. The company is increasingly targeting enterprise and workplace markets, according to recent performance data and use patterns.

Claude 4.1 Opus model outperformed rivals on GDPval, OpenAI’s newly introduced benchmark measuring professional task completion. The evaluation assessed how AI models performed against human professionals across multiple industries and job functions.

Last week, OpenAI acknowledged its GPT-5 model and Anthropic’s Claude Opus 4.1 were “already approaching the quality of work produced by industry experts.”

Box Inc., which received early access to the model, found it delivered an 81% increase in accuracy, allowing teams across professional services, hospitality, energy, retail, and the public sector to onboard faster, cut errors, and automate extraction-heavy work.

Recent usage studies reinforce Claude’s reputation as a professionally-focused tool, particularly when contrasted with ChatGPT’s growing consumer orientation. Research shows Claude users primarily engage with the model for workplace productivity, with coding and mathematical tasks representing 36% of global usage on Claude.ai.

The data reveals an even stronger enterprise pattern through API use, where approximately 77% of prompts involve task execution rather than advisory functions. Coding dominates this channel at 44% of API activity, with another 5% dedicated to AI system development and evaluation.

This pattern signals a fundamental shift in corporate AI adoption — from tools that support decision-making to systems that actually execute work. As models like Claude gain more autonomous capabilities in fields such as software engineering, the potential business impact grows considerably.

TECHSTRONG TV

Click full-screen to enable volume control
Watch latest episodes and shows

Tech Field Day Events

TECHSTRONG AI PODCAST

SHARE THIS STORY