Anthropic Releases Claude Sonnet 4.5

Oct 09, 20253 Mins read74

On September 30, 2025, artificial intelligence company Anthropic officially launched its new-generation model, Claude Sonnet 4.5. The official described it as “the world’s best coding model, the most powerful tool for building complex agents, and the best model for using computers,” marking a new stage of full-process autonomy in AI-assisted development. The model is now fully available, with the API call identifier “claude-sonnet-4-5”. The pricing remains consistent with the previous generation, at $3 per million input tokens and $15 per million output tokens, achieving the dual advantages of performance leap and cost stability.

Core Capabilities Achieve Three Breakthroughs

In the field of coding, Claude Sonnet 4.5 has achieved a revolutionary breakthrough. In the SWE-bench Verified evaluation, which measures real-world software coding capabilities, the model achieved an industry-leading score of 77.2%. This result is based on an average of 10 tests on 500 questions, using a bash and file editing tool framework with a 200K thinking budget configuration. Even more remarkable is its ability to handle long-cycle tasks—it has been tested to maintain focus on complex multi-step tasks for over 30 hours, independently generate 11,000 lines of code, and fully develop production-level products such as enterprise chat applications, covering the entire process from database configuration to compliance auditing. Test data from Codeium shows that its code editing error rate dropped from 9% of the previous generation to 0%, with a significant increase in tool success rate.

The model’s computer usage capabilities have also achieved leapfrog progress. In the OSWorld benchmark test, Claude Sonnet 4.5 achieved a score of 61.4%, significantly outperforming the 42.2% score of Sonnet 4 from four months ago by nearly 20 percentage points. Through the Claude for Chrome extension, the model can directly complete real-scenario tasks such as web navigation and spreadsheet filling in the browser, with operation fluency close to the level of manual operation. In addition, its reasoning capabilities in professional fields such as finance, law, and medicine have been significantly improved compared with older models such as Opus 4.1, achieving excellent results in assessments including the MMMLU multilingual test and the AIME mathematics evaluation. This advancement is a key highlight in the latest news of AI , showcasing the rapid evolution of AI’s professional application capabilities.

Ecosystem Upgrade: Benefits for Both Developers and Enterprises

Along with the model release, Anthropic simultaneously launched an upgrade to its product suite and an initiative to open up development tools. Claude Code has been upgraded to version 2.0, adding a VS Code extension plugin and a checkpoint function. Users can quickly undo modifications through the Esc+Esc shortcut key or the /rewind command. The Claude API has added context editing and memory tools, while the App directly integrates code execution and file creation functions, supporting the generation of various formats such as slides and documents.

Of greater industry significance is the first opening of the Claude Agent SDK. This core infrastructure that supports the development of Claude Code solves key problems such as agent memory management, permission balance, and multi-sub-agent coordination. It opens up Anthropic’s cutting-edge technological capabilities to developers around the world, helping to build customized agent applications. As of the release date, 13 leading companies in the industry have verified its value: GitHub Copilot stated that it has enhanced the ability to handle complex tasks across code bases, while Hai Security has reduced vulnerability processing time by 44% and improved accuracy by 25% through this model.

In terms of security, Claude Sonnet 4.5 has become the most aligned cutting-edge model in Anthropic’s history. It has obtained AI Security Level 3 (ASL-3) certification, is equipped with a CBRN hazardous content classifier, and its false positive rate has been reduced by 90% compared with the first generation. Its defense capabilities against prompt injection attacks have been significantly enhanced, while also reducing undesirable behaviors such as sycophancy and deception, balancing capability improvement with risk management.

Concurrent with this release, Anthropic also launched a limited-time research preview of “Imagine with Claude”, supporting Max subscription users to generate software in real time and dynamically. Although it is currently in the testing phase, its creation mode without pre-written code has demonstrated the future potential of AI development.