返回资讯列表
industry-news

AI Coding Agents Hit 50%+ on SWE-Bench: Autonomous Bug Fixing Arrives

Multiple AI coding systems have crossed the 50% threshold on SWE-Bench Verified, the benchmark for autonomous software engineering. Devin (Cognition AI) achieves 53.8%, Claude with Computer Use achieves 49%, and OpenAI's internal system 48.9%. SWE-Bench tests autonomous resolution of real GitHub issues—reading code, understanding context, implementing a fix, and passing tests. Industry analysts note these systems are now capable of handling 30-40% of straightforward bug fixes autonomously in real production codebases. Several companies are reporting 25-35% reduction in developer time spent on bug fixes after deploying AI coding agents.

2025年5月2日来源:SWE-Bench
AI codingSWE-Benchcoding agentsautonomous codingdeveloper tools

阅读原文

本条资讯来源于 SWE-Bench,点击查看完整报道。

前往 SWE-Bench