AI Agent News
实时追踪 AI Agent 赛道的重大事件、融资动向、模型发布和技术突破
最新行业资讯
实时追踪 AI Agent 赛道的重大事件、融资动向、模型发布和技术突破
重大事件时间线
OpenClaw GitHub 爆发
OpenClaw 10 天冲上 GitHub 全球 Top 10,超越 Linux 内核 Star 增速
Meta 20亿收购 Manus
Meta 以 20 亿美元收购 Manus AI,通用 Agent 赛道正式被巨头锁定
DeepSeek-V3 开源
性价比之王,成本仅 GPT-4 的 5%
Manus 一夜爆火
全球首款通用 AI Agent 在国内社交平台引发空前关注
OpenAI Deep Research
OpenAI 推出深度研究 Agent,一键生成专业研究报告
MCP Server 破 500
MCP 生态爆发,3 个月构建 500+ Server
DeepSeek-R1 震惊全球
开源推理模型,成本仅 OpenAI 的 3%,引发全球 AI 格局震动
MCP 协议诞生
Anthropic 发布 Model Context Protocol,成为 Agent 接口事实标准
Claude Computer Use
Anthropic 让 AI 首次直接操控电脑屏幕,开创计算机使用新范式
Replit Agent 全栈自动化
自然语言到上线产品,面向非工程师
Cursor ARR 破亿
史上增长最快 SaaS,AI 编程工具新王者
Claude 3.5 登顶 SWE-bench
最强编程 AI,Bug 修复能力达到初级工程师水平
Devin 发布
全球首个自主 AI 软件工程师,能独立完成完整编程任务
Anthropic Claude 4 Sonnet: Extended Context, Computer Use & Major Performance Leap
Anthropic has released Claude 4 Sonnet, offering a 500K token context window, significantly improved computer use capabilities (navigate browsers and GUIs autonomously), and 15% better performance across coding, analysis, and reasoning benchmarks. Claude 4 introduces Projects feature for persistent memory across conversations, improved artifact generation (code, documents, data visualizations), and a new "thinking" mode that shows the model's reasoning process. Available via API and Claude.ai with immediate availability; claude-sonnet-4 becomes the default model in Anthropic's consumer app.
Google Veo 2 Reaches General Availability: State-of-the-Art AI Video Generation for Businesses
Google has made Veo 2, its second-generation AI video generation model, generally available through Vertex AI and the Gemini API. Veo 2 generates 1080p video up to 2 minutes long from text or image prompts, with improved physics simulation, character consistency across frames, and cinematic quality. Google has added SynthID watermarking for all generated content and content safety filtering. Early enterprise customers in advertising, media, and entertainment industries report 70% reduction in concept video production costs.
OpenAI GPT-5 Release: Reasoning Improvements and Native Multimodality Across All Modalities
OpenAI has officially announced GPT-5, featuring significant improvements in reasoning, multimodal understanding, and agentic capabilities. GPT-5 natively processes text, images, audio, video, and code in a single model. The model achieves 72% on the ARC-AGI benchmark (vs. 49% for GPT-4o), 88% on GPQA (doctorate-level science), and 90%+ on most professional certification exams. New features include extended thinking mode for complex reasoning, native tool use, improved instruction following, and significantly reduced hallucination rates on factual queries.
Google Gemini 2.0 Flash Thinking: Fast Reasoning Model Challenges o1-mini
Google has released Gemini 2.0 Flash Thinking Experimental, a fast reasoning model that shows its chain-of-thought process before answering. Flash Thinking achieves scores competitive with OpenAI o1-mini on math and science benchmarks while being significantly faster and cheaper. The model excels at: complex mathematical reasoning, coding problems, and multi-step scientific analysis. Google offers Flash Thinking through the Gemini API with a 1M token context window and the ability to process images, video, and audio alongside text. Latency is approximately 2-3x lower than OpenAI o1 for comparable tasks.