返回资讯列表
AI Safety

Anthropic Publishes Updated Model Spec: New Guidelines for AI Behavior

Anthropic releases comprehensive update to Claude Model Spec, detailing new guidelines for handling sensitive topics, improved calibration for confidence expressions, and enhanced corrigibility principles.

2025年5月15日来源:Anthropic
AnthropicAI-safetymodel-specalignmentClaude

阅读原文

本条资讯来源于 Anthropic,点击查看完整报道。

前往 Anthropic