AI SafetyMay 15, 2025
Anthropic Publishes Updated Model Spec: New Guidelines for AI Behavior
Anthropic releases comprehensive update to Claude Model Spec, detailing new guidelines for handling sensitive topics, improved calibration for confidence expressions, and enhanced corrigibility principles.
Also available in 中文.