AI SafetyMay 15, 2025

Anthropic Publishes Updated Model Spec: New Guidelines for AI Behavior

Anthropic releases comprehensive update to Claude Model Spec, detailing new guidelines for handling sensitive topics, improved calibration for confidence expressions, and enhanced corrigibility principles.

Also available in 中文.

Getting Started

Learn how to get started with this application.

Learn more

Installation Guide

Anthropic Publishes Updated Model Spec: New Guidelines for AI Behavior

Documentation

Getting Started

Learn more