ModelsJul 1, 2026
GPT-5.6 Grayscale Testing: Users Discover Hidden "Juice Value" to Detect Model Upgrade
OpenAI released the GPT-5.6 series on June 26, 2025, including flagship Sol, mid-range Terra, and low-cost Luna, initially for invited partners only. Within 48 hours, users found a method to detect grayscale upgrades via Codex by sending specific prompts to reveal a hidden "Juice value" in the model's system prompt: GPT-5.5 in xhigh mode returns 768, GPT-5.6 Sol returns 128. Some users' usage panels already show gpt-5.6 call records. OpenAI officially states ChatGPT is unavailable during preview, but grayscale testing has covered some Plus users.
Model Specifications and Pricing
- Sol (Flagship): $5/M input tokens, $30/M output tokens, 1.5M context tokens (43% increase over GPT-5.5).
- Terra (Mid-range): Half the price, performance close to GPT-5.5.
- Luna (Low-cost): $1/M input tokens, $6/M output tokens.
- Introduced explicit cache breakpoints with a minimum 30-minute lifetime; cache writes billed at 1.25x, reads enjoy 90% discount.
- New inference-side max reasoning effort and ultra mode (via sub-agent collaboration).
Performance
- Terminal-Bench 2.1: Sol Ultra scores 91.9%, surpassing GPT-5.5 (88.0%), Claude Mythos 5 (84.3%), Claude Fable 5 (83.4%), and Gemini 3.1 Pro Preview (70.7%).
- ExploitBench: Sol achieves comparable performance to Claude Mythos Preview using about one-third the output tokens.
- Cybersecurity: Sol scores 96.7% in internal OpenAI tests, crossing the "High" risk threshold, but is emphasized to be better at finding and fixing vulnerabilities than launching attacks.
- GeneBench v1: Token efficiency superior to GPT-5.5 in long-range genomic analysis.
Safety and Access Restrictions
- Safety stack includes model-level refusal, real-time classifiers, cross-session review, and risk-level authorization.
- Red teaming invested over 700,000 A100-equivalent GPU hours, supplemented by third-party human testing.
- Communication with the U.S. government prior to release; currently limited to government-approved partners.
- OpenAI plans full rollout "in the coming weeks," with the community speculating a larger release as early as June 30.
Grayscale Detection Method
- Juice Value Test: In Codex, select gpt-5.5 with thinking intensity xhigh, send a specific XML prompt; if the answer is 128, it's GPT-5.6 Sol; if 768, it's GPT-5.5.
- Context Window Detection: Run /status in Codex CLI; if default context shows 353k, it may have been grayscale upgraded.
- Usage Panel: Visit the analytics page to check for gpt-5.6 call records (updated the next day).
- Note: Grayscale coverage is uneven, limited to Codex; web ChatGPT is not yet supported.
Also available in 中文.