DeepSeek V4 Official Release Scheduled for Mid-July, Introduces Peak-Valley Pricing and Launches Inference Acceleration Framework DSpark
DeepSeek recently notified developers via email that its V4 official version is planned for release in mid-July 2025, and will introduce peak-valley pricing for its API. During peak hours (9:00-12:00 and 14:00-18:00 Beijing time daily), input and output token prices will double, while off-peak hours maintain current low prices. This marks DeepSeek's first API price increase, previously known for its low-cost strategy.
Meanwhile, DeepSeek, in collaboration with Peking University, has open-sourced the speculative decoding framework DSpark and the accompanying training library DeepSpec. DSpark has been deployed in the Flash and Pro versions of the V4 preview, boosting per-user generation speed by 60%-85% (Flash) and 57%-78% (Pro) while maintaining the same throughput. DSpark employs two innovations: semi-autoregressive generation and confidence scheduling, addressing the tail decay of parallel draft models and the computational waste caused by fixed verification length.
Since the V4 preview release in April, user feedback has mainly focused on issues such as high hallucination rates, insufficient stability for ultra-long contexts, and subpar performance on complex code tasks. The official version will optimize these areas and benefit from the inference speed improvements brought by DSpark.
After the peak-valley pricing is implemented, the output price for V4-Pro during peak hours will rise to 12 yuan per million tokens (original price 24 yuan, previously discounted to 6 yuan), and for V4-Flash to 4 yuan per million tokens (original price 8 yuan, previously discounted to 2 yuan). DeepSeek stated that this price adjustment aims to manage computational resource constraints through load management, while also providing financial support for its self-built data center (located in Ulanqab).
Also available in 中文.