中文
← Back to news
IndustryJun 25, 2026

OpenAI and Broadcom Unveil Custom Inference Chip Jalapeño, Achieving Record 9-Month Tape-Out

On June 24, 2026, OpenAI and Broadcom officially launched Jalapeño, their first custom inference accelerator chip designed for large language model inference. The chip achieved a record-breaking 9-month development cycle from design to tape-out. Engineering samples have successfully run cutting-edge models like GPT-5.3-Codex-Spark, with early tests showing significant per-watt performance advantages over current mainstream AI accelerators.

Chip Design and Collaboration

Jalapeño features architecture designed by OpenAI, chip implementation and networking by Broadcom, and board/system integration by Celestica. The architecture is optimized for large model inference, improving real-world utilization by reducing data movement and balancing compute and memory resources. Broadcom's Tomahawk switching chips provide underlying support for large-scale deployment. OpenAI's hardware team, led by former Google TPU senior engineering director Richard Ho, leveraged proprietary AI models to assist chip design, accelerating the development process.

Performance and Deployment

OpenAI has not disclosed specific performance benchmarks but stated that early tests show per-watt performance significantly exceeding current state-of-the-art. Jalapeño is scheduled for initial deployment in data centers of partners like Microsoft by the end of 2026, targeting gigawatt-scale superclusters. This chip is the first product in a multi-generational AI computing platform, with continuous iterations planned.

Impact on NVIDIA

Jalapeño's launch may affect NVIDIA in several ways:

  • Short-term: OpenAI's own inference GPU procurement will be drastically reduced, with inference costs expected to drop by approximately 50%; high-speed networking shifts to Broadcom Ethernet solutions, replacing NVIDIA NVLink.
  • Medium to long-term: ASIC share in the inference market continues to grow; Counterpoint predicts custom ASICs will account for 27.8% of inference servers by 2026. Companies like Meta and Anthropic may follow with custom chips, forming an ASIC camp.
  • Ecosystem: Inference scenarios rely less on CUDA; ASICs can be paired with lightweight software stacks. OpenAI achieves full-stack self-sufficiency from model to chip to data center, weakening NVIDIA's intermediary position.

Industry Background and Trends

OpenAI joins Google, Amazon, Microsoft, and Meta as the latest tech giant to develop custom AI chips. Google's TPU (2016), Amazon's Inferentia (2018) and Trainium (2022), and Microsoft's Azure Maia (2023) preceded it. OpenAI President Greg Brockman stated that in an era driven by computing power, owning hardware is a core strategy. Jalapeño's rapid tape-out validates the efficiency of the "AI giant + chip foundry" custom ASIC model, potentially accelerating the industry's self-development wave.

Also available in 中文.