OpenAI, Broadcom roll out Jalapeno AI chip for LLM inference, target gigawatt-scale data centres from 2026
New Delhi, June 24
OpenAI and Broadcom have unveiled Jalapeno, OpenAI's first custom Intelligence Processor designed from scratch for large language model inference. The chip was delivered to OpenAI's leadership after a nine-month design-to-tape-out cycle and will be deployed at a gigawatt scale with data centre partners across multiple generations starting by the end of 2026.
According to a press release by Broadcom, early lab tests show Jalapeno running ML workloads at production target frequency and power, with performance per watt substantially better than the current state-of-the-art.
Broadcom said Jalapeno marks the start of a multi-generation compute platform built with OpenAI. "Our collaboration with OpenAI represents a fundamental commitment to scaling the physical infrastructure required for the next decade of AI," said Hock Tan, President and CEO, Broadcom.
"This is just the beginning of a multi-generation roadmap. By co-developing our industry-leading silicon directly with OpenAI, we are enabling the deployment of gigawatt-scale data centres with Microsoft and other partners beginning in 2026," Broadcom said.
The chip was co-developed from initial design to manufacturing tape-out in just nine months, with Broadcom contributing silicon implementation expertise, board and rack integration, high-performance networking and scalable production systems.
According to Broadcom, the accelerator is a blank-slate design for modern LLM inference rather than a general-purpose chip adapted from earlier AI workloads. "Jalapeno is a blank-slate design for modern LLM inference, not a general-purpose accelerator adapted from earlier AI workloads," Broadcom said.
The architecture reduces data movement and balances compute, memory and networking resources to achieve utilisation much closer to theoretical peak performance. Broadcom's silicon implementation and networking technologies, including Tomahawk networking silicon, help bring the platform to large-scale production. Engineering samples are already running ML workloads in the lab at production target frequency and power, including GPT-5.3-Codex-Spark.
Broadcom highlighted that OpenAI designed the chip around its understanding of LLM fundamentals, kernels, serving systems and product needs, while Broadcom and Celestica industrialised the platform. "Designed to be the best inference platform for LLMs," Broadcom said.
Jalapeno combines the power and throughput of today's leading AI accelerators with latency closer to the fastest specialised inference systems. That makes it suited for interactive LLM products at scale across ChatGPT, Codex, the API and future agentic products.
While OpenAI is still measuring final performance, Broadcom noted that early testing shows Jalapeno will deliver performance per watt substantially better than the current state-of-the-art, with a detailed technical report on performance to be presented in the coming months.
The companies said the custom ASIC program reflects one of the fastest development cycles ever in advanced semiconductors. Broadcom said the speed reflects deep software-hardware co-development with OpenAI's engineering teams and the use of OpenAI models to accelerate parts of the design and optimisation process.
Jalapeno is the first step in a platform that combines OpenAI-designed accelerators with Broadcom silicon and connectivity technologies and Celestica's board, rack and system expertise for initial deployment by the end of 2026.
— ANI
Reader Comments
Impressive to see OpenAI and Broadcom move so fast, but I wonder about the geopolitical implications. Will India get access to these chips at gigawatt scale or will it be another technology restricted to select countries? We should be developing our own LLM inference chips, not just relying on imports.
Finally, a chip designed specifically for LLM inference! 🚀 The nine-month development cycle is mind-blowing—our Indian IT companies should take notes on execution speed. With GPT-5.3-Codex-Spark already running on it, this could democratize AI access if costs come down. Waiting for that detailed technical report!
All this sounds great on paper, but let's be real—gigawatt-scale data centres need massive power infrastructure. India is still struggling with 24/7 reliable electricity in many areas. We need parallel investment in renewable energy and grid modernisation if we want to host these next-gen chips meaningfully.
Interesting move by OpenAI to own their silicon. India's AI ecosystem should watch closely—if latency improves dramatically for interactive LLM products, it opens doors for local language models serving rural India via mobile. Let's hope these chips don't just serve rich Western markets but enable global AI inclusion.
We welcome thoughtful discussions from our readers. Please keep comments respectful and on-topic.