● LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400● LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400

// COOKEDBAKEDWORKED.COM

ModelsThu, Jun 25, 2026· 5 min read

OpenAI and Broadcom reveal a custom inference chip — and why it matters for builders

OpenAI and Broadcom just announced a custom LLM inference chip called Jalapeño. Less dependence on Nvidia could mean cheaper, faster API calls for everyone building on OpenAI.

OpenAI and Broadcom announced a custom LLM inference chip — codenamed Jalapeño — on June 24. This is OpenAI's first in-house silicon, and it is designed specifically to run large language models faster and cheaper. If it works as described, the downstream effect for builders is lower API costs and higher throughput. That is worth paying attention to.

New hardware

The Jalapeño chip is built by Broadcom and optimized for inference — meaning running models, not training them. OpenAI has been paying Nvidia for GPU time at enormous scale. Building its own inference silicon is a direct move to cut that dependency and reduce per-token costs. The announcement landed on OpenAI's blog and was picked up by TechCrunch, scoring 641 points on Hacker News — one of the stronger signals of genuine builder interest we have seen this week.

No public release date for when Jalapeño will be running production traffic. OpenAI did not say when API prices will change as a result. This is a hardware announcement, not a pricing announcement.

Open-source releases

Krea-2, the image generation model from Krea, is now available on Hugging Face via Comfy-Org. It ranked in the top 25 trending models this week and has cross-confirmed GitHub star activity. If you use ComfyUI for image workflows, this is a direct drop-in to test.

LanceDB hit v0.31.0-beta.3 and Weaviate shipped v1.38.2. Both are vector databases used in RAG pipelines. Neither release is a major version bump, but if you have a production RAG app on either, check the changelogs before upgrading.

OpenHands (the open-source AI coding agent) released cloud-1.39.0. OpenHands lets you run an AI agent that writes and executes code in a sandboxed environment — no coding required to use the hosted version.

Research worth reading

Hugging Face published the FFASR Leaderboard — a new benchmark for speech recognition (ASR) models tested on real-world audio, not clean studio recordings. If you are building voice features into any app, this leaderboard tells you which models actually hold up with background noise, accents, and phone-quality audio.

Baidu's Unlimited-OCR space is trending on Hugging Face. It claims to handle OCR without the usual page-count or resolution limits. Worth a test if you are processing scanned documents or receipts.

What builders can do this week

1. Test Krea-2 in ComfyUI: install the Comfy-Org/Krea-2 model from Hugging Face and run it against your current image generation workflow. Compare output quality on product mockups or social graphics.

2. Check your ASR model choice: go to the FFASR Leaderboard on Hugging Face and find the top-ranked model for your use case (phone audio, accented speech, etc.). Swap it into a small voice transcription project and measure word error rate against what you are using today.

3. Try Baidu's Unlimited-OCR on a messy PDF: drag a scanned invoice or handwritten form into the Hugging Face Space and see if it beats your current OCR setup. No account needed to run the demo.

// what we actually tested

What we can and cannot confirm

Confirmed: OpenAI and Broadcom announced the Jalapeño inference chip on June 24, 2026. The announcement is live on OpenAI's blog and covered by TechCrunch.

Not independently verified by CBW: We have not seen the chip's benchmark numbers, power specs, or any independent hardware review. All performance claims come from OpenAI's own announcement.

Worth noting: OpenAI gave no date for when Jalapeño will handle production API traffic, and no pricing changes were announced. Do not plan around cost reductions yet.

Not independently verified by CBW: We have not tested Krea-2, Unlimited-OCR, or the FFASR Leaderboard rankings against real workloads. Trending on Hugging Face does not mean production-ready.

Worth noting: The TechCrunch article on the Jalapeño chip is the primary detailed source. The OpenAI blog post is the official announcement. Both are linked below.

Source: OpenAI blog — Jalapeño inference chip announcement — https://openai.com/index/openai-broadcom-jalapeno-inference-chip

Source: TechCrunch — OpenAI unveils its first custom chip, built by Broadcom — https://techcrunch.com/2026/06/24/openai-unveils-its-first-custom-chip-built-by-broadcom/

Source: Hugging Face — FFASR Leaderboard blog post — https://huggingface.co/blog/ffasr-leaderboard

Source: Hugging Face — Comfy-Org/Krea-2 model page — https://huggingface.co/Comfy-Org/Krea-2

Source: Hugging Face — Baidu Unlimited-OCR Space — https://huggingface.co/spaces/baidu/Unlimited-OCR

Source: GitHub — LanceDB v0.31.0-beta.3 — https://github.com/lancedb/lancedb