● LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400● LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400

// COOKEDBAKEDWORKED.COM

ModelsMon, Jun 8, 2026· 5 min read

DeepSeek V4 Pro claims to beat GPT-5.5 Pro; llama.cpp adds Gemma 4 MTP support

DeepSeek V4 Pro is being reported as outperforming GPT-5.5 Pro on precision benchmarks. Meanwhile, llama.cpp just merged Gemma 4 MTP support, and you can run Gemma 4 26B on CPU alone.

DeepSeek V4 Pro is making noise today with reports that it beats GPT-5.5 Pro on precision benchmarks. If that holds up under scrutiny, it matters for builders picking a backend model — cheaper Chinese labs keep closing the gap on OpenAI's flagship tiers.

New models

DeepSeek V4 Pro is being reported as outperforming GPT-5.5 Pro on precision metrics, according to RuntimeWire, which also cited Hugging Face model listings. DeepSeek models have historically been available via API and as open weights, so if V4 Pro follows that pattern, builders could run it directly. No official DeepSeek release post has been independently confirmed by CBW at time of writing.

On the local side: llama.cpp merged Gemma 4 MTP (multi-token prediction) support in build b9553. MTP speeds up inference by predicting multiple tokens at once. If you run models locally, pull the latest llama.cpp and you get faster Gemma 4 generation without any config changes.

Also confirmed: Unsloth's quantized Gemma 4 12B (QAT GGUF) is trending on Hugging Face. QAT quantization keeps quality higher than standard post-training quantization at the same file size. The 12B fits on most mid-range machines.

Run big models without a GPU

A widely shared LocalLLaMA post this week makes a practical point: Gemma 4 26B A4B (a 4-bit quantized variant) runs on CPU-only machines. It's slow, but it works. For builders who want a capable local model without buying a GPU, this is a real option right now. The Unsloth GGUF versions on Hugging Face are the easiest starting point.

Tools

office-open-xml-viewer is a new open-source tool that parses Office XML files (.docx, .xlsx, .pptx) and renders them to HTML Canvas — no Word or Excel needed. It scored 127 points on Hacker News. If you're building a document-handling app and need to display Office files in the browser without a paid conversion API, this is worth a look.

Industry moves

A GitHub issue requesting an official Claude Desktop app for Linux hit 480 points on Hacker News and is cross-confirmed on Reddit's Claude AI community. Anthropic has not responded publicly. Right now, Linux users have to run Claude in the browser or use the API. The volume of upvotes signals real demand — worth watching if Anthropic ships anything.

What builders can do this week

1. Test DeepSeek V4 Pro against your current model on a real task you care about — use the DeepSeek API playground or OpenRouter if it's listed there. Compare output quality on your actual prompts, not just benchmark tables.

2. Update llama.cpp to build b9553 and run a Gemma 4 model locally. If you've been on an older build, the Gemma 4 MTP merge alone is worth the update for faster token generation.

3. Drop office-open-xml-viewer into a side project that needs to preview .docx or .xlsx files in the browser. It's open source, no API key required, and saves you from wiring up a paid document conversion service.

// what we actually tested

What we can and can't confirm

Not independently verified by CBW: The DeepSeek V4 Pro benchmark claim comes from RuntimeWire, a single source. We have not seen an official DeepSeek release post or tested the model ourselves. Treat precision benchmark comparisons with caution until more labs reproduce them.

Confirmed: llama.cpp build b9553 merged Gemma 4 MTP support, cross-confirmed by the llama.cpp GitHub repo and LocalLLaMA community posts.

Confirmed: Unsloth's gemma-4-12B-it-qat-GGUF is trending on Hugging Face and cross-confirmed on LocalLLaMA.

Confirmed: The Linux Claude Desktop GitHub issue has 480 Hacker News points and Reddit cross-confirmation, but Anthropic has made no public commitment to ship it.

Worth noting: The Gemma 4 26B CPU-only claim comes from a Reddit post, not an official benchmark. Expect it to be very slow on most consumer hardware — usable, but not fast.

Source: RuntimeWire — DeepSeek V4 Pro vs GPT-5.5 Pro — https://runtimewire.com/article/deepseek-v4-pro-beats-gpt-5-5-pro-on-precision

Source: GitHub — llama.cpp build b9553 — https://github.com/ggml-org/llama.cpp

Source: LocalLLaMA — llama.cpp Gemma 4 MTP support merged — https://www.reddit.com/r/LocalLLaMA/comments/1tzbcyp/llamacpp_gemma4_mtp_support_merged/

Source: LocalLLaMA — Run Gemma 4 26B without a GPU — https://www.reddit.com/r/LocalLLaMA/comments/1tz5ffp/you_dont_need_a_gpu_to_run_gemma426ba4b/

Source: GitHub — Anthropic Claude Desktop for Linux issue — https://github.com/anthropics/claude-code/issues/65697

Source: Hugging Face — unsloth/gemma-4-12B-it-qat-GGUF — https://huggingface.co/unsloth/gemma-4-12B-it-qat-GGUF