LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400 LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400
CBW

NVIDIA Cosmos 3 lands on Hugging Face, llama.cpp ships b9444

NVIDIA's Cosmos 3 is now on Hugging Face — an open omni-model built for physical AI reasoning. Plus llama.cpp hits b9444 and Unsloth drops v0.1.43-beta.

NVIDIA just posted Cosmos 3 to Hugging Face — calling it the first open omni-model aimed at physical AI reasoning and action. That means robotics, simulation, and real-world decision-making, not just chat. If you build anything that needs a model to understand physical space and act on it, this is worth your attention today.

New models

Cosmos 3 is described as an omni-model: it takes in video, images, and text, and is designed to reason about physical environments and output actions. NVIDIA published it on Hugging Face under their blog channel. The model targets physical AI use cases — think robot control, simulation grounding, and embodied agents — rather than general-purpose chat. It is listed as open, but check the license before you ship anything commercial.

Separately, NVIDIA also quantized Qwen3.6-35B-A3B into NVFP4 format and posted it to Hugging Face, where it climbed to rank 22. The NVFP4 format is NVIDIA's own 4-bit float scheme, so you will need compatible hardware (Blackwell or later). Cross-confirmed on Reddit r/LocalLLaMA.

MiniMax M3 appeared on OpenRouter this week. Not much documentation is public yet, but it is live and callable via the OpenRouter API if you want to test it.

Open-source releases

llama.cpp tagged build b9444. This is a routine but important release — llama.cpp is the backbone for running local models on consumer hardware. If you are on an older build, update before trying any new GGUF files. Cross-confirmed on Reddit r/LocalLLaMA.

Unsloth released v0.1.43-beta. Unsloth is the fine-tuning library that cuts VRAM use and speeds up training on consumer GPUs. Beta means rough edges, but if you are actively fine-tuning models, it is worth testing. Check the GitHub changelog for what changed in this version before upgrading a production workflow.

A new paper — Light Interaction — proposes training-free inference acceleration for interactive video world models. Cross-confirmed across Reddit r/MachineLearning and r/StableDiffusion. The idea: speed up video generation at inference time without retraining. No released code yet as of this writing.

Worth reading

A Hacker News post titled 'The solution might be cancelling my AI subscription' hit 347 points. The author argues that AI subscriptions are delivering diminishing returns for many use cases and that the cost-to-value ratio is slipping. It is not a research paper, but it reflects real builder fatigue. Worth a read if you are deciding whether to renew Claude Pro, ChatGPT Plus, or similar.

Also on HN: the rsync project posted 'Please Do Not Vibe Fuck Up This Software' — a direct message to developers using AI coding tools to submit pull requests to rsync without understanding the codebase. 479 points. The maintainers are frustrated. If you use AI to generate open-source contributions, read the project's contribution guidelines first.

What builders can do this week

1. Pull Cosmos 3 from Hugging Face and run the demo notebook on a video clip of a physical space — a room, a table, a workshop. See whether it can answer spatial questions about what it sees. This is a concrete test of whether it is useful for your robotics or simulation project.

2. Update llama.cpp to b9444 and re-run your favorite local GGUF model. Time the inference before and after. Keep a simple log — it takes five minutes and tells you whether the update matters for your hardware.

3. If you pay for more than one AI subscription, spend 20 minutes this week listing which tasks you actually use each one for. The HN thread above is a prompt to audit, not just renew. Cancel what you are not using.

Honest note

// what we actually tested

Honest note

Confirmed: NVIDIA Cosmos 3 is live on Hugging Face as of the blog post date. The model page and blog post are publicly accessible.

Not independently verified by CBW: We have not run Cosmos 3 locally or tested its physical AI reasoning claims. NVIDIA's framing as 'first open omni-model for physical AI' is their own marketing language.

Not independently verified by CBW: MiniMax M3 is listed on OpenRouter but we have not tested it and no detailed spec sheet was available at time of writing.

Worth noting: The NVFP4 quantization of Qwen3.6-35B requires Blackwell-generation NVIDIA hardware. Most consumer GPU owners cannot run it yet.

Worth noting: The Light Interaction paper (2605.31158) has no public code release confirmed as of today. Cross-source buzz on Reddit does not equal a usable tool.

Source: NVIDIA Cosmos 3 — Hugging Face blog — https://huggingface.co/blog/nvidia/cosmos-3-for-physical-ai

Source: llama.cpp b9444 — GitHub — https://github.com/ggml-org/llama.cpp

Source: Unsloth v0.1.43-beta — GitHub — https://github.com/unslothai/unsloth

Source: MiniMax M3 — OpenRouter — https://openrouter.ai/minimax/minimax-m3

Source: The solution might be cancelling my AI subscription — HN — https://thoughts.hmmz.org/2026-05-31.html

Source: Please Do Not Vibe Fuck Up This Software — rsync GitHub issue — https://github.com/RsyncProject/rsync/issues/929

// daily build

One project. 5 minutes. Daily.

Get tomorrow's best AI project in your email. With a guide that works. Free. No spam.

23,400 builders read this