● LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400● LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400

// COOKEDBAKEDWORKED.COM

ModelsFri, Jun 5, 2026· 5 min read

ChatGPT gets 'Dreaming' memory, Gemma 4 12B lands on OpenRouter

OpenAI shipped a new memory system called Dreaming that lets ChatGPT consolidate what it learns about you over time. Meanwhile, Google's Gemma 4 12B is live and free to try.

OpenAI just shipped Dreaming — a new memory system for ChatGPT that processes your past conversations in the background and builds a richer picture of who you are and what you need. If you use ChatGPT daily, this is the biggest quality-of-life change in months.

New models

Google's Gemma 4 12B is now available on OpenRouter and Hugging Face. It is a unified, encoder-free multimodal model — meaning it handles text and images in a single architecture without a separate vision encoder bolted on. At 12 billion parameters it is small enough to run locally on a decent GPU. The HN thread hit 780 points, and it is cross-confirmed across Reddit r/LocalLLaMA, Hugging Face Models, and r/singularity, which makes it one of the most-discussed open model drops in weeks.

NVIDIA also released Nemotron 3.5 Content Safety on OpenRouter — and it is currently free to call. It is a multimodal safety classifier aimed at enterprises that need to filter or audit AI outputs across text and images. If you are building anything customer-facing that touches sensitive content, this is worth a look before you pay for a third-party moderation API.

Tools

A Medium post that climbed to 106 points on Hacker News describes a trick where Claude Code and OpenAI Codex communicate with each other in real time through Git commits. The idea: one agent writes code and commits, the other pulls, reviews, and responds via another commit. It is a clever hack for multi-agent workflows that does not require any special API bridge — just a shared repo. Cross-confirmed on Reddit r/ClaudeAI and r/singularity.

Hugging Face published a post on redesigning their hf CLI to be agent-friendly. The short version: the CLI now exposes structured outputs and flags that make it easier for an AI agent to call Hub operations (upload, download, search) without scraping HTML or guessing at command syntax. Useful if you are building agents that need to pull or push models programmatically.

Industry moves

Anthropic launched a Services Track and Partner Hub inside the Claude Partner Network. This is aimed at consultants and agencies — not end users. If you help businesses deploy AI, this is Anthropic's formal channel for getting listed, certified, and presumably referred.

OpenAI published a 'blueprint for democratic governance of frontier AI' — a policy document, not a product. It outlines how OpenAI thinks governments and international bodies should oversee the most powerful AI systems. Worth reading if you care about regulation; not relevant if you are trying to ship something this week.

Research worth reading

Anthropic released a report mapping a full year of AI-enabled cyber threats against the MITRE ATT&CK framework. It is one of the more concrete public datasets on how AI is actually being used in attacks — not speculation, but documented cases. Security-minded builders should skim it.

Huawei's CSL team released KVarN, a native vLLM backend for KV-cache quantization. The HN thread hit 127 points with cross-confirmation on r/LocalLLaMA and r/MachineLearning. If you are running inference locally and hitting memory limits, this is a real engineering tool — not a demo.

What builders can do this week

1. Pull Gemma 4 12B from Hugging Face and test it on a multimodal task you currently pay a closed API for — describe a product image, parse a receipt, caption a screenshot. It is free and local.

2. Set up a two-agent Git loop using the Claude Code + Codex pattern from the Medium post. Pick a small, real task — a script that needs both writing and review — and let the two agents pass commits back and forth. You will learn quickly where the handoff breaks.

3. Call NVIDIA's Nemotron 3.5 Content Safety endpoint on OpenRouter (currently free) to audit a batch of user-generated content from a project you are already running. Compare its flags against whatever you are doing today.

// what we actually tested

What we can and cannot confirm

Confirmed: OpenAI published the Dreaming memory post on openai.com, cross-confirmed by Reddit r/ChatGPT, r/singularity, and r/StableDiffusion.

Confirmed: Gemma 4 12B is live on Hugging Face and OpenRouter, cross-confirmed across r/LocalLLaMA, Hugging Face Models, and r/singularity.

Confirmed: Nemotron 3.5 Content Safety is listed as free on OpenRouter as of today's signals.

Not independently verified by CBW: We have not tested the Claude Code + Codex Git conversation trick ourselves. The Medium post describes a working prototype, but edge cases and reliability are unknown.

Worth noting: OpenAI's 'frontier AI governance blueprint' is a policy document. It contains no product announcements and no binding commitments — treat it as a position paper, not a roadmap.

Source: OpenAI: ChatGPT Dreaming memory — https://openai.com/index/chatgpt-memory-dreaming

Source: Google: Gemma 4 12B launch blog — https://blog.google/innovation-and-ai/technology/developers-tools/introducing-gemma-4-12b/

Source: Medium / HN: Claude Code and Codex via Git — https://medium.com/@Koukyosyumei/claude-code-and-codex-can-have-real-time-conversation-via-git-f95b696c1c05

Source: Hugging Face: Nemotron 3.5 Content Safety — https://huggingface.co/blog/nvidia/nemotron-3-5-content-safety

Source: Anthropic: Services Track and Partner Hub — https://www.anthropic.com/news/services-track-partner-hub

Source: GitHub: KVarN by Huawei CSL — https://github.com/huawei-csl/KVarN