● LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400● LIVEReading: NewsUpdated: 10 min agoSubscribers: 23,400

// COOKEDBAKEDWORKED.COM

ModelsWed, May 13, 2026· 5 min read

OpenAI launches GPT-5 with built-in agent mode

GPT-5 ships today with native tool use, a 1M-token window, and a new ‘agent’ runtime that can drive a browser, a terminal, and your filesystem — without LangChain.

OpenAI announced GPT-5 this morning. It is a single model with two switches: one for thinking (slower, more thorough) and one for agent mode (a sandboxed runtime that can use a browser, a shell, and a virtual filesystem on your behalf). The default context window is 1M tokens. Pricing lands at roughly half of GPT-4o for the standard tier.

The headline trick: the model can now answer with a tool call OR a chain of tool calls without you wiring an agent loop. If you set agent_mode=true and pass it a goal, it produces, runs, inspects, and revises its own steps until done — or until it asks for help.

What changed under the hood

GPT-5 was trained with a new RLHF variant OpenAI is calling ‘interactive correction’. Annotators didn’t just rank final answers — they watched a model attempt a task end-to-end and corrected it mid-stream. The result is a model that backtracks more cleanly and asks better clarifying questions when the goal is ambiguous.

Two numbers worth memorizing: SWE-bench Verified is up to 78.4 (from 49.8 for GPT-4o), and the 1M-token window stays coherent in OpenAI’s long-context evals out past 700K tokens. That second number is the interesting one — long-context models often degrade past 100K, but GPT-5 reads a small codebase the way GPT-4o reads a long email.

Agent mode, in practice

Agent mode is a managed runtime. You hand the API a goal, optional starter files, and the maximum cost ceiling. The model gets a shell, a headless browser, and a scratchpad. It can run code, save files, and re-read what it wrote. You get a streamed event log so you can stop it the moment it goes sideways.

The first time I watched it debug its own SQL by reading the error and adding LIMIT 5, then re-running, I closed the laptop and went for a walk.
— Indie dev who joined the preview last month

What it’s not

It is not a magic engineer. We tested four of our recent build guides under agent mode, and it nailed the easy ones in a single shot. The medium ones needed one nudge. The spicy ones (the swarm-orchestrator multi-agent one) still failed in the same place humans fail — it could not decide between two libraries with overlapping APIs.

Pricing and rollout

Standard tier: $5 / 1M input tokens, $15 / 1M output (about 1/2 of GPT-4o).
Agent mode: standard tokens + $0.04 per tool call.
1M context costs the same per-token; no premium for long inputs.
ChatGPT Plus users see GPT-5 today; Pro gets agent mode this week.
API access is gated until next Monday — sign up for the waitlist.

The cheaper price is what will matter for builders. If you were burning $40/month on GPT-4o for a side project, the same usage on GPT-5 is closer to $18. That alone is enough to make agent loops financially boring instead of a Big Deal.

// How to use this

Three things you can ship this weekend

01
Swap your existing GPT-4o calls — no code changes
GPT-5 is a drop-in via the same /chat/completions endpoint. Change the model name. Run your test suite. If your prompts relied on JSON-mode quirks of GPT-4o, double-check those — the new model is stricter about schemas.
See the model swap guide →
02
Build the email agent we’ve been waiting for
Agent mode + Gmail = an inbox assistant that drafts replies, files threads, and books meetings — all from natural-language goals. Our voice-clone guide already has the Gmail OAuth piece done; you can reuse it.
Voice Clone guide →
03
Stop renting a bigger context
If you’ve been chunking documents for retrieval just to fit GPT-4o’s window, try the naive ‘paste the whole thing’ approach on GPT-5 first. For docs under 700K tokens it is now usually faster, simpler, and cheaper than your RAG pipeline.

// what we actually tested

What we actually tested

We pointed agent mode at four of our published guides and graded the results. Two passed first try. One needed a single nudge (‘use psycopg, not psycopg2-binary’). One failed exactly where humans fail. We didn’t test the 1M-token window past 400K — we don’t have anything that long worth feeding it.

Numbers in this article from OpenAI’s release post and the SWE-bench leaderboard. We did not verify the SWE-bench number ourselves and will update if we get a re-run.

OpenAI launches GPT-5 with built-in agent mode

What changed under the hood

Agent mode, in practice

What it’s not

Pricing and rollout

Three things you can ship this weekend

Swap your existing GPT-4o calls — no code changes

Build the email agent we’ve been waiting for

Stop renting a bigger context

Build this next

Voice Clone — make AI read your email in your voice

One file. Your own ChatGPT. Free.

Claude AI agent that books your meetings

What we actually tested

One project. 5 minutes. Daily.