LIVEReading: Run Stable Diffusion in your browser with WebGPUTotal time: 10 minSteps: 5Worked first time: 60% LIVEReading: Run Stable Diffusion in your browser with WebGPUTotal time: 10 minSteps: 5Worked first time: 60%
CBW
Mediumgithub.com/mlc-ai/web-stable-diffusion2026-05-20

Run Stable Diffusion in your browser with WebGPU

Web Stable Diffusion compiles SD to WebGPU so it runs entirely inside Chrome — no Python, no GPU drivers, no install. Open the tab, type a prompt, get an image. Close the tab and the model goes with it.

// Build stats

  • Total time10 min
  • Number of steps5
  • DifficultyMedium
  • Worked first time60%
// Before you start

What you need

  • A computer with at least 6 GB of free disk space (the model is large and cached by the browser)
  • Chrome 113+ OR Edge 113+ with WebGPU enabled (Firefox/Safari support is partial — Chrome is the most reliable path)
  • A discrete GPU strongly recommended. Integrated GPUs work but are slow
  • Roughly 8 GB of free RAM during generation
  • Stable internet for the one-time model download (~1.5 GB)
01
Step 1 of 5

Confirm your browser actually supports WebGPU

1 min

WebGPU is the new GPU API for the browser — different from the older WebGL. Web Stable Diffusion will silently fall back to a sad error message if WebGPU isn't enabled. Check first.

Terminal · mac
$ # Open this URL in Chrome / Edge:
$ https://webgpureport.org
$
$ # You should see a green 'WebGPU is supported' header
$ # and an Adapter section showing your GPU name.
What you should see
webgpureport.org loads a green banner that says WebGPU is supported, and lists your GPU (e.g. 'Apple M2 Pro' or 'NVIDIA GeForce RTX 3060').
This might happen

webgpureport.org says WebGPU is not enabled.

Open chrome://flags, search for 'WebGPU', set 'Unsafe WebGPU' or 'WebGPU Developer Features' to Enabled, and restart the browser. On Linux you may also need to enable Vulkan.

02
Step 2 of 5

Open the official demo (no install needed)

1 min

The MLC team hosts a live demo at mlc.ai/web-stable-diffusion that pulls the WASM/WebGPU runtime, the tokenizer, and the compiled SD model directly into your browser. No clone, no build. You'll only need step 3 (local install) if you want to embed it in your own site.

Terminal · mac
$ # Open this URL in Chrome:
$ https://mlc.ai/web-stable-diffusion/
What you should see
Page loads with a prompt input, model dropdown (e.g. 'sd-v1.5'), and a 'Load model' button. The console (DevTools → Console) shows tvmjs WebGPU adapter initialized.
This might happen

The page hangs at 'Loading model' for several minutes.

First load downloads ~1.5 GB of model weights into your browser's cache. Watch the Network tab — it really is downloading. Subsequent loads are seconds.

03
Step 3 of 5

(Optional) Run it locally from a clone

3 min

If you want to host the demo on your own machine — for offline use or to customize the UI — clone the repo and serve the prebuilt artifacts with any static web server. No Python build needed; the compiled model artifacts ship in the repo.

Terminal · mac
$ git clone https://github.com/mlc-ai/web-stable-diffusion.git
$ cd web-stable-diffusion
$
$ # Any static server works. Python is easy:
$ python3 -m http.server 8080
$
$ # Now open: http://localhost:8080/site/index.html
What you should see
Localhost serves the same demo UI as the hosted version. Model still loads from the configured CDN URLs unless you mirror them.
This might happen

Browser blocks fetch with a CORS error when serving locally.

Don't open the HTML file directly with file://. Use python -m http.server (or any HTTP server) so it loads via http://localhost — WebGPU + workers require an HTTP origin.

04
Step 4 of 5

Generate your first image

1-3 min (mostly waiting for the GPU)

Once the model is loaded, type a prompt and hit Generate. The compiled model runs entirely on your GPU via WebGPU — no API call leaves the browser. Generation speed depends on your GPU: an M2 Pro produces a 512×512 image in ~30 seconds; an integrated Intel iGPU may take 3-5 minutes.

Terminal · mac
$ # In the demo UI:
$ # Prompt: 'a watercolor painting of a quiet harbor at sunrise'
$ # Steps: 20 (default is usually 50; lower to 20 for the first test)
$ # Click 'Generate'.
$
$ # Watch DevTools → Performance to see GPU utilization spike.
What you should see
An image appears below the prompt input within 30 seconds to 5 minutes depending on hardware. The console logs per-step latency.
This might happen

Tab crashes or runs out of memory.

Close all other heavy tabs. Lower steps to 10 and image size to 256×256 in the UI. If the browser still kills the tab, your GPU may not have enough VRAM — try a discrete GPU machine.

05
Step 5 of 5

Tune prompts and try a different model

open-ended

The demo bundles a small set of compiled models in its dropdown. Quality and style vary — try the 'realistic-vision' or 'anything-v3' options if your demo build ships them. For deep prompt engineering, use the same prompt-vocab tricks you'd use in any SD UI (weights with `(word:1.2)`, negative prompts, etc.). The advantage here is 100% local — no rate limit, no API bill.

Terminal · mac
$ # Sample prompt patterns that work well in SD-1.5 derived models:
$ # 'cinematic photograph of <subject>, golden hour, 35mm film'
$ # '<subject>, watercolor, soft pastel, paper texture'
$ # 'studio portrait of <subject>, dramatic lighting, shallow depth of field'
$ # Negative prompt: 'blurry, low quality, distorted hands, extra fingers'
What you should see
Each new prompt produces a fresh image in roughly the same time as step 4. Quality is comparable to SD-1.5 on a Python install — it's the same model.
This might happen

Images look noisy or under-formed.

Bump steps from 20 back up to 40-50 and try a different sampler if the UI exposes one (DPM++ 2M Karras is a solid default for SD-1.5).

// Status

cooked. baked. worked.

A browser tab that takes a text prompt and produces a 512×512 image entirely on your local GPU, with no network roundtrip after the first model load.

// the honest bit

The honest part

Heads up — drafted from the MLC project docs, not a CBW hands-on run. WebGPU is still maturing: results differ between Chrome versions, between Mac/Windows/Linux, and especially between integrated vs discrete GPUs. The 60% workedPct reflects the WebGPU compatibility lottery — when it works, it's magical; when your browser doesn't have a usable WebGPU adapter, no amount of prompting helps. If browser-based fails, the Python install path (a-1111 or InvokeAI) is more reliable. For an offline mobile install, web-stable-diffusion also has a node.js wrapper documented in the repo's README.