A live-sentiment-analysis demo for a fictional café's experimental menu. Customer reviews stream in by the hundred, and a transformer model classifies each one as positive, neutral, or negative — entirely in the browser, on the device's NPU, GPU, or CPU, with no cloud round-trips.
Built on Windows ML via WebNN and ONNX Runtime Web.
- A real transformer (RoBERTa, fine-tuned for sentiment) running locally in the browser.
- The same JavaScript code routed to NPU / GPU / CPU by changing a single line.
- Live latency, throughput, and a rolling sentiment chart so you can see the hardware acceleration.
- No API calls. No server inference. No data leaving the device.
- Tokenize — text is converted to BPE token IDs with the model's tokenizer.
- Tensorize — tokens are packed into fixed
[1, 512]int32buffers with an attention mask for padding. - Inference —
ort.InferenceSession.run()dispatches to the WebNN execution provider on the configured device. - Decode — logits → softmax → argmax to pick the predicted class.
- Render — the message lands in the feed with its sentiment badge and inference latency.
The whole inference path lives in js/sentimentAnalyzer.js, in about 40 lines.
Open js/sentimentAnalyzer.js and change one constant:
// ▶️ Device the model runs on. Change to 'cpu' or 'gpu' to compare silicon.
const DEVICE_TYPE = 'npu';Reload the page. The badge in the bottom right reflects the active device, and the latency stat will jump (or drop) accordingly.
cd Sentiment-Analysis-WinML-WebNN
npm install
npm startThen open http://localhost:8888.
The model files (
model/model_opt.onnx,model/model_opt.onnx.data,model/vocab.json,model/merges.txt) are gitignored due to size (~478 MB). Place them in themodel/folder before running.
- A browser with WebNN enabled — see Enabling WebNN flags below.
- Optional but recommended: a Copilot+ PC with an NPU for the most dramatic latency numbers.
WebNN is currently behind experimental flags. In Edge or Chrome, open edge://flags (or chrome://flags), search for WebNN, and enable the following three flags:
| Flag | Setting |
|---|---|
Enables WebNN API (#web-machine-learning-neural-network) |
Enabled |
Enables experimental WebNN API features (#experimental-web-machine-learning-neural-network) |
Enabled |
ONNX Runtime backend for WebNN (#webnn-onnxruntime) |
Enabled |
Restart the browser after enabling.
onnxruntime-web— ONNX Runtime in the browser, with the WebNN execution provider.- Windows ML / WebNN — hardware-accelerated ML on Windows.
- Vanilla JavaScript / HTML / CSS — no frontend framework.
- Model:
cardiffnlp/twitter-roberta-base-sentiment-latest, optimized to ONNX.
Sentiment-Analysis-WinML-WebNN/
├── index.html # App shell + EP badge
├── server.mjs # Tiny static file server
├── package.json
├── css/styles.css
├── js/
│ ├── app.js # UI, stream control, stats
│ ├── sentimentAnalyzer.js # WebNN session + analyze() pipeline
│ ├── tokenizer.js # BPE tokenizer
│ ├── messageGenerator.js # Simulated café reviews
│ └── sentimentChart.js # Live trend chart
└── model/ # ONNX model + tokenizer files (gitignored)
- aka.ms/winml — Windows ML overview, samples, and docs.
- WebNN API — W3C draft spec.
- ONNX Runtime Web — JS bindings and EP options.
Released under the MIT License.

