Add web-rwkv-wasm crate for npm package distribution#62
Merged
Conversation
Introduce a new `crates/web-rwkv-wasm` crate exposing the runtime to JavaScript/WebAssembly, including session, cache, sampler, loader, ops, and visual bindings, plus build scripts and a GitHub Actions workflow to publish the npm package.
f56f6d2 to
dacadfa
Compare
3 tasks
- cache.rs: add SAFETY comment on the as_token_slice repr(transparent) cast - session.rs: replace user-input assert_eq! panics in softmax/back/checkout with descriptive JsError returns; propagate adapter-request failures as errors instead of .expect() panics - visual.rs: return JsError on head-size mismatch and image-write failure - add trailing newlines to LICENSE, mul_exp.wgsl, mul_w.wgsl
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add web-rwkv-wasm crate for npm package distribution
Closes #60
Summary
Adds a new workspace crate,
crates/web-rwkv-wasm, that exposesweb-rwkv'sWebGPU RWKV inference to JavaScript via
wasm-bindgen, packaged for npm. Thislets a web app run RWKV inference locally in the browser with a plain
pnpm add @cryscan/web-rwkv-wasminstead of setting up a Rust/wasm-packtoolchain. The binding surface is upstreamed from the official demo
web-rwkv-puzzles— same API,only the packaging differs (
--target webESM instead of the demo's--target no-modulesglobal).What's included
crates/web-rwkv-wasm(cdylib+rlib), depending on theparent
web-rwkvcrate withdefault-features = false, features = ["web"](drops
tokioandsubgroup-ops, neither of which ports to browser WebGPU):session.rs—Sessionwith reader/prefab loading,run/softmax,state get/load, and a built-in prefix cache (
checkout/cache)loader.rs—Tensor/TensorReader(implements web-rwkv'sReader)sampler.rs—SimpleSampler(argmax) andNucleusSamplercache.rs,ops.rs,visual.rs(state heatmaps as base64 PNGs), and themul_exp.wgsl/mul_w.wgslshadersREADME.md(API overview + Web Worker usage),LICENSE,build.bash/build.cmdCargo.toml— introduces a[workspace]section registeringweb-rwkv-deriveandweb-rwkv-wasmas members..github/workflows/wasm-npm.yml— builds the ESM package on PRs and onpushes to
mainthat touch the crate (this is the CI verification);publishes to npm only via a manual
workflow_dispatchwithpublish=true.The npm scope is parameterized (defaults to
@cryscan/web-rwkv-wasm).Requirements
navigator.gpu). No SharedArrayBuffer / threads,so no cross-origin isolation (COOP/COEP) needed.
safetensors(RWKV v4/v5/v6/v7) or a CBOR prefab; plus atokenizer vocab JSON.
Verification
cargo check -p web-rwkv-wasm --target wasm32-unknown-unknowncompilescleanly (no errors/warnings).
Notes for reviewers / follow-ups
NPM_TOKENrepo secret and ownership of thetarget npm scope. The CI build job verifies on every PR regardless, so
merge does not depend on the publish setup.
--target web(ESM) is chosen over the demo's--target no-modulesso thepackage works with bundlers and module workers; consumers must
await init()before use.
wasm-opt = ['-Os']is set for the release profile to trim module size;inference runs on the GPU, so CPU-side wasm speed is not the bottleneck.