npm - @sauravpanda/flare - Versions diffs - 0.1.0 - Mend

@sauravpanda/flare 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/demo/README.md +40 -0
package/demo/index.html +1767 -0
package/js/index.ts +91 -0
package/js/types.ts +136 -0
package/js/webtransport-loader.js +126 -0
package/js/worker.ts +159 -0
package/package.json +58 -0
package/pkg/flare_web.d.ts +1164 -0
package/pkg/flare_web.js +2790 -0
package/pkg/flare_web_bg.wasm +0 -0
package/pkg/flare_web_bg.wasm.d.ts +105 -0
package/pkg/package.json +27 -0

package/demo/README.md ADDED Viewed

@@ -0,0 +1,40 @@
+# Flare Browser Demo
+A minimal HTML page that loads the WASM build and runs inference entirely in the browser.
+## Quick start
+```bash
+# 1. Build the WASM package
+wasm-pack build flare-web --target web
+# 2. Serve the demo (any HTTP server works)
+cd flare-web
+python3 -m http.server 8000
+# 3. Open http://localhost:8000/demo/
+```
+## What it does
+1. Loads the `flare_web` WASM module
+2. Detects WebGPU availability
+3. Lets you upload a GGUF file via `<input type="file">`
+4. Calls `FlareEngine.load(bytes)` to parse and load the model
+5. Calls `engine.generate_tokens(prompt, 30)` to generate from BOS
+## Limitations
+- The demo shows raw token IDs (no BPE detokenization in JS yet)
+- Loads the entire model into memory at once (no progressive loading yet)
+- CPU only (WebGPU compute path not yet wired through wasm-bindgen)
+## Try it with SmolLM2-135M Q8_0
+```bash
+# Download from HuggingFace (~138MB)
+curl -L "https://huggingface.co/bartowski/SmolLM2-135M-Instruct-GGUF/resolve/main/SmolLM2-135M-Instruct-Q8_0.gguf" \
+  -o SmolLM2-135M-Instruct-Q8_0.gguf
+```
+Then drag the file into the demo page.