@sauravpanda/flare 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/demo/README.md ADDED
@@ -0,0 +1,40 @@
1
+ # Flare Browser Demo
2
+
3
+ A minimal HTML page that loads the WASM build and runs inference entirely in the browser.
4
+
5
+ ## Quick start
6
+
7
+ ```bash
8
+ # 1. Build the WASM package
9
+ wasm-pack build flare-web --target web
10
+
11
+ # 2. Serve the demo (any HTTP server works)
12
+ cd flare-web
13
+ python3 -m http.server 8000
14
+
15
+ # 3. Open http://localhost:8000/demo/
16
+ ```
17
+
18
+ ## What it does
19
+
20
+ 1. Loads the `flare_web` WASM module
21
+ 2. Detects WebGPU availability
22
+ 3. Lets you upload a GGUF file via `<input type="file">`
23
+ 4. Calls `FlareEngine.load(bytes)` to parse and load the model
24
+ 5. Calls `engine.generate_tokens(prompt, 30)` to generate from BOS
25
+
26
+ ## Limitations
27
+
28
+ - The demo shows raw token IDs (no BPE detokenization in JS yet)
29
+ - Loads the entire model into memory at once (no progressive loading yet)
30
+ - CPU only (WebGPU compute path not yet wired through wasm-bindgen)
31
+
32
+ ## Try it with SmolLM2-135M Q8_0
33
+
34
+ ```bash
35
+ # Download from HuggingFace (~138MB)
36
+ curl -L "https://huggingface.co/bartowski/SmolLM2-135M-Instruct-GGUF/resolve/main/SmolLM2-135M-Instruct-Q8_0.gguf" \
37
+ -o SmolLM2-135M-Instruct-Q8_0.gguf
38
+ ```
39
+
40
+ Then drag the file into the demo page.