@simulatte/doppler 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +19 -20
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -10,7 +10,7 @@ Inference and training on raw WebGPU. Pure JS + WGSL.
10
10
  npm install @simulatte/doppler
11
11
  ```
12
12
 
13
- ## Quick Start
13
+ ## Quick start
14
14
 
15
15
  ```js
16
16
  import { doppler } from '@simulatte/doppler';
@@ -26,30 +26,14 @@ Tokens stream from a native `AsyncGenerator`. See [more examples](#more-examples
26
26
 
27
27
  ## Why Doppler
28
28
 
29
- **JS → WGSL → WebGPU.** One hop to the GPU. No ONNX runtime, no WASM blob, no bridge layer.
29
+ **JS → WGSL → WebGPU.** Direct JavaScript orchestration into native WebGPU kernels, avoiding ONNX runtimes, WASM blobs, and bridge layers.
30
30
 
31
- **`for await` streaming.** Not callbacks. Not a `TextStreamer` class. A loop.
31
+ **`for await` streaming.** Generation uses a native `AsyncGenerator` that fits normal app control flow.
32
32
 
33
33
  **LoRA hot-swap.** Swap adapters at runtime without reloading the base model.
34
34
 
35
35
  **Independent model instances.** Run multiple models concurrently. Each owns its pipeline, buffers, and KV cache.
36
36
 
37
- ## Under the Hood
38
-
39
- - Sharded weight loading via OPFS. Gigabytes into VRAM without blocking the main thread.
40
- - Quantized inference: Q4K, Q8, F16. Real models on consumer GPUs.
41
- - Kernel hot-swap between prefill and decode paths.
42
- - Config-driven runtime. Presets, kernel path selection, and sampling are policy, not code.
43
- - Reproducible benchmarks with deterministic knobs and auditable kernel traces.
44
-
45
- ## Browser Support
46
-
47
- - Chrome / Edge 113+ (WebGPU required)
48
- - Firefox (behind flag, WebGPU support varies)
49
- - Safari (WebGPU support in progress)
50
-
51
- ---
52
-
53
37
  ## Evidence
54
38
 
55
39
  ![Phase-latency comparison on one workload across models](benchmarks/vendors/results/compare_1b_multi-workload_favorable_phases.svg)
@@ -58,7 +42,15 @@ Snapshot artifacts:
58
42
  - [g3-1b-p064-d064-t0-k1.compare.json](benchmarks/vendors/fixtures/g3-1b-p064-d064-t0-k1.compare.json)
59
43
  - [lfm2-5-1-2b-p064-d064-t0-k1.compare.json](benchmarks/vendors/fixtures/lfm2-5-1-2b-p064-d064-t0-k1.compare.json)
60
44
 
61
- ## More Examples
45
+ ## Under the hood
46
+
47
+ - Sharded weight loading via OPFS moves multi-GB weights into VRAM without blocking the main thread.
48
+ - Quantized inference paths (Q4K, Q8, F16) support practical model sizes on consumer GPUs.
49
+ - Kernel hot-swap between prefill and decode paths.
50
+ - Config-driven runtime keeps presets, kernel-path selection, and sampling explicit.
51
+ - Reproducible benchmarks expose deterministic knobs and auditable kernel traces.
52
+
53
+ ## More examples
62
54
 
63
55
  ```js
64
56
  // Non-streaming
@@ -85,6 +77,13 @@ for await (const token of doppler('Hello', { model: 'gemma-3-1b' })) {
85
77
  - Runtime config contract: [docs/config.md](docs/config.md)
86
78
  - Architecture: [docs/architecture.md](docs/architecture.md)
87
79
 
80
+ ## Environment requirements
81
+
82
+ - WebGPU-capable browser runtime is required.
83
+ - Chrome / Edge 113+ supported.
84
+ - Firefox support varies (typically behind a flag).
85
+ - Safari support is evolving.
86
+
88
87
  ## License
89
88
 
90
89
  Apache License 2.0 (`Apache-2.0`). See [LICENSE](LICENSE) and [NOTICE](NOTICE).
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@simulatte/doppler",
3
- "version": "0.1.1",
3
+ "version": "0.1.2",
4
4
  "description": "Browser-native WebGPU inference engine for local intent and inference loops",
5
5
  "main": "src/index.js",
6
6
  "types": "src/index.d.ts",