localm-web 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -7,6 +7,104 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ### Added
11
+
12
+ - `docs/getting-started.md` — end-to-end guide covering prerequisites,
13
+ install, first chat snippet, the curated model registry with download /
14
+ RAM estimates, how a model downloads and where it caches, running the
15
+ example Vite app, cold-start expectations, inspecting / clearing the
16
+ Cache API, offline behavior and troubleshooting.
17
+ - README links to the new guide from the **Installation** and
18
+ **Vite usage** sections; the example app blurb now points at the
19
+ runnable folder instead of hedging with "once v0.1 lands".
20
+ - `Completion` task for raw text continuation (no chat template, no history).
21
+ Exposes `predict()` returning a `CompletionResult` and `stream()` yielding
22
+ `TokenChunk` async iterable. Mirrors the `Chat` task DX.
23
+ - `CompletionResult` class in `src/results.ts` — holds the generated text,
24
+ the original prompt, tokens generated and finish reason.
25
+ - `Engine.complete()` and `Engine.streamCompletion()` methods on the
26
+ runtime-agnostic engine contract. `WebLLMEngine` implements both via
27
+ `engine.completions.create()` (raw text mode, bypasses chat template).
28
+ - `ModelLoadPhase` discriminated string type
29
+ (`"downloading" | "compiling" | "loading" | "ready"`) on `ModelLoadProgress`.
30
+ Lets consumers drive UI state machines (spinner → progress bar → ready
31
+ badge) without parsing the runtime's free-form status text.
32
+ - `WebLLMEngine.load()` classifies each progress report via
33
+ `classifyLoadPhase()` and emits a final `phase: "ready"` event exactly
34
+ once when the load resolves successfully.
35
+ - `WorkerEngine` — `Engine` implementation that proxies all calls to a Web
36
+ Worker via a typed RPC protocol. Lets consumers run inference off the UI
37
+ thread.
38
+ - `createInferenceWorker()` helper that spawns a module-type Worker pointing
39
+ at the SDK's bundled worker entry. Exposed for advanced lifecycle
40
+ scenarios (pooling, custom termination); most consumers never call it
41
+ directly.
42
+ - `LMTaskCreateOptions.inWorker` flag (default `false` in v0.2). When
43
+ `true`, the task instantiates a worker-backed engine instead of running
44
+ inference on the main thread. Default flips to `true` in v0.3 once the
45
+ Cache API / OPFS integration validates worker-thread storage access.
46
+ - `src/worker/protocol.ts` — discriminated-union message contract between
47
+ main thread and worker (`load`, `generate`, `stream`, `complete`,
48
+ `stream-completion`, `abort`, `unload`, `isLoaded` requests; `loaded`,
49
+ `progress`, `generated`, `token`, `stream-end`, `error`, `unloaded`,
50
+ `is-loaded` responses). Numeric op ids isolate concurrent operations.
51
+ - `WorkerLike` interface exported for tests and custom integrations that
52
+ need to inject a transport (mocks, Comlink wrappers, MessagePort
53
+ bridges).
54
+ - 11 new unit tests in `test/worker-engine.test.ts` exercising load with
55
+ progress, generate round-trip, abort propagation, signal stripping,
56
+ streaming queue, error mapping, unload short-circuit, terminate, and
57
+ concurrent-load rejection.
58
+ - `ModelCache` class in `src/cache/model-cache.ts` — inspect and manage
59
+ cached model weights from a consuming app:
60
+ - `has(modelId)` / `delete(modelId)` wrap WebLLM's `hasModelInCache` /
61
+ `deleteModelInCache`, validating the friendly id against the
62
+ registry first.
63
+ - `list()` iterates `MODEL_PRESETS` and returns the cached subset as
64
+ `CachedModelEntry[]` with friendly id, backend id, family, params.
65
+ Empty list when nothing is cached (per the project's
66
+ `*NotFoundError`-free convention).
67
+ - `clear()` deletes every registry model in parallel — useful for
68
+ logout / reset flows.
69
+ - `estimateUsage()` wraps `navigator.storage.estimate()` and returns
70
+ `{ usage, quota }`. Falls back to zeros when the API is missing.
71
+ - `ModelCache.assertKnown(modelId)` static guard that throws
72
+ `UnknownModelError` for ids outside the registry.
73
+ - Public types: `CachedModelEntry`, `CacheUsage`, `ModelCacheOptions`
74
+ re-exported from `src/index.ts`.
75
+ - Dependency-injectable backend (`hasModel`, `deleteModel`, `estimate`
76
+ hooks) so unit tests can mock the runtime + browser APIs without
77
+ touching the real Cache API or `@mlc-ai/web-llm`.
78
+ - 15 unit tests in `test/model-cache.test.ts` covering `has` / `delete`
79
+ / `list` / `clear` / `estimateUsage` and `assertKnown`, including
80
+ navigator fallbacks via `vi.stubGlobal`.
81
+
82
+ ### Changed
83
+
84
+ - `ProgressCallback` payload shape gained a required `phase` field. This is
85
+ technically a breaking change but the SDK is pre-1.0 and the type is
86
+ emitted only by the engine — consumers were already supposed to treat
87
+ the payload as opaque.
88
+ - `vite.config.ts` adds `worker.format = "es"` and externalizes ORT-Web /
89
+ HF deps from the worker bundle. `@mlc-ai/web-llm` is intentionally
90
+ bundled into the worker chunk because workers cannot resolve bare
91
+ specifiers at runtime — this trades a larger lazy-loaded chunk
92
+ (~6.5 MB pre-gzip, only fetched when `inWorker: true`) for a clean DX
93
+ (no consumer-side worker config). The main `dist/index.js` stays at
94
+ ~16 kB and webllm remains a peer dep there.
95
+ - `engines.node` bumped from `>=18.0.0` to `>=20.19.0`. Vite 7's worker
96
+ bundler depends on `crypto.hash()` which lands in Node 19; Node 18
97
+ also reaches end-of-life on 2025-04-30 per the Node release schedule.
98
+ - CI matrix dropped Node 18, kept 20 + 22.
99
+
100
+ ### Notes
101
+
102
+ - `ModelCache` is **inspection + management only**. Actual weight
103
+ download still flows through WebLLM's internal Cache-API path.
104
+ OPFS-as-primary-storage and resume-on-interrupted-download (also in
105
+ the v0.2 roadmap) require intercepting the WebLLM downloader and
106
+ are deferred to v0.3 to avoid forking upstream.
107
+
10
108
  ## [0.1.0] - 2026-05-10
11
109
 
12
110
  ### Added
package/README.md CHANGED
@@ -161,19 +161,19 @@ The shape mirrors `ort-vision-sdk-web`: `await Class.create(model)` then `predic
161
161
 
162
162
  ## Installation
163
163
 
164
- > Not yet published. Once v0.1 ships:
165
-
166
164
  ```bash
167
165
  npm install localm-web @mlc-ai/web-llm
168
166
  ```
169
167
 
170
168
  `@mlc-ai/web-llm` is a peer dependency — the consumer pins the version, which keeps the SDK lightweight and avoids version conflicts.
171
169
 
170
+ For a step-by-step walkthrough covering install, model selection, downloading weights, running the example app and troubleshooting, see **[docs/getting-started.md](./docs/getting-started.md)**.
171
+
172
172
  ## Vite usage
173
173
 
174
174
  The package is designed to drop into a Vite app with no extra config. The Web Worker is bundled via Vite's native worker support; just import the SDK and use it.
175
175
 
176
- A complete example will live under `examples/vite-chat/` once v0.1 lands.
176
+ A runnable example lives under [`examples/vite-chat/`](./examples/vite-chat/) `cd` into it, `npm install`, `npm run dev`, open the browser, pick a model, send a prompt. The full guide in [`docs/getting-started.md`](./docs/getting-started.md#run-the-example-app) walks through it.
177
177
 
178
178
  ## Why not server-side?
179
179