localm-web 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +98 -0
- package/README.md +3 -3
- package/dist/assets/index-ChQoBCqA.js +23168 -0
- package/dist/assets/index-ChQoBCqA.js.map +1 -0
- package/dist/assets/inference.worker-CwvQtobb.js +330 -0
- package/dist/assets/inference.worker-CwvQtobb.js.map +1 -0
- package/dist/index.d.ts +374 -0
- package/dist/index.js +518 -3
- package/dist/index.js.map +1 -1
- package/package.json +2 -2
package/CHANGELOG.md
CHANGED
|
@@ -7,6 +7,104 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
7
7
|
|
|
8
8
|
## [Unreleased]
|
|
9
9
|
|
|
10
|
+
### Added
|
|
11
|
+
|
|
12
|
+
- `docs/getting-started.md` — end-to-end guide covering prerequisites,
|
|
13
|
+
install, first chat snippet, the curated model registry with download /
|
|
14
|
+
RAM estimates, how a model downloads and where it caches, running the
|
|
15
|
+
example Vite app, cold-start expectations, inspecting / clearing the
|
|
16
|
+
Cache API, offline behavior and troubleshooting.
|
|
17
|
+
- README links to the new guide from the **Installation** and
|
|
18
|
+
**Vite usage** sections; the example app blurb now points at the
|
|
19
|
+
runnable folder instead of hedging with "once v0.1 lands".
|
|
20
|
+
- `Completion` task for raw text continuation (no chat template, no history).
|
|
21
|
+
Exposes `predict()` returning a `CompletionResult` and `stream()` yielding
|
|
22
|
+
`TokenChunk` async iterable. Mirrors the `Chat` task DX.
|
|
23
|
+
- `CompletionResult` class in `src/results.ts` — holds the generated text,
|
|
24
|
+
the original prompt, tokens generated and finish reason.
|
|
25
|
+
- `Engine.complete()` and `Engine.streamCompletion()` methods on the
|
|
26
|
+
runtime-agnostic engine contract. `WebLLMEngine` implements both via
|
|
27
|
+
`engine.completions.create()` (raw text mode, bypasses chat template).
|
|
28
|
+
- `ModelLoadPhase` discriminated string type
|
|
29
|
+
(`"downloading" | "compiling" | "loading" | "ready"`) on `ModelLoadProgress`.
|
|
30
|
+
Lets consumers drive UI state machines (spinner → progress bar → ready
|
|
31
|
+
badge) without parsing the runtime's free-form status text.
|
|
32
|
+
- `WebLLMEngine.load()` classifies each progress report via
|
|
33
|
+
`classifyLoadPhase()` and emits a final `phase: "ready"` event exactly
|
|
34
|
+
once when the load resolves successfully.
|
|
35
|
+
- `WorkerEngine` — `Engine` implementation that proxies all calls to a Web
|
|
36
|
+
Worker via a typed RPC protocol. Lets consumers run inference off the UI
|
|
37
|
+
thread.
|
|
38
|
+
- `createInferenceWorker()` helper that spawns a module-type Worker pointing
|
|
39
|
+
at the SDK's bundled worker entry. Exposed for advanced lifecycle
|
|
40
|
+
scenarios (pooling, custom termination); most consumers never call it
|
|
41
|
+
directly.
|
|
42
|
+
- `LMTaskCreateOptions.inWorker` flag (default `false` in v0.2). When
|
|
43
|
+
`true`, the task instantiates a worker-backed engine instead of running
|
|
44
|
+
inference on the main thread. Default flips to `true` in v0.3 once the
|
|
45
|
+
Cache API / OPFS integration validates worker-thread storage access.
|
|
46
|
+
- `src/worker/protocol.ts` — discriminated-union message contract between
|
|
47
|
+
main thread and worker (`load`, `generate`, `stream`, `complete`,
|
|
48
|
+
`stream-completion`, `abort`, `unload`, `isLoaded` requests; `loaded`,
|
|
49
|
+
`progress`, `generated`, `token`, `stream-end`, `error`, `unloaded`,
|
|
50
|
+
`is-loaded` responses). Numeric op ids isolate concurrent operations.
|
|
51
|
+
- `WorkerLike` interface exported for tests and custom integrations that
|
|
52
|
+
need to inject a transport (mocks, Comlink wrappers, MessagePort
|
|
53
|
+
bridges).
|
|
54
|
+
- 11 new unit tests in `test/worker-engine.test.ts` exercising load with
|
|
55
|
+
progress, generate round-trip, abort propagation, signal stripping,
|
|
56
|
+
streaming queue, error mapping, unload short-circuit, terminate, and
|
|
57
|
+
concurrent-load rejection.
|
|
58
|
+
- `ModelCache` class in `src/cache/model-cache.ts` — inspect and manage
|
|
59
|
+
cached model weights from a consuming app:
|
|
60
|
+
- `has(modelId)` / `delete(modelId)` wrap WebLLM's `hasModelInCache` /
|
|
61
|
+
`deleteModelInCache`, validating the friendly id against the
|
|
62
|
+
registry first.
|
|
63
|
+
- `list()` iterates `MODEL_PRESETS` and returns the cached subset as
|
|
64
|
+
`CachedModelEntry[]` with friendly id, backend id, family, params.
|
|
65
|
+
Empty list when nothing is cached (per the project's
|
|
66
|
+
`*NotFoundError`-free convention).
|
|
67
|
+
- `clear()` deletes every registry model in parallel — useful for
|
|
68
|
+
logout / reset flows.
|
|
69
|
+
- `estimateUsage()` wraps `navigator.storage.estimate()` and returns
|
|
70
|
+
`{ usage, quota }`. Falls back to zeros when the API is missing.
|
|
71
|
+
- `ModelCache.assertKnown(modelId)` static guard that throws
|
|
72
|
+
`UnknownModelError` for ids outside the registry.
|
|
73
|
+
- Public types: `CachedModelEntry`, `CacheUsage`, `ModelCacheOptions`
|
|
74
|
+
re-exported from `src/index.ts`.
|
|
75
|
+
- Dependency-injectable backend (`hasModel`, `deleteModel`, `estimate`
|
|
76
|
+
hooks) so unit tests can mock the runtime + browser APIs without
|
|
77
|
+
touching the real Cache API or `@mlc-ai/web-llm`.
|
|
78
|
+
- 15 unit tests in `test/model-cache.test.ts` covering `has` / `delete`
|
|
79
|
+
/ `list` / `clear` / `estimateUsage` and `assertKnown`, including
|
|
80
|
+
navigator fallbacks via `vi.stubGlobal`.
|
|
81
|
+
|
|
82
|
+
### Changed
|
|
83
|
+
|
|
84
|
+
- `ProgressCallback` payload shape gained a required `phase` field. This is
|
|
85
|
+
technically a breaking change but the SDK is pre-1.0 and the type is
|
|
86
|
+
emitted only by the engine — consumers were already supposed to treat
|
|
87
|
+
the payload as opaque.
|
|
88
|
+
- `vite.config.ts` adds `worker.format = "es"` and externalizes ORT-Web /
|
|
89
|
+
HF deps from the worker bundle. `@mlc-ai/web-llm` is intentionally
|
|
90
|
+
bundled into the worker chunk because workers cannot resolve bare
|
|
91
|
+
specifiers at runtime — this trades a larger lazy-loaded chunk
|
|
92
|
+
(~6.5 MB pre-gzip, only fetched when `inWorker: true`) for a clean DX
|
|
93
|
+
(no consumer-side worker config). The main `dist/index.js` stays at
|
|
94
|
+
~16 kB and webllm remains a peer dep there.
|
|
95
|
+
- `engines.node` bumped from `>=18.0.0` to `>=20.19.0`. Vite 7's worker
|
|
96
|
+
bundler depends on `crypto.hash()` which lands in Node 19; Node 18
|
|
97
|
+
also reaches end-of-life on 2025-04-30 per the Node release schedule.
|
|
98
|
+
- CI matrix dropped Node 18, kept 20 + 22.
|
|
99
|
+
|
|
100
|
+
### Notes
|
|
101
|
+
|
|
102
|
+
- `ModelCache` is **inspection + management only**. Actual weight
|
|
103
|
+
download still flows through WebLLM's internal Cache-API path.
|
|
104
|
+
OPFS-as-primary-storage and resume-on-interrupted-download (also in
|
|
105
|
+
the v0.2 roadmap) require intercepting the WebLLM downloader and
|
|
106
|
+
are deferred to v0.3 to avoid forking upstream.
|
|
107
|
+
|
|
10
108
|
## [0.1.0] - 2026-05-10
|
|
11
109
|
|
|
12
110
|
### Added
|
package/README.md
CHANGED
|
@@ -161,19 +161,19 @@ The shape mirrors `ort-vision-sdk-web`: `await Class.create(model)` then `predic
|
|
|
161
161
|
|
|
162
162
|
## Installation
|
|
163
163
|
|
|
164
|
-
> Not yet published. Once v0.1 ships:
|
|
165
|
-
|
|
166
164
|
```bash
|
|
167
165
|
npm install localm-web @mlc-ai/web-llm
|
|
168
166
|
```
|
|
169
167
|
|
|
170
168
|
`@mlc-ai/web-llm` is a peer dependency — the consumer pins the version, which keeps the SDK lightweight and avoids version conflicts.
|
|
171
169
|
|
|
170
|
+
For a step-by-step walkthrough covering install, model selection, downloading weights, running the example app and troubleshooting, see **[docs/getting-started.md](./docs/getting-started.md)**.
|
|
171
|
+
|
|
172
172
|
## Vite usage
|
|
173
173
|
|
|
174
174
|
The package is designed to drop into a Vite app with no extra config. The Web Worker is bundled via Vite's native worker support; just import the SDK and use it.
|
|
175
175
|
|
|
176
|
-
A
|
|
176
|
+
A runnable example lives under [`examples/vite-chat/`](./examples/vite-chat/) — `cd` into it, `npm install`, `npm run dev`, open the browser, pick a model, send a prompt. The full guide in [`docs/getting-started.md`](./docs/getting-started.md#run-the-example-app) walks through it.
|
|
177
177
|
|
|
178
178
|
## Why not server-side?
|
|
179
179
|
|