nuxt-edge-ai 0.1.3 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,349 +1,509 @@
1
- # nuxt-edge-ai
2
-
3
- [![npm version](https://img.shields.io/npm/v/nuxt-edge-ai/latest.svg)](https://www.npmjs.com/package/nuxt-edge-ai)
4
- [![npm downloads](https://img.shields.io/npm/dm/nuxt-edge-ai.svg)](https://www.npmjs.com/package/nuxt-edge-ai)
5
- [![license](https://img.shields.io/npm/l/nuxt-edge-ai.svg)](./LICENSE)
6
- [![nuxt](https://img.shields.io/badge/Nuxt-4.x-00DC82?logo=nuxt.js&logoColor=white)](https://nuxt.com/)
7
- [![ci](https://github.com/otadk/nuxt-edge-ai/actions/workflows/ci.yml/badge.svg)](https://github.com/otadk/nuxt-edge-ai/actions/workflows/ci.yml)
8
-
9
- `nuxt-edge-ai` is a Nuxt module for building local-first AI applications with a real server-side WASM inference runtime and an optional remote API fallback.
10
-
11
- It ships:
12
-
13
- - a Nuxt module install surface
14
- - Nitro API routes for health, model pull, and generation
15
- - a client composable for app-side usage
16
- - an `EdgeAI` SDK with an OpenAI-like `chat.completions.create()` surface
17
- - switchable `local`, `remote`, and `mock` providers behind one module API
18
- - a vendored `transformers.js` + `onnxruntime-web` runtime inside the package
19
- - no Ollama, no `llama.cpp`, no Rust/C++/native runtime dependency for consumers
20
-
21
- The model weights are not bundled. Users either point the module at a local model directory or allow it to download and cache the model on first run.
22
-
23
- ## Features
24
-
25
- - Nuxt module install surface designed for app integration
26
- - Nitro endpoints for health, pull, and generate workflows
27
- - local-first server-side inference with bundled WASM runtime assets
28
- - optional OpenAI-compatible remote provider for stronger hosted models
29
- - OpenAI-compatible `chat/completions` endpoint for SDK-style integration
30
- - published package includes vendored inference runtime files
31
- - no consumer requirement for Ollama, Rust, C++, Python, or native AI runtimes
32
-
33
- ## Why this exists
34
-
35
- The goal is to make `nuxt-edge-ai` a credible, publishable Nuxt module:
36
-
37
- - installable in a regular Nuxt app
38
- - able to run a real local model
39
- - packaged as JS/TS + WASM only
40
-
41
- ## Current runtime
42
-
43
- Current local runtime path:
44
-
45
- - `transformers.js` web build
46
- - `onnxruntime-web` WASM backend
47
- - server-side execution through Nitro
48
-
49
- Built-in local preset:
50
-
51
- - `distilgpt2`
52
-
53
- The local path is intentionally conservative now. When local inference is not enough, the module can fall back to a remote OpenAI-compatible API.
54
-
55
- ## Install
56
-
57
- ```bash
58
- pnpm add nuxt-edge-ai
59
- ```
60
-
61
- ```ts
62
- // nuxt.config.ts
63
- export default defineNuxtConfig({
64
- modules: ['nuxt-edge-ai'],
65
- edgeAI: {
66
- provider: 'local',
67
- cacheDir: './.cache/nuxt-edge-ai',
68
- preset: 'distilgpt2',
69
- remote: {
70
- enabled: true,
71
- fallback: true,
72
- baseUrl: 'https://api.openai.com/v1',
73
- apiKey: process.env.OPENAI_API_KEY,
74
- model: 'gpt-4o-mini',
75
- },
76
- },
77
- })
78
- ```
79
-
80
- ```vue
81
- <script setup lang="ts">
82
- const edgeAI = useEdgeAI()
83
-
84
- await edgeAI.pull()
85
-
86
- const result = await edgeAI.generate({
87
- prompt: 'Write a pitch for a local-first Nuxt AI module.',
88
- })
89
- </script>
90
- ```
91
-
92
- ## Configuration
93
-
94
- Top-level module options:
95
-
96
- | Option | Type | Default | Notes |
97
- | --- | --- | --- | --- |
98
- | `routeBase` | `string` | `/api/edge-ai` | Base path for module endpoints |
99
- | `provider` | `'local' \| 'remote' \| 'mock'` | `local` | Runtime backend selector |
100
- | `runtime` | `'transformers-wasm' \| 'mock'` | legacy | Backward-compatible alias for older configs |
101
- | `cacheDir` | `string` | `./.cache/nuxt-edge-ai` | Cache and model asset directory |
102
- | `warmup` | `boolean` | `false` | Warm the runtime on health checks |
103
- | `preset` | `string` | `distilgpt2` | Local model preset |
104
- | `presets` | `Record<string, ...>` | `undefined` | Register additional local presets |
105
- | `model` | `object` | see below | Override the local model preset |
106
- | `remote` | `object` | see below | Remote provider and fallback settings |
107
-
108
- Local model options:
109
-
110
- | Option | Type | Default | Notes |
111
- | --- | --- | --- | --- |
112
- | `id` | `string` | `Xenova/distilgpt2` | Model identifier used when no local path is set |
113
- | `task` | `'text-generation'` | `text-generation` | Current supported task |
114
- | `localPath` | `string \| undefined` | `undefined` | Local model directory |
115
- | `allowRemote` | `boolean` | `true` | Allow first-run download from remote model source |
116
- | `dtype` | `string \| undefined` | `q8` | Runtime dtype passed to Transformers.js |
117
- | `generation.maxNewTokens` | `number` | `96` | Max generated tokens |
118
- | `generation.temperature` | `number` | `0.7` | Sampling temperature |
119
- | `generation.topP` | `number` | `0.9` | Top-p sampling |
120
- | `generation.doSample` | `boolean` | `true` | Enable sampling |
121
- | `generation.repetitionPenalty` | `number` | `1.05` | Repetition penalty |
122
-
123
- Remote provider options:
124
-
125
- | Option | Type | Default | Notes |
126
- | --- | --- | --- | --- |
127
- | `enabled` | `boolean` | `false` | Enable remote provider settings |
128
- | `fallback` | `boolean` | `true` | Fall back to remote if local pull/generate fails |
129
- | `baseUrl` | `string` | `https://api.openai.com/v1` | Remote API base URL |
130
- | `path` | `string` | `/chat/completions` | OpenAI-compatible endpoint path |
131
- | `model` | `string` | `gpt-4o-mini` | Default remote model ID |
132
- | `apiKey` | `string \| undefined` | `undefined` | Inline API key |
133
- | `headers` | `Record<string, string> \| undefined` | `undefined` | Extra request headers |
134
- | `systemPrompt` | `string \| undefined` | `undefined` | Optional system instruction |
135
-
136
- ## Provider examples
137
-
138
- Local-only mode:
139
-
140
- ```ts
141
- export default defineNuxtConfig({
142
- modules: ['nuxt-edge-ai'],
143
- edgeAI: {
144
- provider: 'local',
145
- preset: 'distilgpt2',
146
- remote: {
147
- enabled: false,
148
- },
149
- },
150
- })
151
- ```
152
-
153
- Local with automatic remote fallback:
154
-
155
- ```ts
156
- export default defineNuxtConfig({
157
- modules: ['nuxt-edge-ai'],
158
- edgeAI: {
159
- provider: 'local',
160
- preset: 'distilgpt2',
161
- remote: {
162
- enabled: true,
163
- fallback: true,
164
- baseUrl: 'https://api.openai.com/v1',
165
- apiKey: process.env.OPENAI_API_KEY,
166
- model: 'gpt-4o-mini',
167
- },
168
- },
169
- })
170
- ```
171
-
172
- Custom preset registration:
173
-
174
- ```ts
175
- export default defineNuxtConfig({
176
- modules: ['nuxt-edge-ai'],
177
- edgeAI: {
178
- presets: {
179
- 'team-default': {
180
- label: 'Team Default',
181
- description: 'Project-specific local preset',
182
- model: {
183
- id: 'Xenova/distilgpt2',
184
- dtype: 'q8',
185
- generation: {
186
- maxNewTokens: 120,
187
- },
188
- },
189
- },
190
- },
191
- preset: 'team-default',
192
- },
193
- })
194
- ```
195
-
196
- ## Consumer runtime guarantees
197
-
198
- Consumers do not need to install:
199
-
200
- - Ollama
201
- - Rust
202
- - C++
203
- - Python
204
- - `llama.cpp`
205
- - extra runtime npm packages beyond this module
206
-
207
- What consumers do need:
208
-
209
- - a Node/Nitro server runtime
210
- - a model path or permission to download a compatible model
211
-
212
- ## API surface
213
-
214
- - `GET /api/edge-ai/health`
215
- - `POST /api/edge-ai/pull`
216
- - `POST /api/edge-ai/generate`
217
- - `POST /api/edge-ai/chat/completions`
218
- - `useEdgeAI().health()`
219
- - `useEdgeAI().pull()`
220
- - `useEdgeAI().generate()`
221
- - `useEdgeAI().chatCompletions()`
222
-
223
- Health responses also expose:
224
-
225
- - `provider`
226
- - `presets`
227
- - `remoteFallback`
228
- - `engine.ready`
229
- - `engine.lastError`
230
-
231
- ## OpenAI-compatible chat completions
232
-
233
- You can either point the official OpenAI client at the module's Nitro route, or use the package's own `EdgeAI` client with the same calling style.
234
-
235
- Using `EdgeAI` directly:
236
-
237
- ```ts
238
- import { EdgeAI } from 'nuxt-edge-ai'
239
-
240
- const client = new EdgeAI({
241
- baseURL: 'http://localhost:3000/api/edge-ai',
242
- })
243
-
244
- const response = await client.chat.completions.create({
245
- model: 'openai/gpt-oss-20b:free',
246
- messages: [
247
- {
248
- role: 'user',
249
- content: "How many r's are in strawberry?",
250
- },
251
- ],
252
- reasoning: { enabled: true },
253
- })
254
- ```
255
-
256
- Using the OpenAI SDK against the same route:
257
-
258
- ```ts
259
- import OpenAI from 'openai'
260
-
261
- const client = new OpenAI({
262
- baseURL: 'http://localhost:3000/api/edge-ai',
263
- apiKey: 'local-dev-token',
264
- })
265
-
266
- const response = await client.chat.completions.create({
267
- model: 'openai/gpt-oss-20b:free',
268
- messages: [
269
- {
270
- role: 'user',
271
- content: "How many r's are in strawberry?",
272
- },
273
- ],
274
- reasoning: { enabled: true },
275
- })
276
- ```
277
-
278
- Inside a Nuxt app you can also use `useEdgeAI().client.chat.completions.create(...)`.
279
-
280
- When the module is using a remote OpenAI-compatible backend, it forwards `messages`, `reasoning`, and any extra `remoteBody` fields. If the upstream provider returns `reasoning_details`, the module preserves them on `choices[0].message`.
281
-
282
- Example OpenRouter-style config:
283
-
284
- ```ts
285
- export default defineNuxtConfig({
286
- modules: ['nuxt-edge-ai'],
287
- edgeAI: {
288
- provider: 'remote',
289
- remote: {
290
- enabled: true,
291
- baseUrl: 'https://openrouter.ai/api/v1',
292
- apiKey: process.env.OPENROUTER_API_KEY,
293
- model: 'openai/gpt-oss-20b:free',
294
- },
295
- },
296
- })
297
- ```
298
-
299
- ## Troubleshooting
300
-
301
- Common checks:
302
-
303
- - Run `POST /api/edge-ai/health` first to confirm route wiring and runtime config.
304
- - Use `provider: 'mock'` to separate module wiring issues from model/runtime issues.
305
- - Remote fallback requires `edgeAI.remote.enabled: true` plus `edgeAI.remote.apiKey`.
306
- - If `pull` fails, inspect server logs first. Most early failures are model-path or packaged-runtime issues.
307
- - After changing vendored runtime files, always run `pnpm prepack` before validating a published-style install.
308
-
309
- ## Local development
310
-
311
- ```bash
312
- pnpm install
313
- pnpm dev
314
- ```
315
-
316
- Useful commands:
317
-
318
- ```bash
319
- pnpm vendor:runtime
320
- pnpm lint
321
- pnpm test
322
- pnpm test:types
323
- pnpm prepack
324
- ```
325
-
326
- ## Docs
327
-
328
- See [`docs/index.md`](./docs/index.md) for the project docs tree.
329
-
330
- Key docs:
331
-
332
- - [`docs/getting-started.md`](./docs/getting-started.md)
333
- - [`docs/api.md`](./docs/api.md)
334
- - [`docs/models.md`](./docs/models.md)
335
- - [`docs/architecture.md`](./docs/architecture.md)
336
- - [`docs/third-party.md`](./docs/third-party.md)
337
-
338
- ## Repository shape
339
-
340
- - `src/module.ts`: module entry and runtime config wiring
341
- - `src/runtime/`: composables, plugin, and Nitro runtime code
342
- - `playground/`: interactive demo app
343
- - `test/fixtures/`: module consumer fixtures
344
- - `docs/`: module documentation
345
- - `scripts/vendor-runtime.mjs`: vendored runtime generation
346
-
347
- ## Status
348
-
349
- This is still an MVP, but it now supports three execution modes behind one API: `local`, `remote`, and `mock`.
1
+ # nuxt-edge-ai
2
+
3
+ [![npm version](https://img.shields.io/npm/v/nuxt-edge-ai/latest.svg)](https://www.npmjs.com/package/nuxt-edge-ai)
4
+ [![npm downloads](https://img.shields.io/npm/dm/nuxt-edge-ai.svg)](https://www.npmjs.com/package/nuxt-edge-ai)
5
+ [![license](https://img.shields.io/npm/l/nuxt-edge-ai.svg)](./LICENSE)
6
+ [![nuxt](https://img.shields.io/badge/Nuxt-4.x-00DC82?logo=nuxt.js&logoColor=white)](https://nuxt.com/)
7
+ [![ci](https://github.com/otadk/nuxt-edge-ai/actions/workflows/ci.yml/badge.svg)](https://github.com/otadk/nuxt-edge-ai/actions/workflows/ci.yml)
8
+ [![oosmetrics](https://api.oosmetrics.com/api/v1/badge/achievement/69a77845-965e-4d85-a153-e43023059704.svg)](https://oosmetrics.com/repo/otadk/nuxt-edge-ai)
9
+ [![oosmetrics](https://api.oosmetrics.com/api/v1/badge/achievement/5e00ff2f-b279-4ba3-a4de-4f53d9ca2c0c.svg)](https://oosmetrics.com/repo/otadk/nuxt-edge-ai)
10
+
11
+ `nuxt-edge-ai` is a Nuxt module for building local-first AI applications with a real server-side WASM inference runtime and an optional remote API fallback.
12
+
13
+ It ships:
14
+
15
+ - a Nuxt module install surface
16
+ - Nitro API routes for health, model pull, and generation
17
+ - a client composable for app-side usage
18
+ - an `EdgeAI` SDK with an OpenAI-like `chat.completions.create()` surface
19
+ - switchable `local`, `remote`, and `mock` providers behind one module API
20
+ - a vendored `transformers.js` + `onnxruntime-web` runtime inside the package
21
+ - no Ollama, no `llama.cpp`, no Rust/C++/native runtime dependency for consumers
22
+
23
+ The model weights are not bundled. Users either point the module at a local model directory or allow it to download and cache the model on first run.
24
+
25
+ ## Demo
26
+
27
+ - Minimal local-only StackBlitz demo: https://stackblitz.com/edit/nuxt-starter-t2mavk3t?file=package.json
28
+
29
+ The StackBlitz example keeps the setup intentionally small and uses the OpenAI-style `chat.completions.create()` call shape against the local provider only.
30
+
31
+ ## Features
32
+
33
+ - Nuxt module install surface designed for app integration
34
+ - Nitro endpoints for health, pull, and generate workflows
35
+ - local-first server-side inference with bundled WASM runtime assets
36
+ - optional OpenAI-compatible remote provider for stronger hosted models
37
+ - OpenAI-compatible `chat/completions` endpoint for SDK-style integration
38
+ - **streaming chat completions** with SSE (Server-Sent Events) for real-time typewriter effect
39
+ - compatible with `@ai-sdk/vue`'s `useChat()` for seamless integration
40
+ - published package includes vendored inference runtime files
41
+ - no consumer requirement for Ollama, Rust, C++, Python, or native AI runtimes
42
+
43
+ ## Why this exists
44
+
45
+ The goal is to make `nuxt-edge-ai` a credible, publishable Nuxt module:
46
+
47
+ - installable in a regular Nuxt app
48
+ - able to run a real local model
49
+ - packaged as JS/TS + WASM only
50
+
51
+ ## Current runtime
52
+
53
+ Current local runtime path:
54
+
55
+ - `transformers.js` web build
56
+ - `onnxruntime-web` WASM backend
57
+ - server-side execution through Nitro
58
+
59
+ Built-in local preset:
60
+
61
+ - `distilgpt2`
62
+
63
+ The local path is intentionally conservative now. When local inference is not enough, the module can fall back to a remote OpenAI-compatible API.
64
+
65
+ ## Support Matrix
66
+
67
+ | Surface | Status | Notes |
68
+ | --- | --- | --- |
69
+ | Nuxt | Supported | `^4.4.0` and newer Nuxt 4 releases |
70
+ | Runtime | Supported | Node/Nitro server runtime |
71
+ | Local inference | Supported | Bundled Transformers.js + ONNX Runtime WASM |
72
+ | Remote inference | Supported | OpenAI-compatible `chat/completions` providers |
73
+ | Mock mode | Supported | Fixture tests, CI, and integration smoke checks |
74
+ | Streaming | **Supported** | SSE-based streaming with AI SDK-compatible protocol |
75
+ | Edge runtime workers | Not yet supported | The local WASM runtime currently assumes a Node server process |
76
+
77
+ ## Validation
78
+
79
+ This module is validated through:
80
+
81
+ - fixture-based Nuxt module tests in `test/`
82
+ - type checks for both the module and the playground app
83
+ - a local playground app in `playground/`
84
+ - a published-style external consumer smoke test before release
85
+
86
+ That keeps the package focused on the real consumer path: install the module, register it in `nuxt.config.ts`, and call the exposed Nitro routes or injected client.
87
+
88
+ ## Install
89
+
90
+ ```bash
91
+ pnpm add nuxt-edge-ai
92
+ ```
93
+
94
+ ```ts
95
+ // nuxt.config.ts
96
+ export default defineNuxtConfig({
97
+ modules: ['nuxt-edge-ai'],
98
+ edgeAI: {
99
+ provider: 'local',
100
+ cacheDir: './.cache/nuxt-edge-ai',
101
+ preset: 'distilgpt2',
102
+ remote: {
103
+ enabled: true,
104
+ fallback: true,
105
+ baseUrl: 'https://api.openai.com/v1',
106
+ apiKey: process.env.OPENAI_API_KEY,
107
+ model: 'gpt-4o-mini',
108
+ },
109
+ },
110
+ })
111
+ ```
112
+
113
+ ```vue
114
+ <script setup lang="ts">
115
+ const edgeAI = useEdgeAI()
116
+
117
+ await edgeAI.pull()
118
+
119
+ const completion = await edgeAI.client.chat.completions.create({
120
+ model: edgeAI.defaultModel,
121
+ messages: [
122
+ {
123
+ role: 'user',
124
+ content: 'Write a pitch for a local-first Nuxt AI module.',
125
+ },
126
+ ],
127
+ })
128
+
129
+ const text = String(completion.choices[0]?.message.content ?? '')
130
+ </script>
131
+ ```
132
+
133
+ If you prefer the lower-level route wrapper, `useEdgeAI().chatCompletions()` accepts the same OpenAI-style payload shape.
134
+
135
+ ## Configuration
136
+
137
+ Top-level module options:
138
+
139
+ | Option | Type | Default | Notes |
140
+ | --- | --- | --- | --- |
141
+ | `routeBase` | `string` | `/api/edge-ai` | Base path for module endpoints |
142
+ | `provider` | `'local' \| 'remote' \| 'mock'` | `local` | Runtime backend selector |
143
+ | `runtime` | `'transformers-wasm' \| 'mock'` | legacy | Backward-compatible alias for older configs |
144
+ | `cacheDir` | `string` | `./.cache/nuxt-edge-ai` | Cache and model asset directory |
145
+ | `warmup` | `boolean` | `false` | Warm the runtime on health checks |
146
+ | `preset` | `string` | `distilgpt2` | Local model preset |
147
+ | `presets` | `Record<string, ...>` | `undefined` | Register additional local presets |
148
+ | `model` | `object` | see below | Override the local model preset |
149
+ | `remote` | `object` | see below | Remote provider and fallback settings |
150
+
151
+ Local model options:
152
+
153
+ | Option | Type | Default | Notes |
154
+ | --- | --- | --- | --- |
155
+ | `id` | `string` | `Xenova/distilgpt2` | Model identifier used when no local path is set |
156
+ | `task` | `'text-generation'` | `text-generation` | Current supported task |
157
+ | `localPath` | `string \| undefined` | `undefined` | Local model directory |
158
+ | `allowRemote` | `boolean` | `true` | Allow first-run download from remote model source |
159
+ | `dtype` | `string \| undefined` | `q8` | Runtime dtype passed to Transformers.js |
160
+ | `generation.maxNewTokens` | `number` | `96` | Max generated tokens |
161
+ | `generation.temperature` | `number` | `0.7` | Sampling temperature |
162
+ | `generation.topP` | `number` | `0.9` | Top-p sampling |
163
+ | `generation.doSample` | `boolean` | `true` | Enable sampling |
164
+ | `generation.repetitionPenalty` | `number` | `1.05` | Repetition penalty |
165
+
166
+ Remote provider options:
167
+
168
+ | Option | Type | Default | Notes |
169
+ | --- | --- | --- | --- |
170
+ | `enabled` | `boolean` | `false` | Enable remote provider settings |
171
+ | `fallback` | `boolean` | `true` | Fall back to remote if local pull/generate fails |
172
+ | `baseUrl` | `string` | `https://api.openai.com/v1` | Remote API base URL |
173
+ | `path` | `string` | `/chat/completions` | OpenAI-compatible endpoint path |
174
+ | `model` | `string` | `gpt-4o-mini` | Default remote model ID |
175
+ | `apiKey` | `string \| undefined` | `undefined` | Inline API key |
176
+ | `headers` | `Record<string, string> \| undefined` | `undefined` | Extra request headers |
177
+ | `systemPrompt` | `string \| undefined` | `undefined` | Optional system instruction |
178
+
179
+ ## Provider examples
180
+
181
+ Local-only mode:
182
+
183
+ ```ts
184
+ export default defineNuxtConfig({
185
+ modules: ['nuxt-edge-ai'],
186
+ edgeAI: {
187
+ provider: 'local',
188
+ preset: 'distilgpt2',
189
+ remote: {
190
+ enabled: false,
191
+ },
192
+ },
193
+ })
194
+ ```
195
+
196
+ Local with automatic remote fallback:
197
+
198
+ ```ts
199
+ export default defineNuxtConfig({
200
+ modules: ['nuxt-edge-ai'],
201
+ edgeAI: {
202
+ provider: 'local',
203
+ preset: 'distilgpt2',
204
+ remote: {
205
+ enabled: true,
206
+ fallback: true,
207
+ baseUrl: 'https://api.openai.com/v1',
208
+ apiKey: process.env.OPENAI_API_KEY,
209
+ model: 'gpt-4o-mini',
210
+ },
211
+ },
212
+ })
213
+ ```
214
+
215
+ Custom preset registration:
216
+
217
+ ```ts
218
+ export default defineNuxtConfig({
219
+ modules: ['nuxt-edge-ai'],
220
+ edgeAI: {
221
+ presets: {
222
+ 'team-default': {
223
+ label: 'Team Default',
224
+ description: 'Project-specific local preset',
225
+ model: {
226
+ id: 'Xenova/distilgpt2',
227
+ dtype: 'q8',
228
+ generation: {
229
+ maxNewTokens: 120,
230
+ },
231
+ },
232
+ },
233
+ },
234
+ preset: 'team-default',
235
+ },
236
+ })
237
+ ```
238
+
239
+ ## Consumer runtime guarantees
240
+
241
+ Consumers do not need to install:
242
+
243
+ - Ollama
244
+ - Rust
245
+ - C++
246
+ - Python
247
+ - `llama.cpp`
248
+ - extra runtime npm packages beyond this module
249
+
250
+ What consumers do need:
251
+
252
+ - a Node/Nitro server runtime
253
+ - a model path or permission to download a compatible model
254
+
255
+ ## API surface
256
+
257
+ - `GET /api/edge-ai/health`
258
+ - `POST /api/edge-ai/pull`
259
+ - `POST /api/edge-ai/generate`
260
+ - `POST /api/edge-ai/chat/completions`
261
+ - `useEdgeAI().health()`
262
+ - `useEdgeAI().pull()`
263
+ - `useEdgeAI().chatCompletions()`
264
+ - `useEdgeAI().client.chat.completions.create()`
265
+
266
+ Health responses also expose:
267
+
268
+ - `provider`
269
+ - `presets`
270
+ - `remoteFallback`
271
+ - `engine.ready`
272
+ - `engine.lastError`
273
+
274
+ ## OpenAI-compatible chat completions
275
+
276
+ You can either point the official OpenAI client at the module's Nitro route, or use the package's own `EdgeAI` client with the same calling style.
277
+
278
+ Using `EdgeAI` directly:
279
+
280
+ ```ts
281
+ import { EdgeAI } from 'nuxt-edge-ai'
282
+
283
+ const client = new EdgeAI({
284
+ baseURL: 'http://localhost:3000/api/edge-ai',
285
+ })
286
+
287
+ const response = await client.chat.completions.create({
288
+ model: 'openai/gpt-oss-20b:free',
289
+ messages: [
290
+ {
291
+ role: 'user',
292
+ content: "How many r's are in strawberry?",
293
+ },
294
+ ],
295
+ reasoning: { enabled: true },
296
+ })
297
+ ```
298
+
299
+ Using `useEdgeAI()` inside a Nuxt app with the same calling style:
300
+
301
+ ```ts
302
+ const edgeAI = useEdgeAI()
303
+
304
+ await edgeAI.pull()
305
+
306
+ const response = await edgeAI.client.chat.completions.create({
307
+ model: edgeAI.defaultModel,
308
+ messages: [
309
+ {
310
+ role: 'user',
311
+ content: 'Summarize the module in one sentence.',
312
+ },
313
+ ],
314
+ })
315
+ ```
316
+
317
+ Using the OpenAI SDK against the same route:
318
+
319
+ ```ts
320
+ import OpenAI from 'openai'
321
+
322
+ const client = new OpenAI({
323
+ baseURL: 'http://localhost:3000/api/edge-ai',
324
+ apiKey: 'local-dev-token',
325
+ })
326
+
327
+ const response = await client.chat.completions.create({
328
+ model: 'openai/gpt-oss-20b:free',
329
+ messages: [
330
+ {
331
+ role: 'user',
332
+ content: "How many r's are in strawberry?",
333
+ },
334
+ ],
335
+ reasoning: { enabled: true },
336
+ })
337
+ ```
338
+
339
+ If you want to call the route wrapper directly, `useEdgeAI().chatCompletions(...)` maps to the same `/chat/completions` endpoint.
340
+
341
+ ## Streaming chat completions
342
+
343
+ The module now supports real-time streaming responses with Server-Sent Events (SSE). This enables the typewriter effect that modern AI applications expect.
344
+
345
+ ### Using `useEdgeAI()` with streaming
346
+
347
+ ```vue
348
+ <script setup lang="ts">
349
+ const edgeAI = useEdgeAI()
350
+ const messages = ref<Array<{ role: 'user' | 'assistant', content: string }>>([])
351
+ const input = ref('')
352
+
353
+ async function handleSubmit() {
354
+ const text = input.value.trim()
355
+ if (!text) return
356
+
357
+ // Add user message
358
+ messages.value.push({ role: 'user', content: text })
359
+ input.value = ''
360
+
361
+ // Add placeholder for assistant response
362
+ messages.value.push({ role: 'assistant', content: '' })
363
+
364
+ // Stream the response
365
+ try {
366
+ for await (const token of edgeAI.streamChatCompletionsGenerator({
367
+ model: edgeAI.defaultModel,
368
+ messages: messages.value.slice(0, -1),
369
+ stream: true,
370
+ })) {
371
+ // Update the last message with each token
372
+ const lastMessage = messages.value[messages.value.length - 1]
373
+ lastMessage.content += token
374
+ }
375
+ }
376
+ catch (error) {
377
+ console.error('Stream error:', error)
378
+ }
379
+ }
380
+
381
+ // Stop streaming if needed
382
+ function stop() {
383
+ edgeAI.stop()
384
+ }
385
+ </script>
386
+ ```
387
+
388
+ ### Using the EdgeAI client with streaming
389
+
390
+ ```ts
391
+ import { EdgeAI } from 'nuxt-edge-ai'
392
+
393
+ const client = new EdgeAI({
394
+ baseURL: 'http://localhost:3000/api/edge-ai',
395
+ })
396
+
397
+ // Stream with callbacks
398
+ await client.chat.completions.stream(
399
+ {
400
+ model: 'distilgpt2',
401
+ messages: [{ role: 'user', content: 'Hello!' }],
402
+ stream: true,
403
+ },
404
+ {
405
+ onToken: (token) => console.log(token),
406
+ onCompletion: (text) => console.log('Done:', text),
407
+ onError: (error) => console.error(error),
408
+ }
409
+ )
410
+
411
+ // Or use async generator
412
+ for await (const token of client.streamChatCompletionGenerator({
413
+ model: 'distilgpt2',
414
+ messages: [{ role: 'user', content: 'Hello!' }],
415
+ stream: true,
416
+ })) {
417
+ console.log(token)
418
+ }
419
+ ```
420
+
421
+ ### Compatible with `@ai-sdk/vue`
422
+
423
+ The streaming protocol is compatible with Vercel AI SDK's `useChat()` composable. You can use `@ai-sdk/vue` with this module:
424
+
425
+ ```ts
426
+ import { useChat } from '@ai-sdk/vue'
427
+
428
+ const { messages, input, handleSubmit, isLoading, stop } = useChat({
429
+ api: '/api/edge-ai/chat/completions',
430
+ })
431
+ ```
432
+
433
+ When the module is using a remote OpenAI-compatible backend, it forwards `messages`, `reasoning`, and any extra `remoteBody` fields. If the upstream provider returns `reasoning_details`, the module preserves them on `choices[0].message`.
434
+
435
+ Example OpenRouter-style config:
436
+
437
+ ```ts
438
+ export default defineNuxtConfig({
439
+ modules: ['nuxt-edge-ai'],
440
+ edgeAI: {
441
+ provider: 'remote',
442
+ remote: {
443
+ enabled: true,
444
+ baseUrl: 'https://openrouter.ai/api/v1',
445
+ apiKey: process.env.OPENROUTER_API_KEY,
446
+ model: 'openai/gpt-oss-20b:free',
447
+ },
448
+ },
449
+ })
450
+ ```
451
+
452
+ ## Troubleshooting
453
+
454
+ Common checks:
455
+
456
+ - Run `POST /api/edge-ai/health` first to confirm route wiring and runtime config.
457
+ - Use `provider: 'mock'` to separate module wiring issues from model/runtime issues.
458
+ - Remote fallback requires `edgeAI.remote.enabled: true` plus `edgeAI.remote.apiKey`.
459
+ - If `pull` fails, inspect server logs first. Most early failures are model-path or packaged-runtime issues.
460
+ - After changing vendored runtime files, always run `pnpm prepack` before validating a published-style install.
461
+
462
+ ## Known limitations
463
+
464
+ - The local provider currently targets `text-generation` only.
465
+ - The local WASM runtime is designed for a Node/Nitro server process, not edge-worker runtimes.
466
+ - Model quality and latency depend heavily on the selected preset or upstream remote model.
467
+ - Model weights are not bundled in the npm package; local-first usage still requires either a local model path or a first-run download.
468
+
469
+ ## Local development
470
+
471
+ ```bash
472
+ pnpm install
473
+ pnpm dev
474
+ ```
475
+
476
+ Useful commands:
477
+
478
+ ```bash
479
+ pnpm vendor:runtime
480
+ pnpm lint
481
+ pnpm test
482
+ pnpm test:types
483
+ pnpm prepack
484
+ ```
485
+
486
+ ## Docs
487
+
488
+ See [`docs/index.md`](./docs/index.md) for the project docs tree.
489
+
490
+ Key docs:
491
+
492
+ - [`docs/getting-started.md`](./docs/getting-started.md)
493
+ - [`docs/api.md`](./docs/api.md)
494
+ - [`docs/models.md`](./docs/models.md)
495
+ - [`docs/architecture.md`](./docs/architecture.md)
496
+ - [`docs/third-party.md`](./docs/third-party.md)
497
+
498
+ ## Repository shape
499
+
500
+ - `src/module.ts`: module entry and runtime config wiring
501
+ - `src/runtime/`: composables, plugin, and Nitro runtime code
502
+ - `playground/`: interactive demo app
503
+ - `test/fixtures/`: module consumer fixtures
504
+ - `docs/`: module documentation
505
+ - `scripts/vendor-runtime.mjs`: vendored runtime generation
506
+
507
+ ## Status
508
+
509
+ This module is ready for community consumption, but it is still intentionally scoped. The stable contract today is a Nuxt module that exposes one AI surface across three execution modes: `local`, `remote`, and `mock`.