oomi-ai 0.2.28 → 0.2.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,19 +1,26 @@
1
- # oomi-ai
2
-
3
- OpenClaw channel plugin and bridge tooling for Oomi managed chat and voice.
4
-
5
- ## Current Focus
6
-
7
- `0.2.28` keeps the persona automation lane and adds a usable local managed-voice validation path:
8
- - WebSpatial-based persona scaffolding for generated Oomi apps
9
- - a high-level `oomi personas create-managed` command for agent-driven persona creation
10
- - device-authenticated persona runtime registration and job callbacks
11
- - automatic bridge-side polling for queued `persona_job` control messages
12
- - one shared spoken-metadata normalizer used by both the extension and the bridge
13
- - a repo-backed local `tts-pipeline` replay that can validate assistant-final -> backend -> real Qwen TTS before publishing
14
- - spoken-metadata handling that preserves natural pauses like `...` and keeps the managed voice contract valid on the real chat session path
15
-
16
- This package is for two audiences:
1
+ # oomi-ai
2
+
3
+ OpenClaw channel plugin and bridge tooling for Oomi managed chat and voice.
4
+
5
+ ## Current Focus
6
+
7
+ `0.2.38` keeps the persona automation lane, adds a stable local persona runtime manager, upgrades the Docker dev harness from a package simulator to a real OpenClaw runtime, and introduces a shared OpenClaw profile contract so local onboarding, Docker bootstrap, and future hosted agents use the same setup model:
8
+ - WebSpatial-based persona scaffolding for generated Oomi apps
9
+ - a high-level `oomi personas create-managed` command for agent-driven persona creation
10
+ - a stable `oomi personas launch-managed` flow for local persona hosting under `~/.openclaw/personas`
11
+ - a matching `oomi personas delete` flow that stops managed runtimes and removes the persona workspace from the OpenClaw machine
12
+ - shared OpenClaw path handling for isolated local or containerized dev roots
13
+ - versioned `oomi openclaw profile init|apply` commands for deterministic local/dev or hosted setup flows
14
+ - explicit model auth modes so onboarding can default to `oomi-managed` while internal testing can still opt into direct provider auth
15
+ - a repo-local `openclaw debug persona-runtime` smoke test for managed persona runtime launch/reuse/stop
16
+ - a Docker-based OpenClaw dev harness that runs a real `openclaw gateway` inside an isolated container
17
+ - device-authenticated persona runtime registration and job callbacks
18
+ - automatic bridge-side polling for queued `persona_job` control messages
19
+ - one shared spoken-metadata normalizer used by both the extension and the bridge
20
+ - a repo-backed local `tts-pipeline` replay that can validate assistant-final -> backend -> real Qwen TTS before publishing
21
+ - spoken-metadata handling that preserves natural pauses like `...` and keeps the managed voice contract valid on the real chat session path
22
+
23
+ This package is for two audiences:
17
24
  - OpenClaw operators who need to connect a machine to Oomi and keep chat or voice healthy
18
25
  - Developers evaluating the plugin on npm and deciding whether it matches their OpenClaw + Oomi setup
19
26
 
@@ -122,9 +129,9 @@ Optional fields:
122
129
  - `defaultSessionKey`
123
130
  - `requestTimeoutMs`
124
131
 
125
- ## Runtime Model
126
-
127
- There are two runtime contracts worth understanding.
132
+ ## Runtime Model
133
+
134
+ There are two runtime contracts worth understanding.
128
135
 
129
136
  ### Managed Text Chat
130
137
 
@@ -143,84 +150,189 @@ That bridge:
143
150
 
144
151
  This is the part of the package most likely to matter when debugging voice turn failures.
145
152
 
146
- For managed cloned-voice replies, the canonical contract is:
147
- - visible assistant `content` stays user-facing
148
- - hidden `metadata.spoken` carries the backend TTS payload
149
- - the shared helper in `lib/spokenMetadata.js` is used by both the extension and the local bridge to preserve or normalize that sidecar before it reaches the backend
150
-
151
- The backend cloned-voice path is intentionally strict. If `metadata.spoken` does not reach Oomi, backend TTS fails instead of speaking a flat fallback voice.
152
-
153
- ## Local TTS Validation
154
-
155
- If you are developing this package inside the Oomi repo, you can now validate the managed voice path locally before publishing.
156
-
157
- This local gate does three things:
158
- - replays an assistant `chat.final` frame through the same spoken-metadata normalization path used by the OpenClaw extension and the bridge
159
- - feeds that normalized frame into the Rails backend replay harness
160
- - optionally calls the real Qwen cloned-voice provider and confirms that audio deltas come back
161
-
162
- Important:
163
- - this is a repo developer workflow, not a generic npm-only operator command
164
- - it expects the Oomi repo checkout, the Rails backend, and local provider env vars
165
- - the real-provider replay can auto-enroll a disposable default sample voice profile from `assets/voice/source/nemu-enrollment-sample.mp3`
166
-
167
- Assistant-final contract only:
168
-
169
- ```bash
170
- oomi openclaw debug assistant-final --text "Hey Justin! How is the testing going?" --json
171
- ```
172
-
173
- Full local backend replay:
174
-
175
- ```bash
176
- oomi openclaw debug tts-pipeline --text "When your voice reaches me, it gets turned into text, I read it and think about it, then I speak back through the managed chat session." --json
153
+ For managed cloned-voice replies, the canonical contract is:
154
+ - visible assistant `content` stays user-facing
155
+ - hidden `metadata.spoken` carries the backend TTS payload
156
+ - the shared helper in `lib/spokenMetadata.js` is used by both the extension and the local bridge to preserve or normalize that sidecar before it reaches the backend
157
+
158
+ The backend cloned-voice path is intentionally strict. If `metadata.spoken` does not reach Oomi, backend TTS fails instead of speaking a flat fallback voice.
159
+
160
+ ## Bridge Logging
161
+
162
+ The bridge is intentionally quiet by default in production so normal deploys do not spam logs with frame-level transport noise.
163
+
164
+ To enable verbose bridge tracing temporarily, set:
165
+
166
+ ```bash
167
+ OOMI_BRIDGE_DEBUG=1
168
+ ```
169
+
170
+ With that flag enabled, the bridge will emit low-level session, frame, and spoken-metadata debug logs again.
171
+
172
+ ## Local TTS Validation
173
+
174
+ If you are developing this package inside the Oomi repo, you can now validate the managed voice path locally before publishing.
175
+
176
+ This local gate does three things:
177
+ - replays an assistant `chat.final` frame through the same spoken-metadata normalization path used by the OpenClaw extension and the bridge
178
+ - feeds that normalized frame into the Rails backend replay harness
179
+ - optionally calls the real Qwen cloned-voice provider and confirms that audio deltas come back
180
+
181
+ Important:
182
+ - this is a repo developer workflow, not a generic npm-only operator command
183
+ - it expects the Oomi repo checkout, the Rails backend, and local provider env vars
184
+ - the real-provider replay can auto-enroll a disposable default sample voice profile from `assets/voice/source/nemu-enrollment-sample.mp3`
185
+
186
+ Assistant-final contract only:
187
+
188
+ ```bash
189
+ oomi openclaw debug assistant-final --text "Hey Justin! How is the testing going?" --json
190
+ ```
191
+
192
+ Full local backend replay:
193
+
194
+ ```bash
195
+ oomi openclaw debug tts-pipeline --text "When your voice reaches me, it gets turned into text, I read it and think about it, then I speak back through the managed chat session." --json
196
+ ```
197
+
198
+ Real Qwen provider replay:
199
+
200
+ ```bash
201
+ oomi openclaw debug tts-pipeline --text "When your voice reaches me, it gets turned into text, I read it and think about it, then I speak back through the managed chat session." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
202
+ ```
203
+
204
+ What a good result looks like:
205
+ - `backend.success = true`
206
+ - `managed.assistantSpeechFinal.present = true`
207
+ - `qwen.errorCode = null`
208
+ - `qwen.audioDeltaCount > 0` when `--live-provider` is used
209
+
210
+ This is the preferred pre-publish gate for managed voice regressions, because it is much faster than publishing to npm and testing through a live OpenClaw machine first.
211
+
212
+ ## Local OpenClaw Dev Harness
213
+
214
+ For plugin/runtime work, the preferred pre-publish loop is:
215
+
216
+ 1. run the repo-local CLI directly from source
217
+ 2. run the same flow inside the Dockerized OpenClaw dev harness using a local packed tarball
218
+ 3. only then update a real OpenClaw machine
219
+
220
+ Fast source smoke from the repo checkout:
221
+
222
+ ```bash
223
+ node packages/oomi-ai/bin/oomi-ai.js openclaw debug persona-runtime --name "Chef Dev" --json
224
+ ```
225
+
226
+ Containerized real-runtime smoke:
227
+
228
+ ```bash
229
+ docker compose -f docker/openclaw-dev/compose.yml build openclaw-dev
230
+ docker compose -f docker/openclaw-dev/compose.yml up -d openclaw-dev
231
+ docker compose -f docker/openclaw-dev/compose.yml exec -T openclaw-dev openclaw gateway health --url ws://127.0.0.1:18789 --token dev-gateway-token --json
232
+ docker compose -f docker/openclaw-dev/compose.yml exec -T openclaw-dev oomi-local openclaw debug persona-runtime --name "Chef Dev" --json
177
233
  ```
178
234
 
179
- Real Qwen provider replay:
235
+ The local managed-chat smoke uses a dedicated session key separate from the browser shell so repeated sentinel prompts do not leak into the interactive conversation history.
236
+
237
+ `oomi-local` is a deterministic container wrapper that executes the installed packed `oomi-ai` artifact directly with Node. In the Docker harness, it is only the package wrapper. The agent itself is the real OpenClaw runtime running in the foreground.
238
+
239
+ Shared profile contract smoke:
240
+
241
+ ```bash
242
+ node packages/oomi-ai/bin/oomi-ai.js openclaw profile init --profile-id oomi-dev-local --label "Oomi Local Dev" --backend-url http://127.0.0.1:3001 --device-token dev-device-token --json
243
+ node packages/oomi-ai/bin/oomi-ai.js openclaw profile apply --profile ~/.openclaw/oomi-openclaw-profile.json --openclaw-home ~/.openclaw --json
244
+ ```
245
+
246
+ What the harness does:
247
+
248
+ - bootstraps an isolated OpenClaw home rooted at `HOME/.openclaw`
249
+ - runs `openclaw onboard --non-interactive ...`
250
+ - writes and applies `HOME/.openclaw/oomi-dev-profile.json` using the same shared profile contract the future onboarding UI and hosted-agent bootstrap should use
251
+ - enables the Oomi channel account through that applied profile and relies on local OpenClaw plugin auto-discovery for the installed `oomi-ai` plugin
252
+ - writes device identity material used by the `oomi-ai` bridge tooling
253
+ - packs the local `packages/oomi-ai` checkout into a `.tgz`
254
+ - installs that tarball globally in the container
255
+ - installs the same tarball as a real OpenClaw plugin
256
+ - defaults model auth to `oomi-managed` so onboarding/bootstrap does not require end-user provider keys
257
+ - runs `openclaw gateway` as the foreground container process
258
+
259
+ Useful env overrides for local integration:
180
260
 
181
- ```bash
182
- oomi openclaw debug tts-pipeline --text "When your voice reaches me, it gets turned into text, I read it and think about it, then I speak back through the managed chat session." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
183
- ```
261
+ - `OOMI_DEV_BACKEND_URL`
262
+ - `OOMI_DEV_DEVICE_TOKEN`
263
+ - `OOMI_DEV_MODEL_AUTH_MODE`
264
+ - `OPENCLAW_GATEWAY_TOKEN`
265
+ - `OPENCLAW_GATEWAY_PASSWORD`
184
266
 
185
- What a good result looks like:
186
- - `backend.success = true`
187
- - `managed.assistantSpeechFinal.present = true`
188
- - `qwen.errorCode = null`
189
- - `qwen.audioDeltaCount > 0` when `--live-provider` is used
267
+ Recommended local modes:
190
268
 
191
- This is the preferred pre-publish gate for managed voice regressions, because it is much faster than publishing to npm and testing through a live OpenClaw machine first.
269
+ - onboarding/runtime checks without provider keys
270
+ - `OOMI_DEV_MODEL_AUTH_MODE=oomi-managed`
271
+ - internal real-response smoke before publish
272
+ - `OPENROUTER_API_KEY=...`
273
+ - optional explicit override: `OOMI_DEV_MODEL_AUTH_MODE=provider-env`
192
274
 
193
- ## Persona Scaffolding
275
+ The default container config is intentionally safe for onboarding and runtime testing. It does not require a published npm version, and it does not require end-user provider keys.
194
276
 
195
- Use the scaffold flow when OpenClaw needs to build a managed persona app that will live inside Oomi:
277
+ To make the Dockerized OpenClaw runtime actually answer managed chat locally today, add this to the repo `.env.local`:
196
278
 
197
279
  ```bash
198
- oomi personas scaffold market-analyst --name "Market Analyst" --description "Private app for reviewing my broker positions and risk." --out ~/.openclaw/personas/market-analyst
280
+ OOMI_DEV_MODEL_AUTH_MODE=provider-env
281
+ OPENROUTER_API_KEY=<your-openrouter-key>
199
282
  ```
200
283
 
201
- Use:
202
- - `oomi personas create <id>` for repo-local manifest work
203
- - `oomi personas create-managed --name "Cooking Persona" --description "Private cooking workspace"` for the end-to-end Oomi-managed persona flow
204
- - `oomi personas scaffold <slug>` for a WebSpatial-based Oomi app shell with runtime metadata and health documents
205
- - `oomi persona-jobs execute --message-file <job.json>` when OpenClaw receives a structured persona orchestration job from Oomi
206
-
207
- Additional persona runtime commands:
208
-
209
- ```bash
284
+ The local harness uses the `openrouter-free` preset for direct-provider smoke. If `OPENROUTER_API_KEY` is present in `.env.local`, `pnpm run dev:openclaw-local` automatically uses the provider-backed testing path. Without that key, it boots in `oomi-managed` mode and waits on a future Oomi-managed provider relay.
285
+
286
+ ## Persona Scaffolding
287
+
288
+ Use the scaffold flow when OpenClaw needs to build a managed persona app that will live inside Oomi:
289
+
290
+ ```bash
291
+ oomi personas scaffold market-analyst --name "Market Analyst" --description "Private app for reviewing my broker positions and risk." --out ~/.openclaw/personas/market-analyst
292
+ ```
293
+
294
+ Use:
295
+ - `oomi personas create <id>` for repo-local manifest work
296
+ - `oomi personas create-managed --name "Cooking Persona" --description "Private cooking workspace"` for the end-to-end Oomi-managed persona flow
297
+ - `oomi personas scaffold <slug>` for a WebSpatial-based Oomi app shell with runtime metadata and health documents
298
+ - `oomi persona-jobs execute --message-file <job.json>` when OpenClaw receives a structured persona orchestration job from Oomi
299
+
300
+ Additional persona runtime commands:
301
+
302
+ ```bash
303
+ oomi personas launch-managed market-analyst --name "Market Analyst" --description "Private app for reviewing my broker positions and risk."
304
+ oomi personas status market-analyst
305
+ oomi personas stop market-analyst
306
+ oomi personas delete market-analyst
210
307
  oomi personas runtime-register market-analyst --local-port 4789
211
- oomi personas heartbeat market-analyst --local-port 4789
212
- oomi persona-jobs start pj_123
213
- oomi persona-jobs succeed pj_123 --workspace-path ~/.openclaw/personas/market-analyst --local-port 4789
214
- oomi persona-jobs fail pj_123 --code JOB_FAILED --message "Scaffold generation failed."
215
- ```
216
-
217
- Recommended agent flow:
218
-
219
- ```bash
220
- oomi personas create-managed --name "Cooking Persona" --description "Private cooking workspace for recipes, meal planning, and kitchen notes."
221
- ```
222
-
223
- That command creates the managed persona record in Oomi using the linked device identity. The backend then enqueues the `persona_job`, and the running bridge consumes that job automatically. The poll path is filtered to `metadata.type = persona_job`, so it does not consume normal queued chat traffic.
308
+ oomi personas heartbeat market-analyst --local-port 4789
309
+ oomi persona-jobs start pj_123
310
+ oomi persona-jobs succeed pj_123 --workspace-path ~/.openclaw/personas/market-analyst --local-port 4789
311
+ oomi persona-jobs fail pj_123 --code JOB_FAILED --message "Scaffold generation failed."
312
+ ```
313
+
314
+ Recommended agent flow:
315
+
316
+ ```bash
317
+ oomi personas create-managed --name "Cooking Persona" --description "Private cooking workspace for recipes, meal planning, and kitchen notes."
318
+ ```
319
+
320
+ That command creates the managed persona record in Oomi using the linked device identity. The backend then enqueues the `persona_job`, and the running bridge consumes that job automatically. The poll path is filtered to `metadata.type = persona_job`, so it does not consume normal queued chat traffic.
321
+
322
+ If you want to explicitly host or reuse the persona app on the OpenClaw machine outside the queued-job path, use:
323
+
324
+ ```bash
325
+ oomi personas launch-managed cooking-persona --entry-url https://your-relay.example/oomi/cooking-persona
326
+ ```
327
+
328
+ This command:
329
+
330
+ - reuses `~/.openclaw/personas/<slug>` as the stable workspace
331
+ - scaffolds only when the workspace is missing
332
+ - installs dependencies only when needed or forced
333
+ - allocates or reuses a free local port
334
+ - starts or reuses the local runtime
335
+ - registers the runtime URL back to Oomi unless `--no-register` is set
224
336
 
225
337
  ## Bridge Health States
226
338
 
@@ -234,7 +346,7 @@ The bridge status file is written locally and should roughly be interpreted as:
234
346
 
235
347
  For voice support, a `voice_session_*` failure should be treated as narrower than a full provider outage.
236
348
 
237
- ## Troubleshooting
349
+ ## Troubleshooting
238
350
 
239
351
  ### `invalid handshake: first request must be connect`
240
352
 
@@ -265,32 +377,32 @@ What to check:
265
377
 
266
378
  If the process is alive but runtime faults are being caught, expect `degraded` rather than an immediate hard stop.
267
379
 
268
- ### Voice STT works but the agent does not answer
380
+ ### Voice STT works but the agent does not answer
269
381
 
270
382
  This usually means one of these:
271
383
  - the managed gateway/device side is not actually ready
272
384
  - the bridge or agent run failed after delivery
273
385
  - the OpenClaw run stopped with an upstream provider `network_error`
274
386
 
275
- In that situation, inspect:
276
- - `~/.openclaw/logs/gateway.log`
277
- - `~/.openclaw/logs/gateway.err.log`
278
- - the relevant session JSONL in `~/.openclaw/agents/main/sessions/`
279
-
280
- ### Voice text works but cloned TTS fails with `MISSING_SPOKEN_METADATA`
281
-
282
- Meaning:
283
- - the assistant text arrived
284
- - the backend voice relay never received valid hidden `metadata.spoken`
285
-
286
- What to check:
287
- - run the local replay gate before publishing:
288
- - `oomi openclaw debug assistant-final --text "..."`
289
- - `oomi openclaw debug tts-pipeline --text "..."`
290
- - if the package local replay succeeds but the live machine fails, verify the OpenClaw machine is actually running the updated bridge binary
291
- - if the local replay fails, fix the assistant-final contract first instead of debugging the browser or backend deployment
387
+ In that situation, inspect:
388
+ - `~/.openclaw/logs/gateway.log`
389
+ - `~/.openclaw/logs/gateway.err.log`
390
+ - the relevant session JSONL in `~/.openclaw/agents/main/sessions/`
391
+
392
+ ### Voice text works but cloned TTS fails with `MISSING_SPOKEN_METADATA`
292
393
 
293
- ## Developer Notes
394
+ Meaning:
395
+ - the assistant text arrived
396
+ - the backend voice relay never received valid hidden `metadata.spoken`
397
+
398
+ What to check:
399
+ - run the local replay gate before publishing:
400
+ - `oomi openclaw debug assistant-final --text "..."`
401
+ - `oomi openclaw debug tts-pipeline --text "..."`
402
+ - if the package local replay succeeds but the live machine fails, verify the OpenClaw machine is actually running the updated bridge binary
403
+ - if the local replay fails, fix the assistant-final contract first instead of debugging the browser or backend deployment
404
+
405
+ ## Developer Notes
294
406
 
295
407
  If you are inspecting this package on npm, the main architectural points are:
296
408
  - the extension path is the stable managed text contract
@@ -301,44 +413,44 @@ If you are inspecting this package on npm, the main architectural points are:
301
413
  - `idempotencyKey` handling
302
414
  - bridge status that does not report `connected` before managed subscription is ready
303
415
  - runtime fault isolation so local session failures are less likely to crash the whole provider
304
- - one shared hidden managed-voice speech metadata helper used by both the extension and the local bridge
416
+ - one shared hidden managed-voice speech metadata helper used by both the extension and the local bridge
305
417
 
306
- If you are developing the plugin, test the packaged surface with:
307
-
308
- ```bash
309
- cd packages/oomi-ai
310
- node --test test/*.test.mjs
311
- npm pack --dry-run
312
- ```
313
-
314
- For managed voice changes, do not stop at the package tests. Run the local replay gate from the repo root as well, especially before publishing:
315
-
316
- ```bash
317
- oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --json
318
- oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
319
- ```
320
-
321
- ## Release Process
322
-
323
- Before publishing:
324
-
325
- ```bash
326
- cd packages/oomi-ai
327
- node --test test/*.test.mjs
328
- npm pack --dry-run
329
- ```
330
-
331
- For voice-related changes, also run the repo-backed local replay gate before publish:
332
-
333
- ```bash
334
- oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --json
335
- oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
336
- ```
337
-
338
- Then publish the bumped version:
339
-
340
- ```bash
341
- pnpm check
342
- pnpm publish --dry-run --no-git-checks --access public
343
- pnpm publish --access public
344
- ```
418
+ If you are developing the plugin, test the packaged surface with:
419
+
420
+ ```bash
421
+ cd packages/oomi-ai
422
+ node --test test/*.test.mjs
423
+ npm pack --dry-run
424
+ ```
425
+
426
+ For managed voice changes, do not stop at the package tests. Run the local replay gate from the repo root as well, especially before publishing:
427
+
428
+ ```bash
429
+ oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --json
430
+ oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
431
+ ```
432
+
433
+ ## Release Process
434
+
435
+ Before publishing:
436
+
437
+ ```bash
438
+ cd packages/oomi-ai
439
+ node --test test/*.test.mjs
440
+ npm pack --dry-run
441
+ ```
442
+
443
+ For voice-related changes, also run the repo-backed local replay gate before publish:
444
+
445
+ ```bash
446
+ oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --json
447
+ oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
448
+ ```
449
+
450
+ Then publish the bumped version:
451
+
452
+ ```bash
453
+ pnpm check
454
+ pnpm publish --dry-run --no-git-checks --access public
455
+ pnpm publish --access public
456
+ ```