oomi-ai 0.2.29 → 0.2.38

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,19 +1,26 @@
1
- # oomi-ai
2
-
3
- OpenClaw channel plugin and bridge tooling for Oomi managed chat and voice.
4
-
5
- ## Current Focus
6
-
7
- `0.2.29` keeps the persona automation lane, adds a usable local managed-voice validation path, and makes bridge logging quiet by default in production:
8
- - WebSpatial-based persona scaffolding for generated Oomi apps
9
- - a high-level `oomi personas create-managed` command for agent-driven persona creation
10
- - device-authenticated persona runtime registration and job callbacks
11
- - automatic bridge-side polling for queued `persona_job` control messages
12
- - one shared spoken-metadata normalizer used by both the extension and the bridge
13
- - a repo-backed local `tts-pipeline` replay that can validate assistant-final -> backend -> real Qwen TTS before publishing
14
- - spoken-metadata handling that preserves natural pauses like `...` and keeps the managed voice contract valid on the real chat session path
15
-
16
- This package is for two audiences:
1
+ # oomi-ai
2
+
3
+ OpenClaw channel plugin and bridge tooling for Oomi managed chat and voice.
4
+
5
+ ## Current Focus
6
+
7
+ `0.2.38` keeps the persona automation lane, adds a stable local persona runtime manager, upgrades the Docker dev harness from a package simulator to a real OpenClaw runtime, and introduces a shared OpenClaw profile contract so local onboarding, Docker bootstrap, and future hosted agents use the same setup model:
8
+ - WebSpatial-based persona scaffolding for generated Oomi apps
9
+ - a high-level `oomi personas create-managed` command for agent-driven persona creation
10
+ - a stable `oomi personas launch-managed` flow for local persona hosting under `~/.openclaw/personas`
11
+ - a matching `oomi personas delete` flow that stops managed runtimes and removes the persona workspace from the OpenClaw machine
12
+ - shared OpenClaw path handling for isolated local or containerized dev roots
13
+ - versioned `oomi openclaw profile init|apply` commands for deterministic local/dev or hosted setup flows
14
+ - explicit model auth modes so onboarding can default to `oomi-managed` while internal testing can still opt into direct provider auth
15
+ - a repo-local `openclaw debug persona-runtime` smoke test for managed persona runtime launch/reuse/stop
16
+ - a Docker-based OpenClaw dev harness that runs a real `openclaw gateway` inside an isolated container
17
+ - device-authenticated persona runtime registration and job callbacks
18
+ - automatic bridge-side polling for queued `persona_job` control messages
19
+ - one shared spoken-metadata normalizer used by both the extension and the bridge
20
+ - a repo-backed local `tts-pipeline` replay that can validate assistant-final -> backend -> real Qwen TTS before publishing
21
+ - spoken-metadata handling that preserves natural pauses like `...` and keeps the managed voice contract valid on the real chat session path
22
+
23
+ This package is for two audiences:
17
24
  - OpenClaw operators who need to connect a machine to Oomi and keep chat or voice healthy
18
25
  - Developers evaluating the plugin on npm and deciding whether it matches their OpenClaw + Oomi setup
19
26
 
@@ -122,9 +129,9 @@ Optional fields:
122
129
  - `defaultSessionKey`
123
130
  - `requestTimeoutMs`
124
131
 
125
- ## Runtime Model
126
-
127
- There are two runtime contracts worth understanding.
132
+ ## Runtime Model
133
+
134
+ There are two runtime contracts worth understanding.
128
135
 
129
136
  ### Managed Text Chat
130
137
 
@@ -141,98 +148,191 @@ That bridge:
141
148
  - preserves or synthesizes `idempotencyKey` for `chat.send`
142
149
  - keeps voice-session faults from poisoning normal provider health where possible
143
150
 
144
- This is the part of the package most likely to matter when debugging voice turn failures.
151
+ This is the part of the package most likely to matter when debugging voice turn failures.
145
152
 
146
- For managed cloned-voice replies, the canonical contract is:
147
- - visible assistant `content` stays user-facing
148
- - hidden `metadata.spoken` carries the backend TTS payload
149
- - the shared helper in `lib/spokenMetadata.js` is used by both the extension and the local bridge to preserve or normalize that sidecar before it reaches the backend
150
-
151
- The backend cloned-voice path is intentionally strict. If `metadata.spoken` does not reach Oomi, backend TTS fails instead of speaking a flat fallback voice.
152
-
153
- ## Bridge Logging
154
-
155
- The bridge is intentionally quiet by default in production so normal deploys do not spam logs with frame-level transport noise.
156
-
157
- To enable verbose bridge tracing temporarily, set:
158
-
159
- ```bash
160
- OOMI_BRIDGE_DEBUG=1
161
- ```
162
-
163
- With that flag enabled, the bridge will emit low-level session, frame, and spoken-metadata debug logs again.
164
-
165
- ## Local TTS Validation
166
-
167
- If you are developing this package inside the Oomi repo, you can now validate the managed voice path locally before publishing.
168
-
169
- This local gate does three things:
170
- - replays an assistant `chat.final` frame through the same spoken-metadata normalization path used by the OpenClaw extension and the bridge
171
- - feeds that normalized frame into the Rails backend replay harness
172
- - optionally calls the real Qwen cloned-voice provider and confirms that audio deltas come back
173
-
174
- Important:
175
- - this is a repo developer workflow, not a generic npm-only operator command
176
- - it expects the Oomi repo checkout, the Rails backend, and local provider env vars
177
- - the real-provider replay can auto-enroll a disposable default sample voice profile from `assets/voice/source/nemu-enrollment-sample.mp3`
178
-
179
- Assistant-final contract only:
180
-
181
- ```bash
182
- oomi openclaw debug assistant-final --text "Hey Justin! How is the testing going?" --json
183
- ```
184
-
185
- Full local backend replay:
186
-
187
- ```bash
188
- oomi openclaw debug tts-pipeline --text "When your voice reaches me, it gets turned into text, I read it and think about it, then I speak back through the managed chat session." --json
153
+ For managed cloned-voice replies, the canonical contract is:
154
+ - visible assistant `content` stays user-facing
155
+ - hidden `metadata.spoken` carries the backend TTS payload
156
+ - the shared helper in `lib/spokenMetadata.js` is used by both the extension and the local bridge to preserve or normalize that sidecar before it reaches the backend
157
+
158
+ The backend cloned-voice path is intentionally strict. If `metadata.spoken` does not reach Oomi, backend TTS fails instead of speaking a flat fallback voice.
159
+
160
+ ## Bridge Logging
161
+
162
+ The bridge is intentionally quiet by default in production so normal deploys do not spam logs with frame-level transport noise.
163
+
164
+ To enable verbose bridge tracing temporarily, set:
165
+
166
+ ```bash
167
+ OOMI_BRIDGE_DEBUG=1
168
+ ```
169
+
170
+ With that flag enabled, the bridge will emit low-level session, frame, and spoken-metadata debug logs again.
171
+
172
+ ## Local TTS Validation
173
+
174
+ If you are developing this package inside the Oomi repo, you can now validate the managed voice path locally before publishing.
175
+
176
+ This local gate does three things:
177
+ - replays an assistant `chat.final` frame through the same spoken-metadata normalization path used by the OpenClaw extension and the bridge
178
+ - feeds that normalized frame into the Rails backend replay harness
179
+ - optionally calls the real Qwen cloned-voice provider and confirms that audio deltas come back
180
+
181
+ Important:
182
+ - this is a repo developer workflow, not a generic npm-only operator command
183
+ - it expects the Oomi repo checkout, the Rails backend, and local provider env vars
184
+ - the real-provider replay can auto-enroll a disposable default sample voice profile from `assets/voice/source/nemu-enrollment-sample.mp3`
185
+
186
+ Assistant-final contract only:
187
+
188
+ ```bash
189
+ oomi openclaw debug assistant-final --text "Hey Justin! How is the testing going?" --json
190
+ ```
191
+
192
+ Full local backend replay:
193
+
194
+ ```bash
195
+ oomi openclaw debug tts-pipeline --text "When your voice reaches me, it gets turned into text, I read it and think about it, then I speak back through the managed chat session." --json
196
+ ```
197
+
198
+ Real Qwen provider replay:
199
+
200
+ ```bash
201
+ oomi openclaw debug tts-pipeline --text "When your voice reaches me, it gets turned into text, I read it and think about it, then I speak back through the managed chat session." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
202
+ ```
203
+
204
+ What a good result looks like:
205
+ - `backend.success = true`
206
+ - `managed.assistantSpeechFinal.present = true`
207
+ - `qwen.errorCode = null`
208
+ - `qwen.audioDeltaCount > 0` when `--live-provider` is used
209
+
210
+ This is the preferred pre-publish gate for managed voice regressions, because it is much faster than publishing to npm and testing through a live OpenClaw machine first.
211
+
212
+ ## Local OpenClaw Dev Harness
213
+
214
+ For plugin/runtime work, the preferred pre-publish loop is:
215
+
216
+ 1. run the repo-local CLI directly from source
217
+ 2. run the same flow inside the Dockerized OpenClaw dev harness using a local packed tarball
218
+ 3. only then update a real OpenClaw machine
219
+
220
+ Fast source smoke from the repo checkout:
221
+
222
+ ```bash
223
+ node packages/oomi-ai/bin/oomi-ai.js openclaw debug persona-runtime --name "Chef Dev" --json
224
+ ```
225
+
226
+ Containerized real-runtime smoke:
227
+
228
+ ```bash
229
+ docker compose -f docker/openclaw-dev/compose.yml build openclaw-dev
230
+ docker compose -f docker/openclaw-dev/compose.yml up -d openclaw-dev
231
+ docker compose -f docker/openclaw-dev/compose.yml exec -T openclaw-dev openclaw gateway health --url ws://127.0.0.1:18789 --token dev-gateway-token --json
232
+ docker compose -f docker/openclaw-dev/compose.yml exec -T openclaw-dev oomi-local openclaw debug persona-runtime --name "Chef Dev" --json
189
233
  ```
190
234
 
191
- Real Qwen provider replay:
235
+ The local managed-chat smoke uses a dedicated session key separate from the browser shell so repeated sentinel prompts do not leak into the interactive conversation history.
236
+
237
+ `oomi-local` is a deterministic container wrapper that executes the installed packed `oomi-ai` artifact directly with Node. In the Docker harness, it is only the package wrapper. The agent itself is the real OpenClaw runtime running in the foreground.
238
+
239
+ Shared profile contract smoke:
240
+
241
+ ```bash
242
+ node packages/oomi-ai/bin/oomi-ai.js openclaw profile init --profile-id oomi-dev-local --label "Oomi Local Dev" --backend-url http://127.0.0.1:3001 --device-token dev-device-token --json
243
+ node packages/oomi-ai/bin/oomi-ai.js openclaw profile apply --profile ~/.openclaw/oomi-openclaw-profile.json --openclaw-home ~/.openclaw --json
244
+ ```
245
+
246
+ What the harness does:
247
+
248
+ - bootstraps an isolated OpenClaw home rooted at `HOME/.openclaw`
249
+ - runs `openclaw onboard --non-interactive ...`
250
+ - writes and applies `HOME/.openclaw/oomi-dev-profile.json` using the same shared profile contract the future onboarding UI and hosted-agent bootstrap should use
251
+ - enables the Oomi channel account through that applied profile and relies on local OpenClaw plugin auto-discovery for the installed `oomi-ai` plugin
252
+ - writes device identity material used by the `oomi-ai` bridge tooling
253
+ - packs the local `packages/oomi-ai` checkout into a `.tgz`
254
+ - installs that tarball globally in the container
255
+ - installs the same tarball as a real OpenClaw plugin
256
+ - defaults model auth to `oomi-managed` so onboarding/bootstrap does not require end-user provider keys
257
+ - runs `openclaw gateway` as the foreground container process
258
+
259
+ Useful env overrides for local integration:
192
260
 
193
- ```bash
194
- oomi openclaw debug tts-pipeline --text "When your voice reaches me, it gets turned into text, I read it and think about it, then I speak back through the managed chat session." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
195
- ```
261
+ - `OOMI_DEV_BACKEND_URL`
262
+ - `OOMI_DEV_DEVICE_TOKEN`
263
+ - `OOMI_DEV_MODEL_AUTH_MODE`
264
+ - `OPENCLAW_GATEWAY_TOKEN`
265
+ - `OPENCLAW_GATEWAY_PASSWORD`
196
266
 
197
- What a good result looks like:
198
- - `backend.success = true`
199
- - `managed.assistantSpeechFinal.present = true`
200
- - `qwen.errorCode = null`
201
- - `qwen.audioDeltaCount > 0` when `--live-provider` is used
267
+ Recommended local modes:
202
268
 
203
- This is the preferred pre-publish gate for managed voice regressions, because it is much faster than publishing to npm and testing through a live OpenClaw machine first.
269
+ - onboarding/runtime checks without provider keys
270
+ - `OOMI_DEV_MODEL_AUTH_MODE=oomi-managed`
271
+ - internal real-response smoke before publish
272
+ - `OPENROUTER_API_KEY=...`
273
+ - optional explicit override: `OOMI_DEV_MODEL_AUTH_MODE=provider-env`
204
274
 
205
- ## Persona Scaffolding
275
+ The default container config is intentionally safe for onboarding and runtime testing. It does not require a published npm version, and it does not require end-user provider keys.
206
276
 
207
- Use the scaffold flow when OpenClaw needs to build a managed persona app that will live inside Oomi:
277
+ To make the Dockerized OpenClaw runtime actually answer managed chat locally today, add this to the repo `.env.local`:
208
278
 
209
279
  ```bash
210
- oomi personas scaffold market-analyst --name "Market Analyst" --description "Private app for reviewing my broker positions and risk." --out ~/.openclaw/personas/market-analyst
280
+ OOMI_DEV_MODEL_AUTH_MODE=provider-env
281
+ OPENROUTER_API_KEY=<your-openrouter-key>
211
282
  ```
212
283
 
213
- Use:
214
- - `oomi personas create <id>` for repo-local manifest work
215
- - `oomi personas create-managed --name "Cooking Persona" --description "Private cooking workspace"` for the end-to-end Oomi-managed persona flow
216
- - `oomi personas scaffold <slug>` for a WebSpatial-based Oomi app shell with runtime metadata and health documents
217
- - `oomi persona-jobs execute --message-file <job.json>` when OpenClaw receives a structured persona orchestration job from Oomi
218
-
219
- Additional persona runtime commands:
220
-
221
- ```bash
284
+ The local harness uses the `openrouter-free` preset for direct-provider smoke. If `OPENROUTER_API_KEY` is present in `.env.local`, `pnpm run dev:openclaw-local` automatically uses the provider-backed testing path. Without that key, it boots in `oomi-managed` mode and waits on a future Oomi-managed provider relay.
285
+
286
+ ## Persona Scaffolding
287
+
288
+ Use the scaffold flow when OpenClaw needs to build a managed persona app that will live inside Oomi:
289
+
290
+ ```bash
291
+ oomi personas scaffold market-analyst --name "Market Analyst" --description "Private app for reviewing my broker positions and risk." --out ~/.openclaw/personas/market-analyst
292
+ ```
293
+
294
+ Use:
295
+ - `oomi personas create <id>` for repo-local manifest work
296
+ - `oomi personas create-managed --name "Cooking Persona" --description "Private cooking workspace"` for the end-to-end Oomi-managed persona flow
297
+ - `oomi personas scaffold <slug>` for a WebSpatial-based Oomi app shell with runtime metadata and health documents
298
+ - `oomi persona-jobs execute --message-file <job.json>` when OpenClaw receives a structured persona orchestration job from Oomi
299
+
300
+ Additional persona runtime commands:
301
+
302
+ ```bash
303
+ oomi personas launch-managed market-analyst --name "Market Analyst" --description "Private app for reviewing my broker positions and risk."
304
+ oomi personas status market-analyst
305
+ oomi personas stop market-analyst
306
+ oomi personas delete market-analyst
222
307
  oomi personas runtime-register market-analyst --local-port 4789
223
- oomi personas heartbeat market-analyst --local-port 4789
224
- oomi persona-jobs start pj_123
225
- oomi persona-jobs succeed pj_123 --workspace-path ~/.openclaw/personas/market-analyst --local-port 4789
226
- oomi persona-jobs fail pj_123 --code JOB_FAILED --message "Scaffold generation failed."
227
- ```
228
-
229
- Recommended agent flow:
230
-
231
- ```bash
232
- oomi personas create-managed --name "Cooking Persona" --description "Private cooking workspace for recipes, meal planning, and kitchen notes."
233
- ```
234
-
235
- That command creates the managed persona record in Oomi using the linked device identity. The backend then enqueues the `persona_job`, and the running bridge consumes that job automatically. The poll path is filtered to `metadata.type = persona_job`, so it does not consume normal queued chat traffic.
308
+ oomi personas heartbeat market-analyst --local-port 4789
309
+ oomi persona-jobs start pj_123
310
+ oomi persona-jobs succeed pj_123 --workspace-path ~/.openclaw/personas/market-analyst --local-port 4789
311
+ oomi persona-jobs fail pj_123 --code JOB_FAILED --message "Scaffold generation failed."
312
+ ```
313
+
314
+ Recommended agent flow:
315
+
316
+ ```bash
317
+ oomi personas create-managed --name "Cooking Persona" --description "Private cooking workspace for recipes, meal planning, and kitchen notes."
318
+ ```
319
+
320
+ That command creates the managed persona record in Oomi using the linked device identity. The backend then enqueues the `persona_job`, and the running bridge consumes that job automatically. The poll path is filtered to `metadata.type = persona_job`, so it does not consume normal queued chat traffic.
321
+
322
+ If you want to explicitly host or reuse the persona app on the OpenClaw machine outside the queued-job path, use:
323
+
324
+ ```bash
325
+ oomi personas launch-managed cooking-persona --entry-url https://your-relay.example/oomi/cooking-persona
326
+ ```
327
+
328
+ This command:
329
+
330
+ - reuses `~/.openclaw/personas/<slug>` as the stable workspace
331
+ - scaffolds only when the workspace is missing
332
+ - installs dependencies only when needed or forced
333
+ - allocates or reuses a free local port
334
+ - starts or reuses the local runtime
335
+ - registers the runtime URL back to Oomi unless `--no-register` is set
236
336
 
237
337
  ## Bridge Health States
238
338
 
@@ -246,7 +346,7 @@ The bridge status file is written locally and should roughly be interpreted as:
246
346
 
247
347
  For voice support, a `voice_session_*` failure should be treated as narrower than a full provider outage.
248
348
 
249
- ## Troubleshooting
349
+ ## Troubleshooting
250
350
 
251
351
  ### `invalid handshake: first request must be connect`
252
352
 
@@ -277,32 +377,32 @@ What to check:
277
377
 
278
378
  If the process is alive but runtime faults are being caught, expect `degraded` rather than an immediate hard stop.
279
379
 
280
- ### Voice STT works but the agent does not answer
380
+ ### Voice STT works but the agent does not answer
281
381
 
282
382
  This usually means one of these:
283
383
  - the managed gateway/device side is not actually ready
284
384
  - the bridge or agent run failed after delivery
285
385
  - the OpenClaw run stopped with an upstream provider `network_error`
286
386
 
287
- In that situation, inspect:
288
- - `~/.openclaw/logs/gateway.log`
289
- - `~/.openclaw/logs/gateway.err.log`
290
- - the relevant session JSONL in `~/.openclaw/agents/main/sessions/`
291
-
292
- ### Voice text works but cloned TTS fails with `MISSING_SPOKEN_METADATA`
293
-
294
- Meaning:
295
- - the assistant text arrived
296
- - the backend voice relay never received valid hidden `metadata.spoken`
297
-
298
- What to check:
299
- - run the local replay gate before publishing:
300
- - `oomi openclaw debug assistant-final --text "..."`
301
- - `oomi openclaw debug tts-pipeline --text "..."`
302
- - if the package local replay succeeds but the live machine fails, verify the OpenClaw machine is actually running the updated bridge binary
303
- - if the local replay fails, fix the assistant-final contract first instead of debugging the browser or backend deployment
387
+ In that situation, inspect:
388
+ - `~/.openclaw/logs/gateway.log`
389
+ - `~/.openclaw/logs/gateway.err.log`
390
+ - the relevant session JSONL in `~/.openclaw/agents/main/sessions/`
391
+
392
+ ### Voice text works but cloned TTS fails with `MISSING_SPOKEN_METADATA`
304
393
 
305
- ## Developer Notes
394
+ Meaning:
395
+ - the assistant text arrived
396
+ - the backend voice relay never received valid hidden `metadata.spoken`
397
+
398
+ What to check:
399
+ - run the local replay gate before publishing:
400
+ - `oomi openclaw debug assistant-final --text "..."`
401
+ - `oomi openclaw debug tts-pipeline --text "..."`
402
+ - if the package local replay succeeds but the live machine fails, verify the OpenClaw machine is actually running the updated bridge binary
403
+ - if the local replay fails, fix the assistant-final contract first instead of debugging the browser or backend deployment
404
+
405
+ ## Developer Notes
306
406
 
307
407
  If you are inspecting this package on npm, the main architectural points are:
308
408
  - the extension path is the stable managed text contract
@@ -313,44 +413,44 @@ If you are inspecting this package on npm, the main architectural points are:
313
413
  - `idempotencyKey` handling
314
414
  - bridge status that does not report `connected` before managed subscription is ready
315
415
  - runtime fault isolation so local session failures are less likely to crash the whole provider
316
- - one shared hidden managed-voice speech metadata helper used by both the extension and the local bridge
416
+ - one shared hidden managed-voice speech metadata helper used by both the extension and the local bridge
317
417
 
318
- If you are developing the plugin, test the packaged surface with:
319
-
320
- ```bash
321
- cd packages/oomi-ai
322
- node --test test/*.test.mjs
323
- npm pack --dry-run
324
- ```
325
-
326
- For managed voice changes, do not stop at the package tests. Run the local replay gate from the repo root as well, especially before publishing:
327
-
328
- ```bash
329
- oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --json
330
- oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
331
- ```
332
-
333
- ## Release Process
334
-
335
- Before publishing:
336
-
337
- ```bash
338
- cd packages/oomi-ai
339
- node --test test/*.test.mjs
340
- npm pack --dry-run
341
- ```
342
-
343
- For voice-related changes, also run the repo-backed local replay gate before publish:
344
-
345
- ```bash
346
- oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --json
347
- oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
348
- ```
349
-
350
- Then publish the bumped version:
351
-
352
- ```bash
353
- pnpm check
354
- pnpm publish --dry-run --no-git-checks --access public
355
- pnpm publish --access public
356
- ```
418
+ If you are developing the plugin, test the packaged surface with:
419
+
420
+ ```bash
421
+ cd packages/oomi-ai
422
+ node --test test/*.test.mjs
423
+ npm pack --dry-run
424
+ ```
425
+
426
+ For managed voice changes, do not stop at the package tests. Run the local replay gate from the repo root as well, especially before publishing:
427
+
428
+ ```bash
429
+ oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --json
430
+ oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
431
+ ```
432
+
433
+ ## Release Process
434
+
435
+ Before publishing:
436
+
437
+ ```bash
438
+ cd packages/oomi-ai
439
+ node --test test/*.test.mjs
440
+ npm pack --dry-run
441
+ ```
442
+
443
+ For voice-related changes, also run the repo-backed local replay gate before publish:
444
+
445
+ ```bash
446
+ oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --json
447
+ oomi openclaw debug tts-pipeline --text "Local managed voice validation text." --live-provider --env-file .env.local --provider-timeout-ms 20000 --json
448
+ ```
449
+
450
+ Then publish the bumped version:
451
+
452
+ ```bash
453
+ pnpm check
454
+ pnpm publish --dry-run --no-git-checks --access public
455
+ pnpm publish --access public
456
+ ```