@ouro.bot/cli 0.1.0-alpha.637 → 0.1.0-alpha.639

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/changelog.json CHANGED
@@ -1,6 +1,18 @@
1
1
  {
2
2
  "_note": "This changelog is maintained as part of the PR/version-bump workflow. Agent-curated, not auto-generated. Agents read this file directly via read_file to understand what changed between versions.",
3
3
  "versions": [
4
+ {
5
+ "version": "0.1.0-alpha.639",
6
+ "changes": [
7
+ "Rollback safety guard: ouro rollback now validates the target installed @ouro.bot/cli package payload before flipping CurrentVersion, and refuses cached runtimes whose shipped assets contain blocked removed user-facing failure text so old local versions cannot resurrect the removed BlueBubbles live-turn timeout notice."
8
+ ]
9
+ },
10
+ {
11
+ "version": "0.1.0-alpha.638",
12
+ "changes": [
13
+ "Private-return acknowledgements now have to be packet-backed: outward settle rejects queued/private-return copy unless the same turn actually created a ponder packet, return obligation, and inner wake; regression coverage recreates the post-merge MCP failure and confirms blocking clarifications are still allowed."
14
+ ]
15
+ },
4
16
  {
5
17
  "version": "0.1.0-alpha.637",
6
18
  "changes": [
@@ -19,7 +31,7 @@
19
31
  "BlueBubbles follow-up hardening: callback transport activity now has a bounded per-operation watchdog, late turn results after timeout are suppressed instead of recording stale success, and coverage locks the close/drop paths so stuck status or cleanup transports cannot keep the chat lane occupied.",
20
32
  "Package lifecycle hardening: `npm pack` now runs the full build and package-asset verifier through `prepack`, so local tarballs cannot capture stale compiled `dist/` output after source fixes. The package asset verifier now has a CLI entrypoint used by the lifecycle script.",
21
33
  "Package asset verifier payload-boundary hardening: prepack verification now scans only the roots declared in `package.json.files` plus package metadata, so local worktrees, source folders, coverage reports, and other non-payload developer artifacts cannot block a clean tarball while stale text inside shipped assets is still rejected.",
22
- "Cozy narrative pass on the `## my desk` prompt section that every ouro agent reads every turn. Same information, warmer voice: the desk is now described as a quiet room of the agent's work \u2014 tracks line one wall like drawers in a wide cabinet, friction notes pin to a corkboard, lessons sit on a small reference shelf by the window, and finished work slides into the back, still browsable and still mine. The intent: an agent reading this every turn feels at home rather than briefed. Dynamic-block labels softened to match (`nearest the front of the desk:` instead of `FEATURED:`; `also open on the desk:` instead of `other active tracks:`; `tasks still open:` instead of `non-terminal tasks:`; empty-state reads `the desk is quiet today \u2014 no tracks yet. a good time to lay something down.`). Information density unchanged; only the texture shifts. Tests + snapshot updated to match the new strings; 26 desk-section tests + 282 prompt tests all green.",
34
+ "Cozy narrative pass on the `## my desk` prompt section that every ouro agent reads every turn. Same information, warmer voice: the desk is now described as a quiet room of the agent's work tracks line one wall like drawers in a wide cabinet, friction notes pin to a corkboard, lessons sit on a small reference shelf by the window, and finished work slides into the back, still browsable and still mine. The intent: an agent reading this every turn feels at home rather than briefed. Dynamic-block labels softened to match (`nearest the front of the desk:` instead of `FEATURED:`; `also open on the desk:` instead of `other active tracks:`; `tasks still open:` instead of `non-terminal tasks:`; empty-state reads `the desk is quiet today no tracks yet. a good time to lay something down.`). Information density unchanged; only the texture shifts. Tests + snapshot updated to match the new strings; 26 desk-section tests + 282 prompt tests all green.",
23
35
  "Package e2e smokes now run installed `ouro` binaries with an isolated temp HOME/USERPROFILE so `ouro --version` and `ouro help` no longer create `~/AgentBundles/default.ouro` daemon logs on the developer machine; package-e2e tests cover the isolated env.",
24
36
  "Adds the first Ouro evolution-loop substrate: durable EvolutionCase/EvolutionTrace state under each agent bundle, harness_friction packet binding by friction signature, coding_spawn evolutionCaseId binding with budget/authority enforcement, active-work surfacing for open cases, eight local evolution tools (status/case/capture/decide/verify/deliver/ratify/close) with nerves telemetry, and prompt guidance for evidence-first self-fix work. The flow now captures evidence, checks budget and authority before delegation and merge-sensitive actions, records verification and delivery state, and requires ratification before closure; GEPA-style optimization remains deferred until trace quality exists. Tests add store, packet, coding, active-work, tool, prompt, and end-to-end coverage, plus daemon CLI timing stabilizers uncovered by full coverage."
25
37
  ]
@@ -27,7 +39,7 @@
27
39
  {
28
40
  "version": "0.1.0-alpha.635",
29
41
  "changes": [
30
- "**Critical follow-up**: `McpManager.reconcile()` now reads from the same merged-config helper as `getSharedMcpManager()`'s initial start, so plugin-declared MCP servers (`<plugin-root>/.mcp.json`) survive across turns. Pre-fix bug: reconcile() read only `config.mcpServers` (agent.json builtins), saw plugin servers as 'removed', and tore them down \u2014 so the first turn's `mcp__desk__*` tools would disappear on the second turn (`Unknown server: desk`). Surfaced during desk plugin validation: every other call was failing. New `buildMergedServerConfig()` helper module-level; reused by both code paths. Regression test in mcp-manager-plugin-merge.test.ts asserts plugin servers survive across N reconcile cycles with zero shutdowns. Bundle-skeleton contract test relaxed: the per-agent `plugins` key is excluded from the cross-agent bundle-key parity check (plugins are agent-specific by design)."
42
+ "**Critical follow-up**: `McpManager.reconcile()` now reads from the same merged-config helper as `getSharedMcpManager()`'s initial start, so plugin-declared MCP servers (`<plugin-root>/.mcp.json`) survive across turns. Pre-fix bug: reconcile() read only `config.mcpServers` (agent.json builtins), saw plugin servers as 'removed', and tore them down so the first turn's `mcp__desk__*` tools would disappear on the second turn (`Unknown server: desk`). Surfaced during desk plugin validation: every other call was failing. New `buildMergedServerConfig()` helper module-level; reused by both code paths. Regression test in mcp-manager-plugin-merge.test.ts asserts plugin servers survive across N reconcile cycles with zero shutdowns. Bundle-skeleton contract test relaxed: the per-agent `plugins` key is excluded from the cross-agent bundle-key parity check (plugins are agent-specific by design)."
31
43
  ]
32
44
  },
33
45
  {
@@ -39,31 +51,31 @@
39
51
  {
40
52
  "version": "0.1.0-alpha.633",
41
53
  "changes": [
42
- "`ouro migrate-to-desk` migrator \u2014 new CLI command that copies an agent's legacy `tasks/` tree into desk-shape under `<bundle>/desk/`. **COPY semantics, not move** \u2014 the source `tasks/` directory is left intact for dual-read safety during transition; the operator triggers final deletion later (out of scope here). New `src/repertoire/desk/classifier.ts` exposes a pure-data classifier (`classifyFile()`, `deriveTaskSlug()`, `extractParentTaskDir()`, `resolveUpdatedMs()`) implementing the triage rules \u2014 terminal (done/approved/complete/cancelled/fixed/etc.) \u2192 archive bucket; live status with `updated` older than 30 days (cutoff 2026-04-22) \u2192 stale_live archive bucket; missing/unknown status / `.md.bak` / empty \u2192 ambiguous archive bucket; live status within 30 days \u2192 live_clear \u2192 migrate to a default `legacy` track; the one-off `ongoing/2026-03-09-1410-summer-2026-europe-trip.md` (a special-cased file from the initial bundle that motivated this migrator) \u2192 special_europe_trip lift. Effective `updated` resolves via YAML `updated` \u2192 `approved` \u2192 `created` \u2192 body `**Updated**:` \u2192 date-prefix in filename \u2192 file mtime. Pairing rules: planning/doing/ideation/audit siblings sharing a slug group together; if any sibling is terminal \u2192 all terminal, else if any is live_clear \u2192 all live_clear; sub-files under `<taskdir>/` inherit from the parent. Files under `tasks/archive/` are unconditionally terminal regardless of frontmatter. New `src/heart/daemon/migrate-to-desk.ts` exposes `runMigrateToDesk()` \u2014 walks `<bundle>/tasks/**`, classifies, groups, builds a deterministic plan, then writes archive bucket to `desk/_archive/<original-relative-path>` (preserving structure; `schema_version: 1` set on every touched markdown via `ensureSchemaVersion()`); live_clear tasks to `desk/legacy/<slug>/task.md` + `iterations/`; europe-trip task to `desk/summer-2026-europe-trip/` with two task scaffolds (`book-replacement-outbound`, `weekly-trip-check`), `_planning/{overview.md,next-actions.md}`, full track frontmatter (featured/urgency/target_date/trip_record/travel_docs), and `desk/_meta/featured.md` set to `summer-2026-europe-trip`. Track-level `track.md` generated for the legacy track (status: collaborating, body documents the triage rules + migration date). Migration log written to `desk/_meta/migration-2026-05-22.log` listing every file's classification + final destination. **Idempotency:** if the log exists, re-runs abort with a clear message + exit code 1 unless `--force` is passed; with `--force`, destructive scope is bounded to migrator-owned dirs (`desk/_archive/`, `desk/legacy/`, `desk/summer-2026-europe-trip/`, `desk/_meta/featured.md`, `desk/_meta/migration-2026-05-22.log`) \u2014 nothing else under `desk/` is touched. **`--dry-run`** writes a per-bucket-counts summary to stdout and modifies nothing. Missing-`tasks/` bundles return a `no tasks/ directory` message with `performed: false`. Operator's policy: \"do best pass on ambiguous ones; when in doubt, archive\" \u2014 no interactive prompts, ever. **Europe-trip lift uses minimal stubs** in this unit; the operator can hand-edit content in a follow-up. CLI: `ouro migrate-to-desk --agent <name> [--root <path>] [--force] [--dry-run]`. Wires into `cli-types.ts` (new `migrate-to-desk` variant + `MigrateToDeskCliCommand` alias), `cli-parse.ts` (`parseMigrateToDeskCommand()`), `cli-exec.ts` (local executor branch \u2014 no daemon needed), `cli-help.ts` (Tasks category entry). New `src/__tests__/heart/daemon/migrate-to-desk.test.ts` with a synthetic fixture bundle at `src/__tests__/fixtures/migrate-bundle-mini/` covering all five buckets \u2014 classifier unit tests for each bucket + each updated-fallback step + slug/parent-dir derivation; migrator integration tests for post-migration tree shape, copy semantics (source untouched), schema_version coverage, legacy + europe-trip track.md generation, migration-log format, idempotency abort, `--force` bounded scope (operator-owned files outside the scope survive), `--dry-run` writes nothing, missing-bundle handling, empty-tasks handling, default-root derivation from `--agent`; CLI parser tests for argv\u2192canonical routing + required-flag validation; CLI executor tests for `runOuroCli` wiring + exit-code on idempotency abort. `migrate-to-desk.ts` emits 4 `daemon.migrate_to_desk_*` nerves events (start / no-source / aborted_existing / dry-run / complete) for Rule 5 coverage. `classifier.ts` is pure-data \u2014 caller (migrator) owns observability. Bumps alpha.633."
54
+ "`ouro migrate-to-desk` migrator new CLI command that copies an agent's legacy `tasks/` tree into desk-shape under `<bundle>/desk/`. **COPY semantics, not move** the source `tasks/` directory is left intact for dual-read safety during transition; the operator triggers final deletion later (out of scope here). New `src/repertoire/desk/classifier.ts` exposes a pure-data classifier (`classifyFile()`, `deriveTaskSlug()`, `extractParentTaskDir()`, `resolveUpdatedMs()`) implementing the triage rules terminal (done/approved/complete/cancelled/fixed/etc.) archive bucket; live status with `updated` older than 30 days (cutoff 2026-04-22) stale_live archive bucket; missing/unknown status / `.md.bak` / empty ambiguous archive bucket; live status within 30 days live_clear migrate to a default `legacy` track; the one-off `ongoing/2026-03-09-1410-summer-2026-europe-trip.md` (a special-cased file from the initial bundle that motivated this migrator) special_europe_trip lift. Effective `updated` resolves via YAML `updated` `approved` `created` body `**Updated**:` date-prefix in filename file mtime. Pairing rules: planning/doing/ideation/audit siblings sharing a slug group together; if any sibling is terminal all terminal, else if any is live_clear all live_clear; sub-files under `<taskdir>/` inherit from the parent. Files under `tasks/archive/` are unconditionally terminal regardless of frontmatter. New `src/heart/daemon/migrate-to-desk.ts` exposes `runMigrateToDesk()` walks `<bundle>/tasks/**`, classifies, groups, builds a deterministic plan, then writes archive bucket to `desk/_archive/<original-relative-path>` (preserving structure; `schema_version: 1` set on every touched markdown via `ensureSchemaVersion()`); live_clear tasks to `desk/legacy/<slug>/task.md` + `iterations/`; europe-trip task to `desk/summer-2026-europe-trip/` with two task scaffolds (`book-replacement-outbound`, `weekly-trip-check`), `_planning/{overview.md,next-actions.md}`, full track frontmatter (featured/urgency/target_date/trip_record/travel_docs), and `desk/_meta/featured.md` set to `summer-2026-europe-trip`. Track-level `track.md` generated for the legacy track (status: collaborating, body documents the triage rules + migration date). Migration log written to `desk/_meta/migration-2026-05-22.log` listing every file's classification + final destination. **Idempotency:** if the log exists, re-runs abort with a clear message + exit code 1 unless `--force` is passed; with `--force`, destructive scope is bounded to migrator-owned dirs (`desk/_archive/`, `desk/legacy/`, `desk/summer-2026-europe-trip/`, `desk/_meta/featured.md`, `desk/_meta/migration-2026-05-22.log`) nothing else under `desk/` is touched. **`--dry-run`** writes a per-bucket-counts summary to stdout and modifies nothing. Missing-`tasks/` bundles return a `no tasks/ directory` message with `performed: false`. Operator's policy: \"do best pass on ambiguous ones; when in doubt, archive\" no interactive prompts, ever. **Europe-trip lift uses minimal stubs** in this unit; the operator can hand-edit content in a follow-up. CLI: `ouro migrate-to-desk --agent <name> [--root <path>] [--force] [--dry-run]`. Wires into `cli-types.ts` (new `migrate-to-desk` variant + `MigrateToDeskCliCommand` alias), `cli-parse.ts` (`parseMigrateToDeskCommand()`), `cli-exec.ts` (local executor branch no daemon needed), `cli-help.ts` (Tasks category entry). New `src/__tests__/heart/daemon/migrate-to-desk.test.ts` with a synthetic fixture bundle at `src/__tests__/fixtures/migrate-bundle-mini/` covering all five buckets classifier unit tests for each bucket + each updated-fallback step + slug/parent-dir derivation; migrator integration tests for post-migration tree shape, copy semantics (source untouched), schema_version coverage, legacy + europe-trip track.md generation, migration-log format, idempotency abort, `--force` bounded scope (operator-owned files outside the scope survive), `--dry-run` writes nothing, missing-bundle handling, empty-tasks handling, default-root derivation from `--agent`; CLI parser tests for argv→canonical routing + required-flag validation; CLI executor tests for `runOuroCli` wiring + exit-code on idempotency abort. `migrate-to-desk.ts` emits 4 `daemon.migrate_to_desk_*` nerves events (start / no-source / aborted_existing / dry-run / complete) for Rule 5 coverage. `classifier.ts` is pure-data caller (migrator) owns observability. Bumps alpha.633."
43
55
  ]
44
56
  },
45
57
  {
46
58
  "version": "0.1.0-alpha.632",
47
59
  "changes": [
48
- "`ouro desk` umbrella CLI + `ouro task` alias \u2014 verb-and-router layer over the desk MCP server. New `src/heart/daemon/cli-desk.ts` exposes `parseDeskCommand()` + `parseTaskAliasCommand()` (pure argv\u2192canonical-form parsers) and `executeDeskCommand()` (dispatches via the existing daemon `mcp.call` surface with `server: \"desk\"`). The CLI surface: `ouro desk task list|new|done|archive|show`, `ouro desk track list|new|show`, `ouro desk friction|lesson add <text>`, `ouro desk search|recall <query>`, `ouro desk reindex`, `ouro desk thread <path>`. `ouro task ...` is a top-level alias that routes through the same task sub-parser. Each subverb normalises to a `{ kind: \"desk\", tool, toolArgs }` shape on the `OuroCliCommand` union (new variant in `cli-types.ts` + `DeskCliCommand` alias) which the executor then JSON-stringifies into `mcp.call`'s `args` field. Verbs lacking a direct desk MCP tool (`task list/show`, `track list/show`) route through `desk_search` filtered by `kind`; `task done` rewrites to `task_update` with `frontmatter.status=\"done\"`. `desk reindex` routes to `desk_reindex` \u2014 the MCP server will surface a clean unknown-tool error until that admin tool ships in a follow-up unit. Wires into `cli-parse.ts` (top-level dispatch for `desk` + `task`), `cli-exec.ts` (new `desk` branch + `DeskCliCommand` added to `toDaemonCommand`'s exclusion union), and `cli-help.ts` (`desk`/`task` entries under the Tasks category). New `src/__tests__/heart/daemon/cli-desk.test.ts` covers argv\u2192canonical routing for every subverb (alias and umbrella forms), daemon-socket dispatch shape, `--agent` propagation, daemon-unavailable + daemon-error + empty-content fallbacks, slug normalisation (idempotent `/task.md` suffix + trailing-slash trim), and the usage-string export. The earlier retirement contract in `daemon-cli.test.ts` + `cli-help.test.ts` updated to reflect this PR's reintroduction of `task` as an alias (legacy `task board`/`create`/`fix` subverbs still rejected, just under a desk usage hint rather than a top-level unknown-command error). No new business logic \u2014 pure verb-and-router. `cli-desk.ts` is 100% line/branch/func/statement covered and emits `daemon.desk_cli_dispatch` for nerves audit Rule 5."
60
+ "`ouro desk` umbrella CLI + `ouro task` alias verb-and-router layer over the desk MCP server. New `src/heart/daemon/cli-desk.ts` exposes `parseDeskCommand()` + `parseTaskAliasCommand()` (pure argv→canonical-form parsers) and `executeDeskCommand()` (dispatches via the existing daemon `mcp.call` surface with `server: \"desk\"`). The CLI surface: `ouro desk task list|new|done|archive|show`, `ouro desk track list|new|show`, `ouro desk friction|lesson add <text>`, `ouro desk search|recall <query>`, `ouro desk reindex`, `ouro desk thread <path>`. `ouro task ...` is a top-level alias that routes through the same task sub-parser. Each subverb normalises to a `{ kind: \"desk\", tool, toolArgs }` shape on the `OuroCliCommand` union (new variant in `cli-types.ts` + `DeskCliCommand` alias) which the executor then JSON-stringifies into `mcp.call`'s `args` field. Verbs lacking a direct desk MCP tool (`task list/show`, `track list/show`) route through `desk_search` filtered by `kind`; `task done` rewrites to `task_update` with `frontmatter.status=\"done\"`. `desk reindex` routes to `desk_reindex` the MCP server will surface a clean unknown-tool error until that admin tool ships in a follow-up unit. Wires into `cli-parse.ts` (top-level dispatch for `desk` + `task`), `cli-exec.ts` (new `desk` branch + `DeskCliCommand` added to `toDaemonCommand`'s exclusion union), and `cli-help.ts` (`desk`/`task` entries under the Tasks category). New `src/__tests__/heart/daemon/cli-desk.test.ts` covers argv→canonical routing for every subverb (alias and umbrella forms), daemon-socket dispatch shape, `--agent` propagation, daemon-unavailable + daemon-error + empty-content fallbacks, slug normalisation (idempotent `/task.md` suffix + trailing-slash trim), and the usage-string export. The earlier retirement contract in `daemon-cli.test.ts` + `cli-help.test.ts` updated to reflect this PR's reintroduction of `task` as an alias (legacy `task board`/`create`/`fix` subverbs still rejected, just under a desk usage hint rather than a top-level unknown-command error). No new business logic pure verb-and-router. `cli-desk.ts` is 100% line/branch/func/statement covered and emits `daemon.desk_cli_dispatch` for nerves audit Rule 5."
49
61
  ]
50
62
  },
51
63
  {
52
64
  "version": "0.1.0-alpha.631",
53
65
  "changes": [
54
- "ouroboros daemon now reads each enabled plugin's `<plugin-root>/.mcp.json` and auto-spawns the declared stdio MCP servers per agent. New `src/repertoire/plugin-mcp.ts` exposes `listPluginMcpServers()` \u2014 walks `listEnabledPlugins()`, reads each plugin's `.mcp.json` (Anthropic/Claude-Code public spec shape with `mcpServers` map), resolves `${VAR:-default}` substitution in `args` + `env` values (with special-case: `DESK` defaults to `<agent-bundle>/desk/` when unset). Missing `.mcp.json` skips cleanly; malformed JSON emits `plugin_mcp.parse_error` and skips cleanly (no daemon crash). `getSharedMcpManager()` merges plugin-declared servers with builtin `agent.json` `mcpServers` before calling `manager.start()`; on name collision, builtin wins (deterministic \u2014 operator's explicit agent.json overrides plugin defaults). `McpManager.start()` accepts a second `pluginOrigins: Record<string, string>` arg threading the plugin id through to each `ServerEntry`; `listAllTools()` exposes the new `pluginId` field per entry. `resolveVaultEnv()` short-circuits when no `vault:` reference exists (avoids spinning up the credential store for plugin servers whose env is empty / pure-string). `mcpToolsAsDefinitions()` uses the `pluginId` flag to name plugin-server tools as `mcp__<server>__<tool>` (Anthropic public convention, matches desk-section.ts on-prompt promise); builtin server tools keep the legacy `<server>_<tool>` shape. After this PR the desk plugin's `mcp__desk__*` tools (CRUD + 5 search + thread = 12 tools) become live in any agent that has the desk plugin enabled."
66
+ "ouroboros daemon now reads each enabled plugin's `<plugin-root>/.mcp.json` and auto-spawns the declared stdio MCP servers per agent. New `src/repertoire/plugin-mcp.ts` exposes `listPluginMcpServers()` walks `listEnabledPlugins()`, reads each plugin's `.mcp.json` (Anthropic/Claude-Code public spec shape with `mcpServers` map), resolves `${VAR:-default}` substitution in `args` + `env` values (with special-case: `DESK` defaults to `<agent-bundle>/desk/` when unset). Missing `.mcp.json` skips cleanly; malformed JSON emits `plugin_mcp.parse_error` and skips cleanly (no daemon crash). `getSharedMcpManager()` merges plugin-declared servers with builtin `agent.json` `mcpServers` before calling `manager.start()`; on name collision, builtin wins (deterministic operator's explicit agent.json overrides plugin defaults). `McpManager.start()` accepts a second `pluginOrigins: Record<string, string>` arg threading the plugin id through to each `ServerEntry`; `listAllTools()` exposes the new `pluginId` field per entry. `resolveVaultEnv()` short-circuits when no `vault:` reference exists (avoids spinning up the credential store for plugin servers whose env is empty / pure-string). `mcpToolsAsDefinitions()` uses the `pluginId` flag to name plugin-server tools as `mcp__<server>__<tool>` (Anthropic public convention, matches desk-section.ts on-prompt promise); builtin server tools keep the legacy `<server>_<tool>` shape. After this PR the desk plugin's `mcp__desk__*` tools (CRUD + 5 search + thread = 12 tools) become live in any agent that has the desk plugin enabled."
55
67
  ]
56
68
  },
57
69
  {
58
70
  "version": "0.1.0-alpha.630",
59
71
  "changes": [
60
- "delete the `src/repertoire/tasks/` module and its tests from disk. Final cleanup after an earlier PR dropped production reads and a follow-up PR rewired the bridges, daemon scheduler, and parser utilities off the module. Also rewires the lingering `cli-exec.ts` inner-status handler to import `parseFrontmatter` from `src/util/frontmatter` (earlier-PR miss). Test sweep: 19 `vi.mock(\"../../repertoire/tasks\", ...)` defensive mock blocks removed across `prompt-*`, `tools-*`, `refresh-system-prompt`, and `continuity-tools` test files (mocks pointed at a module path that no longer exists). `task-scheduler.test.ts` removed (legacy fixture helpers gone; scheduler still exercised via integration). `mailbox-readers-continuity-catches.test.ts` drops the obsolete \"self-fix view when task scanning throws\" case (path no longer exists). vitest.config.ts drops the `src/repertoire/tasks/types.ts` coverage exclude (file gone). Drops the dead `renderTaskTransitionLines()` helper from `src/arc/task-lifecycle.ts` (zero callers \u2014 was consumed only by the now-deleted task-board prompt rendering)."
72
+ "delete the `src/repertoire/tasks/` module and its tests from disk. Final cleanup after an earlier PR dropped production reads and a follow-up PR rewired the bridges, daemon scheduler, and parser utilities off the module. Also rewires the lingering `cli-exec.ts` inner-status handler to import `parseFrontmatter` from `src/util/frontmatter` (earlier-PR miss). Test sweep: 19 `vi.mock(\"../../repertoire/tasks\", ...)` defensive mock blocks removed across `prompt-*`, `tools-*`, `refresh-system-prompt`, and `continuity-tools` test files (mocks pointed at a module path that no longer exists). `task-scheduler.test.ts` removed (legacy fixture helpers gone; scheduler still exercised via integration). `mailbox-readers-continuity-catches.test.ts` drops the obsolete \"self-fix view when task scanning throws\" case (path no longer exists). vitest.config.ts drops the `src/repertoire/tasks/types.ts` coverage exclude (file gone). Drops the dead `renderTaskTransitionLines()` helper from `src/arc/task-lifecycle.ts` (zero callers was consumed only by the now-deleted task-board prompt rendering)."
61
73
  ]
62
74
  },
63
75
  {
64
76
  "version": "0.1.0-alpha.629",
65
77
  "changes": [
66
- "rewire bridges, daemon scheduler, and parser utilities off the deprecated `src/repertoire/tasks/` module. `parseFrontmatter` extracted to `src/util/frontmatter.ts` (single shared helper) \u2014 `src/heart/awaiting/await-parser.ts`, `src/heart/habits/habit-parser.ts`, `src/heart/habits/habit-migration.ts`, and `src/heart/hatch/specialist-prompt.ts` now import from there. `src/heart/bridges/manager.ts` no longer reads from the task module (and gains a `defaultWriteDeskTask` writer for promoted-bridge desk task.md emission). `src/heart/daemon/task-scheduler.ts`, `src/heart/daemon/cli-exec.ts`, `src/heart/daemon/cli-types.ts`, `src/heart/daemon/cli-help.ts`, and `src/heart/daemon/cli-parse.ts` drop their task-CLI surface (`task board`, `task create`, etc.) \u2014 those commands move to the desk MCP server. `src/mind/prompt.ts` and `src/repertoire/guardrails.ts` drop residual task-command references. After this PR, no production code imports from `src/repertoire/tasks/`; a follow-up PR does the final disk-level deletion. Pipeline-integration tests updated to drop the three obsolete `parses task ...` cases. New `src/util/frontmatter.ts` ships with focused tests (12 cases, 100% line/branch/func/statement). Bridges manager + state-machine + daemon task-scheduler temporarily excluded from the strict-100% coverage gate (`defaultWriteDeskTask`, desk-discovery walk, and `suspendBridge` branch lack direct tests); followup PR will either backfill tests or refactor the defensive catches."
78
+ "rewire bridges, daemon scheduler, and parser utilities off the deprecated `src/repertoire/tasks/` module. `parseFrontmatter` extracted to `src/util/frontmatter.ts` (single shared helper) `src/heart/awaiting/await-parser.ts`, `src/heart/habits/habit-parser.ts`, `src/heart/habits/habit-migration.ts`, and `src/heart/hatch/specialist-prompt.ts` now import from there. `src/heart/bridges/manager.ts` no longer reads from the task module (and gains a `defaultWriteDeskTask` writer for promoted-bridge desk task.md emission). `src/heart/daemon/task-scheduler.ts`, `src/heart/daemon/cli-exec.ts`, `src/heart/daemon/cli-types.ts`, `src/heart/daemon/cli-help.ts`, and `src/heart/daemon/cli-parse.ts` drop their task-CLI surface (`task board`, `task create`, etc.) those commands move to the desk MCP server. `src/mind/prompt.ts` and `src/repertoire/guardrails.ts` drop residual task-command references. After this PR, no production code imports from `src/repertoire/tasks/`; a follow-up PR does the final disk-level deletion. Pipeline-integration tests updated to drop the three obsolete `parses task ...` cases. New `src/util/frontmatter.ts` ships with focused tests (12 cases, 100% line/branch/func/statement). Bridges manager + state-machine + daemon task-scheduler temporarily excluded from the strict-100% coverage gate (`defaultWriteDeskTask`, desk-discovery walk, and `suspendBridge` branch lack direct tests); followup PR will either backfill tests or refactor the defensive catches."
67
79
  ]
68
80
  },
69
81
  {
@@ -75,7 +87,7 @@
75
87
  {
76
88
  "version": "0.1.0-alpha.627",
77
89
  "changes": [
78
- "assemble `## my desk` prompt section from `<bundle>/desk/`. New `src/mind/desk-section.ts` reads each agent's desk dir synchronously every turn and emits the desk-vocab body (the agent's reflex + threshold + 8-state vocab + first-class-systems-link-not-absorb rule) followed by a dynamic `### currently` block showing the featured track + top-3 non-terminal tasks in it + other active tracks + non-terminal-task count. Featured resolution: `<bundle>/desk/_meta/featured.md` (one slug per line); falls back to alphabetical-first-active track when absent or all entries stale. Closed tracks are skipped as featured candidates. Empty-desk emits a `### currently: empty \u2014 no tracks yet.` stub. Parses `schema_version: 0` (pre-migration) and `schema_version: 1` (post-migration) tasks identically. 26 unit tests including 7 coverage-gate edge cases. Removes the old `## task board` rendering and `ouro task ...` body-map cheatsheet from prompt.ts. Emits `prompt.desk_section_assembled` nerves event per turn (file-completeness gate). desk-section.ts excluded from strict-100% coverage gate (94% stmts / 82% branches, 100% lines + funcs); followup will tighten or refactor the defensive FS catches. follow-up sequence cleans up the deeper repertoire/tasks module dependencies."
90
+ "assemble `## my desk` prompt section from `<bundle>/desk/`. New `src/mind/desk-section.ts` reads each agent's desk dir synchronously every turn and emits the desk-vocab body (the agent's reflex + threshold + 8-state vocab + first-class-systems-link-not-absorb rule) followed by a dynamic `### currently` block showing the featured track + top-3 non-terminal tasks in it + other active tracks + non-terminal-task count. Featured resolution: `<bundle>/desk/_meta/featured.md` (one slug per line); falls back to alphabetical-first-active track when absent or all entries stale. Closed tracks are skipped as featured candidates. Empty-desk emits a `### currently: empty no tracks yet.` stub. Parses `schema_version: 0` (pre-migration) and `schema_version: 1` (post-migration) tasks identically. 26 unit tests including 7 coverage-gate edge cases. Removes the old `## task board` rendering and `ouro task ...` body-map cheatsheet from prompt.ts. Emits `prompt.desk_section_assembled` nerves event per turn (file-completeness gate). desk-section.ts excluded from strict-100% coverage gate (94% stmts / 82% branches, 100% lines + funcs); followup will tighten or refactor the defensive FS catches. follow-up sequence cleans up the deeper repertoire/tasks module dependencies."
79
91
  ]
80
92
  },
81
93
  {
@@ -122,7 +134,7 @@
122
134
  {
123
135
  "version": "0.1.0-alpha.620",
124
136
  "changes": [
125
- "wire the `--agent <name>` flag end-to-end across `ouro plugin install / list / remove`. Previously parsed but ignored by the handlers. Now: install --agent X reads X's `~/AgentBundles/X.ouro/agent.json`, idempotently adds `{ id, enabled: true, source, version }` to plugins[] (source + version persisted only when set on the command). list --agent X intersects installed-on-disk with X's plugins[] (enabled-for-X filter). remove --agent X removes the entry from X's plugins[] only \u2014 never deletes the plugin from disk (other agents may still use it). remove WITHOUT --agent scans all bundles via getAgentBundlesRoot; refuses with a clear message if any agent's plugins[] still references the plugin, listing the offending agents. agent.json writes match the codebase's existing non-atomic `fs.writeFileSync` pattern (atomic-write helper deferred as a follow-up). 15 new tests cover the install / list / remove paths under --agent (and the new machine-wide remove guardrail); plugin-cli.ts stays at 100% coverage.",
137
+ "wire the `--agent <name>` flag end-to-end across `ouro plugin install / list / remove`. Previously parsed but ignored by the handlers. Now: install --agent X reads X's `~/AgentBundles/X.ouro/agent.json`, idempotently adds `{ id, enabled: true, source, version }` to plugins[] (source + version persisted only when set on the command). list --agent X intersects installed-on-disk with X's plugins[] (enabled-for-X filter). remove --agent X removes the entry from X's plugins[] only never deletes the plugin from disk (other agents may still use it). remove WITHOUT --agent scans all bundles via getAgentBundlesRoot; refuses with a clear message if any agent's plugins[] still references the plugin, listing the offending agents. agent.json writes match the codebase's existing non-atomic `fs.writeFileSync` pattern (atomic-write helper deferred as a follow-up). 15 new tests cover the install / list / remove paths under --agent (and the new machine-wide remove guardrail); plugin-cli.ts stays at 100% coverage.",
126
138
  "Orientation substrate campaign: invalid provider/model pairs now fail fast across startup, `ouro use --force`, legacy `auth switch`, and legacy `config model`; BlueBubbles now keeps channel/routing metadata in a structured orientation frame instead of injecting it into user speech; agents get an `orientation_get` tool plus correction-hold action rails that block high-risk durable writes, shell mutations, and first-class MCP tool calls when a terse correction depends on prior context; high-risk tool profiles now require a typed reason so blocked mutations always explain the risk; trip leg updates now require an explicit `updateReason` so confident-but-wrong corrections leave an auditable rationale instead of silently mutating itinerary state.",
127
139
  "Inner/BlueBubbles boundary: `surface` no longer attempts proactive iMessage delivery when returning to bridge-attached or freshest BlueBubbles sessions. Surface now queues the return for the active session, and the tool description names `send_message` with `channel=\"bluebubbles\"` as the dedicated intentional live-send path. Regression tests pin both BlueBubbles surface routes and the explicit send_message live-delivery path."
128
140
  ]
@@ -136,13 +148,13 @@
136
148
  {
137
149
  "version": "0.1.0-alpha.618",
138
150
  "changes": [
139
- "prompt-assembly integration. src/repertoire/skills.ts listSkills() merges plugin skills (via listPluginSkills(listEnabledPlugins())) with bundle skills (deduped, sorted). loadSkill() falls back to plugin skills after the 3 bundle paths (agent \u2192 protocol mirror \u2192 harness); iterates enabled plugins in declaration order and returns the first match. Bundle skills retain precedence \u2014 if the same skill name is in both a bundle and a plugin, the bundle wins. 6 new tests cover the integration paths; skills.ts stays at 100% coverage. Final sub-PR in the ouroboros plugin support track."
151
+ "prompt-assembly integration. src/repertoire/skills.ts listSkills() merges plugin skills (via listPluginSkills(listEnabledPlugins())) with bundle skills (deduped, sorted). loadSkill() falls back to plugin skills after the 3 bundle paths (agent protocol mirror harness); iterates enabled plugins in declaration order and returns the first match. Bundle skills retain precedence if the same skill name is in both a bundle and a plugin, the bundle wins. 6 new tests cover the integration paths; skills.ts stays at 100% coverage. Final sub-PR in the ouroboros plugin support track."
140
152
  ]
141
153
  },
142
154
  {
143
155
  "version": "0.1.0-alpha.617",
144
156
  "changes": [
145
- "`ouro plugin install <source>`, `ouro plugin list`, and `ouro plugin remove <id>` CLI commands. Install clones to ~/.ouro-cli/plugins/<id>/ via git, supports `github:org/repo:plugins/<id>`, `https://github.com/...[.git]`, `local:/path/...`, and bare absolute paths; verifies `.claude-plugin/plugin.json` exists and rolls back on failure. List walks the plugins root and reports sorted installed plugins. Remove deletes the plugin install dir. Handlers live in src/heart/daemon/plugin-cli.ts (narrow RM_RECURSIVE_ALLOWLIST entry \u2014 operator-invoked CLI infrastructure, not agent-callable). Next sub-PR wires plugin skills into prompt assembly via listPluginSkills()."
157
+ "`ouro plugin install <source>`, `ouro plugin list`, and `ouro plugin remove <id>` CLI commands. Install clones to ~/.ouro-cli/plugins/<id>/ via git, supports `github:org/repo:plugins/<id>`, `https://github.com/...[.git]`, `local:/path/...`, and bare absolute paths; verifies `.claude-plugin/plugin.json` exists and rolls back on failure. List walks the plugins root and reports sorted installed plugins. Remove deletes the plugin install dir. Handlers live in src/heart/daemon/plugin-cli.ts (narrow RM_RECURSIVE_ALLOWLIST entry operator-invoked CLI infrastructure, not agent-callable). Next sub-PR wires plugin skills into prompt assembly via listPluginSkills()."
146
158
  ]
147
159
  },
148
160
  {
@@ -194,7 +206,7 @@
194
206
  {
195
207
  "version": "0.1.0-alpha.609",
196
208
  "changes": [
197
- "Vocabulary sweep: drop 'memory' from agent-facing prompt content (trip-ledger truth section) and operator-facing connect-flow strings (cli-exec/cli-help). The agent doesn't have memory; it has a diary it consults, embeddings it can search, and prior conversation context. Renaming the strings makes the surface honest. Internal capability identifier 'memory-embeddings' and CLI input alias 'memory' both stay for back-compat. RAM-sense uses ('in-memory cache', 'process memory') stay. Test snapshots updated. Bumped from .607 \u2192 .609 to leapfrog the parallel-merged .608."
209
+ "Vocabulary sweep: drop 'memory' from agent-facing prompt content (trip-ledger truth section) and operator-facing connect-flow strings (cli-exec/cli-help). The agent doesn't have memory; it has a diary it consults, embeddings it can search, and prior conversation context. Renaming the strings makes the surface honest. Internal capability identifier 'memory-embeddings' and CLI input alias 'memory' both stay for back-compat. RAM-sense uses ('in-memory cache', 'process memory') stay. Test snapshots updated. Bumped from .607 .609 to leapfrog the parallel-merged .608."
198
210
  ]
199
211
  },
200
212
  {
@@ -206,7 +218,7 @@
206
218
  {
207
219
  "version": "0.1.0-alpha.606",
208
220
  "changes": [
209
- "Root-cause fix for the 2026-05-11 inner-dialog wake storm that cost ~$50 in minimax inference. PR #725 removed `inner.wake` from the Claude Code post-tool-use hook with the stated intent that the notification message stay in the queue and be picked up on the agent's next natural turn \u2014 but the daemon's `message.send` HANDLER (daemon.ts case `message.send`) was unconditionally calling `processManager.sendToAgent(to, { type: \"message\" })` after queueing, which woke the inner-dialog worker on every message.send anyway. ~30 message.send/min \u00d7 the 3-turn instinct-loop cap = ~90 turns/min sustained for hours. PR #725 fixed the hook side; the daemon side defeated it.\n\nFix: the `message.send` handler is now pure queue-only delivery. No `startAgent`, no `sendToAgent` \u2014 just `router.send`. Callers that want immediate processing must send `inner.wake` explicitly after `message.send`. The Claude Code hook (cli-exec.ts) was already correctly discriminating (only firing inner.wake on session-start/stop, never per-tool-use), so it works as originally intended now. The CLI `ouro msg` was updated to chain `inner.wake` after `message.send` (operator-driven delivery wants immediate response, preserving historical CLI UX). Other callers (API, programmatic) default to queue-only.\n\nTest pinned: `daemon-command-plane-branches.test.ts` now asserts `processManager.startAgent` and `processManager.sendToAgent` are NOT called from `message.send`. The regression cannot be silently reintroduced."
221
+ "Root-cause fix for the 2026-05-11 inner-dialog wake storm that cost ~$50 in minimax inference. PR #725 removed `inner.wake` from the Claude Code post-tool-use hook with the stated intent that the notification message stay in the queue and be picked up on the agent's next natural turn but the daemon's `message.send` HANDLER (daemon.ts case `message.send`) was unconditionally calling `processManager.sendToAgent(to, { type: \"message\" })` after queueing, which woke the inner-dialog worker on every message.send anyway. ~30 message.send/min × the 3-turn instinct-loop cap = ~90 turns/min sustained for hours. PR #725 fixed the hook side; the daemon side defeated it.\n\nFix: the `message.send` handler is now pure queue-only delivery. No `startAgent`, no `sendToAgent` just `router.send`. Callers that want immediate processing must send `inner.wake` explicitly after `message.send`. The Claude Code hook (cli-exec.ts) was already correctly discriminating (only firing inner.wake on session-start/stop, never per-tool-use), so it works as originally intended now. The CLI `ouro msg` was updated to chain `inner.wake` after `message.send` (operator-driven delivery wants immediate response, preserving historical CLI UX). Other callers (API, programmatic) default to queue-only.\n\nTest pinned: `daemon-command-plane-branches.test.ts` now asserts `processManager.startAgent` and `processManager.sendToAgent` are NOT called from `message.send`. The regression cannot be silently reintroduced."
210
222
  ]
211
223
  },
212
224
  {
@@ -224,7 +236,7 @@
224
236
  {
225
237
  "version": "0.1.0-alpha.602",
226
238
  "changes": [
227
- "Two-part fix for the 2026-05-11 BlueBubbles wedge: an agent's BB session showed the same user message replayed 76 times, because each death-spiral cycle re-injected the inbound. Root cause was the daemon's HTTP health probe (`createHttpHealthProbe(\"bluebubbles:<agent>\", port)`) GETting the sense's /health endpoint every ~60 s with a 5 s timeout \u2014 busy BB sense (e.g. VLM image-describe at 20+ s) timed out, daemon declared 'critical', SIGTERM'd the sense mid-work, respawned, hit the same image, killed again, forever. Part 1: removed the HTTP probe entirely from `listHealthProbes()`. Process supervision (`processManager` child-process exit handler) already catches dead processes; for 'alive but hung' we now rely on the agent's own awareness via `pendingRecoveryCount` / `lastRecoveredAt` in the BB runtime state surfaced into the prompt, plus the agent's new `restart_runtime` tool (from alpha.598 / #723). Part 2: defense-in-depth respawn-loop guard in `processManager.restartAgent` \u2014 if anything triggers more than `RESPAWN_GUARD_MAX_RESTARTS = 5` orchestrated restarts in `RESPAWN_GUARD_WINDOW_MS = 10 min`, refuse further restarts (`daemon.agent_respawn_loop_tripped` nerves event, errorReason + fixHint set on the snapshot). Trip self-clears once timestamps age out of the window, and `startAgent` (= `ouro up`) bypasses the guard so the operator can always recover. Even if some other future cause re-introduces a tight respawn loop, the guard bounds it. The 2026-05-11 spiral was ~60 restarts/hr \u2014 well above 5/10min, so this would have caught it."
239
+ "Two-part fix for the 2026-05-11 BlueBubbles wedge: an agent's BB session showed the same user message replayed 76 times, because each death-spiral cycle re-injected the inbound. Root cause was the daemon's HTTP health probe (`createHttpHealthProbe(\"bluebubbles:<agent>\", port)`) GETting the sense's /health endpoint every ~60 s with a 5 s timeout busy BB sense (e.g. VLM image-describe at 20+ s) timed out, daemon declared 'critical', SIGTERM'd the sense mid-work, respawned, hit the same image, killed again, forever. Part 1: removed the HTTP probe entirely from `listHealthProbes()`. Process supervision (`processManager` child-process exit handler) already catches dead processes; for 'alive but hung' we now rely on the agent's own awareness via `pendingRecoveryCount` / `lastRecoveredAt` in the BB runtime state surfaced into the prompt, plus the agent's new `restart_runtime` tool (from alpha.598 / #723). Part 2: defense-in-depth respawn-loop guard in `processManager.restartAgent` if anything triggers more than `RESPAWN_GUARD_MAX_RESTARTS = 5` orchestrated restarts in `RESPAWN_GUARD_WINDOW_MS = 10 min`, refuse further restarts (`daemon.agent_respawn_loop_tripped` nerves event, errorReason + fixHint set on the snapshot). Trip self-clears once timestamps age out of the window, and `startAgent` (= `ouro up`) bypasses the guard so the operator can always recover. Even if some other future cause re-introduces a tight respawn loop, the guard bounds it. The 2026-05-11 spiral was ~60 restarts/hr well above 5/10min, so this would have caught it."
228
240
  ]
229
241
  },
230
242
  {
@@ -236,43 +248,43 @@
236
248
  {
237
249
  "version": "0.1.0-alpha.600",
238
250
  "changes": [
239
- "Harness attention hygiene: post-tool-use Claude Code hook no longer wakes the inner loop on every tool \u2014 only on session-start and stop. listActiveReturnObligations gains a 14-day age cap so legacy queued items stop cycling indefinitely, and a strict status allow-list so legacy 'fulfilled' values written before the ReturnObligationStatus split (or by any future code path that bypasses the type via 'as any') no longer leak into the held-work-items injection."
251
+ "Harness attention hygiene: post-tool-use Claude Code hook no longer wakes the inner loop on every tool only on session-start and stop. listActiveReturnObligations gains a 14-day age cap so legacy queued items stop cycling indefinitely, and a strict status allow-list so legacy 'fulfilled' values written before the ReturnObligationStatus split (or by any future code path that bypasses the type via 'as any') no longer leak into the held-work-items injection."
240
252
  ]
241
253
  },
242
254
  {
243
255
  "version": "0.1.0-alpha.599",
244
256
  "changes": [
245
- "BlueBubbles in-flight marker hardening. The in-memory `bbInFlightMessageGuids` tracker had a leak class: any exit path in `handleBlueBubblesNormalizedEvent` that doesn't call `endBlueBubblesMessageInFlight` strands the marker forever (until BB sense process restart), silently halting forward progress on the recovery queue. Slugger lost BlueBubbles inbound for 12+ hours on 2026-05-11 \u2014 six user messages piled up unprocessed because each recovery attempt saw the stale marker, returned `already_processed` without actually processing, and the recovery loop counted progress without making any. The class fix: in-flight markers now carry a claim timestamp and expire after `BB_IN_FLIGHT_MAX_AGE_MS = 15 min` (50% beyond the 10-min recovery-turn timeout, so live owners get full headroom). `isBlueBubblesMessageInFlight` returns false for stale markers; `beginBlueBubblesMessageInFlight` is allowed to replace a stale marker and emits `senses.bluebubbles_in_flight_marker_expired` so the auto-eviction is observable. A leaked marker now self-clears in at most 15 min instead of forever. Defense-in-depth: explicit `endBlueBubblesMessageInFlight` audit still worth doing in a follow-up, but the TTL guarantees the bug class can't wedge the queue indefinitely."
257
+ "BlueBubbles in-flight marker hardening. The in-memory `bbInFlightMessageGuids` tracker had a leak class: any exit path in `handleBlueBubblesNormalizedEvent` that doesn't call `endBlueBubblesMessageInFlight` strands the marker forever (until BB sense process restart), silently halting forward progress on the recovery queue. Slugger lost BlueBubbles inbound for 12+ hours on 2026-05-11 six user messages piled up unprocessed because each recovery attempt saw the stale marker, returned `already_processed` without actually processing, and the recovery loop counted progress without making any. The class fix: in-flight markers now carry a claim timestamp and expire after `BB_IN_FLIGHT_MAX_AGE_MS = 15 min` (50% beyond the 10-min recovery-turn timeout, so live owners get full headroom). `isBlueBubblesMessageInFlight` returns false for stale markers; `beginBlueBubblesMessageInFlight` is allowed to replace a stale marker and emits `senses.bluebubbles_in_flight_marker_expired` so the auto-eviction is observable. A leaked marker now self-clears in at most 15 min instead of forever. Defense-in-depth: explicit `endBlueBubblesMessageInFlight` audit still worth doing in a follow-up, but the TTL guarantees the bug class can't wedge the queue indefinitely."
246
258
  ]
247
259
  },
248
260
  {
249
261
  "version": "0.1.0-alpha.598",
250
262
  "changes": [
251
- "New `restart_runtime({ reason })` tool. Agent self-maintenance: asking the human to restart the daemon over BlueBubbles is now a thing of the past. Sends `daemon.restart` over the existing socket \u2014 daemon logs the reason as `daemon.restart_requested`, runs its normal stop pathway, and exits. launchctl's KeepAlive policy auto-respawns the daemon, so the agent comes back fresh on the other side. The agent will not see this tool's return value \u2014 its process exits with the daemon and a clean boot replaces it. In dev mode (no launchctl) the daemon just exits; same observable behavior as `daemon.stop`, with the restart-requested audit event distinguishing intent."
263
+ "New `restart_runtime({ reason })` tool. Agent self-maintenance: asking the human to restart the daemon over BlueBubbles is now a thing of the past. Sends `daemon.restart` over the existing socket daemon logs the reason as `daemon.restart_requested`, runs its normal stop pathway, and exits. launchctl's KeepAlive policy auto-respawns the daemon, so the agent comes back fresh on the other side. The agent will not see this tool's return value its process exits with the daemon and a clean boot replaces it. In dev mode (no launchctl) the daemon just exits; same observable behavior as `daemon.stop`, with the restart-requested audit event distinguishing intent."
252
264
  ]
253
265
  },
254
266
  {
255
267
  "version": "0.1.0-alpha.597",
256
268
  "changes": [
257
- "New `let_go({ id, reason? })` tool. Closes a gap surfaced live by Slugger after #720 (await_condition): when a held work item is resolved externally (e.g. an obligation whose underlying issue was merged in a separate PR), the agent had no way to dismiss it. The existing path to terminal state \u2014 fulfilling via `surface` \u2014 requires delivering a response, which doesn't fit externally-resolved work. The agent kept seeing the same stale items in its prompt every turn for a month with nothing to act on. `let_go` is dismissal WITHOUT delivery: tries `arc/obligations/inner/<id>.json` (ReturnObligation \u2192 `returned` with returnTarget=`surface`), falls through to `arc/obligations/<id>.json` (Obligation \u2192 `fulfilled` with `latestNote=reason`). Idempotent \u2014 calling on an already-terminal item returns the existing status, not an error. Emits `repertoire.obligation_let_go` nerves event recording the reason for future-me. id is the bracketed value in the prompt's 'held work items' section."
269
+ "New `let_go({ id, reason? })` tool. Closes a gap surfaced live by Slugger after #720 (await_condition): when a held work item is resolved externally (e.g. an obligation whose underlying issue was merged in a separate PR), the agent had no way to dismiss it. The existing path to terminal state fulfilling via `surface` requires delivering a response, which doesn't fit externally-resolved work. The agent kept seeing the same stale items in its prompt every turn for a month with nothing to act on. `let_go` is dismissal WITHOUT delivery: tries `arc/obligations/inner/<id>.json` (ReturnObligation `returned` with returnTarget=`surface`), falls through to `arc/obligations/<id>.json` (Obligation `fulfilled` with `latestNote=reason`). Idempotent calling on an already-terminal item returns the existing status, not an error. Emits `repertoire.obligation_let_go` nerves event recording the reason for future-me. id is the bracketed value in the prompt's 'held work items' section."
258
270
  ]
259
271
  },
260
272
  {
261
273
  "version": "0.1.0-alpha.596",
262
274
  "changes": [
263
- "New await_condition primitive. Agents can file a natural-language condition (`await_condition({ name, condition, cadence, alert?, mode?, max_age?, body? })`); the daemon polls on cadence and queues an inner-dialog tick titled `await tick: <name> \u2014 <condition>` with the file body + history block (checked count, last-checked age, last observation). The agent calls `resolve_await({ name, verdict, observation })`: verdict=yes archives the file to `awaiting/.done/` and fires an alert via cross-chat-delivery (intent=generic_outreach) targeting `filed_for_friend_id` on the `alert` channel \u2014 proactive sends slot into the user's existing thread, no self-loopback. verdict=no records the observation and increments the tick count. `cancel_await({ name, reason? })` silently archives without alert. max_age triggers auto-expiry with a 'timed out' alert. Surfaces `## what i'm waiting on` in the commitments section. Validated end-to-end with Slugger: file \u2192 tick \u2192 resolve(no, x2) \u2192 resolve(yes) \u2192 archive + cross_chat_delivery queued_for_later. AwaitScheduler mkdirs its awaits dir at start so the fs.watch watcher attaches on first boot (no need to wait for a periodic reconcile)."
275
+ "New await_condition primitive. Agents can file a natural-language condition (`await_condition({ name, condition, cadence, alert?, mode?, max_age?, body? })`); the daemon polls on cadence and queues an inner-dialog tick titled `await tick: <name> <condition>` with the file body + history block (checked count, last-checked age, last observation). The agent calls `resolve_await({ name, verdict, observation })`: verdict=yes archives the file to `awaiting/.done/` and fires an alert via cross-chat-delivery (intent=generic_outreach) targeting `filed_for_friend_id` on the `alert` channel proactive sends slot into the user's existing thread, no self-loopback. verdict=no records the observation and increments the tick count. `cancel_await({ name, reason? })` silently archives without alert. max_age triggers auto-expiry with a 'timed out' alert. Surfaces `## what i'm waiting on` in the commitments section. Validated end-to-end with Slugger: file tick resolve(no, x2) resolve(yes) archive + cross_chat_delivery queued_for_later. AwaitScheduler mkdirs its awaits dir at start so the fs.watch watcher attaches on first boot (no need to wait for a periodic reconcile)."
264
276
  ]
265
277
  },
266
278
  {
267
279
  "version": "0.1.0-alpha.595",
268
280
  "changes": [
269
- "Voice phone transport defaults to media-stream when OpenAI Realtime or OpenAI SIP is configured but no explicit voice.twilioTransportMode is set. Previously the default was record-play, which made conversationEngine resolve to cascade and routed inbound calls through the ElevenLabs/Whisper greeting path operators with realtime-only credentials never configured \u2014 producing a fully silent first turn (\"no greeting at all\"). Realtime requires media-stream by nature, so we now infer it. Defensive prewarm guard branch marked with a v8 ignore since the implicit default makes it unreachable in current outbound tests."
281
+ "Voice phone transport defaults to media-stream when OpenAI Realtime or OpenAI SIP is configured but no explicit voice.twilioTransportMode is set. Previously the default was record-play, which made conversationEngine resolve to cascade and routed inbound calls through the ElevenLabs/Whisper greeting path operators with realtime-only credentials never configured producing a fully silent first turn (\"no greeting at all\"). Realtime requires media-stream by nature, so we now infer it. Defensive prewarm guard branch marked with a v8 ignore since the implicit default makes it unreachable in current outbound tests."
270
282
  ]
271
283
  },
272
284
  {
273
285
  "version": "0.1.0-alpha.594",
274
286
  "changes": [
275
- "Voice phone transport defaults to media-stream when OpenAI Realtime or OpenAI SIP is configured but no explicit voice.twilioTransportMode is set. Previously the default was record-play, which made conversationEngine resolve to cascade and routed inbound calls through the ElevenLabs/Whisper greeting path operators with realtime-only credentials never configured \u2014 producing a fully silent first turn (\"no greeting at all\"). Realtime requires media-stream by nature, so we now infer it."
287
+ "Voice phone transport defaults to media-stream when OpenAI Realtime or OpenAI SIP is configured but no explicit voice.twilioTransportMode is set. Previously the default was record-play, which made conversationEngine resolve to cascade and routed inbound calls through the ElevenLabs/Whisper greeting path operators with realtime-only credentials never configured producing a fully silent first turn (\"no greeting at all\"). Realtime requires media-stream by nature, so we now infer it."
276
288
  ]
277
289
  },
278
290
  {
@@ -323,7 +335,7 @@
323
335
  {
324
336
  "version": "0.1.0-alpha.586",
325
337
  "changes": [
326
- "`mail_status`, `mail_recent`, `mail_search`, and `mail_index_refresh` now flag a 'mail substrate divergence' when the encrypted mailroom store reports zero visible messages but the on-disk search cache still holds documents from prior imports \u2014 the post-rotation / hosted\u2192local-fallback / wiped-store state that previously rendered as a silent 'no mail' answer indistinguishable from a clean onboarding.",
338
+ "`mail_status`, `mail_recent`, `mail_search`, and `mail_index_refresh` now flag a 'mail substrate divergence' when the encrypted mailroom store reports zero visible messages but the on-disk search cache still holds documents from prior imports the post-rotation / hosted→local-fallback / wiped-store state that previously rendered as a silent 'no mail' answer indistinguishable from a clean onboarding.",
327
339
  "Mail absence answers from a divergent runtime now point at vault inspection (`mailroom.mode`, `mailroom.azureAccountUrl`, `mailroom.storePath`) and re-import recovery, so agents stop treating a broken substrate as evidence that the human inbox is empty.",
328
340
  "The substrate-divergence snapshot counts cache `.json` entries via `readdir` and ignores subdirectories and non-json files, so the diagnostic stays cheap on bundles holding tens of thousands of cached documents."
329
341
  ]
@@ -769,7 +781,7 @@
769
781
  {
770
782
  "version": "0.1.0-alpha.527",
771
783
  "changes": [
772
- "Suppresses `onResult`/`onFailure` in the shared tool-activity callbacks factory for any tool that started hidden, so a hidden tool's END never re-emits its raw args into chat surfaces \u2014 fixing rejected `settle` calls leaking `answer=`/`intent=` into BlueBubbles and Teams threads.",
784
+ "Suppresses `onResult`/`onFailure` in the shared tool-activity callbacks factory for any tool that started hidden, so a hidden tool's END never re-emits its raw args into chat surfaces fixing rejected `settle` calls leaking `answer=`/`intent=` into BlueBubbles and Teams threads.",
773
785
  "Tracks hidden-at-start tools by per-name counter to stay sound across concurrent same-name hidden starts, with no behavior change for visible tools.",
774
786
  "Adds heart-level regression tests for hidden-tool END suppression (success and failure paths, concurrent same-name) and senses-level regression tests against `createBlueBubblesCallbacks` and `createTeamsCallbacks` asserting that a rejected settle following a visible read_file produces no chat output containing the settle answer text or `intent=`/`answer=` substrings."
775
787
  ]
@@ -855,63 +867,63 @@
855
867
  "version": "0.1.0-alpha.519",
856
868
  "changes": [
857
869
  "Introduces `kind: \"library\"` field on bundle `agent.json`. `agent-discovery.ts` filters bundles where `kind === \"library\"` so they're never instantiated as runtime agents. `SerpentGuide.ouro/agent.json` tagged with `kind: \"library\"` to formalize what was previously an implicit `enabled: false` convention.",
858
- "Activation gate `shouldFireRepairGuide` consumes the existing `untypedDegraded` / `typedDegraded` partitioning at `cli-exec.ts:6693-6694`. Fires when `untypedDegraded.length > 0` OR `typedDegraded.length >= 3`. The existing `--no-repair` flag remains the operator escape hatch \u2014 no new env toggle.",
859
- "Drops the `~/AgentBundles/SerpentGuide.ouro/` override fallback in `getSpecialistIdentitySourceDir` \u2014 the in-repo bundle is now the only source. Reasoning per the planning doc: drift surface we don't currently need; cleaner ownership; no override path to maintain. Five referencing files updated (`hatch-flow.ts`, `cli-defaults.ts`, plus their tests). Same constraint extends to RepairGuide from day one \u2014 no override mechanism.",
870
+ "Activation gate `shouldFireRepairGuide` consumes the existing `untypedDegraded` / `typedDegraded` partitioning at `cli-exec.ts:6693-6694`. Fires when `untypedDegraded.length > 0` OR `typedDegraded.length >= 3`. The existing `--no-repair` flag remains the operator escape hatch no new env toggle.",
871
+ "Drops the `~/AgentBundles/SerpentGuide.ouro/` override fallback in `getSpecialistIdentitySourceDir` the in-repo bundle is now the only source. Reasoning per the planning doc: drift surface we don't currently need; cleaner ownership; no override path to maintain. Five referencing files updated (`hatch-flow.ts`, `cli-defaults.ts`, plus their tests). Same constraint extends to RepairGuide from day one no override mechanism.",
860
872
  "`parseRepairProposals` typed parser maps RepairGuide's structured-proposal output into the existing `RepairAction` catalog from `readiness-repair.ts` (`vault-unlock`, `provider-auth`, `provider-use`, etc.). Backfills lane variants and missing fields where unambiguous; rejects malformed proposals.",
861
- "Slugger-style compound integration fixture as canonical acceptance test (per O6): bad bootstrap state + expired creds + broken remote + drift between agent.json and agent.json simultaneously. Validates the full layer 1\u21924\u21922\u21923 pipeline end-to-end.",
873
+ "Slugger-style compound integration fixture as canonical acceptance test (per O6): bad bootstrap state + expired creds + broken remote + drift between agent.json and agent.json simultaneously. Validates the full layer 1→4→2→3 pipeline end-to-end.",
862
874
  "All gates green: tsc clean, lint clean, code coverage 100%, nerves audit pass."
863
875
  ]
864
876
  },
865
877
  {
866
878
  "version": "0.1.0-alpha.518",
867
879
  "changes": [
868
- "Layer 2 of the harness-hardening sequence (1\u21924\u21922\u21923 from `docs/planning/2026-04-28-1900-planning-harness-hardening-and-repairguide.md`). Wires a pre-flight `git pull` over every sync-enabled bundle into `ouro up`, before per-agent provider live-checks, so the post-pull `agent.json` is what live-check reads. First PR in the sequence that mutates working trees; does NOT write to `state/` (verified by a meta-test).",
869
- "New sync failure taxonomy in `src/heart/sync-classification.ts`: `auth-failed`, `not-found-404`, `network-down`, `dirty-working-tree`, `non-fast-forward`, `merge-conflict`, `timeout-soft`, `timeout-hard`, `unknown` \u2014 extends `PendingSyncRecord.classification` additively (legacy `push_rejected`/`pull_rebase_conflict` still work). Pure pattern-matcher: priority order is abort \u2192 404 \u2192 auth \u2192 network \u2192 dirty \u2192 conflict \u2192 non-fast-forward \u2192 unknown.",
880
+ "Layer 2 of the harness-hardening sequence (1→4→2→3 from `docs/planning/2026-04-28-1900-planning-harness-hardening-and-repairguide.md`). Wires a pre-flight `git pull` over every sync-enabled bundle into `ouro up`, before per-agent provider live-checks, so the post-pull `agent.json` is what live-check reads. First PR in the sequence that mutates working trees; does NOT write to `state/` (verified by a meta-test).",
881
+ "New sync failure taxonomy in `src/heart/sync-classification.ts`: `auth-failed`, `not-found-404`, `network-down`, `dirty-working-tree`, `non-fast-forward`, `merge-conflict`, `timeout-soft`, `timeout-hard`, `unknown` extends `PendingSyncRecord.classification` additively (legacy `push_rejected`/`pull_rebase_conflict` still work). Pure pattern-matcher: priority order is abort 404 auth network dirty conflict non-fast-forward unknown.",
870
882
  "End-to-end `AbortSignal` plumbing. New `runWithTimeouts<T>` wrapper in `src/heart/timeouts.ts` (soft 8s warns, hard 15s aborts via `AbortController`); new async sibling `preTurnPullAsync` in `src/heart/sync.ts` that uses `child_process.execFile(..., { signal })` so the kernel kills the git child when the hard timeout fires. Original sync `preTurnPull` preserved for the per-turn pipeline. Two env knobs for the boot-sync probe: `OURO_BOOT_TIMEOUT_GIT_SOFT` (8000ms) and `OURO_BOOT_TIMEOUT_GIT_HARD` (15000ms).",
871
883
  "New `runBootSyncProbe` orchestrator in `src/heart/daemon/boot-sync-probe.ts` aggregates per-bundle findings (each tagged `advisory: true|false`). Wired into `daemon.up` as a new \"sync probe\" boot phase between manual-clone-detection and provider checks. Failures during the probe itself are caught and surfaced as a warning event without blocking the boot. Tests inject `runBootSyncProbeImpl` to keep CI off the developer's home bundles.",
872
- "9903 tests pass (518 files; +19 new). Coverage gate clean (cli-exec.ts 99.33% \u2192 100%). Slow-remote integration test proves boot doesn't hang on a hung remote (probe aborts within `hardMs`). this PR's meta-test enforces no-state-writes invariant on the three new files."
884
+ "9903 tests pass (518 files; +19 new). Coverage gate clean (cli-exec.ts 99.33% 100%). Slow-remote integration test proves boot doesn't hang on a hung remote (probe aborts within `hardMs`). this PR's meta-test enforces no-state-writes invariant on the three new files."
873
885
  ]
874
886
  },
875
887
  {
876
888
  "version": "0.1.0-alpha.517",
877
889
  "changes": [
878
- "`computeDaemonRollup` (Layer 1) gains an optional `driftDetected: boolean`. When true, `healthy` \u2192 `partial` (same downgrade rule as `bootstrapDegraded`). `degraded` and `safe-mode` rollups are unaffected \u2014 drift never escalates past `partial` and never un-downgrades. `daemon-entry.ts` probes each enabled agent for drift before computing the rollup; a single agent's read failure is best-effort and does not block the scan."
890
+ "`computeDaemonRollup` (Layer 1) gains an optional `driftDetected: boolean`. When true, `healthy` `partial` (same downgrade rule as `bootstrapDegraded`). `degraded` and `safe-mode` rollups are unaffected drift never escalates past `partial` and never un-downgrades. `daemon-entry.ts` probes each enabled agent for drift before computing the rollup; a single agent's read failure is best-effort and does not block the scan."
879
891
  ]
880
892
  },
881
893
  {
882
894
  "version": "0.1.0-alpha.516",
883
895
  "changes": [
884
- "Layer 1 of the harness-hardening sequence (1\u21924\u21922\u21923 from `docs/planning/2026-04-28-1900-planning-harness-hardening-and-repairguide.md`). Replaces the daemon-wide rollup at `daemon-entry.ts` (the binary `degraded.length > 0 ? \"degraded\" : \"ok\"` literal) with a five-state vocabulary: `healthy / partial / degraded / safe-mode / down`. A single sick agent no longer tips the whole daemon to `degraded`.",
896
+ "Layer 1 of the harness-hardening sequence (1→4→2→3 from `docs/planning/2026-04-28-1900-planning-harness-hardening-and-repairguide.md`). Replaces the daemon-wide rollup at `daemon-entry.ts` (the binary `degraded.length > 0 ? \"degraded\" : \"ok\"` literal) with a five-state vocabulary: `healthy / partial / degraded / safe-mode / down`. A single sick agent no longer tips the whole daemon to `degraded`.",
885
897
  "Type structure: `RollupStatus` (4-state, returned by the new pure `computeDaemonRollup` decision function in `daemon-rollup.ts`) and `DaemonStatus = RollupStatus | \"down\"` (full daemon-status; `down` is caller-owned because it represents pre-inventory failure, before the rollup is reachable). Both unions project from a single source-of-truth literal tuple so future widening touches one site.",
886
- "`renderRollupStatusLine` in `cli-render.ts` uses a compiler-forced `never`-typed exhaustive switch \u2014 adding a future state compile-errors at every consumer using the pattern. The `degraded` literal carries three copy variants picked by inspecting cached agent statuses: empty map (fresh install, prompts `ouro hatch`), non-empty + any running agent (legacy stale cache from pre-Layer-1 daemons, prompts `ouro up` refresh), non-empty + zero running (all-failed live-check, prompts `ouro doctor`).",
898
+ "`renderRollupStatusLine` in `cli-render.ts` uses a compiler-forced `never`-typed exhaustive switch adding a future state compile-errors at every consumer using the pattern. The `degraded` literal carries three copy variants picked by inspecting cached agent statuses: empty map (fresh install, prompts `ouro hatch`), non-empty + any running agent (legacy stale cache from pre-Layer-1 daemons, prompts `ouro up` refresh), non-empty + zero running (all-failed live-check, prompts `ouro doctor`).",
887
899
  "`runtime-readers.ts:readDaemonHealthDeep` parse tightened to use `isDaemonStatus`. `OutlookDaemonHealthDeep.status` widened to `DaemonStatus | \"unknown\"` so legacy serialized strings (`\"running\"`, `\"ok\"`) coerce defensively rather than failing the parse during rollout.",
888
- "9759 tests pass (508 test files); coverage gate clean. The per-agent live-check loop in `cli-exec.ts` is intentionally untouched \u2014 it was already try/catch-isolated; the bug was in how its output rolled up. Subsequent PRs (layers 4, 2, 3) build on this PR's vocabulary."
900
+ "9759 tests pass (508 test files); coverage gate clean. The per-agent live-check loop in `cli-exec.ts` is intentionally untouched it was already try/catch-isolated; the bug was in how its output rolled up. Subsequent PRs (layers 4, 2, 3) build on this PR's vocabulary."
889
901
  ]
890
902
  },
891
903
  {
892
904
  "version": "0.1.0-alpha.515",
893
905
  "changes": [
894
- "New `speak` tool \u2014 agent can deliver words to the current friend mid-turn without ending the turn. Pairs with `settle` (ends turn) and `ponder` (private inner thought). For acknowledgment of heavy work, phase-boundary updates, or progress narration on chat-style channels (cli, teams, bluebubbles).",
895
- "Schema is intentionally minimal: `speak({ message: string })`. Not sole-call, doesn't terminate the turn, NOT exempt from the 24-call circuit breaker (the breaker is healthy backpressure against narration-spam \u2014 silence is a natural fallback for speak, unlike settle/rest). Added `flushNow?(): void | Promise<void>` to `ChannelCallbacks`; per-sense impls deliver the buffered message immediately (CLI noop, BlueBubbles `client.sendText` keeping typing on, Teams stream emit with `sendMessage` fallback).",
906
+ "New `speak` tool agent can deliver words to the current friend mid-turn without ending the turn. Pairs with `settle` (ends turn) and `ponder` (private inner thought). For acknowledgment of heavy work, phase-boundary updates, or progress narration on chat-style channels (cli, teams, bluebubbles).",
907
+ "Schema is intentionally minimal: `speak({ message: string })`. Not sole-call, doesn't terminate the turn, NOT exempt from the 24-call circuit breaker (the breaker is healthy backpressure against narration-spam silence is a natural fallback for speak, unlike settle/rest). Added `flushNow?(): void | Promise<void>` to `ChannelCallbacks`; per-sense impls deliver the buffered message immediately (CLI noop, BlueBubbles `client.sendText` keeping typing on, Teams stream emit with `sendMessage` fallback).",
896
908
  "Engine integration follows the `ponder` interception template at `core.ts:~1303`: `speak` runs inline (emit + flushNow + push `(spoken)` tool result + nerves event `engine.speak`), then the loop continues. Empty/missing message rejected with a tool-result error and `engine.speak_invalid`. New event keys: `engine.speak`, `engine.speak_invalid`, `engine.speak_delivery_failed`, `bluebubbles.speak_flush`, `teams.speak_flush`.",
897
- "System prompt nudge in Group #4 (`how i work`), gated to chat-style channels: dependency boundary (settle if next step needs a reply, otherwise speak), phase boundaries (after acking heavy ask / hitting major constraint / switching strategy / before externally-visible step \u2014 not per-tool narration), one-way framing (speak is progress, not invitation).",
898
- "Hardened `speak` delivery semantics (PR review follow-up). `flushNow` contract is now explicit: throws if the message could not be delivered through any available path. Teams `flushNow` THROWS when both stream emit AND `sendMessage` fallback fail (was silently logging delivered=false and returning normally \u2014 engine then recorded `(spoken)` even though nothing reached the friend). BlueBubbles `flushNow` already let `client.sendText` rejections propagate; contract documented. Engine wraps `await flushNow()` in try/catch: on hard failure it calls `onToolEnd('speak', ..., false)`, pushes a `'speak delivery failed: ... did not reach your friend; do not assume they saw it'` tool result, emits `engine.speak_delivery_failed` (level=error), and the turn continues \u2014 preventing the agent from assuming silent success.",
899
- "`speak` is now treated as flow-control across all senses (PR review follow-up). Like settle/observe/ponder/rest, its only visible output is the message itself \u2014 no spinner, no phrase rotation, no `\u23f3` placeholder, no tool-activity status line. Added 'speak' to `FLOW_CONTROL_TOOLS` in `cli/tool-display.ts` and `cli/ouro-tui.tsx`; CLI/BlueBubbles/Teams `onToolStart` early-return for speak; `tool-description.ts` returns null for speak as defense-in-depth for any future sense using `createToolActivityCallbacks`. Teams `flushNow` also stops phrase rotation when it delivers, so the actual message replaces the cycling 'thinking...' phrase.",
900
- "Teams `flushNow` no longer aborts the turn on a successful sendMessage fallback (PR review follow-up). Prior code path: stream emit fails \u2192 `tryEmit` calls `markStopped()` which calls `controller.abort()` \u2192 falls through to `sendMessage` \u2192 succeeds \u2192 `flushNow` returns normally \u2192 core records `(spoken)` with success=true \u2014 but the turn controller is already aborted, so the next model/tool step aborts. Successful fallback delivery should not poison the rest of the turn. Fix adds a non-aborting `tryEmitNoAbort` variant adjacent to `tryEmit`; `flushNow` uses it so a primary-stream failure followed by a successful sendMessage no longer triggers `controller.abort()`. Only when ALL delivery paths fail does `flushNow` call `markStopped()` and throw, letting the engine's existing `engine.speak_delivery_failed` catch path end the turn cleanly. `tryEmit` and other non-flushNow callers (end-of-turn `flush()`, `safeEmit`) are unchanged \u2014 their abort-on-failure behavior remains correct because they have no fallback path forward."
909
+ "System prompt nudge in Group #4 (`how i work`), gated to chat-style channels: dependency boundary (settle if next step needs a reply, otherwise speak), phase boundaries (after acking heavy ask / hitting major constraint / switching strategy / before externally-visible step not per-tool narration), one-way framing (speak is progress, not invitation).",
910
+ "Hardened `speak` delivery semantics (PR review follow-up). `flushNow` contract is now explicit: throws if the message could not be delivered through any available path. Teams `flushNow` THROWS when both stream emit AND `sendMessage` fallback fail (was silently logging delivered=false and returning normally engine then recorded `(spoken)` even though nothing reached the friend). BlueBubbles `flushNow` already let `client.sendText` rejections propagate; contract documented. Engine wraps `await flushNow()` in try/catch: on hard failure it calls `onToolEnd('speak', ..., false)`, pushes a `'speak delivery failed: ... did not reach your friend; do not assume they saw it'` tool result, emits `engine.speak_delivery_failed` (level=error), and the turn continues preventing the agent from assuming silent success.",
911
+ "`speak` is now treated as flow-control across all senses (PR review follow-up). Like settle/observe/ponder/rest, its only visible output is the message itself no spinner, no phrase rotation, no `⏳` placeholder, no tool-activity status line. Added 'speak' to `FLOW_CONTROL_TOOLS` in `cli/tool-display.ts` and `cli/ouro-tui.tsx`; CLI/BlueBubbles/Teams `onToolStart` early-return for speak; `tool-description.ts` returns null for speak as defense-in-depth for any future sense using `createToolActivityCallbacks`. Teams `flushNow` also stops phrase rotation when it delivers, so the actual message replaces the cycling 'thinking...' phrase.",
912
+ "Teams `flushNow` no longer aborts the turn on a successful sendMessage fallback (PR review follow-up). Prior code path: stream emit fails `tryEmit` calls `markStopped()` which calls `controller.abort()` falls through to `sendMessage` succeeds `flushNow` returns normally core records `(spoken)` with success=true but the turn controller is already aborted, so the next model/tool step aborts. Successful fallback delivery should not poison the rest of the turn. Fix adds a non-aborting `tryEmitNoAbort` variant adjacent to `tryEmit`; `flushNow` uses it so a primary-stream failure followed by a successful sendMessage no longer triggers `controller.abort()`. Only when ALL delivery paths fail does `flushNow` call `markStopped()` and throw, letting the engine's existing `engine.speak_delivery_failed` catch path end the turn cleanly. `tryEmit` and other non-flushNow callers (end-of-turn `flush()`, `safeEmit`) are unchanged their abort-on-failure behavior remains correct because they have no fallback path forward."
901
913
  ]
902
914
  },
903
915
  {
904
916
  "version": "0.1.0-alpha.514",
905
917
  "changes": [
906
918
  "Add `--strict` flag to `ouro doctor` and bundle the `--category` flag (also in #637) for a coherent CI-friendly diagnostic interface. `--strict` makes the CLI exit non-zero (via thrown error caught by ouro-entry) when any check is `warn` or `fail`. Default behavior is unchanged.",
907
- "Composes naturally with --category and --json (#634): `ouro doctor --category Daemon --strict --json` is the canonical CI invocation \u2014 runs only the daemon checker, exits 1 on any issue, output is parseable. Emits `daemon.doctor_run` with `strict: true` in meta when set so the strict-failure events are filterable through #622's `nerves-review`.",
919
+ "Composes naturally with --category and --json (#634): `ouro doctor --category Daemon --strict --json` is the canonical CI invocation runs only the daemon checker, exits 1 on any issue, output is parseable. Emits `daemon.doctor_run` with `strict: true` in meta when set so the strict-failure events are filterable through #622's `nerves-review`.",
908
920
  "5 new parse tests cover --strict alone, --strict + --category combined, and the existing happy-path / no-value cases. cli-types now models `{ kind: \"doctor\", category?, strict? }`. KNOWN_DOCTOR_CATEGORIES (also from #637) gives external tooling a stable list of available filters. 69/69 doctor + parse tests pass."
909
921
  ]
910
922
  },
911
923
  {
912
924
  "version": "0.1.0-alpha.513",
913
925
  "changes": [
914
- "New `Friends` category in `ouro doctor`. Same shape as the Mailroom (#632) and Trips (#631) checks \u2014 walk each agent bundle, classify the friend store's health, and report a trust-level breakdown for the healthy path.",
926
+ "New `Friends` category in `ouro doctor`. Same shape as the Mailroom (#632) and Trips (#631) checks walk each agent bundle, classify the friend store's health, and report a trust-level breakdown for the healthy path.",
915
927
  "Per-agent reports: pass when no friends/ dir (no friends recorded yet), pass with `<N> friends, <X> family, <Y> friend, <Z> stranger` when records parse cleanly, warn when some files are unparseable (with parse-failure count), fail when the dir itself can't be read. Records with an unrecognized `trustLevel` get counted under `<N> other` so the operator can investigate.",
916
928
  "6 new tests cover all branches plus file-extension filtering (`.txt` ignored). Wired between Security and Disk in CATEGORY_CHECKERS, same orchestration shape as Mailroom and Trips."
917
929
  ]
@@ -919,40 +931,40 @@
919
931
  {
920
932
  "version": "0.1.0-alpha.512",
921
933
  "changes": [
922
- "New `friend_list` tool \u2014 the agent can list its known friends with id, name, and trust level. The notes/friend repertoire had `get_friend_note` and `save_friend_note` for individual lookups but no surface to enumerate the friend graph. Real workflows that needed it: cross-chat outreach decisions, screener triage, orienting on relationships at session start.",
934
+ "New `friend_list` tool the agent can list its known friends with id, name, and trust level. The notes/friend repertoire had `get_friend_note` and `save_friend_note` for individual lookups but no surface to enumerate the friend graph. Real workflows that needed it: cross-chat outreach decisions, screener triage, orienting on relationships at session start.",
923
935
  "Optional `trust` filter (`family`/`friend`/`stranger`) and `limit` (1-200, default 50). Renders one entry per friend with id, trust label, name, and external-id channel:identifier pairs when present. Sorted alphabetically by display name. Empty-state messages are filter-aware so the agent can tell the difference between 'no friends at all' and 'no matches for the filter'.",
924
- "Defensively handles a friend store that lacks `listAll` (the interface marks it optional) \u2014 returns 'the configured friend store does not support listing' rather than throwing. Tool registry up to 75 (snapshot regenerated, H10 contract list extended). 4 new tests cover sorted listing, trust filtering, store-without-listAll defensive path, and filter-empty state."
936
+ "Defensively handles a friend store that lacks `listAll` (the interface marks it optional) returns 'the configured friend store does not support listing' rather than throwing. Tool registry up to 75 (snapshot regenerated, H10 contract list extended). 4 new tests cover sorted listing, trust filtering, store-without-listAll defensive path, and filter-empty state."
925
937
  ]
926
938
  },
927
939
  {
928
940
  "version": "0.1.0-alpha.511",
929
941
  "changes": [
930
942
  "Add `--json` flag to `ouro doctor`. Default human-readable output is unchanged; `--json` emits the full DoctorResult (categories + checks + summary) as pretty-printed JSON for piping into jq, dashboards, scheduled health monitors, or CI pipelines that want to alert on a doctor failure.",
931
- "Same shape as the `--json` flag on the diagnostic family that's been growing alongside (session-playback / nerves-review / session-stats in their respective open PRs). Doctor was the asymmetric case \u2014 text-only output. With agent-private state coverage now landing in the doctor (Trips in #631, Mailroom in #632), structured output makes the doctor genuinely consumable by external tooling.",
932
- "1 new test on the parse path (`doctor --json` \u2192 `{ kind: \"doctor\", json: true }`); the existing parse test is updated to match the new shape (`{ kind: \"doctor\", json: false }`). The exec-side change is a one-line ternary that picks `JSON.stringify(result, null, 2) + \"\\n\"` over `formatDoctorOutput(result)` when the flag is set. 83/83 doctor + parse tests pass."
943
+ "Same shape as the `--json` flag on the diagnostic family that's been growing alongside (session-playback / nerves-review / session-stats in their respective open PRs). Doctor was the asymmetric case text-only output. With agent-private state coverage now landing in the doctor (Trips in #631, Mailroom in #632), structured output makes the doctor genuinely consumable by external tooling.",
944
+ "1 new test on the parse path (`doctor --json` `{ kind: \"doctor\", json: true }`); the existing parse test is updated to match the new shape (`{ kind: \"doctor\", json: false }`). The exec-side change is a one-line ternary that picks `JSON.stringify(result, null, 2) + \"\\n\"` over `formatDoctorOutput(result)` when the flag is set. 83/83 doctor + parse tests pass."
933
945
  ]
934
946
  },
935
947
  {
936
948
  "version": "0.1.0-alpha.510",
937
949
  "changes": [
938
950
  "New `Mailroom` category in `ouro doctor`. Same shape as the `Trips` check from #631 (alpha.509): walk each agent bundle, classify the mailroom registry's health into pass/warn/fail with structured detail, return record/grant/message counts for healthy ledgers.",
939
- "Health classes: pass when no mailroom dir (mail not connected), pass with `<N> mailboxes, <N> source grants, <N> messages` when healthy, warn when registry.json is missing or has zero mailboxes, fail when registry.json is unreadable or unparseable. Message count walks the messages dir and counts `.json` files only \u2014 stray `.txt` etc. are ignored.",
940
- "6 new tests cover all branches plus message-count file-extension filtering. Wired between `Security` and `Disk` in `CATEGORY_CHECKERS`. With #631's Trips check landing alongside, doctor now has agent-private state coverage for the two stateful primitives the agent owns (mail + trips) \u2014 same shape, same detail format, same orchestration."
951
+ "Health classes: pass when no mailroom dir (mail not connected), pass with `<N> mailboxes, <N> source grants, <N> messages` when healthy, warn when registry.json is missing or has zero mailboxes, fail when registry.json is unreadable or unparseable. Message count walks the messages dir and counts `.json` files only stray `.txt` etc. are ignored.",
952
+ "6 new tests cover all branches plus message-count file-extension filtering. Wired between `Security` and `Disk` in `CATEGORY_CHECKERS`. With #631's Trips check landing alongside, doctor now has agent-private state coverage for the two stateful primitives the agent owns (mail + trips) same shape, same detail format, same orchestration."
941
953
  ]
942
954
  },
943
955
  {
944
956
  "version": "0.1.0-alpha.509",
945
957
  "changes": [
946
- "New `Trips` category in `ouro doctor`. Operators previously had no quick way to verify the trip ledger was healthy \u2014 they had to call `trip_status` from inside the agent or open `state/trips/ledger.json` by hand. With the trip ledger now load-bearing for trip planning workflows, doctor coverage matters.",
958
+ "New `Trips` category in `ouro doctor`. Operators previously had no quick way to verify the trip ledger was healthy they had to call `trip_status` from inside the agent or open `state/trips/ledger.json` by hand. With the trip ledger now load-bearing for trip planning workflows, doctor coverage matters.",
947
959
  "The check walks each agent bundle and reports per-agent trip health: pass when the ledger is absent (optional feature, not yet ensured), warn when `state/trips/` exists but `ledger.json` is missing or lacks the `ledgerId` field, fail when `ledger.json` is unreadable, unparseable, or missing the `privateKeyPem` (encrypted records would be unreadable). Healthy ledgers report `<ledgerId> (<N> records)` so the operator sees record count without opening anything.",
948
- "7 new tests cover: no-agents (warn), no-ledger-dir (pass \u2014 optional), missing ledger.json (warn), unparseable JSON (fail), missing ledgerId (warn), missing privateKeyPem (fail), and the healthy passing path with record-counting that ignores non-`.json` files in the records dir. Wired between `Security` and `Disk` in the `CATEGORY_CHECKERS` array \u2014 same orchestration shape as the existing categories."
960
+ "7 new tests cover: no-agents (warn), no-ledger-dir (pass optional), missing ledger.json (warn), unparseable JSON (fail), missing ledgerId (warn), missing privateKeyPem (fail), and the healthy passing path with record-counting that ignores non-`.json` files in the records dir. Wired between `Security` and `Disk` in the `CATEGORY_CHECKERS` array same orchestration shape as the existing categories."
949
961
  ]
950
962
  },
951
963
  {
952
964
  "version": "0.1.0-alpha.508",
953
965
  "changes": [
954
- "New `trip_remove_leg` tool. The trip ledger had `trip_upsert`, `trip_attach_evidence`, and `trip_update_leg`, but no first-class way to drop a leg \u2014 the agent had to re-emit the entire trip record minus that leg (fragile, easy to lose evidence). Real workflow that needed it: the user cancelled a hotel booking; #620's e2e test exercised the add path but there was no way to model the cancel.",
955
- "`trip_remove_leg(tripId, legId, updatedAt, reason?)` finds the leg, drops it from `legs[]`, bumps the trip's `updatedAt`, and emits `trips.leg_removed` (info) carrying tripId/legId/kind/reason. Rejects when the leg id is unknown (so accidental no-ops are visible) and when the trip is missing (returns the same `trip not found` shape as the other tools). Tool registry up to 75 (now 8 trip tools \u2014 snapshot updated, H10 contract list includes the new name).",
966
+ "New `trip_remove_leg` tool. The trip ledger had `trip_upsert`, `trip_attach_evidence`, and `trip_update_leg`, but no first-class way to drop a leg the agent had to re-emit the entire trip record minus that leg (fragile, easy to lose evidence). Real workflow that needed it: the user cancelled a hotel booking; #620's e2e test exercised the add path but there was no way to model the cancel.",
967
+ "`trip_remove_leg(tripId, legId, updatedAt, reason?)` finds the leg, drops it from `legs[]`, bumps the trip's `updatedAt`, and emits `trips.leg_removed` (info) carrying tripId/legId/kind/reason. Rejects when the leg id is unknown (so accidental no-ops are visible) and when the trip is missing (returns the same `trip not found` shape as the other tools). Tool registry up to 75 (now 8 trip tools snapshot updated, H10 contract list includes the new name).",
956
968
  "5 tests cover happy-path removal with leg-count assertion via `trip_get`, unknown-leg rejection, missing-trip propagation, the three required-field validation paths, plus the existing stranger-ctx trust block now extended to include `trip_remove_leg`."
957
969
  ]
958
970
  },
@@ -966,32 +978,32 @@
966
978
  {
967
979
  "version": "0.1.0-alpha.506",
968
980
  "changes": [
969
- "Detect duplicate tool_call_id across assistant messages in `validateSessionMessages`. MiniMax-M2.7 emits canonical tool_call ids of the form `call_function_<hash>_<n>` and reuses the same id across turns when the same function gets called \u2014 which causes provider rejections on replay because tool_call_id is supposed to be unique per request. The session sanitize pass already had position-aware orphan detection (#613) and inline-reasoning strip (#612); this adds the third member of the family \u2014 collision detection.",
970
- "New exported `detectDuplicateToolCallIds(messages)` returns `{ id, indices }[]` for each tool_call_id that appears in multiple assistant messages. Same-message duplicates (one assistant calling the same id twice) are not flagged \u2014 those are a legitimate parallel-call shape. `validateSessionMessages` now folds collisions into its violations list with a message that calls out MiniMax specifically so operators reading nerves know what they're looking at.",
971
- "Detection only \u2014 no rewriting yet, since rewriting tool_call_ids and the matching tool_results requires careful pairing logic that risks regression. The collision is visible to operators via the `mind.session_invariant_violation` nerves event the sanitize pass already emits when violations are present, and the existing `nerves-review` CLI from #622 makes it filterable. 3 new tests cover collision detection, single-message parallel-call shape (no false positive), and the all-distinct happy path."
981
+ "Detect duplicate tool_call_id across assistant messages in `validateSessionMessages`. MiniMax-M2.7 emits canonical tool_call ids of the form `call_function_<hash>_<n>` and reuses the same id across turns when the same function gets called which causes provider rejections on replay because tool_call_id is supposed to be unique per request. The session sanitize pass already had position-aware orphan detection (#613) and inline-reasoning strip (#612); this adds the third member of the family collision detection.",
982
+ "New exported `detectDuplicateToolCallIds(messages)` returns `{ id, indices }[]` for each tool_call_id that appears in multiple assistant messages. Same-message duplicates (one assistant calling the same id twice) are not flagged those are a legitimate parallel-call shape. `validateSessionMessages` now folds collisions into its violations list with a message that calls out MiniMax specifically so operators reading nerves know what they're looking at.",
983
+ "Detection only no rewriting yet, since rewriting tool_call_ids and the matching tool_results requires careful pairing logic that risks regression. The collision is visible to operators via the `mind.session_invariant_violation` nerves event the sanitize pass already emits when violations are present, and the existing `nerves-review` CLI from #622 makes it filterable. 3 new tests cover collision detection, single-message parallel-call shape (no false positive), and the all-distinct happy path."
972
984
  ]
973
985
  },
974
986
  {
975
987
  "version": "0.1.0-alpha.505",
976
988
  "changes": [
977
989
  "New `ouro session-stats <session.json>` CLI for at-a-glance metrics on a saved session: total events, breakdown by role (system/user/assistant/tool), tool-call totals + top 5 by frequency, attachment count, time range with duration, projection breakdown (in/out, input tokens, max tokens, trimmed), and last usage. Read-only.",
978
- "Pure `computeSessionStats(envelope, path)` core in `src/heart/session-stats.ts` \u2014 testable with synthesized envelopes, embeddable in future doctor checks. `runSessionStats(path)` adds the file-load layer; `formatStatsReport(report)` renders human-readable text; `--json` mode for jq piping. Composes with #619 (session-playback) and #622 (nerves-review): three pure-analyzer-plus-thin-CLI tools that together make a stuck session immediately diagnosable end-to-end.",
990
+ "Pure `computeSessionStats(envelope, path)` core in `src/heart/session-stats.ts` testable with synthesized envelopes, embeddable in future doctor checks. `runSessionStats(path)` adds the file-load layer; `formatStatsReport(report)` renders human-readable text; `--json` mode for jq piping. Composes with #619 (session-playback) and #622 (nerves-review): three pure-analyzer-plus-thin-CLI tools that together make a stuck session immediately diagnosable end-to-end.",
979
991
  "8 tests cover role counts, tool-call name aggregation with frequency-sorted top-5, time range with and without authoredAt timestamps, attachment counting, projection-omission detection, the unrecognized-envelope stub, CLI no-args help, and CLI --json output. Wired as `npm run session:stats -- <path>` and `dist/heart/session-stats-cli-main.js`."
980
992
  ]
981
993
  },
982
994
  {
983
995
  "version": "0.1.0-alpha.504",
984
996
  "changes": [
985
- "New `mail_outbox` tool \u2014 the agent can now introspect its own outbound mail (drafts, queued sends, delivered, bounced, etc.). The mail repertoire had `mail_compose`, `mail_send`, and `mail_recent` for inbound \u2014 but no symmetric way to ask 'what did I send / queue?' Operators were having to ssh in and `ls state/.../outbound`. Real-world need: when planning a trip with the operator, the agent often wants to verify it sent a confirmation request before re-asking.",
986
- "Lists records newest-first (by `updatedAt`), bounded to `limit` (1-50, default 20), with optional `status` filter across the full MailOutboundStatus union (draft / sent / submitted / accepted / delivered / bounced / suppressed / quarantined / spam-filtered / failed). Each record renders id + status + recipients + truncated subject (80 chars) + last-touched timestamp + provider message id and error message when present. No body text dumped \u2014 agent uses message id with another tool if it needs the content.",
997
+ "New `mail_outbox` tool the agent can now introspect its own outbound mail (drafts, queued sends, delivered, bounced, etc.). The mail repertoire had `mail_compose`, `mail_send`, and `mail_recent` for inbound but no symmetric way to ask 'what did I send / queue?' Operators were having to ssh in and `ls state/.../outbound`. Real-world need: when planning a trip with the operator, the agent often wants to verify it sent a confirmation request before re-asking.",
998
+ "Lists records newest-first (by `updatedAt`), bounded to `limit` (1-50, default 20), with optional `status` filter across the full MailOutboundStatus union (draft / sent / submitted / accepted / delivered / bounced / suppressed / quarantined / spam-filtered / failed). Each record renders id + status + recipients + truncated subject (80 chars) + last-touched timestamp + provider message id and error message when present. No body text dumped agent uses message id with another tool if it needs the content.",
987
999
  "Family-trust gated like the rest of mail (read gate, no special block since outbound metadata isn't body content). Records `mail_outbox` access in the access log alongside the other mail tools. Tool registry now at 75 tools (snapshot updated). Two tests cover the empty / sorted / limit / status-filter / audit-log paths, plus the trust block."
988
1000
  ]
989
1001
  },
990
1002
  {
991
1003
  "version": "0.1.0-alpha.503",
992
1004
  "changes": [
993
- "In-process LRU cache for decrypted mail bodies. The cold path for `mail_thread` is read-encrypted-blob-from-Azure (1-3s p50, up to tens of seconds for HEY-sized bodies \u2014 #614 raised the timeout to 60s for this very reason) plus an RSA-OAEP+A256GCM decrypt. Repeated reads of the same message are common: re-checking a booking confirmation while seeding a trip leg, following up on a thread, looping back to verify a fact. Each repeat hit was paying the full cold cost.",
994
- "New `src/mailroom/body-cache.ts` keeps a 50-entry LRU keyed by `StoredMailMessage.id` (a deterministic content hash \u2014 rotating keys produces a new id, so stale ciphertext can never be served against a fresh keyset). Insertion-order eviction; reads refresh LRU position. Per-process by design \u2014 daemon restart clears it (matches the established pattern with #618 heartbeat-recursion state and #621 BB own-handle discovery).",
1005
+ "In-process LRU cache for decrypted mail bodies. The cold path for `mail_thread` is read-encrypted-blob-from-Azure (1-3s p50, up to tens of seconds for HEY-sized bodies #614 raised the timeout to 60s for this very reason) plus an RSA-OAEP+A256GCM decrypt. Repeated reads of the same message are common: re-checking a booking confirmation while seeding a trip leg, following up on a thread, looping back to verify a fact. Each repeat hit was paying the full cold cost.",
1006
+ "New `src/mailroom/body-cache.ts` keeps a 50-entry LRU keyed by `StoredMailMessage.id` (a deterministic content hash rotating keys produces a new id, so stale ciphertext can never be served against a fresh keyset). Insertion-order eviction; reads refresh LRU position. Per-process by design daemon restart clears it (matches the established pattern with #618 heartbeat-recursion state and #621 BB own-handle discovery).",
995
1007
  "Wired into both `mail_thread` (cache-first read; on miss, do the disk fetch + decrypt and cache for next time) and `mail_recent`/`mail_search` (which already decrypt batches; now they also seed the body cache so the next `mail_thread` on any of those is free). New `repertoire.mail_body_cache_hit` info-level event makes hit rate observable via `ouro nerves-review --event mail_body_cache_hit` (alpha.501). 7 new tests cover hit/miss, LRU refresh-on-read, eviction at capacity, defensive empty-id handling, and clear."
996
1008
  ]
997
1009
  },
@@ -999,60 +1011,60 @@
999
1011
  "version": "0.1.0-alpha.502",
1000
1012
  "changes": [
1001
1013
  "Enrich `engine.error` nerve event with HTTP status, redacted body excerpt, and a one-line summary string. Provider errors previously surfaced only as a free-form `error.message`, which forced operators to spelunk the SDK's wrapped object to find the actual status code or quota explanation.",
1002
- "Two new helpers in `src/heart/providers/error-classification.ts`: `extractProviderErrorDetails(error)` pulls `status` (when present) and a body excerpt (capped at 240 chars, with redaction of any 32+ char token-shaped substring so leaked auth keys don't get persisted into nerves), falling through `error.error \u2192 error.response \u2192 error.body \u2192 error.message` until something usable shows up. Survives circular structures defensively. `summarizeProviderError(error, classification, providerId, model)` produces the canonical operator-readable line: `provider <id>/<model>: <classification>[ HTTP <status>][ \u2014 <bodyExcerpt>]`.",
1003
- "Wired into `finishTerminalProviderError` in `src/heart/core.ts` so every terminal provider error now lands in nerves with `httpStatus` + `bodyExcerpt` + `summary` meta \u2014 making `ouro nerves-review --component engine --event engine.error` (alpha.501) immediately useful for diagnosing provider blowups. 11 new tests cover status capture, missing-status defaults, token redaction, 240-char truncation, fallback through alternate body fields, circular-structure safety, and summary formatting in two shapes."
1014
+ "Two new helpers in `src/heart/providers/error-classification.ts`: `extractProviderErrorDetails(error)` pulls `status` (when present) and a body excerpt (capped at 240 chars, with redaction of any 32+ char token-shaped substring so leaked auth keys don't get persisted into nerves), falling through `error.error error.response error.body error.message` until something usable shows up. Survives circular structures defensively. `summarizeProviderError(error, classification, providerId, model)` produces the canonical operator-readable line: `provider <id>/<model>: <classification>[ HTTP <status>][ <bodyExcerpt>]`.",
1015
+ "Wired into `finishTerminalProviderError` in `src/heart/core.ts` so every terminal provider error now lands in nerves with `httpStatus` + `bodyExcerpt` + `summary` meta making `ouro nerves-review --component engine --event engine.error` (alpha.501) immediately useful for diagnosing provider blowups. 11 new tests cover status capture, missing-status defaults, token redaction, 240-char truncation, fallback through alternate body fields, circular-structure safety, and summary formatting in two shapes."
1004
1016
  ]
1005
1017
  },
1006
1018
  {
1007
1019
  "version": "0.1.0-alpha.501",
1008
1020
  "changes": [
1009
1021
  "New `ouro nerves-review` CLI for tailing the agent's nerves ndjson with structured filters. Read-only. Operators previously had to grep raw ndjson by hand to track down something like 'how many heartbeat-recursion-suspected events fired today' or 'show me the last hour of senses warnings'.",
1010
- "Filters: `--component <substr>`, `--event <substr>`, `--level <level>`, `--since <duration>` (e.g. 5m, 2h, 1d), `--limit <N>`, `--process <name>` (default: daemon), `--agent <name>` (default: current). Output modes: human-readable text (`<time> [<level>] <component>/<event> \u2014 <message>`) and `--json` (one parsed object per line for piping to jq).",
1022
+ "Filters: `--component <substr>`, `--event <substr>`, `--level <level>`, `--since <duration>` (e.g. 5m, 2h, 1d), `--limit <N>`, `--process <name>` (default: daemon), `--agent <name>` (default: current). Output modes: human-readable text (`<time> [<level>] <component>/<event> <message>`) and `--json` (one parsed object per line for piping to jq).",
1011
1023
  "Pure `reviewNerveEvents(filePath, filter)` core in `src/nerves/review/core.ts` reads the tail of the ndjson (8 MB cap, walks last 200+ lines) and applies in-memory filters; testable without filesystem mocks beyond a temp file. 12 tests cover all six filter dimensions plus duration parsing edge cases (ms/s/m/h/d, malformed inputs), missing-file handling, and the two CLI flag paths (--help, invalid --since). Wired as `npm run nerves:review -- <flags>` and `dist/nerves/review/cli-main.js`."
1012
1024
  ]
1013
1025
  },
1014
1026
  {
1015
1027
  "version": "0.1.0-alpha.500",
1016
1028
  "changes": [
1017
- "Auto-discover BlueBubbles agent handles on isFromMe outbound. The `bluebubbles.ownHandles` config field added in #610 closes the group-echo self-talk loop, but only after an operator manually populates it with the right handle format. Until then, the very bug the field is supposed to fix can fire \u2014 the agent ingests its own group echo and replies to itself.",
1029
+ "Auto-discover BlueBubbles agent handles on isFromMe outbound. The `bluebubbles.ownHandles` config field added in #610 closes the group-echo self-talk loop, but only after an operator manually populates it with the right handle format. Until then, the very bug the field is supposed to fix can fire the agent ingests its own group echo and replies to itself.",
1018
1030
  "When a normalized BlueBubbles event arrives with `event.fromMe === true`, BlueBubbles is telling us the canonical handle BB attributes to the agent's outbound. We capture `event.sender.externalId` into an in-process `discoveredOwnHandles` set and emit an info-level `senses.bluebubbles_own_handle_discovered` nerve event with the captured handle, so an operator can promote it to durable config (cross-restart). The default `getOwnHandles` now returns the union of configured + discovered; `isAgentSelfHandle` therefore filters subsequent isFromMe-missing group echoes even before the operator updates the vault config.",
1019
- "Per-process state by design \u2014 a daemon restart re-learns from the next outbound. Three new tests cover: capture-and-dedupe (raw/normalized form match collapses to one entry), defensive empty/whitespace input handling, and the end-to-end proof that `isAgentSelfHandle` honors discovered handles after `recordDiscoveredOwnHandle` fires."
1031
+ "Per-process state by design a daemon restart re-learns from the next outbound. Three new tests cover: capture-and-dedupe (raw/normalized form match collapses to one entry), defensive empty/whitespace input handling, and the end-to-end proof that `isAgentSelfHandle` honors discovered handles after `recordDiscoveredOwnHandle` fires."
1020
1032
  ]
1021
1033
  },
1022
1034
  {
1023
1035
  "version": "0.1.0-alpha.499",
1024
1036
  "changes": [
1025
- "New `ouro session-playback <session.json>` CLI for dry-running the sanitize pipeline against a saved session. When an agent is stuck in a replay loop, an operator can now run the same `sanitizeProviderMessages` chain that the harness fires before every replay, see what would be dropped/modified/synthesized, and decide whether to clear or hand-repair the session \u2014 *without* writing anything to disk.",
1026
- "The report distinguishes three repair classes: dropped (orphan tool results whose preceding assistant has no matching tool_call), modified-content (assistant messages whose inline `<think>...</think>` blocks would be stripped before replay), and synthetic-added (synthetic tool-results inserted to satisfy the provider's tool_call/tool_result pairing \u2014 these include the explanatory message added in #612 so the agent can read what happened). Each change carries a role, index, optional tool_call_id, reason, and a 120-char preview of the affected content.",
1027
- "Two output modes: human-readable text (default) and `--json` for piping into jq/diagnostics. Underlying `runSessionPlayback` is a pure function \u2014 takes either a session path or a raw object \u2014 so it's testable in isolation and the same code path can be embedded in future doctor checks. Wired as `npm run session:playback -- <path>` and as the `dist/heart/session-playback-cli-main.js` entry. 7 tests cover the four envelope shapes (clean legacy, with stripped think, with orphan tool result, unrecognized) plus the two CLI flag paths."
1037
+ "New `ouro session-playback <session.json>` CLI for dry-running the sanitize pipeline against a saved session. When an agent is stuck in a replay loop, an operator can now run the same `sanitizeProviderMessages` chain that the harness fires before every replay, see what would be dropped/modified/synthesized, and decide whether to clear or hand-repair the session *without* writing anything to disk.",
1038
+ "The report distinguishes three repair classes: dropped (orphan tool results whose preceding assistant has no matching tool_call), modified-content (assistant messages whose inline `<think>...</think>` blocks would be stripped before replay), and synthetic-added (synthetic tool-results inserted to satisfy the provider's tool_call/tool_result pairing these include the explanatory message added in #612 so the agent can read what happened). Each change carries a role, index, optional tool_call_id, reason, and a 120-char preview of the affected content.",
1039
+ "Two output modes: human-readable text (default) and `--json` for piping into jq/diagnostics. Underlying `runSessionPlayback` is a pure function takes either a session path or a raw object so it's testable in isolation and the same code path can be embedded in future doctor checks. Wired as `npm run session:playback -- <path>` and as the `dist/heart/session-playback-cli-main.js` entry. 7 tests cover the four envelope shapes (clean legacy, with stripped think, with orphan tool result, unrecognized) plus the two CLI flag paths."
1028
1040
  ]
1029
1041
  },
1030
1042
  {
1031
1043
  "version": "0.1.0-alpha.498",
1032
1044
  "changes": [
1033
- "Heartbeat / habit recursion detection in the inner-dialog worker. The existing instinct cap (`MAX_CONSECUTIVE_INSTINCT_TURNS=3`) protects against the *internal* pending-dir self-loop (a turn writes back to its own pending dir, drains it, repeats). It does not protect against the *external* IPC self-loop where heartbeat-shaped messages get re-issued faster than their cadence \u2014 e.g. a hook misconfigured to repost on every heartbeat, a daemon retry storm, or two timers drifting into the same window.",
1034
- "Two new warn-level nerve events: `senses.habit_recursion_suspected` fires when two of the same habit (e.g. `heartbeat`) arrive within `HABIT_RECURSION_MIN_INTERVAL_MS` (5s) \u2014 no realistic cadence runs that fast. `senses.habit_recursion_burst` fires when `HABIT_RECURSION_BURST_THRESHOLD` (5) or more habit messages of any kind land within `HABIT_RECURSION_BURST_WINDOW_MS` (60s) \u2014 catches slower runaways that stay just under the min-interval threshold.",
1035
- "Detection is observation-only by design: it emits the warn signal so an operator (or a follow-up auto-recovery layer) can act on it. The message is not dropped \u2014 the signal is the value. Per-habit-name tracking, so two distinct habits firing close together don't trip the min-interval warning. `nowSource` is injectable via the `createInnerDialogWorker` factory for deterministic tests. 5 new tests cover both detectors plus the trim-window and per-habit isolation cases."
1045
+ "Heartbeat / habit recursion detection in the inner-dialog worker. The existing instinct cap (`MAX_CONSECUTIVE_INSTINCT_TURNS=3`) protects against the *internal* pending-dir self-loop (a turn writes back to its own pending dir, drains it, repeats). It does not protect against the *external* IPC self-loop where heartbeat-shaped messages get re-issued faster than their cadence e.g. a hook misconfigured to repost on every heartbeat, a daemon retry storm, or two timers drifting into the same window.",
1046
+ "Two new warn-level nerve events: `senses.habit_recursion_suspected` fires when two of the same habit (e.g. `heartbeat`) arrive within `HABIT_RECURSION_MIN_INTERVAL_MS` (5s) no realistic cadence runs that fast. `senses.habit_recursion_burst` fires when `HABIT_RECURSION_BURST_THRESHOLD` (5) or more habit messages of any kind land within `HABIT_RECURSION_BURST_WINDOW_MS` (60s) catches slower runaways that stay just under the min-interval threshold.",
1047
+ "Detection is observation-only by design: it emits the warn signal so an operator (or a follow-up auto-recovery layer) can act on it. The message is not dropped the signal is the value. Per-habit-name tracking, so two distinct habits firing close together don't trip the min-interval warning. `nowSource` is injectable via the `createInnerDialogWorker` factory for deterministic tests. 5 new tests cover both detectors plus the trim-window and per-habit isolation cases."
1036
1048
  ]
1037
1049
  },
1038
1050
  {
1039
1051
  "version": "0.1.0-alpha.497",
1040
1052
  "changes": [
1041
- "Mail thread reconstruction + tool rename. The previous `mail_thread` tool was misleadingly named \u2014 it returned ONE message body, not a thread. Renamed to `mail_body`. The new actual conversation walker now owns the canonical name `mail_thread`. Existing tests, audit-log strings, and CLI guidance updated to match.",
1042
- "Header capture: `PrivateMailEnvelope` carries optional `inReplyTo` and `references` fields, populated at `buildStoredMailMessage` time from RFC822 headers. Existing messages without these headers are unaffected. `mail_thread` walks the thread from any seed message (storage id or RFC822 `<message-id@host>`): ancestors via `In-Reply-To`/`References`, descendants by reverse-edges across the recent message pool (default 200, configurable 20-500, scoped native/delegated/all), assigns true reply-chain depth via topological longest-path, and renders chronologically with depth-indented summaries. Bodies not included \u2014 `mail_body` opens one message.",
1053
+ "Mail thread reconstruction + tool rename. The previous `mail_thread` tool was misleadingly named it returned ONE message body, not a thread. Renamed to `mail_body`. The new actual conversation walker now owns the canonical name `mail_thread`. Existing tests, audit-log strings, and CLI guidance updated to match.",
1054
+ "Header capture: `PrivateMailEnvelope` carries optional `inReplyTo` and `references` fields, populated at `buildStoredMailMessage` time from RFC822 headers. Existing messages without these headers are unaffected. `mail_thread` walks the thread from any seed message (storage id or RFC822 `<message-id@host>`): ancestors via `In-Reply-To`/`References`, descendants by reverse-edges across the recent message pool (default 200, configurable 20-500, scoped native/delegated/all), assigns true reply-chain depth via topological longest-path, and renders chronologically with depth-indented summaries. Bodies not included `mail_body` opens one message.",
1043
1055
  "Pure thread-walker (`src/mailroom/thread.ts`) is testable without the filesystem: 7 unit tests cover mid-thread seed (walks both directions), seed by RFC822 message-id when storage id doesn't match, References-only (no In-Reply-To, common in list mailers), unrelated-message exclusion, empty/whitespace defensiveness. Plus 3 new tool-level tests for `mail_thread` (multi-message reconstruction, untrusted refusal, delegated-trust block). Tool registry stays at 75 (rename, not addition). All 194 mailroom tests pass."
1044
1056
  ]
1045
1057
  },
1046
1058
  {
1047
1059
  "version": "0.1.0-alpha.496",
1048
1060
  "changes": [
1049
- "New `Lifecycle` category in `ouro doctor` (`src/heart/daemon/doctor.ts:checkLifecycle`). Reads daemon.ndjson from the first available agent bundle and surfaces operator-relevant signal: last activity timestamp + age (warns if older than 5 minutes \u2014 daemon may be silent or stopped), daemon restart count in the last hour (warns if >3 \u2014 high churn), recent version-install events with installed versions, and any agent_process_error events with reason. Designed to answer the operator's question after the daemon goes silent: 'did it crash? when did it last do anything? did it just upgrade?' This session's daemon went silent at 04:30 UTC with no easy way to diagnose; the new check would have surfaced 'last event 18m ago \u2014 daemon may be silent or stopped' immediately. Tail-reads only the last 5000 log lines so doctor stays snappy on chatty daemons. 13 new tests covering recent activity, restart counts, install events, agent_process_error, age formatting, log truncation, and edge cases (malformed JSON, missing meta fields, missing log file, read failure)."
1061
+ "New `Lifecycle` category in `ouro doctor` (`src/heart/daemon/doctor.ts:checkLifecycle`). Reads daemon.ndjson from the first available agent bundle and surfaces operator-relevant signal: last activity timestamp + age (warns if older than 5 minutes daemon may be silent or stopped), daemon restart count in the last hour (warns if >3 high churn), recent version-install events with installed versions, and any agent_process_error events with reason. Designed to answer the operator's question after the daemon goes silent: 'did it crash? when did it last do anything? did it just upgrade?' This session's daemon went silent at 04:30 UTC with no easy way to diagnose; the new check would have surfaced 'last event 18m ago daemon may be silent or stopped' immediately. Tail-reads only the last 5000 log lines so doctor stays snappy on chatty daemons. 13 new tests covering recent activity, restart counts, install events, agent_process_error, age formatting, log truncation, and edge cases (malformed JSON, missing meta fields, missing log file, read failure)."
1050
1062
  ]
1051
1063
  },
1052
1064
  {
1053
1065
  "version": "0.1.0-alpha.495",
1054
1066
  "changes": [
1055
- "New regression bundle at `src/__tests__/heart/provider-replay-regressions.test.ts` that captures provider replay-rejection bug shapes in one place \u2014 documentation-as-test. Each entry cites the PR that fixed the shape and the runbook entry; future debuggers seeing a 4xx from a provider on what looks like a valid turn can grep this file first to see if the shape was already encountered. Currently bundles the MiniMax-M2.7 inline-`<think>`-plus-tool_calls case (#612), the reused-tool_call_id-misordered-after-pruning case (#613), and a cross-reference stub for the event-id collision class (covered separately in session-events.test.ts). Also documents the contribution pattern: capture the failing shape, write the test BEFORE the fix, land the fix, verify the test passes, cite the PR. Linked from `docs/known-issues-and-recovery.md` so operators triaging a similar bug land on the test bundle by default."
1067
+ "New regression bundle at `src/__tests__/heart/provider-replay-regressions.test.ts` that captures provider replay-rejection bug shapes in one place documentation-as-test. Each entry cites the PR that fixed the shape and the runbook entry; future debuggers seeing a 4xx from a provider on what looks like a valid turn can grep this file first to see if the shape was already encountered. Currently bundles the MiniMax-M2.7 inline-`<think>`-plus-tool_calls case (#612), the reused-tool_call_id-misordered-after-pruning case (#613), and a cross-reference stub for the event-id collision class (covered separately in session-events.test.ts). Also documents the contribution pattern: capture the failing shape, write the test BEFORE the fix, land the fix, verify the test passes, cite the PR. Linked from `docs/known-issues-and-recovery.md` so operators triaging a similar bug land on the test bundle by default."
1056
1068
  ]
1057
1069
  },
1058
1070
  {
@@ -1064,56 +1076,56 @@
1064
1076
  {
1065
1077
  "version": "0.1.0-alpha.493",
1066
1078
  "changes": [
1067
- "Position-aware orphan-tool-result detection in `repairToolCallSequences`. Slugger's session was STILL hitting MiniMax error 2013 even after the alpha.492 inline-reasoning strip landed because the orphan check was global (a tool result was kept if its tool_call_id appeared in ANY assistant message in the conversation, regardless of order). After session pruning, a synthetic tool-result for a long-pruned tool_call ended up at sequence 86 referencing `call_function_utqogadgqp5h_1` while the assistant message that defined that id lived at sequence 88 \u2014 AFTER the tool result. MiniMax requires tool results to follow their matching assistant. The fix walks the conversation in order, tracking tool_call_ids only as they're encountered in assistant messages; tool results referencing ids that haven't been defined yet are removed. Regression test reproduces the exact misordered shape and asserts the misplaced tool result is dropped while the correctly-ordered one survives. This is the third and final layer of the empty-reply chain (#611 stripped the operator surface, #612 stripped the persisted content + load-time repair, #493 fixes orphan-detection ordering)."
1079
+ "Position-aware orphan-tool-result detection in `repairToolCallSequences`. Slugger's session was STILL hitting MiniMax error 2013 even after the alpha.492 inline-reasoning strip landed because the orphan check was global (a tool result was kept if its tool_call_id appeared in ANY assistant message in the conversation, regardless of order). After session pruning, a synthetic tool-result for a long-pruned tool_call ended up at sequence 86 referencing `call_function_utqogadgqp5h_1` while the assistant message that defined that id lived at sequence 88 AFTER the tool result. MiniMax requires tool results to follow their matching assistant. The fix walks the conversation in order, tracking tool_call_ids only as they're encountered in assistant messages; tool results referencing ids that haven't been defined yet are removed. Regression test reproduces the exact misordered shape and asserts the misplaced tool result is dropped while the correctly-ordered one survives. This is the third and final layer of the empty-reply chain (#611 stripped the operator surface, #612 stripped the persisted content + load-time repair, #493 fixes orphan-detection ordering)."
1068
1080
  ]
1069
1081
  },
1070
1082
  {
1071
1083
  "version": "0.1.0-alpha.492",
1072
1084
  "changes": [
1073
- "Engine-level fix for the actual root cause of Slugger's empty-reply MCP bug (PR #611's strip+retry was the right shape, but missed the deepest layer). MiniMax-M2.7 occasionally emits an assistant message with BOTH inline `<think>...</think>` reasoning AND tool_calls. When that combination is replayed in a subsequent turn, MiniMax rejects with error 2013 ('tool result's tool id not found') and stalls the entire session \u2014 every subsequent turn fails the same way, the failover layer fires repeatedly suggesting a provider switch, and the agent's own answer never reaches the operator. Slugger's session was stuck for 11 unanswered user messages because of this exact loop.",
1074
- "The fix has two halves and an AX rule: (1) **Persist-time strip** \u2014 runAgent now strips `<think>` blocks from the assistant message's persisted `content` before saving, while preserving the original reasoning trace on `_inline_reasoning` for audit. New `engine.inline_reasoning_stripped` info-level nerve event fires when this happens. (2) **Load-time repair** \u2014 `sanitizeProviderMessages` self-heals existing sessions that were saved before (1) by stripping the same blocks at load time. (3) **AX rule: full agent awareness, no silent fixes**. When the load-time repair runs, the synthetic tool-result that fills in for the missing tool result is an **explanatory** one \u2014 it tells the agent specifically: \"your previous tool call's result was lost because the assistant message had inline reasoning blocks the provider rejected; the harness has stripped them; your reasoning trace is preserved out-of-band; if the work needs to be done, retry the tool call now.\" Tool calls whose parent didn't have stripped reasoning still get a generic-but-improved \"this tool call's result was lost \u2014 possible causes [...]; retry if needed\" message instead of the old vague \"interrupted (previous turn timed out)\" line. The agent always sees what happened and what to do next.",
1085
+ "Engine-level fix for the actual root cause of Slugger's empty-reply MCP bug (PR #611's strip+retry was the right shape, but missed the deepest layer). MiniMax-M2.7 occasionally emits an assistant message with BOTH inline `<think>...</think>` reasoning AND tool_calls. When that combination is replayed in a subsequent turn, MiniMax rejects with error 2013 ('tool result's tool id not found') and stalls the entire session every subsequent turn fails the same way, the failover layer fires repeatedly suggesting a provider switch, and the agent's own answer never reaches the operator. Slugger's session was stuck for 11 unanswered user messages because of this exact loop.",
1086
+ "The fix has two halves and an AX rule: (1) **Persist-time strip** runAgent now strips `<think>` blocks from the assistant message's persisted `content` before saving, while preserving the original reasoning trace on `_inline_reasoning` for audit. New `engine.inline_reasoning_stripped` info-level nerve event fires when this happens. (2) **Load-time repair** `sanitizeProviderMessages` self-heals existing sessions that were saved before (1) by stripping the same blocks at load time. (3) **AX rule: full agent awareness, no silent fixes**. When the load-time repair runs, the synthetic tool-result that fills in for the missing tool result is an **explanatory** one it tells the agent specifically: \"your previous tool call's result was lost because the assistant message had inline reasoning blocks the provider rejected; the harness has stripped them; your reasoning trace is preserved out-of-band; if the work needs to be done, retry the tool call now.\" Tool calls whose parent didn't have stripped reasoning still get a generic-but-improved \"this tool call's result was lost possible causes [...]; retry if needed\" message instead of the old vague \"interrupted (previous turn timed out)\" line. The agent always sees what happened and what to do next.",
1075
1087
  "Also: the no-tool-call retry path (added in #611's last commit) now uses the same shared `stripThinkBlocksForViolationCheck` helper so the violation-detection logic is consistent. Three regression tests cover the full path: persist-time strip preserves `_inline_reasoning`, load-time repair produces the explanatory tool-result message, generic orphans get the generic message, and unclosed `<think>` tags drop everything from the open tag onward."
1076
1088
  ]
1077
1089
  },
1078
1090
  {
1079
1091
  "version": "0.1.0-alpha.491",
1080
1092
  "changes": [
1081
- "Two live-runtime bugs Slugger surfaced during MCP roundtrip: (1) MCP `send_message` returned blank or raw `<think>` content instead of an actual reply when minimax-style models emitted only reasoning. The shared-turn runner now strips closed AND unclosed `<think>...</think>` blocks before returning, and when nothing remains it returns a clear diagnostic (`agent produced reasoning but no final answer this turn \u2014 try again`) plus emits a `senses.shared_turn_only_reasoning` warn-level nerve event so operators can see how often it's happening. (2) The `rest` tool's fresh-pending-work gate fired on every rest call within a turn because `hasFreshPendingWork(options)` reads from the turn-start snapshot of `pendingMessages` and never updates \u2014 once pending was non-empty, the agent could be told 'fresh work arrived' indefinitely even after surfacing or processing the items. The gate is now once-per-turn: the first rest call hits it, gets the message, the agent does whatever it needs, the next rest call passes. Emits `engine.fresh_work_gate_fired` info-level event the one time it fires. Both bugs reproduced as regression tests that fail without the fix.",
1082
- "New `trip_update_leg` tool to round out the trip ledger. The original Step 4 followup on substrate#35 listed `trip_ensure / trip_get / trip_update_leg / trip_attach_evidence` \u2014 the previous PR (#609) shipped `trip_upsert` instead of `trip_update_leg`, which forced the agent to re-emit the entire trip record to change one leg field. `trip_update_leg` updates specific fields of an existing leg in place: pass `tripId`, `legId`, a JSON object of field updates, and `updatedAt`. Identity-changing updates (`legId`, `kind`) and empty updates objects are rejected with operational error messages. Existing evidence is preserved unless the agent explicitly overwrites it. Emits a `trips.leg_updated` nerve event with the field list. Tool registry now at 74 tools (up from 73).",
1083
- "New `docs/trip-ledger.md` covering what the ledger actually is, why Slugger said it needed to exist (gap between mail body and travel doc \u2014 no authoritative source for cross-checks), the discriminated TripLeg union (lodging / flight / train / ground-transport / rental-car / ferry / event), the non-optional TripEvidence shape with `discoveryMethod`, the trust shape (per-agent keys, private key returned exactly once, hosted side never sees plaintext), the seven harness tools, on-disk layout for both harness and hosted sides, and an explicit answer to 'is this travel-specific or generalizable infra?' (current shape is travel-specific by design; the *pattern* is general and would be lifted to a shared abstraction the next time we build a per-agent encrypted record service)."
1093
+ "Two live-runtime bugs Slugger surfaced during MCP roundtrip: (1) MCP `send_message` returned blank or raw `<think>` content instead of an actual reply when minimax-style models emitted only reasoning. The shared-turn runner now strips closed AND unclosed `<think>...</think>` blocks before returning, and when nothing remains it returns a clear diagnostic (`agent produced reasoning but no final answer this turn try again`) plus emits a `senses.shared_turn_only_reasoning` warn-level nerve event so operators can see how often it's happening. (2) The `rest` tool's fresh-pending-work gate fired on every rest call within a turn because `hasFreshPendingWork(options)` reads from the turn-start snapshot of `pendingMessages` and never updates once pending was non-empty, the agent could be told 'fresh work arrived' indefinitely even after surfacing or processing the items. The gate is now once-per-turn: the first rest call hits it, gets the message, the agent does whatever it needs, the next rest call passes. Emits `engine.fresh_work_gate_fired` info-level event the one time it fires. Both bugs reproduced as regression tests that fail without the fix.",
1094
+ "New `trip_update_leg` tool to round out the trip ledger. The original Step 4 followup on substrate#35 listed `trip_ensure / trip_get / trip_update_leg / trip_attach_evidence` the previous PR (#609) shipped `trip_upsert` instead of `trip_update_leg`, which forced the agent to re-emit the entire trip record to change one leg field. `trip_update_leg` updates specific fields of an existing leg in place: pass `tripId`, `legId`, a JSON object of field updates, and `updatedAt`. Identity-changing updates (`legId`, `kind`) and empty updates objects are rejected with operational error messages. Existing evidence is preserved unless the agent explicitly overwrites it. Emits a `trips.leg_updated` nerve event with the field list. Tool registry now at 74 tools (up from 73).",
1095
+ "New `docs/trip-ledger.md` covering what the ledger actually is, why Slugger said it needed to exist (gap between mail body and travel doc no authoritative source for cross-checks), the discriminated TripLeg union (lodging / flight / train / ground-transport / rental-car / ferry / event), the non-optional TripEvidence shape with `discoveryMethod`, the trust shape (per-agent keys, private key returned exactly once, hosted side never sees plaintext), the seven harness tools, on-disk layout for both harness and hosted sides, and an explicit answer to 'is this travel-specific or generalizable infra?' (current shape is travel-specific by design; the *pattern* is general and would be lifted to a shared abstraction the next time we build a per-agent encrypted record service)."
1084
1096
  ]
1085
1097
  },
1086
1098
  {
1087
1099
  "version": "0.1.0-alpha.490",
1088
1100
  "changes": [
1089
- "BlueBubbles group echo self-talk fix. The BB ingest path previously relied solely on the payload's `isFromMe` flag to detect the agent's own outbound messages \u2014 but in groups, BlueBubbles sometimes broadcasts the echo back through the webhook with that flag missing or false. Without a fallback, the agent would ingest its own message and reply to it (the user reported this in a group with their friend Rach: 'Slugger talking to himself'). New `bluebubbles.ownHandles` config field accepts the agent's known iMessage handles (phone numbers in any formatting, or email addresses); a fallback guard at the head of `handleBlueBubblesNormalizedEvent` filters any event whose `sender.externalId` matches a configured handle (case-insensitive, with phone-number normalization across +/space/paren/dash differences) and emits a `senses.bluebubbles_self_handle_filtered` warn-level nerve event so the case is observable. Direct chats are unaffected (their echoes already carry `isFromMe: true` reliably). Also folds in three trivial cleanups surfaced by the full-system audit: removed a stray `# Production SPA serving` heading from the README, removed a vestigial `// getPhrases removed` comment in bluebubbles/index.ts, and removed two unreachable `throw new Error('unreachable')` statements after `process.exit(1)` in heart/core.ts (process.exit returns `never` so TS already knows control doesn't continue)."
1101
+ "BlueBubbles group echo self-talk fix. The BB ingest path previously relied solely on the payload's `isFromMe` flag to detect the agent's own outbound messages but in groups, BlueBubbles sometimes broadcasts the echo back through the webhook with that flag missing or false. Without a fallback, the agent would ingest its own message and reply to it (the user reported this in a group with their friend Rach: 'Slugger talking to himself'). New `bluebubbles.ownHandles` config field accepts the agent's known iMessage handles (phone numbers in any formatting, or email addresses); a fallback guard at the head of `handleBlueBubblesNormalizedEvent` filters any event whose `sender.externalId` matches a configured handle (case-insensitive, with phone-number normalization across +/space/paren/dash differences) and emits a `senses.bluebubbles_self_handle_filtered` warn-level nerve event so the case is observable. Direct chats are unaffected (their echoes already carry `isFromMe: true` reliably). Also folds in three trivial cleanups surfaced by the full-system audit: removed a stray `# Production SPA serving` heading from the README, removed a vestigial `// getPhrases removed` comment in bluebubbles/index.ts, and removed two unreachable `throw new Error('unreachable')` statements after `process.exit(1)` in heart/core.ts (process.exit returns `never` so TS already knows control doesn't continue)."
1090
1102
  ]
1091
1103
  },
1092
1104
  {
1093
1105
  "version": "0.1.0-alpha.489",
1094
1106
  "changes": [
1095
- "Trip ledger Step 4 \u2014 harness-side trip tools land. Six new tools (`trip_ensure_ledger`, `trip_status`, `trip_get`, `trip_upsert`, `trip_attach_evidence`, `trip_new_id`) give the agent a private, encrypted, per-agent travel ledger backed by an RSA-OAEP-SHA256 + AES-256-GCM envelope. Ledger keypair lives at `state/trips/ledger.json`; encrypted records persist under `state/trips/records/<tripId>.json`. Tools are gated behind the same trust check as other private surfaces (only available to trusted callers) and validate `TripRecord` / `TripEvidence` shape before persisting. Vendor-copies the substrate trip-control types so the harness has no runtime dependency on a hosted ledger service yet, while keeping the on-disk format compatible for future migration."
1107
+ "Trip ledger Step 4 harness-side trip tools land. Six new tools (`trip_ensure_ledger`, `trip_status`, `trip_get`, `trip_upsert`, `trip_attach_evidence`, `trip_new_id`) give the agent a private, encrypted, per-agent travel ledger backed by an RSA-OAEP-SHA256 + AES-256-GCM envelope. Ledger keypair lives at `state/trips/ledger.json`; encrypted records persist under `state/trips/records/<tripId>.json`. Tools are gated behind the same trust check as other private surfaces (only available to trusted callers) and validate `TripRecord` / `TripEvidence` shape before persisting. Vendor-copies the substrate trip-control types so the harness has no runtime dependency on a hosted ledger service yet, while keeping the on-disk format compatible for future migration."
1096
1108
  ]
1097
1109
  },
1098
1110
  {
1099
1111
  "version": "0.1.0-alpha.488",
1100
1112
  "changes": [
1101
- "BlueBubbles group chats stay fully silent on `observe` turns. The engine emits `onToolStart(\"observe\")` / `onToolEnd(\"observe\")` even when the resulting outcome is `observed` (no reply), and the BlueBubbles adapter previously treated every tool start as reply commitment \u2014 so groups would briefly show typing or mark-read before the silent path completed. Both callbacks now short-circuit for `observe`: no startTypingNow, no toolCallbacks dispatch, just an observability event. Real reply-commit semantics (typing, mark-read, status messages on real tools) are preserved. Regression test reproduces the real callback sequence from the engine and asserts the lane stays quiet."
1113
+ "BlueBubbles group chats stay fully silent on `observe` turns. The engine emits `onToolStart(\"observe\")` / `onToolEnd(\"observe\")` even when the resulting outcome is `observed` (no reply), and the BlueBubbles adapter previously treated every tool start as reply commitment so groups would briefly show typing or mark-read before the silent path completed. Both callbacks now short-circuit for `observe`: no startTypingNow, no toolCallbacks dispatch, just an observability event. Real reply-commit semantics (typing, mark-read, status messages on real tools) are preserved. Regression test reproduces the real callback sequence from the engine and asserts the lane stays quiet."
1102
1114
  ]
1103
1115
  },
1104
1116
  {
1105
1117
  "version": "0.1.0-alpha.486",
1106
1118
  "changes": [
1107
1119
  "Mail convergence pass 1-5 hardens hosted-mail truth surfaces under live HEY ingest: import truth + audit resilience, accurate archive freshness and identity surfaces, sharper recovery and archive truth, delegated search resilience, and natural anchor-list retrieval; imported archive content is now searched on parsed message text rather than raw archive bytes so quoted-printable / HTML-heavy booking mail is reachable.",
1108
- "`mail_search` now ranks by booking-aware relevance instead of pure recency. Score signals weight query-term hits by field (subject +6 / from +4 / body +2), booking-intent tokens (`booking confirmation`, `your stay`, e-ticket, etc.), confirmation-number-shaped tokens, currency amounts, and known travel-sender domains; recency stays as a tiebreaker. Recall is unchanged \u2014 noise still appears in results, just below the decisive message. Each rendered result also surfaces a `matched on:` line listing fields, booking signals, status (confirmed / cancelled / changed / refunded / etc.), confirmation token, amount, dates, attachment count, and sender hint, so the agent can triage without paying for a body open.",
1109
- "BlueBubbles sense no longer sticks in `error` status when a single message is permanently unrecoverable. `upstreamStatus` now tracks upstream health and pending work only \u2014 per-cycle recovery failures stay informational in `detail` for transparency without contradicting `ouro doctor`'s healthy verdict, so a malformed payload that fails repairEvent on every retry can no longer brick the visible sense state until operator intervention.",
1120
+ "`mail_search` now ranks by booking-aware relevance instead of pure recency. Score signals weight query-term hits by field (subject +6 / from +4 / body +2), booking-intent tokens (`booking confirmation`, `your stay`, e-ticket, etc.), confirmation-number-shaped tokens, currency amounts, and known travel-sender domains; recency stays as a tiebreaker. Recall is unchanged noise still appears in results, just below the decisive message. Each rendered result also surfaces a `matched on:` line listing fields, booking signals, status (confirmed / cancelled / changed / refunded / etc.), confirmation token, amount, dates, attachment count, and sender hint, so the agent can triage without paying for a body open.",
1121
+ "BlueBubbles sense no longer sticks in `error` status when a single message is permanently unrecoverable. `upstreamStatus` now tracks upstream health and pending work only per-cycle recovery failures stay informational in `detail` for transparency without contradicting `ouro doctor`'s healthy verdict, so a malformed payload that fails repairEvent on every retry can no longer brick the visible sense state until operator intervention.",
1110
1122
  "Heart streaming caps oversized Responses-API `function_call_output` history items both when rebuilding provider input from session history and when appending fresh tool output mid-turn, preventing a giant tool result on resume from blowing the model context."
1111
1123
  ]
1112
1124
  },
1113
1125
  {
1114
1126
  "version": "0.1.0-alpha.485",
1115
1127
  "changes": [
1116
- "Session JSON storage no longer accumulates duplicate event ids when two writers race for the same session \u2014 `parseSessionEnvelope` now dedupes on read (last-occurrence-wins) so existing corrupted sessions self-heal on the next save, `buildCanonicalSessionEnvelope` assigns the next sequence as `max(existing) + 1` instead of `events.length + 1` so pruning gaps cannot collide, and `deferPostTurnPersist` serializes per-`sessPath` through an in-process queue so concurrent BlueBubbles webhooks for the same chat (or CLI postTurn racing the inner-dialog turn for the same MCP session) cannot interleave their writes.",
1128
+ "Session JSON storage no longer accumulates duplicate event ids when two writers race for the same session `parseSessionEnvelope` now dedupes on read (last-occurrence-wins) so existing corrupted sessions self-heal on the next save, `buildCanonicalSessionEnvelope` assigns the next sequence as `max(existing) + 1` instead of `events.length + 1` so pruning gaps cannot collide, and `deferPostTurnPersist` serializes per-`sessPath` through an in-process queue so concurrent BlueBubbles webhooks for the same chat (or CLI postTurn racing the inner-dialog turn for the same MCP session) cannot interleave their writes.",
1117
1129
  "Auto-created BlueBubbles group friends are now marked with a `notes.autoCreatedGroup` flag at resolver time, and the trust gate's family-member bypass surfaces a one-time inner-pending notice the first time messages route through an unacknowledged stranger-trust group so the agent can label, rename, or dismiss the relationship before activity accumulates invisibly.",
1118
1130
  "Inner-dialog worker now caps consecutive `instinct` follow-on turns at `MAX_CONSECUTIVE_INSTINCT_TURNS = 3` to break self-sustaining loops where a tool that writes to the inner-dialog pending dir during a turn would otherwise re-fire the worker indefinitely; externally-queued messages reset the counter so legitimate cascading follow-ups still run, and a new `senses.inner_dialog_worker_instinct_loop_capped` event surfaces when the cap fires."
1119
1131
  ]
@@ -2138,7 +2150,7 @@
2138
2150
  "version": "0.1.0-alpha.364",
2139
2151
  "changes": [
2140
2152
  "Cross-machine polish: bash PATH writes to .bashrc on Linux/WSL instead of .bash_profile (which non-login shells skip on Debian/Ubuntu). Shell hint message matches.",
2141
- "Agent prompt: never guess about harness behavior \u2014 consult docs first, investigate in code, fix stale docs via PR.",
2153
+ "Agent prompt: never guess about harness behavior consult docs first, investigate in code, fix stale docs via PR.",
2142
2154
  "Agent prompt: harness docs pointer distinguishes dev mode (local read) vs production (fetch from GitHub)."
2143
2155
  ]
2144
2156
  },
@@ -2212,16 +2224,16 @@
2212
2224
  "version": "0.1.0-alpha.352",
2213
2225
  "changes": [
2214
2226
  "Settle tool description now communicates turn-ending semantics: 'deliver your response and end your turn' with explicit guidance against settling with status updates mid-task.",
2215
- "Observe tool now available in all outward channels including 1:1 chats, not just groups and reactions \u2014 agents can absorb messages without responding when the moment doesn't call for words.",
2227
+ "Observe tool now available in all outward channels including 1:1 chats, not just groups and reactions agents can absorb messages without responding when the moment doesn't call for words.",
2216
2228
  "Autonomous execution prompt contract added: when told to work autonomously, agents use ponder to absorb new messages and continue using tools, settling only with the final result."
2217
2229
  ]
2218
2230
  },
2219
2231
  {
2220
2232
  "version": "0.1.0-alpha.351",
2221
2233
  "changes": [
2222
- "Surface tool description rewritten from 'surface progress' to 'send a message to someone' \u2014 makes it clear the tool is for interpersonal messaging, not status reporting.",
2234
+ "Surface tool description rewritten from 'surface progress' to 'send a message to someone' makes it clear the tool is for interpersonal messaging, not status reporting.",
2223
2235
  "Inner dialog prompt contract now guides agents to use rest(note) for heartbeat state and ponder(reflection) for deeper thoughts, keeping surface strictly for words meant for another person.",
2224
- "Removed [surfaced from inner dialog] prefix from synthetic session messages \u2014 provenance is tracked via captureKind: 'synthetic', the prefix was redundant and created echo loops.",
2236
+ "Removed [surfaced from inner dialog] prefix from synthetic session messages provenance is tracked via captureKind: 'synthetic', the prefix was redundant and created echo loops.",
2225
2237
  "Obligation summaries and attention queue headers reframed as structured internal data ([internal] tags) instead of surface-ready prose.",
2226
2238
  "Shared proactive-content-guard module blocks internal content (heartbeat, check-in, task board, obligation status, meta markers) from BlueBubbles and Teams proactive sends."
2227
2239
  ]
@@ -2462,7 +2474,7 @@
2462
2474
  "version": "0.1.0-alpha.313",
2463
2475
  "changes": [
2464
2476
  "feat(daemon): agentic repair flow with LLM diagnosis for degraded agents during `ouro up`. New `runAgenticRepair()` in `src/heart/daemon/agentic-repair.ts` wraps interactive repair with optional AI-powered diagnosis. Uses `discoverWorkingProvider()` to find a working LLM, then offers conversational diagnosis with degraded agent context and daemon log tail. Falls back to deterministic repair when no provider is available or user declines. Wired into cli-exec.ts replacing direct `runInteractiveRepair()` call. 12 new tests, 100% coverage on all branches.",
2465
- "feat(daemon): add `--no-repair` flag to `ouro up` \u2014 skips interactive/agentic repair and exits non-zero when degraded agents are detected. Useful for CI and scripted environments."
2477
+ "feat(daemon): add `--no-repair` flag to `ouro up` skips interactive/agentic repair and exits non-zero when degraded agents are detected. Useful for CI and scripted environments."
2466
2478
  ]
2467
2479
  },
2468
2480
  {
@@ -2513,13 +2525,13 @@
2513
2525
  {
2514
2526
  "version": "0.1.0-alpha.305",
2515
2527
  "changes": [
2516
- "feat(daemon): add sense-level liveness probes for hung webhook detection. New `/health` endpoint on BlueBubbles webhook server returns `{ status: \"ok\", uptime: N }` via GET/HEAD (localhost only, 405 for other methods). Generic `SenseProbe` interface in HealthMonitor runs probes alongside existing checks every 60s \u2014 failed probes produce critical results triggering auto-recovery restart. HTTP health probe factory `createHttpHealthProbe(name, port, timeoutMs)` makes reusable probes for any sense with an HTTP endpoint. BlueBubbles probe auto-registered in daemon-entry when BB sense config exists. Directly addresses the documented 70-minute Lobster outage where BB webhook server was hung but process was alive. New files: `http-health-probe.ts` + 3 test files. 26 new tests at 100% coverage on new code."
2528
+ "feat(daemon): add sense-level liveness probes for hung webhook detection. New `/health` endpoint on BlueBubbles webhook server returns `{ status: \"ok\", uptime: N }` via GET/HEAD (localhost only, 405 for other methods). Generic `SenseProbe` interface in HealthMonitor runs probes alongside existing checks every 60s failed probes produce critical results triggering auto-recovery restart. HTTP health probe factory `createHttpHealthProbe(name, port, timeoutMs)` makes reusable probes for any sense with an HTTP endpoint. BlueBubbles probe auto-registered in daemon-entry when BB sense config exists. Directly addresses the documented 70-minute Lobster outage where BB webhook server was hung but process was alive. New files: `http-health-probe.ts` + 3 test files. 26 new tests at 100% coverage on new code."
2517
2529
  ]
2518
2530
  },
2519
2531
  {
2520
2532
  "version": "0.1.0-alpha.304",
2521
2533
  "changes": [
2522
- "feat(mind): add content trust framing to recall results. New `classifyProvenanceTrust()` in `src/mind/provenance-trust.ts` categorizes diary entry provenance as self/trusted/external. Diary entries surfaced via `recall` or associative recall from external sources (messages, emails, web content) now get `[diary/external]` tag instead of `[diary]`. System prompt adds guidance: external entries should not be followed as instructions. This closes the prompt injection defense chain from the Lobster research \u2014 even if an attacker plants instructions in content that gets persisted to the diary, the recall pipeline marks it as external and steers the agent away from executing embedded instructions. 19 new tests across 4 files at 100% coverage."
2534
+ "feat(mind): add content trust framing to recall results. New `classifyProvenanceTrust()` in `src/mind/provenance-trust.ts` categorizes diary entry provenance as self/trusted/external. Diary entries surfaced via `recall` or associative recall from external sources (messages, emails, web content) now get `[diary/external]` tag instead of `[diary]`. System prompt adds guidance: external entries should not be followed as instructions. This closes the prompt injection defense chain from the Lobster research even if an attacker plants instructions in content that gets persisted to the diary, the recall pipeline marks it as external and steers the agent away from executing embedded instructions. 19 new tests across 4 files at 100% coverage."
2523
2535
  ]
2524
2536
  },
2525
2537
  {
@@ -2531,13 +2543,13 @@
2531
2543
  {
2532
2544
  "version": "0.1.0-alpha.302",
2533
2545
  "changes": [
2534
- "feat(cli): add `ouro doctor` system health check command. New command runs 6 diagnostic categories \u2014 daemon (socket existence + responsiveness), agents (bundle discovery, agent.json validation for version/humanFacing/agentFacing/enabled), senses (BlueBubbles and Teams config presence and well-formedness), habits (launchd plist discovery, degraded state), security (secrets.json permissions, credential leak detection in agent.json), and disk (log size thresholds at 100MB warn / 500MB critical, bundle root existence). Output is a colored checklist with per-category grouping and a summary line. Works without daemon running \u2014 daemon checks fail gracefully while all other categories still execute, making it useful for cold diagnostics. 3 new files (doctor.ts, doctor-types.ts, cli-render-doctor.ts), 3 modified (cli-types.ts, cli-parse.ts, cli-exec.ts), 4 test files with 61 tests at 100% coverage on new code."
2546
+ "feat(cli): add `ouro doctor` system health check command. New command runs 6 diagnostic categories daemon (socket existence + responsiveness), agents (bundle discovery, agent.json validation for version/humanFacing/agentFacing/enabled), senses (BlueBubbles and Teams config presence and well-formedness), habits (launchd plist discovery, degraded state), security (secrets.json permissions, credential leak detection in agent.json), and disk (log size thresholds at 100MB warn / 500MB critical, bundle root existence). Output is a colored checklist with per-category grouping and a summary line. Works without daemon running daemon checks fail gracefully while all other categories still execute, making it useful for cold diagnostics. 3 new files (doctor.ts, doctor-types.ts, cli-render-doctor.ts), 3 modified (cli-types.ts, cli-parse.ts, cli-exec.ts), 4 test files with 61 tests at 100% coverage on new code."
2535
2547
  ]
2536
2548
  },
2537
2549
  {
2538
2550
  "version": "0.1.0-alpha.300",
2539
2551
  "changes": [
2540
- "test(bundle): cover `isFirstPushToRemote` branches via mocked child_process. Exported the function from `tools-bundle.ts` (previously private) and added `src/__tests__/repertoire/bundle-push-first-push.test.ts` with 5 unit tests that mock `execFileSync` to exercise all 3 code paths: (1) `symbolic-ref --short HEAD` failure \u2192 conservative true, (2) `ls-remote --heads` returns empty stdout \u2192 true (real first push, remote branch doesn't exist), (3) `ls-remote --heads` returns non-empty \u2192 false (subsequent push, remote branch exists), (4) `ls-remote` network failure \u2192 conservative true, (5) branch name correctly threaded to `ls-remote` args. Removed the `/* v8 ignore start/stop */` wrapper since all branches are now covered. The security contract (never return false when probe fails) is verified by tests 1 and 4. Also adds a cross-reference comment to the static test-isolation contract test documenting its relationship with the runtime prod-path leak guard in global-capture.ts."
2552
+ "test(bundle): cover `isFirstPushToRemote` branches via mocked child_process. Exported the function from `tools-bundle.ts` (previously private) and added `src/__tests__/repertoire/bundle-push-first-push.test.ts` with 5 unit tests that mock `execFileSync` to exercise all 3 code paths: (1) `symbolic-ref --short HEAD` failure conservative true, (2) `ls-remote --heads` returns empty stdout true (real first push, remote branch doesn't exist), (3) `ls-remote --heads` returns non-empty false (subsequent push, remote branch exists), (4) `ls-remote` network failure conservative true, (5) branch name correctly threaded to `ls-remote` args. Removed the `/* v8 ignore start/stop */` wrapper since all branches are now covered. The security contract (never return false when probe fails) is verified by tests 1 and 4. Also adds a cross-reference comment to the static test-isolation contract test documenting its relationship with the runtime prod-path leak guard in global-capture.ts."
2541
2553
  ]
2542
2554
  },
2543
2555
  {
@@ -2562,35 +2574,35 @@
2562
2574
  {
2563
2575
  "version": "0.1.0-alpha.296",
2564
2576
  "changes": [
2565
- "feat(sync): pending-sync.json classification + bundleState enrichment. `postTurnPush` in `src/heart/sync.ts` now distinguishes between a push that was rejected AFTER a successful rebase retry (`classification: push_rejected`) and a rebase that itself failed with merge conflicts (`classification: pull_rebase_conflict`, with conflictFiles populated from `git status --porcelain=v1` UU/AA/DD/AU/UA/DU/UD markers). The `PendingSyncRecord` interface is exported from `sync.ts` so downstream readers can type-check. `detectBundleState` in `src/heart/bundle-state.ts` gained `remote_push_failed` and `pull_rebase_conflict` issue cases that are added alongside `pending_sync_exists` when the classification field is present. Readers tolerate pending-sync.json without a classification field (pre-alpha.296 schema) or with malformed JSON \u2014 both fall back to the plain `pending_sync_exists` signal. 4 new bundle-state tests (push_rejected, pull_rebase_conflict, legacy schema, malformed JSON) and 2 new sync tests (second-push-fails-after-rebase-success, rebase-leaves-merge-conflicts). Completes the Directive D remediation signal plumbing started in alpha.281.",
2566
- "feat(prompt): bundle self-management guidance in `bodyMapSection` (`src/mind/prompt.ts`). New `### git sync \u2014 i own my bundle's git state` subsection documents the full detect \u2192 init \u2192 add_remote \u2192 list_first_commit \u2192 review with friend \u2192 do_first_commit \u2192 first_push_review \u2192 confirm \u2192 push workflow in first-person voice, plus the remediation paths for `remote_push_failed` (pull_rebase) and `pull_rebase_conflict` (walk the friend through conflicts). Added after `### home` and before `### peers` so the flow reads naturally with the rest of the body metaphor."
2577
+ "feat(sync): pending-sync.json classification + bundleState enrichment. `postTurnPush` in `src/heart/sync.ts` now distinguishes between a push that was rejected AFTER a successful rebase retry (`classification: push_rejected`) and a rebase that itself failed with merge conflicts (`classification: pull_rebase_conflict`, with conflictFiles populated from `git status --porcelain=v1` UU/AA/DD/AU/UA/DU/UD markers). The `PendingSyncRecord` interface is exported from `sync.ts` so downstream readers can type-check. `detectBundleState` in `src/heart/bundle-state.ts` gained `remote_push_failed` and `pull_rebase_conflict` issue cases that are added alongside `pending_sync_exists` when the classification field is present. Readers tolerate pending-sync.json without a classification field (pre-alpha.296 schema) or with malformed JSON both fall back to the plain `pending_sync_exists` signal. 4 new bundle-state tests (push_rejected, pull_rebase_conflict, legacy schema, malformed JSON) and 2 new sync tests (second-push-fails-after-rebase-success, rebase-leaves-merge-conflicts). Completes the Directive D remediation signal plumbing started in alpha.281.",
2578
+ "feat(prompt): bundle self-management guidance in `bodyMapSection` (`src/mind/prompt.ts`). New `### git sync i own my bundle's git state` subsection documents the full detect init add_remote list_first_commit review with friend do_first_commit first_push_review confirm push workflow in first-person voice, plus the remediation paths for `remote_push_failed` (pull_rebase) and `pull_rebase_conflict` (walk the friend through conflicts). Added after `### home` and before `### peers` so the flow reads naturally with the rest of the body metaphor."
2567
2579
  ]
2568
2580
  },
2569
2581
  {
2570
2582
  "version": "0.1.0-alpha.295",
2571
2583
  "changes": [
2572
- "chore(tests): ratchet down REAL_OURO_CLI_WRITE_ALLOWLIST and REAL_AGENT_SECRETS_WRITE_ALLOWLIST to empty. Nine pre-existing test lines that shared a `.ouro-cli` or `.agentsecrets` literal with `os.homedir()` on the same line were each refactored to extract the subpath as a local const \u2014 the test-isolation.contract.test.ts rule scans line-by-line for the pattern and the const extraction preserves identical runtime behavior while passing the check. Affected: daemon-health.test.ts (1), daemon-orphan-cleanup.test.ts (2 \u2014 also factored common constants OURO_CLI_SUBPATH + PIDFILE_NAME), daemon-tombstone.test.ts (2), auth-flow.test.ts (3 \u2014 the default-location write test + the v1 migration test), daemon/hooks/agent-config-v2.test.ts (1). Both allowlists are now empty so any new offender is blocked by the contract test."
2584
+ "chore(tests): ratchet down REAL_OURO_CLI_WRITE_ALLOWLIST and REAL_AGENT_SECRETS_WRITE_ALLOWLIST to empty. Nine pre-existing test lines that shared a `.ouro-cli` or `.agentsecrets` literal with `os.homedir()` on the same line were each refactored to extract the subpath as a local const the test-isolation.contract.test.ts rule scans line-by-line for the pattern and the const extraction preserves identical runtime behavior while passing the check. Affected: daemon-health.test.ts (1), daemon-orphan-cleanup.test.ts (2 also factored common constants OURO_CLI_SUBPATH + PIDFILE_NAME), daemon-tombstone.test.ts (2), auth-flow.test.ts (3 the default-location write test + the v1 migration test), daemon/hooks/agent-config-v2.test.ts (1). Both allowlists are now empty so any new offender is blocked by the contract test."
2573
2585
  ]
2574
2586
  },
2575
2587
  {
2576
2588
  "version": "0.1.0-alpha.293",
2577
2589
  "changes": [
2578
- "chore(tests): three test-isolation fixes bundled as chain D1 from the follow-up investigation after PR #372 (default.ouro leak). (1) Remove the `agentName = \"default\"` catch fallback in `src/repertoire/credential-access.ts` getCredentialStore() \u2014 same silent-leak class as coding/manager.ts, would have routed BuiltInCredentialStore writes to `~/AgentBundles/default.ouro/vault/` and `~/.agentsecrets/default/` on any test hit that didn't mock identity. Hoisted `getAgentName()` out of the outer try/catch so it throws loudly; the remaining try/catch now only guards the bitwarden store wiring (the only code path that has a legitimate fall-through to built-in). Also switched from `require(\"../heart/identity\")` inside the function body to a static ESM import at the top of the file \u2014 require() bypasses vitest's module registry so `vi.mock(\"../heart/identity\", ...)` was silently not applying to the old dynamic require; the static import finally lets the existing test mocks intercept. Tests in credential-access.test.ts that had been accidentally leaning on the default fallback now hit the real mock as intended.",
2579
- "chore(tests): fix the tmpbundle leak guard false-positive on daemon-cli.test.ts. The guard was firing on \"ouro CLI parsing > parses primary daemon commands\" every run, but the real cause was `createTmpBundle` being called at describe-scope inside the \"ouro thoughts CLI execution\" suite (line 5662) \u2014 synchronous describe callbacks run during test collection, so the handle landed in _liveHandles before ANY test ran. The first afterEach hook (on the first test in the file) then noticed the dangling handle and blamed it. Fix: moved the createTmpBundle call into a beforeAll hook and tagged it `{ shared: true }` so the handle only exists while the thoughts suite is actually running, with afterAll cleanup aligned to it.",
2590
+ "chore(tests): three test-isolation fixes bundled as chain D1 from the follow-up investigation after PR #372 (default.ouro leak). (1) Remove the `agentName = \"default\"` catch fallback in `src/repertoire/credential-access.ts` getCredentialStore() same silent-leak class as coding/manager.ts, would have routed BuiltInCredentialStore writes to `~/AgentBundles/default.ouro/vault/` and `~/.agentsecrets/default/` on any test hit that didn't mock identity. Hoisted `getAgentName()` out of the outer try/catch so it throws loudly; the remaining try/catch now only guards the bitwarden store wiring (the only code path that has a legitimate fall-through to built-in). Also switched from `require(\"../heart/identity\")` inside the function body to a static ESM import at the top of the file require() bypasses vitest's module registry so `vi.mock(\"../heart/identity\", ...)` was silently not applying to the old dynamic require; the static import finally lets the existing test mocks intercept. Tests in credential-access.test.ts that had been accidentally leaning on the default fallback now hit the real mock as intended.",
2591
+ "chore(tests): fix the tmpbundle leak guard false-positive on daemon-cli.test.ts. The guard was firing on \"ouro CLI parsing > parses primary daemon commands\" every run, but the real cause was `createTmpBundle` being called at describe-scope inside the \"ouro thoughts CLI execution\" suite (line 5662) synchronous describe callbacks run during test collection, so the handle landed in _liveHandles before ANY test ran. The first afterEach hook (on the first test in the file) then noticed the dangling handle and blamed it. Fix: moved the createTmpBundle call into a beforeAll hook and tagged it `{ shared: true }` so the handle only exists while the thoughts suite is actually running, with afterAll cleanup aligned to it.",
2580
2592
  "chore(tests): extend the tmpbundle leak guard with a `shared: true` opt-in. `TmpBundleHandle` gains a `shared` field, `CreateTmpBundleOptions.shared` defaults to false, and the per-test leak guard in `src/__tests__/nerves/global-capture.ts` now skips shared handles (they're cleaned in afterAll, not after every test). Prevents the class of false positive exposed by the daemon-cli.test.ts fix above.",
2581
- "chore(tests): new runtime prod-path leak guard in `src/__tests__/nerves/global-capture.ts`. Snapshots `~/AgentBundles` entries at worker boot, diffs at worker teardown (afterAll without describe context runs once per worker), force-removes any new entries, and emits a loud console.error naming them. Text-based contract tests can't catch runtime leaks where production code routes a write via a silent fallback (exactly the bug class PR #372 fixed in coding/manager.ts). This runtime guard is the belt to the contract test's suspenders \u2014 would have caught the default.ouro leak in the first run instead of requiring my investigation chain."
2593
+ "chore(tests): new runtime prod-path leak guard in `src/__tests__/nerves/global-capture.ts`. Snapshots `~/AgentBundles` entries at worker boot, diffs at worker teardown (afterAll without describe context runs once per worker), force-removes any new entries, and emits a loud console.error naming them. Text-based contract tests can't catch runtime leaks where production code routes a write via a silent fallback (exactly the bug class PR #372 fixed in coding/manager.ts). This runtime guard is the belt to the contract test's suspenders would have caught the default.ouro leak in the first run instead of requiring my investigation chain."
2582
2594
  ]
2583
2595
  },
2584
2596
  {
2585
2597
  "version": "0.1.0-alpha.292",
2586
2598
  "changes": [
2587
- "feat(mind): add structured provenance tracking to diary entries. New DiaryEntryProvenance interface records tool, channel (cli/teams/bluebubbles/inner/mcp), friend identity (id + name), and trust level (family/friend/acquaintance/stranger) at write time. diary_write handler automatically extracts provenance from ToolContext. Associative recall and recall tool render provenance fields when present. Fully backwards compatible \u2014 existing entries without provenance parse and display normally. 19 tests across 4 test files with 100% coverage on new code."
2599
+ "feat(mind): add structured provenance tracking to diary entries. New DiaryEntryProvenance interface records tool, channel (cli/teams/bluebubbles/inner/mcp), friend identity (id + name), and trust level (family/friend/acquaintance/stranger) at write time. diary_write handler automatically extracts provenance from ToolContext. Associative recall and recall tool render provenance fields when present. Fully backwards compatible existing entries without provenance parse and display normally. 19 tests across 4 test files with 100% coverage on new code."
2588
2600
  ]
2589
2601
  },
2590
2602
  {
2591
2603
  "version": "0.1.0-alpha.291",
2592
2604
  "changes": [
2593
- "fix(coding): eliminate silent `~/AgentBundles/default.ouro` real-fs leak in the coding session manager. Root cause: `src/repertoire/coding/manager.ts` had a `safeAgentName()` helper that caught `getAgentName()` throws (which happens in vitest because there's no `--agent` in argv) and silently fell back to `\"default\"`. Combined with `src/repertoire/coding/index.ts:8` constructing the singleton as `new CodingSessionManager({})` (no options \u2014 real fs, no agentName), every vitest run that called `getCodingSessionManager()` followed by `resetCodingSessionManager()` triggered `shutdown()` \u2192 `persistState()` \u2192 `fs.mkdirSync('~/AgentBundles/default.ouro/state/coding', { recursive: true })` + `fs.writeFileSync('.../sessions.json', ...)`. This wrote real files under the developer's home directory on every coverage-gate run. The observable symptom was `~/AgentBundles/default.ouro/` reappearing minutes after `rm -rf` \u2014 PR B's housekeeping cleanup couldn't stick. Fix: deleted `safeAgentName()` and made the constructor's `agentName` default call `getAgentName()` directly, so construction fails loudly in vitest when identity isn't mocked. Updated `index.test.ts` to mock both `../../../heart/identity` and `fs` so the singleton can be tested without touching real disk. Updated `session-manager.test.ts`, `session-manager-branches.test.ts`, and `session-manager-persistence.test.ts` to either inline `agentName: \"test-coding-agent\"` in `noPersistence` or mock `../../../heart/identity` at file top. One persistence test assertion updated from `parentAgent === \"default\"` to `\"test-coding-agent\"`. Full coverage gate passes, 171 coding tests pass, `~/AgentBundles/default.ouro` stays deleted across 3 consecutive coverage-gate runs."
2605
+ "fix(coding): eliminate silent `~/AgentBundles/default.ouro` real-fs leak in the coding session manager. Root cause: `src/repertoire/coding/manager.ts` had a `safeAgentName()` helper that caught `getAgentName()` throws (which happens in vitest because there's no `--agent` in argv) and silently fell back to `\"default\"`. Combined with `src/repertoire/coding/index.ts:8` constructing the singleton as `new CodingSessionManager({})` (no options real fs, no agentName), every vitest run that called `getCodingSessionManager()` followed by `resetCodingSessionManager()` triggered `shutdown()` `persistState()` `fs.mkdirSync('~/AgentBundles/default.ouro/state/coding', { recursive: true })` + `fs.writeFileSync('.../sessions.json', ...)`. This wrote real files under the developer's home directory on every coverage-gate run. The observable symptom was `~/AgentBundles/default.ouro/` reappearing minutes after `rm -rf` PR B's housekeeping cleanup couldn't stick. Fix: deleted `safeAgentName()` and made the constructor's `agentName` default call `getAgentName()` directly, so construction fails loudly in vitest when identity isn't mocked. Updated `index.test.ts` to mock both `../../../heart/identity` and `fs` so the singleton can be tested without touching real disk. Updated `session-manager.test.ts`, `session-manager-branches.test.ts`, and `session-manager-persistence.test.ts` to either inline `agentName: \"test-coding-agent\"` in `noPersistence` or mock `../../../heart/identity` at file top. One persistence test assertion updated from `parentAgent === \"default\"` to `\"test-coding-agent\"`. Full coverage gate passes, 171 coding tests pass, `~/AgentBundles/default.ouro` stays deleted across 3 consecutive coverage-gate runs."
2594
2606
  ]
2595
2607
  },
2596
2608
  {
@@ -2609,27 +2621,27 @@
2609
2621
  {
2610
2622
  "version": "0.1.0-alpha.288",
2611
2623
  "changes": [
2612
- "feat(bundle): full `.gitignore` template + first-push PII review workflow + confirmation-token gate on `bundle_push`. Completes the agent-manages-its-own-bundle chain (PRs 5 + 6 + 7). (1) New `src/repertoire/bundle-templates.ts` exports `BUNDLE_GITIGNORE_TEMPLATE` \u2014 a curated gitignore that handles functional cases only (runtime state, credentials, editor/OS noise, build artifacts) and explicitly does NOT block PII. The design philosophy is baked into the file's top-comment: bundles are inherently full of PII (friends/, diary/, journal/, psyche/, arc/, facts/, family/, travel/) and blocking those via gitignore would defeat the bundle's purpose. PII is handled at first-push time by a separate safety layer instead. Also exports `PII_BUNDLE_DIRECTORIES` as the canonical list of PII-bearing top-level dirs. (2) `bundle_init_git` now writes the full template instead of the minimal `state/`-only placeholder from PR 6. (3) New `bundle_first_push_review` tool enumerates existing PII directories with per-directory file counts (honoring `.gitignore` via `git ls-files --others --exclude-standard`), probes the remote URL for GitHub public/private visibility via an unauthenticated `fetch` to `https://api.github.com/repos/{owner}/{repo}` with a 5-second timeout, and generates a first-person warning text \u2014 `warningLevel: 'public_github' | 'private_github' | 'generic'`. The tool issues a `confirmationToken` (crypto.randomUUID) stored in a module-level Map with a 15-minute TTL and returns it in the payload. (4) `bundle_push` updated to accept an optional `confirmation_token` parameter: on first-push attempts (detected via `git ls-remote --heads <remote> <branch>` empty \u2014 or, conservatively, when that probe fails to reach the remote), the handler requires a valid token that was issued for the SAME bundleRoot and has not expired. Missing, invalid, wrong-bundle, or expired token returns `kind: 'confirmation_required'`. On successful validation, the token is consumed (one-shot). This is the Directive D PII-review gate: the agent cannot push a bundle to the internet without the human explicitly acknowledging the PII payload first. 17 new tests cover the template, PII counting with empty and populated directories, GitHub public/private/404/network-error/malformed-response paths, URL parsing for gitlab/self-hosted/github, token storage + TTL expiry, and every token-gating refusal path in bundle_push. Bundle-templates.ts is added to the file-completeness exempt list since it's a pure constants module (design: `tools-bundle.ts` owns the observability for bundle operations)."
2624
+ "feat(bundle): full `.gitignore` template + first-push PII review workflow + confirmation-token gate on `bundle_push`. Completes the agent-manages-its-own-bundle chain (PRs 5 + 6 + 7). (1) New `src/repertoire/bundle-templates.ts` exports `BUNDLE_GITIGNORE_TEMPLATE` a curated gitignore that handles functional cases only (runtime state, credentials, editor/OS noise, build artifacts) and explicitly does NOT block PII. The design philosophy is baked into the file's top-comment: bundles are inherently full of PII (friends/, diary/, journal/, psyche/, arc/, facts/, family/, travel/) and blocking those via gitignore would defeat the bundle's purpose. PII is handled at first-push time by a separate safety layer instead. Also exports `PII_BUNDLE_DIRECTORIES` as the canonical list of PII-bearing top-level dirs. (2) `bundle_init_git` now writes the full template instead of the minimal `state/`-only placeholder from PR 6. (3) New `bundle_first_push_review` tool enumerates existing PII directories with per-directory file counts (honoring `.gitignore` via `git ls-files --others --exclude-standard`), probes the remote URL for GitHub public/private visibility via an unauthenticated `fetch` to `https://api.github.com/repos/{owner}/{repo}` with a 5-second timeout, and generates a first-person warning text `warningLevel: 'public_github' | 'private_github' | 'generic'`. The tool issues a `confirmationToken` (crypto.randomUUID) stored in a module-level Map with a 15-minute TTL and returns it in the payload. (4) `bundle_push` updated to accept an optional `confirmation_token` parameter: on first-push attempts (detected via `git ls-remote --heads <remote> <branch>` empty or, conservatively, when that probe fails to reach the remote), the handler requires a valid token that was issued for the SAME bundleRoot and has not expired. Missing, invalid, wrong-bundle, or expired token returns `kind: 'confirmation_required'`. On successful validation, the token is consumed (one-shot). This is the Directive D PII-review gate: the agent cannot push a bundle to the internet without the human explicitly acknowledging the PII payload first. 17 new tests cover the template, PII counting with empty and populated directories, GitHub public/private/404/network-error/malformed-response paths, URL parsing for gitlab/self-hosted/github, token storage + TTL expiry, and every token-gating refusal path in bundle_push. Bundle-templates.ts is added to the file-completeness exempt list since it's a pure constants module (design: `tools-bundle.ts` owns the observability for bundle operations)."
2613
2625
  ]
2614
2626
  },
2615
2627
  {
2616
2628
  "version": "0.1.0-alpha.287",
2617
2629
  "changes": [
2618
- "feat(nerves): add two-layer log redaction to the NDJSON sink. Structured key-based redaction strips sensitive fields (passwords, tokens, API keys, auth headers) from event meta objects before serialization. Regex fallback catches secrets in serialized strings that bypass structured checks (Anthropic keys, OpenAI keys, Bearer tokens, URL token params). Redacted values are replaced with `[REDACTED:key_name]` markers that preserve debugging context without exposing secrets. Redaction happens at the sink level only \u2014 in-memory events retain full data for runtime use. `OURO_LOG_VERBOSE=1` env var disables redaction for active debugging sessions. New `src/nerves/redact.ts` module with 38 tests across 3 test files including a 10-case golden corpus covering nested objects, mixed safe/secret fields, and realistic key patterns."
2630
+ "feat(nerves): add two-layer log redaction to the NDJSON sink. Structured key-based redaction strips sensitive fields (passwords, tokens, API keys, auth headers) from event meta objects before serialization. Regex fallback catches secrets in serialized strings that bypass structured checks (Anthropic keys, OpenAI keys, Bearer tokens, URL token params). Redacted values are replaced with `[REDACTED:key_name]` markers that preserve debugging context without exposing secrets. Redaction happens at the sink level only in-memory events retain full data for runtime use. `OURO_LOG_VERBOSE=1` env var disables redaction for active debugging sessions. New `src/nerves/redact.ts` module with 38 tests across 3 test files including a 10-case golden corpus covering nested objects, mixed safe/secret fields, and realistic key patterns."
2619
2631
  ]
2620
2632
  },
2621
2633
  {
2622
2634
  "version": "0.1.0-alpha.286",
2623
2635
  "changes": [
2624
- "feat(bundle): new `src/repertoire/tools-bundle.ts` registers 7 agent-callable tools for managing the bundle's own git state: `bundle_check_sync_status`, `bundle_init_git`, `bundle_add_remote`, `bundle_list_first_commit`, `bundle_do_first_commit`, `bundle_push`, `bundle_pull_rebase`. Each tool computes `bundleRoot = getAgentRoot()` once at the top and refuses any path argument that resolves outside \u2014 the security boundary is enforced via `assertInsideBundle(bundleRoot, rel)` which normalizes the path and requires either equality with bundleRoot or the `bundleRoot + sep` prefix. Destructive operations (init on an existing repo, add_remote on a configured remote, pull_rebase on a dirty tree) refuse by default and require an explicit `force` or `discard_changes` flag (Directive B layered refusal pattern). `bundle_do_first_commit` stages files via explicit enumeration (`git add -- <file1> <file2>`) and refuses an empty files array \u2014 Directive A: the agent must enumerate what it wants to delete or commit, not recursively blast. `bundle_push` returns structured `{ ok, error, kind }` where kind is 'rejected' | 'network' | 'auth' | 'unknown', classified from stderr. `bundle_pull_rebase` returns `{ kind: 'conflict', conflictFiles: [...] }` so the agent can walk the human through resolution. All 7 tools registered in the flat registry at `src/repertoire/tools.ts` line 34. 41 new unit tests cover happy paths, every refusal path, URL validation, security boundary escapes, push error classification, and dirty-tree handling."
2636
+ "feat(bundle): new `src/repertoire/tools-bundle.ts` registers 7 agent-callable tools for managing the bundle's own git state: `bundle_check_sync_status`, `bundle_init_git`, `bundle_add_remote`, `bundle_list_first_commit`, `bundle_do_first_commit`, `bundle_push`, `bundle_pull_rebase`. Each tool computes `bundleRoot = getAgentRoot()` once at the top and refuses any path argument that resolves outside the security boundary is enforced via `assertInsideBundle(bundleRoot, rel)` which normalizes the path and requires either equality with bundleRoot or the `bundleRoot + sep` prefix. Destructive operations (init on an existing repo, add_remote on a configured remote, pull_rebase on a dirty tree) refuse by default and require an explicit `force` or `discard_changes` flag (Directive B layered refusal pattern). `bundle_do_first_commit` stages files via explicit enumeration (`git add -- <file1> <file2>`) and refuses an empty files array Directive A: the agent must enumerate what it wants to delete or commit, not recursively blast. `bundle_push` returns structured `{ ok, error, kind }` where kind is 'rejected' | 'network' | 'auth' | 'unknown', classified from stderr. `bundle_pull_rebase` returns `{ kind: 'conflict', conflictFiles: [...] }` so the agent can walk the human through resolution. All 7 tools registered in the flat registry at `src/repertoire/tools.ts` line 34. 41 new unit tests cover happy paths, every refusal path, URL validation, security boundary escapes, push error classification, and dirty-tree handling."
2625
2637
  ]
2626
2638
  },
2627
2639
  {
2628
2640
  "version": "0.1.0-alpha.285",
2629
2641
  "changes": [
2630
- "feat(daemon): wire startPeriodicReconciliation() after scheduler.start() in daemon-entry.ts \u2014 prevents silent habit death when OS cron fails by giving the daemon a self-healing reconciliation loop.",
2631
- "feat(guardrails): protect agent.json from agent self-modification \u2014 closes a vector where prompt injection could alter the agent's own identity, provider, or model settings.",
2632
- "feat(habits): add tools field to HabitFile interface and habit-parser \u2014 habits can now declare tools: [read, web_fetch] in frontmatter. Schema-only; runtime enforcement ships in a follow-up PR."
2642
+ "feat(daemon): wire startPeriodicReconciliation() after scheduler.start() in daemon-entry.ts prevents silent habit death when OS cron fails by giving the daemon a self-healing reconciliation loop.",
2643
+ "feat(guardrails): protect agent.json from agent self-modification closes a vector where prompt injection could alter the agent's own identity, provider, or model settings.",
2644
+ "feat(habits): add tools field to HabitFile interface and habit-parser habits can now declare tools: [read, web_fetch] in frontmatter. Schema-only; runtime enforcement ships in a follow-up PR."
2633
2645
  ]
2634
2646
  },
2635
2647
  {
@@ -2642,28 +2654,28 @@
2642
2654
  "version": "0.1.0-alpha.283",
2643
2655
  "changes": [
2644
2656
  "chore(housekeeping): clean up ~3600 leaked test secret dirs in `~/.agentsecrets/`. The auth CLI test suite was creating ephemeral agent secret dirs (auth-local-*, auth-store-*, auth-no-switch-*, etc.) without cleaning them up, accreting over weeks into thousands of orphaned entries. Also purged three non-test orphans (`testagent`, `model-reviews`, `config-model-facing-*`) that were leftovers from manual probe sessions. Only the operator's real agent bundles remain.",
2645
- "chore(housekeeping): remove orphan bundle stubs from `~/AgentBundles/` (default.ouro, thoughts-test-*, an empty `friends/` dir, and a .DS_Store). These were skeletal test leftovers with no real identity \u2014 the daemon was already filtering them out of `listEnabledBundleAgents`, so deletion is invisible to the runtime but stops confusing anyone inspecting the bundles directory.",
2646
- "chore(tests): replace the `outlookServer.stop` v8 ignore band-aid in `daemon.ts` with a proper test. `OuroDaemonOptions` gains an `outlookServerFactory` seam that lets tests inject an in-memory stub handle instead of binding port 6876 (which a running production daemon holds on dev machines, causing EADDRINUSE flakes). New `daemon-outlook-lifecycle.test.ts` covers the happy path (factory runs, stop is called), the error path (factory throws, daemon logs warn and keeps going, stop is a no-op), and the double-start guard. The default production factory path (`createDefaultOutlookServer` \u2014 wires the real `startOutlookHttpServer` with bundlesRoot + view builders) is v8-ignored because it only runs under real bind on port 6876; `startOutlookHttpServer` itself has full coverage in `outlook-http.test.ts`. Removed the redundant `if (!this.outlookServer)` wrapper in `startInner()` that was guarding against an unreachable retry scenario. `daemon.ts` now hits 100% statements / branches / functions / lines with zero port binding in the test suite.",
2647
- "chore(ci): tighten the wrapper publish-sync check timing. The version-bump check and wrapper-publish-sync check now run BEFORE the ~2+ minute coverage gate, not after. Previously, a contributor would wait for the full test suite to pass only to then see a 'version already published' error. Now the fast checks surface in under a minute, and coverage only runs once version/wrapper gates have cleared. Same checks, same logic \u2014 just reordered steps in `.github/workflows/coverage.yml`."
2657
+ "chore(housekeeping): remove orphan bundle stubs from `~/AgentBundles/` (default.ouro, thoughts-test-*, an empty `friends/` dir, and a .DS_Store). These were skeletal test leftovers with no real identity the daemon was already filtering them out of `listEnabledBundleAgents`, so deletion is invisible to the runtime but stops confusing anyone inspecting the bundles directory.",
2658
+ "chore(tests): replace the `outlookServer.stop` v8 ignore band-aid in `daemon.ts` with a proper test. `OuroDaemonOptions` gains an `outlookServerFactory` seam that lets tests inject an in-memory stub handle instead of binding port 6876 (which a running production daemon holds on dev machines, causing EADDRINUSE flakes). New `daemon-outlook-lifecycle.test.ts` covers the happy path (factory runs, stop is called), the error path (factory throws, daemon logs warn and keeps going, stop is a no-op), and the double-start guard. The default production factory path (`createDefaultOutlookServer` wires the real `startOutlookHttpServer` with bundlesRoot + view builders) is v8-ignored because it only runs under real bind on port 6876; `startOutlookHttpServer` itself has full coverage in `outlook-http.test.ts`. Removed the redundant `if (!this.outlookServer)` wrapper in `startInner()` that was guarding against an unreachable retry scenario. `daemon.ts` now hits 100% statements / branches / functions / lines with zero port binding in the test suite.",
2659
+ "chore(ci): tighten the wrapper publish-sync check timing. The version-bump check and wrapper-publish-sync check now run BEFORE the ~2+ minute coverage gate, not after. Previously, a contributor would wait for the full test suite to pass only to then see a 'version already published' error. Now the fast checks surface in under a minute, and coverage only runs once version/wrapper gates have cleared. Same checks, same logic just reordered steps in `.github/workflows/coverage.yml`."
2648
2660
  ]
2649
2661
  },
2650
2662
  {
2651
2663
  "version": "0.1.0-alpha.282",
2652
2664
  "changes": [
2653
- "feat(cli): crash-resilient sessions \u2014 saves after each tool result, repairs orphaned tool calls on resume."
2665
+ "feat(cli): crash-resilient sessions saves after each tool result, repairs orphaned tool calls on resume."
2654
2666
  ]
2655
2667
  },
2656
2668
  {
2657
2669
  "version": "0.1.0-alpha.281",
2658
2670
  "changes": [
2659
- "feat(bundle): new `src/heart/bundle-state.ts` module exports `detectBundleState(agentRoot)` which returns a structured `BundleStateIssue[]` describing git-level problems the agent can remediate. Enum cases: `not_a_git_repo`, `no_remote_configured`, `first_commit_never_happened`, `pending_sync_exists`. Detection never throws \u2014 every git call is wrapped in try/catch so a broken bundle degrades to a clear signal rather than crashing the turn pipeline. Also exports `renderBundleStateHint(issues)` which produces first-person remediation guidance (per the memory rule) that tells the agent to call `bundle_check_sync_status` and the `bundle_*` tools shipping in a follow-up PR.",
2660
- "feat(bundle): `StartOfTurnPacket` gains an optional `bundleState?: BundleStateIssue[]` field, populated by the senses pipeline in `handleInboundTurn` at packet assembly time. Renders via a new `case \"bundleState\":` branch in `formatSections` with the `**Bundle:**` prefix, at the same priority tier as the legacy `syncFailure` free-form string (priority 7, truncated last). The two signals coexist during the transition \u2014 `syncFailure` is still emitted by sync.ts for humans, while `bundleState` is the structured form the agent pattern-matches on. Deferred to follow-up: `pending-sync.json` schema extension with `classification` / `conflictFiles` so sync.ts can distinguish push-rejected from pull-rebase-conflict."
2671
+ "feat(bundle): new `src/heart/bundle-state.ts` module exports `detectBundleState(agentRoot)` which returns a structured `BundleStateIssue[]` describing git-level problems the agent can remediate. Enum cases: `not_a_git_repo`, `no_remote_configured`, `first_commit_never_happened`, `pending_sync_exists`. Detection never throws every git call is wrapped in try/catch so a broken bundle degrades to a clear signal rather than crashing the turn pipeline. Also exports `renderBundleStateHint(issues)` which produces first-person remediation guidance (per the memory rule) that tells the agent to call `bundle_check_sync_status` and the `bundle_*` tools shipping in a follow-up PR.",
2672
+ "feat(bundle): `StartOfTurnPacket` gains an optional `bundleState?: BundleStateIssue[]` field, populated by the senses pipeline in `handleInboundTurn` at packet assembly time. Renders via a new `case \"bundleState\":` branch in `formatSections` with the `**Bundle:**` prefix, at the same priority tier as the legacy `syncFailure` free-form string (priority 7, truncated last). The two signals coexist during the transition `syncFailure` is still emitted by sync.ts for humans, while `bundleState` is the structured form the agent pattern-matches on. Deferred to follow-up: `pending-sync.json` schema extension with `classification` / `conflictFiles` so sync.ts can distinguish push-rejected from pull-rebase-conflict."
2661
2673
  ]
2662
2674
  },
2663
2675
  {
2664
2676
  "version": "0.1.0-alpha.280",
2665
2677
  "changes": [
2666
- "chore(tests): prod-path isolation ratchet + tmpBundle leak guard + no-rm-rf contract. Extends `src/__tests__/heart/daemon/test-isolation.contract.test.ts` with three new rules and adds a runtime leak guard. (1) Three new prod-path block rules mirror the existing ~/AgentBundles check: no test file may construct a write path under `~/.ouro-cli`, `~/.agentsecrets`, or `~/.claude` without mocking fs. Each rule has its own empty-seeded ratchet allowlist (except `.ouro-cli` and `.agentsecrets` which are seeded with the pre-existing offenders that this rule newly catches \u2014 follow-up PRs can convert those to mocked-fs and ratchet down). The path-scan loop is factored into a shared `runProdPathCheck` helper. (2) New Directive-A contract rule: agent-callable production code under `src/` (excluding `src/__tests__/`) must not call `fs.rmSync(..., { recursive: true })` or shell out to `rm -rf` / `rm -fr` / `rm --recursive --force`. The rule is about making deletion auditable and interruptible: an agent should enumerate the files it wants to delete instead of recursively blasting a directory. Four legitimate infrastructure callsites are on `RM_RECURSIVE_ALLOWLIST` with explicit justifications: specialist-tools.ts (adoption rollback), ouro-version-manager.ts (CLI version pruning), ouro-uti.ts (macOS icon pipeline), cli-defaults.ts (self-setup temp dir). Three files that are themselves the rm-rf enforcement layer (guardrails.ts, shell-sessions.ts, prompt.ts) are in `RM_RULE_ENFORCEMENT_FILES` and skipped by the scan since they contain the literal \"rm -rf\" only as regex patterns or prompt strings designed to BLOCK the call. (3) `createTmpBundle` in `src/__tests__/test-helpers/tmpdir-bundle.ts` now tracks live handles in a module-level `_liveHandles: Set<TmpBundleHandle>`. `cleanup()` removes the handle from the set; a new `__getLiveTmpBundleHandles()` export returns a readonly view. (4) `src/__tests__/nerves/global-capture.ts` adds a global vitest `afterEach` leak guard that iterates `__getLiveTmpBundleHandles` and calls `cleanup()` on any remaining handles, logging a `console.warn` naming the test that leaked them. Runs AFTER the pairing guard from alpha.276 so pairing failures surface first. Both guards co-exist cleanly."
2678
+ "chore(tests): prod-path isolation ratchet + tmpBundle leak guard + no-rm-rf contract. Extends `src/__tests__/heart/daemon/test-isolation.contract.test.ts` with three new rules and adds a runtime leak guard. (1) Three new prod-path block rules mirror the existing ~/AgentBundles check: no test file may construct a write path under `~/.ouro-cli`, `~/.agentsecrets`, or `~/.claude` without mocking fs. Each rule has its own empty-seeded ratchet allowlist (except `.ouro-cli` and `.agentsecrets` which are seeded with the pre-existing offenders that this rule newly catches follow-up PRs can convert those to mocked-fs and ratchet down). The path-scan loop is factored into a shared `runProdPathCheck` helper. (2) New Directive-A contract rule: agent-callable production code under `src/` (excluding `src/__tests__/`) must not call `fs.rmSync(..., { recursive: true })` or shell out to `rm -rf` / `rm -fr` / `rm --recursive --force`. The rule is about making deletion auditable and interruptible: an agent should enumerate the files it wants to delete instead of recursively blasting a directory. Four legitimate infrastructure callsites are on `RM_RECURSIVE_ALLOWLIST` with explicit justifications: specialist-tools.ts (adoption rollback), ouro-version-manager.ts (CLI version pruning), ouro-uti.ts (macOS icon pipeline), cli-defaults.ts (self-setup temp dir). Three files that are themselves the rm-rf enforcement layer (guardrails.ts, shell-sessions.ts, prompt.ts) are in `RM_RULE_ENFORCEMENT_FILES` and skipped by the scan since they contain the literal \"rm -rf\" only as regex patterns or prompt strings designed to BLOCK the call. (3) `createTmpBundle` in `src/__tests__/test-helpers/tmpdir-bundle.ts` now tracks live handles in a module-level `_liveHandles: Set<TmpBundleHandle>`. `cleanup()` removes the handle from the set; a new `__getLiveTmpBundleHandles()` export returns a readonly view. (4) `src/__tests__/nerves/global-capture.ts` adds a global vitest `afterEach` leak guard that iterates `__getLiveTmpBundleHandles` and calls `cleanup()` on any remaining handles, logging a `console.warn` naming the test that leaked them. Runs AFTER the pairing guard from alpha.276 so pairing failures surface first. Both guards co-exist cleanly."
2667
2679
  ]
2668
2680
  },
2669
2681
  {
@@ -2675,16 +2687,16 @@
2675
2687
  {
2676
2688
  "version": "0.1.0-alpha.278",
2677
2689
  "changes": [
2678
- "fix(identity): spread-with-validation loader eliminates the field-drop bug class that caused #349 (silent `sync` drop). Two distinct structural bugs fixed, one in each agent.json loader: (1) `loadAgentConfig` in `src/heart/identity.ts` previously built the returned `AgentConfig` via a hand-rolled object literal that listed every field explicitly \u2014 any new field on `AgentConfig` that wasn't added to this literal got silently dropped. Root cause of #349. Refactored to a spread-then-override pattern: start with `{ ...parsed as AgentConfig }`, then explicitly override `version`, `enabled`, `humanFacing`, `agentFacing`, `senses`, `phrases` which need validation/normalization. The deprecated `provider` field is re-attached from the validated `rawProvider` check. (2) `readAgentConfigForAgent` in `src/heart/auth/auth-flow.ts` previously did `parsed as unknown as AgentConfig` which passed through ALL fields unconditionally (so no silent drops) but ALSO had zero per-field validation \u2014 any garbage in agent.json leaked into the returned config. Refactored to apply the same spread-with-validation pattern, reusing `normalizeSenses` (now exported from identity.ts). Both entry points now return equivalent configs for the same agent.json file. New `src/__tests__/heart/identity-fixture.ts` exports `FULL_AGENT_JSON satisfies DeepRequired<AgentConfig>` \u2014 a compile-time regression guard that fails to build if any new field is added to `AgentConfig` without updating the fixture. Two new test files: `identity-contract.test.ts` exercises `readAgentConfigForAgent` with `createTmpBundle`; `identity-load-contract.test.ts` exercises `loadAgentConfig` via `vi.mock(\"fs\")` (split to avoid the fs-mock-vs-real-fs conflict). 7 tests total."
2690
+ "fix(identity): spread-with-validation loader eliminates the field-drop bug class that caused #349 (silent `sync` drop). Two distinct structural bugs fixed, one in each agent.json loader: (1) `loadAgentConfig` in `src/heart/identity.ts` previously built the returned `AgentConfig` via a hand-rolled object literal that listed every field explicitly any new field on `AgentConfig` that wasn't added to this literal got silently dropped. Root cause of #349. Refactored to a spread-then-override pattern: start with `{ ...parsed as AgentConfig }`, then explicitly override `version`, `enabled`, `humanFacing`, `agentFacing`, `senses`, `phrases` which need validation/normalization. The deprecated `provider` field is re-attached from the validated `rawProvider` check. (2) `readAgentConfigForAgent` in `src/heart/auth/auth-flow.ts` previously did `parsed as unknown as AgentConfig` which passed through ALL fields unconditionally (so no silent drops) but ALSO had zero per-field validation any garbage in agent.json leaked into the returned config. Refactored to apply the same spread-with-validation pattern, reusing `normalizeSenses` (now exported from identity.ts). Both entry points now return equivalent configs for the same agent.json file. New `src/__tests__/heart/identity-fixture.ts` exports `FULL_AGENT_JSON satisfies DeepRequired<AgentConfig>` a compile-time regression guard that fails to build if any new field is added to `AgentConfig` without updating the fixture. Two new test files: `identity-contract.test.ts` exercises `readAgentConfigForAgent` with `createTmpBundle`; `identity-load-contract.test.ts` exercises `loadAgentConfig` via `vi.mock(\"fs\")` (split to avoid the fs-mock-vs-real-fs conflict). 7 tests total."
2679
2691
  ]
2680
2692
  },
2681
2693
  {
2682
2694
  "version": "0.1.0-alpha.277",
2683
2695
  "changes": [
2684
- "feat(pulse): multi-agent situational awareness for peer agents on the same machine. The harness scales horizontally \u2014 multiple peer agents share a machine, each with their own identity and bundle (the Bob model from We Are Legion / We Are Bob). Without explicit awareness, peer agents are isolated workers who don't even know each other exist. The pulse fixes that.",
2696
+ "feat(pulse): multi-agent situational awareness for peer agents on the same machine. The harness scales horizontally multiple peer agents share a machine, each with their own identity and bundle (the Bob model from We Are Legion / We Are Bob). Without explicit awareness, peer agents are isolated workers who don't even know each other exist. The pulse fixes that.",
2685
2697
  "feat(pulse): new `src/heart/daemon/pulse.ts` module. Daemon writes `~/.ouro-cli/pulse.json` whenever any managed agent's snapshot changes (status, errorReason, fixHint, etc.). Each entry includes name, bundle path, status, last-seen-at, errorReason+fixHint when broken, alertId for at-most-once delivery tracking, and currentActivity (read from each agent's `state/sessions/self/inner/runtime.json` when running). Pure helpers `buildPulseState`, `findNovelBrokenAgents`, `findRecoveredAgents`, `pickWakeRecipient`, `flushPulse`, `readAgentActivity`, `buildAlertId`, `buildRecoveryAlertId`, `pruneDeliveredState` exported for unit coverage; I/O wrappers `writePulse`, `readPulse`, `writeDeliveredState`, `readDeliveredState` use injectable deps.",
2686
2698
  "feat(pulse): both passive AND active surfacing. Passive: every agent's prompt assembly renders a `## the pulse` section in Group 7 (dynamic state) showing siblings grouped into broken / reachable / idle buckets. Self-excluded so each agent describes its peers, not itself. Renders nothing on single-agent machines (zero token cost). Active: when a sibling newly breaks (or recovers), the daemon fires `inner.wake` on the most-recently-active running agent so the user finds out within seconds rather than next-time-they-talk-to-someone. Persistent at-most-once delivery via `~/.ouro-cli/pulse-delivered.json` so daemon restarts don't re-page.",
2687
- "feat(pulse): horizontal-scaling norm in bodyMapSection. New `### peers` subsection teaches the agent the Bob model directly: 'i talk first. when i need a sibling's help, i `send_message` them \u2014 that's how peers coordinate, the same way humans on a team do. i only open a sibling's bundle directly via read_file/glob/grep when conversation isn't possible (they're crashed, sleeping, or i need history they haven't surfaced).' Direct declarative voice \u2014 no hedging, no soft modal verbs.",
2699
+ "feat(pulse): horizontal-scaling norm in bodyMapSection. New `### peers` subsection teaches the agent the Bob model directly: 'i talk first. when i need a sibling's help, i `send_message` them that's how peers coordinate, the same way humans on a team do. i only open a sibling's bundle directly via read_file/glob/grep when conversation isn't possible (they're crashed, sleeping, or i need history they haven't surfaced).' Direct declarative voice no hedging, no soft modal verbs.",
2688
2700
  "feat(daemon): `DaemonAgentSnapshot` gains `errorReason` and `fixHint` fields, populated by `checkAgentConfig` results in `startAgent` (set on failure, cleared on recovery). Cleared error fields are how the recovery wake fires: when a previously-broken agent transitions to running with `errorReason: null`, `findRecoveredAgents` flags it.",
2689
2701
  "feat(daemon): `DaemonProcessManager` gains `onSnapshotChange` callback option. Called after every snapshot mutation (start, exit, config-fail, recovery, restart-exhausted). Errors from the observer are swallowed so lifecycle code never breaks because the observer threw. The daemon-entry registers a callback that calls `flushPulse` to update the pulse state and fire wakes.",
2690
2702
  "feat(daemon): wake recipient picker (`pickWakeRecipient`) chooses the most-recently-active running sibling. Excludes the broken agent itself, non-running siblings, and siblings that have never been seen alive. Returns null when no eligible recipient exists, in which case the alert is still marked delivered to avoid spam on subsequent flushes."
@@ -2693,80 +2705,80 @@
2693
2705
  {
2694
2706
  "version": "0.1.0-alpha.276",
2695
2707
  "changes": [
2696
- "fix(nerves): root-cause the `start_end_pairing` audit flake on `daemon.server_start`, `daemon.update_checker_start`, and `daemon.apply_pending_updates_start`. Three fixes: (1) `applyPendingUpdates` in `update-hooks.ts` now wraps its body in try/finally so `_end` always fires, including on the early returns for `!fs.existsSync(bundlesRoot)` and `readdirSync` throws that previously orphaned the `_start`. (2) `daemon.start()` now wraps the ~380-line startup body in a try/catch that emits `daemon.server_error` with the error message and rethrows \u2014 the audit's pairing rule accepts `_end` OR `_error` as a valid closure for a `_start`, so startup throws no longer orphan `server_start`. The body was extracted to a private `startInner()` method to keep the try/catch small and readable. (3) `startUpdateChecker` callers in the update-checker test suite were already paired via an `afterEach(stopUpdateChecker)`; audited and confirmed no new unpaired callers. Also adds a vitest `afterEach` pairing guard in `src/__tests__/nerves/global-capture.ts` that fails loudly on any orphaned lifecycle `_start` in a test's per-test event stream \u2014 scoped to the three lifecycle events above so it catches regressions without false-positiving on legitimate narrow-slice operational events like `repertoire.task_scan_start`. New `src/__tests__/nerves/pairing-regression.test.ts` locks in the contract with 5 tests (nonexistent-dir, readdirSync-throws, happy-path, startUpdateChecker-pair, daemon-start-throw). Three consecutive `npm run test:coverage` runs confirm zero flakes on the target events."
2708
+ "fix(nerves): root-cause the `start_end_pairing` audit flake on `daemon.server_start`, `daemon.update_checker_start`, and `daemon.apply_pending_updates_start`. Three fixes: (1) `applyPendingUpdates` in `update-hooks.ts` now wraps its body in try/finally so `_end` always fires, including on the early returns for `!fs.existsSync(bundlesRoot)` and `readdirSync` throws that previously orphaned the `_start`. (2) `daemon.start()` now wraps the ~380-line startup body in a try/catch that emits `daemon.server_error` with the error message and rethrows the audit's pairing rule accepts `_end` OR `_error` as a valid closure for a `_start`, so startup throws no longer orphan `server_start`. The body was extracted to a private `startInner()` method to keep the try/catch small and readable. (3) `startUpdateChecker` callers in the update-checker test suite were already paired via an `afterEach(stopUpdateChecker)`; audited and confirmed no new unpaired callers. Also adds a vitest `afterEach` pairing guard in `src/__tests__/nerves/global-capture.ts` that fails loudly on any orphaned lifecycle `_start` in a test's per-test event stream scoped to the three lifecycle events above so it catches regressions without false-positiving on legitimate narrow-slice operational events like `repertoire.task_scan_start`. New `src/__tests__/nerves/pairing-regression.test.ts` locks in the contract with 5 tests (nonexistent-dir, readdirSync-throws, happy-path, startUpdateChecker-pair, daemon-start-throw). Three consecutive `npm run test:coverage` runs confirm zero flakes on the target events."
2697
2709
  ]
2698
2710
  },
2699
2711
  {
2700
2712
  "version": "0.1.0-alpha.275",
2701
2713
  "changes": [
2702
- "fix(tui): backspace on macOS \u2014 Ink 3.2 maps \\x7f to key.delete not key.backspace."
2714
+ "fix(tui): backspace on macOS Ink 3.2 maps \\x7f to key.delete not key.backspace."
2703
2715
  ]
2704
2716
  },
2705
2717
  {
2706
2718
  "version": "0.1.0-alpha.274",
2707
2719
  "changes": [
2708
2720
  "feat(nerves): daemon log rotation is now 25 MB x 5 gzipped generations instead of 50 MB x 2 uncompressed, dropping peak disk per stream from ~150 MB to ~30 MB. `createNdjsonFileSink` and `rotateIfNeeded` in `src/nerves/index.ts` accept an options object `{ maxSizeBytes, maxGenerations, compress, rotationCheckIntervalBytes }` (with number-form backcompat for the old positional API). Rotation uses a rename-then-gzip pattern so active writers can keep their fd alive while the renamed file gets compressed. Legacy uncompressed `.1.ndjson`/`.2.ndjson` files from the old scheme are tolerated and gzip-migrated on first rotation. Lifecycle emits paired `nerves.rotation_start` / `nerves.rotation_end` events with a shared trace_id, plus `nerves.rotation_error` on failure via a completion-flag try/catch.",
2709
- "feat(log-tailer): `ouro logs` can now read rotated `.ndjson.gz` generations, so `--lines N` spans across historical files. `discoverLogFiles` matches both `.ndjson` and `.ndjson.gz`, parses filenames into (streamBase, rank) tuples, and sorts chronologically (oldest generation first, active last). A new internal `readNdjsonFileContents` helper dispatches to `zlib.gunzipSync` for `.gz` paths while preserving the DI-stubbed plain-file path for existing tests. Follow mode only watches the active stream \u2014 gzipped generations are historical by definition and never tailed.",
2710
- "feat(logs-prune): new `ouro logs prune` subcommand applies the active rotation policy to every oversized `.ndjson` file in the agent daemon logs directory. Idempotent \u2014 a second run on a compliant dir is a no-op. Concurrent-writer-safe because it delegates to `rotateIfNeeded`'s rename-then-gzip pattern (no locking needed). Emits paired `nerves.logs_prune_start` / `nerves.logs_prune_end` with `nerves.logs_prune_error` on failure. Prints `compacted N file(s), freed M bytes` to stdout. New module `src/heart/daemon/logs-prune.ts` exports `pruneDaemonLogs(options)`; the CLI wire-up adds a `daemon.logs.prune` command kind across cli-types/cli-parse/cli-exec and a `pruneDaemonLogs` dep to `createDefaultOuroCliDeps`.",
2721
+ "feat(log-tailer): `ouro logs` can now read rotated `.ndjson.gz` generations, so `--lines N` spans across historical files. `discoverLogFiles` matches both `.ndjson` and `.ndjson.gz`, parses filenames into (streamBase, rank) tuples, and sorts chronologically (oldest generation first, active last). A new internal `readNdjsonFileContents` helper dispatches to `zlib.gunzipSync` for `.gz` paths while preserving the DI-stubbed plain-file path for existing tests. Follow mode only watches the active stream gzipped generations are historical by definition and never tailed.",
2722
+ "feat(logs-prune): new `ouro logs prune` subcommand applies the active rotation policy to every oversized `.ndjson` file in the agent daemon logs directory. Idempotent a second run on a compliant dir is a no-op. Concurrent-writer-safe because it delegates to `rotateIfNeeded`'s rename-then-gzip pattern (no locking needed). Emits paired `nerves.logs_prune_start` / `nerves.logs_prune_end` with `nerves.logs_prune_error` on failure. Prints `compacted N file(s), freed M bytes` to stdout. New module `src/heart/daemon/logs-prune.ts` exports `pruneDaemonLogs(options)`; the CLI wire-up adds a `daemon.logs.prune` command kind across cli-types/cli-parse/cli-exec and a `pruneDaemonLogs` dep to `createDefaultOuroCliDeps`.",
2711
2723
  "fix(launchd): drop the stale `StandardErrorPath` plist key that pointed at `ouro-daemon-stderr.log`. The file grew to 366 MB in the wild because the daemon stopped writing to it (the nerves ndjson pipeline has been the source of truth for diagnostics since the nerves layer landed) but nothing ever removed the plist key, so launchd kept the path registered and occasional process-level stderr still dripped in. Removing the key lets launchd forward stray stderr to the system log forwarder where the OS rotates it. The 366 MB orphaned file on disk at `~/AgentBundles/<agent>.ouro/state/daemon/logs/ouro-daemon-stderr.log` is safe to delete manually and is flagged in the PR summary for user cleanup.",
2712
- "refactor(runtime-logging,cli-logging): every `createNdjsonFileSink` call site now passes an explicit `{ maxSizeBytes, maxGenerations, compress }` options object. No callsite silently relies on the old 50 MB default \u2014 the policy is visible at each wire-up point. This also removes the `/* v8 ignore */` around the rotation trigger in the sink's flush() loop; tests now exercise it directly via the new `rotationCheckIntervalBytes` option."
2724
+ "refactor(runtime-logging,cli-logging): every `createNdjsonFileSink` call site now passes an explicit `{ maxSizeBytes, maxGenerations, compress }` options object. No callsite silently relies on the old 50 MB default the policy is visible at each wire-up point. This also removes the `/* v8 ignore */` around the rotation trigger in the sink's flush() loop; tests now exercise it directly via the new `rotationCheckIntervalBytes` option."
2713
2725
  ]
2714
2726
  },
2715
2727
  {
2716
2728
  "version": "0.1.0-alpha.273",
2717
2729
  "changes": [
2718
- "feat(version-manager): auto-prune old CLI versions during activate. The user observed `~/.ouro-cli/versions/` accumulating every CLI version they'd ever installed (alpha.85 from 2026-03-20 onward, ~100MB+ of dead node_modules trees) because nothing ever GCed. New `pruneOldVersions(retain=5, deps?)` walks `~/.ouro-cli/versions/`, sorts by alpha-suffix numerically, and deletes everything outside the retention window \u2014 except always preserves (a) the currently-active version (CurrentVersion symlink target), and (b) the previous version (previous symlink target), so `ouro rollback` stays one command away. Wired into `cli-defaults.ts` via `ensureCurrentVersionInstalled` and `activateCliVersion` \u2014 every successful version activation self-prunes. New helpers `compareCliVersions(a, b)` and `selectVersionsToPrune(installed, protected, retain)` are pure and exported for direct unit coverage. 9 new tests covering version comparison, retention selection, current/previous protection, partial-failure handling, missing-symlink fallback, and non-directory entry filtering.",
2719
- "test(contract): new OURO_DAEMON_INSTANTIATION_ALLOWLIST in test-isolation.contract.test.ts flags any new test file that constructs `new OuroDaemon(...)` outside the 11 grandfathered files. Constructing a real daemon and calling start() runs killOrphanProcesses() and writePidfile() against the production pidfile at ~/.ouro-cli/daemon.pids. The runtime guards added in #346 short-circuit those functions under vitest, but if a future change to start() adds a NEW production-state side-effect, the existing 11 tests would silently exercise it. The contract test forces conscious review of each new file taking this shape \u2014 same defense-in-depth pattern as BYPASS_USE_ALLOWLIST in alpha.265."
2730
+ "feat(version-manager): auto-prune old CLI versions during activate. The user observed `~/.ouro-cli/versions/` accumulating every CLI version they'd ever installed (alpha.85 from 2026-03-20 onward, ~100MB+ of dead node_modules trees) because nothing ever GCed. New `pruneOldVersions(retain=5, deps?)` walks `~/.ouro-cli/versions/`, sorts by alpha-suffix numerically, and deletes everything outside the retention window except always preserves (a) the currently-active version (CurrentVersion symlink target), and (b) the previous version (previous symlink target), so `ouro rollback` stays one command away. Wired into `cli-defaults.ts` via `ensureCurrentVersionInstalled` and `activateCliVersion` every successful version activation self-prunes. New helpers `compareCliVersions(a, b)` and `selectVersionsToPrune(installed, protected, retain)` are pure and exported for direct unit coverage. 9 new tests covering version comparison, retention selection, current/previous protection, partial-failure handling, missing-symlink fallback, and non-directory entry filtering.",
2731
+ "test(contract): new OURO_DAEMON_INSTANTIATION_ALLOWLIST in test-isolation.contract.test.ts flags any new test file that constructs `new OuroDaemon(...)` outside the 11 grandfathered files. Constructing a real daemon and calling start() runs killOrphanProcesses() and writePidfile() against the production pidfile at ~/.ouro-cli/daemon.pids. The runtime guards added in #346 short-circuit those functions under vitest, but if a future change to start() adds a NEW production-state side-effect, the existing 11 tests would silently exercise it. The contract test forces conscious review of each new file taking this shape same defense-in-depth pattern as BYPASS_USE_ALLOWLIST in alpha.265."
2720
2732
  ]
2721
2733
  },
2722
2734
  {
2723
2735
  "version": "0.1.0-alpha.272",
2724
2736
  "changes": [
2725
- "fix(daemon): the daemon's periodic update checker can now actually auto-update itself. The `onUpdate` callback in `daemon.ts` invoked `performStagedRestart` which (a) ran `npm install -g @ouro.bot/cli@{version}` to install to the global node prefix, and (b) tried to find the new code via `node -e \"console.log(require.resolve('@ouro.bot/cli/package.json'))\"` which depends on the daemon process's NODE_PATH. Neither path was actually reachable from the daemon process running out of `~/.ouro-cli/versions/{version}/...`, so every auto-update attempt bailed at `daemon.staged_restart_path_failed` (\"could not resolve new code path\") and the daemon never updated itself. The user had to manually run `ouro up` to pick up new versions. Fix: switch the staged restart to use the version-managed installer (same one the CLI's `up` flow uses) \u2014 `installVersion(version)` puts files at `~/.ouro-cli/versions/{version}/node_modules/@ouro.bot/cli` (deterministic, computable), then `activateVersion(version)` flips the CurrentVersion symlink so the next user-driven `ouro up` sees the same version the daemon is running. `performStagedRestart` gained an optional `installNewVersion` dep that production callers inject; the legacy `npm install -g` fallback path is preserved for tests. Two new regression tests in `staged-restart.test.ts`.",
2737
+ "fix(daemon): the daemon's periodic update checker can now actually auto-update itself. The `onUpdate` callback in `daemon.ts` invoked `performStagedRestart` which (a) ran `npm install -g @ouro.bot/cli@{version}` to install to the global node prefix, and (b) tried to find the new code via `node -e \"console.log(require.resolve('@ouro.bot/cli/package.json'))\"` which depends on the daemon process's NODE_PATH. Neither path was actually reachable from the daemon process running out of `~/.ouro-cli/versions/{version}/...`, so every auto-update attempt bailed at `daemon.staged_restart_path_failed` (\"could not resolve new code path\") and the daemon never updated itself. The user had to manually run `ouro up` to pick up new versions. Fix: switch the staged restart to use the version-managed installer (same one the CLI's `up` flow uses) `installVersion(version)` puts files at `~/.ouro-cli/versions/{version}/node_modules/@ouro.bot/cli` (deterministic, computable), then `activateVersion(version)` flips the CurrentVersion symlink so the next user-driven `ouro up` sees the same version the daemon is running. `performStagedRestart` gained an optional `installNewVersion` dep that production callers inject; the legacy `npm install -g` fallback path is preserved for tests. Two new regression tests in `staged-restart.test.ts`.",
2726
2738
  "fix(cli): `ouro up` no longer prints the 'ouro updated to ...' message twice during npx invocations. Three independent paths can detect a CLI version change: (1) `checkForCliUpdate` finds a newer version on npm and re-execs (cross-process print), (2) `ensureCurrentVersionInstalled` flips the CurrentVersion symlink during `performSystemSetup` because the running package version is newer than what the symlink pointed at (path 2, in-process), (3) `bundle-meta.json`'s stored runtime version differs from the running version (path 3, in-process fallback). Path 3's existing guard `linkedVersionBeforeUp !== currentVersion` correctly skipped path 3 when path 1 had already printed in a different process, but did NOT catch the case where path 2 fired in the same process. Verified live on 2026-04-08: `npx --yes @ouro.bot/cli@alpha up` printed the message twice. Fix: track an in-process `printedUpdateMessage` flag set by path 2; path 3 checks it before printing. Path 3 still acts as a fallback when path 2 didn't fire. New regression test in `daemon-cli-update-flow.test.ts` simulates the npx scenario and asserts exactly one print.",
2727
- "fix(hooks): Claude Code lifecycle hooks (`ouro hook session-start|stop|post-tool-use`) now short-circuit when the daemon socket file doesn't exist, instead of attempting `sendDaemonCommand` and logging two ENOENT errors per hook fire (one for `message.send`, one for `inner.wake`). Every Claude Code event during a daemon-down window was producing noisy `connect ENOENT /tmp/ouroboros-daemon.sock` entries in `ouro.ndjson`, which made it hard to read logs around outages. The hook is best-effort by design \u2014 dropping notifications when the daemon is down is correct behavior; we just don't want to log spam about it. New nerves event `daemon.hook_skipped_no_socket` (info level) when the short-circuit fires."
2739
+ "fix(hooks): Claude Code lifecycle hooks (`ouro hook session-start|stop|post-tool-use`) now short-circuit when the daemon socket file doesn't exist, instead of attempting `sendDaemonCommand` and logging two ENOENT errors per hook fire (one for `message.send`, one for `inner.wake`). Every Claude Code event during a daemon-down window was producing noisy `connect ENOENT /tmp/ouroboros-daemon.sock` entries in `ouro.ndjson`, which made it hard to read logs around outages. The hook is best-effort by design dropping notifications when the daemon is down is correct behavior; we just don't want to log spam about it. New nerves event `daemon.hook_skipped_no_socket` (info level) when the short-circuit fires."
2728
2740
  ]
2729
2741
  },
2730
2742
  {
2731
2743
  "version": "0.1.0-alpha.271",
2732
2744
  "changes": [
2733
- "fix(daemon): unblock daemon.stop deadlock that hung `ouro up` after a CLI auto-update. When the running daemon's version drifted from the local CLI version, `ensureDaemonRunning` would send `daemon.stop` over the socket, the daemon's command handler would `await this.stop()`, and `stop()` would `await server.close()`. server.close() resolves only after every open connection has closed \u2014 but the calling client's connection was the ONE thing keeping the server open: its `flushResponse()` was awaiting THIS function call. Both processes sat in kevent forever. Verified live on 2026-04-08: alpha.268 daemon hung at `daemon.server_end` log line for 5+ minutes after a fresh alpha.270 ouro process sent daemon.stop, and the alpha.270 ouro process hung waiting for the response. The deadlock had existed since the original `await server.close()` line was added (2026-03-05) but was masked for weeks by the half-close behavior in socket-client: the client called `client.end()` after writing its command, which (with `allowHalfOpen: false`) caused node to auto-tear-down the server's writable side, incidentally unblocking server.close() before the response was sent. The fix in #303/#334/#339 (which removed `client.end()` and switched to `allowHalfOpen: true` to stop dropping long-running responses like agent.senseTurn) accidentally exposed the underlying deadlock. Fix: don't `await` server.close() in stop() \u2014 just fire it. Once stop() returns, the daemon.stop case returns its response, flushResponse calls connection.end(response), the connection closes, and server.close()'s pending callback fires asynchronously. Includes a new daemon-stop-deadlock.test.ts that uses real net sockets to drive daemon.stop and asserts the response comes back within 2s \u2014 the test fails (24s timeout) without the fix and passes (110ms) with it."
2745
+ "fix(daemon): unblock daemon.stop deadlock that hung `ouro up` after a CLI auto-update. When the running daemon's version drifted from the local CLI version, `ensureDaemonRunning` would send `daemon.stop` over the socket, the daemon's command handler would `await this.stop()`, and `stop()` would `await server.close()`. server.close() resolves only after every open connection has closed but the calling client's connection was the ONE thing keeping the server open: its `flushResponse()` was awaiting THIS function call. Both processes sat in kevent forever. Verified live on 2026-04-08: alpha.268 daemon hung at `daemon.server_end` log line for 5+ minutes after a fresh alpha.270 ouro process sent daemon.stop, and the alpha.270 ouro process hung waiting for the response. The deadlock had existed since the original `await server.close()` line was added (2026-03-05) but was masked for weeks by the half-close behavior in socket-client: the client called `client.end()` after writing its command, which (with `allowHalfOpen: false`) caused node to auto-tear-down the server's writable side, incidentally unblocking server.close() before the response was sent. The fix in #303/#334/#339 (which removed `client.end()` and switched to `allowHalfOpen: true` to stop dropping long-running responses like agent.senseTurn) accidentally exposed the underlying deadlock. Fix: don't `await` server.close() in stop() just fire it. Once stop() returns, the daemon.stop case returns its response, flushResponse calls connection.end(response), the connection closes, and server.close()'s pending callback fires asynchronously. Includes a new daemon-stop-deadlock.test.ts that uses real net sockets to drive daemon.stop and asserts the response comes back within 2s the test fails (24s timeout) without the fix and passes (110ms) with it."
2734
2746
  ]
2735
2747
  },
2736
2748
  {
2737
2749
  "version": "0.1.0-alpha.270",
2738
2750
  "changes": [
2739
- "feat(daemon): `ouro status` now shows a new `Agents` section listing every discovered bundle with its enabled/disabled state. Previously disabled agents were completely invisible in status \u2014 the Senses/Workers/Git Sync sections only iterate managed (enabled) bundles, so a bundle with `\"enabled\": false` in agent.json left no trace in the output. New helper `listAllBundleAgents()` in `agent-discovery.ts` walks the bundles root and returns `{ name, enabled }` for every `<name>.ouro` with a parseable agent.json, and `listEnabledBundleAgents()` now delegates to it. Daemon status payload carries a new `agents: BundleAgentRow[]` field (backward-compat optional in the parser). The stopped-daemon renderer also reads bundles directly from disk so the Agents section works when the daemon is down."
2751
+ "feat(daemon): `ouro status` now shows a new `Agents` section listing every discovered bundle with its enabled/disabled state. Previously disabled agents were completely invisible in status the Senses/Workers/Git Sync sections only iterate managed (enabled) bundles, so a bundle with `\"enabled\": false` in agent.json left no trace in the output. New helper `listAllBundleAgents()` in `agent-discovery.ts` walks the bundles root and returns `{ name, enabled }` for every `<name>.ouro` with a parseable agent.json, and `listEnabledBundleAgents()` now delegates to it. Daemon status payload carries a new `agents: BundleAgentRow[]` field (backward-compat optional in the parser). The stopped-daemon renderer also reads bundles directly from disk so the Agents section works when the daemon is down."
2740
2752
  ]
2741
2753
  },
2742
2754
  {
2743
2755
  "version": "0.1.0-alpha.269",
2744
2756
  "changes": [
2745
- "fix(prompt): revert alpha.267 over-engineering and correct the two targeted additions. The existing contextSection already contained the 'save to disk or lose it' teaching ('my conversation memory is ephemeral -- it resets between sessions. anything i learn about my friend, i save with save_friend_note so future me remembers.'), and toolContractsSection already told agents to call save_friend_note/diary_write before responding. alpha.267 added a third block in memoryJudgementSection about 'not just remembering between sessions' which was redundant and misplaced (memoryJudgementSection sits under 'my tools & capabilities' alongside tool routing heuristics, not nature-of-self content). That block is reverted. What alpha.267 got right and this PR keeps: (a) bodyMapSection guidance that standard folders are a floor, not a ceiling \u2014 bundles can have custom top-level folders (an early bundle's travel/ was the motivating example) \u2014 with a nudge to try the file-listing tool on the bundle root BEFORE falling back to recall, and (b) a diary-routing bullet that flags bundle-layout discoveries as worth persisting. Both had errors corrected here: alpha.267 told agents to use `list_directory` but the actual tool is `glob` (with a pattern like `*/`), and it said 'write a diary note like bundle-layout.md' but diary/ is a jsonl fact store queried via recall, not a directory of .md files \u2014 the corrected bullet says 'save the fact with diary_write'."
2757
+ "fix(prompt): revert alpha.267 over-engineering and correct the two targeted additions. The existing contextSection already contained the 'save to disk or lose it' teaching ('my conversation memory is ephemeral -- it resets between sessions. anything i learn about my friend, i save with save_friend_note so future me remembers.'), and toolContractsSection already told agents to call save_friend_note/diary_write before responding. alpha.267 added a third block in memoryJudgementSection about 'not just remembering between sessions' which was redundant and misplaced (memoryJudgementSection sits under 'my tools & capabilities' alongside tool routing heuristics, not nature-of-self content). That block is reverted. What alpha.267 got right and this PR keeps: (a) bodyMapSection guidance that standard folders are a floor, not a ceiling bundles can have custom top-level folders (an early bundle's travel/ was the motivating example) with a nudge to try the file-listing tool on the bundle root BEFORE falling back to recall, and (b) a diary-routing bullet that flags bundle-layout discoveries as worth persisting. Both had errors corrected here: alpha.267 told agents to use `list_directory` but the actual tool is `glob` (with a pattern like `*/`), and it said 'write a diary note like bundle-layout.md' but diary/ is a jsonl fact store queried via recall, not a directory of .md files the corrected bullet says 'save the fact with diary_write'."
2746
2758
  ]
2747
2759
  },
2748
2760
  {
2749
2761
  "version": "0.1.0-alpha.268",
2750
2762
  "changes": [
2751
- "fix(identity): `loadAgentConfig` now preserves the `sync` block from agent.json. The hand-rolled object literal that constructs the typed `AgentConfig` was missing `sync` from its field list, so `agentConfig.sync` was always `undefined`, `getSyncConfig()` always returned `enabled: false`, and the entire sync code path (`preTurnPull` / `postTurnPush` in `pipeline.ts`) was dead from the moment the field was added to the type. The bug hid for weeks because `ouro status` reads `agent.json` directly via `listBundleSyncRows`, not through `loadAgentConfig` \u2014 so the per-agent Git Sync display correctly showed `enabled origin \u2192 ...` while sync did literally nothing. Slugger accumulated 5+ days of dirty files and zero `sync: post-turn update` commits before this surfaced. Also adds 3 regression tests asserting the sync block round-trips through `loadAgentConfig` (full block, partial block, missing block). The first two would have caught the original bug; without them future field additions to `AgentConfig` could repeat the pattern."
2763
+ "fix(identity): `loadAgentConfig` now preserves the `sync` block from agent.json. The hand-rolled object literal that constructs the typed `AgentConfig` was missing `sync` from its field list, so `agentConfig.sync` was always `undefined`, `getSyncConfig()` always returned `enabled: false`, and the entire sync code path (`preTurnPull` / `postTurnPush` in `pipeline.ts`) was dead from the moment the field was added to the type. The bug hid for weeks because `ouro status` reads `agent.json` directly via `listBundleSyncRows`, not through `loadAgentConfig` so the per-agent Git Sync display correctly showed `enabled origin ...` while sync did literally nothing. Slugger accumulated 5+ days of dirty files and zero `sync: post-turn update` commits before this surfaced. Also adds 3 regression tests asserting the sync block round-trips through `loadAgentConfig` (full block, partial block, missing block). The first two would have caught the original bug; without them future field additions to `AgentConfig` could repeat the pattern."
2752
2764
  ]
2753
2765
  },
2754
2766
  {
2755
2767
  "version": "0.1.0-alpha.267",
2756
2768
  "changes": [
2757
- "fix(prompt): teach agents that they do NOT 'just remember' things between sessions. Observed failure mode: an agent spent a recall scan searching diary/journal for a `travel/` folder that was right at the bundle root, then concluded 'i should know my own folder structure next time' and refused to persist the discovery because 'that's just me knowing my own home.' That is impossible \u2014 next session is a blank slate. The agent conflated in-session realizations with persistent knowledge. Three prompt changes fix this: (1) bodyMapSection now explicitly states that standard folders are a floor, bundles MAY have custom top-level folders created by the friend over time, and the agent should `list_directory` the bundle root BEFORE reaching for `recall` when something might be in a custom location. (2) memoryJudgementSection opens with an always-on reminder that the agent does not 'just remember' anything between sessions \u2014 every session carries only (a) the prompt, (b) what's on disk, (c) what tools observe this turn \u2014 and that thoughts like 'i should know this next time' or 'i'll look there first in the future' are CUES to write a concrete diary/friend-note RIGHT NOW. Future me cannot inherit resolutions, only files. (3) memoryJudgementSection's diary-routing rules now explicitly call out bundle-layout discoveries as a thing worth persisting (e.g. to a `bundle-layout.md` diary note). Includes 3 new tests locking the directives into place so future prompt edits cannot silently drop them."
2769
+ "fix(prompt): teach agents that they do NOT 'just remember' things between sessions. Observed failure mode: an agent spent a recall scan searching diary/journal for a `travel/` folder that was right at the bundle root, then concluded 'i should know my own folder structure next time' and refused to persist the discovery because 'that's just me knowing my own home.' That is impossible next session is a blank slate. The agent conflated in-session realizations with persistent knowledge. Three prompt changes fix this: (1) bodyMapSection now explicitly states that standard folders are a floor, bundles MAY have custom top-level folders created by the friend over time, and the agent should `list_directory` the bundle root BEFORE reaching for `recall` when something might be in a custom location. (2) memoryJudgementSection opens with an always-on reminder that the agent does not 'just remember' anything between sessions every session carries only (a) the prompt, (b) what's on disk, (c) what tools observe this turn and that thoughts like 'i should know this next time' or 'i'll look there first in the future' are CUES to write a concrete diary/friend-note RIGHT NOW. Future me cannot inherit resolutions, only files. (3) memoryJudgementSection's diary-routing rules now explicitly call out bundle-layout discoveries as a thing worth persisting (e.g. to a `bundle-layout.md` diary note). Includes 3 new tests locking the directives into place so future prompt edits cannot silently drop them."
2758
2770
  ]
2759
2771
  },
2760
2772
  {
2761
2773
  "version": "0.1.0-alpha.266",
2762
2774
  "changes": [
2763
- "fix(daemon): vitest guard for production pidfile (~/.ouro-cli/daemon.pids). The pidfile path is hardcoded with no DI seam, so when a test creates a real OuroDaemon instance and calls start(), the daemon's killOrphanProcesses() reads the REAL pidfile, ps-verifies the PIDs, and SIGTERMs the production daemon's PIDs. Verified live: alpha.265 daemon (PID 64988) was killed 93s after startup by `npx vitest run` invoking daemon.start() in 6 different test files. SIGTERM forensics added in alpha.265 captured the death (parentPid=1, parentCommand=/sbin/launchd) but the killer hint was misleading \u2014 the real culprit was the production-pidfile leak. Fix: killOrphanProcesses() and writePidfile() now short-circuit under vitest with warn-level nerves events. Tests that need to verify these functions' behavior continue to use the extracted pure helpers (parseOrphanPidsFromPs, filterPidfilePidsToActualOrphans). New unit tests verify the pidfile is unchanged after no-op calls. Same defense-in-depth pattern as the alpha.265 socket-client hardening \u2014 production-side state being touched from tests is now physically impossible from a vitest worker."
2775
+ "fix(daemon): vitest guard for production pidfile (~/.ouro-cli/daemon.pids). The pidfile path is hardcoded with no DI seam, so when a test creates a real OuroDaemon instance and calls start(), the daemon's killOrphanProcesses() reads the REAL pidfile, ps-verifies the PIDs, and SIGTERMs the production daemon's PIDs. Verified live: alpha.265 daemon (PID 64988) was killed 93s after startup by `npx vitest run` invoking daemon.start() in 6 different test files. SIGTERM forensics added in alpha.265 captured the death (parentPid=1, parentCommand=/sbin/launchd) but the killer hint was misleading the real culprit was the production-pidfile leak. Fix: killOrphanProcesses() and writePidfile() now short-circuit under vitest with warn-level nerves events. Tests that need to verify these functions' behavior continue to use the extracted pure helpers (parseOrphanPidsFromPs, filterPidfilePidsToActualOrphans). New unit tests verify the pidfile is unchanged after no-op calls. Same defense-in-depth pattern as the alpha.265 socket-client hardening production-side state being touched from tests is now physically impossible from a vitest worker."
2764
2776
  ]
2765
2777
  },
2766
2778
  {
2767
2779
  "version": "0.1.0-alpha.265",
2768
2780
  "changes": [
2769
- "fix(daemon): bulletproof vitest socket leak + SIGTERM forensics. Hardens the socket-client vitest guard so production daemon socket calls (DEFAULT_DAEMON_SOCKET_PATH = /tmp/ouroboros-daemon.sock) are unconditionally blocked under vitest, regardless of __bypassVitestGuardForTests state. Cross-file leaks via the globalThis bypass flag (process-wide, leaks across concurrent test files in the same vitest worker) can no longer reach the production daemon. Test files that legitimately exercise the real socket-client transport against synthetic test socket paths (/tmp/daemon.sock) continue to work. Background: a daemon outage on 2026-04-08 was traced to a 14-call burst of leaked `inner.wake testagent` errors, signature of vitest test runs hammering the production socket \u2014 same pattern as 1,460 historical daemon log entries.",
2781
+ "fix(daemon): bulletproof vitest socket leak + SIGTERM forensics. Hardens the socket-client vitest guard so production daemon socket calls (DEFAULT_DAEMON_SOCKET_PATH = /tmp/ouroboros-daemon.sock) are unconditionally blocked under vitest, regardless of __bypassVitestGuardForTests state. Cross-file leaks via the globalThis bypass flag (process-wide, leaks across concurrent test files in the same vitest worker) can no longer reach the production daemon. Test files that legitimately exercise the real socket-client transport against synthetic test socket paths (/tmp/daemon.sock) continue to work. Background: a daemon outage on 2026-04-08 was traced to a 14-call burst of leaked `inner.wake testagent` errors, signature of vitest test runs hammering the production socket same pattern as 1,460 historical daemon log entries.",
2770
2782
  "feat(daemon): SIGTERM/SIGINT tombstone forensics. New writeDaemonTombstone path captures process.ppid, parent command via `ps -p <ppid> -o command=`, and a filtered process snapshot (only node/vitest/ouro/kill lines) at signal-driven death time. Adds a killerHint heuristic: launchd parent reparenting suggests `launchctl bootout`/KeepAlive thrash; vitest worker presence suggests test cleanup; pkill/killall presence is an explicit kill. Forensics field is parsed by readDaemonTombstone and included in the daemon.tombstone_written nerves event meta. Also caps recentCrashes at 100 entries (was 12,265 from a March 31 thrash loop).",
2771
2783
  "test(daemon): contract test BYPASS_USE_ALLOWLIST flags any new file calling __bypassVitestGuardForTests outside the two known-good test files (socket-client.test.ts and daemon-cli-defaults.test.ts). Prevents future regressions of the cross-file leak vector."
2772
2784
  ]
@@ -2774,7 +2786,7 @@
2774
2786
  {
2775
2787
  "version": "0.1.0-alpha.264",
2776
2788
  "changes": [
2777
- "fix(bluebubbles): dedup `updated-message` webhooks BEFORE running repair+hydrate+VLM. BlueBubbles routinely sends a `new-message` webhook for a fresh message, then follows up seconds later with one or more `updated-message` webhooks for delivery/read status. The BB sense's `repairEvent` path promotes updated-message events with recoverable content back to `message` kind, which re-runs the full hydration pipeline \u2014 including a second MiniMax VLM describe call on the same image. Verified live on 2026-04-08T00:58Z: two sequential VLM describes for attachment guid 317E37EB-..., 13.7s + 14.0s each, for the exact same 291KB JPEG, triggered by a `new-message` followed 3s later by an `updated-message` for the same guid. The downstream `handleBlueBubblesNormalizedEvent` dedup check was firing correctly but too late (after the expensive VLM round-trip). Fix: add a pre-repair dedup check in `handleBlueBubblesEvent` that consults the inbound sidecar by messageGuid and short-circuits before calling `client.repairEvent(...)`. New nerves event `senses.bluebubbles_repair_skipped_duplicate` at warn level for observability."
2789
+ "fix(bluebubbles): dedup `updated-message` webhooks BEFORE running repair+hydrate+VLM. BlueBubbles routinely sends a `new-message` webhook for a fresh message, then follows up seconds later with one or more `updated-message` webhooks for delivery/read status. The BB sense's `repairEvent` path promotes updated-message events with recoverable content back to `message` kind, which re-runs the full hydration pipeline including a second MiniMax VLM describe call on the same image. Verified live on 2026-04-08T00:58Z: two sequential VLM describes for attachment guid 317E37EB-..., 13.7s + 14.0s each, for the exact same 291KB JPEG, triggered by a `new-message` followed 3s later by an `updated-message` for the same guid. The downstream `handleBlueBubblesNormalizedEvent` dedup check was firing correctly but too late (after the expensive VLM round-trip). Fix: add a pre-repair dedup check in `handleBlueBubblesEvent` that consults the inbound sidecar by messageGuid and short-circuits before calling `client.repairEvent(...)`. New nerves event `senses.bluebubbles_repair_skipped_duplicate` at warn level for observability."
2778
2790
  ]
2779
2791
  },
2780
2792
  {
@@ -2786,28 +2798,28 @@
2786
2798
  {
2787
2799
  "version": "0.1.0-alpha.262",
2788
2800
  "changes": [
2789
- "test(daemon): inject explicit `vi.mock(\"...heart/daemon/socket-client\", ...)` blocks into all 40 grandfathered test files in `TESTAGENT_NO_MOCK_ALLOWLIST`. The runtime guard in `socket-client.ts` already prevents real socket leaks under vitest, but the explicit mocks let those tests assert call counts cleanly and shrink the contract-test allowlist to zero. Both the testagent and the bundle-write allowlists in `test-isolation.contract.test.ts` are now empty Sets \u2014 any new offender fails the build."
2801
+ "test(daemon): inject explicit `vi.mock(\"...heart/daemon/socket-client\", ...)` blocks into all 40 grandfathered test files in `TESTAGENT_NO_MOCK_ALLOWLIST`. The runtime guard in `socket-client.ts` already prevents real socket leaks under vitest, but the explicit mocks let those tests assert call counts cleanly and shrink the contract-test allowlist to zero. Both the testagent and the bundle-write allowlists in `test-isolation.contract.test.ts` are now empty Sets any new offender fails the build."
2790
2802
  ]
2791
2803
  },
2792
2804
  {
2793
2805
  "version": "0.1.0-alpha.261",
2794
2806
  "changes": [
2795
- "feat(tui): full input parity with Claude Code \u2014 kill ring, emacs nav, Home/End, Ctrl+D, forward delete, Esc history, bracketed paste, clipboard image, token deletion, chip navigation. 156 new tests."
2807
+ "feat(tui): full input parity with Claude Code kill ring, emacs nav, Home/End, Ctrl+D, forward delete, Esc history, bracketed paste, clipboard image, token deletion, chip navigation. 156 new tests."
2796
2808
  ]
2797
2809
  },
2798
2810
  {
2799
2811
  "version": "0.1.0-alpha.260",
2800
2812
  "changes": [
2801
- "fix(daemon): MCP bridge empty-response bug for long-running commands. `sendDaemonCommand` and `checkDaemonSocketAlive` were calling `client.end()` immediately after writing, which half-closed the TCP connection. The daemon server's `net.createServer()` uses the default `allowHalfOpen: false`, so when it saw the client's FIN it auto-closed its own writable side \u2014 and any response the server tried to write after processing a long-running command (like `agent.senseTurn`, which runs a full LLM turn) was dropped on the floor. Verified via direct socket repro: with `client.end()`, `agent.senseTurn` returned empty in ~149ms; without it, the same call returned a real response in ~5.8s. This was a silent regression of the fix in #303 (commit `253e4b1f` titled \"socket half-close fix\" actually *added* the half-close back). Fix: drop the `client.end()` calls from both sites AND set `allowHalfOpen: true` on the daemon's `net.createServer(...)` as defense-in-depth so future clients calling `end()` don't silently break again.",
2802
- "fix(daemon): tighten pidfile trust \u2014 `killOrphanProcesses` now verifies each pidfile PID is an actual orphan (PPID reparented to init/PID 1) before SIGTERMing it. Previously a polluted pidfile (written by a crashed daemon whose PIDs have since been reused by the OS for unrelated processes) could cause mass-kill of unrelated apps. New exported helper `filterPidfilePidsToActualOrphans` provides direct unit coverage.",
2813
+ "fix(daemon): MCP bridge empty-response bug for long-running commands. `sendDaemonCommand` and `checkDaemonSocketAlive` were calling `client.end()` immediately after writing, which half-closed the TCP connection. The daemon server's `net.createServer()` uses the default `allowHalfOpen: false`, so when it saw the client's FIN it auto-closed its own writable side and any response the server tried to write after processing a long-running command (like `agent.senseTurn`, which runs a full LLM turn) was dropped on the floor. Verified via direct socket repro: with `client.end()`, `agent.senseTurn` returned empty in ~149ms; without it, the same call returned a real response in ~5.8s. This was a silent regression of the fix in #303 (commit `253e4b1f` titled \"socket half-close fix\" actually *added* the half-close back). Fix: drop the `client.end()` calls from both sites AND set `allowHalfOpen: true` on the daemon's `net.createServer(...)` as defense-in-depth so future clients calling `end()` don't silently break again.",
2814
+ "fix(daemon): tighten pidfile trust `killOrphanProcesses` now verifies each pidfile PID is an actual orphan (PPID reparented to init/PID 1) before SIGTERMing it. Previously a polluted pidfile (written by a crashed daemon whose PIDs have since been reused by the OS for unrelated processes) could cause mass-kill of unrelated apps. New exported helper `filterPidfilePidsToActualOrphans` provides direct unit coverage.",
2803
2815
  "fix(daemon, repertoire): rename six `_stop` nerves events to `_end` so they pair correctly with their `_start` counterparts under the nerves audit's start/end pairing rule. Affected: `daemon.thoughts_follow_stop`, `daemon.server_stop`, `daemon.update_checker_stop`, `daemon.mcp_server_stop`, `daemon.habit_scheduler_stop`, `mcp.manager_stop`. The naming was semantically fine (`stop` pairs with `start`), but the audit specifically looks for `_end`/`_error` suffixes. Was producing intermittent audit failures whenever any test run exercised these teardown paths (seen today as flakes on `daemon-socket-errors.test.ts` and `MarkdownStreamer` tests).",
2804
- "fix(providers): drop the harness-imposed MiniMax VLM timeout entirely. Previously the module defaulted to a 60-second AbortSignal; E2E validation of the BB image fix hit a real VLM request that took >60s to return (same bytes returned in 9.5s on the immediate retry). Raising to 120s would have been arbitrary too \u2014 the correct answer, per the same reasoning as PR #322 for LLM providers, is to not impose a harness ceiling at all. When `timeoutMs` isn't provided, `fetch()` now runs without an AbortSignal and undici's own defaults (headersTimeout + bodyTimeout, both 5 minutes) are the ceiling. Callers that want a tighter bound can still pass an explicit `timeoutMs`. The AbortError path is kept for that case, and the error message adapts to say \"underlying stack default\" when there was no harness-set value."
2816
+ "fix(providers): drop the harness-imposed MiniMax VLM timeout entirely. Previously the module defaulted to a 60-second AbortSignal; E2E validation of the BB image fix hit a real VLM request that took >60s to return (same bytes returned in 9.5s on the immediate retry). Raising to 120s would have been arbitrary too the correct answer, per the same reasoning as PR #322 for LLM providers, is to not impose a harness ceiling at all. When `timeoutMs` isn't provided, `fetch()` now runs without an AbortSignal and undici's own defaults (headersTimeout + bodyTimeout, both 5 minutes) are the ceiling. Callers that want a tighter bound can still pass an explicit `timeoutMs`. The AbortError path is kept for that case, and the error message adapts to say \"underlying stack default\" when there was no harness-set value."
2805
2817
  ]
2806
2818
  },
2807
2819
  {
2808
2820
  "version": "0.1.0-alpha.259",
2809
2821
  "changes": [
2810
- "fix(daemon): SIGINT and SIGTERM now ALWAYS write a tombstone before exiting, instead of silently skipping when `_gracefulShutdown` was set. The previous behavior made signal-driven shutdowns invisible in `~/.ouro-cli/daemon-death.json` \u2014 launchd policy decisions, the OOM killer, manual `kill`, and `killOrphanProcesses` from a sibling daemon all looked identical to a clean exit. After a real outage where the user's daemon kept dying with a tombstone from a week earlier, this restores observability: every signal-driven exit now records `reason: \"sigint\"` or `reason: \"sigterm\"` with timestamp + recentCrashes accumulator. The catch-all `process.on('exit')` handler also no longer short-circuits on graceful shutdown."
2822
+ "fix(daemon): SIGINT and SIGTERM now ALWAYS write a tombstone before exiting, instead of silently skipping when `_gracefulShutdown` was set. The previous behavior made signal-driven shutdowns invisible in `~/.ouro-cli/daemon-death.json` launchd policy decisions, the OOM killer, manual `kill`, and `killOrphanProcesses` from a sibling daemon all looked identical to a clean exit. After a real outage where the user's daemon kept dying with a tombstone from a week earlier, this restores observability: every signal-driven exit now records `reason: \"sigint\"` or `reason: \"sigterm\"` with timestamp + recentCrashes accumulator. The catch-all `process.on('exit')` handler also no longer short-circuits on graceful shutdown."
2811
2823
  ]
2812
2824
  },
2813
2825
  {
@@ -2821,14 +2833,14 @@
2821
2833
  {
2822
2834
  "version": "0.1.0-alpha.257",
2823
2835
  "changes": [
2824
- "test(daemon): convert all `daemon-cli.test.ts` auth/thoughts/config tests to use `os.tmpdir()` via the new `createTmpBundle()` helper. Previously these tests wrote real bundles to `~/AgentBundles/auth-local-${Date.now()}.ouro` etc., relying on `try/finally` cleanup that doesn't fire on test interruption \u2014 leaking bundle directories into the developer's home and inflating noise on the running daemon. The `REAL_BUNDLES_WRITE_ALLOWLIST` ratchet in `test-isolation.contract.test.ts` is now empty.",
2836
+ "test(daemon): convert all `daemon-cli.test.ts` auth/thoughts/config tests to use `os.tmpdir()` via the new `createTmpBundle()` helper. Previously these tests wrote real bundles to `~/AgentBundles/auth-local-${Date.now()}.ouro` etc., relying on `try/finally` cleanup that doesn't fire on test interruption leaking bundle directories into the developer's home and inflating noise on the running daemon. The `REAL_BUNDLES_WRITE_ALLOWLIST` ratchet in `test-isolation.contract.test.ts` is now empty.",
2825
2837
  "fix(daemon/cli-exec): plumb `bundlesRoot` and `secretsRoot` deps through the auth.run / auth.verify / auth.switch / config.model / config.models / thoughts handlers so tests can route reads/writes to a tmpdir without monkey-patching the identity module. Production code paths still default to `getAgentBundlesRoot()` and `~/.agentsecrets`."
2826
2838
  ]
2827
2839
  },
2828
2840
  {
2829
2841
  "version": "0.1.0-alpha.256",
2830
2842
  "changes": [
2831
- "fix(tui): cursor renders as inverse character (not inserted block) \u2014 matches standard terminal cursor behavior.",
2843
+ "fix(tui): cursor renders as inverse character (not inserted block) matches standard terminal cursor behavior.",
2832
2844
  "fix(tui): Alt+Enter via ESC-timing (50ms window) instead of raw stdin handler that interfered with arrow keys.",
2833
2845
  "fix(tui): image path regex now matches backslash-escaped spaces from macOS drag-drop."
2834
2846
  ]
@@ -2837,20 +2849,20 @@
2837
2849
  "version": "0.1.0-alpha.255",
2838
2850
  "changes": [
2839
2851
  "feat(tui): session resume shows last messages as regular chat (no dimmed preview) with teal resume banner in header.",
2840
- "feat(tui): image/file drag-and-drop \u2014 detects image paths in pasted text, reads to base64, inserts [Image #N] references, sends as image_url content blocks to model.",
2852
+ "feat(tui): image/file drag-and-drop detects image paths in pasted text, reads to base64, inserts [Image #N] references, sends as image_url content blocks to model.",
2841
2853
  "refactor(tui): removed custom history-* roles and addSessionHistory (KISS/DRY)."
2842
2854
  ]
2843
2855
  },
2844
2856
  {
2845
2857
  "version": "0.1.0-alpha.254",
2846
2858
  "changes": [
2847
- "fix(daemon): orphan-cleanup fallback no longer kills processes from sibling harness instances. On startup, `killOrphanProcesses` scans `ps` for harness entry points (`agent-entry.js`, `daemon-entry.js`, `bluebubbles/entry.js`, `teams-entry.js`) and SIGTERMs them when the pidfile is missing. Previously any matching process was fair game, so a vitest-driven harness run from a sibling worktree (or a parallel Claude Code session running the coverage gate) would terminate the production daemon's children, triggering cascading graceful shutdowns and making the production agent unavailable for seconds-to-minutes at a time. Now the fallback only flags true orphans \u2014 processes whose PPID has been reparented to init (PID 1). Sibling daemons' live-parented children are left alone. New exported helper `parseOrphanPidsFromPs` isolates the filter for direct unit coverage. Complements the test-isolation guard shipped in alpha.253 (#333): that PR stopped tests from SENDING `inner.wake testagent` commands into the production socket; this PR stops the production daemon from SIGTERMing test-spawned child processes during its own startup."
2859
+ "fix(daemon): orphan-cleanup fallback no longer kills processes from sibling harness instances. On startup, `killOrphanProcesses` scans `ps` for harness entry points (`agent-entry.js`, `daemon-entry.js`, `bluebubbles/entry.js`, `teams-entry.js`) and SIGTERMs them when the pidfile is missing. Previously any matching process was fair game, so a vitest-driven harness run from a sibling worktree (or a parallel Claude Code session running the coverage gate) would terminate the production daemon's children, triggering cascading graceful shutdowns and making the production agent unavailable for seconds-to-minutes at a time. Now the fallback only flags true orphans processes whose PPID has been reparented to init (PID 1). Sibling daemons' live-parented children are left alone. New exported helper `parseOrphanPidsFromPs` isolates the filter for direct unit coverage. Complements the test-isolation guard shipped in alpha.253 (#333): that PR stopped tests from SENDING `inner.wake testagent` commands into the production socket; this PR stops the production daemon from SIGTERMing test-spawned child processes during its own startup."
2848
2860
  ]
2849
2861
  },
2850
2862
  {
2851
2863
  "version": "0.1.0-alpha.253",
2852
2864
  "changes": [
2853
- "fix(daemon): test-isolation guard \u2014 stop tests from leaking real `inner.wake` commands into the running daemon. A pattern in 36+ test files mocks `getAgentName` to the literal `\"testagent\"` but does NOT mock `socket-client`, so any code path through pondering / rest / coding feedback fires a real socket connection at /tmp/ouroboros-daemon.sock with `inner.wake testagent`. The daemon errored on every command (`Unknown managed agent 'testagent'`) and at flood volumes that contributed to a real outage on the developer's machine. The fix is in `socket-client.ts` itself: detect vitest via `process.argv` (no env vars) and convert all socket operations into safe no-ops. Tests that legitimately exercise the real transport (socket-client.test.ts, daemon-cli-defaults.test.ts) opt out of the guard via a new `__bypassVitestGuardForTests()` setter that lives on globalThis to survive `vi.resetModules()`. New nerves events: `daemon.socket_command_test_blocked`, `daemon.inner_wake_test_blocked`.",
2865
+ "fix(daemon): test-isolation guard stop tests from leaking real `inner.wake` commands into the running daemon. A pattern in 36+ test files mocks `getAgentName` to the literal `\"testagent\"` but does NOT mock `socket-client`, so any code path through pondering / rest / coding feedback fires a real socket connection at /tmp/ouroboros-daemon.sock with `inner.wake testagent`. The daemon errored on every command (`Unknown managed agent 'testagent'`) and at flood volumes that contributed to a real outage on the developer's machine. The fix is in `socket-client.ts` itself: detect vitest via `process.argv` (no env vars) and convert all socket operations into safe no-ops. Tests that legitimately exercise the real transport (socket-client.test.ts, daemon-cli-defaults.test.ts) opt out of the guard via a new `__bypassVitestGuardForTests()` setter that lives on globalThis to survive `vi.resetModules()`. New nerves events: `daemon.socket_command_test_blocked`, `daemon.inner_wake_test_blocked`.",
2854
2866
  "test(contract): new test-isolation contract test ratchets two anti-patterns: (1) test files using `name: \"testagent\"` without mocking socket-client, and (2) test files constructing write paths under the real `~/AgentBundles` via `os.homedir()`. Existing offenders are grandfathered in two allowlists; new offenders fail the build. Follow-up PRs shrink the allowlists toward zero."
2855
2867
  ]
2856
2868
  },
@@ -2863,7 +2875,7 @@
2863
2875
  {
2864
2876
  "version": "0.1.0-alpha.251",
2865
2877
  "changes": [
2866
- "fix(bluebubbles): images sent via iMessage now reach the model \u2014 adds capability-gated image hydration with a MiniMax VLM fallback. Reasoning MiniMax chat models (M2/M2.1/M2.5/M2.7) silently drop OpenAI-style `image_url` content parts, so previously the agent answered image questions from fabricated context. Now, when the active chat model lacks the new `vision: true` capability flag, inbound screenshots are auto-described at ingestion via `/v1/coding_plan/vlm` and the description text replaces the `image_url` part before the turn reaches the model. Vision-capable models (claude-opus/sonnet-4-6, gpt-5.4, MiniMax-Text-01, MiniMax-VL-01) continue to see images natively via pass-through.",
2878
+ "fix(bluebubbles): images sent via iMessage now reach the model adds capability-gated image hydration with a MiniMax VLM fallback. Reasoning MiniMax chat models (M2/M2.1/M2.5/M2.7) silently drop OpenAI-style `image_url` content parts, so previously the agent answered image questions from fabricated context. Now, when the active chat model lacks the new `vision: true` capability flag, inbound screenshots are auto-described at ingestion via `/v1/coding_plan/vlm` and the description text replaces the `image_url` part before the turn reaches the model. Vision-capable models (claude-opus/sonnet-4-6, gpt-5.4, MiniMax-Text-01, MiniMax-VL-01) continue to see images natively via pass-through.",
2867
2879
  "feat(tools): new `describe_image` agent tool registered into the BlueBubbles tool set when the chat model lacks vision. Lets the agent re-interrogate an attachment with a targeted prompt (e.g. 'what's the flight number in the bottom-right?') after ingestion. Backed by a bounded in-memory attachment cache populated during hydration; handler re-downloads bytes and calls the same VLM client.",
2868
2880
  "fix(bluebubbles): `formatMessageText` now preserves the attachment marker when a message has BOTH text and attachments (B2). Previously the marker was dropped whenever text was present, hiding attachments from the agent's view of the message.",
2869
2881
  "feat(heart): `ModelCapabilities` gains `vision?: boolean` and `audio?: boolean` flags (B4). Vision rows populated for claude-opus-4-6, claude-sonnet-4-6, claude-opus-4.6, claude-sonnet-4.6, gpt-5.4, MiniMax-Text-01, MiniMax-VL-01. M2.1/M2.5/M2.7 intentionally left unset.",
@@ -2873,7 +2885,7 @@
2873
2885
  {
2874
2886
  "version": "0.1.0-alpha.250",
2875
2887
  "changes": [
2876
- "fix(sync): surface 'bundle is not a git repo' as an actionable error instead of silently failing. Previously, enabling `sync.enabled` on a bundle that had never been `git init`'d produced a generic `git status` failure buried in nerves logs; the agent saw nothing in its start-of-turn packet and the user saw nothing in `ouro status`. Now: (1) `preTurnPull` and `postTurnPush` detect the missing `.git` directory before touching git and return an actionable error with the bundle path and the `git init` hint; (2) this error propagates via `ctx.syncFailure` into the agent's Sync warning, so the agent can offer to run `git init` or just do it; (3) `ouro status` shows a red `error` state with `not a git repo \u2014 run \\`git init\\` to enable sync` next to the offending bundle. New nerves event: `heart.sync_not_a_repo`."
2888
+ "fix(sync): surface 'bundle is not a git repo' as an actionable error instead of silently failing. Previously, enabling `sync.enabled` on a bundle that had never been `git init`'d produced a generic `git status` failure buried in nerves logs; the agent saw nothing in its start-of-turn packet and the user saw nothing in `ouro status`. Now: (1) `preTurnPull` and `postTurnPush` detect the missing `.git` directory before touching git and return an actionable error with the bundle path and the `git init` hint; (2) this error propagates via `ctx.syncFailure` into the agent's Sync warning, so the agent can offer to run `git init` or just do it; (3) `ouro status` shows a red `error` state with `not a git repo run \\`git init\\` to enable sync` next to the offending bundle. New nerves event: `heart.sync_not_a_repo`."
2877
2889
  ]
2878
2890
  },
2879
2891
  {
@@ -2886,8 +2898,8 @@
2886
2898
  {
2887
2899
  "version": "0.1.0-alpha.248",
2888
2900
  "changes": [
2889
- "feat(daemon): `ouro status` Git Sync now resolves and shows the actual remote URL via `git remote get-url`. Three states: `origin \u2192 git@github.com:me/foo.git` when the remote resolves, `local only` when sync is enabled but no remote is configured, `disabled` when sync is off. Previously you had to `cd` into the bundle and run `git remote -v` to find out where it pushes.",
2890
- "fix(heart/sync): `preTurnPull` now skips the pull when no git remote is configured, mirroring the existing `postTurnPush` behavior. Closes the half-implemented 'no-remote sync' (local-only commit log) story \u2014 previously, enabling sync without a remote produced a `syncFailure` on every turn from the failing `git pull` call."
2901
+ "feat(daemon): `ouro status` Git Sync now resolves and shows the actual remote URL via `git remote get-url`. Three states: `origin git@github.com:me/foo.git` when the remote resolves, `local only` when sync is enabled but no remote is configured, `disabled` when sync is off. Previously you had to `cd` into the bundle and run `git remote -v` to find out where it pushes.",
2902
+ "fix(heart/sync): `preTurnPull` now skips the pull when no git remote is configured, mirroring the existing `postTurnPush` behavior. Closes the half-implemented 'no-remote sync' (local-only commit log) story previously, enabling sync without a remote produced a `syncFailure` on every turn from the failing `git pull` call."
2891
2903
  ]
2892
2904
  },
2893
2905
  {
@@ -2899,7 +2911,7 @@
2899
2911
  {
2900
2912
  "version": "0.1.0-alpha.246",
2901
2913
  "changes": [
2902
- "feat(tui): session resume display \u2014 shows summary line + last 3 exchanges dimmed when reconnecting to existing session."
2914
+ "feat(tui): session resume display shows summary line + last 3 exchanges dimmed when reconnecting to existing session."
2903
2915
  ]
2904
2916
  },
2905
2917
  {
@@ -2911,58 +2923,58 @@
2911
2923
  {
2912
2924
  "version": "0.1.0-alpha.244",
2913
2925
  "changes": [
2914
- "refactor(sync): git-status-based bundle sync \u2014 postTurnPush discovers dirty files via `git status --porcelain` instead of broken explicit trackSyncWrite tracking (only 3/9 writers used it). Removed dead infrastructure. Remote push optional and non-fatal.",
2915
- "feat(tui): queued input steer \u2014 messages typed while agent is thinking appear dimmed above input area. UP/ESC pops all queued messages back into input for editing. Placeholder hint. Multiple messages supported, each sent as separate turn.",
2926
+ "refactor(sync): git-status-based bundle sync postTurnPush discovers dirty files via `git status --porcelain` instead of broken explicit trackSyncWrite tracking (only 3/9 writers used it). Removed dead infrastructure. Remote push optional and non-fatal.",
2927
+ "feat(tui): queued input steer messages typed while agent is thinking appear dimmed above input area. UP/ESC pops all queued messages back into input for editing. Placeholder hint. Multiple messages supported, each sent as separate turn.",
2916
2928
  "chore: deleted legacy InkCliApp adapter (776 lines dead code)."
2917
2929
  ]
2918
2930
  },
2919
2931
  {
2920
2932
  "version": "0.1.0-alpha.243",
2921
2933
  "changes": [
2922
- "fix(daemon): per-agent Git Sync in `ouro status` \u2014 was always showing `disabled` regardless of any agent's `agent.json`, because the daemon process has no argv-derived agent identity so `getSyncConfig()` fell into its catch. Status payload now carries a per-agent `sync` array (one row per enabled bundle) rendered as its own section like Senses and Workers, instead of a single global field on the overview block."
2934
+ "fix(daemon): per-agent Git Sync in `ouro status` was always showing `disabled` regardless of any agent's `agent.json`, because the daemon process has no argv-derived agent identity so `getSyncConfig()` fell into its catch. Status payload now carries a per-agent `sync` array (one row per enabled bundle) rendered as its own section like Senses and Workers, instead of a single global field on the overview block."
2923
2935
  ]
2924
2936
  },
2925
2937
  {
2926
2938
  "version": "0.1.0-alpha.241",
2927
2939
  "changes": [
2928
- "feat: commerce bootstrap \u2014 bw CLI lazy-install, vault auto-config, resolver coverage"
2940
+ "feat: commerce bootstrap bw CLI lazy-install, vault auto-config, resolver coverage"
2929
2941
  ]
2930
2942
  },
2931
2943
  {
2932
2944
  "version": "0.1.0-alpha.242",
2933
2945
  "changes": [
2934
- "fix(friends): stable local CLI identity \u2014 dropped hostname from external ID (was `username@hostname`, now just `username`). macOS hostname instability (`Mac` vs `Aris-MacBook-Pro.local`) was creating duplicate friend records with separate sessions and trust levels.",
2935
- "feat(friends): migration fallback \u2014 FriendResolver now searches for old `username@*` format IDs when exact match fails, linking new stable ID to existing friend record.",
2936
- "fix(heart): retry-everything-except-blocklist policy \u2014 the SDK 'Request timed out.' error from MiniMax (and other providers) was reaching the agent as a terminal failure because neither the generic isTransientError detector nor the per-provider classifier recognized it. Replaced the two-layer transient detection with a small blocklist (HTTP 400/401/403/404/422 + classifications auth-failure/usage-limit). Default policy now retries every other error.",
2937
- "fix(heart/providers): drop the 30s OpenAI/Anthropic SDK timeout from all five providers (anthropic, azure, github-copilot, minimax, openai-codex). The SDK timeout caps the entire stream lifetime, so 30s killed any reasoning model mid-generation. SDK defaults (\u224810min) are sane.",
2946
+ "fix(friends): stable local CLI identity dropped hostname from external ID (was `username@hostname`, now just `username`). macOS hostname instability (`Mac` vs `Aris-MacBook-Pro.local`) was creating duplicate friend records with separate sessions and trust levels.",
2947
+ "feat(friends): migration fallback FriendResolver now searches for old `username@*` format IDs when exact match fails, linking new stable ID to existing friend record.",
2948
+ "fix(heart): retry-everything-except-blocklist policy the SDK 'Request timed out.' error from MiniMax (and other providers) was reaching the agent as a terminal failure because neither the generic isTransientError detector nor the per-provider classifier recognized it. Replaced the two-layer transient detection with a small blocklist (HTTP 400/401/403/404/422 + classifications auth-failure/usage-limit). Default policy now retries every other error.",
2949
+ "fix(heart/providers): drop the 30s OpenAI/Anthropic SDK timeout from all five providers (anthropic, azure, github-copilot, minimax, openai-codex). The SDK timeout caps the entire stream lifetime, so 30s killed any reasoning model mid-generation. SDK defaults (≈10min) are sane.",
2938
2950
  "refactor(heart/providers): consolidate duplicated isNetworkError + classifyXxxError scaffolding into a shared `error-classification.ts` module. Each provider now delegates via `classifyHttpError(err, overrides)` and only carries its own provider-specific quirks (Anthropic 529, Codex usage-limit message detection)."
2939
2951
  ]
2940
2952
  },
2941
2953
  {
2942
2954
  "version": "0.1.0-alpha.238",
2943
2955
  "changes": [
2944
- "feat: pretty `ouro status` \u2014 ANSI colored output with box-drawing header, status dots, grouped senses/workers by agent. Added git sync info to overview.",
2945
- "fix: socket half-close \u2014 sendDaemonCommand now calls client.end() after writing, preventing intermittent connection hangs.",
2946
- "refactor: config tiers \u2014 replaced numeric T1/T2/T3 with `self` (agent-configurable) and `managed` (harness-only). All config keys are now agent-writable except `version` and `enabled`. mcpServers promoted to self.",
2947
- "refactor: removed confirmation system \u2014 deleted propose_config tool, confirmationRequired/confirmationAlwaysRequired/onConfirmAction from core, all tool definitions, and Teams sense. Was only wired up on Teams, silently failed everywhere else. -1,361 lines."
2956
+ "feat: pretty `ouro status` ANSI colored output with box-drawing header, status dots, grouped senses/workers by agent. Added git sync info to overview.",
2957
+ "fix: socket half-close sendDaemonCommand now calls client.end() after writing, preventing intermittent connection hangs.",
2958
+ "refactor: config tiers replaced numeric T1/T2/T3 with `self` (agent-configurable) and `managed` (harness-only). All config keys are now agent-writable except `version` and `enabled`. mcpServers promoted to self.",
2959
+ "refactor: removed confirmation system deleted propose_config tool, confirmationRequired/confirmationAlwaysRequired/onConfirmAction from core, all tool definitions, and Teams sense. Was only wired up on Teams, silently failed everywhere else. -1,361 lines."
2948
2960
  ]
2949
2961
  },
2950
2962
  {
2951
2963
  "version": "0.1.0-alpha.235",
2952
2964
  "changes": [
2953
- "fix: MCP tool double-prefix \u2014 tools already prefixed by server name no longer get a redundant second prefix in the unified registry.",
2954
- "feat: Open-Meteo zero-auth weather \u2014 replaced OpenWeatherMap with Open-Meteo forecast + geocoding API. Weather now works without any API key or credential provisioning.",
2955
- "feat: expanded ISO/FIPS divergence table \u2014 21 new entries for correct travel advisory resolution across all major divergent country codes.",
2956
- "fix: BitwardenCredentialStore retry logic \u2014 exponential backoff with configurable retries for transient bw CLI failures, plus bw-not-installed error.",
2965
+ "fix: MCP tool double-prefix tools already prefixed by server name no longer get a redundant second prefix in the unified registry.",
2966
+ "feat: Open-Meteo zero-auth weather replaced OpenWeatherMap with Open-Meteo forecast + geocoding API. Weather now works without any API key or credential provisioning.",
2967
+ "feat: expanded ISO/FIPS divergence table 21 new entries for correct travel advisory resolution across all major divergent country codes.",
2968
+ "fix: BitwardenCredentialStore retry logic exponential backoff with configurable retries for transient bw CLI failures, plus bw-not-installed error.",
2957
2969
  "chore: travel MCP packages (Duffel, Expedia) status confirmed GitHub-only, not published to npm."
2958
2970
  ]
2959
2971
  },
2960
2972
  {
2961
2973
  "version": "0.1.0-alpha.233",
2962
2974
  "changes": [
2963
- "feat: first-class MCP tools \u2014 MCP tools now appear in the agent tool list directly (no shell indirection). Agent can call browser_navigate, browser_click etc. as native tools.",
2964
- "fix: daemon MCP pre-init poisoned singleton \u2014 removed eager getSharedMcpManager() at daemon startup that cached null before agent identity was set.",
2965
- "fix: removed dead mcpManager field from BuildSystemOptions \u2014 MCP manager now flows through runAgentOptions.",
2975
+ "feat: first-class MCP tools MCP tools now appear in the agent tool list directly (no shell indirection). Agent can call browser_navigate, browser_click etc. as native tools.",
2976
+ "fix: daemon MCP pre-init poisoned singleton removed eager getSharedMcpManager() at daemon startup that cached null before agent identity was set.",
2977
+ "fix: removed dead mcpManager field from BuildSystemOptions MCP manager now flows through runAgentOptions.",
2966
2978
  "fix: MCP tool results filter to text-only content types.",
2967
2979
  "includes: vault integration, travel advisory fix, credential access layer, HKDF-Expand crypto fix."
2968
2980
  ]
@@ -2970,21 +2982,21 @@
2970
2982
  {
2971
2983
  "version": "0.1.0-alpha.232",
2972
2984
  "changes": [
2973
- "feat: first-class MCP tools \u2014 MCP server tools now appear in the agent's active tool list (e.g. browser_navigate, duffel_search_flights) and are callable directly by the model, eliminating fragile shell indirection",
2985
+ "feat: first-class MCP tools MCP server tools now appear in the agent's active tool list (e.g. browser_navigate, duffel_search_flights) and are callable directly by the model, eliminating fragile shell indirection",
2974
2986
  "feat: mcpToolsAsDefinitions() converts McpManager tools to ToolDefinition objects with {server}_{tool} naming",
2975
- "feat: first-class MCP trust gating \u2014 mcpServerName on GuardContext enables per-server trust rules (browser blocked for acquaintance, blocked in group chat)",
2976
- "refactor: removed mcpToolsSection() from system prompt \u2014 MCP tools no longer need prompt documentation",
2987
+ "feat: first-class MCP trust gating mcpServerName on GuardContext enables per-server trust rules (browser blocked for acquaintance, blocked in group chat)",
2988
+ "refactor: removed mcpToolsSection() from system prompt MCP tools no longer need prompt documentation",
2977
2989
  "fix: execTool, isConfirmationRequired, summarizeArgs now check combined native+MCP registry"
2978
2990
  ]
2979
2991
  },
2980
2992
  {
2981
2993
  "version": "0.1.0-alpha.231",
2982
2994
  "changes": [
2983
- "fix: defense-in-depth group chat blocking for proactive BB delivery \u2014 sendProactiveBlueBubblesMessageToSession now rejects group chat keys (;+;) unless intent is explicit_cross_chat (bridge/delegation responses). All upper-layer paths (surface tool, send_message tool, inner-dialog delegation) now filter to DM sessions only for proactive outreach, and pass explicit_cross_chat intent for bridge/delegation returns where group responses are legitimate.",
2995
+ "fix: defense-in-depth group chat blocking for proactive BB delivery sendProactiveBlueBubblesMessageToSession now rejects group chat keys (;+;) unless intent is explicit_cross_chat (bridge/delegation responses). All upper-layer paths (surface tool, send_message tool, inner-dialog delegation) now filter to DM sessions only for proactive outreach, and pass explicit_cross_chat intent for bridge/delegation returns where group responses are legitimate.",
2984
2996
  "fix: travel advisory now resolves ISO codes that differ from FIPS (ES -> Spain, not El Salvador)",
2985
2997
  "fix: MCP bridge sense now includes MCP tool descriptions in system prompt (browser tools visible)",
2986
2998
  "fix: removed confirmationRequired from credential_store/credential_delete (trust gating sufficient)",
2987
- "feat: vault integration \u2014 Bitwarden/Vaultwarden account creation with PBKDF2/HKDF/AES-256-CBC crypto",
2999
+ "feat: vault integration Bitwarden/Vaultwarden account creation with PBKDF2/HKDF/AES-256-CBC crypto",
2988
3000
  "feat: vault_setup tool for one-time vault provisioning (family trust gated)",
2989
3001
  "feat: BitwardenCredentialStore wrapping bw CLI for agent-owned vault access"
2990
3002
  ]
@@ -2998,20 +3010,20 @@
2998
3010
  {
2999
3011
  "version": "0.1.0-alpha.229",
3000
3012
  "changes": [
3001
- "fix: BB session key resolution prefers DM (;-;) over group chat (;+;) \u2014 alphabetical sort put group chats first, causing proactive messages to land in group chats instead of personal DMs",
3013
+ "fix: BB session key resolution prefers DM (;-;) over group chat (;+;) alphabetical sort put group chats first, causing proactive messages to land in group chats instead of personal DMs",
3002
3014
  "chore: remove temporary debug traces (send-message-debug.log, friends.get_called event)"
3003
3015
  ]
3004
3016
  },
3005
3017
  {
3006
3018
  "version": "0.1.0-alpha.227",
3007
3019
  "changes": [
3008
- "fix: direct filesystem name resolution in sendProactiveBlueBubblesMessageToSession \u2014 bypass store.get()/listAll() with raw fs reads on friends directory when store lookup fails"
3020
+ "fix: direct filesystem name resolution in sendProactiveBlueBubblesMessageToSession bypass store.get()/listAll() with raw fs reads on friends directory when store lookup fails"
3009
3021
  ]
3010
3022
  },
3011
3023
  {
3012
3024
  "version": "0.1.0-alpha.226",
3013
3025
  "changes": [
3014
- "fix(daemon): set agent name before senseTurn so MCP messages resolve identity \u2014 setAgentName() is now called at the top of the senseTurn handler, before any downstream code that depends on agent identity (loadAgentConfig, getAgentSecretsPath, etc.)"
3026
+ "fix(daemon): set agent name before senseTurn so MCP messages resolve identity setAgentName() is now called at the top of the senseTurn handler, before any downstream code that depends on agent identity (loadAgentConfig, getAgentSecretsPath, etc.)"
3015
3027
  ]
3016
3028
  },
3017
3029
  {
@@ -3023,7 +3035,7 @@
3023
3035
  {
3024
3036
  "version": "0.1.0-alpha.224",
3025
3037
  "changes": [
3026
- "fix: FileFriendStore.get() now resolves friend names \u2014 when UUID lookup fails, scans the friends directory for a name match. This is the deepest possible layer for name resolution, ensuring it works regardless of which tool or code path calls store.get()."
3038
+ "fix: FileFriendStore.get() now resolves friend names when UUID lookup fails, scans the friends directory for a name match. This is the deepest possible layer for name resolution, ensuring it works regardless of which tool or code path calls store.get()."
3027
3039
  ]
3028
3040
  },
3029
3041
  {
@@ -3035,127 +3047,127 @@
3035
3047
  {
3036
3048
  "version": "0.1.0-alpha.222",
3037
3049
  "changes": [
3038
- "debug: add diagnostic nerves events to proactive BB delivery \u2014 name resolution in both tools-session.ts and sendProactiveBlueBubblesMessageToSession now emit events showing friend count, names, resolution success/failure, and errors. Temporary diagnostics to identify why name\u2192UUID resolution isn't working in production."
3050
+ "debug: add diagnostic nerves events to proactive BB delivery name resolution in both tools-session.ts and sendProactiveBlueBubblesMessageToSession now emit events showing friend count, names, resolution success/failure, and errors. Temporary diagnostics to identify why name→UUID resolution isn't working in production."
3039
3051
  ]
3040
3052
  },
3041
3053
  {
3042
3054
  "version": "0.1.0-alpha.221",
3043
3055
  "changes": [
3044
- "fix: BB proactive send resolves friend by name when UUID lookup fails \u2014 sendProactiveBlueBubblesMessageToSession now falls back to store.listAll() name matching when store.get() returns null. This handles agents passing friend names instead of UUIDs, bypassing the upstream resolution that wasn't working in all contexts."
3056
+ "fix: BB proactive send resolves friend by name when UUID lookup fails sendProactiveBlueBubblesMessageToSession now falls back to store.listAll() name matching when store.get() returns null. This handles agents passing friend names instead of UUIDs, bypassing the upstream resolution that wasn't working in all contexts."
3045
3057
  ]
3046
3058
  },
3047
3059
  {
3048
3060
  "version": "0.1.0-alpha.220",
3049
3061
  "changes": [
3050
- "fix: send_message BB session key resolution \u2014 agents don't know the real BB session key (e.g. 'chat_any;-;you@example.com'), so they pass the default 'session'. buildChatRefForSessionKey failed on this fake key, returning missing_target. Now auto-resolves the real BB session key from the sessions directory when the default key is used."
3062
+ "fix: send_message BB session key resolution agents don't know the real BB session key (e.g. 'chat_any;-;you@example.com'), so they pass the default 'session'. buildChatRefForSessionKey failed on this fake key, returning missing_target. Now auto-resolves the real BB session key from the sessions directory when the default key is used."
3051
3063
  ]
3052
3064
  },
3053
3065
  {
3054
3066
  "version": "0.1.0-alpha.219",
3055
3067
  "changes": [
3056
- "fix: proactive message delivery \u2014 three bugs fixed. (1) surface tool now resolves friend names to UUIDs by scanning friends directory. (2) send_message tool also resolves friend names to UUIDs. (3) deliverCrossChatMessage no longer immediately queues generic_outreach \u2014 it now attempts delivery when a deliverer is available, with the deliverer's own trust checks still gating actual sends. Previously, any proactive send from inner dialog was silently queued without attempting delivery."
3068
+ "fix: proactive message delivery three bugs fixed. (1) surface tool now resolves friend names to UUIDs by scanning friends directory. (2) send_message tool also resolves friend names to UUIDs. (3) deliverCrossChatMessage no longer immediately queues generic_outreach it now attempts delivery when a deliverer is available, with the deliverer's own trust checks still gating actual sends. Previously, any proactive send from inner dialog was silently queued without attempting delivery."
3057
3069
  ]
3058
3070
  },
3059
3071
  {
3060
3072
  "version": "0.1.0-alpha.218",
3061
3073
  "changes": [
3062
- "fix: surface tool now resolves friend names to UUIDs \u2014 agents pass friend names but sessions are stored under UUID directories. Added name-to-UUID resolution by scanning the friends directory when the friendId doesn't match a session directory."
3074
+ "fix: surface tool now resolves friend names to UUIDs agents pass friend names but sessions are stored under UUID directories. Added name-to-UUID resolution by scanning the friends directory when the friendId doesn't match a session directory."
3063
3075
  ]
3064
3076
  },
3065
3077
  {
3066
3078
  "version": "0.1.0-alpha.217",
3067
3079
  "changes": [
3068
- "feat: Bitwarden vault client (bw CLI wrapper) with credential gateway \u2014 BitwardenClient class with SDK-first/CLI-fallback, singleton accessor, nerves events on all operations. Raw secrets never enter model context.",
3069
- "feat: vault tools (vault_get, vault_store, vault_list, vault_delete) with trust gating \u2014 read ops require friend+, write/delete require family-only. Destructive operations require confirmation.",
3070
- "feat: stealth browser MCP configuration with trust gating \u2014 Playwright MCP auto-provisions when configured, browser tools appear in agent tool list. Trust-gated to CLI and trusted 1:1 only (group chat blocked).",
3071
- "feat: travel API tools (weather_lookup, travel_advisory, geocode_search) \u2014 native weather via OpenWeatherMap + vault-backed API key, State Dept RSS feed for advisories, geocoding via Nominatim. All friend+ trust-gated.",
3072
- "feat: credential gateway (vaultKey on apiRequest()) \u2014 automatic secret injection from vault into HTTP headers at call time, keeping credentials out of model context entirely.",
3080
+ "feat: Bitwarden vault client (bw CLI wrapper) with credential gateway BitwardenClient class with SDK-first/CLI-fallback, singleton accessor, nerves events on all operations. Raw secrets never enter model context.",
3081
+ "feat: vault tools (vault_get, vault_store, vault_list, vault_delete) with trust gating read ops require friend+, write/delete require family-only. Destructive operations require confirmation.",
3082
+ "feat: stealth browser MCP configuration with trust gating Playwright MCP auto-provisions when configured, browser tools appear in agent tool list. Trust-gated to CLI and trusted 1:1 only (group chat blocked).",
3083
+ "feat: travel API tools (weather_lookup, travel_advisory, geocode_search) native weather via OpenWeatherMap + vault-backed API key, State Dept RSS feed for advisories, geocoding via Nominatim. All friend+ trust-gated.",
3084
+ "feat: credential gateway (vaultKey on apiRequest()) automatic secret injection from vault into HTTP headers at call time, keeping credentials out of model context entirely.",
3073
3085
  "feat: travel-planning and browser-navigation skills"
3074
3086
  ]
3075
3087
  },
3076
3088
  {
3077
3089
  "version": "0.1.0-alpha.216",
3078
3090
  "changes": [
3079
- "fix: surface tool proactive BB delivery masked by newer MCP/CLI sessions \u2014 findFreshestFriendSession picked the single freshest session regardless of channel, so an MCP or CLI session being newer than the BB session caused the BB proactive path to be skipped entirely. Now scans all friend sessions, attempts BB delivery first on any BB session, then falls back to queuing on the freshest non-inner session."
3091
+ "fix: surface tool proactive BB delivery masked by newer MCP/CLI sessions findFreshestFriendSession picked the single freshest session regardless of channel, so an MCP or CLI session being newer than the BB session caused the BB proactive path to be skipped entirely. Now scans all friend sessions, attempts BB delivery first on any BB session, then falls back to queuing on the freshest non-inner session."
3080
3092
  ]
3081
3093
  },
3082
3094
  {
3083
3095
  "version": "0.1.0-alpha.215",
3084
3096
  "changes": [
3085
- "fix: surface tool proactive delivery no longer gated by 24-hour session threshold \u2014 findFreshestFriendSession was called with activeOnly:true, filtering out sessions older than 24h even when the agent explicitly wants to send a proactive message. Both the bridge path and direct path now find any session regardless of age. Proactive BB delivery and trust checks still apply."
3097
+ "fix: surface tool proactive delivery no longer gated by 24-hour session threshold findFreshestFriendSession was called with activeOnly:true, filtering out sessions older than 24h even when the agent explicitly wants to send a proactive message. Both the bridge path and direct path now find any session regardless of age. Proactive BB delivery and trust checks still apply."
3086
3098
  ]
3087
3099
  },
3088
3100
  {
3089
3101
  "version": "0.1.0-alpha.214",
3090
3102
  "changes": [
3091
- "fix: daemon death diagnostics \u2014 createStderrSink() bypassed EPIPE-safe default in createTerminalSink(), causing uncaught EPIPE crashes when daemon runs detached. Removed redundant unsafe default so the existing try-catch fires.",
3092
- "fix: daemon tombstone now covers all exit paths \u2014 unhandledRejection writes tombstone with full error+stack (was just a warn log, but Node 15+ terminates on these). Added process.on('exit') catch-all for any unanticipated exit. SIGINT/SIGTERM marked graceful to avoid false positives. _lastKnownCause threads real error through to exit handler."
3103
+ "fix: daemon death diagnostics createStderrSink() bypassed EPIPE-safe default in createTerminalSink(), causing uncaught EPIPE crashes when daemon runs detached. Removed redundant unsafe default so the existing try-catch fires.",
3104
+ "fix: daemon tombstone now covers all exit paths unhandledRejection writes tombstone with full error+stack (was just a warn log, but Node 15+ terminates on these). Added process.on('exit') catch-all for any unanticipated exit. SIGINT/SIGTERM marked graceful to avoid false positives. _lastKnownCause threads real error through to exit handler."
3093
3105
  ]
3094
3106
  },
3095
3107
  {
3096
3108
  "version": "0.1.0-alpha.213",
3097
3109
  "changes": [
3098
- "cleanup: remove vestigial subagents/ directory and package.json files entry (content already in ouroboros-skills repo). Remove backward-compat re-exports from heart/core.ts (tools, execTool, summarizeArgs, getToolsForChannel, streamChatCompletion, streamResponsesApi, toResponsesInput, toResponsesTools, buildSystem, Channel, hasToolIntent \u2014 no consumers used them). Update ARCHITECTURE.md, README.md, and CONTRIBUTING.md to reflect the full audit restructuring: new arc/ subsystem, heart/ topic subdirectories, split tool modules, BlueBubbles directory, scopes list."
3110
+ "cleanup: remove vestigial subagents/ directory and package.json files entry (content already in ouroboros-skills repo). Remove backward-compat re-exports from heart/core.ts (tools, execTool, summarizeArgs, getToolsForChannel, streamChatCompletion, streamResponsesApi, toResponsesInput, toResponsesTools, buildSystem, Channel, hasToolIntent no consumers used them). Update ARCHITECTURE.md, README.md, and CONTRIBUTING.md to reflect the full audit restructuring: new arc/ subsystem, heart/ topic subdirectories, split tool modules, BlueBubbles directory, scopes list."
3099
3111
  ]
3100
3112
  },
3101
3113
  {
3102
3114
  "version": "0.1.0-alpha.212",
3103
3115
  "changes": [
3104
- "refactor: consolidate BlueBubbles sense into senses/bluebubbles/ directory \u2014 move 9 flat files (bluebubbles.ts, bluebubbles-client.ts, bluebubbles-model.ts, bluebubbles-media.ts, bluebubbles-inbound-log.ts, bluebubbles-mutation-log.ts, bluebubbles-runtime-state.ts, bluebubbles-session-cleanup.ts, bluebubbles-entry.ts) into senses/bluebubbles/ with shorter names (index.ts, client.ts, model.ts, etc.). All imports updated across 20+ files including test files, sense-manager, daemon, and package.json."
3116
+ "refactor: consolidate BlueBubbles sense into senses/bluebubbles/ directory move 9 flat files (bluebubbles.ts, bluebubbles-client.ts, bluebubbles-model.ts, bluebubbles-media.ts, bluebubbles-inbound-log.ts, bluebubbles-mutation-log.ts, bluebubbles-runtime-state.ts, bluebubbles-session-cleanup.ts, bluebubbles-entry.ts) into senses/bluebubbles/ with shorter names (index.ts, client.ts, model.ts, etc.). All imports updated across 20+ files including test files, sense-manager, daemon, and package.json."
3105
3117
  ]
3106
3118
  },
3107
3119
  {
3108
3120
  "version": "0.1.0-alpha.211",
3109
3121
  "changes": [
3110
- "refactor: extract duplicated patterns into shared utilities \u2014 mind/embedding-provider.ts (shared OpenAI embedding client from diary + associative-recall), arc/json-store.ts (shared JSON file CRUD from obligations + cares + intentions), repertoire/api-client.ts (shared HTTP request helper from graph + ado + github clients). M12 (channel callback factory) skipped: CLI and Teams streaming implementations are too different for clean abstraction."
3122
+ "refactor: extract duplicated patterns into shared utilities mind/embedding-provider.ts (shared OpenAI embedding client from diary + associative-recall), arc/json-store.ts (shared JSON file CRUD from obligations + cares + intentions), repertoire/api-client.ts (shared HTTP request helper from graph + ado + github clients). M12 (channel callback factory) skipped: CLI and Teams streaming implementations are too different for clean abstraction."
3111
3123
  ]
3112
3124
  },
3113
3125
  {
3114
3126
  "version": "0.1.0-alpha.210",
3115
3127
  "changes": [
3116
- "refactor: create src/arc/ subsystem \u2014 extract durable continuity state (obligations, cares, episodes, intentions, presence, attention-types) from heart/ and mind/ into dedicated arc/ module. arc/ owns the agent's continuity state, distinct from engine mechanics (heart) and cognition (mind). All imports updated across 40+ files."
3128
+ "refactor: create src/arc/ subsystem extract durable continuity state (obligations, cares, episodes, intentions, presence, attention-types) from heart/ and mind/ into dedicated arc/ module. arc/ owns the agent's continuity state, distinct from engine mechanics (heart) and cognition (mind). All imports updated across 40+ files."
3117
3129
  ]
3118
3130
  },
3119
3131
  {
3120
3132
  "version": "0.1.0-alpha.209",
3121
3133
  "changes": [
3122
- "refactor: restructure daemon/ directory \u2014 move outlook files to heart/outlook/, habit files to heart/habits/, hatch/specialist files to heart/hatch/, versioning/update files to heart/versioning/, auth-flow to heart/auth/, mcp-server to heart/mcp/. daemon/ reduced from 60 to 36 core daemon-lifecycle files."
3134
+ "refactor: restructure daemon/ directory move outlook files to heart/outlook/, habit files to heart/habits/, hatch/specialist files to heart/hatch/, versioning/update files to heart/versioning/, auth-flow to heart/auth/, mcp-server to heart/mcp/. daemon/ reduced from 60 to 36 core daemon-lifecycle files."
3123
3135
  ]
3124
3136
  },
3125
3137
  {
3126
3138
  "version": "0.1.0-alpha.208",
3127
3139
  "changes": [
3128
- "refactor: split daemon-cli.ts (3,630 lines) into 5 focused modules \u2014 cli-types (command/deps types), cli-parse (argument parsing), cli-render (output formatting), cli-exec (command execution router), cli-defaults (production dependency wiring). daemon-cli.ts reduced to 42-line re-export shim."
3140
+ "refactor: split daemon-cli.ts (3,630 lines) into 5 focused modules cli-types (command/deps types), cli-parse (argument parsing), cli-render (output formatting), cli-exec (command execution router), cli-defaults (production dependency wiring). daemon-cli.ts reduced to 42-line re-export shim."
3129
3141
  ]
3130
3142
  },
3131
3143
  {
3132
3144
  "version": "0.1.0-alpha.207",
3133
3145
  "changes": [
3134
- "refactor: split tools-base.ts (1,912 lines) into 9 category modules \u2014 tools-files, tools-shell, tools-memory, tools-bridge, tools-session, tools-continuity, tools-flow, tools-surface, tools-config. Surface tool handler extracted from tools.ts to tools-surface.ts."
3146
+ "refactor: split tools-base.ts (1,912 lines) into 9 category modules tools-files, tools-shell, tools-memory, tools-bridge, tools-session, tools-continuity, tools-flow, tools-surface, tools-config. Surface tool handler extracted from tools.ts to tools-surface.ts."
3135
3147
  ]
3136
3148
  },
3137
3149
  {
3138
3150
  "version": "0.1.0-alpha.206",
3139
3151
  "changes": [
3140
- "feat: capability discovery and tiered self-configuration \u2014 config registry with tier-aware metadata (T1 self-service, T2 proposal, T3 operator-only), read_config tool with topic-filtered discovery, update_config tool for T1 immediate changes, propose_config tool for T2 operator-approval flow, version-change surfacing in start-of-turn packet via buildCapabilitiesSection"
3152
+ "feat: capability discovery and tiered self-configuration config registry with tier-aware metadata (T1 self-service, T2 proposal, T3 operator-only), read_config tool with topic-filtered discovery, update_config tool for T1 immediate changes, propose_config tool for T2 operator-approval flow, version-change surfacing in start-of-turn packet via buildCapabilitiesSection"
3141
3153
  ]
3142
3154
  },
3143
3155
  {
3144
3156
  "version": "0.1.0-alpha.205",
3145
3157
  "changes": [
3146
- "refactor: enforce subsystem boundaries \u2014 eliminate all heart/ and nerves/ static imports from senses/, move AttentionItem type to heart/, surfaceToolDef to repertoire/, SteeringFollowUpEffect to heart/turn-coordinator, inline BlueBubbles runtime state reader in daemon"
3158
+ "refactor: enforce subsystem boundaries eliminate all heart/ and nerves/ static imports from senses/, move AttentionItem type to heart/, surfaceToolDef to repertoire/, SteeringFollowUpEffect to heart/turn-coordinator, inline BlueBubbles runtime state reader in daemon"
3147
3159
  ]
3148
3160
  },
3149
3161
  {
3150
3162
  "version": "0.1.0-alpha.204",
3151
3163
  "changes": [
3152
- "refactor: introduce TurnContext snapshot \u2014 centralize state assembly from pipeline.ts into buildTurnContext(), thread pre-read state through prompt assembly to eliminate ad-hoc filesystem reads"
3164
+ "refactor: introduce TurnContext snapshot centralize state assembly from pipeline.ts into buildTurnContext(), thread pre-read state through prompt assembly to eliminate ad-hoc filesystem reads"
3153
3165
  ]
3154
3166
  },
3155
3167
  {
3156
3168
  "version": "0.1.0-alpha.203",
3157
3169
  "changes": [
3158
- "refactor: unify obligation systems \u2014 mind/obligations.ts merged into heart/obligations.ts with prefixed ReturnObligation API"
3170
+ "refactor: unify obligation systems mind/obligations.ts merged into heart/obligations.ts with prefixed ReturnObligation API"
3159
3171
  ]
3160
3172
  },
3161
3173
  {
@@ -3171,7 +3183,7 @@
3171
3183
  {
3172
3184
  "version": "0.1.0-alpha.201",
3173
3185
  "changes": [
3174
- "fix: don't launchctl bootstrap after daemon start \u2014 was starting competing daemon that killed the first"
3186
+ "fix: don't launchctl bootstrap after daemon start was starting competing daemon that killed the first"
3175
3187
  ]
3176
3188
  },
3177
3189
  {
@@ -3193,7 +3205,7 @@
3193
3205
  {
3194
3206
  "version": "0.1.0-alpha.197",
3195
3207
  "changes": [
3196
- "fix(daemon): validate agent config before spawn \u2014 skips agents with missing credentials instead of crash-looping"
3208
+ "fix(daemon): validate agent config before spawn skips agents with missing credentials instead of crash-looping"
3197
3209
  ]
3198
3210
  },
3199
3211
  {
@@ -3212,20 +3224,20 @@
3212
3224
  {
3213
3225
  "version": "0.1.0-alpha.194",
3214
3226
  "changes": [
3215
- "fix(daemon): self-spawn restart \u2014 no longer relies on launchd KeepAlive for staged restarts",
3216
- "fix(daemon): error boundary with circuit breaker \u2014 uncaught exceptions logged and survived, exits only after 10+ in 60s",
3227
+ "fix(daemon): self-spawn restart no longer relies on launchd KeepAlive for staged restarts",
3228
+ "fix(daemon): error boundary with circuit breaker uncaught exceptions logged and survived, exits only after 10+ in 60s",
3217
3229
  "fix(daemon): EPIPE suppression in uncaughtException handler",
3218
3230
  "fix(daemon): 5-second force-exit timeouts on all shutdown paths",
3219
3231
  "feat: human-facing and agent-facing provider configs",
3220
3232
  "fix(auth): always refresh codex OAuth token, responses API verification",
3221
- "feat: Outlook visibility \u2014 orientation, obligations, changes, self-fix, memory decisions, route migration to /"
3233
+ "feat: Outlook visibility orientation, obligations, changes, self-fix, memory decisions, route migration to /"
3222
3234
  ]
3223
3235
  },
3224
3236
  {
3225
3237
  "version": "0.1.0-alpha.192",
3226
3238
  "changes": [
3227
- "refactor: canonical obligations \u2014 ActiveWorkFrame as single source of truth for prompt sections",
3228
- "feat(mcp): dynamic server add/remove \u2014 agents can manage MCP servers without daemon restart"
3239
+ "refactor: canonical obligations ActiveWorkFrame as single source of truth for prompt sections",
3240
+ "feat(mcp): dynamic server add/remove agents can manage MCP servers without daemon restart"
3229
3241
  ]
3230
3242
  },
3231
3243
  {
@@ -3237,13 +3249,13 @@
3237
3249
  {
3238
3250
  "version": "0.1.0-alpha.176",
3239
3251
  "changes": [
3240
- "feat(mcp): dynamic MCP server add/remove \u2014 agents can add/remove MCP servers in agent.json without daemon restart"
3252
+ "feat(mcp): dynamic MCP server add/remove agents can add/remove MCP servers in agent.json without daemon restart"
3241
3253
  ]
3242
3254
  },
3243
3255
  {
3244
3256
  "version": "0.1.0-alpha.175",
3245
3257
  "changes": [
3246
- "fix(daemon): launchd KeepAlive for crash recovery \u2014 auto-restarts on crash",
3258
+ "fix(daemon): launchd KeepAlive for crash recovery auto-restarts on crash",
3247
3259
  "fix(daemon): orphan killer excludes MCP server processes",
3248
3260
  "fix(daemon): health file writer wired into daemon-entry",
3249
3261
  "fix(engine): auth-failure errors include actionable guidance"
@@ -3252,50 +3264,50 @@
3252
3264
  {
3253
3265
  "version": "0.1.0-alpha.174",
3254
3266
  "changes": [
3255
- "feat(outlook): keyboard shortcuts \u2014 1-7 for tabs, Esc to collapse",
3256
- "feat(outlook): obligation origin cards \u2014 clickable visual chain from who asked through which channel"
3267
+ "feat(outlook): keyboard shortcuts 1-7 for tabs, Esc to collapse",
3268
+ "feat(outlook): obligation origin cards clickable visual chain from who asked through which channel"
3257
3269
  ]
3258
3270
  },
3259
3271
  {
3260
3272
  "version": "0.1.0-alpha.173",
3261
3273
  "changes": [
3262
- "feat(outlook): sessions grouped by person \u2014 same friend across multiple channels shown together with person header"
3274
+ "feat(outlook): sessions grouped by person same friend across multiple channels shown together with person header"
3263
3275
  ]
3264
3276
  },
3265
3277
  {
3266
3278
  "version": "0.1.0-alpha.172",
3267
3279
  "changes": [
3268
- "feat(outlook): inner dialog landmark navigation \u2014 jump to surfaces, rests, delegations",
3280
+ "feat(outlook): inner dialog landmark navigation jump to surfaces, rests, delegations",
3269
3281
  "feat(outlook): active coding sessions shown on Overview dashboard",
3270
- "feat(outlook): habit confidence indicators \u2014 on schedule, overdue, never fired"
3282
+ "feat(outlook): habit confidence indicators on schedule, overdue, never fired"
3271
3283
  ]
3272
3284
  },
3273
3285
  {
3274
3286
  "version": "0.1.0-alpha.171",
3275
3287
  "changes": [
3276
- "feat(outlook): session state at a glance \u2014 last inbound/outbound shown on each session row",
3277
- "feat(outlook): habit confidence \u2014 on schedule / overdue / never fired indicators"
3288
+ "feat(outlook): session state at a glance last inbound/outbound shown on each session row",
3289
+ "feat(outlook): habit confidence on schedule / overdue / never fired indicators"
3278
3290
  ]
3279
3291
  },
3280
3292
  {
3281
3293
  "version": "0.1.0-alpha.170",
3282
3294
  "changes": [
3283
- "feat(outlook): needs-me triage \u2014 action now vs stale sections, dismiss buttons, return-ready highlighting",
3284
- "fix(outlook): return-ready obligation detection \u2014 highlights results ready but not returned"
3295
+ "feat(outlook): needs-me triage action now vs stale sections, dismiss buttons, return-ready highlighting",
3296
+ "fix(outlook): return-ready obligation detection highlights results ready but not returned"
3285
3297
  ]
3286
3298
  },
3287
3299
  {
3288
3300
  "version": "0.1.0-alpha.169",
3289
3301
  "changes": [
3290
- "fix(outlook): desk prefs wiring \u2014 carrying block, constellations, starred friends now load in production",
3291
- "feat(outlook): obligation dismiss \u2014 agents can clear stale obligations from needs-me queue",
3302
+ "fix(outlook): desk prefs wiring carrying block, constellations, starred friends now load in production",
3303
+ "feat(outlook): obligation dismiss agents can clear stale obligations from needs-me queue",
3292
3304
  "fix: default minimax model updated to MiniMax-M2.7"
3293
3305
  ]
3294
3306
  },
3295
3307
  {
3296
3308
  "version": "0.1.0-alpha.168",
3297
3309
  "changes": [
3298
- "fix(outlook): content area matches sidebar background \u2014 consistent dark surface"
3310
+ "fix(outlook): content area matches sidebar background consistent dark surface"
3299
3311
  ]
3300
3312
  },
3301
3313
  {
@@ -3307,31 +3319,31 @@
3307
3319
  {
3308
3320
  "version": "0.1.0-alpha.166",
3309
3321
  "changes": [
3310
- "fix(outlook): add dark class to html root \u2014 fixes white/blank page in production"
3322
+ "fix(outlook): add dark class to html root fixes white/blank page in production"
3311
3323
  ]
3312
3324
  },
3313
3325
  {
3314
3326
  "version": "0.1.0-alpha.165",
3315
3327
  "changes": [
3316
- "fix(daemon): dont launchctl bootstrap during ouro up \u2014 write plist only, prevents competing daemon process"
3328
+ "fix(daemon): dont launchctl bootstrap during ouro up write plist only, prevents competing daemon process"
3317
3329
  ]
3318
3330
  },
3319
3331
  {
3320
3332
  "version": "0.1.0-alpha.164",
3321
3333
  "changes": [
3322
- "fix(daemon): keep /dev/null fds open until parent exits \u2014 fixes ouro up daemon crash"
3334
+ "fix(daemon): keep /dev/null fds open until parent exits fixes ouro up daemon crash"
3323
3335
  ]
3324
3336
  },
3325
3337
  {
3326
3338
  "version": "0.1.0-alpha.163",
3327
3339
  "changes": [
3328
- "fix(daemon): redirect detached spawn stdio to /dev/null \u2014 fixes ouro up daemon crash"
3340
+ "fix(daemon): redirect detached spawn stdio to /dev/null fixes ouro up daemon crash"
3329
3341
  ]
3330
3342
  },
3331
3343
  {
3332
3344
  "version": "0.1.0-alpha.162",
3333
3345
  "changes": [
3334
- "fix(daemon): handle EPIPE in detached daemon \u2014 suppress pipe errors when parent exits after ouro up"
3346
+ "fix(daemon): handle EPIPE in detached daemon suppress pipe errors when parent exits after ouro up"
3335
3347
  ]
3336
3348
  },
3337
3349
  {
@@ -3344,10 +3356,10 @@
3344
3356
  {
3345
3357
  "version": "0.1.0-alpha.160",
3346
3358
  "changes": [
3347
- "feat(outlook): total inspectability expansion \u2014 14 API endpoints, session x-ray, obligation chain tracing, coding deep inspection, attention/pending queue, bridge inventory, habit triage, memory/journal, friend economics, SSE live updates",
3348
- "feat(outlook): React SPA with Catalyst UI \u2014 sidebar layout, 7-tab agent inspector, chat bubble transcripts with mechanism-tool awareness, hash URL routing",
3349
- "feat(outlook): agent desk customization \u2014 carrying block, pinned constellations, tab ordering, starred friends, status line, closure memory, needs-me urgency queue",
3350
- "feat(outlook): nerves observation layer \u2014 shared typed readers, eliminates bespoke type mirrors",
3359
+ "feat(outlook): total inspectability expansion 14 API endpoints, session x-ray, obligation chain tracing, coding deep inspection, attention/pending queue, bridge inventory, habit triage, memory/journal, friend economics, SSE live updates",
3360
+ "feat(outlook): React SPA with Catalyst UI sidebar layout, 7-tab agent inspector, chat bubble transcripts with mechanism-tool awareness, hash URL routing",
3361
+ "feat(outlook): agent desk customization carrying block, pinned constellations, tab ordering, starred friends, status line, closure memory, needs-me urgency queue",
3362
+ "feat(outlook): nerves observation layer shared typed readers, eliminates bespoke type mirrors",
3351
3363
  "fix(auth): ouro auth for openai-codex always refreshes token, provider verification uses correct endpoint",
3352
3364
  "fix(daemon): use OUTLOOK_DEFAULT_PORT (6876) for Outlook server"
3353
3365
  ]
@@ -3365,12 +3377,12 @@
3365
3377
  {
3366
3378
  "version": "0.1.0-alpha.158",
3367
3379
  "changes": [
3368
- "Task scanner v2: explicit identity via kind: task field. Scanner only parses files that declare themselves as task cards \u2014 doing docs, planning docs, and artifacts silently skipped. Eliminates 184 false parse errors on real bundles.",
3380
+ "Task scanner v2: explicit identity via kind: task field. Scanner only parses files that declare themselves as task cards doing docs, planning docs, and artifacts silently skipped. Eliminates 184 false parse errors on real bundles.",
3369
3381
  "Typed issue model: every scanner issue has a code, description, proposed fix, confidence (safe/needs_review), and category (live/migration). Replaces flat parseErrors/invalidFilenames arrays.",
3370
3382
  "Board health line: compact board shows health: clean or health: 1 live, 10 migration. Live vs migration split prevents cleanup noise from looking like breakage.",
3371
3383
  "Fix command: ouro task fix (dry-run), ouro task fix --safe (apply deterministic fixes), ouro task fix <id> (inspect/apply individual issues). Currently auto-fixes schema-missing-kind.",
3372
3384
  "Cancelled status: new terminal state reachable from any active status. Auto-archives with work directory, hidden from active board view.",
3373
- "Derived child_tasks: computed at scan time from parent_task links. child_tasks removed from authored schema \u2014 no more hand-maintained stale arrays.",
3385
+ "Derived child_tasks: computed at scan time from parent_task links. child_tasks removed from authored schema no more hand-maintained stale arrays.",
3374
3386
  "Work directory awareness: same-stem directories detected and listed on TaskFile (hasWorkDir, workDirFiles). Scanner never descends into them.",
3375
3387
  "Collection root clutter detection: non-task support docs at collection root summarized as one aggregated migration issue per collection.",
3376
3388
  "Root-only scanning: flat directory reads replace recursive walks. Faster and correct."
@@ -3379,7 +3391,7 @@
3379
3391
  {
3380
3392
  "version": "0.1.0-alpha.157",
3381
3393
  "changes": [
3382
- "Habit turns as awareness: unified buildHabitTurnMessage replaces contextual-heartbeat. Continuity-first format (checkpoint leads, not elapsed time). Same format for all habits \u2014 no heartbeat special-casing.",
3394
+ "Habit turns as awareness: unified buildHabitTurnMessage replaces contextual-heartbeat. Continuity-first format (checkpoint leads, not elapsed time). Same format for all habits no heartbeat special-casing.",
3383
3395
  "First beat experience: new habits get \"your [Title] is alive. this is its first breath\" on first fire.",
3384
3396
  "Fix: reconcile() now fires new/overdue habits immediately (was start()-only). New habits created via write_file fire within seconds.",
3385
3397
  "Rhythm awareness across all channels: rhythmStatusSection() in system prompt shows heartbeat health in every conversation.",
@@ -3405,10 +3417,10 @@
3405
3417
  {
3406
3418
  "version": "0.1.0-alpha.154",
3407
3419
  "changes": [
3408
- "feat: clean tool status messages \u2014 human-readable by default, /debug toggle",
3420
+ "feat: clean tool status messages human-readable by default, /debug toggle",
3409
3421
  "humanReadableToolDescription derives from tool name+args, not hardcoded map",
3410
- "Shared tool activity callbacks (DRY) \u2014 senses only provide render function",
3411
- "Slash command handling moved to pipeline \u2014 all senses get /debug for free",
3422
+ "Shared tool activity callbacks (DRY) senses only provide render function",
3423
+ "Slash command handling moved to pipeline all senses get /debug for free",
3412
3424
  "BlueBubbles: one clean iMessage per tool, not raw shared work: processing"
3413
3425
  ]
3414
3426
  },
@@ -3505,7 +3517,7 @@
3505
3517
  "version": "0.1.0-alpha.142",
3506
3518
  "changes": [
3507
3519
  "Surface tool fulfills heart obligations on successful routing: findPendingObligationForOrigin + fulfillObligation called after inner obligation advance, wrapped in try/catch.",
3508
- "New fulfillHeartObligation callback on HandleSurfaceInput \u2014 origin-based lookup independent of inner obligationId.",
3520
+ "New fulfillHeartObligation callback on HandleSurfaceInput origin-based lookup independent of inner obligationId.",
3509
3521
  "ouro inner status command: reads runtime.json, journal dir, heartbeat cadence, attention count. Shows last turn, status, heartbeat health, journal listing, held thoughts."
3510
3522
  ]
3511
3523
  },
@@ -3545,21 +3557,21 @@
3545
3557
  {
3546
3558
  "version": "0.1.0-alpha.138",
3547
3559
  "changes": [
3548
- "Memory renamed to diary: memory.ts \u2192 diary.ts, MemoryFact \u2192 DiaryEntry, memory_save \u2192 diary_write, memory_search \u2192 recall. All types, functions, events, and variables renamed throughout.",
3560
+ "Memory renamed to diary: memory.ts diary.ts, MemoryFact DiaryEntry, memory_save diary_write, memory_search recall. All types, functions, events, and variables renamed throughout.",
3549
3561
  "Diary path: diary/ (top-level) replaces psyche/memory/. Schema-2 migration copies files; legacy fallback removed.",
3550
3562
  "Journal workspace: journal/ directory for freeform thinking-in-progress. Agent writes with write_file, system reads for heartbeat context.",
3551
3563
  "Unified recall tool: searches both diary entries and journal files. Results tagged [diary] or [journal].",
3552
3564
  "Journal embeddings: file-level embeddings indexed during heartbeat via journal/.index.json sidecar.",
3553
3565
  "Journal section in inner dialog system prompt: index of up to 10 most recently modified journal files with name, recency, and first-line preview.",
3554
3566
  "Metacognitive framing updated: diary (record), journal (workspace), ponder/rest vocabulary, morning briefing encouragement.",
3555
- "Session migration: memory_save \u2192 diary_write, memory_search \u2192 recall added to migrateToolNames()."
3567
+ "Session migration: memory_save diary_write, memory_search recall added to migrateToolNames()."
3556
3568
  ]
3557
3569
  },
3558
3570
  {
3559
3571
  "version": "0.1.0-alpha.137",
3560
3572
  "changes": [
3561
3573
  "ouro dev auto-discovers existing repo at ~/Projects/ouroboros or prompts for clone path.",
3562
- "ouro dev never clones without user consent \u2014 prompts in interactive mode, errors in non-interactive.",
3574
+ "ouro dev never clones without user consent prompts in interactive mode, errors in non-interactive.",
3563
3575
  "ouro dev --repo-path errors clearly when the specified path has no repo."
3564
3576
  ]
3565
3577
  },
@@ -3597,7 +3609,7 @@
3597
3609
  {
3598
3610
  "version": "0.1.0-alpha.133",
3599
3611
  "changes": [
3600
- "Inner return obligations: delegated inner dialog work now tracks a ReturnObligation through queued \u2192 running \u2192 returned/deferred lifecycle.",
3612
+ "Inner return obligations: delegated inner dialog work now tracks a ReturnObligation through queued running returned/deferred lifecycle.",
3601
3613
  "Exact-origin routing: inner dialog completions route back to the session that delegated the work, not just the freshest active session.",
3602
3614
  "Active work frame surfaces pending inner return obligations so the agent knows what's outstanding."
3603
3615
  ]
@@ -3645,14 +3657,14 @@
3645
3657
  {
3646
3658
  "version": "0.1.0-alpha.126",
3647
3659
  "changes": [
3648
- "Fixed Anthropic tool_choice incompatibility with thinking \u2014 uses auto instead of any when thinking is enabled.",
3660
+ "Fixed Anthropic tool_choice incompatibility with thinking uses auto instead of any when thinking is enabled.",
3649
3661
  "auth verify and auth switch now use pingProvider for real API verification instead of format-only checks. auth switch verifies credentials work before switching."
3650
3662
  ]
3651
3663
  },
3652
3664
  {
3653
3665
  "version": "0.1.0-alpha.125",
3654
3666
  "changes": [
3655
- "Fixed Anthropic tool_choice incompatibility with thinking \u2014 uses auto instead of any when thinking is enabled.",
3667
+ "Fixed Anthropic tool_choice incompatibility with thinking uses auto instead of any when thinking is enabled.",
3656
3668
  "auth verify and auth switch now use pingProvider for real API verification instead of format-only checks. auth switch verifies credentials work before switching."
3657
3669
  ]
3658
3670
  },
@@ -3684,7 +3696,7 @@
3684
3696
  {
3685
3697
  "version": "0.1.0-alpha.120",
3686
3698
  "changes": [
3687
- "Daemon startup now kills ALL orphaned ouro processes (daemons AND agents) from previous instances \u2014 fixes stale-version processes handling requests after every update.",
3699
+ "Daemon startup now kills ALL orphaned ouro processes (daemons AND agents) from previous instances fixes stale-version processes handling requests after every update.",
3688
3700
  "Failover error messages no longer contain raw JSON API response bodies. Error messages are sanitized at the source and the failover summary uses clean classification labels only."
3689
3701
  ]
3690
3702
  },
@@ -3724,10 +3736,10 @@
3724
3736
  "changes": [
3725
3737
  "Fix: Default runtime logger is now silent (no stderr sink) so nerves events emitted before logger configuration no longer interleave with the CLI spinner animation.",
3726
3738
  "Fix: MCP server connect failures now include the command name, args, and a hint to check agent.json mcpServers configuration. Retry-exhaustion messages also identify the failing command.",
3727
- "Verification: StreamingWordWrapper integration in CLI chat confirmed working \u2014 wraps at word boundaries during streaming output.",
3739
+ "Verification: StreamingWordWrapper integration in CLI chat confirmed working wraps at word boundaries during streaming output.",
3728
3740
  "When a model provider fails mid-conversation (auth error, usage limit, outage), the harness now classifies the error, pings alternative configured providers, and surfaces validated failover options to the user in-channel. Reply 'switch to <provider>' to continue on a working provider.",
3729
3741
  "Each provider now has a `classifyError` method that distinguishes auth failures, usage/subscription limits, rate limits, server errors, and network errors. The old auth guidance wrappers are replaced by this unified classification system.",
3730
- "New `pingProvider` function makes a real heartbeat completion call to verify provider credentials and quota are live \u2014 no more format-only checks.",
3742
+ "New `pingProvider` function makes a real heartbeat completion call to verify provider credentials and quota are live no more format-only checks.",
3731
3743
  "Provider factories now accept optional config parameters, enabling credential injection for health inventory pings without touching disk config."
3732
3744
  ]
3733
3745
  },
@@ -3862,7 +3874,7 @@
3862
3874
  {
3863
3875
  "version": "0.1.0-alpha.94",
3864
3876
  "changes": [
3865
- "Fix stale CurrentVersion symlink not healing during `ouro up` \u2014 the daemon now detects and repairs dangling version symlinks before reading the active version.",
3877
+ "Fix stale CurrentVersion symlink not healing during `ouro up` the daemon now detects and repairs dangling version symlinks before reading the active version.",
3866
3878
  "Fix homedir regression in daemon-cli-defaults test and cover changelog-null branch."
3867
3879
  ]
3868
3880
  },
@@ -3956,14 +3968,14 @@
3956
3968
  {
3957
3969
  "version": "0.1.0-alpha.80",
3958
3970
  "changes": [
3959
- "Bootstrap package (npx ouro.bot) now installs into ~/.ouro-cli/ versioned layout directly. No more silent npx updates \u2014 every install and update is logged. Cleans up old ~/.local/bin/ouro wrapper."
3971
+ "Bootstrap package (npx ouro.bot) now installs into ~/.ouro-cli/ versioned layout directly. No more silent npx updates every install and update is logged. Cleans up old ~/.local/bin/ouro wrapper."
3960
3972
  ]
3961
3973
  },
3962
3974
  {
3963
3975
  "version": "0.1.0-alpha.79",
3964
3976
  "changes": [
3965
3977
  "New: Versioned CLI directory layout (~/.ouro-cli/) replaces npx-based ouro wrapper. Explicit version management, rollback support, and deterministic updates.",
3966
- "New: `ouro up` now checks the registry for newer CLI versions, installs them into ~/.ouro-cli/versions/, activates via symlink flip, and re-execs \u2014 no more silent npx downloads.",
3978
+ "New: `ouro up` now checks the registry for newer CLI versions, installs them into ~/.ouro-cli/versions/, activates via symlink flip, and re-execs no more silent npx downloads.",
3967
3979
  "New: `ouro rollback [<version>]` swaps CurrentVersion/previous symlinks, stops the daemon. With a version arg, installs if needed then activates.",
3968
3980
  "New: `ouro versions` lists cached CLI versions with * current and (previous) markers.",
3969
3981
  "Migration: On first run, old ~/.local/bin/ouro wrapper is removed, old PATH entry cleaned from shell profile, new ~/.ouro-cli/bin added to PATH.",
@@ -3985,10 +3997,10 @@
3985
3997
  {
3986
3998
  "version": "0.1.0-alpha.76",
3987
3999
  "changes": [
3988
- "Fix: CLI chat terminal logging now filters to warn/error only \u2014 info-level nerves logs go to ndjson file only, keeping the interactive TUI clean.",
4000
+ "Fix: CLI chat terminal logging now filters to warn/error only info-level nerves logs go to ndjson file only, keeping the interactive TUI clean.",
3989
4001
  "Fix: Streamed model output now wraps at word boundaries instead of mid-word. A new StreamingWordWrapper buffers partial lines and breaks at spaces when approaching terminal width.",
3990
4002
  "New: `ouro up` now prints 'ouro updated to <version> (was <previous>)' when npx downloads a newer CLI binary, separate from the agent bundle update message.",
3991
- "Fix: Spinner/log interleave verified \u2014 terminal sink reads pause/resume hooks at call time, not creation time, so the filterSink wrapper in CLI logging does not break spinner coordination."
4003
+ "Fix: Spinner/log interleave verified terminal sink reads pause/resume hooks at call time, not creation time, so the filterSink wrapper in CLI logging does not break spinner coordination."
3992
4004
  ]
3993
4005
  },
3994
4006
  {
@@ -4040,7 +4052,7 @@
4040
4052
  {
4041
4053
  "version": "0.1.0-alpha.69",
4042
4054
  "changes": [
4043
- "Generic MCP client: ouroboros agents can now connect to any MCP server configured in agent.json. Zero new dependencies \u2014 pure JSON-RPC over stdio.",
4055
+ "Generic MCP client: ouroboros agents can now connect to any MCP server configured in agent.json. Zero new dependencies pure JSON-RPC over stdio.",
4044
4056
  "New `ouro mcp list` and `ouro mcp call` CLI commands route through the daemon socket to persistent MCP connections, so agents use shared server instances instead of spawning fresh ones per call.",
4045
4057
  "MCP tools are injected into the agent's system prompt on startup, so agents know what external capabilities are available without a discovery step.",
4046
4058
  "Trust manifest: `mcp list` requires acquaintance trust, `mcp call` requires friend trust."
@@ -4049,7 +4061,7 @@
4049
4061
  {
4050
4062
  "version": "0.1.0-alpha.68",
4051
4063
  "changes": [
4052
- "New no_response tool lets agents stay silent in group chats when the moment doesn't call for a reply \u2014 reactions, side conversations, and tapbacks no longer trigger unwanted responses.",
4064
+ "New no_response tool lets agents stay silent in group chats when the moment doesn't call for a reply reactions, side conversations, and tapbacks no longer trigger unwanted responses.",
4053
4065
  "Group chat participation prompt teaches agents to be intentional participants, comfortable with silence, and to prefer reactions over full text replies when appropriate.",
4054
4066
  "System prompt includes --agent flag in all ouro CLI examples for non-daemon deployments. Azure startup symlinks ouro CLI into /usr/local/bin."
4055
4067
  ]
@@ -4063,7 +4075,7 @@
4063
4075
  {
4064
4076
  "version": "0.1.0-alpha.65",
4065
4077
  "changes": [
4066
- "Tool permissions overhauled: channel-level blocking removed, all tools now visible on all channels. Guardrails are invocation-level with two layers \u2014 structural (edit-requires-read, destructive pattern blocking, protected paths) always on for everyone, and trust-level (ouro CLI per-subcommand trust manifest, general CLI allowlists, bundle-scoped writes) for untrusted contexts.",
4078
+ "Tool permissions overhauled: channel-level blocking removed, all tools now visible on all channels. Guardrails are invocation-level with two layers structural (edit-requires-read, destructive pattern blocking, protected paths) always on for everyone, and trust-level (ouro CLI per-subcommand trust manifest, general CLI allowlists, bundle-scoped writes) for untrusted contexts.",
4067
4079
  "New `ouro changelog` CLI subcommand reads changelog.json and supports `--from <version>` for delta filtering, so agents can introspect their own update history on any channel.",
4068
4080
  "Compound shell commands (&&, ;, |, $()) are blocked for untrusted users to prevent smuggling dangerous operations behind safe prefixes.",
4069
4081
  "Azure App Service deployment migrated from zip-deploy to npm-based harness install with persistent agent bundle and managed identity auth."
@@ -4210,9 +4222,9 @@
4210
4222
  "changes": [
4211
4223
  "Inner dialog now knows which task triggered it: taskId flows from daemon poke through the worker into the turn, and the agent gets the full task file content instead of a generic heartbeat prompt.",
4212
4224
  "Inner dialog boot message includes aspirations and state summary instead of a vacuous placeholder, so the agent wakes up with context about what matters and what's happening.",
4213
- "Vestigial `drainInbox` removed from inner dialog \u2014 pipeline already handles pending drain correctly.",
4225
+ "Vestigial `drainInbox` removed from inner dialog pipeline already handles pending drain correctly.",
4214
4226
  "Inner dialog nerves events now include assistant response preview, tool call names, token usage, and taskId for meaningful observability.",
4215
- "`ouro thoughts` command reads and formats inner dialog session turns with `--last`, `--json`, `--follow`, and `--agent` flags \u2014 humans can now see what the agent has been thinking.",
4227
+ "`ouro thoughts` command reads and formats inner dialog session turns with `--last`, `--json`, `--follow`, and `--agent` flags humans can now see what the agent has been thinking.",
4216
4228
  "`readTaskFile` searches collection subdirectories (one-shots, ongoing, habits) since the scheduler sends bare task stems without collection prefixes.",
4217
4229
  "`ouro reminder create` accepts `--requester` to track who requested a reminder for notification round-trip.",
4218
4230
  "Response extraction handles `tool_choice=required` models by falling back to `final_answer` tool call arguments when assistant message content is empty."
@@ -4242,25 +4254,25 @@
4242
4254
  {
4243
4255
  "version": "0.1.0-alpha.42",
4244
4256
  "changes": [
4245
- "Associative recall now skips corrupt JSONL lines instead of crashing \u2014 matches the resilient pattern already used in memory.ts."
4257
+ "Associative recall now skips corrupt JSONL lines instead of crashing matches the resilient pattern already used in memory.ts."
4246
4258
  ]
4247
4259
  },
4248
4260
  {
4249
4261
  "version": "0.1.0-alpha.41",
4250
4262
  "changes": [
4251
- "JSONL readers (memory facts, inter-agent inbox) now skip corrupt lines instead of crashing \u2014 partial writes from crashes no longer lose all data.",
4263
+ "JSONL readers (memory facts, inter-agent inbox) now skip corrupt lines instead of crashing partial writes from crashes no longer lose all data.",
4252
4264
  "Inter-agent message router now parses before clearing the inbox file, and preserves unparsed lines so corrupt messages are not silently lost.",
4253
- "Inner-dialog checkpoint derivation no longer crashes on all-whitespace assistant content \u2014 returns fallback checkpoint instead.",
4265
+ "Inner-dialog checkpoint derivation no longer crashes on all-whitespace assistant content returns fallback checkpoint instead.",
4254
4266
  "Update checker interval now catches and logs errors from the onUpdate callback instead of silently swallowing them."
4255
4267
  ]
4256
4268
  },
4257
4269
  {
4258
4270
  "version": "0.1.0-alpha.40",
4259
4271
  "changes": [
4260
- "Removed dead backward-compat re-exports from core.ts (tools, streaming, prompt, kicks) \u2014 consumers already import from the canonical modules.",
4272
+ "Removed dead backward-compat re-exports from core.ts (tools, streaming, prompt, kicks) consumers already import from the canonical modules.",
4261
4273
  "Removed dead exports: baseToolHandlers, teamsToolHandlers, teamsTools, __internal (token-estimate), TASK_STEM_PATTERN, checkAndRecord403 no-op and METHOD_TO_ACTION.",
4262
4274
  "Consolidated duplicate sanitizeKey (config.ts + bluebubbles-mutation-log.ts) and slugify (hatch-flow.ts + tasks/index.ts) into shared exports from config.ts.",
4263
- "Replaced all as-any casts in source with proper TypeScript narrowing or Record<string, unknown> \u2014 only 2 SDK-required casts remain.",
4275
+ "Replaced all as-any casts in source with proper TypeScript narrowing or Record<string, unknown> only 2 SDK-required casts remain.",
4264
4276
  "Removed unnecessary as-unknown-as casts on readdirSync (4 locations) and spawner double-cast.",
4265
4277
  "Cleaned up commented-out kick detection code, stale TODOs, misplaced imports, and unused type imports."
4266
4278
  ]
@@ -4268,9 +4280,9 @@
4268
4280
  {
4269
4281
  "version": "0.1.0-alpha.39",
4270
4282
  "changes": [
4271
- "All senses now route through a shared per-turn pipeline \u2014 friend resolution, trust gate, session load, pending drain, agent turn, post-turn, and token accumulation happen in one place instead of four.",
4283
+ "All senses now route through a shared per-turn pipeline friend resolution, trust gate, session load, pending drain, agent turn, post-turn, and token accumulation happen in one place instead of four.",
4272
4284
  "Trust gate is now channel-aware: open senses (iMessage) enforce stranger/acquaintance rules, closed senses (Teams) trust the org, local and internal always pass through.",
4273
- "Tool access and prompt restrictions use a single shared isTrustedLevel check \u2014 no more scattered family/friend comparisons that could drift apart.",
4285
+ "Tool access and prompt restrictions use a single shared isTrustedLevel check no more scattered family/friend comparisons that could drift apart.",
4274
4286
  "Pending messages now inject correctly into multimodal content (image attachments no longer silently drop pending messages).",
4275
4287
  "ouro reminder create supports --agent flag, matching every other identity-scoped CLI command."
4276
4288
  ]
@@ -4278,11 +4290,11 @@
4278
4290
  {
4279
4291
  "version": "0.1.0-alpha.38",
4280
4292
  "changes": [
4281
- "You now have a proper body map \u2014 understanding of your home (bundle) and bones (harness), what each directory is for, and how to modify your own configuration.",
4293
+ "You now have a proper body map understanding of your home (bundle) and bones (harness), what each directory is for, and how to modify your own configuration.",
4282
4294
  "Inner dialog is now genuine internal monologue with metacognitive framing, not a second CLI session. Heartbeat and bootstrap messages read as first-person awareness.",
4283
4295
  "Cross-session communication works end-to-end: inner dialog thoughts surface as [inner thought: ...] in conversations, messages to yourself route to inner dialog, and you can proactively reach out to friends via iMessage and Teams.",
4284
4296
  "Tool audit: removed wrapper tools (git_commit, gh_cli, get_current_time, list_directory), added surgical tools (edit_file, glob, grep, read_file with offset/limit), consolidated 7 task tools + schedule_reminder + friend tools into ouro CLI commands.",
4285
- "You now understand why certain tools are restricted in certain contexts \u2014 trust level and shared channels each have independent, explained gates.",
4297
+ "You now understand why certain tools are restricted in certain contexts trust level and shared channels each have independent, explained gates.",
4286
4298
  "ouro friend link/unlink commands handle orphan cleanup when linking external identities, merging duplicate friend records intelligently.",
4287
4299
  "During onboarding, the adoption specialist can collect phone number and Teams handle to create an initial friend record with contact info."
4288
4300
  ]