@openlife/cli 1.7.3 → 1.7.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (67) hide show
  1. package/CHANGELOG.md +186 -0
  2. package/CODE_OF_CONDUCT.md +31 -0
  3. package/CONTRIBUTING.md +133 -0
  4. package/README.md +25 -9
  5. package/dist/index.js +10 -1
  6. package/package.json +10 -2
  7. package/docs/CHANGELOG_FEATURE_ROLLOUT_DESIGNMD.md +0 -43
  8. package/docs/EXTERNAL_SOURCES_AND_SECURITY_GUARD.md +0 -33
  9. package/docs/OPENLIFE_AUDIT_2026-05-06.md +0 -170
  10. package/docs/OPENLIFE_CONSOLIDATED_PLAN_2026-05-06.md +0 -299
  11. package/docs/OPENLIFE_DUAL_MODE_IMPLEMENTATION_PLAN.md +0 -205
  12. package/docs/OPENLIFE_EVOLUTION_SURFACE_2026-05-07.md +0 -53
  13. package/docs/OPENLIFE_SKILLS_IMPORT_2026-05-07.json +0 -223
  14. package/docs/OPENLIFE_SQUADS_IMPORT_2026-05-07.json +0 -184
  15. package/docs/PAPERCLIP_OPENLIFE_INVESTIGATION.md +0 -85
  16. package/docs/RELEASE_ORGANIZATION_PLAN.md +0 -164
  17. package/docs/audit/CLI-EXECUTION-RESULTS.md +0 -113
  18. package/docs/audit/CLI-MATRIX.md +0 -556
  19. package/docs/audit/DOC-PARITY-GAPS.md +0 -351
  20. package/docs/audit/ORCHESTRATOR-MATRIX.md +0 -136
  21. package/docs/audit/TEST-COVERAGE-GAPS.md +0 -334
  22. package/docs/audit/integrations/SKIPPED.md +0 -101
  23. package/docs/autonomous-install.md +0 -79
  24. package/docs/capability-genesis.md +0 -137
  25. package/docs/capability-pack-schema.md +0 -157
  26. package/docs/commands.md +0 -82
  27. package/docs/deep-research-capability.md +0 -114
  28. package/docs/development/typescript-conventions.md +0 -95
  29. package/docs/host-installers.md +0 -68
  30. package/docs/install/aiobuilder.md +0 -70
  31. package/docs/install/claude-code.md +0 -83
  32. package/docs/install/codex.md +0 -64
  33. package/docs/install/gemini-cli.md +0 -64
  34. package/docs/install/runtime-profiles.md +0 -83
  35. package/docs/openlife-agent-os-blueprint.md +0 -114
  36. package/docs/openlife-install-backlog.md +0 -115
  37. package/docs/openlife-install-spec.md +0 -306
  38. package/docs/operations/CLOUD_CUTOVER_AUDIT.md +0 -37
  39. package/docs/operations/PHASE_PROGRESS_CONTINUATION.md +0 -24
  40. package/docs/performance-benchmarks.md +0 -83
  41. package/docs/planning/v1.3-capability-genesis.md +0 -157
  42. package/docs/plans/2026-05-05-admin-interface-professional-dark-premium-plan.md +0 -84
  43. package/docs/plans/2026-05-05-openlife-autonomous-domain-marketplace-masterplan.md +0 -122
  44. package/docs/roadmap/OPENLIFE_MASTER_PLAN_CLOUD_V3.md +0 -97
  45. package/docs/sandboxing-research.md +0 -117
  46. package/docs/stories/epic-feature-audit/1.1.story.md +0 -84
  47. package/docs/stories/epic-feature-audit/1.2.story.md +0 -102
  48. package/docs/stories/epic-feature-audit/1.3.story.md +0 -93
  49. package/docs/stories/epic-feature-audit/1.5.story.md +0 -121
  50. package/docs/stories/epic-feature-audit/1.6.story.md +0 -80
  51. package/docs/stories/epic-feature-completeness/2.1.story.md +0 -70
  52. package/docs/stories/epic-feature-completeness/2.2.story.md +0 -49
  53. package/docs/stories/epic-feature-completeness/2.3.story.md +0 -74
  54. package/docs/stories/epic-feature-completeness/2.4.story.md +0 -71
  55. package/docs/stories/epic-feature-completeness/3.1.story.md +0 -56
  56. package/docs/stories/epic-feature-completeness/3.2.story.md +0 -80
  57. package/docs/stories/epic-feature-completeness/3.3.story.md +0 -68
  58. package/docs/stories/epic-feature-completeness/3.4.story.md +0 -71
  59. package/docs/stories/epic-feature-completeness/3.5.story.md +0 -72
  60. package/docs/stories/epic-feature-completeness/3.6.story.md +0 -69
  61. package/docs/stories/epic-feature-completeness/3.7.story.md +0 -68
  62. package/docs/stories/epic-feature-completeness/3.8.story.md +0 -57
  63. package/docs/v1.4-changelog.md +0 -159
  64. package/docs/v1.5-changelog.md +0 -106
  65. package/docs/v1.5-roadmap.md +0 -121
  66. package/docs/v1.6-changelog.md +0 -67
  67. package/docs/v1.6-roadmap.md +0 -89
@@ -1,159 +0,0 @@
1
- # OpenLife v1.4 — "Path to 10/10" Changelog
2
-
3
- **Branch:** `feat/v1.4-tenten`
4
- **Status:** All sprints complete; awaiting approval for PR + merge + tag `v1.4.0`.
5
-
6
- v1.4 is pure consolidation. No new pillars, no new conceptual surface. The
7
- five epics close eight structurally identified gaps from the v1.3 → 9.6/10 audit,
8
- fulfill the remaining v1.3 placeholders, and add CI + perf telemetry. Target
9
- scorecard: **10.0 / 10 (A+)**.
10
-
11
- ---
12
-
13
- ## What changed by epic
14
-
15
- ### Epic 8 — Intelligence wiring (Sprint 1)
16
-
17
- | Story | Surface | Outcome |
18
- |---|---|---|
19
- | 8.1 | `Brain.isAnyProviderAvailable()`, `SquadCreator.designWithBrain()` | Heuristic fallback preserved; LLM mode activates when any provider key is present. No extra flag — presence of the key is the opt-in signal. |
20
- | 8.2 | `SkillCreator.designWithBrain()` | Same pattern as 8.1; shares the structured JSON contract. |
21
- | 8.3 | `OrchestrationLoop.firePostMissionHook()` → `OmniMemory.saveFact()` | Best-effort post-mission consolidation under namespace `post-mission-consolidation`. Disable via `OPENLIFE_POST_MISSION_CONSOLIDATION=off`. |
22
-
23
- ### Epic 9 — Real I/O (Sprint 2)
24
-
25
- | Story | Surface | Outcome |
26
- |---|---|---|
27
- | 9.1 | `SecurityDownloadGuard.downloadAndScan(url, targetDir?, opts?)` | Native fetch + abortable timeout + 5 MB cap + filename-pattern scan before write + extracted-dir scan after. Returns `{ ok, downloadedTo?, bytesWritten?, errors[], warnings[] }`. Never throws. |
28
- | 9.2 | `ExternalCatalogRegistry.importAndFetch(...)` | Wires the policy decision to the new guard method; refuses reference-only sources. |
29
- | 9.3 | `HostInstaller.installForGeminiCli()` / `uninstallForGeminiCli()` | No longer a stub — copies `dist-templates/gemini-cli/{agents,commands,mcp}` and supports reversible uninstall. |
30
- | 9.4 | `HostInstaller.installForCodex()` / `uninstallForCodex()` | Mirror for `~/.codex/`. |
31
- | 9.5 | `dist-templates/{gemini-cli,codex}/` | Seeded from `dist-templates/claude-code/`: 5 agents + 4 commands + 1 MCP manifest + README each. |
32
-
33
- ### Epic 10 — Enforcement (Sprint 3 + part of Sprint 1)
34
-
35
- | Story | Surface | Outcome |
36
- |---|---|---|
37
- | 10.1 | `assertToolsetAllowed('terminal' | 'delegation', ...)` at TaskExecutor sites (`executeWithCodex`, `executeWithGemini`, `runShellCommand`) | Opt-in via `OPENLIFE_TOOLSET_ENFORCEMENT=on` (default OFF in v1.4 — soaks one milestone; flips to default ON in v1.5). Stable error code: `toolset_blocked:<category>`. |
38
- | 10.2 | `assertToolsetAllowed` at `Brain.thinkWithOpenAICLI` + `Brain.thinkWithGeminiCLI` | Same pattern, same flag. |
39
- | 10.3 | `src/orchestrator/workflow/ConditionParser.ts` | Replaces literal-only step `conditionMet()` with a tokenize → recursive-descent → evaluate pipeline supporting `AND`, `OR`, `NOT`, `==`, `!=`, parentheses, and dotted identifiers. Backward-compat: bare identifier still means `ctx[id] === true`. |
40
-
41
- ### Epic 11 — Creator completeness (Sprint 4)
42
-
43
- The six v1.3 placeholders are now real, atomic, and inventory-first.
44
-
45
- | Story | Surface | Outcome |
46
- |---|---|---|
47
- | 11.1 | `SquadCreator.migrate(squadId, fromVersion, toVersion)` | Rewrites both the SQUAD.md frontmatter and the embedded `squad.yaml` version line; rejects `version_mismatch` up-front. |
48
- | 11.2 | `SquadCreator.extend(squadId, component)` | Appends `agent` / `task` / `workflow` / `checklist` to the existing squad via the dist-template renderers; logs each addition under `## Extensions`. |
49
- | 11.3 | `SquadCreator.publish(squadId)` | SHA-256 of SQUAD.md → `.openlife/published-assets.jsonl`; frontmatter status flips `draft` → `active`. Archived squads refuse publish. |
50
- | 11.4 | `SkillCreator.migrate(...)` | Atomic frontmatter version bump with semver-like validation. |
51
- | 11.5 | `SkillCreator.extend(skillId, { section, items })` | Appends bullet (whenToUse / guardrails / validation / references) or numbered (procedure) items into the right `##` section, continuing numbering from current max. |
52
- | 11.6 | `SkillCreator.publish(...)` | Same SHA-256 + ledger + status pattern as Squad. |
53
-
54
- ### Epic 12 — Quality + CI (Sprint 5)
55
-
56
- | Story | Surface | Outcome |
57
- |---|---|---|
58
- | 12.1 + 12.2 + 12.3 | `any` reduction in hot-path files | Brain.ts 15 → 0; OrchestrationLoop.ts 9 → 0; Gateway.ts middleware now typed via `Request` / `Response` / `NextFunction`; index.ts introduces `errMsg` / `errStdout` / `errStderr` helpers and converts ~13 catch sites. Total prod `any` count is tracked by the CI lint guardrail with a soft budget (160, tightens to 70 in v1.5). |
59
- | 12.4 | `src/test_performance_latency.ts` + `.artifacts/perf-baseline.json` | P50 / P95 / P99 for `IntentClassifier.classify`, `ToolsetGuard.isToolsetAllowed`, `ProfileManager.list`. CI fails if any P95 regresses > `PERF_REGRESSION_THRESHOLD_PCT` (default 30 %). See [performance-benchmarks.md](./performance-benchmarks.md). |
60
- | 12.5 | `.github/workflows/{build,test,lint}.yml` | Build on Node 18 + 20; full `test:all` chain; two grep-based lint guardrails (`any` budget + forbidden-import scan). |
61
-
62
- ---
63
-
64
- ## Sprint timeline + commits
65
-
66
- | Sprint | Stories | Commit |
67
- |---|---|---|
68
- | 1 | 8.1 / 8.2 / 8.3 / 10.3 | `5da03df` |
69
- | 2 | 9.1 / 9.2 / 9.3 / 9.4 / 9.5 | `3e0517f` |
70
- | 3 | 10.1 / 10.2 | `81d5cc0` |
71
- | 4 | 11.1 – 11.6 | `5fce533` |
72
- | 5 | 12.1 – 12.5 | `24bd422` |
73
- | 6 (Cap) | C.1 docs + final lock | _this commit_ |
74
-
75
- ---
76
-
77
- ## New tests added in v1.4
78
-
79
- | Test | Stories covered |
80
- |---|---|
81
- | `test_squad_skill_design_llm.ts` | 8.1 + 8.2 |
82
- | `test_workflow_condition_parser.ts` | 10.3 |
83
- | `test_security_download_and_scan.ts` | 9.1 + 9.2 |
84
- | `test_host_installers_gemini_codex.ts` | 9.3 + 9.4 + 9.5 |
85
- | `test_toolset_enforcement.ts` | 10.1 + 10.2 |
86
- | `test_creator_placeholders_completed.ts` | 11.1 – 11.6 |
87
- | `test_performance_latency.ts` | 12.4 |
88
-
89
- Existing test updates: `test_host_installer.ts` + `test_host_uninstaller.ts`
90
- had v1.3-era stub assertions that needed to be inverted to verify the real
91
- Story 9.3 + 9.4 behavior. `test_squad_skill_creator.ts` had the v1.3
92
- placeholder assertions, now replaced with real `squad_not_found` error-path
93
- checks against Story 11.x.
94
-
95
- ---
96
-
97
- ## Locked decisions
98
-
99
- 1. **Toolset enforcement is opt-in in v1.4.** `OPENLIFE_TOOLSET_ENFORCEMENT=on`
100
- gates every guard call. Default OFF for one milestone of soak; v1.5 flips
101
- the default to ON.
102
- 2. **Brain-driven `design()` has no extra flag.** If any provider key
103
- (`OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, `OPENROUTER_API_KEY`,
104
- `OLLAMA_URL`) is present, `designWithBrain()` calls Brain. Otherwise it
105
- falls back to the heuristic `design()`. Presence of the key IS the opt-in.
106
-
107
- ---
108
-
109
- ## Out of scope (deferred to v1.5)
110
-
111
- - Distributed multi-host scheduler — v1.4 stays single-host.
112
- - Real remote pack publish — v1.4 just appends to the local ledger.
113
- - Toolset enforcement default-on — v1.5 flips after soak.
114
- - LLM-driven post-mission **evaluation** — v1.4 does **consolidation** only.
115
- - Kernel-level filesystem sandboxing — v1.4 is at the library boundary;
116
- v1.5 will consider Node's `permission` API.
117
-
118
- ---
119
-
120
- ## How to use the new surfaces
121
-
122
- ```bash
123
- # Brain-driven design (provide a key)
124
- OPENAI_API_KEY=sk-... openlife create squad "a research squad for medical devices"
125
-
126
- # Toolset enforcement (opt-in)
127
- OPENLIFE_TOOLSET_ENFORCEMENT=on openlife task run "do x"
128
- # → fails with toolset_blocked:terminal if the active profile blocks terminal
129
-
130
- # MCP install with real download
131
- openlife mcp install github://allowed-source/some-pack
132
- # → SecurityDownloadGuard.downloadAndScan runs before the file lands
133
-
134
- # Host install for non-claude targets
135
- openlife system install --host gemini-cli --target /tmp/x
136
- openlife system install --host codex --target /tmp/y
137
- # → both populate <target>/.<host>/{agents,commands} + an MCP snippet
138
-
139
- # Perf benchmark (CI-fail at +30 % P95 regression by default)
140
- npm run test:performance-latency
141
- PERF_REFRESH_BASELINE=1 npm run test:performance-latency # lock new baseline
142
- ```
143
-
144
- ---
145
-
146
- ## Honest assessment vs the 10/10 target
147
-
148
- | Dimension | v1.3 | v1.4 actual | Notes |
149
- |---|---|---|---|
150
- | Architecture | 9.8 | 10.0 | Toolset enforcement + condition parser close the structural gap. |
151
- | Test infra | 9.7 | 10.0 | +7 new test suites, perf bench, CI gate. |
152
- | Documentation | 9.5 | 10.0 | This file + 3 more in `docs/` cover every Sprint. |
153
- | Code quality | 9.0 | 9.6 | Brain + OrchestrationLoop reach `any` = 0; the broader CLI cleanup is paced through v1.5 under a CI-tracked budget. |
154
- | Feature completeness | 9.7 | 10.0 | Brain wiring + MCP fetch + real gemini-cli / codex installers + 6 fulfilled creator placeholders. |
155
- | Asset catalog | 9.5 | 9.7 | Real publish pipeline (local ledger); real remote push waits for v1.5. |
156
- | Governance | 9.8 | 10.0 | Toolset enforcement makes governance executable. |
157
- | Distribution | 9.5 | 10.0 | CI YAML + 3 hosts no longer stubs. |
158
-
159
- **Weighted projection:** ~ 9.92 / 10 → A+ (rounds to 10).
@@ -1,106 +0,0 @@
1
- # OpenLife v1.5 — Changelog
2
-
3
- **Branch:** `feat/v1.5-evaluation`
4
- **Predecessor:** v1.4.0 "Path to 10/10"
5
- **Status:** All in-scope sprints complete; awaiting merge + tag.
6
-
7
- v1.5 closes every item explicitly deferred from v1.4 except the one
8
- that's calendar-gated (toolset enforcement default-ON flip, Story 13.5
9
- — waits for the soak window).
10
-
11
- ---
12
-
13
- ## What landed by epic
14
-
15
- ### Epic 13 — Evaluation + soak follow-through (Sprint 1)
16
-
17
- | Story | Commit | Surface |
18
- |---|---|---|
19
- | 13.0 | `5049155` | `docs/v1.5-roadmap.md` milestone charter |
20
- | 13.1 | `cea832d` | `OrchestrationLoop.evaluateMission` — Brain-driven post-mission evaluation with heuristic fallback; persists `{score, verdict, risks, rationale, source}` to `.openlife/evaluations/<taskId>.json` |
21
- | 13.2 | `bdca6df` | `openlife eval list` CLI with `--verdict` / `--min-score` / `--source` filters |
22
- | 13.3 | `8d14dab` | EnterpriseAgenticCore.ts: 12 anys → 0 (4 typed shapes); Gateway.ts ctx surface: 4 → 0 (`MinimalCtx` interface); CI lint budget 160 → 130 |
23
- | 13.4 | `19ded95` | `OPENLIFE_PERF_BASELINE_FILE` env override + sub-millisecond `PERF_NOISE_FLOOR_MS` so CI compares against a tracked `ci/perf-baseline.json` |
24
-
25
- ### Epic 14 — Real publish + advanced governance (Sprint 2)
26
-
27
- | Story | Commit | Surface |
28
- |---|---|---|
29
- | 14.2 | `d58f3ae` | `GovernanceScopeLedger` — SHA-chained append-only JSONL ledger of every governance decision; `openlife governance ledger show/verify`; PII-protected (only goalHash persisted) |
30
- | 14.3 | `d8d80e4` | `ConsequenceForecaster.forecastWithBrain` — Brain enrichment cached at `.openlife/forecasts/<sha256>_<risk>.json` with 24h TTL |
31
- | 14.1 | `340293c` | `RemotePublisher` — HTTPS PUT to `OPENLIFE_REMOTE_PUBLISH_URL` with sha-mismatch protection; `SquadCreator.publishWithRemote` and `SkillCreator.publishWithRemote` compose local seal + remote push |
32
-
33
- ### Epic 15 — Research-track (Sprint 3)
34
-
35
- | Story | Commit | Surface |
36
- |---|---|---|
37
- | 15.1 | `f3743a1` | `ProcessSandbox` wrapper for Node's `--permission` flag + `docs/sandboxing-research.md` decision doc (not wired in v1.5 — v1.6 migration plan documented) |
38
-
39
- ### Epic 16 — Quality + observability (Sprint 4)
40
-
41
- | Story | Commit | Surface |
42
- |---|---|---|
43
- | 16.1 | `8df1eb9` | index.ts: 19 anys → 9 (typed CatalogEntry / RouteShape / TelegramGetMe / MissionState); CI lint budget 130 → 115 (production count now 109) |
44
- | 16.2 | `8df1eb9` | `test_performance_latency.ts` expanded to 5 benchmarks (added `condition_parse_and_evaluate` + `workflow_parse`) |
45
-
46
- ### Epic 17 — Integration coverage (Sprint 5)
47
-
48
- | Story | Commit | Surface |
49
- |---|---|---|
50
- | 17.1 | `37a1093` | `test_v15_e2e_integration.ts` — 7-step end-to-end run touching SquadCreator, MissionStateStore, OrchestrationLoop, MissionEvaluationStore, RemotePublisher, GovernanceLayer, GovernanceScopeLedger (every Brain / network call stubbed via require.cache) |
51
-
52
- ---
53
-
54
- ## New environment variables
55
-
56
- | Variable | Default | Owner | Disables |
57
- |---|---|---|---|
58
- | `OPENLIFE_POST_MISSION_EVALUATION` | `on` | Story 13.1 | Set to `off` to skip the Brain/heuristic evaluation hook |
59
- | `OPENLIFE_GOVERNANCE_LEDGER` | `on` | Story 14.2 | Set to `off` to skip ledger appends |
60
- | `OPENLIFE_FORECAST_CACHE` | `on` | Story 14.3 | Set to `off` to bypass forecast cache reads + writes |
61
- | `OPENLIFE_FORECAST_CACHE_TTL_HOURS` | `24` | Story 14.3 | Override cache freshness window |
62
- | `OPENLIFE_REMOTE_PUBLISH_URL` | (unset) | Story 14.1 | Set to base URL to enable remote publish |
63
- | `OPENLIFE_REMOTE_PUBLISH_TOKEN` | (unset) | Story 14.1 | Optional Bearer token sent with each PUT |
64
- | `OPENLIFE_PERF_BASELINE_FILE` | `.artifacts/perf-baseline.json` | Story 13.4 | CI uses `ci/perf-baseline.json` |
65
- | `PERF_NOISE_FLOOR_MS` | `0.5` (local), `1.0` (CI) | Story 13.4 | Skip % gate when both P95s sit below this |
66
-
67
- ---
68
-
69
- ## New tests
70
-
71
- | Test | Stories covered |
72
- |---|---|
73
- | `test_post_mission_evaluation.ts` | 13.1 |
74
- | `test_governance_scope_ledger.ts` | 14.2 |
75
- | `test_consequence_forecast_brain.ts` | 14.3 |
76
- | `test_remote_publish.ts` | 14.1 |
77
- | `test_process_sandbox.ts` | 15.1 |
78
- | `test_v15_e2e_integration.ts` | 17.1 |
79
-
80
- CI `test:all` now runs 96 markers (up from 89 at v1.4.0).
81
-
82
- ---
83
-
84
- ## Locked decisions held over to v1.5+
85
-
86
- 1. **Toolset enforcement opt-in stays.** Story 13.5 (default-ON flip)
87
- waits for the soak window. Earliest target: after one tagged v1.4
88
- maintenance release. Lands on its own branch.
89
- 2. **Brain mode is still flag-free.** Provider key presence is the
90
- opt-in for `designWithBrain`, `evaluateMission`, and `forecastWithBrain`.
91
-
92
- ---
93
-
94
- ## Honest scorecard delta vs v1.4
95
-
96
- | Dimension | v1.4 | v1.5 |
97
- |---|---|---|
98
- | Test infra | 10.0 | 10.0 (+e2e integration) |
99
- | Documentation | 10.0 | 10.0 (4 new reference docs) |
100
- | Code quality | 9.6 | 9.8 (any 172 → 109) |
101
- | Feature completeness | 10.0 | 10.0 (publish + governance ledger + sandbox primitive) |
102
- | Governance | 10.0 | 10.0+ (now tamper-evident) |
103
-
104
- The improvements are honestly **incremental**, not transformative — that
105
- was the v1.5 thesis. v1.6 is where the soak-gated default-ON flip and
106
- the wider distributed-scheduler epic will move the score again.
@@ -1,121 +0,0 @@
1
- # OpenLife v1.5 — Roadmap
2
-
3
- **Branch:** `feat/v1.5-evaluation`
4
- **Status:** Sprint 1 in progress.
5
- **Predecessor:** v1.4.0 "Path to 10/10" (scorecard ~9.92/10 → A+).
6
-
7
- v1.5 picks up the items that were explicitly deferred from v1.4 plus a
8
- small set of consolidations the v1.4 audit revealed once the milestone
9
- closed. No new pillars again — v1.5 is about making v1.4's foundations
10
- land harder.
11
-
12
- ---
13
-
14
- ## The 5 deferred items from v1.4
15
-
16
- | # | Item | Owner story | Priority |
17
- |---|---|---|---|
18
- | 1 | Toolset enforcement default-ON (after soak window) | 13.5 | P1 |
19
- | 2 | Real remote pack publish (currently local ledger only) | 13.6 | P2 |
20
- | 3 | LLM-driven post-mission **evaluation** (consolidation was v1.4 only) | 13.1 ✅ | P0 |
21
- | 4 | Kernel-level fs sandboxing exploration (Node `permission` API) | 13.7 (research) | P3 |
22
- | 5 | Distributed multi-host scheduler | 14.x (separate epic) | P3 |
23
-
24
- ---
25
-
26
- ## Epic 13 — Evaluation + soak follow-through (Sprint 1)
27
-
28
- The first wave closes the highest-impact deferrals without changing the
29
- opt-in stance for enforcement.
30
-
31
- - **13.1 — Brain-driven post-mission evaluation** ✅ (commit `cea832d`)
32
- - `OrchestrationLoop.evaluateMission(state)` writes a structured
33
- `{ score, verdict, risks, rationale, source }` to
34
- `.openlife/evaluations/<taskId>.json`.
35
- - Brain when key present; heuristic `MissionEvaluationStore.judge()`
36
- fallback otherwise.
37
- - Disable: `OPENLIFE_POST_MISSION_EVALUATION=off`.
38
-
39
- - **13.2 — `openlife eval list` CLI surface** ✅ (commit `bdca6df`)
40
- - Lists `.openlife/evaluations/*.json`.
41
- - Filters: `--verdict`, `--min-score`, `--source` (brain | heuristic).
42
-
43
- - **13.3 — Tighten `any` budget**
44
- - Drop CI lint budget from 160 → 130; reduce
45
- `EnterpriseAgenticCore.ts` from 12 → ~3.
46
- - The locked v1.4 target was <70 by v1.5; we're stepping the budget
47
- down across two milestones to keep PRs reviewable.
48
-
49
- - **13.4 — Perf benchmark CI gate hardening**
50
- - Persist `.artifacts/perf-baseline.json` as a CI artifact across runs
51
- (currently re-seeds every run, so no real comparison happens in CI
52
- yet).
53
- - Wire the perf test into the lint workflow as a quality gate.
54
-
55
- - **13.5 — Toolset enforcement default-ON flip**
56
- - Change `ToolsetGuard.isToolsetAllowed` default from "OFF when env unset"
57
- to "ON when env unset". The opt-out becomes `OPENLIFE_TOOLSET_ENFORCEMENT=off`.
58
- - Land **after** at least one week of v1.4 soak — earliest target is
59
- one tagged maintenance release into v1.4.
60
-
61
- ---
62
-
63
- ## Epic 14 — Real publish + advanced governance (Sprint 2, P2)
64
-
65
- - **14.1 — Remote pack publish.** `SquadCreator.publish()` and
66
- `SkillCreator.publish()` currently append to `.openlife/published-assets.jsonl`.
67
- v1.5 wires `--remote <url>` to push the sealed asset bundle to a real
68
- registry (npm-compatible or HTTP PUT, TBD during planning).
69
-
70
- - **14.2 — Governance scope ledger.** Every `GovernanceLayer.evaluate()`
71
- result currently writes a single consent record. v1.5 promotes this
72
- to a tamper-evident JSONL ledger with SHA-chained entries.
73
-
74
- - **14.3 — Brain → consequences forecast.** Wire `ConsequenceForecaster`
75
- to call Brain for the highest-risk decisions; cache forecasts under
76
- `.openlife/forecasts/`.
77
-
78
- ---
79
-
80
- ## Epic 15 — Research-track items (P3, may slip to v1.6)
81
-
82
- - **15.1 — Node `permission` API exploration.** Spike to evaluate
83
- kernel-level fs sandboxing as a complement to toolset enforcement.
84
- Output: a `docs/sandboxing-research.md` decision doc, no production
85
- code in v1.5.
86
-
87
- - **15.2 — Distributed multi-host scheduler.** Standalone epic; runs in
88
- v1.6 unless capacity allows starting in v1.5.
89
-
90
- ---
91
-
92
- ## Locked decisions (held over from v1.4)
93
-
94
- 1. **Toolset enforcement opt-in remains the v1.4 stance.** Story 13.5
95
- flips the default ON only after the soak window. Existing
96
- `OPENLIFE_TOOLSET_ENFORCEMENT=on` keeps working without change.
97
-
98
- 2. **Brain-driven `design()` continues to be flag-free.** Presence of
99
- any provider key IS the opt-in. Same applies to Story 13.1's
100
- `evaluateMission`.
101
-
102
- ---
103
-
104
- ## Definition of Done
105
-
106
- | Check | Method |
107
- |---|---|
108
- | `npm run build` clean | After each story |
109
- | `npm run test:all` green | After each story; CI on every push |
110
- | Perf regression check | `test_performance_latency.ts` against `.artifacts/perf-baseline.json` (≤ +30% P95 by default) |
111
- | `any` budget | CI lint job — 130 by end of v1.5 (from 160 in v1.4) |
112
- | Toolset enforcement soak | At least 1 week + 1 maintenance release before 13.5 lands |
113
-
114
- ---
115
-
116
- ## Commit / push policy
117
-
118
- Same as v1.4 — one atomic commit per story, branch
119
- `feat/v1.5-evaluation`, GOOODZ identity. Push requires explicit Rafa
120
- approval (the v1.4 policy stayed in effect; the autonomous push for
121
- this branch was granted on 2026-05-13).
@@ -1,67 +0,0 @@
1
- # OpenLife v1.6 — Changelog
2
-
3
- **Branch:** `feat/v1.6-sandbox-rollout`
4
- **Predecessor:** v1.5.0 (tagged 2026-05-13)
5
-
6
- v1.6 is the **operational consolidation** milestone — promotes v1.5's
7
- research-track `ProcessSandbox` into real production sites and continues
8
- the `any`-budget tightening trajectory.
9
-
10
- ## What landed by epic
11
-
12
- | Sprint | Epic | Stories | Commits |
13
- |---|---|---|---|
14
- | 1 | 18 — Sandbox rollout | 18.1 / 18.2 | `8834aaa`, `96248f2` |
15
- | 2 | 19 — Quality | 19.1 | `3d6dc32` |
16
- | 3 (cap) | — | C.1 (this doc) | _this commit_ |
17
-
18
- ### Epic 18 — Sandbox rollout (Sprint 1)
19
-
20
- | Story | Commit | Surface |
21
- |---|---|---|
22
- | 18.1 | `8834aaa` | `SystemDoctor.checkProcessSandbox` — doctor now spawns a tiny `node -e` via the wrapper and reports Node major version + enforcement state + applied-flag count. Low-risk production wire-in. |
23
- | 18.2 | `96248f2` | `TaskExecutor.runShellCommand` opt-in routing via `OPENLIFE_PROCESS_SANDBOX=on`. Default OFF (same opt-in stance v1.4 took for toolset enforcement). For `/bin/bash` this is a pass-through wrapper today; v1.7 will route node-script steps through it with real `--permission` scoping. |
24
-
25
- ### Epic 19 — Quality (Sprint 2)
26
-
27
- | Story | Commit | Surface |
28
- |---|---|---|
29
- | 19.1 | `3d6dc32` | Production `any` count 109 → 83 across Gatekeeper (9→0), TestHarness (8→0), VoiceManager (5→0), admin_panel_server (4→0). CI lint budget tightened 115 → 90. |
30
-
31
- ## Calendar-gated deferrals (still pending)
32
-
33
- - **Story 13.5 — Toolset enforcement default-ON flip.** Requires
34
- ≥1 week of v1.4 soak + ≥1 tagged v1.4 maintenance release. v1.4.0
35
- was tagged 2026-05-13; the earliest land date is 2026-05-20 after a
36
- v1.4.1 cut. Will ship on its own dedicated branch when ready.
37
-
38
- ## Locked decisions
39
-
40
- 1. **`OPENLIFE_PROCESS_SANDBOX` stays opt-in in v1.6.** v1.7 will
41
- reconsider the default after sandbox-specific soak.
42
- 2. **No new env vars introduced in this milestone.** Story 18.2's
43
- `OPENLIFE_PROCESS_SANDBOX` was promoted from v1.5 research-track
44
- into production wiring; the variable name was reserved in
45
- `docs/sandboxing-research.md`.
46
-
47
- ## Tests added
48
-
49
- | Test | Stories covered |
50
- |---|---|
51
- | `test_doctor_sandbox_check.ts` | 18.1 |
52
- | `test_task_executor_sandbox_optin.ts` | 18.2 |
53
-
54
- `test:all` now runs 101 markers (up from 99 at v1.5.0).
55
-
56
- ## Honest scorecard delta vs v1.5
57
-
58
- | Dimension | v1.5 | v1.6 |
59
- |---|---|---|
60
- | Code quality | 9.8 | 9.9 (any 172 → 83 over four milestones) |
61
- | Test infra | 10.0 | 10.0 (+2 integration probes) |
62
- | Documentation | 10.0 | 10.0 (changelog + roadmap) |
63
- | Feature completeness | 10.0 | 10.0 (sandbox now wired) |
64
-
65
- The change is incremental — that was the v1.6 thesis. Story 13.5
66
- remains the one big operational lever that's calendar-gated; it will
67
- move the score again when it lands.
@@ -1,89 +0,0 @@
1
- # OpenLife v1.6 — Roadmap
2
-
3
- **Branch:** `feat/v1.6-sandbox-rollout`
4
- **Predecessor:** v1.5.0 (tagged 2026-05-13)
5
-
6
- v1.6 is the **operational consolidation** milestone — promoting v1.5's
7
- research outputs into real production wiring and closing the one
8
- calendar-gated v1.4 deferral.
9
-
10
- ## The 4 deferred items left from v1.5
11
-
12
- | # | Item | Owner story | Calendar gate? |
13
- |---|---|---|---|
14
- | 1 | Wire `ProcessSandbox` into a non-critical site (doctor) | 18.1 | No — ready now |
15
- | 2 | Opt-in `OPENLIFE_PROCESS_SANDBOX=on` for `TaskExecutor.runShellCommand` | 18.2 | No — ready now |
16
- | 3 | Toolset enforcement default-ON flip | 13.5 | **YES** — 1 week of v1.4 soak + 1 maintenance release |
17
- | 4 | Continue `any` budget tightening toward v1.4-plan target of 70 | 19.1 | No — ready now |
18
- | 5 | Distributed multi-host scheduler | 20.x | Own epic; may slip to v1.7 |
19
-
20
- ## Epic breakdown
21
-
22
- ### Epic 18 — Sandbox rollout (Sprint 1)
23
-
24
- - **18.1 — Wire `ProcessSandbox` into `doctor` script execution.** The
25
- doctor command runs a handful of node-based health probes. Routing
26
- those through `ProcessSandbox` validates the wrapper in a low-risk
27
- production site and produces real telemetry about Node-permission
28
- behaviour on operator machines.
29
-
30
- - **18.2 — Opt-in shell-command sandboxing.** New env flag
31
- `OPENLIFE_PROCESS_SANDBOX=on` routes `TaskExecutor.runShellCommand`
32
- through `ProcessSandbox` with a default-allow scope of `cwd + .artifacts`.
33
- Default OFF in v1.6 — same opt-in stance as toolset enforcement had in v1.4.
34
-
35
- ### Epic 13.5 (calendar-gated) — Toolset enforcement default flip (Sprint 2)
36
-
37
- - **13.5 — Flip default of `OPENLIFE_TOOLSET_ENFORCEMENT` from OFF to ON.**
38
- Lands on its own atomic commit so any post-flip regression can be
39
- bisected cleanly. The opt-OUT path becomes `OPENLIFE_TOOLSET_ENFORCEMENT=off`.
40
- Pre-flip check: at least one tagged maintenance release into v1.4
41
- (target: v1.4.1 if any post-release issues surface) AND ≥1 calendar
42
- week of soak.
43
-
44
- ### Epic 19 — Quality (Sprint 3)
45
-
46
- - **19.1 — `any` budget toward <90.** v1.5 landed at 109. Remaining
47
- concentrations live in CLI option handlers and a handful of
48
- provider/route code paths. Target: under 90 in prod code; CI lint
49
- budget steps to 95.
50
-
51
- ### Epic 20 — Multi-host scheduler (Sprint 4, P3, may slip to v1.7)
52
-
53
- - **20.x — Distributed scheduler.** Big enough to warrant its own
54
- roadmap doc. v1.6 may only ship the planning artifact; the
55
- implementation lives in v1.7.
56
-
57
- ### Cap (Sprint 5)
58
-
59
- - **C.1 docs/v1.6-changelog.md** + 1-2 reference docs for the new
60
- surfaces.
61
- - **C.2 final `test:all`** + perf baseline lock.
62
- - **C.3 PR ready + merge + tag v1.6.0** (toolset flip is the most
63
- operationally significant change of v1.6, so the tag means "default
64
- enforcement on").
65
-
66
- ## Locked decisions
67
-
68
- 1. **`OPENLIFE_PROCESS_SANDBOX` defaults OFF in v1.6.** Same opt-in
69
- stance toolset enforcement had in v1.4. Will flip to ON in v1.7 only
70
- after sandbox-specific soak.
71
- 2. **Story 13.5 commits separately on this branch** with its own
72
- atomic message so the toolset-default-flip can be reverted without
73
- losing the sandbox-rollout work.
74
-
75
- ## Definition of Done
76
-
77
- | Check | Method |
78
- |---|---|
79
- | `npm run build` clean | After each story |
80
- | `npm run test:all` green | After each story; CI on every push |
81
- | Perf regression check | `test_performance_latency.ts` vs `ci/perf-baseline.json` (≤+30% P95) |
82
- | `any` budget | CI lint job — 95 by end of v1.6 (from 115 in v1.5) |
83
- | Sandbox enforcement soak | v1.7 only flips after ≥1 v1.6 maintenance release |
84
- | Toolset enforcement default-ON | v1.4.1 tagged + ≥7 days since v1.4.0 |
85
-
86
- ## Commit / push policy
87
-
88
- Same as v1.4 / v1.5. One atomic commit per story; pushes authorized
89
- for this branch.