@openlife/cli 1.7.4 → 1.7.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/CHANGELOG.md +186 -0
  2. package/CODE_OF_CONDUCT.md +31 -0
  3. package/CONTRIBUTING.md +133 -0
  4. package/README.md +25 -9
  5. package/package.json +10 -2
  6. package/docs/CHANGELOG_FEATURE_ROLLOUT_DESIGNMD.md +0 -43
  7. package/docs/EXTERNAL_SOURCES_AND_SECURITY_GUARD.md +0 -33
  8. package/docs/OPENLIFE_AUDIT_2026-05-06.md +0 -170
  9. package/docs/OPENLIFE_CONSOLIDATED_PLAN_2026-05-06.md +0 -299
  10. package/docs/OPENLIFE_DUAL_MODE_IMPLEMENTATION_PLAN.md +0 -205
  11. package/docs/OPENLIFE_EVOLUTION_SURFACE_2026-05-07.md +0 -53
  12. package/docs/OPENLIFE_SKILLS_IMPORT_2026-05-07.json +0 -223
  13. package/docs/OPENLIFE_SQUADS_IMPORT_2026-05-07.json +0 -184
  14. package/docs/PAPERCLIP_OPENLIFE_INVESTIGATION.md +0 -85
  15. package/docs/RELEASE_ORGANIZATION_PLAN.md +0 -164
  16. package/docs/audit/CLI-EXECUTION-RESULTS.md +0 -113
  17. package/docs/audit/CLI-MATRIX.md +0 -556
  18. package/docs/audit/DOC-PARITY-GAPS.md +0 -351
  19. package/docs/audit/ORCHESTRATOR-MATRIX.md +0 -136
  20. package/docs/audit/TEST-COVERAGE-GAPS.md +0 -334
  21. package/docs/audit/integrations/SKIPPED.md +0 -101
  22. package/docs/autonomous-install.md +0 -79
  23. package/docs/capability-genesis.md +0 -137
  24. package/docs/capability-pack-schema.md +0 -157
  25. package/docs/commands.md +0 -82
  26. package/docs/deep-research-capability.md +0 -114
  27. package/docs/development/typescript-conventions.md +0 -95
  28. package/docs/host-installers.md +0 -68
  29. package/docs/install/aiobuilder.md +0 -70
  30. package/docs/install/claude-code.md +0 -83
  31. package/docs/install/codex.md +0 -64
  32. package/docs/install/gemini-cli.md +0 -64
  33. package/docs/install/runtime-profiles.md +0 -83
  34. package/docs/openlife-agent-os-blueprint.md +0 -114
  35. package/docs/openlife-install-backlog.md +0 -115
  36. package/docs/openlife-install-spec.md +0 -306
  37. package/docs/operations/CLOUD_CUTOVER_AUDIT.md +0 -37
  38. package/docs/operations/PHASE_PROGRESS_CONTINUATION.md +0 -24
  39. package/docs/performance-benchmarks.md +0 -83
  40. package/docs/planning/v1.3-capability-genesis.md +0 -157
  41. package/docs/plans/2026-05-05-admin-interface-professional-dark-premium-plan.md +0 -84
  42. package/docs/plans/2026-05-05-openlife-autonomous-domain-marketplace-masterplan.md +0 -122
  43. package/docs/roadmap/OPENLIFE_MASTER_PLAN_CLOUD_V3.md +0 -97
  44. package/docs/sandboxing-research.md +0 -117
  45. package/docs/stories/epic-feature-audit/1.1.story.md +0 -84
  46. package/docs/stories/epic-feature-audit/1.2.story.md +0 -102
  47. package/docs/stories/epic-feature-audit/1.3.story.md +0 -93
  48. package/docs/stories/epic-feature-audit/1.5.story.md +0 -121
  49. package/docs/stories/epic-feature-audit/1.6.story.md +0 -80
  50. package/docs/stories/epic-feature-completeness/2.1.story.md +0 -70
  51. package/docs/stories/epic-feature-completeness/2.2.story.md +0 -49
  52. package/docs/stories/epic-feature-completeness/2.3.story.md +0 -74
  53. package/docs/stories/epic-feature-completeness/2.4.story.md +0 -71
  54. package/docs/stories/epic-feature-completeness/3.1.story.md +0 -56
  55. package/docs/stories/epic-feature-completeness/3.2.story.md +0 -80
  56. package/docs/stories/epic-feature-completeness/3.3.story.md +0 -68
  57. package/docs/stories/epic-feature-completeness/3.4.story.md +0 -71
  58. package/docs/stories/epic-feature-completeness/3.5.story.md +0 -72
  59. package/docs/stories/epic-feature-completeness/3.6.story.md +0 -69
  60. package/docs/stories/epic-feature-completeness/3.7.story.md +0 -68
  61. package/docs/stories/epic-feature-completeness/3.8.story.md +0 -57
  62. package/docs/v1.4-changelog.md +0 -159
  63. package/docs/v1.5-changelog.md +0 -106
  64. package/docs/v1.5-roadmap.md +0 -121
  65. package/docs/v1.6-changelog.md +0 -67
  66. package/docs/v1.6-roadmap.md +0 -89
@@ -1,121 +0,0 @@
1
- # Story 1.5 — [BUG] /api/v1/trigger requires auth or documented topology
2
-
3
- **StoryId:** `1.5`
4
- **Epic:** `epic-feature-audit`
5
- **Status:** InReview
6
- **Severity:** P2
7
- **Discovered in phase:** 4 (audit run `20260507T224949Z`)
8
- **Cluster:** security-perimeter
9
-
10
- ## Description
11
-
12
- The Express webhook endpoint `POST /api/v1/trigger` accepts arbitrary JSON bodies without authentication and queues them as tasks. Phase 4 confirmed:
13
-
14
- ```
15
- POST /api/v1/trigger (no auth) → 200 {"status":"success","message":"Task enviada ao Córtex"}
16
- ```
17
-
18
- For local-only deployment (the daemon binds to `0.0.0.0:3000` by default but is typically firewalled), this is harmless. For any internet-exposed deployment (Heroku, Railway, EC2 with public IP, etc.), this is a security gap:
19
-
20
- - Anyone can submit arbitrary text intents → cost burn (LLM credits)
21
- - Crafted intents could attempt prompt-injection or governance bypass
22
- - No audit trail tying triggers to a sender identity
23
-
24
- The admin endpoints (`/api/v1/admin/*`) are correctly protected with Basic auth (`audit-user:audit-pass` test confirmed 401 without auth, 200 with). Only `/trigger` is unprotected.
25
-
26
- ## Reproduce
27
-
28
- ```bash
29
- # Boot daemon (audit creds)
30
- PORT=3001 OPENLIFE_ADMIN_USER=audit-user OPENLIFE_ADMIN_PASS=audit-pass \
31
- nohup node dist/index.js start --daemon > /tmp/d.log 2>&1 &
32
- sleep 5
33
-
34
- # Hit /trigger with no auth
35
- curl -s -o /dev/null -w "%{http_code}\n" -X POST http://127.0.0.1:3001/api/v1/trigger \
36
- -H "Content-Type: application/json" -d '{"text":"audit smoke ping"}'
37
- # 200 (expected: 401 if auth required)
38
-
39
- # Compare admin endpoint
40
- curl -s -o /dev/null -w "%{http_code}\n" http://127.0.0.1:3001/api/v1/admin/teams
41
- # 401 (correct)
42
-
43
- kill $(jobs -p)
44
- ```
45
-
46
- Evidence: `.audit-runs/20260507T224949Z/phase-4/express-trigger-mock.json`, `.audit-runs/20260507T224949Z/phase-4/express-teams-noauth.json`
47
-
48
- ## Root-cause hypothesis
49
-
50
- `Gateway.ts` registers `/api/v1/trigger` as a public webhook by design (it's meant to be hit by external integrations like Telegram webhooks, Zapier, etc.). The original assumption was likely that a reverse proxy or firewall would handle authentication. But:
51
- - No README/INSTALL doc states this assumption.
52
- - The default binding is `0.0.0.0:3000`, not `127.0.0.1`, so the endpoint is reachable from external interfaces by default.
53
- - Heroku/Railway deploys run with `0.0.0.0` and public IP, so this is internet-exposed.
54
-
55
- ## Acceptance Criteria
56
-
57
- **Decision: Option B (Basic Auth)** — consistent with existing `/api/v1/admin/*` auth pattern. Minimal cognitive overhead for operators who already configure admin creds. HMAC (Option A) is more correct for true webhook semantics but adds signature-generation burden on every caller; can be added later as an additional layer if needed.
58
-
59
- Choose ONE approach (consult security/ops team):
60
-
61
- ### Option A — Add HMAC signature verification (preferred for webhook semantics)
62
-
63
- - [ ] Add `OPENLIFE_TRIGGER_HMAC_SECRET` env var.
64
- - [ ] When set, verify `X-OpenLife-Signature` header on `/trigger` POST. Reject 401 if missing/invalid.
65
- - [ ] When unset, log a startup WARNING that `/trigger` is unauthenticated.
66
- - [ ] Add `test_trigger_hmac.ts` that boots daemon, sends valid+invalid signatures, asserts behavior.
67
-
68
- ### Option B — Add Basic auth (consistent with admin) ✓ CHOSEN
69
-
70
- - [x] Add `OPENLIFE_TRIGGER_USER`/`OPENLIFE_TRIGGER_PASS` env vars (separate from admin).
71
- - [x] When both set, require Basic auth on `/trigger`. Returns 401 (missing/malformed Authorization) or 403 (wrong creds). Successful auth proceeds to existing webhook pipeline.
72
- - [x] When unset, log startup WARNING via `console.warn('[GATEWAY] WARNING: /api/v1/trigger sem autenticação...')`.
73
-
74
- ### Option C — Bind localhost-only by default + document topology
75
-
76
- - [ ] Change Express `app.listen(port, '127.0.0.1', ...)` as default.
77
- - [ ] Add `OPENLIFE_BIND_HOST` env var (`0.0.0.0` for explicit public deploy).
78
- - [ ] Document in `INSTALL.md` that internet-exposed deployments MUST be behind a reverse proxy adding auth.
79
-
80
- For all options:
81
-
82
- - [ ] Update `Procfile` and Heroku/Railway docs as needed. *(Deferred — docs update can ship in a follow-up; behavior is gated behind opt-in env vars so it's non-breaking for existing deploys.)*
83
- - [x] Boot the daemon and re-run the Phase 4 probes; `/trigger` returns 401 without correct auth — confirmed via `test_trigger_basic_auth.ts`.
84
- - [x] All 8 sanctioned tests still pass — full `test:all` (55 tests) green.
85
-
86
- ## Dev Notes
87
-
88
- - Auth is **opt-in** via env vars: setting both `OPENLIFE_TRIGGER_USER` and `OPENLIFE_TRIGGER_PASS` activates the middleware. Unsetting (default) is a no-op + startup WARNING. This preserves backwards-compat for any existing local-only deploy while making the gap obvious in logs for ops.
89
- - Used a **separate** env-var pair (`OPENLIFE_TRIGGER_*`) from admin (`OPENLIFE_ADMIN_*`) so webhook callers and human operators can have independent credentials. Sharing one pair would force a webhook integration to also be able to access the admin surface.
90
- - `triggerAuth` middleware mirrors `adminAuth` structure but uses `realm="OpenLife Trigger"` so HTTP clients can distinguish the two challenges.
91
- - Test boots the Gateway on port 3098 (no real Telegram token), exercises three cases: (1) no Auth header → 401, (2) wrong password → 403, (3) valid creds → 200 (or 500 when no LLM available in CI). Auth-disabled case asserts the middleware is a no-op.
92
-
93
- ## File List
94
-
95
- - `src/orchestrator/Gateway.ts` — MODIFIED (added `triggerAuth` middleware, applied to `POST /api/v1/trigger`, startup warning when unset)
96
- - `src/test_trigger_basic_auth.ts` — NEW
97
- - `package.json` — MODIFIED (added `test:trigger-basic-auth`, appended to `test:all`)
98
-
99
- ## Change Log
100
-
101
- - 2026-05-10 — @dev (Charlie) — Chose Option B (Basic Auth) for consistency with admin endpoints. Auth gated behind opt-in env pair; warning when disabled. Test covers 401/403/200/auth-disabled paths. Status: Ready → InReview.
102
-
103
- ## IDS check
104
-
105
- **Decision:** ADAPT (extending existing endpoint behavior, not creating a new endpoint).
106
-
107
- - `src/orchestrator/Gateway.ts` → ADAPT (add auth middleware to `/trigger`)
108
- - `INSTALL.md`, `docs/autonomous-install.md` → ADAPT (document deploy topology)
109
- - `test_trigger_*.ts` → CREATE
110
-
111
- ## Files to touch
112
-
113
- - `src/orchestrator/Gateway.ts` (auth middleware on `/trigger`)
114
- - `.env.example` (new env vars)
115
- - `INSTALL.md` (deploy topology docs)
116
- - `src/test_trigger_<chosen-option>.ts` — new
117
- - `package.json` — add test script
118
-
119
- ## Estimate
120
-
121
- Effort: S (4-6 hours). Mostly straightforward; testing auth flows takes the most time.
@@ -1,80 +0,0 @@
1
- # Story 1.6 — [POLISH] Surface HTTP status + response body in Brain LLM error
2
-
3
- **StoryId:** `1.6`
4
- **Epic:** `epic-feature-audit`
5
- **Status:** InReview
6
- **Severity:** P3
7
- **Discovered in phase:** 5 (audit run `20260507T224949Z`)
8
- **Cluster:** observability
9
-
10
- ## Description
11
-
12
- When a fallback provider call fails in `Brain.ts`, the user-facing error message is `Connection error.` — no HTTP status code, no response body, no key-prefix hint. This made diagnosing Story 1.3 (misconfigured `OPENAI_API_KEY`) unnecessarily slow.
13
-
14
- Specifically the audit observed:
15
-
16
- ```
17
- [BRAIN ERROR - openai-api/gpt-5.4-mini-2026-03-17] Connection error.
18
- ```
19
-
20
- For a wrong-key situation, the OpenAI client typically returns 401 with body `{"error": {"message": "Incorrect API key provided", ...}}`. None of that surfaces.
21
-
22
- ## Reproduce
23
-
24
- ```bash
25
- # Set a fake key
26
- OPENAI_API_KEY=sk-fake-not-a-real-key node dist/index.js ask "test"
27
- # Output mentions "Connection error." but doesn't say:
28
- # - HTTP status code
29
- # - response body excerpt
30
- # - whether the key prefix looked correct
31
- ```
32
-
33
- Evidence: `.audit-runs/20260507T224949Z/phase-5/drill6.err`
34
-
35
- ## Root-cause hypothesis
36
-
37
- `Brain.ts` `thinkWithOpenAIAPI()` (and similar provider methods) probably catch errors with `try { ... } catch (e) { return e.message }` or similar. The OpenAI SDK's `APIError` class exposes `status`, `code`, `headers`, `error.message`, but these aren't being read.
38
-
39
- ## Acceptance Criteria
40
-
41
- - [x] In `Brain.thinkWithOpenAIAPI()`: catch errors and surface a structured message via `formatProviderError(provider, model, error, {keyEnvVar, expectedKeyPrefix})`.
42
- - [x] Apply the same pattern to `thinkWithAnthropic` (`sk-ant-`), `thinkWithGeminiAPI`, `thinkWithOllama`, `thinkWithOpenRouter`. All providers route through the same helper.
43
- - [x] CLI providers (`thinkWithOpenAICLI`, `thinkWithGeminiCLI`) surface `stderr` from the spawned process via `error.stderr` field.
44
- - [x] The user-facing `CRITICAL ERROR` summary inherits the new format because failures now carry structured messages from `formatProviderError`.
45
- - [x] Add `test_brain_error_diagnostics.ts` — tests provider tag, HTTP status, API message, stderr passthrough, key-prefix warning (and absence-of-warning when prefix is correct), and `cause` preservation.
46
- - [x] All 8 sanctioned tests still pass — `test:all` (53 tests now) green.
47
-
48
- ## Dev Notes
49
-
50
- - Introduced public helper `formatProviderError(provider, model, error, opts?)` on `Brain` so the test seam doesn't require mocking an entire provider — assertions can call the helper directly with synthetic error shapes.
51
- - Key-prefix warning fires only when `expectedKeyPrefix` is configured AND the actual env var value doesn't match. This avoids noisy false positives for providers without a stable prefix pattern (e.g., Ollama, custom OpenRouter setups).
52
- - `formatProviderError` preserves the original error via `(wrapped as any).cause` and adds `(wrapped as any).providerStatus` for downstream code that wants the status without re-parsing the message.
53
- - Body text from non-OK `fetch` responses (Ollama, OpenRouter) is now read (truncated to 200 chars) before throwing — previously only `statusText` was surfaced.
54
-
55
- ## File List
56
-
57
- - `src/orchestrator/Brain.ts` — MODIFIED (added `formatProviderError`, wrapped each `thinkWith*` in try/catch routing through helper, read body on fetch failures)
58
- - `src/test_brain_error_diagnostics.ts` — NEW
59
- - `package.json` — MODIFIED (added `test:brain-error-diagnostics`, appended to `test:all`)
60
-
61
- ## Change Log
62
-
63
- - 2026-05-10 — @dev (Charlie) — Implemented structured provider errors via `formatProviderError` helper. All 7 providers route through it. Test covers status/message/stderr/key-prefix/cause. Status: Ready → InReview.
64
-
65
- ## IDS check
66
-
67
- **Decision:** ADAPT (refining existing error handling).
68
-
69
- - `src/orchestrator/Brain.ts` → ADAPT (better error wrapping)
70
- - `src/test_brain_error_diagnostics.ts` → CREATE
71
-
72
- ## Files to touch
73
-
74
- - `src/orchestrator/Brain.ts` (each `thinkWith*` method)
75
- - `src/test_brain_error_diagnostics.ts` — new
76
- - `package.json` — add test script
77
-
78
- ## Estimate
79
-
80
- Effort: XS (1-2 hours). Localized refactor in `Brain.ts`.
@@ -1,70 +0,0 @@
1
- # Story 2.1 — [BUG] `openlife phase1-check` hangs indefinitely
2
-
3
- **StoryId:** `2.1`
4
- **Epic:** `epic-feature-completeness`
5
- **Status:** InReview
6
- **Severity:** P1
7
- **Discovered in:** Phase 2 of total feature audit milestone (`docs/audit/CLI-EXECUTION-RESULTS.md`)
8
- **Cluster:** process-lifecycle
9
-
10
- ## Description
11
-
12
- `openlife phase1-check` never exits. Process must be killed with SIGTERM/SIGKILL. Documented as a canonical readiness check, but unusable in scripts, CI, or any context expecting deterministic exit.
13
-
14
- **Operational impact:** Same class of bug as Story 1.2 `ask` exit (now resolved). The handler at `src/index.ts:435` runs `TestHarness.runPhase1Checks()` and sets `process.exitCode` but never calls `process.exit()`. TestHarness constructor instantiates `Gateway` → `Telegraf` and `Gatekeeper` → `Brain` → `OmniMemory` etc., all of which leave event-loop handles open.
15
-
16
- ## Reproduce
17
-
18
- ```bash
19
- timeout 35 node dist/index.js phase1-check
20
- # Expected: exit 0 or 1 in under 30s with readable check matrix
21
- # Observed (pre-fix): exit 143 (SIGTERM) after 35s, zero output captured
22
- ```
23
-
24
- Evidence: `.planning/phase-2/FINDINGS.md` BUG-01.
25
-
26
- ## Root cause
27
-
28
- Three contributing factors:
29
-
30
- 1. **No explicit `process.exit()`** — handler only sets `process.exitCode = 1` on failure; Node tries to drain event-loop normally, fails because handles remain open.
31
- 2. **Heavy module imports** — `require('./orchestrator/TestHarness')` chains through Brain/Gateway/Gatekeeper, which collectively take ~24s of synchronous `require()` time on WSL2 (Brain alone ~10s, Gateway ~21s, Gatekeeper ~23s).
32
- 3. **No master timeout** — if any check (`checkBrainPrimary`, `checkGatewayText`, etc.) blocks on an LLM call that never returns, the whole command hangs.
33
-
34
- ## Acceptance Criteria
35
-
36
- - [x] **`process.exit(exitCode)` called unconditionally** at end of handler — guarantees deterministic exit regardless of open handles
37
- - [x] **Master timeout via `Promise.race`** — default 30s (overridable via `OPENLIFE_PHASE1_TIMEOUT_MS`) — if all checks together exceed timeout, exit 1 with clear error
38
- - [x] **Construction inside race** — `TestHarness` instantiation wrapped in async IIFE so heavy synchronous imports don't starve the timeout
39
- - [x] **Regression test** — `src/test_phase1_check_exit.ts` spawns subprocess, asserts exit code in {0,1}, asserts no SIGKILL needed within generous 120s shell timeout
40
- - [x] All 8 Phase 1 checks still run when reachable (no degradation of `phase1-check` functionality)
41
- - [x] `npm run test:all` passes — 62 → 63 tests
42
-
43
- ## Dev Notes
44
-
45
- - **Why master timeout = 30s default**: empirically `phase1-check` takes ~36s on WSL because of slow imports. Users running in production-like env (linux, fast disk) typically see <15s. Default is conservative for "either it works or fails cleanly".
46
- - **Why test timeout is 120s**: regression test must NOT be flaky on WSL where imports are slow. 120s is "either fix works or doesn't" — the bug being tested is **forever-hang**, not slowness.
47
- - **Why not refactor TestHarness**: TestHarness lazy-instantiation would require changes to the constructor pattern across many call sites. Out of scope for this story. Story 2.x might revisit if `phase1-check` becomes hot path.
48
-
49
- ## File List
50
-
51
- - `src/index.ts` — MODIFIED (added try/catch wrapping master timeout + `process.exit`)
52
- - `src/test_phase1_check_exit.ts` — NEW (regression test)
53
- - `package.json` — MODIFIED (added `test:phase1-check-exit`, appended to `test:all`)
54
-
55
- ## Change Log
56
-
57
- - 2026-05-11 — @dev (Charlie) — Implemented `process.exit` + master timeout + race-wrapping. Regression test added. Test suite 62 → 63 verde. Status: Ready → InReview.
58
-
59
- ## IDS check
60
-
61
- **Decision:** REUSE (process.exit pattern from Story 1.2) + ADAPT (add timeout master) + CREATE (regression test).
62
-
63
- - `src/index.ts:435` handler → ADAPT pattern from `src/index.ts:415` (`ask` handler, Story 1.2)
64
- - `src/test_phase1_check_exit.ts` → CREATE mirror of `src/test_ask_exit.ts`
65
-
66
- ## Files to touch
67
-
68
- - `src/index.ts` (handler at line 435)
69
- - `src/test_phase1_check_exit.ts` — new
70
- - `package.json` — add `test:phase1-check-exit` script + append to test:all chain
@@ -1,49 +0,0 @@
1
- # Story 2.2 — [BUG] `openlife mcp status` default exits 1 demanding `--real`
2
-
3
- **StoryId:** `2.2`
4
- **Epic:** `epic-feature-completeness`
5
- **Status:** InReview
6
- **Severity:** P2
7
- **Discovered in:** Phase 2 of total feature audit (`.planning/phase-2/FINDINGS.md`)
8
- **Cluster:** cli-default-ux
9
-
10
- ## Description
11
-
12
- `openlife mcp status` (without flags) exits with code 1 and prints `❌ Use --real para obter o status determinístico do runtime.` This is bad UX — defaults should either work or print help. Workaround: pass `--real`.
13
-
14
- The dual-mode pattern (`--real` vs default) was a remnant of an older design where a mock/deterministic mode existed. No mock path exists in current code, so the only useful behavior is `--real`.
15
-
16
- ## Reproduce
17
-
18
- ```bash
19
- node dist/index.js mcp status
20
- # Pre-fix: stderr "❌ Use --real..." exit 1
21
- # Post-fix: stdout JSON status, exit 0
22
-
23
- node dist/index.js mcp status --real
24
- # Both pre and post: stdout JSON status, exit 0 (compat preserved)
25
- ```
26
-
27
- ## Fix
28
-
29
- Default action now invokes `world.mcpStatusReal()` regardless of flag. `--real` flag retained as no-op for backwards compatibility (any scripts/aliases that pass `--real` continue to work unchanged).
30
-
31
- ## Acceptance Criteria
32
-
33
- - [x] `openlife mcp status` (no flag) returns JSON `mcp-real-status` payload and exits 0
34
- - [x] `openlife mcp status --real` continues to work (backwards compat)
35
- - [x] `test_cli_diagnostics.ts` reclassifies this command from `gap` (KNOWN BUG) to `pass`
36
- - [x] `npm run test:all` green (63/63)
37
-
38
- ## File List
39
-
40
- - `src/index.ts:1154-1163` — MODIFIED (removed error gate, `--real` becomes no-op)
41
- - `src/test_cli_diagnostics.ts` — MODIFIED (reclassified `mcp status` to `pass`)
42
-
43
- ## Change Log
44
-
45
- - 2026-05-11 — @dev (Charlie) — Removed `--real` gate. `mcp status` default returns real status. Backwards compat preserved. Test reclassified. Status: Ready → InReview.
46
-
47
- ## IDS check
48
-
49
- **Decision:** ADAPT — minor handler refactor.
@@ -1,74 +0,0 @@
1
- # Story 2.3 — [CONCERN] `pilot/learning/plugin --help` >3s (NOT REPRODUCIBLE)
2
-
3
- **StoryId:** `2.3`
4
- **Epic:** `epic-feature-completeness`
5
- **Status:** Closed (No-Fix Required — concern did not reproduce)
6
- **Severity:** P3
7
- **Discovered in:** Phase 2 of total feature audit (`.planning/phase-2/FINDINGS.md` CONCERN-01)
8
- **Cluster:** lazy-load-performance
9
-
10
- ## Description
11
-
12
- Phase 2 test `test_cli_help_surface.ts` flagged 3 command groups with `--help` latency >3s:
13
- - `pilot --help` — 4175ms
14
- - `learning --help` — 3311ms
15
- - `plugin --help` — 3249ms
16
-
17
- Original hypothesis: top-level imports of heavy classes (EnterpriseAgenticCore, SkillLearningLoop) violating the lazy-load invariant documented in `CLAUDE.md`.
18
-
19
- ## Investigation (2026-05-11)
20
-
21
- Direct measurement after Stories 2.1 + 2.2:
22
-
23
- ```
24
- pilot --help: 1336ms
25
- learning --help: 1417ms
26
- plugin --help: 1589ms
27
- help --help: 1392ms
28
- install --help: 1625ms
29
- ask --help: 1430ms
30
- system --help: 1368ms
31
- ```
32
-
33
- All groups respond to `--help` in ~1.3–1.6s, **uniform across the surface**. No outliers.
34
-
35
- Re-running `test_cli_help_surface.ts`:
36
-
37
- ```
38
- TEST_CLI_HELP_SURFACE_OK (45/45 groups + 2 root invocations)
39
- ```
40
-
41
- No `SLOW (>3s)` line — meaning all 45 groups + 2 root invocations now complete under the 3s threshold.
42
-
43
- Module load timings (isolated):
44
- - `require('./orchestrator/EnterpriseAgenticCore')`: **21ms**
45
- - `require('./orchestrator/SkillLearningLoop')`: same range
46
-
47
- These are NOT heavy modules. The original "slow" reading was a measurement artifact — likely subprocess startup batching in the first few test iterations, or transient system load during the Phase 2 baseline.
48
-
49
- ## Decision
50
-
51
- **NO-FIX REQUIRED.** Concern was a false positive in the Phase 2 baseline measurement.
52
-
53
- Action items:
54
- - [x] Re-measure with current build — all `--help` <2s
55
- - [x] Re-run `test_cli_help_surface.ts` — no SLOW flags
56
- - [x] Document finding here for traceability
57
- - [x] Update `.planning/phase-2/FINDINGS.md` to note CONCERN-01 resolved
58
-
59
- **If future regression** brings `--help` back over 3s for any group: re-open this story. The test still has the 3s threshold and will flag.
60
-
61
- ## Acceptance Criteria
62
-
63
- - [x] No production code changes
64
- - [x] `npm run test:all` continues green (63/63)
65
- - [x] CONCERN-01 documented as not-reproducible
66
-
67
- ## File List
68
-
69
- - `docs/stories/epic-feature-completeness/2.3.story.md` — NEW (this story)
70
- - `.planning/phase-2/FINDINGS.md` — MODIFIED (CONCERN-01 marked resolved)
71
-
72
- ## Change Log
73
-
74
- - 2026-05-11 — @dev (Charlie) — Investigated CONCERN-01. Not reproducible with current build (~1.4s help latency uniform). Closed as No-Fix.
@@ -1,71 +0,0 @@
1
- # Story 2.4 — [DEBT] Test infra cleanup via pretest hook
2
-
3
- **StoryId:** `2.4`
4
- **Epic:** `epic-feature-completeness`
5
- **Status:** InReview
6
- **Severity:** P3
7
- **Discovered in:** Phase 2 of total feature audit (`.planning/phase-2/FINDINGS.md` DEBT-01) + `.planning/codebase/CONCERNS.md` C8
8
- **Cluster:** test-hygiene
9
-
10
- ## Description
11
-
12
- Pre-existing test pollution made `npm run test:all` intermittently fail:
13
- - `test_openlife_evolution_surface` asserts `.catalog/` clean of demo/test artifacts; broken when leftover `test-agent/`, `test-squad/`, `test-skill-*/`, `test-mcp/` persist from prior runs
14
- - `test_operating_system` sensitive to stale state in `.artifacts/execution-board.json` from older mission runs
15
- - `test_create_entities`, `test_admin_teams_networks`, `test_sources_import_ref` are the **emitters** of test-* entries (per Story 1.4 Dev Notes)
16
-
17
- This blocked test:all consistency until manually cleaned. Documented as a "Known Bug" workaround in Phase 2 CLI-EXECUTION-RESULTS.md.
18
-
19
- ## Reproduce
20
-
21
- ```bash
22
- npm run test:all # passes
23
- npm run test:all # FAILS at test_openlife_evolution_surface ("catalog doctor warns about demo/test assets")
24
- ```
25
-
26
- Without `pretest:all` hook, the second run sees the residue from the first.
27
-
28
- ## Fix
29
-
30
- Add `scripts/clean-test-pollution.js` + `pretest:all` npm hook that runs **before** `test:all`:
31
-
32
- 1. Remove `.catalog/{agents,squads,skills,mcps}/test-*` (and `void` directory auto-created)
33
- 2. Delete `.artifacts/` recursively
34
- 3. Restore tracked `.artifacts/*` files via `git checkout -- .artifacts/`
35
- 4. Idempotent (safe to run multiple times)
36
-
37
- This is a **pragmatic mitigation** that unblocks test:all NOW. Proper fix (deferred to Story 2.4 v2 in v2.0+) is `OPENLIFE_CATALOG_DIR` env override + temp-dir fixtures in offending tests.
38
-
39
- ## Acceptance Criteria
40
-
41
- - [x] `scripts/clean-test-pollution.js` exists, executable, idempotent
42
- - [x] `package.json` has `pretest:all` script invoking the cleanup
43
- - [x] `npm run test:all` runs cleanup automatically before tests
44
- - [x] `npm run test:all` passes 2+ consecutive runs (deterministic)
45
- - [x] Cleanup script preserves tracked files (e.g., `.artifacts/squad-scores.json`)
46
- - [x] `npm run test:all` = 63/63 verde
47
- - [x] `pretest:all` output visible in stdout (operators see what was cleaned)
48
-
49
- ## Dev Notes
50
-
51
- - **Why not `OPENLIFE_CATALOG_DIR` override now?** Would require changes to 3 separate env var conventions (`OPENLIFE_AGENT_ROOTS`, `OPENLIFE_SKILL_ROOT`, `OPENLIFE_SQUAD_ROOT`), plus offending tests need refactor. ~6-10h effort vs ~30min for this hook approach. Hook unblocks immediately; env override can ship in v2.0 epic F (test parallelization).
52
- - **Patterns cleaned:** `test-*` prefix + `void` directory (one of the assets agents auto-creates).
53
- - **`.artifacts/` is gitignored** but `squad-scores.json` is tracked — the cleanup script handles this via `git checkout`.
54
-
55
- ## File List
56
-
57
- - `scripts/clean-test-pollution.js` — NEW (executable cleanup script)
58
- - `package.json` — MODIFIED (added `pretest:all` hook)
59
-
60
- ## Change Log
61
-
62
- - 2026-05-11 — @dev (Charlie) — Added pretest cleanup hook. test:all now deterministic across consecutive runs. Status: Ready → InReview.
63
-
64
- ## IDS check
65
-
66
- **Decision:** CREATE (new ops script).
67
-
68
- ## Future work (out of this story)
69
-
70
- - Story 2.4-v2 (v2.0 epic F): `OPENLIFE_CATALOG_DIR` env override + temp-dir fixtures in test_create_entities/test_admin_teams_networks/test_sources_import_ref
71
- - Story 2.4-v3 (v2.0): pretest hook on individual test scripts (not just test:all)
@@ -1,56 +0,0 @@
1
- # Story 3.1 — Host enum + validator (v1.1 multi-host installer foundation)
2
-
3
- **StoryId:** `3.1`
4
- **Epic:** `epic-multi-host-installer` (v1.1)
5
- **Status:** InReview
6
- **Severity:** P1 (foundational — blocks 3.2-3.7)
7
- **Cluster:** install-flow
8
-
9
- ## Description
10
-
11
- `InstallFlow.run({ host })` currently accepts ANY string and ignores it. This is the architectural stub identified in the comprehensive codebase audit (2026-05-11): "Host install is architectural stub". For v1.1 to deliver real multi-host install, we need a validated type so downstream stories (per-host logic, MCP registration, docs) have a deterministic input.
12
-
13
- ## Acceptance Criteria
14
-
15
- - [x] **Type** `Host = 'claude-code' | 'gemini-cli' | 'codex'` exported from `src/cli/InstallFlow.ts`
16
- - [x] **Constants** `VALID_HOSTS` and `DEFAULT_HOST` exported
17
- - [x] **Function** `validateHost(value)` — throws `INVALID_HOST` with clear message + valid list when input is invalid; falls back to `DEFAULT_HOST` on null/undefined/empty
18
- - [x] **Case-insensitive** + whitespace-tolerant validation (`CLAUDE-CODE`, ` Codex ` normalize)
19
- - [x] **Auto-detection** `detectHostFromEnv()` based on env vars set by each CLI:
20
- - `claude-code` ← `CLAUDECODE` or `CLAUDE_PROJECT_DIR`
21
- - `gemini-cli` ← `GEMINI_CONFIG_DIR`
22
- - `codex` ← `CODEX_HOME`
23
- - [x] `InstallFlow.run()` uses validateHost — invalid host throws clear error
24
- - [x] Fixed deprecated `mode set` reference in `buildNextCommands` (replaced with `system setup --profile X --host Y`, mirror of commit `aba599b` INSTALL.md fix)
25
- - [x] Regression test `src/test_install_flow_host_validation.ts` — 7 test cases covering happy path + edge cases
26
- - [x] Suite 63 → 64 verde
27
- - [x] Bumped `test_ask_exit` timeout 30s → 60s (pre-existing latency margin, not a regression)
28
-
29
- ## Dev Notes
30
-
31
- - **Why all 3 hosts now (not just claude-code)?** Decision D3 locked: install offers Lone Wolf or Swarm Commander × any of 3 hosts. Story 3.1 establishes the enum so 3.2 (templates) and 3.3 (per-host logic) have a stable type to branch on.
32
- - **Why case-insensitive + trim?** Real users mistype. `--host Claude-Code` should work.
33
- - **Auto-detection priority** picks the most specific signal. If user sets both `CLAUDECODE=1` and `GEMINI_CONFIG_DIR=/x`, we pick claude-code (declared first). Tests cover both paths but not the conflict — operators are expected to set one or pass `--host` explicitly.
34
- - **Fixing `mode set` in nextCommands** was a drive-by — same class of bug as 3 CRITICAL doc gaps closed in commit `aba599b`. The `buildNextCommands` output was still telling users to run a non-existent command. Now uses `system setup`.
35
-
36
- ## File List
37
-
38
- - `src/cli/InstallFlow.ts` — MODIFIED (added Host type, validateHost, detectHostFromEnv, VALID_HOSTS, DEFAULT_HOST; updated `run()` to validate; updated `buildNextCommands` to use real command)
39
- - `src/test_install_flow_host_validation.ts` — NEW (7 test cases)
40
- - `src/test_ask_exit.ts` — MODIFIED (timeout 30s → 60s, comment explains why)
41
- - `package.json` — MODIFIED (added `test:install-flow-host-validation`, appended to `test:all`)
42
-
43
- ## Change Log
44
-
45
- - 2026-05-11 — @dev (Charlie) — Implemented host enum + validator + auto-detection + bug fix on buildNextCommands. test:all 63 → 64 verde. Status: Ready → InReview.
46
-
47
- ## IDS check
48
-
49
- **Decision:** ADAPT (extending existing InstallFlow with stricter typing) + CREATE (regression test).
50
-
51
- ## What unblocks for v1.1
52
-
53
- - Story 3.2 (templates per host) — has `Host` type to switch on
54
- - Story 3.3 (per-host install logic) — has `validateHost` at CLI boundary
55
- - Story 3.4 (uninstall) — has same enum for reversal
56
- - Story 3.5 (wizard) — has detection + validation primitives
@@ -1,80 +0,0 @@
1
- # Story 3.2 — dist-templates per host (Claude Code starter roster)
2
-
3
- **StoryId:** `3.2`
4
- **Epic:** `epic-multi-host-installer` (v1.1)
5
- **Status:** InReview
6
- **Severity:** P1 (blocks 3.3 per-host install logic)
7
- **Cluster:** install-flow
8
- **Depends on:** Story 3.1 (host enum + validator)
9
-
10
- ## Description
11
-
12
- Story 3.1 gave OpenLife a validated `Host` type at the CLI boundary, but `openlife system setup --host claude-code` had nothing to install. We need the actual artifacts — agent files, slash commands, MCP manifest — bundled in the npm package so install is offline and atomic.
13
-
14
- This story ships the **Claude Code** templates only. gemini-cli and codex follow in Story 3.3 once each host's installation format is investigated and verified.
15
-
16
- ## Acceptance Criteria
17
-
18
- - [x] **Directory layout** `dist-templates/claude-code/{agents,commands/openlife,mcp}` exists in the repo
19
- - [x] **5 starter agents** in Claude Code subagent format (YAML frontmatter + system prompt body):
20
- - `openlife-maestro` — meta-orchestrator (routes to specialists via `Task` tool)
21
- - `openlife-lyra` — research synthesis + narrative writing
22
- - `openlife-forge` — artifact creation (agents, skills, slash commands, MCP)
23
- - `openlife-atlas` — codebase mapping + architectural analysis
24
- - `openlife-genesis` — new-project bootstrap + install/scaffold
25
- - [x] **4 starter slash commands** under `commands/openlife/`:
26
- - `/openlife:status`, `/openlife:ask`, `/openlife:doctor`, `/openlife:dream`
27
- - [x] **MCP manifest** `mcp/openlife-orchestrator.json` with 7 tool declarations (server impl deferred to Story 3.3)
28
- - [x] **README** `dist-templates/README.md` documenting layout, format, and how to add new agents
29
- - [x] `package.json` `files` array includes `dist-templates/` (so npm publishes it)
30
- - [x] **Regression test** `src/test_dist_templates_layout.ts` — 5 test groups: layout, agents parse, slash commands parse, MCP valid JSON, package.json correctness
31
- - [x] Test wired into `test:all`; suite 64 → 65 verde
32
-
33
- ## Dev Notes
34
-
35
- - **Why 5 agents, not 21?** User chose "Pulled from 21 vault agents (MAESTRO/LYRA/etc.)" approach. Starter set is the 5 with clearest non-overlapping ownership: MAESTRO (routing), LYRA (synthesis), FORGE (creation), ATLAS (analysis), GENESIS (bootstrap). Remaining 16 are scheduled for v1.1+ stories once each role has a verified `.catalog/agents/` runtime counterpart.
36
- - **Why Claude Code format only?** User chose "Claude Code primeiro (Recomendado)" — Claude Code's subagent spec is well-documented and the most mature host. gemini-cli/codex defer to Story 3.3 to investigate their respective formats.
37
- - **Why is the MCP server stubbed?** The manifest ships so install is atomic (single host-add operation copies all artifacts), but `bin/openlife-mcp.js` doesn't exist yet. Story 3.3 wires up the actual MCP server. Until then, the manifest is informational — installing it into `~/.claude.json` is harmless.
38
- - **dist-templates vs .catalog/.** Two different audiences:
39
- - `.catalog/agents/` = OpenLife **runtime** catalog (rich YAML, loaded by `AgentRegistry`)
40
- - `dist-templates/claude-code/agents/` = what gets **installed into the host CLI** (lean Claude Code subagent format)
41
- - Same logical agent can appear in both with different formats.
42
- - **Lean prompt size.** Each agent file is ~50-80 lines including frontmatter. Compare to legacy heavy-format runtime catalog entries (~330 lines). Claude Code agents work best with focused system prompts; verbose persona definitions are wasted context per invocation.
43
-
44
- ## File List
45
-
46
- - `dist-templates/README.md` — NEW (layout + format docs)
47
- - `dist-templates/claude-code/agents/openlife-maestro.md` — NEW
48
- - `dist-templates/claude-code/agents/openlife-lyra.md` — NEW
49
- - `dist-templates/claude-code/agents/openlife-forge.md` — NEW
50
- - `dist-templates/claude-code/agents/openlife-atlas.md` — NEW
51
- - `dist-templates/claude-code/agents/openlife-genesis.md` — NEW
52
- - `dist-templates/claude-code/commands/openlife/status.md` — NEW
53
- - `dist-templates/claude-code/commands/openlife/ask.md` — NEW
54
- - `dist-templates/claude-code/commands/openlife/doctor.md` — NEW
55
- - `dist-templates/claude-code/commands/openlife/dream.md` — NEW
56
- - `dist-templates/claude-code/mcp/openlife-orchestrator.json` — NEW
57
- - `src/test_dist_templates_layout.ts` — NEW (5 test groups, 65th test in suite)
58
- - `package.json` — MODIFIED (added `dist-templates` to `files`; added `test:dist-templates-layout` script; appended to `test:all`)
59
-
60
- ## Change Log
61
-
62
- - 2026-05-11 — @dev (Charlie) — Created dist-templates/ skeleton with 5 starter agents (MAESTRO/LYRA/FORGE/ATLAS/GENESIS) in Claude Code subagent format, 4 starter slash commands under `/openlife:*`, MCP manifest (server stubbed for 3.3), regression test, README. test:all 64 → 65 verde. Status: Ready → InReview.
63
-
64
- ## IDS check
65
-
66
- **Decision:** CREATE (new distribution surface — no existing artifact installs templates into a host CLI). Format follows Claude Code's published subagent + slash command spec (REUSE of external pattern, not invented).
67
-
68
- ## What unblocks for v1.1
69
-
70
- - Story 3.3 (per-host install logic) — has templates to copy
71
- - Story 3.4 (uninstall reversible) — has known artifact list to remove
72
- - Story 3.5 (install wizard interactive) — has roster to present to user
73
- - Story 3.6 (docs per host) — has agent/command surface to document
74
-
75
- ## What this does NOT do
76
-
77
- - Implement the actual MCP server (`bin/openlife-mcp.js`) → Story 3.3
78
- - Ship gemini-cli or codex templates → Story 3.3
79
- - Wire `InstallFlow.run()` to actually copy these into the host → Story 3.3
80
- - Add the remaining 16 named agents (VEIN/FLUX/VECTOR/etc.) → spread across v1.1+ stories