@openlife/cli 1.7.3 → 1.7.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +186 -0
- package/CODE_OF_CONDUCT.md +31 -0
- package/CONTRIBUTING.md +133 -0
- package/README.md +25 -9
- package/dist/index.js +10 -1
- package/package.json +10 -2
- package/docs/CHANGELOG_FEATURE_ROLLOUT_DESIGNMD.md +0 -43
- package/docs/EXTERNAL_SOURCES_AND_SECURITY_GUARD.md +0 -33
- package/docs/OPENLIFE_AUDIT_2026-05-06.md +0 -170
- package/docs/OPENLIFE_CONSOLIDATED_PLAN_2026-05-06.md +0 -299
- package/docs/OPENLIFE_DUAL_MODE_IMPLEMENTATION_PLAN.md +0 -205
- package/docs/OPENLIFE_EVOLUTION_SURFACE_2026-05-07.md +0 -53
- package/docs/OPENLIFE_SKILLS_IMPORT_2026-05-07.json +0 -223
- package/docs/OPENLIFE_SQUADS_IMPORT_2026-05-07.json +0 -184
- package/docs/PAPERCLIP_OPENLIFE_INVESTIGATION.md +0 -85
- package/docs/RELEASE_ORGANIZATION_PLAN.md +0 -164
- package/docs/audit/CLI-EXECUTION-RESULTS.md +0 -113
- package/docs/audit/CLI-MATRIX.md +0 -556
- package/docs/audit/DOC-PARITY-GAPS.md +0 -351
- package/docs/audit/ORCHESTRATOR-MATRIX.md +0 -136
- package/docs/audit/TEST-COVERAGE-GAPS.md +0 -334
- package/docs/audit/integrations/SKIPPED.md +0 -101
- package/docs/autonomous-install.md +0 -79
- package/docs/capability-genesis.md +0 -137
- package/docs/capability-pack-schema.md +0 -157
- package/docs/commands.md +0 -82
- package/docs/deep-research-capability.md +0 -114
- package/docs/development/typescript-conventions.md +0 -95
- package/docs/host-installers.md +0 -68
- package/docs/install/aiobuilder.md +0 -70
- package/docs/install/claude-code.md +0 -83
- package/docs/install/codex.md +0 -64
- package/docs/install/gemini-cli.md +0 -64
- package/docs/install/runtime-profiles.md +0 -83
- package/docs/openlife-agent-os-blueprint.md +0 -114
- package/docs/openlife-install-backlog.md +0 -115
- package/docs/openlife-install-spec.md +0 -306
- package/docs/operations/CLOUD_CUTOVER_AUDIT.md +0 -37
- package/docs/operations/PHASE_PROGRESS_CONTINUATION.md +0 -24
- package/docs/performance-benchmarks.md +0 -83
- package/docs/planning/v1.3-capability-genesis.md +0 -157
- package/docs/plans/2026-05-05-admin-interface-professional-dark-premium-plan.md +0 -84
- package/docs/plans/2026-05-05-openlife-autonomous-domain-marketplace-masterplan.md +0 -122
- package/docs/roadmap/OPENLIFE_MASTER_PLAN_CLOUD_V3.md +0 -97
- package/docs/sandboxing-research.md +0 -117
- package/docs/stories/epic-feature-audit/1.1.story.md +0 -84
- package/docs/stories/epic-feature-audit/1.2.story.md +0 -102
- package/docs/stories/epic-feature-audit/1.3.story.md +0 -93
- package/docs/stories/epic-feature-audit/1.5.story.md +0 -121
- package/docs/stories/epic-feature-audit/1.6.story.md +0 -80
- package/docs/stories/epic-feature-completeness/2.1.story.md +0 -70
- package/docs/stories/epic-feature-completeness/2.2.story.md +0 -49
- package/docs/stories/epic-feature-completeness/2.3.story.md +0 -74
- package/docs/stories/epic-feature-completeness/2.4.story.md +0 -71
- package/docs/stories/epic-feature-completeness/3.1.story.md +0 -56
- package/docs/stories/epic-feature-completeness/3.2.story.md +0 -80
- package/docs/stories/epic-feature-completeness/3.3.story.md +0 -68
- package/docs/stories/epic-feature-completeness/3.4.story.md +0 -71
- package/docs/stories/epic-feature-completeness/3.5.story.md +0 -72
- package/docs/stories/epic-feature-completeness/3.6.story.md +0 -69
- package/docs/stories/epic-feature-completeness/3.7.story.md +0 -68
- package/docs/stories/epic-feature-completeness/3.8.story.md +0 -57
- package/docs/v1.4-changelog.md +0 -159
- package/docs/v1.5-changelog.md +0 -106
- package/docs/v1.5-roadmap.md +0 -121
- package/docs/v1.6-changelog.md +0 -67
- package/docs/v1.6-roadmap.md +0 -89
|
@@ -1,306 +0,0 @@
|
|
|
1
|
-
# OpenLife Install Spec (Terminal-First, Dual Mode)
|
|
2
|
-
|
|
3
|
-
## Command Goal
|
|
4
|
-
`openlife install` is the single onboarding entrypoint with two primary modes:
|
|
5
|
-
|
|
6
|
-
1. **Install CLI**
|
|
7
|
-
2. **Install Autonomous Agent**
|
|
8
|
-
|
|
9
|
-
Mandatory qualities:
|
|
10
|
-
- modular architecture
|
|
11
|
-
- idempotent execution
|
|
12
|
-
- no silent mock fallback
|
|
13
|
-
- real validation at each critical step
|
|
14
|
-
- failure recovery with retry/resume
|
|
15
|
-
- terminal chat functional at the end
|
|
16
|
-
- Telegram setup available in both modes
|
|
17
|
-
|
|
18
|
-
---
|
|
19
|
-
|
|
20
|
-
## High-Level Architecture
|
|
21
|
-
|
|
22
|
-
- `install.entry.ts` — mode routing and flow orchestration
|
|
23
|
-
- `install.wizard.ts` — interactive prompts + navigation
|
|
24
|
-
- `modules/precheck.ts`
|
|
25
|
-
- `modules/core-install.ts`
|
|
26
|
-
- `modules/model-auth.ts`
|
|
27
|
-
- `modules/model-routing.ts`
|
|
28
|
-
- `modules/skills.ts`
|
|
29
|
-
- `modules/agent-manager.ts`
|
|
30
|
-
- `modules/telegram.ts`
|
|
31
|
-
- `modules/chat-terminal.ts`
|
|
32
|
-
- `modules/service-manager.ts`
|
|
33
|
-
- `modules/validate.ts`
|
|
34
|
-
- `modules/recovery.ts`
|
|
35
|
-
- `modules/state-store.ts`
|
|
36
|
-
|
|
37
|
-
---
|
|
38
|
-
|
|
39
|
-
## Main Flow Tree
|
|
40
|
-
|
|
41
|
-
```text
|
|
42
|
-
openlife install
|
|
43
|
-
├─ Welcome screen
|
|
44
|
-
├─ Mode selection
|
|
45
|
-
│ ├─ [1] CLI
|
|
46
|
-
│ └─ [2] Autonomous Agent
|
|
47
|
-
├─ Run selected mode flow
|
|
48
|
-
├─ Shared finalization
|
|
49
|
-
│ ├─ terminal chat enable/validate
|
|
50
|
-
│ ├─ install summary
|
|
51
|
-
│ └─ next commands (copy/paste)
|
|
52
|
-
└─ End
|
|
53
|
-
```
|
|
54
|
-
|
|
55
|
-
---
|
|
56
|
-
|
|
57
|
-
## Flow A — CLI Installation
|
|
58
|
-
|
|
59
|
-
1. **Precheck**
|
|
60
|
-
- Node/npm/runtime checks
|
|
61
|
-
- path permission checks
|
|
62
|
-
- network checks
|
|
63
|
-
- optional auto-fix
|
|
64
|
-
|
|
65
|
-
2. **Workspace selection**
|
|
66
|
-
- default or custom path
|
|
67
|
-
|
|
68
|
-
3. **Core CLI install**
|
|
69
|
-
- install runtime artifacts/binary
|
|
70
|
-
- write baseline config
|
|
71
|
-
|
|
72
|
-
4. **Initial CLI config**
|
|
73
|
-
- language
|
|
74
|
-
- log verbosity
|
|
75
|
-
- default profile
|
|
76
|
-
|
|
77
|
-
5. **Telegram setup (optional, recommended)**
|
|
78
|
-
- collect `BOT_TOKEN`
|
|
79
|
-
- collect optional `CHAT_ID`
|
|
80
|
-
- validate token with Telegram API
|
|
81
|
-
- save secret securely
|
|
82
|
-
- optional send test message if `CHAT_ID` provided
|
|
83
|
-
|
|
84
|
-
6. **Terminal chat setup (mandatory)**
|
|
85
|
-
- enable `openlife chat`
|
|
86
|
-
- run chat smoke test
|
|
87
|
-
|
|
88
|
-
7. **Final validation**
|
|
89
|
-
- `openlife --version`
|
|
90
|
-
- `openlife doctor`
|
|
91
|
-
- `openlife status`
|
|
92
|
-
- `openlife chat --test`
|
|
93
|
-
|
|
94
|
-
---
|
|
95
|
-
|
|
96
|
-
## Flow B — Autonomous Agent Installation
|
|
97
|
-
|
|
98
|
-
1. **Autonomous precheck**
|
|
99
|
-
- dependency checks
|
|
100
|
-
- process manager availability (systemd/supervisor)
|
|
101
|
-
|
|
102
|
-
2. **Agent runtime installation**
|
|
103
|
-
- install autonomous runtime
|
|
104
|
-
- install control commands
|
|
105
|
-
|
|
106
|
-
3. **Model setup wizard**
|
|
107
|
-
User selects one option:
|
|
108
|
-
- only Model 1
|
|
109
|
-
- Model 1 + Model 2
|
|
110
|
-
- Model 1 + Model 2 + Model 3
|
|
111
|
-
|
|
112
|
-
For each selected model:
|
|
113
|
-
- choose auth method: API Key or OAuth/Auth
|
|
114
|
-
- input credentials
|
|
115
|
-
- validate immediately
|
|
116
|
-
|
|
117
|
-
4. **Model routing/fallback**
|
|
118
|
-
- define order M1 → M2 → M3
|
|
119
|
-
- run real routing/fallback test
|
|
120
|
-
|
|
121
|
-
5. **Optional skill installation**
|
|
122
|
-
- choose additional skills to install now
|
|
123
|
-
- configure required API credentials per skill
|
|
124
|
-
- validate skill readiness
|
|
125
|
-
|
|
126
|
-
6. **Agent management (modular core)**
|
|
127
|
-
options:
|
|
128
|
-
- continue with default agent
|
|
129
|
-
- rename default agent (e.g. Lara → custom)
|
|
130
|
-
- create new agent
|
|
131
|
-
- create additional agent
|
|
132
|
-
|
|
133
|
-
rule:
|
|
134
|
-
- default agent always exists
|
|
135
|
-
- default name can be customized
|
|
136
|
-
|
|
137
|
-
7. **Telegram setup (optional, recommended)**
|
|
138
|
-
- token/chat setup and validation
|
|
139
|
-
- anti-conflict polling checks
|
|
140
|
-
|
|
141
|
-
8. **Continuous activation**
|
|
142
|
-
- install/start/enable service via systemd/supervisor
|
|
143
|
-
|
|
144
|
-
9. **Terminal chat setup (mandatory)**
|
|
145
|
-
- `openlife chat` must work as manual control interface
|
|
146
|
-
|
|
147
|
-
10. **End-to-end validation**
|
|
148
|
-
- service healthy
|
|
149
|
-
- agent responds
|
|
150
|
-
- model chain validated
|
|
151
|
-
- Telegram validated (if configured)
|
|
152
|
-
- terminal chat validated
|
|
153
|
-
|
|
154
|
-
---
|
|
155
|
-
|
|
156
|
-
## Telegram Module Spec (Both Modes)
|
|
157
|
-
|
|
158
|
-
### Prompt
|
|
159
|
-
- “Connect Telegram now?”
|
|
160
|
-
- Yes / No
|
|
161
|
-
|
|
162
|
-
If Yes:
|
|
163
|
-
1. ask for `BOT_TOKEN`
|
|
164
|
-
2. ask for optional `CHAT_ID`
|
|
165
|
-
3. validate via `getMe`
|
|
166
|
-
4. if `CHAT_ID` provided, run send test
|
|
167
|
-
5. persist secure secret
|
|
168
|
-
|
|
169
|
-
### Security Rules
|
|
170
|
-
- mask token in UI/logs (`123456:ABC...XYZ`)
|
|
171
|
-
- never log full token
|
|
172
|
-
- support token rotation command post-install
|
|
173
|
-
|
|
174
|
-
### Conflict-409 Protection
|
|
175
|
-
Before polling start:
|
|
176
|
-
- detect existing long-poller with same token
|
|
177
|
-
- if conflict:
|
|
178
|
-
- option A: stop conflicting process
|
|
179
|
-
- option B: provide different token
|
|
180
|
-
- option C: skip Telegram for now
|
|
181
|
-
|
|
182
|
-
---
|
|
183
|
-
|
|
184
|
-
## Terminal Chat Spec (Mandatory)
|
|
185
|
-
|
|
186
|
-
### Command
|
|
187
|
-
- `openlife chat`
|
|
188
|
-
- optional direct alias: `openlife`
|
|
189
|
-
|
|
190
|
-
### Minimum behavior
|
|
191
|
-
- interactive prompt/reply loop
|
|
192
|
-
- shows active agent/model/session status
|
|
193
|
-
|
|
194
|
-
### Minimum slash commands
|
|
195
|
-
- `/help`
|
|
196
|
-
- `/model`
|
|
197
|
-
- `/tools`
|
|
198
|
-
- `/exit`
|
|
199
|
-
|
|
200
|
-
### Test commands
|
|
201
|
-
- `openlife chat --test`
|
|
202
|
-
- `openlife chat --agent <name>`
|
|
203
|
-
- `openlife chat --model <id>`
|
|
204
|
-
|
|
205
|
-
---
|
|
206
|
-
|
|
207
|
-
## Official Wizard Prompt Sequence
|
|
208
|
-
|
|
209
|
-
1. Welcome
|
|
210
|
-
- “How do you want to install OpenLife?”
|
|
211
|
-
- (1) CLI
|
|
212
|
-
- (2) Autonomous Agent
|
|
213
|
-
|
|
214
|
-
2. Precheck
|
|
215
|
-
- “Run environment checks now?”
|
|
216
|
-
|
|
217
|
-
3. Workspace
|
|
218
|
-
- “Choose installation directory: default/custom”
|
|
219
|
-
|
|
220
|
-
4. Model setup (autonomous mode)
|
|
221
|
-
- “How many models to configure now?”
|
|
222
|
-
- 1 / 1+2 / 1+2+3
|
|
223
|
-
|
|
224
|
-
5. For each model
|
|
225
|
-
- “Auth method: API Key or OAuth/Auth?”
|
|
226
|
-
- “Enter credential”
|
|
227
|
-
- “Validate now?”
|
|
228
|
-
|
|
229
|
-
6. Skills (autonomous mode)
|
|
230
|
-
- “Install additional skills now?”
|
|
231
|
-
|
|
232
|
-
7. Agent management (autonomous mode)
|
|
233
|
-
- use default / rename default / create new / create additional
|
|
234
|
-
|
|
235
|
-
8. Telegram (both modes)
|
|
236
|
-
- “Connect Telegram now?”
|
|
237
|
-
- token + optional chat id + validation
|
|
238
|
-
|
|
239
|
-
9. Terminal chat (both modes)
|
|
240
|
-
- mandatory setup and smoke validation
|
|
241
|
-
|
|
242
|
-
10. Finish
|
|
243
|
-
- “Run full validation now?”
|
|
244
|
-
|
|
245
|
-
---
|
|
246
|
-
|
|
247
|
-
## State Persistence + Resume
|
|
248
|
-
|
|
249
|
-
State file: `.openlife/install-state.json`
|
|
250
|
-
|
|
251
|
-
Stores:
|
|
252
|
-
- selected mode
|
|
253
|
-
- completed steps
|
|
254
|
-
- current step
|
|
255
|
-
- retries/errors
|
|
256
|
-
- timestamps
|
|
257
|
-
|
|
258
|
-
Resume command:
|
|
259
|
-
- `openlife install --resume`
|
|
260
|
-
|
|
261
|
-
---
|
|
262
|
-
|
|
263
|
-
## Error Handling + Recovery
|
|
264
|
-
|
|
265
|
-
Per step:
|
|
266
|
-
- show human-readable error + concise technical cause
|
|
267
|
-
- provide actionable fix suggestions
|
|
268
|
-
- offer:
|
|
269
|
-
- retry step
|
|
270
|
-
- skip optional step
|
|
271
|
-
- abort and save state
|
|
272
|
-
|
|
273
|
-
Forbidden:
|
|
274
|
-
- fake success
|
|
275
|
-
- hidden auth/network failures
|
|
276
|
-
|
|
277
|
-
---
|
|
278
|
-
|
|
279
|
-
## Definition of Done
|
|
280
|
-
|
|
281
|
-
Installation status is **SUCCESS** only if:
|
|
282
|
-
- chosen flow completed without critical failures
|
|
283
|
-
- `openlife chat` works
|
|
284
|
-
- autonomous mode: service active + responsive
|
|
285
|
-
- Telegram mode (if enabled): valid token and no active polling conflict
|
|
286
|
-
- final summary includes operational next commands
|
|
287
|
-
|
|
288
|
-
---
|
|
289
|
-
|
|
290
|
-
## Final Output Commands (Always Show)
|
|
291
|
-
|
|
292
|
-
- `openlife chat`
|
|
293
|
-
- `openlife status`
|
|
294
|
-
- `openlife doctor`
|
|
295
|
-
- `openlife models list`
|
|
296
|
-
- `openlife agents list`
|
|
297
|
-
- `openlife telegram status` (if Telegram configured)
|
|
298
|
-
|
|
299
|
-
---
|
|
300
|
-
|
|
301
|
-
## Future-Proof Extensions
|
|
302
|
-
|
|
303
|
-
- `openlife install --mode cli|autonomous`
|
|
304
|
-
- `openlife install --from-file setup.yaml`
|
|
305
|
-
- `openlife install --headless` (CI/devops)
|
|
306
|
-
- multi-profile install presets (dev/stage/prod)
|
|
@@ -1,37 +0,0 @@
|
|
|
1
|
-
# Cloud Cutover Audit (Phase 1)
|
|
2
|
-
|
|
3
|
-
## Escopo executado
|
|
4
|
-
- Introduzida abstração de providers para registries:
|
|
5
|
-
- `src/orchestrator/providers/AgentProvider.ts`
|
|
6
|
-
- `src/orchestrator/providers/SquadProvider.ts`
|
|
7
|
-
- `src/orchestrator/providers/SkillProvider.ts`
|
|
8
|
-
- Implementados providers de compatibilidade por arquivo:
|
|
9
|
-
- `FileAgentProvider.ts`
|
|
10
|
-
- `FileSquadProvider.ts`
|
|
11
|
-
- `FileSkillProvider.ts`
|
|
12
|
-
- Refatorados registries para depender de provider (não parser inline):
|
|
13
|
-
- `AgentRegistry.ts`
|
|
14
|
-
- `SquadRegistry.ts`
|
|
15
|
-
- `SkillRegistryV2.ts`
|
|
16
|
-
- Migração documental inicial (estudo -> git docs):
|
|
17
|
-
- `docs/imported-from-study/*`
|
|
18
|
-
|
|
19
|
-
## Testes executados
|
|
20
|
-
- Build TypeScript: ✅ `npm run build`
|
|
21
|
-
|
|
22
|
-
## Resultado
|
|
23
|
-
- Compatibilidade preservada com fonte por arquivo via provider.
|
|
24
|
-
- Estrutura pronta para plugar `Cloud*Provider` sem quebrar API dos registries.
|
|
25
|
-
|
|
26
|
-
## Pendências para fechar plano completo
|
|
27
|
-
1. ~~Implementar `CloudAgentProvider`, `CloudSquadProvider`, `CloudSkillProvider`.~~ ✅
|
|
28
|
-
2. ~~Habilitar cloud-first + fallback controlado por flag.~~ ✅ (composite providers)
|
|
29
|
-
3. ~~Implementar Skill Manager (create/patch/activate/audit).~~ ✅ (módulo inicial)
|
|
30
|
-
4. Auto-criação completa de squads (4 artefatos) no storage cloud.
|
|
31
|
-
5. Persistência de subagentes com lifecycle.
|
|
32
|
-
6. Learn-in-loop com detector de padrões e promoção.
|
|
33
|
-
7. Módulo de engenharia reversa e rebuild AIOBUILDER.
|
|
34
|
-
8. Smoke Telegram + auditoria final de produção.
|
|
35
|
-
|
|
36
|
-
## Conclusão de auditoria desta fase
|
|
37
|
-
**APROVADO (fase parcial)**: base técnica estável para evolução sem quebra.
|
|
@@ -1,24 +0,0 @@
|
|
|
1
|
-
# Phase Continuation Progress
|
|
2
|
-
|
|
3
|
-
## Newly implemented in this continuation
|
|
4
|
-
|
|
5
|
-
### 1) Squad auto-creation (cloud asset bundle)
|
|
6
|
-
- Module: `src/orchestrator/SquadAutoCreator.ts`
|
|
7
|
-
- Generates required 4 artifacts in cloud assets dir:
|
|
8
|
-
1. principal-agent.json
|
|
9
|
-
2. usage-index.json
|
|
10
|
-
3. workflow-initial.json
|
|
11
|
-
4. operational-note.json
|
|
12
|
-
|
|
13
|
-
### 2) Subagent lifecycle persistence
|
|
14
|
-
- Module: `src/orchestrator/SubagentLifecycle.ts`
|
|
15
|
-
- Lifecycle states:
|
|
16
|
-
- proposed
|
|
17
|
-
- trial
|
|
18
|
-
- active
|
|
19
|
-
- archived
|
|
20
|
-
- Supports create, list, transition with persisted JSON backing store.
|
|
21
|
-
|
|
22
|
-
## Notes
|
|
23
|
-
- Runtime remains backward compatible.
|
|
24
|
-
- This is implementation groundwork; wiring into orchestrator flow is next step.
|
|
@@ -1,83 +0,0 @@
|
|
|
1
|
-
# OpenLife — performance benchmarks (Story 12.4)
|
|
2
|
-
|
|
3
|
-
OpenLife ships a lightweight latency benchmark that runs inside `test:all`.
|
|
4
|
-
It exists to catch order-of-magnitude regressions on hot paths between
|
|
5
|
-
releases — not to do exact numerical profiling.
|
|
6
|
-
|
|
7
|
-
## What it measures
|
|
8
|
-
|
|
9
|
-
`src/test_performance_latency.ts` measures P50 / P95 / P99 / mean for three
|
|
10
|
-
benchmarks:
|
|
11
|
-
|
|
12
|
-
| Name | Function under test | Notes |
|
|
13
|
-
|---|---|---|
|
|
14
|
-
| `intent_classify_cold` | `IntentClassifier.classify(...)` | Cold path through the heuristic dispatcher. |
|
|
15
|
-
| `toolset_guard_off` | `isToolsetAllowed('terminal')` | With enforcement OFF (default), should be ~ a few microseconds. |
|
|
16
|
-
| `profile_manager_list` | `ProfileManager.list()` | Real disk read with 5 seeded profiles in a tmp dir. |
|
|
17
|
-
|
|
18
|
-
Each benchmark runs 5 warm-up iterations + 200–500 timed iterations. Samples
|
|
19
|
-
are sorted; percentiles are positional (`p[i] = sorted[floor((n-1) * i)]`).
|
|
20
|
-
|
|
21
|
-
## Where the baseline lives
|
|
22
|
-
|
|
23
|
-
`.artifacts/perf-baseline.json` — autogenerated on first run, then read on
|
|
24
|
-
every subsequent run. The repo's `.gitignore` excludes `.artifacts/`, so
|
|
25
|
-
each environment (dev machine, CI runner) seeds its own baseline. The CI
|
|
26
|
-
job in `.github/workflows/test.yml` uploads the baseline as an artifact so
|
|
27
|
-
operators can pull it down for inspection or to lock it across runs.
|
|
28
|
-
|
|
29
|
-
## Regression threshold
|
|
30
|
-
|
|
31
|
-
A run fails (process exit code 1) if any benchmark's **P95** is more than
|
|
32
|
-
`PERF_REGRESSION_THRESHOLD_PCT` % above the baseline's P95. Default `30`.
|
|
33
|
-
Override for a noisier CI runner:
|
|
34
|
-
|
|
35
|
-
```bash
|
|
36
|
-
PERF_REGRESSION_THRESHOLD_PCT=50 npm run test:performance-latency
|
|
37
|
-
```
|
|
38
|
-
|
|
39
|
-
The threshold is intentionally generous because Node's microbenchmark
|
|
40
|
-
variance on shared CI runners is non-trivial (~10–20 %). Tighten it
|
|
41
|
-
manually if you know your runner is dedicated.
|
|
42
|
-
|
|
43
|
-
## Refreshing the baseline
|
|
44
|
-
|
|
45
|
-
After a deliberate change that legitimately changes performance, lock the
|
|
46
|
-
new baseline:
|
|
47
|
-
|
|
48
|
-
```bash
|
|
49
|
-
PERF_REFRESH_BASELINE=1 npm run test:performance-latency
|
|
50
|
-
```
|
|
51
|
-
|
|
52
|
-
Commit the resulting `.artifacts/perf-baseline.json` (force-add — it is
|
|
53
|
-
otherwise gitignored) if you want it to travel with the repo. Most teams
|
|
54
|
-
just keep it in CI artifact storage and re-seed locally as needed.
|
|
55
|
-
|
|
56
|
-
## Reading the output
|
|
57
|
-
|
|
58
|
-
```
|
|
59
|
-
[perf] ✓ intent_classify_cold P95 now=0.002ms vs baseline=0.002ms (+0.0%)
|
|
60
|
-
[perf] ✓ toolset_guard_off P95 now=0.001ms vs baseline=0.001ms (+0.0%)
|
|
61
|
-
[perf] ❌ profile_manager_list P95 now=12.500ms vs baseline=7.500ms (+66.7%)
|
|
62
|
-
[perf] REGRESSION DETECTED:
|
|
63
|
-
• profile_manager_list: P95 +66.7% > threshold 30%
|
|
64
|
-
```
|
|
65
|
-
|
|
66
|
-
`✓` means within budget. `❌` flags a regression and exits non-zero.
|
|
67
|
-
|
|
68
|
-
## Why not a CLI-spawn benchmark?
|
|
69
|
-
|
|
70
|
-
The plan called for one (P50/P95/P99 of `openlife --help`, `workflow list`,
|
|
71
|
-
etc.). That was discarded for three reasons:
|
|
72
|
-
|
|
73
|
-
1. **Variance.** Spawning a fresh `node` process adds ~50 ms of cold-start
|
|
74
|
-
noise that swamps everything else.
|
|
75
|
-
2. **Suite budget.** `test:all` already runs ~4 minutes. Adding a
|
|
76
|
-
spawn-benchmark would push it over the 5-minute target.
|
|
77
|
-
3. **Signal/noise.** The hot paths we actually care about (Brain, intent
|
|
78
|
-
routing, toolset checks) are well-isolated library functions. Measuring
|
|
79
|
-
them directly gives more actionable telemetry.
|
|
80
|
-
|
|
81
|
-
A spawn-based smoke test of the CLI surface still exists — see
|
|
82
|
-
`src/test_cli_diagnostics.ts` — but it is correctness-focused, not
|
|
83
|
-
latency-focused.
|
|
@@ -1,157 +0,0 @@
|
|
|
1
|
-
# v1.3 Capability Genesis Engine — Planning Note
|
|
2
|
-
|
|
3
|
-
**Status:** Backlog. To be opened as a formal epic after v1.2 (`feat/v1.2-royal-stack`) merges.
|
|
4
|
-
|
|
5
|
-
**Source:** Hermes (Obsidian-stored knowledge agent) — 2026-05-12 conversational spec.
|
|
6
|
-
|
|
7
|
-
---
|
|
8
|
-
|
|
9
|
-
## Why this exists
|
|
10
|
-
|
|
11
|
-
The current v1.2 plan defines Workflow + Squad + Skill creators and the Workflow engine. The Hermes spec **goes deeper**:
|
|
12
|
-
|
|
13
|
-
- Workflow is not just an executable YAML — it is a **living contract of execution** with 18 mandatory components (identity, objective, triggers, context, mode, agents, squads, skills, tools allowed/restricted/forbidden, policies, steps, decisions, checkpoints, validation, evidence, recovery, audit, learning).
|
|
14
|
-
- Auto-creation is the **evolutionary mechanism**: the system detects capability gaps and converts experience into new workflows, skills, agents, squads automatically.
|
|
15
|
-
- The atomic unit of evolution is a **Capability Pack** — a bundle of workflow + squad + agents + skills + checklists + templates + policies + tests, autocontido.
|
|
16
|
-
|
|
17
|
-
This will be the v1.3 epic. v1.2 ships the foundation (engine, creators, atomic writes, locks, checkpoints, heartbeat); v1.3 ships the genesis loop.
|
|
18
|
-
|
|
19
|
-
---
|
|
20
|
-
|
|
21
|
-
## v1.3 Epic Sketch — "Capability Genesis Engine"
|
|
22
|
-
|
|
23
|
-
### Epic Goal
|
|
24
|
-
Transform OpenLife from a "library of workflows" into a system that **builds its own library** as it operates.
|
|
25
|
-
|
|
26
|
-
### Pillar 1 — Workflow Schema Evolution (extends v1.2 WorkflowSchema)
|
|
27
|
-
|
|
28
|
-
New mandatory fields in `WorkflowSchema.ts`:
|
|
29
|
-
- `mode: task | service` (currently implicit)
|
|
30
|
-
- `triggers: [user_command | event | schedule | recurrence | failure_signal]`
|
|
31
|
-
- `context_required: [paths or env keys]`
|
|
32
|
-
- `tools: { allowed, restricted, forbidden }`
|
|
33
|
-
- `policies: { require_human_approval, block_if, on_destructive }`
|
|
34
|
-
- `validation: { quality_gates }` (not just `creates`)
|
|
35
|
-
- `evidence: { required_artifacts, success_criteria }`
|
|
36
|
-
- `failure_handling: { on_test_failure, on_deploy_failure, on_policy_block, retry, fallback, rollback }`
|
|
37
|
-
- `audit: { emit_events, ledger_path }`
|
|
38
|
-
- `learning: { update_memory, propose_skill_patch, propose_workflow_patch }`
|
|
39
|
-
|
|
40
|
-
### Pillar 2 — Capability Pack as canonical bundle
|
|
41
|
-
|
|
42
|
-
New artifact type:
|
|
43
|
-
```
|
|
44
|
-
deep-research.capability/
|
|
45
|
-
├── workflow.yaml
|
|
46
|
-
├── squad.yaml
|
|
47
|
-
├── agents/*.yaml
|
|
48
|
-
├── skills/*.md
|
|
49
|
-
├── checklists/*.md
|
|
50
|
-
├── templates/*.md
|
|
51
|
-
├── policies.yaml
|
|
52
|
-
├── tests/*.ts
|
|
53
|
-
└── pack.json
|
|
54
|
-
```
|
|
55
|
-
|
|
56
|
-
CLI:
|
|
57
|
-
- `openlife capability list`
|
|
58
|
-
- `openlife capability show <id>`
|
|
59
|
-
- `openlife capability publish <id>` (promote from draft → tested → active)
|
|
60
|
-
- `openlife capability install <pack.tar.gz>`
|
|
61
|
-
|
|
62
|
-
### Pillar 3 — Capability Genesis Engine (autocriação)
|
|
63
|
-
|
|
64
|
-
Components:
|
|
65
|
-
- `WorkflowCreator` — already planned in v1.2 (extend in v1.3)
|
|
66
|
-
- `SquadCreator` — already planned
|
|
67
|
-
- `SkillCreator` — already planned
|
|
68
|
-
- `AgentCreator` — new
|
|
69
|
-
- `ChecklistCreator` — new
|
|
70
|
-
- `TemplateCreator` — new
|
|
71
|
-
- `CapabilityValidator` — checks pack integrity
|
|
72
|
-
- `CatalogPublisher` — publishes to `.catalog/` with status lifecycle
|
|
73
|
-
|
|
74
|
-
Conversational UX:
|
|
75
|
-
```
|
|
76
|
-
openlife create capability "pesquisa profunda profissional"
|
|
77
|
-
```
|
|
78
|
-
→ IA detects this needs workflow + squad + 7 agents + 5 skills + 1 checklist + 1 template
|
|
79
|
-
→ Asks ~10 calibration questions (modo rápido / profissional / elite)
|
|
80
|
-
→ Generates the whole Capability Pack as `draft` status
|
|
81
|
-
→ Validates internal consistency
|
|
82
|
-
→ Returns to operator for review or executes in experimental mode
|
|
83
|
-
|
|
84
|
-
### Pillar 4 — Genesis Loop (`Task Genesis Loop`)
|
|
85
|
-
|
|
86
|
-
```
|
|
87
|
-
User request → classify intent → lookup workflow → if exists, run
|
|
88
|
-
→ if missing or insufficient:
|
|
89
|
-
→ Capability Genesis
|
|
90
|
-
→ create draft pack
|
|
91
|
-
→ execute draft
|
|
92
|
-
→ validate
|
|
93
|
-
→ if quality ≥ threshold and user permits:
|
|
94
|
-
promote draft → active
|
|
95
|
-
```
|
|
96
|
-
|
|
97
|
-
States:
|
|
98
|
-
- `draft` — IA-generated, not yet used by router
|
|
99
|
-
- `tested` — passed at least one execution + validation
|
|
100
|
-
- `active` — router may use without prompt
|
|
101
|
-
- `deprecated` — replaced
|
|
102
|
-
|
|
103
|
-
### Pillar 5 — Service Mode (long-running operations)
|
|
104
|
-
|
|
105
|
-
Distinguish missions:
|
|
106
|
-
- **Task mode** (default) — finite delivery, completes when validated
|
|
107
|
-
- **Service mode** — continuous, has KPIs/events/health, never "completes"
|
|
108
|
-
|
|
109
|
-
Examples:
|
|
110
|
-
- Task: "Fix this bug" → ends when PR merges
|
|
111
|
-
- Service: "Monitor this Telegram channel" → indefinite, with degraded states
|
|
112
|
-
|
|
113
|
-
Engine: `WorkflowEngine` supports both. Service mode promotes lifecycle to `ServiceState` (already exists, now formalized as workflow target).
|
|
114
|
-
|
|
115
|
-
### Pillar 6 — Learning Loop
|
|
116
|
-
|
|
117
|
-
After every execution:
|
|
118
|
-
- What worked? What failed?
|
|
119
|
-
- Should a new skill be created? Workflow updated? Agent improved?
|
|
120
|
-
- Any new risks discovered?
|
|
121
|
-
- User preferences learned?
|
|
122
|
-
|
|
123
|
-
Outputs:
|
|
124
|
-
- `learning_proposals.jsonl` event log
|
|
125
|
-
- `*-skill-patch.draft.md` candidates
|
|
126
|
-
- `*-workflow-patch.draft.yaml` candidates
|
|
127
|
-
|
|
128
|
-
User reviews proposals, promotes to active.
|
|
129
|
-
|
|
130
|
-
---
|
|
131
|
-
|
|
132
|
-
## Canonical definition (from Hermes — to be added to root docs after v1.2)
|
|
133
|
-
|
|
134
|
-
> Workflow, no OpenLife, é uma especificação operacional executável, governada, auditável e evolutiva que transforma uma intenção em resultado validado. Ele coordena agentes, squads, skills, ferramentas, memória, políticas, checkpoints, decisões, validações, evidências e mecanismos de recuperação para executar tarefas ou serviços de forma segura, reutilizável e continuamente aprimorável.
|
|
135
|
-
>
|
|
136
|
-
> Além de organizar a execução, um workflow participa da evolução do próprio sistema: quando identifica lacunas, recorrências ou novos padrões de trabalho, pode acionar a autocriação ou atualização de skills, agentes, squads, checklists, templates e outros workflows, convertendo aprendizado operacional em capacidade permanente do OpenLife.
|
|
137
|
-
|
|
138
|
-
---
|
|
139
|
-
|
|
140
|
-
## Mapping to v1.2 work
|
|
141
|
-
|
|
142
|
-
| v1.2 deliverable | v1.3 extension |
|
|
143
|
-
|---|---|
|
|
144
|
-
| `WorkflowSchema` (Story 4.1) | Extend with 14 new mandatory fields (mode/triggers/context/tools/policies/validation/evidence/failure_handling/audit/learning) |
|
|
145
|
-
| `WorkflowEngine` (Story 4.2) | Add tool-permission enforcement, policy blocks, learning hooks, service mode |
|
|
146
|
-
| `SquadCreator` (Story 5.1) | Add to Capability Genesis pipeline as one of N creators |
|
|
147
|
-
| `SkillCreator` (Story 5.2) | Same |
|
|
148
|
-
| `AIOBuilder CLI` (Story 5.4) | `openlife create capability` becomes the universal entry; sub-creators are internal |
|
|
149
|
-
| `dist-templates/workflows/` (Story 4.4) | Become "starter Capability Packs" not just standalone workflows |
|
|
150
|
-
|
|
151
|
-
---
|
|
152
|
-
|
|
153
|
-
## Status
|
|
154
|
-
|
|
155
|
-
- v1.2 finishes the foundation (target: ~ 14-18 days from 2026-05-12).
|
|
156
|
-
- v1.3 epic to be opened immediately after v1.2 merges to `main`.
|
|
157
|
-
- This document is the seed; the v1.3 plan will refine into ~25-30 stories across 6 pillars.
|
|
@@ -1,84 +0,0 @@
|
|
|
1
|
-
# OpenLife Interface Profissional — Dark Premium Plan
|
|
2
|
-
|
|
3
|
-
## Objetivo
|
|
4
|
-
Evoluir de MVP para uma interface enterprise-grade para operação de serviços autônomos por domínio, com estética dark premium, alta densidade informacional e UX operacional (NOC-style).
|
|
5
|
-
|
|
6
|
-
## Princípios de Design
|
|
7
|
-
- Dark premium real: contraste alto, profundidade por camadas, tipografia forte, microinterações discretas.
|
|
8
|
-
- "Operator-first": menos marketing, mais controle/estado/ação.
|
|
9
|
-
- Everything traceable: toda decisão visível via timeline e tool-trace.
|
|
10
|
-
- Segurança por default: RBAC, escopo por tenant, auditoria.
|
|
11
|
-
|
|
12
|
-
## Arquitetura de Frontend (fase profissional)
|
|
13
|
-
- Stack: Next.js + TypeScript + Tailwind + shadcn/ui + TanStack Query.
|
|
14
|
-
- Design tokens: tema via CSS variables (color, spacing, radius, motion).
|
|
15
|
-
- Estado: server-state com cache e optimistic updates.
|
|
16
|
-
- Observabilidade UI: Sentry + OpenTelemetry web.
|
|
17
|
-
|
|
18
|
-
## IA/Operação Screens (v1 profissional)
|
|
19
|
-
1. Command Center
|
|
20
|
-
- Health global (tenants, serviços ativos, incidentes, custo/h).
|
|
21
|
-
- Event stream em tempo real.
|
|
22
|
-
- Alert rail com severidade.
|
|
23
|
-
|
|
24
|
-
2. Agent Fleet
|
|
25
|
-
- Lista/grid de agentes por domínio e tenant.
|
|
26
|
-
- Start/stop/pause, fallback chain, SLA profile.
|
|
27
|
-
- Runtime trace (tool labels + duration + artifact links).
|
|
28
|
-
|
|
29
|
-
3. Service Templates Marketplace
|
|
30
|
-
- Catálogo por domínio (jurídico, saúde, vendas, social, suporte).
|
|
31
|
-
- Install/provision wizard.
|
|
32
|
-
- Versionamento, changelog, compatibilidade.
|
|
33
|
-
|
|
34
|
-
4. Mission Console
|
|
35
|
-
- Conversa com agente + painel de execução lado a lado.
|
|
36
|
-
- Tree de passos (planner/executor/reviewer/synthesizer).
|
|
37
|
-
- Evidências: arquivos, logs, auditoria e custo.
|
|
38
|
-
|
|
39
|
-
5. Governance & Security
|
|
40
|
-
- Policies, consentimentos, destructive-guard approvals.
|
|
41
|
-
- ACL matrix por papel e tenant.
|
|
42
|
-
- Auditoria filtrável/exportável.
|
|
43
|
-
|
|
44
|
-
## Design System Dark Premium
|
|
45
|
-
- Paleta base: graphite/navy + accent elétrico (indigo/cyan).
|
|
46
|
-
- Tokens: bg/base/surface/elevated, border subtle/strong, semantic success/warn/error/info.
|
|
47
|
-
- Tipografia: Inter + JetBrains Mono (telemetria).
|
|
48
|
-
- Motion: 120–180ms, easing suave, sem animação excessiva.
|
|
49
|
-
- Components: cards densos, tables virtuais, timeline, command palette, split pane.
|
|
50
|
-
|
|
51
|
-
## Roadmap
|
|
52
|
-
### Sprint A (7-10 dias)
|
|
53
|
-
- Estrutura Next.js + auth + layout shell dark premium.
|
|
54
|
-
- Telas: Teams/Networks/Status integradas à API já existente.
|
|
55
|
-
- RBAC básico e sessão admin.
|
|
56
|
-
|
|
57
|
-
### Sprint B (10-14 dias)
|
|
58
|
-
- Agent Fleet + Mission Console (stream de estado).
|
|
59
|
-
- Timeline de tool-trace e artifacts.
|
|
60
|
-
- Filtros e busca global.
|
|
61
|
-
|
|
62
|
-
### Sprint C (10-14 dias)
|
|
63
|
-
- Marketplace templates + provisionamento guiado.
|
|
64
|
-
- Billing preview + SLA visual.
|
|
65
|
-
- Auditoria completa e export.
|
|
66
|
-
|
|
67
|
-
## Critérios de Aceite
|
|
68
|
-
- P95 de navegação < 250ms em tela quente.
|
|
69
|
-
- 100% de ações críticas com log/auditoria.
|
|
70
|
-
- Fluxo completo: contratar template -> provisionar serviço -> executar missão -> evidenciar entrega.
|
|
71
|
-
- UX consistente em dark premium em todas as telas.
|
|
72
|
-
|
|
73
|
-
## Dependências Técnicas
|
|
74
|
-
- Endpoints adicionais: /admin/agents, /admin/missions, /admin/events, /admin/templates.
|
|
75
|
-
- SSE/WebSocket para stream operacional.
|
|
76
|
-
- Persistência de eventos em store dedicado (trace store).
|
|
77
|
-
|
|
78
|
-
## Riscos e Mitigação
|
|
79
|
-
- Risco: virar dashboard bonito sem operação real.
|
|
80
|
-
Mitigação: priorizar Mission Console + ações acionáveis.
|
|
81
|
-
- Risco: custo de telemetria.
|
|
82
|
-
Mitigação: sampling + retenção por tenant/política.
|
|
83
|
-
- Risco: complexidade multi-tenant.
|
|
84
|
-
Mitigação: isolamento por workspace desde o início.
|