role-os 1.9.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,297 +1,332 @@
1
- # Changelog
2
-
3
- ## 1.9.0
4
-
5
- ### Added
6
-
7
- #### Unified Entry Path (Phase T)
8
- - `roleos start <task>`auto-decides mission vs pack vs free routing
9
- - Three-level fallback ladder with confidence scores and alternatives
10
- - Composite task detection warns when a task should be decomposed
11
- - `--json` flag for machine-readable entry decisions
12
- - 46 new tests: entry engine, comparison trials, CLI integration
13
-
14
- #### Handbook Updates
15
- - New Missions handbook page with full mission documentation
16
- - Updated Getting Started to lead with `roleos start`
17
- - Updated Reference with all CLI commands (start, mission, packs, artifacts, status, doctor)
18
- - Updated handbook index with entry levels and 9 operating layers
19
-
20
- #### README Overhaul
21
- - "How it works" section leads with `roleos start` examples
22
- - Quick Start updated with mission and start commands
23
- - Added 6 Missions table
24
- - Updated project structure with all 18 source modules
25
- - Updated status history through v1.9.0
26
-
27
- ### Evidence
28
- - 527 tests, zero failures (46 new)
29
- - Entry path trials validated against 20+ real task descriptions
30
- - Fallback ladder tested: mission, pack, free-routing, composite, empty input
31
-
32
- ## 1.8.0
33
-
34
- ### Added
35
-
36
- #### Mission Library (Phase S Mission Hardening)
37
- - 6 named, repeatable mission types: feature-ship, bugfix, treatment, docs-release, security-hardening, research-launch
38
- - Each mission declares: pack, role chain, artifact flow, escalation branches, honest-partial definition, stop conditions, dispatch defaults, trial evidence
39
- - Mission runner: create → step through → complete/fail → generate completion report
40
- - Completion proof reporter with honest-partial and formatted text output
41
- - `roleos mission list` — list all missions
42
- - `roleos mission show <key>` — full mission detail
43
- - `roleos mission suggest <text>` — signal-based mission suggestion
44
- - `roleos mission validate [key]` validate mission wiring against packs/roles
45
-
46
- #### Mission Runner Engine
47
- - `createRun()` instantiate a mission with tracked steps
48
- - `startNextStep()` / `completeStep()` / `failStep()` — step lifecycle
49
- - `recordEscalation()` — re-opens completed steps on escalation loops
50
- - `getRunPosition()` / `getArtifactChain()` run introspection
51
- - `generateCompletionReport()` / `formatCompletionReport()` honest outcome reporting
52
-
53
- ### Evidence
54
- - 465 tests, zero failures (67 new)
55
- - All 6 missions validate against live pack/role catalog
56
- - Full lifecycle tests: end-to-end runs, escalation loops, partial completions, failure reporting
57
-
58
- ## 1.7.0
59
-
60
- ### Added
61
-
62
- #### Completion Proof (Phase R)
63
- - `roleos artifacts` CLI command: list, show, validate, chain subcommands
64
- - 13 new CLI integration tests for artifact inspection
65
- - Real task completion missions through the full stack
66
-
67
- #### Completion Proof Evidence
68
- - R1-1 Feature mission: `roleos artifacts` command shipped through feature pack
69
- - Pack: feature (high confidence, correct)
70
- - Chain: 5 roles, 0 escalations, 1 minor correction
71
- - Artifact contracts: all 4 used and valid
72
- - R1-2 Bugfix mission: README.zh.md npm anomaly
73
- - Diagnosed correctly: npm auto-includes README* regardless of files field
74
- - Escalated honestly: fix requires structural decision (translation file organization)
75
- - Not force-closed: deferred to treatment pass
76
-
77
- ### Evidence
78
- - 398 tests, zero failures
79
- - 3 missions run through the full stack
80
- - Completion metrics recorded per mission
81
-
82
- ## 1.6.0
83
-
84
- ### Added
85
-
86
- #### Artifact Spine (Phase Q)
87
- - 20 per-role artifact contracts: each defines artifact type, required sections, evidence references, downstream consumers, and completion rules
88
- - `validateArtifact(role, content)` — structural validation against role contracts (missing sections, evidence references, content depth)
89
- - 7 pack-level handoff contracts: define the expected artifact flow between steps for each pack (e.g., strategy-brief → implementation-spec → change-plan → test-package → verdict)
90
- - `validatePackChain(pack, artifacts)` validates an entire pack's artifact chain for completeness
91
- - `getArtifactContract(role)` / `getHandoffContract(pack)` lookup APIs
92
- - `formatArtifactValidation()` / `formatPackChain()` — display formatters
93
-
94
- #### Artifact contract coverage
95
- - Product Strategist → strategy-brief (problem-framing, scope, non-goals, tradeoffs)
96
- - Spec Writer → implementation-spec (acceptance-criteria, edge-cases, interface-spec)
97
- - Backend/Frontend Engineer → change-plan (files-to-change, implementation-approach, risk-notes)
98
- - Test Engineer test-package (test-plan, test-cases, false-confidence-assessment)
99
- - Security Reviewer security-findings (findings, severity-assessment, recommendations)
100
- - Critic Reviewer verdict (verdict, evidence, required-corrections)
101
- - And 14 more roles with full contracts
102
-
103
- ### Evidence
104
- - 385 tests, zero failures
105
- - 27 new artifact tests
106
-
107
- ## 1.5.0
108
-
109
- ### Added
110
-
111
- #### Hook Spine / Runtime Enforcement (Phase R)
112
- - 5 lifecycle hooks: SessionStart, UserPromptSubmit, PreToolUse, SubagentStart, Stop
113
- - `scaffoldHooks()` generates all 5 hook scripts in .claude/hooks/
114
- - `roleos init claude` now scaffolds hooks + settings.local.json with hook config
115
- - `roleos doctor` now checks for hook scripts (check 7) and settings hooks (check 8)
116
-
117
- #### SessionStart hook
118
- - Establishes session contract on every new session
119
- - Records session ID, timestamp, initializes state tracking
120
- - Adds context reminding Claude to use /roleos-route for non-trivial tasks
121
-
122
- #### UserPromptSubmit hook
123
- - Classifies prompts as substantial (>50 chars + action verbs)
124
- - After 2+ substantial prompts without a route card, adds context reminder
125
- - Does not block advisory enforcement
126
-
127
- #### PreToolUse hook
128
- - Records all tool usage in session state
129
- - Flags write tools (Bash, Write, Edit) used without route card after substantial work
130
- - Advisory, not blocking preserves operator control
131
-
132
- #### SubagentStart hook
133
- - Injects active role contract into delegated agents
134
- - Ensures subagents inherit the Role OS session context
135
-
136
- #### Stop hook
137
- - Warns when substantial sessions end without route card or outcome artifact
138
- - Advisory — does not block session exit
139
- - Trivial sessions (< 2 substantial prompts) are exempt
140
-
141
- ### Evidence
142
- - 358 tests, zero failures
143
- - 23 new hook tests covering all 5 lifecycle hooks
144
-
145
- ## 1.4.0
146
-
147
- ### Added
148
-
149
- #### Session Spine (Phase Q)
150
- - `roleos init claude` scaffolds Claude Code integration: CLAUDE.md instructions, /roleos-route + /roleos-review + /roleos-status slash commands
151
- - `roleos doctor` — verifies repo is correctly wired for Role OS sessions (6 checks: .claude/ dir, CLAUDE.md section, /roleos-route command, context files, role contracts, packets)
152
- - Route card generation — session header artifact proving Role OS was engaged (task type, pack, confidence, composite status, success artifact)
153
- - CLAUDE.md template instructs Claude to route through Role OS before non-trivial work
154
- - /roleos-route command produces structured route cards
155
- - /roleos-review command guides structured verdict production
156
- - /roleos-status command shows active work and context health
157
- - Appends to existing CLAUDE.md without overwriting (detects Role OS section)
158
- - --force flag overwrites existing command files
159
-
160
- ### Evidence
161
- - 335 tests, zero failures
162
-
163
- ## 1.3.0
164
-
165
- ### Added
166
-
167
- #### Outcome Calibration (Phase M)
168
- - Run outcome ledger append-only JSONL recording pack selection, confidence, overrides, escalations, corrections, completion status
169
- - `computeCalibration()` pack usage rates, high-confidence accuracy, operator override rates, per-pack performance
170
- - `computePackBoosts()` — weight tuning from clean completed runs (+0.5/run, capped at 2.0)
171
- - `computeConfidenceAdjustment()` — raises threshold when high-confidence is often overridden, lowers when medium is often accepted
172
- - Auto-generated calibration suggestions when metrics drift
173
- - Safety constraint: calibration never overrides mismatch guards, conflict rules, escalation honesty, or evidence requirements
174
-
175
- #### Mixed-Task Decomposition (Phase N)
176
- - `detectComposite()` — 7 subtask categories (build, bugfix, security, docs, research, launch, treatment) with signal-based detection
177
- - Structural connector detection ("and then", "after that", "plus", "also")
178
- - Confidence levels: high (3+ categories or 2+ with connectors), medium, low
179
- - `decompose()` — generates linked child packets sorted by phase order
180
- - `createRunPlan()` — dependency-aware parent plan with child tracking
181
- - Honest fallback: medium/low confidence shows uncertainty warning with `--no-split` override
182
-
183
- #### Composite Execution (Phase O)
184
- - `initExecution()` / `advance()` — dependency-driven child execution with artifact passing
185
- - 7 artifact contracts defining what each category produces and expects
186
- - Artifact ledger tracking all cross-packet handoffs
187
- - `blockChild()` / `recoverChild()` / `failChild()` branch recovery with transitive cascade
188
- - `invalidateDownstream()` resets stale children when upstream changes, removes stale artifacts
189
- - `synthesize()` truthful parent-level completion report
190
- - Independent branches continue unaffected when a sibling fails
191
-
192
- #### Adaptive Replanning (Phase P)
193
- - 6 structured change event types: scope-change, artifact-changed, new-requirement, review-finding, dependency-discovered, priority-change
194
- - `analyzeImpact()` — identifies valid/stale children, stale artifacts, whether new children or reorder needed
195
- - `replan()` — selective replanning: invalidates only affected branches, inserts new children, updates dependencies
196
- - Plan diff: shows what changed, what stayed valid, what reopened, what was inserted
197
- - Execution resumes from next valid child after replan — no restart required
198
-
199
- ### Evidence
200
- - 317 tests, zero failures
201
- - Calibration, decomposition, composite execution, and replanning each have dedicated test suites
202
-
203
- ## 1.2.0
204
-
205
- ### Added
206
- - Pack auto-selection in `roleos route` suggests best pack when confidence is high
207
- - `roleos route --pack=<name>` use a specific pack for routing
208
- - Pack mismatch detection warns when a pack doesn't fit the task, suggests the correct alternative
209
- - Pack fallback — mismatched or unknown packs fall back to free routing automatically
210
- - `checkPackMismatch()` API with 7 guard sets covering all pack×task-type combinations
211
- - `getPackRoles()` API with conditional Orchestrator support
212
-
213
- ### Changed
214
- - Docs pack: Support Triage Lead now opens (was Feedback Synthesizer). Feedback Synthesizer is second. Release Engineer + Deployment Verifier moved to optional (overhead for docs-only tasks).
215
- - Pack calibration applied from comparison evidence: conditional Orchestrator, Security Reviewer in Treatment, Product Strategist opens Research, mismatch guards on all 7 packs.
216
-
217
- ### Evidence
218
- - Pack comparison: calibrated packs now win or tie 6/7 (was 2/7 pre-calibration)
219
- - Misfit honesty: 0 full bluffs, 0 undetected partial bluffs (was 1 + 3)
220
- - 230 tests, zero failures
221
-
222
- ## 1.1.0
223
-
224
- ### Added
225
-
226
- #### Routing
227
- - Full 31-role catalog — all roles scored by keyword, trigger phrase, packet type bias, and deliverable affinity
228
- - Dynamic chain builder phase-ordered assembly replacing static templates
229
- - Routing confidence assessment (high/medium/low)
230
- - `excludeWhen` enforcement roles suppressed when exclusion patterns match packet content
231
- - `detectType` false-positive prevention "integration testing" no longer triggers integration type
232
- - `--verbose` flag for `roleos route` hides scoring noise by default
233
-
234
- #### Conflict Detection
235
- - 4-pass conflict engine: hard conflicts, sequence, redundancy, coverage gaps
236
- - Per-role constraint registry: lateOnly, requiresBeforePacks
237
- - Overlap pair detection
238
- - Repair suggestions on every finding
239
-
240
- #### Escalation Auto-Routing
241
- - Blocked/rejected/conflict/split work auto-routes to named resolver
242
- - Every escalation includes: target role, recovery type, required artifact, handoff context
243
-
244
- #### Structured Evidence
245
- - 12 evidence kinds, 4 statuses, closed 4-verdict enum (accept/accept-with-notes/reject/blocked)
246
- - Role-aware evidence requirements for 15 roles
247
- - Sufficiency checks with contradiction detection
248
-
249
- #### Runtime Dispatch
250
- - Execution manifests for multi-claude with per-role tool profiles and budgets
251
- - 8 execution states with auto-advance
252
- - Escalation packet generation for blocked/rejected steps
253
-
254
- #### Proven Team Packs
255
- - 7 battle-tested packs: feature, bugfix, security, docs, launch, research, treatment
256
- - `roleos packs list` — show all packs with role counts
257
- - `roleos packs suggest <packet>` — suggest best pack for a packet
258
- - `roleos packs show <name>` — show pack details (roles, artifacts, stop conditions)
259
- - Pack suggestion engine with confidence levels
260
-
261
- #### Trials
262
- - Full roster proven: 30/30 gold-task trials + 5/5 negative (wrong-task honesty) trials
263
- - 7 pack execution trials all packs ran full chains with honest Critic verdicts
264
- - Trial framework: buildClusterTrials, evaluateTrialOutput, formatTrialReport
265
-
266
- ### Changed
267
- - 32 31 roles: Information Architect merged into Docs Architect
268
- - Verdict vocabulary unified: evidence.mjs now uses accept/reject/blocked (matching review.mjs)
269
- - "worker" terminology replaced with "role" in dispatch.mjs
270
-
271
- ### Fixed
272
- - `excludeWhen` was declared on 14 roles but never enforced — now active in scoreRole
273
- - `detectType` false-positived on "integration testing" — now uses word-boundary regex
274
- - "Not triggered: N roles" noise hidden by default (shown with --verbose)
275
- - Handbook: Team Packs page added, reference sidebar reordered
276
-
277
- ## 1.0.2
278
-
279
- ### Fixed
280
- - Fix double-nested `.claude/.claude/` directory created by `roleos init` — `starter-pack/.claude/workflows/full-treatment.md` moved to `starter-pack/workflows/`
281
- - Read VERSION from `package.json` at runtime instead of hardcoded constant — prevents version drift between CLI and package metadata
282
-
283
- ### Added
284
- - `roleos init --force` — update canonical scaffolded files while always protecting user-filled `context/` files
285
- - 4 regression tests: no double-nesting, correct workflow placement, version sync, --force context protection
286
-
287
- ## 1.0.0
288
-
289
- ### Added
290
- - `roleos init` scaffold Role OS starter pack into `.claude/`
291
- - `roleos packet new <type>` create feature, integration, or identity packets
292
- - `roleos route <packet-file>` — recommend smallest valid role chain with dependency verification
293
- - `roleos review <packet-file> <verdict>` — record accept/reject/blocked verdicts
294
- - Full starter pack: 8 role contracts, 3 schemas, 4 policies, 3 workflows
295
- - Guided context templates with inline prompts
296
- - 3 canonical example packets (feature, integration, identity)
297
- - Adoption handbook
1
+ # Changelog
2
+
3
+ ## 2.0.0
4
+
5
+ ### Added
6
+
7
+ #### Operator Friction Pass (Phase U)
8
+ - `roleos run "<task>"`one command from task description to active execution
9
+ - Persistent disk-backed runs in `.claude/runs/` survives session interruptions
10
+ - Entry level auto-selection: mission, pack, or free routing with force overrides (`--mission=`, `--pack=`)
11
+ - Step-local operator guidance at every step: role, artifact, required sections, completion rule, stop conditions
12
+ - `roleos resume [id]` continue interrupted runs from disk
13
+ - `roleos next` — start the next step or show what's active
14
+ - `roleos explain [id]` — full run state with guidance, escalations, interventions
15
+ - `roleos complete <artifact> [note]` — complete the active step with artifact reference
16
+ - `roleos fail <partial|failed> <reason>` fail with honest downstream blocking
17
+ - `roleos run list` list all runs with status icons
18
+ - `roleos run show <id>` full run detail
19
+
20
+ #### Intervention Shortcuts
21
+ - `roleos retry <step>` retry a failed/partial step, unblock downstream
22
+ - `roleos reroute <step> <role> <reason>` swap a step to a different role
23
+ - `roleos escalate <from> <to> <trigger> <action>` — escalate between roles with step re-opening
24
+ - `roleos block <step> <reason>` manually block a step
25
+ - `roleos reopen <step> <reason>` — reopen a completed step for re-execution
26
+
27
+ #### Friction Measurement
28
+ - `roleos report [id]` generate completion report with honest-partial
29
+ - `roleos friction [id]` measure operator touches: interventions, escalations, manual steps
30
+ - Friction score: low/medium/high based on touch count vs step count
31
+
32
+ ### Evidence
33
+ - 613 tests, zero failures (86 new)
34
+ - 6 friction trials validated: clean run, reroute, retry, pack-level, free-routing, disk resume
35
+ - All entry levels produce low/medium friction scores
36
+ - Disk round-trip verified: create pause → load → resume → complete
37
+
38
+ ## 1.9.0
39
+
40
+ ### Added
41
+
42
+ #### Unified Entry Path (Phase T)
43
+ - `roleos start <task>` — auto-decides mission vs pack vs free routing
44
+ - Three-level fallback ladder with confidence scores and alternatives
45
+ - Composite task detection warns when a task should be decomposed
46
+ - `--json` flag for machine-readable entry decisions
47
+ - 46 new tests: entry engine, comparison trials, CLI integration
48
+
49
+ #### Handbook Updates
50
+ - New Missions handbook page with full mission documentation
51
+ - Updated Getting Started to lead with `roleos start`
52
+ - Updated Reference with all CLI commands (start, mission, packs, artifacts, status, doctor)
53
+ - Updated handbook index with entry levels and 9 operating layers
54
+
55
+ #### README Overhaul
56
+ - "How it works" section leads with `roleos start` examples
57
+ - Quick Start updated with mission and start commands
58
+ - Added 6 Missions table
59
+ - Updated project structure with all 18 source modules
60
+ - Updated status history through v1.9.0
61
+
62
+ ### Evidence
63
+ - 527 tests, zero failures (46 new)
64
+ - Entry path trials validated against 20+ real task descriptions
65
+ - Fallback ladder tested: mission, pack, free-routing, composite, empty input
66
+
67
+ ## 1.8.0
68
+
69
+ ### Added
70
+
71
+ #### Mission Library (Phase S Mission Hardening)
72
+ - 6 named, repeatable mission types: feature-ship, bugfix, treatment, docs-release, security-hardening, research-launch
73
+ - Each mission declares: pack, role chain, artifact flow, escalation branches, honest-partial definition, stop conditions, dispatch defaults, trial evidence
74
+ - Mission runner: create step through complete/fail → generate completion report
75
+ - Completion proof reporter with honest-partial and formatted text output
76
+ - `roleos mission list` — list all missions
77
+ - `roleos mission show <key>` — full mission detail
78
+ - `roleos mission suggest <text>` — signal-based mission suggestion
79
+ - `roleos mission validate [key]` validate mission wiring against packs/roles
80
+
81
+ #### Mission Runner Engine
82
+ - `createRun()` — instantiate a mission with tracked steps
83
+ - `startNextStep()` / `completeStep()` / `failStep()` — step lifecycle
84
+ - `recordEscalation()` — re-opens completed steps on escalation loops
85
+ - `getRunPosition()` / `getArtifactChain()` — run introspection
86
+ - `generateCompletionReport()` / `formatCompletionReport()` — honest outcome reporting
87
+
88
+ ### Evidence
89
+ - 465 tests, zero failures (67 new)
90
+ - All 6 missions validate against live pack/role catalog
91
+ - Full lifecycle tests: end-to-end runs, escalation loops, partial completions, failure reporting
92
+
93
+ ## 1.7.0
94
+
95
+ ### Added
96
+
97
+ #### Completion Proof (Phase R)
98
+ - `roleos artifacts` CLI command: list, show, validate, chain subcommands
99
+ - 13 new CLI integration tests for artifact inspection
100
+ - Real task completion missions through the full stack
101
+
102
+ #### Completion Proof Evidence
103
+ - R1-1 Feature mission: `roleos artifacts` command shipped through feature pack
104
+ - Pack: feature (high confidence, correct)
105
+ - Chain: 5 roles, 0 escalations, 1 minor correction
106
+ - Artifact contracts: all 4 used and valid
107
+ - R1-2 Bugfix mission: README.zh.md npm anomaly
108
+ - Diagnosed correctly: npm auto-includes README* regardless of files field
109
+ - Escalated honestly: fix requires structural decision (translation file organization)
110
+ - Not force-closed: deferred to treatment pass
111
+
112
+ ### Evidence
113
+ - 398 tests, zero failures
114
+ - 3 missions run through the full stack
115
+ - Completion metrics recorded per mission
116
+
117
+ ## 1.6.0
118
+
119
+ ### Added
120
+
121
+ #### Artifact Spine (Phase Q)
122
+ - 20 per-role artifact contracts: each defines artifact type, required sections, evidence references, downstream consumers, and completion rules
123
+ - `validateArtifact(role, content)` structural validation against role contracts (missing sections, evidence references, content depth)
124
+ - 7 pack-level handoff contracts: define the expected artifact flow between steps for each pack (e.g., strategy-brief → implementation-spec → change-plan → test-package → verdict)
125
+ - `validatePackChain(pack, artifacts)`validates an entire pack's artifact chain for completeness
126
+ - `getArtifactContract(role)` / `getHandoffContract(pack)` — lookup APIs
127
+ - `formatArtifactValidation()` / `formatPackChain()` — display formatters
128
+
129
+ #### Artifact contract coverage
130
+ - Product Strategist strategy-brief (problem-framing, scope, non-goals, tradeoffs)
131
+ - Spec Writer → implementation-spec (acceptance-criteria, edge-cases, interface-spec)
132
+ - Backend/Frontend Engineer → change-plan (files-to-change, implementation-approach, risk-notes)
133
+ - Test Engineer test-package (test-plan, test-cases, false-confidence-assessment)
134
+ - Security Reviewer security-findings (findings, severity-assessment, recommendations)
135
+ - Critic Reviewer → verdict (verdict, evidence, required-corrections)
136
+ - And 14 more roles with full contracts
137
+
138
+ ### Evidence
139
+ - 385 tests, zero failures
140
+ - 27 new artifact tests
141
+
142
+ ## 1.5.0
143
+
144
+ ### Added
145
+
146
+ #### Hook Spine / Runtime Enforcement (Phase R)
147
+ - 5 lifecycle hooks: SessionStart, UserPromptSubmit, PreToolUse, SubagentStart, Stop
148
+ - `scaffoldHooks()` generates all 5 hook scripts in .claude/hooks/
149
+ - `roleos init claude` now scaffolds hooks + settings.local.json with hook config
150
+ - `roleos doctor` now checks for hook scripts (check 7) and settings hooks (check 8)
151
+
152
+ #### SessionStart hook
153
+ - Establishes session contract on every new session
154
+ - Records session ID, timestamp, initializes state tracking
155
+ - Adds context reminding Claude to use /roleos-route for non-trivial tasks
156
+
157
+ #### UserPromptSubmit hook
158
+ - Classifies prompts as substantial (>50 chars + action verbs)
159
+ - After 2+ substantial prompts without a route card, adds context reminder
160
+ - Does not block — advisory enforcement
161
+
162
+ #### PreToolUse hook
163
+ - Records all tool usage in session state
164
+ - Flags write tools (Bash, Write, Edit) used without route card after substantial work
165
+ - Advisory, not blocking — preserves operator control
166
+
167
+ #### SubagentStart hook
168
+ - Injects active role contract into delegated agents
169
+ - Ensures subagents inherit the Role OS session context
170
+
171
+ #### Stop hook
172
+ - Warns when substantial sessions end without route card or outcome artifact
173
+ - Advisory does not block session exit
174
+ - Trivial sessions (< 2 substantial prompts) are exempt
175
+
176
+ ### Evidence
177
+ - 358 tests, zero failures
178
+ - 23 new hook tests covering all 5 lifecycle hooks
179
+
180
+ ## 1.4.0
181
+
182
+ ### Added
183
+
184
+ #### Session Spine (Phase Q)
185
+ - `roleos init claude` scaffolds Claude Code integration: CLAUDE.md instructions, /roleos-route + /roleos-review + /roleos-status slash commands
186
+ - `roleos doctor` verifies repo is correctly wired for Role OS sessions (6 checks: .claude/ dir, CLAUDE.md section, /roleos-route command, context files, role contracts, packets)
187
+ - Route card generation session header artifact proving Role OS was engaged (task type, pack, confidence, composite status, success artifact)
188
+ - CLAUDE.md template instructs Claude to route through Role OS before non-trivial work
189
+ - /roleos-route command produces structured route cards
190
+ - /roleos-review command guides structured verdict production
191
+ - /roleos-status command shows active work and context health
192
+ - Appends to existing CLAUDE.md without overwriting (detects Role OS section)
193
+ - --force flag overwrites existing command files
194
+
195
+ ### Evidence
196
+ - 335 tests, zero failures
197
+
198
+ ## 1.3.0
199
+
200
+ ### Added
201
+
202
+ #### Outcome Calibration (Phase M)
203
+ - Run outcome ledger — append-only JSONL recording pack selection, confidence, overrides, escalations, corrections, completion status
204
+ - `computeCalibration()` — pack usage rates, high-confidence accuracy, operator override rates, per-pack performance
205
+ - `computePackBoosts()` — weight tuning from clean completed runs (+0.5/run, capped at 2.0)
206
+ - `computeConfidenceAdjustment()` raises threshold when high-confidence is often overridden, lowers when medium is often accepted
207
+ - Auto-generated calibration suggestions when metrics drift
208
+ - Safety constraint: calibration never overrides mismatch guards, conflict rules, escalation honesty, or evidence requirements
209
+
210
+ #### Mixed-Task Decomposition (Phase N)
211
+ - `detectComposite()` 7 subtask categories (build, bugfix, security, docs, research, launch, treatment) with signal-based detection
212
+ - Structural connector detection ("and then", "after that", "plus", "also")
213
+ - Confidence levels: high (3+ categories or 2+ with connectors), medium, low
214
+ - `decompose()` generates linked child packets sorted by phase order
215
+ - `createRunPlan()` dependency-aware parent plan with child tracking
216
+ - Honest fallback: medium/low confidence shows uncertainty warning with `--no-split` override
217
+
218
+ #### Composite Execution (Phase O)
219
+ - `initExecution()` / `advance()` dependency-driven child execution with artifact passing
220
+ - 7 artifact contracts defining what each category produces and expects
221
+ - Artifact ledger tracking all cross-packet handoffs
222
+ - `blockChild()` / `recoverChild()` / `failChild()` — branch recovery with transitive cascade
223
+ - `invalidateDownstream()` — resets stale children when upstream changes, removes stale artifacts
224
+ - `synthesize()` — truthful parent-level completion report
225
+ - Independent branches continue unaffected when a sibling fails
226
+
227
+ #### Adaptive Replanning (Phase P)
228
+ - 6 structured change event types: scope-change, artifact-changed, new-requirement, review-finding, dependency-discovered, priority-change
229
+ - `analyzeImpact()` identifies valid/stale children, stale artifacts, whether new children or reorder needed
230
+ - `replan()` — selective replanning: invalidates only affected branches, inserts new children, updates dependencies
231
+ - Plan diff: shows what changed, what stayed valid, what reopened, what was inserted
232
+ - Execution resumes from next valid child after replan no restart required
233
+
234
+ ### Evidence
235
+ - 317 tests, zero failures
236
+ - Calibration, decomposition, composite execution, and replanning each have dedicated test suites
237
+
238
+ ## 1.2.0
239
+
240
+ ### Added
241
+ - Pack auto-selection in `roleos route` — suggests best pack when confidence is high
242
+ - `roleos route --pack=<name>` use a specific pack for routing
243
+ - Pack mismatch detection — warns when a pack doesn't fit the task, suggests the correct alternative
244
+ - Pack fallback — mismatched or unknown packs fall back to free routing automatically
245
+ - `checkPackMismatch()` API with 7 guard sets covering all pack×task-type combinations
246
+ - `getPackRoles()` API with conditional Orchestrator support
247
+
248
+ ### Changed
249
+ - Docs pack: Support Triage Lead now opens (was Feedback Synthesizer). Feedback Synthesizer is second. Release Engineer + Deployment Verifier moved to optional (overhead for docs-only tasks).
250
+ - Pack calibration applied from comparison evidence: conditional Orchestrator, Security Reviewer in Treatment, Product Strategist opens Research, mismatch guards on all 7 packs.
251
+
252
+ ### Evidence
253
+ - Pack comparison: calibrated packs now win or tie 6/7 (was 2/7 pre-calibration)
254
+ - Misfit honesty: 0 full bluffs, 0 undetected partial bluffs (was 1 + 3)
255
+ - 230 tests, zero failures
256
+
257
+ ## 1.1.0
258
+
259
+ ### Added
260
+
261
+ #### Routing
262
+ - Full 31-role catalog all roles scored by keyword, trigger phrase, packet type bias, and deliverable affinity
263
+ - Dynamic chain builderphase-ordered assembly replacing static templates
264
+ - Routing confidence assessment (high/medium/low)
265
+ - `excludeWhen` enforcement — roles suppressed when exclusion patterns match packet content
266
+ - `detectType` false-positive prevention — "integration testing" no longer triggers integration type
267
+ - `--verbose` flag for `roleos route` hides scoring noise by default
268
+
269
+ #### Conflict Detection
270
+ - 4-pass conflict engine: hard conflicts, sequence, redundancy, coverage gaps
271
+ - Per-role constraint registry: lateOnly, requiresBeforePacks
272
+ - Overlap pair detection
273
+ - Repair suggestions on every finding
274
+
275
+ #### Escalation Auto-Routing
276
+ - Blocked/rejected/conflict/split work auto-routes to named resolver
277
+ - Every escalation includes: target role, recovery type, required artifact, handoff context
278
+
279
+ #### Structured Evidence
280
+ - 12 evidence kinds, 4 statuses, closed 4-verdict enum (accept/accept-with-notes/reject/blocked)
281
+ - Role-aware evidence requirements for 15 roles
282
+ - Sufficiency checks with contradiction detection
283
+
284
+ #### Runtime Dispatch
285
+ - Execution manifests for multi-claude with per-role tool profiles and budgets
286
+ - 8 execution states with auto-advance
287
+ - Escalation packet generation for blocked/rejected steps
288
+
289
+ #### Proven Team Packs
290
+ - 7 battle-tested packs: feature, bugfix, security, docs, launch, research, treatment
291
+ - `roleos packs list`show all packs with role counts
292
+ - `roleos packs suggest <packet>` — suggest best pack for a packet
293
+ - `roleos packs show <name>` — show pack details (roles, artifacts, stop conditions)
294
+ - Pack suggestion engine with confidence levels
295
+
296
+ #### Trials
297
+ - Full roster proven: 30/30 gold-task trials + 5/5 negative (wrong-task honesty) trials
298
+ - 7 pack execution trials — all packs ran full chains with honest Critic verdicts
299
+ - Trial framework: buildClusterTrials, evaluateTrialOutput, formatTrialReport
300
+
301
+ ### Changed
302
+ - 32 → 31 roles: Information Architect merged into Docs Architect
303
+ - Verdict vocabulary unified: evidence.mjs now uses accept/reject/blocked (matching review.mjs)
304
+ - "worker" terminology replaced with "role" in dispatch.mjs
305
+
306
+ ### Fixed
307
+ - `excludeWhen` was declared on 14 roles but never enforced — now active in scoreRole
308
+ - `detectType` false-positived on "integration testing" — now uses word-boundary regex
309
+ - "Not triggered: N roles" noise hidden by default (shown with --verbose)
310
+ - Handbook: Team Packs page added, reference sidebar reordered
311
+
312
+ ## 1.0.2
313
+
314
+ ### Fixed
315
+ - Fix double-nested `.claude/.claude/` directory created by `roleos init` — `starter-pack/.claude/workflows/full-treatment.md` moved to `starter-pack/workflows/`
316
+ - Read VERSION from `package.json` at runtime instead of hardcoded constant — prevents version drift between CLI and package metadata
317
+
318
+ ### Added
319
+ - `roleos init --force` — update canonical scaffolded files while always protecting user-filled `context/` files
320
+ - 4 regression tests: no double-nesting, correct workflow placement, version sync, --force context protection
321
+
322
+ ## 1.0.0
323
+
324
+ ### Added
325
+ - `roleos init` — scaffold Role OS starter pack into `.claude/`
326
+ - `roleos packet new <type>` — create feature, integration, or identity packets
327
+ - `roleos route <packet-file>` — recommend smallest valid role chain with dependency verification
328
+ - `roleos review <packet-file> <verdict>` — record accept/reject/blocked verdicts
329
+ - Full starter pack: 8 role contracts, 3 schemas, 4 policies, 3 workflows
330
+ - Guided context templates with inline prompts
331
+ - 3 canonical example packets (feature, integration, identity)
332
+ - Adoption handbook