role-os 1.8.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,268 +1,332 @@
1
- # Changelog
2
-
3
- ## 1.8.0
4
-
5
- ### Added
6
-
7
- #### Mission Library (Phase S — Mission Hardening)
8
- - 6 named, repeatable mission types: feature-ship, bugfix, treatment, docs-release, security-hardening, research-launch
9
- - Each mission declares: pack, role chain, artifact flow, escalation branches, honest-partial definition, stop conditions, dispatch defaults, trial evidence
10
- - Mission runner: create step through complete/fail generate completion report
11
- - Completion proof reporter with honest-partial and formatted text output
12
- - `roleos mission list` — list all missions
13
- - `roleos mission show <key>` full mission detail
14
- - `roleos mission suggest <text>` signal-based mission suggestion
15
- - `roleos mission validate [key]` — validate mission wiring against packs/roles
16
-
17
- #### Mission Runner Engine
18
- - `createRun()` instantiate a mission with tracked steps
19
- - `startNextStep()` / `completeStep()` / `failStep()` — step lifecycle
20
- - `recordEscalation()` — re-opens completed steps on escalation loops
21
- - `getRunPosition()` / `getArtifactChain()`run introspection
22
- - `generateCompletionReport()` / `formatCompletionReport()`honest outcome reporting
23
-
24
- ### Evidence
25
- - 465 tests, zero failures (67 new)
26
- - All 6 missions validate against live pack/role catalog
27
- - Full lifecycle tests: end-to-end runs, escalation loops, partial completions, failure reporting
28
-
29
- ## 1.7.0
30
-
31
- ### Added
32
-
33
- #### Completion Proof (Phase R)
34
- - `roleos artifacts` CLI command: list, show, validate, chain subcommands
35
- - 13 new CLI integration tests for artifact inspection
36
- - Real task completion missions through the full stack
37
-
38
- #### Completion Proof Evidence
39
- - R1-1 Feature mission: `roleos artifacts` command shipped through feature pack
40
- - Pack: feature (high confidence, correct)
41
- - Chain: 5 roles, 0 escalations, 1 minor correction
42
- - Artifact contracts: all 4 used and valid
43
- - R1-2 Bugfix mission: README.zh.md npm anomaly
44
- - Diagnosed correctly: npm auto-includes README* regardless of files field
45
- - Escalated honestly: fix requires structural decision (translation file organization)
46
- - Not force-closed: deferred to treatment pass
47
-
48
- ### Evidence
49
- - 398 tests, zero failures
50
- - 3 missions run through the full stack
51
- - Completion metrics recorded per mission
52
-
53
- ## 1.6.0
54
-
55
- ### Added
56
-
57
- #### Artifact Spine (Phase Q)
58
- - 20 per-role artifact contracts: each defines artifact type, required sections, evidence references, downstream consumers, and completion rules
59
- - `validateArtifact(role, content)` structural validation against role contracts (missing sections, evidence references, content depth)
60
- - 7 pack-level handoff contracts: define the expected artifact flow between steps for each pack (e.g., strategy-brief → implementation-spec → change-plan → test-package → verdict)
61
- - `validatePackChain(pack, artifacts)` — validates an entire pack's artifact chain for completeness
62
- - `getArtifactContract(role)` / `getHandoffContract(pack)` — lookup APIs
63
- - `formatArtifactValidation()` / `formatPackChain()` display formatters
64
-
65
- #### Artifact contract coverage
66
- - Product Strategist → strategy-brief (problem-framing, scope, non-goals, tradeoffs)
67
- - Spec Writer → implementation-spec (acceptance-criteria, edge-cases, interface-spec)
68
- - Backend/Frontend Engineer → change-plan (files-to-change, implementation-approach, risk-notes)
69
- - Test Engineer → test-package (test-plan, test-cases, false-confidence-assessment)
70
- - Security Reviewer → security-findings (findings, severity-assessment, recommendations)
71
- - Critic Reviewer verdict (verdict, evidence, required-corrections)
72
- - And 14 more roles with full contracts
73
-
74
- ### Evidence
75
- - 385 tests, zero failures
76
- - 27 new artifact tests
77
-
78
- ## 1.5.0
79
-
80
- ### Added
81
-
82
- #### Hook Spine / Runtime Enforcement (Phase R)
83
- - 5 lifecycle hooks: SessionStart, UserPromptSubmit, PreToolUse, SubagentStart, Stop
84
- - `scaffoldHooks()` generates all 5 hook scripts in .claude/hooks/
85
- - `roleos init claude` now scaffolds hooks + settings.local.json with hook config
86
- - `roleos doctor` now checks for hook scripts (check 7) and settings hooks (check 8)
87
-
88
- #### SessionStart hook
89
- - Establishes session contract on every new session
90
- - Records session ID, timestamp, initializes state tracking
91
- - Adds context reminding Claude to use /roleos-route for non-trivial tasks
92
-
93
- #### UserPromptSubmit hook
94
- - Classifies prompts as substantial (>50 chars + action verbs)
95
- - After 2+ substantial prompts without a route card, adds context reminder
96
- - Does not block — advisory enforcement
97
-
98
- #### PreToolUse hook
99
- - Records all tool usage in session state
100
- - Flags write tools (Bash, Write, Edit) used without route card after substantial work
101
- - Advisory, not blocking — preserves operator control
102
-
103
- #### SubagentStart hook
104
- - Injects active role contract into delegated agents
105
- - Ensures subagents inherit the Role OS session context
106
-
107
- #### Stop hook
108
- - Warns when substantial sessions end without route card or outcome artifact
109
- - Advisory does not block session exit
110
- - Trivial sessions (< 2 substantial prompts) are exempt
111
-
112
- ### Evidence
113
- - 358 tests, zero failures
114
- - 23 new hook tests covering all 5 lifecycle hooks
115
-
116
- ## 1.4.0
117
-
118
- ### Added
119
-
120
- #### Session Spine (Phase Q)
121
- - `roleos init claude` — scaffolds Claude Code integration: CLAUDE.md instructions, /roleos-route + /roleos-review + /roleos-status slash commands
122
- - `roleos doctor` verifies repo is correctly wired for Role OS sessions (6 checks: .claude/ dir, CLAUDE.md section, /roleos-route command, context files, role contracts, packets)
123
- - Route card generation session header artifact proving Role OS was engaged (task type, pack, confidence, composite status, success artifact)
124
- - CLAUDE.md template instructs Claude to route through Role OS before non-trivial work
125
- - /roleos-route command produces structured route cards
126
- - /roleos-review command guides structured verdict production
127
- - /roleos-status command shows active work and context health
128
- - Appends to existing CLAUDE.md without overwriting (detects Role OS section)
129
- - --force flag overwrites existing command files
130
-
131
- ### Evidence
132
- - 335 tests, zero failures
133
-
134
- ## 1.3.0
135
-
136
- ### Added
137
-
138
- #### Outcome Calibration (Phase M)
139
- - Run outcome ledger — append-only JSONL recording pack selection, confidence, overrides, escalations, corrections, completion status
140
- - `computeCalibration()` pack usage rates, high-confidence accuracy, operator override rates, per-pack performance
141
- - `computePackBoosts()` — weight tuning from clean completed runs (+0.5/run, capped at 2.0)
142
- - `computeConfidenceAdjustment()` — raises threshold when high-confidence is often overridden, lowers when medium is often accepted
143
- - Auto-generated calibration suggestions when metrics drift
144
- - Safety constraint: calibration never overrides mismatch guards, conflict rules, escalation honesty, or evidence requirements
145
-
146
- #### Mixed-Task Decomposition (Phase N)
147
- - `detectComposite()` 7 subtask categories (build, bugfix, security, docs, research, launch, treatment) with signal-based detection
148
- - Structural connector detection ("and then", "after that", "plus", "also")
149
- - Confidence levels: high (3+ categories or 2+ with connectors), medium, low
150
- - `decompose()` generates linked child packets sorted by phase order
151
- - `createRunPlan()` — dependency-aware parent plan with child tracking
152
- - Honest fallback: medium/low confidence shows uncertainty warning with `--no-split` override
153
-
154
- #### Composite Execution (Phase O)
155
- - `initExecution()` / `advance()` dependency-driven child execution with artifact passing
156
- - 7 artifact contracts defining what each category produces and expects
157
- - Artifact ledger tracking all cross-packet handoffs
158
- - `blockChild()` / `recoverChild()` / `failChild()` branch recovery with transitive cascade
159
- - `invalidateDownstream()` resets stale children when upstream changes, removes stale artifacts
160
- - `synthesize()` truthful parent-level completion report
161
- - Independent branches continue unaffected when a sibling fails
162
-
163
- #### Adaptive Replanning (Phase P)
164
- - 6 structured change event types: scope-change, artifact-changed, new-requirement, review-finding, dependency-discovered, priority-change
165
- - `analyzeImpact()` — identifies valid/stale children, stale artifacts, whether new children or reorder needed
166
- - `replan()` — selective replanning: invalidates only affected branches, inserts new children, updates dependencies
167
- - Plan diff: shows what changed, what stayed valid, what reopened, what was inserted
168
- - Execution resumes from next valid child after replan — no restart required
169
-
170
- ### Evidence
171
- - 317 tests, zero failures
172
- - Calibration, decomposition, composite execution, and replanning each have dedicated test suites
173
-
174
- ## 1.2.0
175
-
176
- ### Added
177
- - Pack auto-selection in `roleos route` — suggests best pack when confidence is high
178
- - `roleos route --pack=<name>` use a specific pack for routing
179
- - Pack mismatch detection — warns when a pack doesn't fit the task, suggests the correct alternative
180
- - Pack fallback — mismatched or unknown packs fall back to free routing automatically
181
- - `checkPackMismatch()` API with 7 guard sets covering all pack×task-type combinations
182
- - `getPackRoles()` API with conditional Orchestrator support
183
-
184
- ### Changed
185
- - Docs pack: Support Triage Lead now opens (was Feedback Synthesizer). Feedback Synthesizer is second. Release Engineer + Deployment Verifier moved to optional (overhead for docs-only tasks).
186
- - Pack calibration applied from comparison evidence: conditional Orchestrator, Security Reviewer in Treatment, Product Strategist opens Research, mismatch guards on all 7 packs.
187
-
188
- ### Evidence
189
- - Pack comparison: calibrated packs now win or tie 6/7 (was 2/7 pre-calibration)
190
- - Misfit honesty: 0 full bluffs, 0 undetected partial bluffs (was 1 + 3)
191
- - 230 tests, zero failures
192
-
193
- ## 1.1.0
194
-
195
- ### Added
196
-
197
- #### Routing
198
- - Full 31-role catalog — all roles scored by keyword, trigger phrase, packet type bias, and deliverable affinity
199
- - Dynamic chain builder — phase-ordered assembly replacing static templates
200
- - Routing confidence assessment (high/medium/low)
201
- - `excludeWhen` enforcement — roles suppressed when exclusion patterns match packet content
202
- - `detectType` false-positive prevention — "integration testing" no longer triggers integration type
203
- - `--verbose` flag for `roleos route` hides scoring noise by default
204
-
205
- #### Conflict Detection
206
- - 4-pass conflict engine: hard conflicts, sequence, redundancy, coverage gaps
207
- - Per-role constraint registry: lateOnly, requiresBeforePacks
208
- - Overlap pair detection
209
- - Repair suggestions on every finding
210
-
211
- #### Escalation Auto-Routing
212
- - Blocked/rejected/conflict/split work auto-routes to named resolver
213
- - Every escalation includes: target role, recovery type, required artifact, handoff context
214
-
215
- #### Structured Evidence
216
- - 12 evidence kinds, 4 statuses, closed 4-verdict enum (accept/accept-with-notes/reject/blocked)
217
- - Role-aware evidence requirements for 15 roles
218
- - Sufficiency checks with contradiction detection
219
-
220
- #### Runtime Dispatch
221
- - Execution manifests for multi-claude with per-role tool profiles and budgets
222
- - 8 execution states with auto-advance
223
- - Escalation packet generation for blocked/rejected steps
224
-
225
- #### Proven Team Packs
226
- - 7 battle-tested packs: feature, bugfix, security, docs, launch, research, treatment
227
- - `roleos packs list` — show all packs with role counts
228
- - `roleos packs suggest <packet>` suggest best pack for a packet
229
- - `roleos packs show <name>` show pack details (roles, artifacts, stop conditions)
230
- - Pack suggestion engine with confidence levels
231
-
232
- #### Trials
233
- - Full roster proven: 30/30 gold-task trials + 5/5 negative (wrong-task honesty) trials
234
- - 7 pack execution trials — all packs ran full chains with honest Critic verdicts
235
- - Trial framework: buildClusterTrials, evaluateTrialOutput, formatTrialReport
236
-
237
- ### Changed
238
- - 32 → 31 roles: Information Architect merged into Docs Architect
239
- - Verdict vocabulary unified: evidence.mjs now uses accept/reject/blocked (matching review.mjs)
240
- - "worker" terminology replaced with "role" in dispatch.mjs
241
-
242
- ### Fixed
243
- - `excludeWhen` was declared on 14 roles but never enforced now active in scoreRole
244
- - `detectType` false-positived on "integration testing" now uses word-boundary regex
245
- - "Not triggered: N roles" noise hidden by default (shown with --verbose)
246
- - Handbook: Team Packs page added, reference sidebar reordered
247
-
248
- ## 1.0.2
249
-
250
- ### Fixed
251
- - Fix double-nested `.claude/.claude/` directory created by `roleos init` — `starter-pack/.claude/workflows/full-treatment.md` moved to `starter-pack/workflows/`
252
- - Read VERSION from `package.json` at runtime instead of hardcoded constant — prevents version drift between CLI and package metadata
253
-
254
- ### Added
255
- - `roleos init --force` — update canonical scaffolded files while always protecting user-filled `context/` files
256
- - 4 regression tests: no double-nesting, correct workflow placement, version sync, --force context protection
257
-
258
- ## 1.0.0
259
-
260
- ### Added
261
- - `roleos init` — scaffold Role OS starter pack into `.claude/`
262
- - `roleos packet new <type>` create feature, integration, or identity packets
263
- - `roleos route <packet-file>`recommend smallest valid role chain with dependency verification
264
- - `roleos review <packet-file> <verdict>` — record accept/reject/blocked verdicts
265
- - Full starter pack: 8 role contracts, 3 schemas, 4 policies, 3 workflows
266
- - Guided context templates with inline prompts
267
- - 3 canonical example packets (feature, integration, identity)
268
- - Adoption handbook
1
+ # Changelog
2
+
3
+ ## 2.0.0
4
+
5
+ ### Added
6
+
7
+ #### Operator Friction Pass (Phase U)
8
+ - `roleos run "<task>"` one command from task description to active execution
9
+ - Persistent disk-backed runs in `.claude/runs/` survives session interruptions
10
+ - Entry level auto-selection: mission, pack, or free routing with force overrides (`--mission=`, `--pack=`)
11
+ - Step-local operator guidance at every step: role, artifact, required sections, completion rule, stop conditions
12
+ - `roleos resume [id]` — continue interrupted runs from disk
13
+ - `roleos next` start the next step or show what's active
14
+ - `roleos explain [id]`full run state with guidance, escalations, interventions
15
+ - `roleos complete <artifact> [note]` — complete the active step with artifact reference
16
+ - `roleos fail <partial|failed> <reason>` — fail with honest downstream blocking
17
+ - `roleos run list` — list all runs with status icons
18
+ - `roleos run show <id>` full run detail
19
+
20
+ #### Intervention Shortcuts
21
+ - `roleos retry <step>`retry a failed/partial step, unblock downstream
22
+ - `roleos reroute <step> <role> <reason>` swap a step to a different role
23
+ - `roleos escalate <from> <to> <trigger> <action>` — escalate between roles with step re-opening
24
+ - `roleos block <step> <reason>` — manually block a step
25
+ - `roleos reopen <step> <reason>` reopen a completed step for re-execution
26
+
27
+ #### Friction Measurement
28
+ - `roleos report [id]` — generate completion report with honest-partial
29
+ - `roleos friction [id]` — measure operator touches: interventions, escalations, manual steps
30
+ - Friction score: low/medium/high based on touch count vs step count
31
+
32
+ ### Evidence
33
+ - 613 tests, zero failures (86 new)
34
+ - 6 friction trials validated: clean run, reroute, retry, pack-level, free-routing, disk resume
35
+ - All entry levels produce low/medium friction scores
36
+ - Disk round-trip verified: create pause load → resume → complete
37
+
38
+ ## 1.9.0
39
+
40
+ ### Added
41
+
42
+ #### Unified Entry Path (Phase T)
43
+ - `roleos start <task>` — auto-decides mission vs pack vs free routing
44
+ - Three-level fallback ladder with confidence scores and alternatives
45
+ - Composite task detection warns when a task should be decomposed
46
+ - `--json` flag for machine-readable entry decisions
47
+ - 46 new tests: entry engine, comparison trials, CLI integration
48
+
49
+ #### Handbook Updates
50
+ - New Missions handbook page with full mission documentation
51
+ - Updated Getting Started to lead with `roleos start`
52
+ - Updated Reference with all CLI commands (start, mission, packs, artifacts, status, doctor)
53
+ - Updated handbook index with entry levels and 9 operating layers
54
+
55
+ #### README Overhaul
56
+ - "How it works" section leads with `roleos start` examples
57
+ - Quick Start updated with mission and start commands
58
+ - Added 6 Missions table
59
+ - Updated project structure with all 18 source modules
60
+ - Updated status history through v1.9.0
61
+
62
+ ### Evidence
63
+ - 527 tests, zero failures (46 new)
64
+ - Entry path trials validated against 20+ real task descriptions
65
+ - Fallback ladder tested: mission, pack, free-routing, composite, empty input
66
+
67
+ ## 1.8.0
68
+
69
+ ### Added
70
+
71
+ #### Mission Library (Phase S Mission Hardening)
72
+ - 6 named, repeatable mission types: feature-ship, bugfix, treatment, docs-release, security-hardening, research-launch
73
+ - Each mission declares: pack, role chain, artifact flow, escalation branches, honest-partial definition, stop conditions, dispatch defaults, trial evidence
74
+ - Mission runner: create → step through → complete/fail → generate completion report
75
+ - Completion proof reporter with honest-partial and formatted text output
76
+ - `roleos mission list` — list all missions
77
+ - `roleos mission show <key>` — full mission detail
78
+ - `roleos mission suggest <text>` — signal-based mission suggestion
79
+ - `roleos mission validate [key]` — validate mission wiring against packs/roles
80
+
81
+ #### Mission Runner Engine
82
+ - `createRun()` instantiate a mission with tracked steps
83
+ - `startNextStep()` / `completeStep()` / `failStep()` step lifecycle
84
+ - `recordEscalation()` re-opens completed steps on escalation loops
85
+ - `getRunPosition()` / `getArtifactChain()` run introspection
86
+ - `generateCompletionReport()` / `formatCompletionReport()` honest outcome reporting
87
+
88
+ ### Evidence
89
+ - 465 tests, zero failures (67 new)
90
+ - All 6 missions validate against live pack/role catalog
91
+ - Full lifecycle tests: end-to-end runs, escalation loops, partial completions, failure reporting
92
+
93
+ ## 1.7.0
94
+
95
+ ### Added
96
+
97
+ #### Completion Proof (Phase R)
98
+ - `roleos artifacts` CLI command: list, show, validate, chain subcommands
99
+ - 13 new CLI integration tests for artifact inspection
100
+ - Real task completion missions through the full stack
101
+
102
+ #### Completion Proof Evidence
103
+ - R1-1 Feature mission: `roleos artifacts` command shipped through feature pack
104
+ - Pack: feature (high confidence, correct)
105
+ - Chain: 5 roles, 0 escalations, 1 minor correction
106
+ - Artifact contracts: all 4 used and valid
107
+ - R1-2 Bugfix mission: README.zh.md npm anomaly
108
+ - Diagnosed correctly: npm auto-includes README* regardless of files field
109
+ - Escalated honestly: fix requires structural decision (translation file organization)
110
+ - Not force-closed: deferred to treatment pass
111
+
112
+ ### Evidence
113
+ - 398 tests, zero failures
114
+ - 3 missions run through the full stack
115
+ - Completion metrics recorded per mission
116
+
117
+ ## 1.6.0
118
+
119
+ ### Added
120
+
121
+ #### Artifact Spine (Phase Q)
122
+ - 20 per-role artifact contracts: each defines artifact type, required sections, evidence references, downstream consumers, and completion rules
123
+ - `validateArtifact(role, content)`structural validation against role contracts (missing sections, evidence references, content depth)
124
+ - 7 pack-level handoff contracts: define the expected artifact flow between steps for each pack (e.g., strategy-brief → implementation-spec → change-plan → test-package → verdict)
125
+ - `validatePackChain(pack, artifacts)` validates an entire pack's artifact chain for completeness
126
+ - `getArtifactContract(role)` / `getHandoffContract(pack)` lookup APIs
127
+ - `formatArtifactValidation()` / `formatPackChain()` display formatters
128
+
129
+ #### Artifact contract coverage
130
+ - Product Strategist → strategy-brief (problem-framing, scope, non-goals, tradeoffs)
131
+ - Spec Writer → implementation-spec (acceptance-criteria, edge-cases, interface-spec)
132
+ - Backend/Frontend Engineer → change-plan (files-to-change, implementation-approach, risk-notes)
133
+ - Test Engineer → test-package (test-plan, test-cases, false-confidence-assessment)
134
+ - Security Reviewer → security-findings (findings, severity-assessment, recommendations)
135
+ - Critic Reviewer → verdict (verdict, evidence, required-corrections)
136
+ - And 14 more roles with full contracts
137
+
138
+ ### Evidence
139
+ - 385 tests, zero failures
140
+ - 27 new artifact tests
141
+
142
+ ## 1.5.0
143
+
144
+ ### Added
145
+
146
+ #### Hook Spine / Runtime Enforcement (Phase R)
147
+ - 5 lifecycle hooks: SessionStart, UserPromptSubmit, PreToolUse, SubagentStart, Stop
148
+ - `scaffoldHooks()` generates all 5 hook scripts in .claude/hooks/
149
+ - `roleos init claude` now scaffolds hooks + settings.local.json with hook config
150
+ - `roleos doctor` now checks for hook scripts (check 7) and settings hooks (check 8)
151
+
152
+ #### SessionStart hook
153
+ - Establishes session contract on every new session
154
+ - Records session ID, timestamp, initializes state tracking
155
+ - Adds context reminding Claude to use /roleos-route for non-trivial tasks
156
+
157
+ #### UserPromptSubmit hook
158
+ - Classifies prompts as substantial (>50 chars + action verbs)
159
+ - After 2+ substantial prompts without a route card, adds context reminder
160
+ - Does not block advisory enforcement
161
+
162
+ #### PreToolUse hook
163
+ - Records all tool usage in session state
164
+ - Flags write tools (Bash, Write, Edit) used without route card after substantial work
165
+ - Advisory, not blocking preserves operator control
166
+
167
+ #### SubagentStart hook
168
+ - Injects active role contract into delegated agents
169
+ - Ensures subagents inherit the Role OS session context
170
+
171
+ #### Stop hook
172
+ - Warns when substantial sessions end without route card or outcome artifact
173
+ - Advisory — does not block session exit
174
+ - Trivial sessions (< 2 substantial prompts) are exempt
175
+
176
+ ### Evidence
177
+ - 358 tests, zero failures
178
+ - 23 new hook tests covering all 5 lifecycle hooks
179
+
180
+ ## 1.4.0
181
+
182
+ ### Added
183
+
184
+ #### Session Spine (Phase Q)
185
+ - `roleos init claude` scaffolds Claude Code integration: CLAUDE.md instructions, /roleos-route + /roleos-review + /roleos-status slash commands
186
+ - `roleos doctor` verifies repo is correctly wired for Role OS sessions (6 checks: .claude/ dir, CLAUDE.md section, /roleos-route command, context files, role contracts, packets)
187
+ - Route card generation — session header artifact proving Role OS was engaged (task type, pack, confidence, composite status, success artifact)
188
+ - CLAUDE.md template instructs Claude to route through Role OS before non-trivial work
189
+ - /roleos-route command produces structured route cards
190
+ - /roleos-review command guides structured verdict production
191
+ - /roleos-status command shows active work and context health
192
+ - Appends to existing CLAUDE.md without overwriting (detects Role OS section)
193
+ - --force flag overwrites existing command files
194
+
195
+ ### Evidence
196
+ - 335 tests, zero failures
197
+
198
+ ## 1.3.0
199
+
200
+ ### Added
201
+
202
+ #### Outcome Calibration (Phase M)
203
+ - Run outcome ledger append-only JSONL recording pack selection, confidence, overrides, escalations, corrections, completion status
204
+ - `computeCalibration()` — pack usage rates, high-confidence accuracy, operator override rates, per-pack performance
205
+ - `computePackBoosts()` — weight tuning from clean completed runs (+0.5/run, capped at 2.0)
206
+ - `computeConfidenceAdjustment()` raises threshold when high-confidence is often overridden, lowers when medium is often accepted
207
+ - Auto-generated calibration suggestions when metrics drift
208
+ - Safety constraint: calibration never overrides mismatch guards, conflict rules, escalation honesty, or evidence requirements
209
+
210
+ #### Mixed-Task Decomposition (Phase N)
211
+ - `detectComposite()` — 7 subtask categories (build, bugfix, security, docs, research, launch, treatment) with signal-based detection
212
+ - Structural connector detection ("and then", "after that", "plus", "also")
213
+ - Confidence levels: high (3+ categories or 2+ with connectors), medium, low
214
+ - `decompose()` — generates linked child packets sorted by phase order
215
+ - `createRunPlan()` — dependency-aware parent plan with child tracking
216
+ - Honest fallback: medium/low confidence shows uncertainty warning with `--no-split` override
217
+
218
+ #### Composite Execution (Phase O)
219
+ - `initExecution()` / `advance()` — dependency-driven child execution with artifact passing
220
+ - 7 artifact contracts defining what each category produces and expects
221
+ - Artifact ledger tracking all cross-packet handoffs
222
+ - `blockChild()` / `recoverChild()` / `failChild()` — branch recovery with transitive cascade
223
+ - `invalidateDownstream()` resets stale children when upstream changes, removes stale artifacts
224
+ - `synthesize()` — truthful parent-level completion report
225
+ - Independent branches continue unaffected when a sibling fails
226
+
227
+ #### Adaptive Replanning (Phase P)
228
+ - 6 structured change event types: scope-change, artifact-changed, new-requirement, review-finding, dependency-discovered, priority-change
229
+ - `analyzeImpact()`identifies valid/stale children, stale artifacts, whether new children or reorder needed
230
+ - `replan()` selective replanning: invalidates only affected branches, inserts new children, updates dependencies
231
+ - Plan diff: shows what changed, what stayed valid, what reopened, what was inserted
232
+ - Execution resumes from next valid child after replan — no restart required
233
+
234
+ ### Evidence
235
+ - 317 tests, zero failures
236
+ - Calibration, decomposition, composite execution, and replanning each have dedicated test suites
237
+
238
+ ## 1.2.0
239
+
240
+ ### Added
241
+ - Pack auto-selection in `roleos route` — suggests best pack when confidence is high
242
+ - `roleos route --pack=<name>` — use a specific pack for routing
243
+ - Pack mismatch detection warns when a pack doesn't fit the task, suggests the correct alternative
244
+ - Pack fallback mismatched or unknown packs fall back to free routing automatically
245
+ - `checkPackMismatch()` API with 7 guard sets covering all pack×task-type combinations
246
+ - `getPackRoles()` API with conditional Orchestrator support
247
+
248
+ ### Changed
249
+ - Docs pack: Support Triage Lead now opens (was Feedback Synthesizer). Feedback Synthesizer is second. Release Engineer + Deployment Verifier moved to optional (overhead for docs-only tasks).
250
+ - Pack calibration applied from comparison evidence: conditional Orchestrator, Security Reviewer in Treatment, Product Strategist opens Research, mismatch guards on all 7 packs.
251
+
252
+ ### Evidence
253
+ - Pack comparison: calibrated packs now win or tie 6/7 (was 2/7 pre-calibration)
254
+ - Misfit honesty: 0 full bluffs, 0 undetected partial bluffs (was 1 + 3)
255
+ - 230 tests, zero failures
256
+
257
+ ## 1.1.0
258
+
259
+ ### Added
260
+
261
+ #### Routing
262
+ - Full 31-role catalogall roles scored by keyword, trigger phrase, packet type bias, and deliverable affinity
263
+ - Dynamic chain builderphase-ordered assembly replacing static templates
264
+ - Routing confidence assessment (high/medium/low)
265
+ - `excludeWhen` enforcement roles suppressed when exclusion patterns match packet content
266
+ - `detectType` false-positive prevention "integration testing" no longer triggers integration type
267
+ - `--verbose` flag for `roleos route` hides scoring noise by default
268
+
269
+ #### Conflict Detection
270
+ - 4-pass conflict engine: hard conflicts, sequence, redundancy, coverage gaps
271
+ - Per-role constraint registry: lateOnly, requiresBeforePacks
272
+ - Overlap pair detection
273
+ - Repair suggestions on every finding
274
+
275
+ #### Escalation Auto-Routing
276
+ - Blocked/rejected/conflict/split work auto-routes to named resolver
277
+ - Every escalation includes: target role, recovery type, required artifact, handoff context
278
+
279
+ #### Structured Evidence
280
+ - 12 evidence kinds, 4 statuses, closed 4-verdict enum (accept/accept-with-notes/reject/blocked)
281
+ - Role-aware evidence requirements for 15 roles
282
+ - Sufficiency checks with contradiction detection
283
+
284
+ #### Runtime Dispatch
285
+ - Execution manifests for multi-claude with per-role tool profiles and budgets
286
+ - 8 execution states with auto-advance
287
+ - Escalation packet generation for blocked/rejected steps
288
+
289
+ #### Proven Team Packs
290
+ - 7 battle-tested packs: feature, bugfix, security, docs, launch, research, treatment
291
+ - `roleos packs list` — show all packs with role counts
292
+ - `roleos packs suggest <packet>` — suggest best pack for a packet
293
+ - `roleos packs show <name>` — show pack details (roles, artifacts, stop conditions)
294
+ - Pack suggestion engine with confidence levels
295
+
296
+ #### Trials
297
+ - Full roster proven: 30/30 gold-task trials + 5/5 negative (wrong-task honesty) trials
298
+ - 7 pack execution trials — all packs ran full chains with honest Critic verdicts
299
+ - Trial framework: buildClusterTrials, evaluateTrialOutput, formatTrialReport
300
+
301
+ ### Changed
302
+ - 32 → 31 roles: Information Architect merged into Docs Architect
303
+ - Verdict vocabulary unified: evidence.mjs now uses accept/reject/blocked (matching review.mjs)
304
+ - "worker" terminology replaced with "role" in dispatch.mjs
305
+
306
+ ### Fixed
307
+ - `excludeWhen` was declared on 14 roles but never enforced — now active in scoreRole
308
+ - `detectType` false-positived on "integration testing" — now uses word-boundary regex
309
+ - "Not triggered: N roles" noise hidden by default (shown with --verbose)
310
+ - Handbook: Team Packs page added, reference sidebar reordered
311
+
312
+ ## 1.0.2
313
+
314
+ ### Fixed
315
+ - Fix double-nested `.claude/.claude/` directory created by `roleos init` — `starter-pack/.claude/workflows/full-treatment.md` moved to `starter-pack/workflows/`
316
+ - Read VERSION from `package.json` at runtime instead of hardcoded constant — prevents version drift between CLI and package metadata
317
+
318
+ ### Added
319
+ - `roleos init --force` — update canonical scaffolded files while always protecting user-filled `context/` files
320
+ - 4 regression tests: no double-nesting, correct workflow placement, version sync, --force context protection
321
+
322
+ ## 1.0.0
323
+
324
+ ### Added
325
+ - `roleos init` — scaffold Role OS starter pack into `.claude/`
326
+ - `roleos packet new <type>` — create feature, integration, or identity packets
327
+ - `roleos route <packet-file>` — recommend smallest valid role chain with dependency verification
328
+ - `roleos review <packet-file> <verdict>` — record accept/reject/blocked verdicts
329
+ - Full starter pack: 8 role contracts, 3 schemas, 4 policies, 3 workflows
330
+ - Guided context templates with inline prompts
331
+ - 3 canonical example packets (feature, integration, identity)
332
+ - Adoption handbook