role-os 1.9.0 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,250 +1,287 @@
1
- <p align="center">
2
- <a href="README.ja.md">日本語</a> | <a href="README.zh.md">中文</a> | <a href="README.es.md">Español</a> | <a href="README.fr.md">Français</a> | <a href="README.hi.md">हिन्दी</a> | <a href="README.it.md">Italiano</a> | <a href="README.pt-BR.md">Português (BR)</a>
3
- </p>
4
-
5
- # Role OS
6
-
7
- <p align="center">
8
- <img src="https://raw.githubusercontent.com/mcp-tool-shop-org/brand/main/logos/role-os/readme.png" alt="Role OS" width="400">
9
- </p>
10
-
11
- <p align="center">
12
- <a href="https://github.com/mcp-tool-shop-org/role-os/actions"><img src="https://github.com/mcp-tool-shop-org/role-os/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
13
- <a href="https://www.npmjs.com/package/role-os"><img src="https://img.shields.io/npm/v/role-os" alt="npm"></a>
14
- <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue" alt="MIT License"></a>
15
- <a href="https://mcp-tool-shop-org.github.io/role-os/"><img src="https://img.shields.io/badge/Landing_Page-live-brightgreen" alt="Landing Page"></a>
16
- </p>
17
-
18
- A multi-Claude operating system that staffs, routes, validates, and runs work through 31 specialized role contracts. Creates task packets, assembles the right team from scored role matching, detects broken chains before execution, auto-routes recovery when work is blocked or rejected, and requires structured evidence in every verdict.
19
-
20
- ## What it does
21
-
22
- Role OS is the professional way to use multi-Claude. It prevents the specific failures that generic AI workflows produce:
23
-
24
- - **Drift** — roles stay in lane. Product doesn't redesign. Frontend doesn't redefine scope. Backend doesn't invent product direction.
25
- - **False completion** — the done definition is concrete. Work that hides gaps, skips verification, or solves a different problem gets rejected.
26
- - **Contamination** — forked or inherited projects carry identity residue. Role OS detects and rejects cross-project drift in terminology, visuals, and mental models.
27
- - **Vibes-based progress** — every handoff is structured. Every verdict ties to evidence. "It feels done" is not a valid state.
28
-
29
- ## How it works
30
-
31
- Describe your task. Role OS decides the right level of orchestration automatically.
32
-
33
- ```bash
34
- roleos start "fix the crash in save handler"
35
- # → MISSION: Bugfix & Diagnosis (70% confidence)
36
- # Chain: Repo Researcher → Backend Engineer → Test Engineer → Critic Reviewer
37
-
38
- roleos start "add a new export command"
39
- # PACK: Feature Build (50% confidence)
40
- # Roles: Orchestrator, Product Strategist, Spec Writer, Backend Engineer, Test Engineer, Critic Reviewer
41
-
42
- roleos start "something completely novel"
43
- # FREE-ROUTING (10% confidence)
44
- # Hint: Create a packet and run `roleos route` for role-level routing
45
- ```
46
-
47
- **The fallback ladder:**
48
-
49
- 1. **Mission** — when the task matches a proven recurring workflow (bugfix, treatment, feature-ship, docs, security, research). Known role chain, artifact flow, escalation branches, and honest-partial definitions.
50
- 2. **Pack** — when the task is a known family but not a full mission shape. 7 calibrated team packs with auto-selection and mismatch guards.
51
- 3. **Free routing** — when the task is novel, mixed, or uncertain. Scores all 31 roles against packet content and assembles a dynamic chain.
52
-
53
- The system never forces work through the wrong abstraction. It explains why it chose each level and offers alternatives.
54
-
55
- **Once routed:**
56
-
57
- 1. **Each role produces a handoff** structured output with evidence items that reduce ambiguity for the next role
58
- 2. **Critic reviews against contract** — accepts, rejects, or blocks based on structured evidence, not impression
59
- 3. **Recovery routes automatically** — blocked or rejected work gets routed to the right resolver with a reason, recovery type, and required artifact
60
-
61
- ## Org rollout state
62
-
63
- Org-wide rollout state (queue, decisions, audit records, per-repo lock packets) lives in a separate private repo: [`role-os-rollout`](https://github.com/mcp-tool-shop-org/role-os-rollout). This repo is the product; that repo is operational state.
64
-
65
- ## Memory and continuity
66
-
67
- Role OS does not own or duplicate the memory layer. Where Claude project memory exists, it is the canonical continuity system — repo facts, decisions, open loops, and treatment history live there.
68
-
69
- Role OS integrates with Claude project memory. It does not replace it.
70
-
71
- ## Full treatment and shipcheck
72
-
73
- Full treatment is a canonical 7-phase protocol defined in Claude project memory (`memory/full-treatment.md`). Role OS routes and reviews treatments using role contracts, handoffs, and critic gates — it does not redefine the protocol.
74
-
75
- **Shipcheck** is the 31-item quality gate that runs before full treatment. Hard gates A-D must pass before any treatment begins. Canonical reference: `memory/shipcheck.md`.
76
-
77
- Order: Shipcheck first, then full treatment. No v1.0.0 without passing hard gates.
78
-
79
- ## 31 roles across 8 packs
80
-
81
- | Pack | Roles |
82
- |------|-------|
83
- | **Core** (3) | Orchestrator, Product Strategist, Critic Reviewer |
84
- | **Engineering** (7) | Frontend Developer, Backend Engineer, Test Engineer, Refactor Engineer, Performance Engineer, Dependency Auditor, Security Reviewer |
85
- | **Design** (2) | UI Designer, Brand Guardian |
86
- | **Marketing** (1) | Launch Copywriter |
87
- | **Treatment** (7) | Repo Researcher, Repo Translator, Docs Architect, Metadata Curator, Coverage Auditor, Deployment Verifier, Release Engineer |
88
- | **Product** (3) | Feedback Synthesizer, Roadmap Prioritizer, Spec Writer |
89
- | **Research** (4) | UX Researcher, Competitive Analyst, Trend Researcher, User Interview Synthesizer |
90
- | **Growth** (4) | Launch Strategist, Content Strategist, Community Manager, Support Triage Lead |
91
-
92
- Every role has a full contract: mission, use when, do not use when, expected inputs, required outputs, quality bar, and escalation triggers. Every role is routable — `roleos route` can recommend any of them based on packet content.
93
-
94
- ## Quick start
95
-
96
- ```bash
97
- npx role-os init
98
-
99
- # Describe what you need — Role OS picks the right level:
100
- roleos start "fix the crash in save handler"
101
-
102
- # Or go manual:
103
- roleos packet new feature
104
- roleos route .claude/packets/my-feature.md
105
- roleos review .claude/packets/my-feature.md accept
106
- roleos status
107
-
108
- # Explore missions and packs:
109
- roleos mission list
110
- roleos mission show bugfix
111
- roleos packs list
112
- roleos packs show feature
113
- ```
114
-
115
- ## When not to use Role OS
116
-
117
- - Single-line fixes, typos, or obvious bugs
118
- - Exploratory research with no defined output
119
- - Work that fits in one person's head in 5 minutes
120
- - Emergency hotfixes that need to ship before a review chain completes
121
- - Projects where you want speed over structure
122
-
123
- ## Evidence
124
-
125
- Role OS was proven across three trial shapes in two structurally different repos:
126
-
127
- **Trial 001Feature work** (Crew Screen, Star Freight)
128
- - 7-role chain, 45 test scenarios, 0 role collisions
129
- - Prevented contamination from fork ancestor, caught inline invention, surfaced honest blockers
130
-
131
- **Trial 002 — Integration work** (CampaignState wiring, Star Freight)
132
- - 5-role chain, resolved architectural seam without fallback lies
133
- - Anti-fallback tests proved the live path is real, not placeholder
134
-
135
- **Trial 003 Identity work** (Contamination purge, Star Freight)
136
- - 6-role chain, 51 test scenarios including durable CI contamination defense
137
- - Repaired inherited fiction drift without collapsing into broad redesign
138
-
139
- **Portability trial** (Persona consistency, sensor-humor)
140
- - Same spine, different language/domain/stack
141
- - Adopted with context changes only — no core contract modifications
142
-
143
- **Full treatment FT-001** (portlight-desktop)
144
- - 7-phase staffed treatment with Treatment Pack roles
145
- - Shipcheck gating proven, zero role collisions
146
-
147
- **Full treatment FT-002** (studioflow)
148
- - Same treatment pack, structurally different repo (creative workspace vs game)
149
- - Treatment Pack portable — no contract modifications needed
150
-
151
- ## Core properties
152
-
153
- These are non-negotiable. If a change weakens any of them, reject it.
154
-
155
- - Role boundaries hold
156
- - Review has teeth
157
- - Escalation stays honest
158
- - Packets stay testable
159
- - Portability requires context adaptation, not core surgery
160
-
161
- ## Project structure
162
-
163
- ```
164
- role-os/
165
- bin/roleos.mjs ← CLI entrypoint
166
- src/
167
- entry.mjs ← Unified entry: mission → pack → free routing
168
- entry-cmd.mjs ← `roleos start` CLI command
169
- mission.mjs ← 6 named mission types (feature, bugfix, treatment, docs, security, research)
170
- mission-run.mjs ← Mission runner: create step complete report
171
- mission-cmd.mjs ← `roleos mission` CLI commands
172
- route.mjs ← 31-role routing + dynamic chain builder
173
- packs.mjs ← 7 calibrated team packs + auto-selection
174
- conflicts.mjs ← 4-pass conflict detection
175
- escalation.mjs ← Auto-routing for blocked/rejected/split
176
- evidence.mjs ← Structured evidence + role-aware requirements
177
- dispatch.mjs ← Runtime dispatch manifests for multi-claude
178
- artifacts.mjs ← 20 per-role artifact contracts + 7 pack handoffs
179
- decompose.mjs ← Composite task detection + splitting
180
- composite.mjs ← Dependency-ordered execution + recovery
181
- replan.mjs ← Mid-run adaptive replanning
182
- calibration.mjs ← Outcome recording + weight tuning
183
- hooks.mjs ← 5 lifecycle hooks for runtime enforcement
184
- session.mjs ← Session scaffolding + doctor
185
- test/ ← 527 tests across 20 test files
186
- starter-pack/ ← Drop-in role contracts, policies, schemas, workflows
187
- ```
188
-
189
- ## Security
190
-
191
- Role OS operates **locally only**. It copies markdown templates and writes packet/verdict files to your repository's `.claude/` directory. It does not access the network, handle secrets, or collect telemetry. No dangerous operations — all file writes use skip-if-exists by default. See [SECURITY.md](SECURITY.md) for the full policy.
192
-
193
- ## The operating system
194
-
195
- | Layer | What it does | Status |
196
- |-------|-------------|--------|
197
- | **Routing** | Scores all 31 roles against packet content, explains recommendations, assesses confidence | ✓ Shipped |
198
- | **Chain builder** | Assembles phase-ordered chains from scored roles, packet-type biased not template-locked | ✓ Shipped |
199
- | **Conflict detection** | 4-pass validation: hard conflicts, sequence, redundancy, coverage gaps. Repair suggestions. | ✓ Shipped |
200
- | **Escalation** | Auto-routes blocked/rejected/split work to the right resolver with reason + required artifact | ✓ Shipped |
201
- | **Evidence** | Role-aware structured evidence in verdicts. Sufficiency checks. 12 evidence kinds. | ✓ Shipped |
202
- | **Dispatch** | Generates execution manifests for multi-claude. Per-role tool profiles, system prompts, budgets. | ✓ Shipped |
203
- | **Trials** | Full roster proven: 30/30 gold-task + 5/5 negative trials. 7 pack trials complete. | ✓ Complete |
204
- | **Team Packs** | 7 calibrated packs with auto-selection, mismatch guards, and free-routing fallback. | ✓ Shipped |
205
- | **Outcome calibration** | Records run outcomes, tunes pack/role weights from results, adjusts confidence thresholds. | ✓ Shipped |
206
- | **Mixed-task decomposition** | Detects composite work, splits into child packets, assigns packs, preserves dependencies. | Shipped |
207
- | **Composite execution** | Runs child packets in dependency order with artifact passing, branch recovery, and synthesis. | ✓ Shipped |
208
- | **Adaptive replanning** | Mid-run scope changes, findings, or new requirements update the plan without restarting. | ✓ Shipped |
209
- | **Session spine** | `roleos init claude` scaffolds CLAUDE.md, /roleos-route, /roleos-review, /roleos-status. `roleos doctor` verifies wiring. Route cards prove engagement. | ✓ Shipped |
210
- | **Hook spine** | 5 lifecycle hooks (SessionStart, PromptSubmit, PreToolUse, SubagentStart, Stop). Advisory enforcement: route card reminders, write-tool gating, subagent role injection, completion audit. | ✓ Shipped |
211
- | **Artifact spine** | 20 per-role artifact contracts. 7 pack handoff contracts. Structural validation. Chain completeness checks. Downstream roles never guess what they received. | ✓ Shipped |
212
- | **Mission library** | 6 named missions (feature-ship, bugfix, treatment, docs-release, security-hardening, research-launch). Each declares pack, role chain, artifact flow, escalation branches, honest-partial definition. All 6 trial-run and hardened. | ✓ Shipped |
213
- | **Mission runner** | Create runs, step through with tracked state, complete/fail with honest reporting. Blocked-step propagation, out-of-chain escalation warnings, last-step re-opening. | ✓ Shipped |
214
- | **Unified entry** | `roleos start` decides mission vs pack vs free routing automatically. Fallback ladder with confidence scores, alternatives, and composite detection. | ✓ Shipped |
215
-
216
- ## 6 missions
217
-
218
- | Mission | Pack | Roles | When to use |
219
- |---------|------|-------|-------------|
220
- | `feature-ship` | feature | 5 | Full feature delivery: scope → spec → implement → test → review |
221
- | `bugfix` | bugfix | 4 | Diagnose root cause, fix, test, verify |
222
- | `treatment` | treatment | 4 | Shipcheck + polish + docs + CI verify + review |
223
- | `docs-release` | docs | 2 | Write/update documentation, release notes |
224
- | `security-hardening` | security | 4 | Threat model, audit, fix vulnerabilities, re-audit, verify |
225
- | `research-launch` | research | 4 | Frame question, research, document findings, decide |
226
-
227
- Each mission includes honest-partial definitions — when work stalls, the system documents what was completed and what remains instead of bluffing completion.
228
-
229
- ## Status
230
-
231
- - v0.1–v0.4: Foundation — trials, adoption, treatment pack, starter pack
232
- - v1.0.0: 32 roles, full CLI, proven treatment, multi-repo portability
233
- - v1.0.2: Role OS lockdown (bootstrap truth fixes, init --force)
234
- - v1.1.0: 31 roles, full routing spine, conflict detection, escalation, evidence, dispatch, 7 proven team packs. 35 execution trials. 212 tests.
235
- - v1.2.0: Calibrated packs promoted to default entry. Auto-selection, mismatch detection, alternative suggestion, free-routing fallback. 246 tests.
236
- - v1.3.0: Outcome calibration, mixed-task decomposition, composite execution, adaptive replanning. 317 tests.
237
- - v1.4.0: Session spine `roleos init claude`, `roleos doctor`, route cards, /roleos-route + /roleos-review + /roleos-status commands. 335 tests.
238
- - v1.5.0: Hook spine 5 lifecycle hooks for runtime enforcement. 358 tests.
239
- - v1.6.0: Artifact spine 20 per-role artifact contracts, 7 pack handoff contracts, structural validation. 385 tests.
240
- - v1.7.0: Completion proof real tasks run through the full stack. `roleos artifacts` CLI. Honest escalation on structural fixes. 398 tests.
241
- - v1.8.0: Mission library (Phase S) 6 named missions, runner engine, completion reports. Hardened from 6 real trial runs. 481 tests.
242
- - **v1.9.0**: Unified entry path (Phase T) `roleos start` auto-decides mission vs pack vs free routing. Fallback ladder, composite detection, entry-path comparison trials. 527 tests.
243
-
244
- ## License
245
-
246
- MIT
247
-
248
- ---
249
-
250
- Built by <a href="https://mcp-tool-shop.github.io/">MCP Tool Shop</a>
1
+ <p align="center">
2
+ <a href="README.ja.md">日本語</a> | <a href="README.zh.md">中文</a> | <a href="README.es.md">Español</a> | <a href="README.fr.md">Français</a> | <a href="README.hi.md">हिन्दी</a> | <a href="README.it.md">Italiano</a> | <a href="README.pt-BR.md">Português (BR)</a>
3
+ </p>
4
+
5
+
6
+ <p align="center">
7
+ <img src="https://raw.githubusercontent.com/mcp-tool-shop-org/brand/main/logos/role-os/readme.png" alt="Role OS" width="600">
8
+ </p>
9
+
10
+ <p align="center">
11
+ <a href="https://github.com/mcp-tool-shop-org/role-os/actions"><img src="https://github.com/mcp-tool-shop-org/role-os/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
12
+ <a href="https://www.npmjs.com/package/role-os"><img src="https://img.shields.io/npm/v/role-os" alt="npm"></a>
13
+ <a href="LICENSE"><img src="https://img.shields.io/badge/license-MIT-blue" alt="MIT License"></a>
14
+ <a href="https://mcp-tool-shop-org.github.io/role-os/"><img src="https://img.shields.io/badge/Landing_Page-live-brightgreen" alt="Landing Page"></a>
15
+ </p>
16
+
17
+ A multi-Claude operating system that staffs, routes, validates, and runs work through 31 specialized role contracts. Creates task packets, assembles the right team from scored role matching, detects broken chains before execution, auto-routes recovery when work is blocked or rejected, and requires structured evidence in every verdict.
18
+
19
+ ## What it does
20
+
21
+ Role OS is the professional way to use multi-Claude. It prevents the specific failures that generic AI workflows produce:
22
+
23
+ - **Drift** — roles stay in lane. Product doesn't redesign. Frontend doesn't redefine scope. Backend doesn't invent product direction.
24
+ - **False completion** — the done definition is concrete. Work that hides gaps, skips verification, or solves a different problem gets rejected.
25
+ - **Contamination** — forked or inherited projects carry identity residue. Role OS detects and rejects cross-project drift in terminology, visuals, and mental models.
26
+ - **Vibes-based progress** — every handoff is structured. Every verdict ties to evidence. "It feels done" is not a valid state.
27
+
28
+ ## How it works
29
+
30
+ Describe your task. Role OS decides the right level of orchestration automatically.
31
+
32
+ ```bash
33
+ roleos start "fix the crash in save handler"
34
+ # MISSION: Bugfix & Diagnosis (70% confidence)
35
+ # Chain: Repo Researcher Backend Engineer Test Engineer → Critic Reviewer
36
+
37
+ roleos start "add a new export command"
38
+ # PACK: Feature Build (50% confidence)
39
+ # Roles: Orchestrator, Product Strategist, Spec Writer, Backend Engineer, Test Engineer, Critic Reviewer
40
+
41
+ roleos start "something completely novel"
42
+ # FREE-ROUTING (10% confidence)
43
+ # Hint: Create a packet and run `roleos route` for role-level routing
44
+ ```
45
+
46
+ **The fallback ladder:**
47
+
48
+ 1. **Mission** — when the task matches a proven recurring workflow (bugfix, treatment, feature-ship, docs, security, research). Known role chain, artifact flow, escalation branches, and honest-partial definitions.
49
+ 2. **Pack** — when the task is a known family but not a full mission shape. 7 calibrated team packs with auto-selection and mismatch guards.
50
+ 3. **Free routing** — when the task is novel, mixed, or uncertain. Scores all 31 roles against packet content and assembles a dynamic chain.
51
+
52
+ The system never forces work through the wrong abstraction. It explains why it chose each level and offers alternatives.
53
+
54
+ **One command to active execution:**
55
+
56
+ ```bash
57
+ roleos run "fix the crash in save handler"
58
+ # Created run: run-1234
59
+ # Entry: MISSION (bugfix)
60
+ # → Started step 0: Repo Researcher → diagnosis-report
61
+ # Guidance: Required sections: entrypoints, module-map, build-test-commands
62
+
63
+ roleos next # Start the next step
64
+ roleos complete diagnosis.md # Complete the active step with artifact
65
+ roleos explain # Show full run state and guidance
66
+ roleos resume # Continue an interrupted run
67
+ roleos report # Generate completion report
68
+ roleos friction # Measure operator touches
69
+ ```
70
+
71
+ **Interventions when things go wrong:**
72
+
73
+ ```bash
74
+ roleos retry 0 # Retry a failed step
75
+ roleos reroute 1 "Frontend Developer" "UI bug" # Swap a role
76
+ roleos escalate "Test Engineer" "Repo Researcher" "missed edge case" "re-diagnose"
77
+ roleos block 2 "waiting for API spec"
78
+ roleos reopen 0 "found issue in review"
79
+ ```
80
+
81
+ Runs persist to disk (`.claude/runs/`), so interrupted sessions resume cleanly. Every step includes operator guidance: what to produce, required sections, and stop conditions.
82
+
83
+ **Once routed:**
84
+
85
+ 1. **Each role produces a handoff** structured output with evidence items that reduce ambiguity for the next role
86
+ 2. **Critic reviews against contract** accepts, rejects, or blocks based on structured evidence, not impression
87
+ 3. **Recovery routes automatically** blocked or rejected work gets routed to the right resolver with a reason, recovery type, and required artifact
88
+
89
+ ## Org rollout state
90
+
91
+ Org-wide rollout state (queue, decisions, audit records, per-repo lock packets) lives in a separate private repo: [`role-os-rollout`](https://github.com/mcp-tool-shop-org/role-os-rollout). This repo is the product; that repo is operational state.
92
+
93
+ ## Memory and continuity
94
+
95
+ Role OS does not own or duplicate the memory layer. Where Claude project memory exists, it is the canonical continuity system — repo facts, decisions, open loops, and treatment history live there.
96
+
97
+ Role OS integrates with Claude project memory. It does not replace it.
98
+
99
+ ## Full treatment and shipcheck
100
+
101
+ Full treatment is a canonical 7-phase protocol defined in Claude project memory (`memory/full-treatment.md`). Role OS routes and reviews treatments using role contracts, handoffs, and critic gates — it does not redefine the protocol.
102
+
103
+ **Shipcheck** is the 31-item quality gate that runs before full treatment. Hard gates A-D must pass before any treatment begins. Canonical reference: `memory/shipcheck.md`.
104
+
105
+ Order: Shipcheck first, then full treatment. No v1.0.0 without passing hard gates.
106
+
107
+ ## 31 roles across 8 packs
108
+
109
+ | Pack | Roles |
110
+ |------|-------|
111
+ | **Core** (3) | Orchestrator, Product Strategist, Critic Reviewer |
112
+ | **Engineering** (7) | Frontend Developer, Backend Engineer, Test Engineer, Refactor Engineer, Performance Engineer, Dependency Auditor, Security Reviewer |
113
+ | **Design** (2) | UI Designer, Brand Guardian |
114
+ | **Marketing** (1) | Launch Copywriter |
115
+ | **Treatment** (7) | Repo Researcher, Repo Translator, Docs Architect, Metadata Curator, Coverage Auditor, Deployment Verifier, Release Engineer |
116
+ | **Product** (3) | Feedback Synthesizer, Roadmap Prioritizer, Spec Writer |
117
+ | **Research** (4) | UX Researcher, Competitive Analyst, Trend Researcher, User Interview Synthesizer |
118
+ | **Growth** (4) | Launch Strategist, Content Strategist, Community Manager, Support Triage Lead |
119
+
120
+ Every role has a full contract: mission, use when, do not use when, expected inputs, required outputs, quality bar, and escalation triggers. Every role is routable — `roleos route` can recommend any of them based on packet content.
121
+
122
+ ## Quick start
123
+
124
+ ```bash
125
+ npx role-os init
126
+
127
+ # Describe what you need Role OS picks the right level:
128
+ roleos run "fix the crash in save handler"
129
+ # Creates run, picks bugfix mission, starts first step with guidance
130
+
131
+ # Step through:
132
+ roleos next # Start next step
133
+ roleos complete artifact.md # Complete with artifact
134
+ roleos explain # Show full state
135
+ roleos report # Completion report
136
+
137
+ # Or go manual:
138
+ roleos start "fix the crash" # Entry decision only (no run)
139
+ roleos packet new feature
140
+ roleos route .claude/packets/my-feature.md
141
+ roleos review .claude/packets/my-feature.md accept
142
+
143
+ # Explore missions and packs:
144
+ roleos mission list
145
+ roleos packs list
146
+ ```
147
+
148
+ ## When not to use Role OS
149
+
150
+ - Single-line fixes, typos, or obvious bugs
151
+ - Exploratory research with no defined output
152
+ - Work that fits in one person's head in 5 minutes
153
+ - Emergency hotfixes that need to ship before a review chain completes
154
+ - Projects where you want speed over structure
155
+
156
+ ## Evidence
157
+
158
+ Role OS was proven across three trial shapes in two structurally different repos:
159
+
160
+ **Trial 001 — Feature work** (Crew Screen, Star Freight)
161
+ - 7-role chain, 45 test scenarios, 0 role collisions
162
+ - Prevented contamination from fork ancestor, caught inline invention, surfaced honest blockers
163
+
164
+ **Trial 002 — Integration work** (CampaignState wiring, Star Freight)
165
+ - 5-role chain, resolved architectural seam without fallback lies
166
+ - Anti-fallback tests proved the live path is real, not placeholder
167
+
168
+ **Trial 003 Identity work** (Contamination purge, Star Freight)
169
+ - 6-role chain, 51 test scenarios including durable CI contamination defense
170
+ - Repaired inherited fiction drift without collapsing into broad redesign
171
+
172
+ **Portability trial** (Persona consistency, sensor-humor)
173
+ - Same spine, different language/domain/stack
174
+ - Adopted with context changes only — no core contract modifications
175
+
176
+ **Full treatment FT-001** (portlight-desktop)
177
+ - 7-phase staffed treatment with Treatment Pack roles
178
+ - Shipcheck gating proven, zero role collisions
179
+
180
+ **Full treatment FT-002** (studioflow)
181
+ - Same treatment pack, structurally different repo (creative workspace vs game)
182
+ - Treatment Pack portable no contract modifications needed
183
+
184
+ ## Core properties
185
+
186
+ These are non-negotiable. If a change weakens any of them, reject it.
187
+
188
+ - Role boundaries hold
189
+ - Review has teeth
190
+ - Escalation stays honest
191
+ - Packets stay testable
192
+ - Portability requires context adaptation, not core surgery
193
+
194
+ ## Project structure
195
+
196
+ ```
197
+ role-os/
198
+ bin/roleos.mjs ← CLI entrypoint
199
+ src/
200
+ entry.mjs ← Unified entry: mission pack free routing
201
+ entry-cmd.mjs ← `roleos start` CLI command
202
+ run.mjs ← Persistent run engine: create step pause resume report
203
+ run-cmd.mjs ← `roleos run/resume/next/explain/complete/fail` + interventions
204
+ mission.mjs ← 6 named mission types (feature, bugfix, treatment, docs, security, research)
205
+ mission-run.mjs ← Mission runner: create step complete report
206
+ mission-cmd.mjs ← `roleos mission` CLI commands
207
+ route.mjs ← 31-role routing + dynamic chain builder
208
+ packs.mjs ← 7 calibrated team packs + auto-selection
209
+ conflicts.mjs ← 4-pass conflict detection
210
+ escalation.mjs ← Auto-routing for blocked/rejected/split
211
+ evidence.mjs ← Structured evidence + role-aware requirements
212
+ dispatch.mjs ← Runtime dispatch manifests for multi-claude
213
+ artifacts.mjs ← 20 per-role artifact contracts + 7 pack handoffs
214
+ decompose.mjs ← Composite task detection + splitting
215
+ composite.mjs ← Dependency-ordered execution + recovery
216
+ replan.mjs ← Mid-run adaptive replanning
217
+ calibration.mjs ← Outcome recording + weight tuning
218
+ hooks.mjs ← 5 lifecycle hooks for runtime enforcement
219
+ session.mjs ← Session scaffolding + doctor
220
+ test/ ← 613 tests across 25 test files
221
+ starter-pack/ ← Drop-in role contracts, policies, schemas, workflows
222
+ ```
223
+
224
+ ## Security
225
+
226
+ Role OS operates **locally only**. It copies markdown templates and writes packet/verdict files to your repository's `.claude/` directory. It does not access the network, handle secrets, or collect telemetry. No dangerous operations — all file writes use skip-if-exists by default. See [SECURITY.md](SECURITY.md) for the full policy.
227
+
228
+ ## The operating system
229
+
230
+ | Layer | What it does | Status |
231
+ |-------|-------------|--------|
232
+ | **Routing** | Scores all 31 roles against packet content, explains recommendations, assesses confidence | ✓ Shipped |
233
+ | **Chain builder** | Assembles phase-ordered chains from scored roles, packet-type biased not template-locked | ✓ Shipped |
234
+ | **Conflict detection** | 4-pass validation: hard conflicts, sequence, redundancy, coverage gaps. Repair suggestions. | Shipped |
235
+ | **Escalation** | Auto-routes blocked/rejected/split work to the right resolver with reason + required artifact | Shipped |
236
+ | **Evidence** | Role-aware structured evidence in verdicts. Sufficiency checks. 12 evidence kinds. | ✓ Shipped |
237
+ | **Dispatch** | Generates execution manifests for multi-claude. Per-role tool profiles, system prompts, budgets. | Shipped |
238
+ | **Trials** | Full roster proven: 30/30 gold-task + 5/5 negative trials. 7 pack trials complete. | ✓ Complete |
239
+ | **Team Packs** | 7 calibrated packs with auto-selection, mismatch guards, and free-routing fallback. | Shipped |
240
+ | **Outcome calibration** | Records run outcomes, tunes pack/role weights from results, adjusts confidence thresholds. | Shipped |
241
+ | **Mixed-task decomposition** | Detects composite work, splits into child packets, assigns packs, preserves dependencies. | Shipped |
242
+ | **Composite execution** | Runs child packets in dependency order with artifact passing, branch recovery, and synthesis. | Shipped |
243
+ | **Adaptive replanning** | Mid-run scope changes, findings, or new requirements update the plan without restarting. | ✓ Shipped |
244
+ | **Session spine** | `roleos init claude` scaffolds CLAUDE.md, /roleos-route, /roleos-review, /roleos-status. `roleos doctor` verifies wiring. Route cards prove engagement. | ✓ Shipped |
245
+ | **Hook spine** | 5 lifecycle hooks (SessionStart, PromptSubmit, PreToolUse, SubagentStart, Stop). Advisory enforcement: route card reminders, write-tool gating, subagent role injection, completion audit. | ✓ Shipped |
246
+ | **Artifact spine** | 20 per-role artifact contracts. 7 pack handoff contracts. Structural validation. Chain completeness checks. Downstream roles never guess what they received. | ✓ Shipped |
247
+ | **Mission library** | 6 named missions (feature-ship, bugfix, treatment, docs-release, security-hardening, research-launch). Each declares pack, role chain, artifact flow, escalation branches, honest-partial definition. All 6 trial-run and hardened. | ✓ Shipped |
248
+ | **Mission runner** | Create runs, step through with tracked state, complete/fail with honest reporting. Blocked-step propagation, out-of-chain escalation warnings, last-step re-opening. | ✓ Shipped |
249
+ | **Unified entry** | `roleos start` decides mission vs pack vs free routing automatically. Fallback ladder with confidence scores, alternatives, and composite detection. | ✓ Shipped |
250
+ | **Persistent runs** | `roleos run` creates disk-backed runs. `resume`, `next`, `explain`, `complete`, `fail`. Interventions: reroute, escalate, retry, block, reopen. Step-local guidance. Friction measurement. | ✓ Shipped |
251
+
252
+ ## 6 missions
253
+
254
+ | Mission | Pack | Roles | When to use |
255
+ |---------|------|-------|-------------|
256
+ | `feature-ship` | feature | 5 | Full feature delivery: scope → spec → implement → test → review |
257
+ | `bugfix` | bugfix | 4 | Diagnose root cause, fix, test, verify |
258
+ | `treatment` | treatment | 4 | Shipcheck + polish + docs + CI verify + review |
259
+ | `docs-release` | docs | 2 | Write/update documentation, release notes |
260
+ | `security-hardening` | security | 4 | Threat model, audit, fix vulnerabilities, re-audit, verify |
261
+ | `research-launch` | research | 4 | Frame question, research, document findings, decide |
262
+
263
+ Each mission includes honest-partial definitions — when work stalls, the system documents what was completed and what remains instead of bluffing completion.
264
+
265
+ ## Status
266
+
267
+ - v0.1–v0.4: Foundation — trials, adoption, treatment pack, starter pack
268
+ - v1.0.0: 32 roles, full CLI, proven treatment, multi-repo portability
269
+ - v1.0.2: Role OS lockdown (bootstrap truth fixes, init --force)
270
+ - v1.1.0: 31 roles, full routing spine, conflict detection, escalation, evidence, dispatch, 7 proven team packs. 35 execution trials. 212 tests.
271
+ - v1.2.0: Calibrated packs promoted to default entry. Auto-selection, mismatch detection, alternative suggestion, free-routing fallback. 246 tests.
272
+ - v1.3.0: Outcome calibration, mixed-task decomposition, composite execution, adaptive replanning. 317 tests.
273
+ - v1.4.0: Session spine — `roleos init claude`, `roleos doctor`, route cards, /roleos-route + /roleos-review + /roleos-status commands. 335 tests.
274
+ - v1.5.0: Hook spine — 5 lifecycle hooks for runtime enforcement. 358 tests.
275
+ - v1.6.0: Artifact spine — 20 per-role artifact contracts, 7 pack handoff contracts, structural validation. 385 tests.
276
+ - v1.7.0: Completion proof — real tasks run through the full stack. `roleos artifacts` CLI. Honest escalation on structural fixes. 398 tests.
277
+ - v1.8.0: Mission library (Phase S) — 6 named missions, runner engine, completion reports. Hardened from 6 real trial runs. 481 tests.
278
+ - v1.9.0: Unified entry path (Phase T) — `roleos start` auto-decides mission vs pack vs free routing. Fallback ladder, composite detection, entry-path comparison trials. 527 tests.
279
+ - **v2.0.0**: Operator friction pass (Phase U) — `roleos run` creates persistent disk-backed runs. Resume, next, explain, complete, fail. Interventions: reroute, escalate, retry, block, reopen. Step-local guidance at every step. Friction measurement. 6 friction trials. 613 tests.
280
+
281
+ ## License
282
+
283
+ MIT
284
+
285
+ ---
286
+
287
+ Built by <a href="https://mcp-tool-shop.github.io/">MCP Tool Shop</a>