@sulhadin/orchestrator 3.0.0-beta.9 → 3.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -4,7 +4,7 @@ AI team orchestration for [Claude Code](https://docs.anthropic.com/en/docs/claud
4
4
 
5
5
  ## What is Orchestra?
6
6
 
7
- Orchestra turns a single Claude Code session into a coordinated development team. A Product Manager plans features, a Conductor executes them — switching between specialized roles (backend, frontend, architect) automatically. Each role has strict boundaries, every commit passes verification, and the system learns from past milestones.
7
+ Orchestra turns a single Claude Code session into a coordinated development team. A Product Manager plans features, a Conductor orchestrates them — delegating each phase to a sub-agent with the right role (backend, frontend, architect). Sub-agents own implementation and verification; conductor owns commits. Each role has strict boundaries, every commit passes verification, and the system learns from past milestones.
8
8
 
9
9
  No infrastructure. No API keys. Just markdown files and Claude Code.
10
10
 
@@ -23,12 +23,12 @@ Terminal 1 (PM): Terminal 2 (Conductor):
23
23
  /orchestra pm /orchestra start
24
24
  │ │
25
25
  ├─ Discuss features ├─ Scan milestones
26
- ├─ Create milestones ├─ Activate architect → RFC
27
- ├─ Groom phases ├─ Activate backend → code + tests
28
- │ ├─ Activate frontend → UI
26
+ ├─ Create milestones ├─ Delegate to architect → RFC
27
+ ├─ Groom phases ├─ Delegate to backend → code + tests
28
+ │ ├─ Delegate to frontend → UI
29
29
  │ (plan M2 while M1 runs) ├─ Call reviewer → code review
30
30
  │ ├─ Push → milestone done
31
- │ └─ Loop next milestone
31
+ │ └─ Stop (inline) or next milestone (agent)
32
32
  ```
33
33
 
34
34
  ## Quick Example
@@ -49,8 +49,7 @@ PM challenges scope, creates M1-user-auth with 3 phases
49
49
  ⚙️ backend → phase-2: API endpoints → committed
50
50
  🎨 frontend → phase-3: Login UI → committed
51
51
  🔍 reviewer → approved
52
- 🚦 Push? yes
53
- ✅ M1-user-auth done. Checking for next milestone...
52
+ M1-user-auth done. Pushed to origin.
54
53
  ```
55
54
 
56
55
  ## Commands
@@ -63,6 +62,8 @@ PM challenges scope, creates M1-user-auth with 3 phases
63
62
  | `/orchestra start --auto` | Fully autonomous — warns once, then auto-push |
64
63
  | `/orchestra hotfix {desc}` | Ultra-fast fix: implement → verify → commit → push |
65
64
  | `/orchestra status` | Milestone status report (PM only) |
65
+ | `/orchestra verifier [N]` | Verify milestones match PRD/RFC requirements (PM only) |
66
+ | `/orchestra rewind [N]` | Review execution history: decisions, metrics, insights (PM only) |
66
67
  | `/orchestra blueprint {name}` | Generate milestones from template |
67
68
  | `/orchestra blueprint add` | Save current work as reusable template |
68
69
  | `/orchestra create-role` | Create a new role interactively (Orchestrator only) |
@@ -86,7 +87,7 @@ PM challenges scope, creates M1-user-auth with 3 phases
86
87
  │ ├── conductor.md ← Autonomous milestone executor
87
88
  │ └── reviewer.md ← Independent code review
88
89
  ├── skills/*.orchestra.md ← 14 domain checklists
89
- ├── rules/*.orchestra.md ← 8 discipline rules
90
+ ├── rules/*.orchestra.md ← Discipline rules (auto-loaded)
90
91
  └── commands/orchestra/ ← /orchestra commands
91
92
 
92
93
  .orchestra/ ← Project data + config
@@ -101,10 +102,11 @@ PM challenges scope, creates M1-user-auth with 3 phases
101
102
 
102
103
  **Config-driven pipeline** — `.orchestra/config.yml` controls everything: verification commands (customize for Go, Python, Rust), approval gates, thresholds, parallel execution. No hardcoded assumptions.
103
104
 
104
- **Three complexity levels** — PM sets per milestone:
105
- - `quick` Engineer Commit Push (trivial changes)
106
- - `standard` Engineer Review Push (typical features)
107
- - `full` Architect Engineer Review → Push (complex work)
105
+ **Four complexity levels with model tiering** — PM sets per phase:
106
+ - `trivial` (haiku)Config changes, version bumps
107
+ - `quick` (sonnet)Single-file fixes, simple CRUD
108
+ - `standard` (sonnet)Typical features (default)
109
+ - `complex` (opus) → New subsystems, architectural changes
108
110
 
109
111
  **Verification gate** — Tests + lint must pass before every commit. Commands come from config. Fails 3 times → phase marked failed, escalated to user.
110
112
 
@@ -116,6 +118,8 @@ PM challenges scope, creates M1-user-auth with 3 phases
116
118
 
117
119
  **Role boundaries** — Enforced via `.claude/rules/`. PM cannot write code. Engineers cannot modify system files. Orchestrator cannot write features. Boundaries checked by file path, not by words.
118
120
 
121
+ **Milestone isolation** — `inline` mode stops after each milestone (user compacts manually). `agent` mode spawns each milestone in its own sub-agent — context freed automatically, enabling 20+ milestones in a single `--auto` session.
122
+
119
123
  **Stuck detection** — Detects repeated failures, circular fixes, over-engineering. Tries different approach once, then escalates. Auto mode skips to next phase.
120
124
 
121
125
  ## Upgrading
@@ -135,7 +139,7 @@ Smart merge on upgrade:
135
139
  | Blueprints (your custom) | Preserved |
136
140
  | milestones/ | Untouched |
137
141
  | knowledge.md | Preserved |
138
- | config.yml | Preserved |
142
+ | config.yml | Smart merged (user values preserved, new keys added) |
139
143
 
140
144
  ## Documentation
141
145
 
@@ -6,9 +6,17 @@ const path = require("path");
6
6
  const rootDir = process.cwd();
7
7
  const templateDir = path.join(rootDir, "template");
8
8
 
9
+ // Dev-only agents that should NOT be published to users
10
+ const DEV_ONLY_AGENTS = new Set([
11
+ "codebase-deep-analyzer.md",
12
+ "orchestra-analyzer.md",
13
+ "orchestra-reviewer.md",
14
+ "repo-deep-analyzer.md",
15
+ ]);
16
+
9
17
  // System files to include in the template
10
18
  const SYSTEM_PATHS = [
11
- { src: ".claude/agents", dest: ".claude/agents" },
19
+ { src: ".claude/agents", dest: ".claude/agents", filter: (f) => !DEV_ONLY_AGENTS.has(f) },
12
20
  { src: ".claude/commands/orchestra", dest: ".claude/commands/orchestra" },
13
21
  { src: ".claude/rules", dest: ".claude/rules", filter: (f) => f.endsWith(".orchestra.md") },
14
22
  { src: ".claude/skills", dest: ".claude/skills", filter: (f) => f.endsWith(".orchestra.md") },
package/package.json CHANGED
@@ -1,10 +1,10 @@
1
1
  {
2
2
  "name": "@sulhadin/orchestrator",
3
- "version": "3.0.0-beta.9",
3
+ "version": "3.0.0",
4
4
  "description": "AI Team Orchestration System — multi-role coordination for Claude Code",
5
5
  "bin": "bin/index.js",
6
6
  "scripts": {
7
- "test": "node --test bin/**/*.test.js",
7
+ "test": "node --test test/**/*.test.js",
8
8
  "template": "node bin/build-template.js",
9
9
  "prepare": "husky"
10
10
  },
@@ -18,8 +18,10 @@ by delegating each phase to a sub-agent. You NEVER implement code yourself.
18
18
 
19
19
  When started:
20
20
 
21
- 1. If `--auto`: print `Warning: Auto mode — all gates skipped, auto-push enabled.` and proceed.
21
+ 1. If `--auto`: print `Warning: Auto mode — RFC gate skipped, fully autonomous.` and proceed.
22
22
  2. Read `.orchestra/config.yml` for pipeline settings and thresholds.
23
+ - Read `pipeline.milestone_isolation` (default: `inline`).
24
+ - If `--auto` and `milestone_isolation: inline`: warn once: "Inline mode with --auto: conductor stops after each milestone. Consider `milestone_isolation: agent` for batch runs."
23
25
  3. Read `.orchestra/README.md` for orchestration rules.
24
26
  4. Read `.orchestra/knowledge.md` Active Knowledge section (skip Archive).
25
27
  5. Scan milestones:
@@ -136,10 +138,11 @@ that affect this phase. Omit for first phase.}
136
138
  ### 4. Process Sub-Agent Result (Conductor does this)
137
139
 
138
140
  - If **done** (verification passed):
139
- 1. Conductor commits → update phase status → `done`, update context.md
140
- 2. Store sub-agent ID for potential review fix cycle
141
+ 1. Conductor commits
142
+ 2. Update context.md: set phase `done`, add commit hash + files_changed, append decisions from notes
143
+ 3. Store sub-agent ID for potential review fix cycle
141
144
  - If **failed** (verification failed after max retries):
142
- 1. Log in context.md: phase name, last error summary, retry count
145
+ 1. Update context.md: set phase `failed`, add error summary + last-error + retry count
143
146
  2. Decide: retry with new sub-agent or escalate to user
144
147
 
145
148
  **Note:** Conductor owns commit only. Sub-agents own implementation + verification.
@@ -168,8 +171,8 @@ If config.yml `pipeline.parallel: enabled`:
168
171
  After all implementation phases (unless config says `review: skip`):
169
172
  1. Call reviewer agent (`.claude/agents/reviewer.md`) as sub-agent
170
173
  2. Reviewer reads git diff independently, applies checklist, returns verdict
171
- 3. **approved** → push gate
172
- 4. **approved-with-comments** → push gate, log comments in context.md
174
+ 3. **approved** → push immediately
175
+ 4. **approved-with-comments** → push immediately, log comments in context.md
173
176
  5. **changes-requested** → fix cycle:
174
177
  - Use SendMessage to continue the last phase's sub-agent with reviewer findings
175
178
  (if sub-agent no longer available, launch new sub-agent with findings + role)
@@ -179,16 +182,17 @@ After all implementation phases (unless config says `review: skip`):
179
182
  ## Approval Gates
180
183
 
181
184
  Read gate behavior from config.yml:
182
- - **Normal mode:** Ask user at configured gates (rfc_approval, push_approval).
185
+ - **Normal mode:** Ask user at RFC gate (rfc_approval). Push is automatic after review passes.
183
186
  - **Auto mode:** Skip all gates. Print status but don't wait.
184
187
 
185
188
  ## Rejection Flow
186
189
 
187
190
  - **RFC Rejected:** Ask feedback → architect revises → re-submit (max config.yml `pipeline.max_rfc_rounds`).
188
- - **Push Rejected:** Ask feedback → create fix phase → re-submit.
189
191
 
190
192
  ## Milestone Completion
191
193
 
194
+ ### Inline Mode (default)
195
+
192
196
  After push:
193
197
  1. Update milestone.md `status: done`, remove `Locked-By`.
194
198
  2. Append 5-line retrospective to knowledge.md:
@@ -200,26 +204,180 @@ After push:
200
204
  - Review findings: {N blocking, N non-blocking} — {top issue}
201
205
  - Missing skill: {name or "none"}
202
206
  ```
207
+ 3. Proceed to "Next Milestone — Mode-Dependent Behavior" → Inline Mode.
208
+
209
+ ### Agent Mode
210
+
211
+ Milestone agent handles push and returns structured result (see Milestone Agent Delegation).
212
+ Conductor processes the return:
213
+ 1. Update milestone.md `status: done`, remove `Locked-By`.
214
+ 2. Append retro from milestone agent's return to knowledge.md.
215
+ 3. Proceed to "Next Milestone — Mode-Dependent Behavior" → Agent Mode.
216
+
217
+ ## Next Milestone — Mode-Dependent Behavior
218
+
219
+ Behavior after milestone completion depends on `pipeline.milestone_isolation`:
220
+
221
+ ### Inline Mode (default)
222
+
223
+ After push and retro:
224
+ 1. **STOP.** Print: "Milestone {id} complete and pushed."
225
+ 2. Do NOT loop to next milestone.
226
+
227
+ ### Agent Mode
228
+
229
+ After milestone agent returns (retro already written in Milestone Completion above):
230
+ 1. Re-read knowledge.md Active section (may have new retros)
231
+ 2. Re-scan `.orchestra/milestones/` using Glob (PM may have created new ones)
232
+ 3. If pending → spawn next milestone agent
233
+ 4. If none → "All milestones complete. Waiting for new work from PM."
234
+
235
+ Context stays lean because all phase-level context lived in the (now ended)
236
+ milestone agent. Conductor only accumulates ~1-2k tokens per milestone
237
+ (prompt + structured result).
238
+
239
+ ## Milestone Agent Delegation (Agent Mode Only)
240
+
241
+ This section applies ONLY when config `pipeline.milestone_isolation: agent`.
242
+
243
+ In agent mode, the conductor becomes a two-tier dispatcher:
244
+ - Conductor spawns one milestone agent per milestone
245
+ - Milestone agent spawns phase sub-agents (same as current phase delegation)
246
+ - When milestone agent completes, its context is freed entirely
247
+
248
+ ### Milestone Agent Prompt Template
249
+
250
+ ```
251
+ You are a Milestone Agent executing milestone {milestone_id}: {title}.
252
+ Rules from `.claude/rules/*.orchestra.md` are automatically loaded.
253
+
254
+ **Config:**
255
+ {config_yml_content}
256
+
257
+ **Orchestration Rules:**
258
+ {readme_content}
259
+
260
+ **Active Knowledge:**
261
+ {knowledge_active_section}
262
+
263
+ **Milestone:**
264
+ {milestone.md content}
265
+
266
+ **Grooming:**
267
+ {grooming.md content}
268
+
269
+ **Context (if resuming):**
270
+ {context.md content}
271
+ Sections: `## Status` (milestone state), `## Phases` (per-phase status — skip `done` phases),
272
+ `## Decisions` (cross-phase context), `## Metrics` (duration + retries per phase).
203
273
 
204
- ## Next Milestone
274
+ **Phase files:**
275
+ {all phase file contents, in order}
205
276
 
206
- After completion:
207
- - Re-scan `.orchestra/milestones/` using Glob (PM may have created new ones)
208
- - If found → start next milestone
209
- - If none "All milestones complete. Waiting for new work from PM."
277
+ **Role files (unique, one per role used in phases):**
278
+ {role file contents deduplicated}
279
+
280
+ **Skills (unique, one per skill used in phases):**
281
+ {skill file contents — deduplicated}
282
+
283
+ ## Your Task
284
+ Execute this milestone using the Phase Execution protocol:
285
+ 1. For each phase: pre-flight → compose prompt → delegate to phase sub-agent → process result
286
+ 2. Conductor (you) commits after each successful phase, updates context.md
287
+ 3. After all phases: trigger review (unless config says skip)
288
+ 4. After review passes: push to origin
289
+ 5. On phase failure after max retries: set phase to `failed`, log in context.md
290
+ - If stuck: set milestone status to `failed`, return immediately
291
+ 6. You own exactly ONE milestone — do NOT loop to other milestones
292
+
293
+ ## Return Format
294
+ - status: done | failed
295
+ - phases_completed: [list of phase names]
296
+ - phases_failed: [list with error summaries]
297
+ - review_verdict: approved | approved-with-comments | changes-requested | skipped
298
+ - pushed: true | false
299
+ - retro: |
300
+ ## Retro: {id} — {title} ({date})
301
+ - Longest phase: {name} (~{duration}) — {why}
302
+ - Verification retries: {count} — {which phases}
303
+ - Stuck: {yes/no} — {root cause if yes}
304
+ - Review findings: {N blocking, N non-blocking} — {top issue}
305
+ - Missing skill: {name or "none"}
306
+ - notes: {anything conductor should know for subsequent milestones}
307
+
308
+ IMPORTANT: Return retro text in your result. Do NOT write to knowledge.md — conductor handles this.
309
+ ```
310
+
311
+ ### Processing Milestone Agent Result
312
+
313
+ Conductor processes the return:
314
+
315
+ - **status: done + pushed: true** → Write retro to knowledge.md, update milestone.md status to `done`, remove `Locked-By`, proceed to next milestone.
316
+ - **status: failed** → Log failure to context.md, write partial retro to knowledge.md.
317
+ - `--auto` mode: move to next milestone.
318
+ - Normal mode: stop and report to user with options: (a) retry with fresh agent, (b) skip, (c) stop.
319
+ - **status: done + pushed: false** → Log error, escalate to user.
320
+
321
+ ### Milestone Agent Configuration
322
+
323
+ - Use default (general-purpose) subagent_type — milestone identity is in the prompt
324
+ - Do NOT use `isolation: "worktree"` — milestones run sequentially, not in parallel
325
+ - Milestone agent inherits all conductor capabilities: git, Agent tool, file access
326
+ - On resume (milestone was `in-progress`): include context.md in prompt — milestone agent reads phase statuses and continues from last completed phase
210
327
 
211
328
  ## Context Persistence
212
329
 
213
- Update context.md at: phase start, phase completion (with sub-agent summary), errors.
214
- On resume: read context.md, continue from last completed phase.
330
+ context.md uses a fixed structure. Conductor updates it at phase start, completion, and on errors.
331
+
332
+ ### context.md Format
333
+
334
+ ```markdown
335
+ ## Status
336
+ milestone: {milestone-id}
337
+ started: {YYYY-MM-DD}
338
+ pipeline: {quick | standard | full}
339
+
340
+ ## Phases
341
+ - phase-1: {done | in-progress | failed | pending} | commit: {hash} | files: {changed files}
342
+ - phase-2: {status} | error: {error summary, retry count} | last-error: {specific error}
343
+ - phase-3: pending
344
+ ...
345
+
346
+ ## Codebase Map
347
+ {path — one-line description, generated by scout sub-agent}
348
+
349
+ ## Decisions
350
+ - phase-1: {key decision or trade-off made during implementation}
351
+ - phase-2: {why a specific approach was chosen}
352
+
353
+ ## Metrics
354
+ - phase-1: duration: ~{N}min | verification_retries: {N}
355
+ - phase-2: duration: ~{N}min | verification_retries: {N}
356
+ ```
357
+
358
+ ### Update Rules
359
+
360
+ - **Phase start:** Set phase status to `in-progress`
361
+ - **Phase done:** Set status to `done`, add commit hash and files_changed from sub-agent result
362
+ - **Phase failed:** Set status to `failed`, add error summary and last-error
363
+ - **Decisions:** Append key decisions from sub-agent's `notes` field — only non-obvious choices that affect later phases
364
+ - **Metrics:** Record approximate phase duration and verification_retries from sub-agent result
365
+ - **Milestone complete:** Retro is written to knowledge.md (see Milestone Completion)
366
+
367
+ ### On Resume
368
+
369
+ Read context.md → skip phases marked `done` → resume from first non-done phase.
370
+ `## Decisions` from completed phases are included in "previous phase summary" for the next sub-agent — this preserves cross-phase context even after session restart.
215
371
 
216
372
  ## Hotfix Pipeline
217
373
 
374
+ Hotfix always runs inline regardless of `milestone_isolation` setting — single-phase fast path, sub-agent isolation adds no value.
375
+
218
376
  When user types `/orchestra hotfix {description}`:
219
377
  1. Auto-create hotfix milestone + single phase
220
378
  2. Launch implementation sub-agent (model: standard) — implements, verifies, reports
221
379
  3. If done → conductor commits → push immediately (no RFC, no review, no gates)
222
- 5. Append one-liner to knowledge.md
380
+ 4. Append one-liner to knowledge.md
223
381
  6. Return to normal execution if active
224
382
 
225
383
  ## What Conductor Does NOT Do
@@ -10,7 +10,7 @@ Review code independently. No implementation context by design — only the code
10
10
 
11
11
  ## Process
12
12
 
13
- 1. Read context.md for objectives and acceptance criteria
13
+ 1. Read milestone.md for objectives, phase files for acceptance criteria, context.md for codebase map and decisions
14
14
  2. Read RFC if exists
15
15
  3. `git log origin/{branch}..HEAD` + `git diff origin/{branch}...HEAD`
16
16
  4. Detect mode from diff: backend / frontend / both → apply relevant checklist
@@ -11,6 +11,8 @@ COMMANDS:
11
11
  /orchestra start --auto Fully autonomous (warns once, then auto-push)
12
12
  /orchestra hotfix {desc} Ultra-fast fix: implement → verify → commit → push
13
13
  /orchestra status Milestone status report (PM only)
14
+ /orchestra verifier [N] Verify milestones match requirements (PM only)
15
+ /orchestra rewind [N] Review milestone execution history (PM only)
14
16
  /orchestra help Show this help
15
17
  /orchestra blueprint {name} Generate milestones from template (PM only)
16
18
  /orchestra blueprint add Save current work as blueprint (PM only)
@@ -41,7 +43,7 @@ FILES:
41
43
  .claude/skills/*.orchestra.md Domain checklists (auth, CRUD, deploy, etc.)
42
44
  .claude/rules/*.orchestra.md Discipline rules (verification, commit format, etc.)
43
45
  .claude/commands/orchestra/ Orchestra commands
44
- .orchestra/roles/ Role identities (slim, 15 lines each)
46
+ .orchestra/roles/ Role identities (one file per role)
45
47
  .orchestra/config.yml Pipeline configuration
46
48
  .orchestra/blueprints/ Project/component templates
47
49
  .orchestra/knowledge.md Append-only project knowledge
@@ -0,0 +1,60 @@
1
+ Review milestone execution history for actionable insights. PM role only.
2
+
3
+ **Usage:**
4
+ - `/orchestra rewind` — rewind all `done` milestones
5
+ - `/orchestra rewind 1,2,3` — rewind only specified milestone numbers
6
+
7
+ 1. Read `.orchestra/roles/product-manager.md` to activate PM.
8
+ 2. Scan `.orchestra/milestones/` — collect milestones to review:
9
+ - No arguments: all milestones with `status: done`
10
+ - With numbers: only milestones matching those numbers (e.g., `1` matches `M1-*`)
11
+ 3. For each milestone, read execution artifacts:
12
+ - `context.md` — structured sections:
13
+ - `## Decisions` — key choices made during implementation
14
+ - `## Metrics` — phase duration and verification retries
15
+ - `## Phases` — status, commits, errors per phase
16
+ - `knowledge.md` — retro entry for this milestone
17
+ - `grooming.md` — original scope vs what actually happened
18
+ - Review verdict and comments (from context.md or git log)
19
+ 4. Extract and present — focus on **what the user needs to know**, not execution mechanics:
20
+
21
+ ```
22
+ ## Rewind: M1-user-auth
23
+
24
+ ### Key Decisions Made During Execution
25
+ - phase-1: Used Stripe SDK v4 instead of raw API (architect RFC recommendation)
26
+ - phase-2: Split webhook handler into separate file for testability
27
+ - phase-3: Chose CSS modules over Tailwind (frontend preference)
28
+
29
+ ### Performance
30
+ - Total phases: 5 | Completed: 5 | Failed: 0
31
+ - Longest phase: phase-3 (~12min) — complex UI with form validation
32
+ - Verification retries: 3 total (phase-2: 2, phase-4: 1)
33
+ - Stuck: No
34
+
35
+ ### Review Findings
36
+ - Verdict: approved-with-comments
37
+ - Comments:
38
+ - "Consider adding index on user_email for login query" (non-blocking)
39
+ - "Error messages expose internal details" (non-blocking, logged)
40
+
41
+ ### Scope Changes
42
+ - Original grooming planned 4 phases, executed 5 (phase-3 was split during implementation)
43
+ - phase-2 scope expanded: webhook handler was not in original PRD, added during RFC
44
+
45
+ ### Unresolved Items
46
+ - 🔧 DB index on user_email — reviewer flagged, not addressed
47
+ - 🔧 Error message sanitization — reviewer flagged, not addressed
48
+ - 🔧 phase-2 workaround: hardcoded timeout — flagged as tech debt in Decisions
49
+
50
+ ### What We Learned
51
+ - 📝 Webhook handler pattern — reusable for future integrations
52
+ - ⏱️ Form validation phases consistently slow — consider a form-validation skill
53
+ - 💡 Splitting phase-3 mid-execution worked well — complex UI benefits from smaller phases
54
+ ```
55
+
56
+ 5. After all milestones, present a cross-milestone summary:
57
+ - **Unresolved items** — review comments and flagged workarounds never addressed, across all milestones
58
+ - **Recurring patterns** — same review comments, same slow phase types, same failure modes
59
+ - **Skill gaps** — missing skills that would have helped
60
+ - **Strategic suggestions** — new skills to create, process improvements, items to fix in upcoming work
@@ -7,7 +7,11 @@ The conductor will:
7
7
  2. Execute phases sequentially (or parallel if configured)
8
8
  3. Activate roles, load skills, implement code
9
9
  4. Trigger code review via reviewer agent
10
- 5. Push after approval (or auto-push in --auto mode)
11
- 6. Loop to next milestone until all complete
10
+ 5. Push automatically after review passes
11
+ 6. Behavior after milestone: stop (inline mode) or continue to next (agent mode)
12
+
13
+ Config `pipeline.milestone_isolation` controls post-milestone behavior:
14
+ - `inline` (default): stops after each milestone. User compacts and restarts.
15
+ - `agent`: spawns each milestone in sub-agent. Loops automatically. Best with `--auto`.
12
16
 
13
17
  Pass `--auto` flag for fully autonomous mode (warns once, then skips all gates).
@@ -5,7 +5,7 @@ Show full milestone status report. PM role only.
5
5
  3. For active milestones, read context.md for progress details.
6
6
  4. Report:
7
7
  - All milestones with status, current phase, next action
8
- - Phase details for active milestone (role, status, cost tracking)
8
+ - Phase details for active milestone (role, status, metrics)
9
9
  - Git status (branch, unpushed commits)
10
- - Cost summary (from context.md)
10
+ - Metrics summary from context.md `## Metrics` section (duration + retries per phase)
11
11
  - Actions needed (specific next steps)
@@ -0,0 +1,52 @@
1
+ Verify that implemented milestones match their requirements. PM role only.
2
+
3
+ **Usage:**
4
+ - `/orchestra verifier` — verify all `done` milestones
5
+ - `/orchestra verifier 1,2,3` — verify only specified milestone numbers
6
+
7
+ 1. Read `.orchestra/roles/product-manager.md` to activate PM.
8
+ 2. Scan `.orchestra/milestones/` — collect milestones to verify:
9
+ - No arguments: all milestones with `status: done`
10
+ - With numbers: only milestones matching those numbers (e.g., `1` matches `M1-*`)
11
+ 3. For each milestone, read:
12
+ - `prd.md` — product requirements and acceptance criteria
13
+ - `rfc.md` — technical design decisions (if exists)
14
+ - `milestone.md` — summary and acceptance criteria
15
+ - `grooming.md` — scope decisions and phase breakdown
16
+ - All `phases/*.md` — phase acceptance criteria
17
+ 4. For each milestone, read execution context:
18
+ - `context.md` — `## Decisions` section (why specific approaches were chosen)
19
+ - `context.md` — `## Phases` section (which phases completed, which failed)
20
+ 5. For each milestone, read the actual implementation:
21
+ - Run `git log --oneline` filtered to commits from that milestone's phases
22
+ - Run `git diff` for those commits to see what changed
23
+ - Read the current state of modified files — diff shows changes, but current code shows completeness
24
+ 6. Compare requirements vs implementation. For each requirement/acceptance criterion:
25
+ - **met** — implementation satisfies the requirement
26
+ - **partial** — partially implemented, missing aspects noted
27
+ - **missed** — not implemented at all
28
+ - **deviated** — implemented differently than specified
29
+ 6. Report:
30
+
31
+ ```
32
+ ## Verification: M1-user-auth
33
+
34
+ ### Requirements Coverage
35
+ - ✅ met: JWT authentication endpoint (phase-1, commit abc123)
36
+ - ⚠️ partial: Rate limiting — implemented but no Redis backing (phase-2)
37
+ - ❌ missed: Password reset flow — not in any commit
38
+ - 🔀 deviated: Token refresh — RFC said rotating tokens, implemented static expiry
39
+
40
+ ### Summary
41
+ 4 requirements: 1 met, 1 partial, 1 missed, 1 deviated
42
+
43
+ ### Severity
44
+ - 🔴 critical: Password reset flow missing (core auth feature)
45
+ - 🟡 moderate: Rate limiting without Redis (works but won't scale)
46
+ - 🟡 moderate: Token refresh deviation (security concern)
47
+ ```
48
+
49
+ 8. After reporting all milestones, if there are critical or moderate gaps:
50
+ - List gaps grouped by severity
51
+ - Suggest: "Use `/orchestra pm` to plan fix milestones for these gaps."
52
+ - Do NOT create milestones directly — PM decides scope and priority
@@ -10,14 +10,13 @@ Terminal 1 (PM): Terminal 2 (Conductor):
10
10
  /orchestra pm /orchestra start
11
11
  │ │
12
12
  ├─ Discuss features with user ├─ Scan milestones
13
- ├─ Create milestones ├─ 🏗️ architect → RFC
13
+ ├─ Create milestones ├─ 🏗️ delegate to architect → RFC
14
14
  ├─ Groom phases ├─ 🚦 User approves RFC
15
- ├─ Always available ├─ ⚙️ backend → phase by phase
16
- │ ├─ 🎨 frontend → phase by phase
15
+ ├─ Always available ├─ ⚙️ delegate to backend → phase by phase
16
+ │ ├─ 🎨 delegate to frontend → phase by phase
17
17
  │ (can plan M2 while M1 runs) ├─ 🔍 reviewer → review commits
18
- │ ├─ 🚦 User approves push
19
18
  │ ├─ git push → milestone done
20
- │ └─ Loop next milestone
19
+ │ └─ Stop (inline) or next milestone (agent)
21
20
  ```
22
21
 
23
22
  ## Directory Structure
@@ -25,7 +24,7 @@ Terminal 1 (PM): Terminal 2 (Conductor):
25
24
  ```
26
25
  .orchestra/
27
26
  ├── README.md # This file
28
- ├── roles/ # Role identities (slim, ~15 lines each)
27
+ ├── roles/ # Role identities (one file per role)
29
28
  │ ├── product-manager.md
30
29
  │ ├── architect.md
31
30
  │ ├── backend-engineer.md
@@ -56,8 +55,10 @@ You can plan new milestones while the conductor is executing another one.
56
55
 
57
56
  ### Terminal 2: `/orchestra start` (Execution)
58
57
 
59
- Conductor reads milestones, executes phases autonomously. Activates roles per phase.
60
- Loops to the next milestone when done. Maintains `context.md` for resume capability.
58
+ Conductor reads milestones, delegates each phase to a sub-agent with the right role.
59
+ Sub-agents implement + verify; conductor commits. After milestone completion, behavior
60
+ depends on `milestone_isolation` config: stops (inline) or continues to next (agent).
61
+ Maintains `context.md` for resume capability.
61
62
 
62
63
  ```
63
64
  /orchestra start
@@ -81,7 +82,6 @@ PM discusses feature with user
81
82
  → Conductor executes frontend phases (sequential, each → commit)
82
83
  → Conductor calls reviewer agent (reviews unpushed commits)
83
84
  → FIX cycle if changes-requested (re-review if fix >= 30 lines)
84
- → [USER APPROVAL GATE: Push to origin]
85
85
  → Conductor pushes, PM verifies acceptance criteria, closes milestone
86
86
  → Conductor appends 5-line retrospective to knowledge.md
87
87
 
@@ -94,19 +94,45 @@ Hotfix (production bugs):
94
94
  ### Milestone Lock
95
95
 
96
96
  Conductor claims a milestone by writing `Locked-By: {timestamp}` to milestone.md before execution.
97
- Other conductors skip locked milestones. Lock expires after 2 hours (stale protection).
97
+ Other conductors skip locked milestones. Lock expires after config.yml `thresholds.milestone_lock_timeout` minutes (default 120).
98
98
 
99
99
  ### Pipeline Modes (Complexity)
100
100
 
101
- PM sets a `Complexity` level on each milestone that determines the pipeline:
101
+ PM sets `Complexity` on milestone (pipeline) and `complexity` on each phase (model selection):
102
102
 
103
- | Complexity | Pipeline | Use when |
104
- |------------|----------|----------|
105
- | `quick` | Engineer → Commit → Push | Config tweaks, copy changes, trivial fixes |
106
- | `standard` | EngineerReview → Push | Typical features, clear requirements |
107
- | `full` | Architect Engineer → Review → Push | Complex features, new subsystems |
103
+ | Complexity | Model | Pipeline | Use when |
104
+ |------------|-------|----------|----------|
105
+ | `trivial` | Haiku | Phases → Commit → Push | Version bumps, env vars, config changes |
106
+ | `quick` | Sonnet | Phases Commit → Push (skip review) | Single-file fixes, simple CRUD |
107
+ | `standard` | Sonnet | Phases → Review → Push | Typical features, clear requirements |
108
+ | `complex` | Opus | Architect → Phases → Review → Push | New subsystems, unfamiliar territory |
108
109
 
109
- Default is `full` if not specified. Conductor reads the `Complexity` field from `milestone.md`.
110
+ Defaults: config.yml `pipeline.default_pipeline` and `pipeline.default_complexity`.
111
+
112
+ ### Milestone Isolation
113
+
114
+ Config `pipeline.milestone_isolation` controls how the conductor handles multiple milestones:
115
+
116
+ | Mode | Behavior | Best for |
117
+ |------|----------|----------|
118
+ | `inline` (default) | Conductor runs milestone directly, **stops** after completion. User runs `/compact` then `/orchestra start` for next milestone. | Manual sessions, PC-based work |
119
+ | `agent` | Conductor spawns a sub-agent per milestone. Context freed automatically after each. Loops to next milestone. | `--auto` overnight batch runs |
120
+
121
+ ```
122
+ Inline mode: Agent mode:
123
+ /orchestra start /orchestra start --auto
124
+ → M1 executes → done → STOP → Spawn Agent(M1) → done → freed
125
+ user: /compact → Spawn Agent(M2) → done → freed
126
+ /orchestra start → Spawn Agent(M3) → done → freed
127
+ → M2 executes → done → STOP → All done
128
+ ```
129
+
130
+ In agent mode, the delegation is two-tier:
131
+ ```
132
+ Conductor (lean dispatcher)
133
+ └── Milestone Agent (fresh context)
134
+ └── Phase Agent (unchanged)
135
+ ```
110
136
 
111
137
  ### Milestone Statuses
112
138
 
@@ -142,8 +168,8 @@ Within each domain (backend/frontend), phases run in order: phase-1 → phase-2
142
168
  **Parallel execution:** If PM sets `depends_on` in phase frontmatter, independent phases
143
169
  can run in parallel via subagent worktree isolation. No `depends_on` = sequential (default).
144
170
 
145
- **Verification Gate:** Before every commit, conductor MUST pass type check + tests + lint
146
- (commands from config.yml). Commit is blocked until all checks pass.
171
+ **Verification Gate:** Sub-agents run typecheck + tests + lint (from config.yml) before reporting.
172
+ Conductor NEVER commits unless verification passes.
147
173
 
148
174
  ---
149
175
 
@@ -151,7 +177,8 @@ can run in parallel via subagent worktree isolation. No `depends_on` = sequentia
151
177
 
152
178
  - Each phase completion → **one conventional commit** on the current branch
153
179
  - No branch creation or switching — work happens on whatever branch is checked out
154
- - Milestone completion → **push to origin** (after user approval)
180
+ - Milestone completion → **push to origin** (automatic after review passes)
181
+ - Commits stay local until milestone fully completes — no partial push on failure
155
182
  - Reviewer reviews unpushed commits: `git log origin/{branch}..HEAD`
156
183
  - Clean git history: each commit maps to a phase
157
184
 
@@ -185,16 +212,14 @@ Rules:
185
212
 
186
213
  The user must approve before these transitions:
187
214
  - **Milestone creation** — PM discusses and plans, but must get user approval before creating the milestone directory and files
188
- - **RFC → Implementation** — user reviews architect's RFC
189
- - **Push to origin** — user approves the final changeset
215
+ - **RFC → Implementation** — user reviews architect's RFC (if `rfc_approval` is not `skip`)
190
216
 
191
- All other transitions are automatic.
217
+ Push is automatic after review passes. All other transitions are automatic.
192
218
 
193
219
  ### Rejection Handling
194
220
 
195
221
  If the user says **no** at any gate:
196
- - **RFC rejected** → Architect revises based on feedback, re-submits (max 3 rounds)
197
- - **Push rejected** → Conductor creates fix phase, implements, re-submits push gate
222
+ - **RFC rejected** → Architect revises based on feedback, re-submits (max config `pipeline.max_rfc_rounds`)
198
223
  - **Milestone rejected** → PM revises in PM terminal
199
224
 
200
225
  Rejections are normal. The system does not stall — it loops back with feedback.
@@ -213,12 +238,12 @@ Conductor calls reviewer agent
213
238
  → Returns: approved / approved-with-comments / changes-requested
214
239
  ```
215
240
 
216
- **If approved** → proceed to push gate.
241
+ **If approved** → push immediately.
217
242
 
218
- **If approved-with-comments** → proceed to push gate. Comments are logged in context.md.
243
+ **If approved-with-comments** → push immediately. Comments are logged in context.md.
219
244
 
220
- **If changes-requested** → Conductor switches to the relevant role, fixes
221
- and commits. Re-review triggered if fix >= config `re_review_lines` threshold.
245
+ **If changes-requested** → Conductor continues the phase's sub-agent via SendMessage with
246
+ reviewer findings. Re-review triggered if fix >= config `re_review_lines` threshold.
222
247
 
223
248
  ---
224
249
 
@@ -283,16 +308,21 @@ PM and conductor run in **separate terminals**. They communicate through milesto
283
308
 
284
309
  ### Context Persistence
285
310
 
286
- Conductor maintains `context.md` in each milestone directory. This allows:
287
- - Resume after terminal close/reopen
288
- - Track decisions made during implementation
289
- - Record what was committed in each phase
311
+ Conductor maintains `context.md` in each milestone directory with a fixed structure:
312
+ - `## Status` milestone id, start date, pipeline type
313
+ - `## Phases` per-phase status, commit hash, files changed, errors
314
+ - `## Codebase Map` scout-generated file map (survives milestone clear)
315
+ - `## Decisions` — key choices from each phase that affect later phases
316
+ - `## Metrics` — phase duration and verification retries (used by `/orchestra status`)
317
+
318
+ This enables resume after terminal close/reopen. On restart, conductor reads context.md and skips completed phases.
290
319
 
291
320
  ### Approval Gates (Conductor Terminal)
292
321
 
293
- Conductor asks the user directly (not PM) at these points:
294
- 1. **RFC ready** — "Approve RFC to start implementation?"
295
- 2. **Push to origin** — "All done. Push to origin?"
322
+ Conductor asks the user directly (not PM) at this point:
323
+ 1. **RFC ready** — "Approve RFC to start implementation?" (if `rfc_approval` is not `skip`)
324
+
325
+ Push is automatic after review passes — no approval needed.
296
326
 
297
327
  ---
298
328
 
@@ -330,16 +360,18 @@ sequenceDiagram
330
360
  C->>C: Fix → commit
331
361
  end
332
362
 
333
- C->>U: Push to origin?
334
- U->>C: Yes
335
363
  C->>C: git push → milestone done
336
364
 
337
- C->>C: Next milestone? → loop or done
365
+ alt Inline mode (default)
366
+ C->>C: STOP — user compacts and restarts
367
+ else Agent mode
368
+ C->>C: Next milestone? → loop or done
369
+ end
338
370
 
339
371
  Note over PM: PM is free the entire time<br/>Can plan M2 while M1 executes
340
372
  ```
341
373
 
342
- ### 2. Conductor Execution Loop
374
+ ### 2. Conductor Execution Loop (Inline Mode)
343
375
 
344
376
  ```mermaid
345
377
  sequenceDiagram
@@ -354,11 +386,27 @@ sequenceDiagram
354
386
  C->>C: reviewer → approved
355
387
  C->>C: Push → M1 done
356
388
 
357
- C->>C: Start M2
358
- C->>C: architect → RFC
359
- C->>C: backend phase-1
360
- C->>C: reviewer approved
361
- C->>C: Push → M2 done
389
+ Note over C: STOP. "Run /compact or /clear then /orchestra start"
390
+ ```
391
+
392
+ ### 3. Conductor Execution Loop (Agent Mode)
393
+
394
+ ```mermaid
395
+ sequenceDiagram
396
+ participant C as Conductor
397
+ participant MA as Milestone Agent
398
+
399
+ C->>C: Scan milestones/
400
+
401
+ C->>MA: Spawn Agent(M1)
402
+ MA->>MA: phase-1 → phase-2 → review → push
403
+ MA-->>C: {status: done, retro: ...}
404
+ Note over C: Write retro, ~1-2k tokens retained
405
+
406
+ C->>MA: Spawn Agent(M2)
407
+ MA->>MA: phase-1 → phase-2 → review → push
408
+ MA-->>C: {status: done, retro: ...}
409
+ Note over C: Write retro, ~1-2k tokens retained
362
410
 
363
411
  C->>C: No more milestones
364
412
  Note over C: "All done. Waiting for new work."
@@ -13,10 +13,7 @@ pipeline:
13
13
  standard: sonnet
14
14
  complex: opus
15
15
  # RFC approval gate: required | optional | skip
16
- rfc_approval: required
17
-
18
- # Push approval gate: required | auto
19
- push_approval: required
16
+ rfc_approval: skip
20
17
 
21
18
  # Code review: required | optional | skip
22
19
  review: required
@@ -25,6 +22,11 @@ pipeline:
25
22
  # When enabled, phases with depends_on: [] run in parallel
26
23
  parallel: disabled
27
24
 
25
+ # Milestone isolation mode: inline | agent
26
+ # inline: conductor runs milestones directly, stops after each. User compacts manually. (default)
27
+ # agent: each milestone runs in its own sub-agent. Context freed automatically. Best for --auto.
28
+ milestone_isolation: inline
29
+
28
30
  # Default pipeline when milestone Complexity is missing
29
31
  default_pipeline: full # quick | standard | full
30
32
 
@@ -34,6 +36,9 @@ pipeline:
34
36
  # Max RFC rejection rounds before escalating to user
35
37
  max_rfc_rounds: 3
36
38
 
39
+ # Max milestone review rounds before proceeding anyway with warnings
40
+ max_milestone_review_rounds: 3
41
+
37
42
  thresholds:
38
43
  # Milestone lock timeout in minutes (stale locks are ignored)
39
44
  milestone_lock_timeout: 120
@@ -69,7 +69,7 @@ Last 5 milestones. Conductor reads before every milestone start. PM reads before
69
69
 
70
70
  ### Decisions
71
71
  - Skill System (markdown-only): Lightweight `.orchestra/skills/` with domain checklists (auth, CRUD, deployment). No registry, no keyword matching — PM manually assigns via `skills:` frontmatter in phase files. Preserves zero-infrastructure philosophy.
72
- - Cost Awareness: Track duration + verification retries per phase in context.md Cost Tracking table. PM sees this in #status. No token counting (unreliable from prompt), focus on observable metrics.
72
+ - Cost Awareness: Track duration + verification retries per phase in context.md `## Metrics` section. PM sees this in `/orchestra status`. No token counting (unreliable from prompt), focus on observable metrics.
73
73
  - Re-review Threshold: Fix < 30 lines → no re-review. Fix >= 30 lines → abbreviated re-review (only the fix commit). Balances quality vs speed.
74
74
  - Rejection Flow: RFC rejected → architect revises (max 3 rounds). Push rejected → create fix phase. System no longer stalls on "no".
75
75
 
@@ -42,13 +42,44 @@ Cannot write: feature code, RFCs, architecture docs, review findings, system fil
42
42
  └── phase-2.md
43
43
  ```
44
44
 
45
- ### Pre-flight Checklist
45
+ ### Milestone Review Loop
46
46
 
47
+ After creating milestone files, launch a milestone-reviewer sub-agent before
48
+ marking the milestone as ready. This catches planning errors before conductor executes.
49
+
50
+ **Flow:** PM creates → reviewer sub-agent → PM fixes → reviewer again → max `pipeline.max_milestone_review_rounds`
51
+
52
+ Launch sub-agent (general-purpose, model: sonnet) with this prompt:
53
+
54
+ ```
55
+ You are reviewing a milestone for quality before execution. Read these files
56
+ in {milestone_path}/: prd.md, milestone.md, grooming.md, and all files in phases/.
57
+ (rfc.md and context.md don't exist yet — don't flag them as missing.)
58
+
59
+ ## Checklist
47
60
  1. Every phase has `role:` set?
48
- 2. Every phase has `skills:` reviewed?
49
- 3. Every phase has clear, testable acceptance criteria?
50
- 4. `milestone.md` has `Complexity:` set?
51
- 5. Phase order and dependencies correct?
61
+ 2. Every phase has `complexity:` set?
62
+ 3. Every phase has `skills:` appropriate for the role and task?
63
+ 4. Every phase has `scope:` defining which files/dirs to touch?
64
+ 5. Acceptance criteria are testable? (not vague like "works well" — specific like "returns 200")
65
+ 6. `milestone.md` has `Complexity:` set?
66
+ 7. Phase order and `depends_on` are correct? (frontend depends on backend, etc.)
67
+ 8. No overlapping scope between phases? (two phases writing same files)
68
+ 9. PRD explains WHY, not just WHAT?
69
+
70
+ ## Return Format
71
+ verdict: approved | changes-requested
72
+ issues:
73
+ - [severity: blocking|suggestion] {description} — {file}
74
+ summary: {2-3 sentences}
75
+ ```
76
+
77
+ **Process:**
78
+ 1. If **approved** → proceed, milestone is ready for conductor
79
+ 2. If **changes-requested** → PM reads issues, fixes milestone files, re-launches reviewer
80
+ 3. After max rounds with no blocking issues → proceed with suggestions logged in grooming.md
81
+ 4. After max rounds with blocking issues still open → escalate to user, do NOT proceed
82
+ 5. Present verdict to user before finalizing
52
83
 
53
84
  ### milestone.md Format
54
85
 
@@ -59,7 +90,7 @@ Cannot write: feature code, RFCs, architecture docs, review findings, system fil
59
90
  |-------|-------|
60
91
  | Status | planning / in-progress / review / done |
61
92
  | Priority | P0 / P1 / P2 |
62
- | Complexity | quick / standard / full |
93
+ | Complexity | trivial / quick / standard / complex |
63
94
  | PRD | prd.md |
64
95
  | Created | {date} |
65
96
  ```
@@ -85,11 +116,12 @@ depends_on: []
85
116
 
86
117
  ### Complexity Levels
87
118
 
88
- | Level | Pipeline | When |
89
- |-------|----------|------|
90
- | `quick` | Engineer → Commit → Push | Trivial: config, copy, single-file fix |
91
- | `standard` | EngineerReview → Push | Typical features, clear requirements |
92
- | `full` | Architect Engineer → Review → Push | Complex: new subsystems, unfamiliar territory |
119
+ | Level | Model | Pipeline | When |
120
+ |-------|-------|----------|------|
121
+ | `trivial` | Haiku | Phases → Commit → Push | Version bumps, env vars, config changes |
122
+ | `quick` | Sonnet | Phases Commit → Push (skip review) | Single-file fixes, simple CRUD |
123
+ | `standard` | Sonnet | Phases → Review → Push | Typical features (default) |
124
+ | `complex` | Opus | Architect → Phases → Review → Push | New subsystems, unfamiliar territory |
93
125
 
94
126
  ### Blueprint Command
95
127
 
@@ -1,6 +1,6 @@
1
- # CLAUDE.md — Orchestra Setup Instructions
1
+ # CLAUDE.md
2
2
 
3
- This file is automatically read by Claude at the start of every session.
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
4
 
5
5
  <!-- orchestra -->
6
6
  ## Orchestra — AI Team Orchestration System
@@ -46,6 +46,22 @@ Role IDs: orchestrator, product-manager, architect, backend-engineer, frontend-e
46
46
  - Rules (`.claude/rules/*.orchestra.md`) auto-loaded. Skills loaded per phase.
47
47
  - **PROTECTED:** Non-Orchestrator roles NEVER modify `.orchestra/roles/`, `.orchestra/config.yml`, `.orchestra/README.md`, `.orchestra/blueprints/`, `.claude/agents/`, `.claude/rules/*.orchestra.md`, `.claude/skills/*.orchestra.md`, `.claude/commands/orchestra/`, `CLAUDE.md`, or `docs/`.
48
48
 
49
+ ## Development
50
+
51
+ This is an npm package (`@sulhadin/orchestrator`) — a CLI installer that copies Orchestra template files into user projects.
52
+
53
+ ```bash
54
+ yarn test # Run tests (node:test, test/**/*.test.js)
55
+ yarn template # Rebuild template/ from source files (bin/build-template.js)
56
+ yarn build # Full build (defined in lint-staged)
57
+ ```
58
+
59
+ **Architecture:** `bin/index.js` is the CLI entry point (runs via `npx`). It copies files from `template/` into the user's project, with smart YAML merge for `config.yml` (preserves user values, adds new keys). `bin/build-template.js` generates the `template/` directory from the source `.orchestra/` and `.claude/` files.
60
+
61
+ **npm publishes:** Only `bin/` and `template/` directories (see `package.json` `files` field). Tests, docs, and source orchestra files are excluded.
62
+
63
+ **Pre-commit:** Husky + lint-staged runs `yarn template && yarn build` on staged `.js`, `.md`, `.yml`, `.json` files.
64
+
49
65
  ## Installation
50
66
 
51
67
  See `docs/getting-started.md` for setup instructions.
@@ -1,135 +0,0 @@
1
- const { describe, it } = require("node:test");
2
- const assert = require("node:assert");
3
- const fs = require("fs");
4
- const path = require("path");
5
-
6
- // Extract mergeConfigYaml from index.js
7
- const src = fs.readFileSync(path.join(__dirname, "index.js"), "utf-8");
8
- const match = src.match(/function mergeConfigYaml\([\s\S]*?^}/m);
9
- eval(match[0]);
10
-
11
- const userConfig = `pipeline:
12
- models:
13
- quick: haiku
14
- standard: sonnet
15
- complex: opus
16
- rfc_approval: skip
17
- push_approval: auto
18
- review: required
19
- parallel: disabled
20
-
21
- thresholds:
22
- re_review_lines: 50
23
- phase_time_limit: 20
24
- phase_tool_limit: 40
25
- stuck_retry_limit: 5
26
-
27
- verification:
28
- typecheck: "yarn tsc --noEmit"
29
- test: "yarn test"
30
- lint: "yarn lint"
31
- `;
32
-
33
- const templateConfig = `# Orchestra Configuration
34
- # Customize pipeline behavior, thresholds, and verification commands.
35
-
36
- pipeline:
37
- # Model selection per phase complexity
38
- models:
39
- trivial: haiku
40
- quick: sonnet
41
- standard: sonnet
42
- complex: opus
43
- rfc_approval: required
44
- push_approval: required
45
- review: required
46
- parallel: disabled
47
- default_pipeline: full
48
- default_complexity: standard
49
- max_rfc_rounds: 3
50
-
51
- thresholds:
52
- milestone_lock_timeout: 120
53
- re_review_lines: 30
54
- phase_time_limit: 15
55
- phase_tool_limit: 40
56
- stuck_retry_limit: 3
57
-
58
- verification:
59
- typecheck: "npx tsc --noEmit"
60
- test: "npm test"
61
- lint: "npm run lint"
62
- `;
63
-
64
- describe("mergeConfigYaml", () => {
65
- const result = mergeConfigYaml(userConfig, templateConfig);
66
-
67
- describe("new template keys are added", () => {
68
- it("adds trivial model tier", () => {
69
- assert.ok(result.includes("trivial: haiku"));
70
- });
71
-
72
- it("adds default_pipeline", () => {
73
- assert.ok(result.includes("default_pipeline: full"));
74
- });
75
-
76
- it("adds default_complexity", () => {
77
- assert.ok(result.includes("default_complexity: standard"));
78
- });
79
-
80
- it("adds max_rfc_rounds", () => {
81
- assert.ok(result.includes("max_rfc_rounds: 3"));
82
- });
83
-
84
- it("adds milestone_lock_timeout", () => {
85
- assert.ok(result.includes("milestone_lock_timeout: 120"));
86
- });
87
- });
88
-
89
- describe("user values are preserved", () => {
90
- it("keeps user models.quick value", () => {
91
- assert.ok(result.includes("quick: haiku"));
92
- });
93
-
94
- it("keeps user rfc_approval", () => {
95
- assert.ok(result.includes("rfc_approval: skip"));
96
- });
97
-
98
- it("keeps user push_approval", () => {
99
- assert.ok(result.includes("push_approval: auto"));
100
- });
101
-
102
- it("keeps user re_review_lines", () => {
103
- assert.ok(result.includes("re_review_lines: 50"));
104
- });
105
-
106
- it("keeps user phase_time_limit", () => {
107
- assert.ok(result.includes("phase_time_limit: 20"));
108
- });
109
-
110
- it("keeps user stuck_retry_limit", () => {
111
- assert.ok(result.includes("stuck_retry_limit: 5"));
112
- });
113
-
114
- it("keeps user verification commands", () => {
115
- assert.ok(result.includes('typecheck: "yarn tsc --noEmit"'));
116
- assert.ok(result.includes('test: "yarn test"'));
117
- assert.ok(result.includes('lint: "yarn lint"'));
118
- });
119
- });
120
-
121
- describe("template structure is preserved", () => {
122
- it("keeps comments from template", () => {
123
- assert.ok(result.includes("# Orchestra Configuration"));
124
- assert.ok(result.includes("# Model selection per phase complexity"));
125
- });
126
-
127
- it("maintains section order", () => {
128
- const pipelineIdx = result.indexOf("pipeline:");
129
- const thresholdsIdx = result.indexOf("thresholds:");
130
- const verificationIdx = result.indexOf("verification:");
131
- assert.ok(pipelineIdx < thresholdsIdx);
132
- assert.ok(thresholdsIdx < verificationIdx);
133
- });
134
- });
135
- });