agent-composer 0.3.1 → 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (180) hide show
  1. package/README.md +495 -180
  2. package/composer.config.schema.json +206 -2
  3. package/dist/cli/cleanup.d.ts +24 -0
  4. package/dist/cli/cleanup.js +151 -0
  5. package/dist/cli/cleanup.js.map +1 -0
  6. package/dist/cli/doctor.d.ts +12 -0
  7. package/dist/cli/doctor.js +244 -4
  8. package/dist/cli/doctor.js.map +1 -1
  9. package/dist/cli/goal.d.ts +28 -0
  10. package/dist/cli/goal.js +251 -0
  11. package/dist/cli/goal.js.map +1 -0
  12. package/dist/cli/help.d.ts +3 -0
  13. package/dist/cli/help.js +31 -0
  14. package/dist/cli/help.js.map +1 -0
  15. package/dist/cli/init.d.ts +5 -0
  16. package/dist/cli/init.js +116 -21
  17. package/dist/cli/init.js.map +1 -1
  18. package/dist/cli/initArgs.d.ts +16 -0
  19. package/dist/cli/initArgs.js +19 -0
  20. package/dist/cli/initArgs.js.map +1 -0
  21. package/dist/cli/installGitHook.d.ts +7 -0
  22. package/dist/cli/installGitHook.js +61 -0
  23. package/dist/cli/installGitHook.js.map +1 -0
  24. package/dist/cli/mode.d.ts +6 -0
  25. package/dist/cli/mode.js +25 -0
  26. package/dist/cli/mode.js.map +1 -0
  27. package/dist/cli/status.d.ts +105 -0
  28. package/dist/cli/status.js +400 -0
  29. package/dist/cli/status.js.map +1 -0
  30. package/dist/config/env.d.ts +1 -1
  31. package/dist/config/modes.d.ts +10 -0
  32. package/dist/config/modes.js +26 -0
  33. package/dist/config/modes.js.map +1 -0
  34. package/dist/config/oracleRole.d.ts +10 -0
  35. package/dist/config/oracleRole.js +11 -0
  36. package/dist/config/oracleRole.js.map +1 -0
  37. package/dist/config/schema.d.ts +246 -0
  38. package/dist/config/schema.js +127 -2
  39. package/dist/config/schema.js.map +1 -1
  40. package/dist/evolve/reflection.d.ts +9 -0
  41. package/dist/evolve/reflection.js +14 -0
  42. package/dist/evolve/reflection.js.map +1 -1
  43. package/dist/evolve/runner.d.ts +2 -1
  44. package/dist/evolve/runner.js +2 -1
  45. package/dist/evolve/runner.js.map +1 -1
  46. package/dist/index.js +115 -6
  47. package/dist/index.js.map +1 -1
  48. package/dist/providers/AnthropicCompatibleProvider.d.ts +13 -1
  49. package/dist/providers/AnthropicCompatibleProvider.js +115 -9
  50. package/dist/providers/AnthropicCompatibleProvider.js.map +1 -1
  51. package/dist/providers/CLIProvider.d.ts +18 -0
  52. package/dist/providers/CLIProvider.js +265 -62
  53. package/dist/providers/CLIProvider.js.map +1 -1
  54. package/dist/providers/IProvider.d.ts +12 -0
  55. package/dist/providers/SpendGuardProvider.d.ts +32 -0
  56. package/dist/providers/SpendGuardProvider.js +98 -0
  57. package/dist/providers/SpendGuardProvider.js.map +1 -0
  58. package/dist/registry.d.ts +5 -2
  59. package/dist/registry.js +17 -2
  60. package/dist/registry.js.map +1 -1
  61. package/dist/server/activeRuns.d.ts +17 -0
  62. package/dist/server/activeRuns.js +114 -0
  63. package/dist/server/activeRuns.js.map +1 -0
  64. package/dist/server/codexLifecycleRunner.d.ts +29 -0
  65. package/dist/server/codexLifecycleRunner.js +188 -0
  66. package/dist/server/codexLifecycleRunner.js.map +1 -0
  67. package/dist/server/configMutation.d.ts +22 -0
  68. package/dist/server/configMutation.js +121 -0
  69. package/dist/server/configMutation.js.map +1 -0
  70. package/dist/server/handoffContext.d.ts +1 -0
  71. package/dist/server/handoffContext.js +12 -0
  72. package/dist/server/handoffContext.js.map +1 -0
  73. package/dist/server/progress.d.ts +24 -0
  74. package/dist/server/progress.js +109 -0
  75. package/dist/server/progress.js.map +1 -0
  76. package/dist/server/toolDescriptions.d.ts +60 -0
  77. package/dist/server/toolDescriptions.js +134 -0
  78. package/dist/server/toolDescriptions.js.map +1 -0
  79. package/dist/server.d.ts +19 -25
  80. package/dist/server.js +87 -377
  81. package/dist/server.js.map +1 -1
  82. package/dist/tools/audit.d.ts +2 -0
  83. package/dist/tools/audit.js +66 -0
  84. package/dist/tools/audit.js.map +1 -0
  85. package/dist/tools/code.d.ts +2 -0
  86. package/dist/tools/code.js +160 -0
  87. package/dist/tools/code.js.map +1 -0
  88. package/dist/tools/codexLifecycle.d.ts +2 -0
  89. package/dist/tools/codexLifecycle.js +206 -0
  90. package/dist/tools/codexLifecycle.js.map +1 -0
  91. package/dist/tools/config.d.ts +2 -0
  92. package/dist/tools/config.js +183 -0
  93. package/dist/tools/config.js.map +1 -0
  94. package/dist/tools/context.d.ts +31 -0
  95. package/dist/tools/context.js +2 -0
  96. package/dist/tools/context.js.map +1 -0
  97. package/dist/tools/goal.d.ts +2 -0
  98. package/dist/tools/goal.js +159 -0
  99. package/dist/tools/goal.js.map +1 -0
  100. package/dist/tools/handoff.d.ts +2 -0
  101. package/dist/tools/handoff.js +57 -0
  102. package/dist/tools/handoff.js.map +1 -0
  103. package/dist/tools/oracle.d.ts +2 -0
  104. package/dist/tools/oracle.js +248 -0
  105. package/dist/tools/oracle.js.map +1 -0
  106. package/dist/tools/research.d.ts +2 -0
  107. package/dist/tools/research.js +51 -0
  108. package/dist/tools/research.js.map +1 -0
  109. package/dist/tools/review.d.ts +2 -0
  110. package/dist/tools/review.js +233 -0
  111. package/dist/tools/review.js.map +1 -0
  112. package/dist/tools/route.d.ts +2 -0
  113. package/dist/tools/route.js +69 -0
  114. package/dist/tools/route.js.map +1 -0
  115. package/dist/tools/session.d.ts +2 -0
  116. package/dist/tools/session.js +37 -0
  117. package/dist/tools/session.js.map +1 -0
  118. package/dist/tools/status.d.ts +2 -0
  119. package/dist/tools/status.js +34 -0
  120. package/dist/tools/status.js.map +1 -0
  121. package/dist/tools/workflow.d.ts +2 -0
  122. package/dist/tools/workflow.js +27 -0
  123. package/dist/tools/workflow.js.map +1 -0
  124. package/dist/util/applyFileBlocks.d.ts +18 -0
  125. package/dist/util/applyFileBlocks.js +163 -0
  126. package/dist/util/applyFileBlocks.js.map +1 -0
  127. package/dist/util/asyncControl.d.ts +14 -0
  128. package/dist/util/asyncControl.js +106 -0
  129. package/dist/util/asyncControl.js.map +1 -0
  130. package/dist/util/auditLog.d.ts +56 -0
  131. package/dist/util/auditLog.js +232 -0
  132. package/dist/util/auditLog.js.map +1 -0
  133. package/dist/util/codexLifecycle.d.ts +55 -0
  134. package/dist/util/codexLifecycle.js +102 -0
  135. package/dist/util/codexLifecycle.js.map +1 -0
  136. package/dist/util/codexLifecycleJob.d.ts +209 -0
  137. package/dist/util/codexLifecycleJob.js +360 -0
  138. package/dist/util/codexLifecycleJob.js.map +1 -0
  139. package/dist/util/composerDisabled.d.ts +6 -0
  140. package/dist/util/composerDisabled.js +27 -0
  141. package/dist/util/composerDisabled.js.map +1 -0
  142. package/dist/util/dispatchHint.d.ts +5 -3
  143. package/dist/util/dispatchHint.js +62 -2
  144. package/dist/util/dispatchHint.js.map +1 -1
  145. package/dist/util/goal.d.ts +132 -0
  146. package/dist/util/goal.js +616 -0
  147. package/dist/util/goal.js.map +1 -0
  148. package/dist/util/goalReport.d.ts +51 -0
  149. package/dist/util/goalReport.js +164 -0
  150. package/dist/util/goalReport.js.map +1 -0
  151. package/dist/util/jobPolling.d.ts +9 -0
  152. package/dist/util/jobPolling.js +17 -0
  153. package/dist/util/jobPolling.js.map +1 -0
  154. package/dist/util/oracleJob.d.ts +66 -0
  155. package/dist/util/oracleJob.js +295 -0
  156. package/dist/util/oracleJob.js.map +1 -0
  157. package/dist/util/oracleLock.d.ts +38 -0
  158. package/dist/util/oracleLock.js +182 -0
  159. package/dist/util/oracleLock.js.map +1 -0
  160. package/dist/util/reviewDiff.d.ts +8 -0
  161. package/dist/util/reviewDiff.js +29 -0
  162. package/dist/util/reviewDiff.js.map +1 -0
  163. package/dist/util/reviewJob.d.ts +57 -0
  164. package/dist/util/reviewJob.js +207 -0
  165. package/dist/util/reviewJob.js.map +1 -0
  166. package/dist/util/workflowPlan.d.ts +24 -0
  167. package/dist/util/workflowPlan.js +49 -0
  168. package/dist/util/workflowPlan.js.map +1 -0
  169. package/package.json +8 -1
  170. package/plugin/composer-mastermind/commands/evolve.md +4 -0
  171. package/plugin/composer-mastermind/hooks/boundary_guard.sh +43 -2
  172. package/plugin/composer-mastermind/hooks/codex_warm_review.sh +161 -9
  173. package/plugin/composer-mastermind/hooks/learn.sh +172 -32
  174. package/plugin/composer-mastermind/hooks/precommit_codex_review.sh +438 -64
  175. package/plugin/composer-mastermind/skills/composer-mastermind/SKILL.md +190 -4
  176. package/scripts/composer-oracle-router-safe.sh +47 -0
  177. package/scripts/composer-statusline-segment.mjs +40 -0
  178. package/scripts/oracle-codex-handoff-safe.sh +49 -0
  179. package/scripts/oracle-plan-mcp.sh +66 -0
  180. package/scripts/oracle-pro-safe.sh +572 -0
package/README.md CHANGED
@@ -1,265 +1,580 @@
1
- # Composer — multi-agent orchestration for Claude Code
1
+ # Composer — multi-agent orchestration for daily coding
2
2
 
3
- [![npm](https://img.shields.io/badge/npm-agent--composer-blue)](#install) [![tests](https://img.shields.io/badge/vitest-435%20passing-brightgreen)](#contributing) [![license](https://img.shields.io/badge/license-MIT-lightgrey)](#license)
3
+ [![npm](https://img.shields.io/badge/npm-agent--composer-blue)](#install) [![MCP](https://img.shields.io/badge/MCP-Claude%20Code%20server-purple)](#architecture) [![license](https://img.shields.io/badge/license-MIT-lightgrey)](#license)
4
4
 
5
- > **Claude orchestrates. GLM, Codex, and `agy` execute — and *apply* — off your Claude quota.** Composer is an MCP server + Claude Code plugin that lets the most-capable model hold the plan while worker models generate *and write* the code in their own context. Because the executors apply files themselves (instead of returning text the main session must re-ingest), composer keeps the orchestrator's context lean and every change reviewable.
5
+ > **Claude Code stays the main brain. Composer routes planning, coding, research, review, and safety gates to the right worker model.**
6
6
 
7
- ## What it is
7
+ Composer is an MCP server plus Claude Code plugin for people who want the strongest model to hold the product and architecture context, while cheaper or more specialized models do the mechanical work. It keeps the main Claude session focused on intent, integration, and decisions instead of spending tokens on raw code generation, patch application, repeated research, or long review logs.
8
8
 
9
- Two coordinated artefacts:
9
+ ## Why this exists
10
10
 
11
- | Artefact | Purpose |
12
- |---|---|
13
- | **`agent-composer`** (this npm package) | MCP server exposing `composer_handoff_create`, `composer_research`, `composer_code`, `composer_code_chain`, `composer_code_cli`, `composer_review`, and `composer_review_claude`. Wraps GLM (via Anthropic-compatible endpoint) and CLI executors such as Codex, `agy`, or bounded `claude -p`. |
14
- | **`composer-mastermind`** (Claude Code plugin) | Orchestrator skill + haiku-wrapped subagents (`coder`, `researcher`, `reviewer`, optional `reviewer-claude`) + `boundary_guard` PreToolUse hook + `/evolve` slash command. |
11
+ Modern LLM development gets expensive and messy when one chat session does everything:
15
12
 
16
- Combined, they turn the main Claude session into a coordinator that never writes code or edits files directly. The main session may use Bash for inspection and verification, while code changes are dispatched through Composer MCP tools. The boundary hook fails closed if a denied file-mutating tool is requested.
13
+ - planning the feature,
14
+ - searching docs,
15
+ - editing files,
16
+ - debugging failures,
17
+ - reviewing the diff,
18
+ - remembering local workflow rules,
19
+ - and enforcing commit gates.
17
20
 
18
- ## Tools
21
+ Composer separates those jobs.
19
22
 
20
- Seven MCP tools, all routing work off the main Claude session:
23
+ | Need | Composer answer |
24
+ | ------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------- |
25
+ | Keep the best model as the strategist | Claude Code orchestrates; optional ChatGPT Pro via Oracle acts as a slow co-oracle for hard planning/review/debugging. |
26
+ | Save Claude Code tokens | Codex, GLM, `agy`, and bounded `claude -p` calls run outside the main Claude context and return compact summaries. |
27
+ | Let workers edit locally | `composer_code_cli` lets Codex or another CLI executor generate **and apply** code directly in the repo. |
28
+ | Avoid copy/paste between agents | `composer_handoff_create` writes shared packets under `.composer/handoffs/`. |
29
+ | Add quality gates | Review lanes, Codex pre-commit review, Claude Code hooks, terminal git hooks, and doctor checks make failures visible. |
30
+ | Improve over time | The `composer-mastermind` skill and `/evolve` loop let routing guidance improve from real failures, with manual promotion only. |
21
31
 
22
- | Tool | Executor | What it does |
23
- |---|---|---|
24
- | `composer_handoff_create` | Composer server | Writes a compact shared packet under `.composer/handoffs/`; pass `handoffPath` to Codex, GLM, agy, researcher, and reviewer calls so every worker shares the same objective and constraints. |
25
- | `composer_code_cli` | Codex CLI or agy | **Default for code edits.** The configured CLI executor generates **and applies** files itself off-CC, from the MCP server root, then returns a bounded summary. Use Codex here for complex coding work. |
26
- | `composer_code_chain` | GLM authors → server applies | GLM fallback. GLM writes the complete files off-CC (`FILE: <path>` + fenced blocks); the MCP server applies them deterministically off-CC; the orchestrator only relays a summary. ~71% fewer total-CC tokens on multi-file tasks. |
27
- | `composer_code` | GLM | Legacy patch-only lane. Use only when you explicitly need GLM diff/text output instead of an apply-capable lane. |
28
- | `composer_research` | Codex CLI search | Direct docs/web/current-context lane → bounded structured summary. Runs Codex with live web search and a read-only sandbox. |
29
- | `composer_review` | agy | Direct diff-review lane. Ask it to run repo-appropriate targeted checks off-CC; use a reviewer model different from the author for cross-model rigor (e.g. GLM writes → agy reviews). |
30
- | `composer_review_claude` | Claude Code CLI | Premium second-opinion review for high-risk/security-sensitive diffs or explicit user requests. Default config runs bounded `claude -p --model opus` with read/test tools only and `--max-budget-usd 0.50`. |
32
+ ## Real-world scenarios
31
33
 
32
- **Why "off-CC" matters:** GLM (z.ai), Codex, and agy run on *separate* quotas. Generating and *applying* code in their own context — not returning text the main Claude session must re-ingest — is what actually preserves your Max5 quota. The eval harness scores on **total-CC tokens** (every Claude model in a run = real Max5 burn), with a correctness gate (tsc/tests) and N-run averaging.
34
+ ### 1. Build a feature without burning the main context
33
35
 
34
- ## Install
36
+ ```text
37
+ You describe the feature
38
+ → Claude Code creates a compact handoff
39
+ → Codex implements with composer_code_cli
40
+ → agy reviews with composer_review
41
+ → Claude Code integrates only the summary and decisions
42
+ ```
43
+
44
+ ### 2. Use ChatGPT Pro only when it is worth the wait
45
+
46
+ ```text
47
+ Architecture unclear? → composer_oracle_plan(mode="deep")
48
+ Hard root cause? → composer_oracle_plan(mode="debug")
49
+ Large risky diff? → composer_oracle_plan(mode="review")
50
+ Long research, not urgent? → composer_oracle_job_start + composer_oracle_job_result
51
+ Routine docs lookup? → composer_research, not Oracle
52
+ ```
53
+
54
+ Oracle is deliberately opt-in. It drives a real ChatGPT Pro browser session through `steipete/oracle`, so it is slower and should be used for high-value reasoning, not every small question.
55
+
56
+ ### 3. Keep commits gated
57
+
58
+ ```text
59
+ Claude-issued git commit → PreToolUse pre-commit gate can deny
60
+ Manual terminal git commit → install the real .git/hooks/pre-commit bridge
61
+ CI / protected branch → recommended final backstop
62
+ ```
63
+
64
+ ### 4. Recover when a bug fight stalls
65
+
66
+ ```text
67
+ 1 failed fix → inspect locally and retry normally
68
+ 2+ failed attempts → Codex lifecycle/rescue or Oracle debug
69
+ Patch produced → composer_review before reporting done
70
+ ```
71
+
72
+ ## Quick start
73
+
74
+ ### Install
35
75
 
36
76
  ```bash
37
- # 1. Install the MCP server
38
77
  npm install -g agent-composer
78
+ ```
79
+
80
+ ### Recommended global setup
39
81
 
40
- # 2. Bootstrap a project (creates composer.config.json + .env.json template +
41
- # .gitignore + .claude/settings.json with mcpServers.composer entry)
82
+ Use this when you want Composer available from many projects.
83
+
84
+ ```bash
85
+ agent-composer init --global
86
+ $EDITOR ~/.config/composer/.env.json
87
+ agent-composer doctor
88
+ ```
89
+
90
+ Then launch Claude Code from any repo:
91
+
92
+ ```bash
93
+ claude
94
+ ```
95
+
96
+ ### Project-local setup
97
+
98
+ Use this when a repo needs its own provider routing or stricter gates.
99
+
100
+ ```bash
42
101
  cd your-project
43
102
  agent-composer init
103
+ $EDITOR .env.json
104
+ agent-composer doctor
105
+ ```
44
106
 
45
- # 3. Fill credentials
46
- $EDITOR .env.json # ANTHROPIC_BASE_URL + ANTHROPIC_AUTH_TOKEN
107
+ ### Optional ChatGPT Pro / Oracle setup
47
108
 
48
- # 4. Install the plugin (manual until Claude Code plugin marketplace lands)
49
- mkdir -p ~/.claude/plugins
50
- git clone <this-repo> /tmp/composer
51
- cp -R /tmp/composer/plugin/composer-mastermind ~/.claude/plugins/
109
+ Oracle is not enabled by default.
52
110
 
53
- # 5. Launch
54
- claude
111
+ ```bash
112
+ cd your-project
113
+ agent-composer init --oracle
114
+ scripts/oracle-pro-safe.sh --mode quick -- "Say OK."
115
+ agent-composer doctor
116
+ ```
117
+
118
+ The first real Oracle run may open a browser for login. Complete the ChatGPT login once, then rerun the smoke test. The adapter probes optional Oracle browser flags before using them, so it should degrade gracefully across compatible Oracle builds.
119
+
120
+ ## Cheat sheet
121
+
122
+ ### User prompts
123
+
124
+ | What you want | Say this |
125
+ | ---------------------------------- | ------------------------------------------------------------ |
126
+ | Normal code change | “Implement this using Composer.” |
127
+ | Multi-file feature | “Create a handoff, implement via Codex, then review.” |
128
+ | Current docs or API lookup | “Use `composer_research` first.” |
129
+ | Hard architecture planning | “Use `composer_oracle_plan` with mode `deep`.” |
130
+ | Risky diff review | “Run `composer_review`, then Oracle review if risk remains.” |
131
+ | Stuck debugging | “Use Oracle debug after the failed attempts.” |
132
+ | Do not block while Oracle thinks | “Start an async Oracle job and poll it later.” |
133
+ | Premium Claude review | “Escalate to `composer_review_claude`.” |
134
+ | Toggle Composer enforcement | `/composer disable` (this session) · `/composer enable` · `/composer status` |
135
+
136
+ ### MCP tools
137
+
138
+ | Tool | Use it for |
139
+ | --------------------------------- | ------------------------------------------------------------------------- |
140
+ | `composer_handoff_create` | Create a compact shared packet for multi-agent work. |
141
+ | `composer_research` | Fast docs/current-context lookup through the researcher role. |
142
+ | `composer_code_cli` | Default code-edit lane. Codex/CLI executor writes files directly. |
143
+ | `composer_code_chain` | GLM writes complete file blocks; Composer applies them deterministically. |
144
+ | `composer_code` | Legacy patch/text-only GLM lane. Rare fallback. |
145
+ | `composer_review` | Default diff review lane. |
146
+ | `composer_review_claude` | Expensive second-opinion review. |
147
+ | `composer_oracle_plan` | Synchronous ChatGPT Pro planning/review/debug lane. |
148
+ | `composer_oracle_job_start` | Non-blocking Oracle job for long work. |
149
+ | `composer_oracle_job_result` | Poll/read an Oracle job result. |
150
+ | `composer_codex_lifecycle_decide` | Cheap policy decision: skip, ask, or run Codex. |
151
+ | `composer_codex_lifecycle_run` | Run an advisory Codex lifecycle checkpoint. |
152
+ | `composer_codex_lifecycle_result` | Read a lifecycle checkpoint result. |
153
+ | `composer_route_decide` | Choose the next Composer lane for a task. |
154
+ | `composer_workflow_plan` | Produce an ordered workflow plan for multi-step work. |
155
+ | `composer_audit_record` | Append a structured audit event. |
156
+ | `composer_audit_read` | Read recent audit events for project context. |
157
+ | `composer_audit_summary` | Summarize recent routing, review, test, and outcome audit events. |
158
+ | `composer_session_get` | Inspect the current Composer session settings. |
159
+ | `composer_session_set` | Update session-local mode, oracle, or profile settings. |
160
+ | `composer_status` | Read config, integration, activity, and recommendation status. |
161
+ | `composer_goal_start` | Start one project goal with objective, condition, checks, and budget. |
162
+ | `composer_goal_status` | Inspect active or named goal state and next advisory action. |
163
+ | `composer_goal_step` | Advance the advisory goal loop from deterministic check results. |
164
+ | `composer_goal_clear` | Cancel or clear a project goal record. |
165
+ | `composer_config_get` | Inspect active/project/global Composer config. |
166
+ | `composer_config_set` | Safely patch lifecycle and review-gate settings. |
167
+
168
+ ### Oracle force tags
169
+
170
+ Use these when you want deterministic routing inside an Oracle prompt or wrapper:
171
+
172
+ ```text
173
+ [oracle:quick] Small sanity check through ChatGPT web.
174
+ [oracle:standard] Moderate design question.
175
+ [oracle:deep] Feature planning, architecture, migration design.
176
+ [oracle:plan] Same as deep, but semantically a plan.
177
+ [oracle:review] High-risk design or diff review.
178
+ [oracle:debug] Hard root-cause analysis.
179
+ [oracle:research] Slow research/synthesis.
180
+ [oracle:async] Orchestrator hint: use the async Oracle job tools.
181
+ [codex] Keep it on the cheaper Codex lane.
55
182
  ```
56
183
 
57
- Verify the orchestrator skill loaded:
184
+ ## Daily workflows
185
+
186
+ ### Feature work
58
187
 
188
+ ```text
189
+ 1. composer_handoff_create
190
+ 2. composer_research if current docs/API context is needed
191
+ 3. composer_oracle_plan(mode="deep") only if architecture is unclear
192
+ 4. composer_code_cli to implement
193
+ 5. composer_review on the diff
194
+ 6. targeted tests / typecheck
195
+ 7. optional composer_review_claude for high-risk changes
59
196
  ```
60
- /composer-mastermind
197
+
198
+ ### Hard debugging
199
+
200
+ ```text
201
+ 1. Capture the smallest failing command/output.
202
+ 2. Try the obvious local fix once.
203
+ 3. After repeated failure, call composer_oracle_plan(mode="debug") or Codex rescue.
204
+ 4. Feed the diagnosis to composer_code_cli.
205
+ 5. Run tests and composer_review.
61
206
  ```
62
207
 
63
- Smoke-test the self-evolution loop:
208
+ ### Review before merge
64
209
 
210
+ ```text
211
+ Routine diff → composer_review
212
+ Large/risky diff → composer_review + composer_oracle_plan(mode="review")
213
+ Security-sensitive → composer_review + composer_review_claude, optionally Oracle review
65
214
  ```
66
- /evolve --eval-mode synthetic
215
+
216
+ ### Long research
217
+
218
+ ```text
219
+ Blocking decision → composer_oracle_plan(mode="research")
220
+ Useful but not urgent → composer_oracle_job_start(mode="research"), then poll result
221
+ Routine lookup → composer_research
67
222
  ```
68
223
 
224
+ ## Architecture
225
+
226
+ ```mermaid
227
+ flowchart TD
228
+ U[You] --> CC[Claude Code main session<br/>orchestrator: plan, integrate, decide]
229
+ CC -->|MCP tools| MCP[Composer MCP server]
230
+ MCP --> R[composer_research → researcher]
231
+ MCP --> CODE[composer_code_cli / code_chain → coder / coderCli]
232
+ MCP --> REV[composer_review / review_claude → reviewer]
233
+ MCP --> HO[composer_handoff_create → .composer/handoffs/]
234
+ MCP --> OR[composer_oracle_plan / job_* → oraclePlanner]
235
+ R --> P1[Codex / GLM research]
236
+ CODE --> P2[Codex / GLM apply files in repo]
237
+ REV --> P3[agy / Claude review]
238
+ OR --> P4[ChatGPT Pro via steipete/oracle browser]
239
+ P2 -->|bounded summary| CC
240
+ P3 -->|verdict| CC
241
+ P4 -->|advisory plan| CC
242
+ CC --> G[Gates: boundary_guard · codexReview · git hook · doctor]
243
+ ```
244
+
245
+ ### Main components
246
+
247
+ | Path | Purpose |
248
+ | ----------------------------- | --------------------------------------------------------------------- |
249
+ | `src/server.ts` | Registers Composer MCP tools. |
250
+ | `src/providers/` | Provider adapters: CLI, Anthropic-compatible, mock. |
251
+ | `src/util/` | Handoffs, lifecycle jobs, Oracle jobs/locks, dispatch hints, helpers. |
252
+ | `plugin/composer-mastermind/` | Claude Code plugin: skill, subagents, hooks, `/evolve`. |
253
+ | `scripts/` | Oracle adapters, review hooks, release/dev helpers. |
254
+ | `docs/adr/` | Architecture decisions and append-only contracts. |
255
+ | `tests/` | Vitest, hook tests, script tests. |
256
+
257
+ ### Execution model
258
+
259
+ Composer has one invariant:
260
+
261
+ > The main Claude session coordinates. Worker lanes execute. Review gates decide whether work is good enough.
262
+
263
+ That means:
264
+
265
+ - Claude Code should plan, inspect, verify, and integrate.
266
+ - File writes should go through `composer_code_cli` or `composer_code_chain`.
267
+ - Review should run through a different model/provider than the author whenever practical.
268
+ - Oracle is advisory only; it never edits files.
269
+ - Background jobs are persisted as state records, but Oracle async jobs are server-lifetime, not OS-detached workers.
270
+
271
+ ### Bounded execution
272
+
273
+ Every dispatch the main Claude session awaits has a hard deadline and is cancellable; no lane can block the brain indefinitely.
274
+
275
+ | Surface | Bound |
276
+ | ------- | ----- |
277
+ | Providers | Internal default timeout plus the caller's `AbortSignal`. CLI providers bound total wall-clock across retries and kill the child process tree on timeout. |
278
+ | Background jobs | Oracle, review, and Codex lifecycle jobs run under wall-clock deadlines, propagate brain cancellation, and flush to a terminal state on server `SIGTERM`. |
279
+ | Status | Persistence is async and best-effort, off the event loop. Startup prunes stale in-flight entries after `COMPOSER_ACTIVE_RUN_TTL_MS` (default 2h). |
280
+ | Hooks | `boundary_guard`, `dispatch_guard`, precommit, and `learn` run with bounded timeouts and fail closed. |
281
+ | Config | Role `timeoutMs` values in `composer.config.json` override sane defaults. |
282
+
283
+ ### Live status
284
+
285
+ Active Composer runs are tracked in `~/.composer/state/active-runs.json`, enriched with the tool, provider label/role, phase, detail, and start time. Two surfaces consume it:
286
+
287
+ - `composer_status` reports in-flight runs with elapsed time, provider, and phase.
288
+ - `scripts/composer-statusline-segment.mjs` renders a compact statusline segment (e.g. `⚡composer: review(glm) 2m`) for the terminal status bar.
289
+
69
290
  ## Configuration
70
291
 
71
- Two files at the consumer-project root, both gitignored or partially gitignored:
292
+ Composer reads config from the active project first, then from the global Composer config when no project config exists.
72
293
 
73
- **`composer.config.json`** (committed) provider routing + spend caps:
294
+ `composer.config.json` is hot-reloaded on every provider dispatch, so role, lifecycle, and review-gate edits take effect without restarting the MCP server.
295
+
296
+ | File | Purpose |
297
+ | ----------------------- | -------------------------------------------------------------------------- |
298
+ | `composer.config.json` | Provider roles, lifecycle policy, review-gate settings. Usually committed. |
299
+ | `.env.json` | Provider credentials. Never commit. |
300
+ | `.claude/settings.json` | Claude Code MCP server wiring and hooks. |
301
+
302
+ ### Minimal provider roles
74
303
 
75
304
  ```json
76
305
  {
77
306
  "roles": {
78
- "researcher": { "provider": "cli", "cli": ["codex", "--search", "--ask-for-approval", "never", "exec", "--ephemeral", "--sandbox", "read-only"], "timeoutMs": 180000, "retries": 0 },
79
- "coder": { "provider": "anthropic", "baseUrl": "https://api.z.ai/api/anthropic", "apiKeyEnv": "ANTHROPIC_AUTH_TOKEN" },
80
- "coderCli": { "provider": "cli", "cli": ["codex", "exec", "--ephemeral", "--sandbox", "workspace-write", "-c", "approval_policy=\"never\"", "-c", "model_reasoning_effort=\"medium\""], "timeoutMs": 900000, "retries": 0 },
81
- "reviewer": { "provider": "cli", "cli": ["agy", "--dangerously-skip-permissions", "--print-timeout", "90s", "-p"], "timeoutMs": 120000, "retries": 0 },
82
- "reviewerClaude": {
307
+ "researcher": {
308
+ "provider": "cli",
309
+ "cli": ["codex", "--search", "--ask-for-approval", "never", "exec", "--ephemeral", "--sandbox", "read-only"],
310
+ "timeoutMs": 180000,
311
+ "retries": 0
312
+ },
313
+ "coder": {
314
+ "provider": "anthropic",
315
+ "baseUrl": "https://api.z.ai/api/anthropic",
316
+ "apiKeyEnv": "ANTHROPIC_AUTH_TOKEN"
317
+ },
318
+ "coderCli": {
83
319
  "provider": "cli",
84
- "model": "claude-opus-review",
85
- "cli": ["claude", "-p", "--model", "opus", "--permission-mode", "bypassPermissions", "--setting-sources", "project", "--disable-slash-commands", "--no-session-persistence", "--max-budget-usd", "0.50", "--tools", "Read,Glob,Grep,Bash", "--allowedTools", "Read,Glob,Grep,Bash(npx tsc --noEmit),Bash(npm test),Bash(npm run test:*),Bash(npx vitest*)"],
86
- "timeoutMs": 300000,
320
+ "cli": ["codex", "exec", "--ephemeral", "--sandbox", "workspace-write", "-c", "approval_policy=\"never\"", "-c", "model_reasoning_effort=\"medium\""],
321
+ "timeoutMs": 900000,
87
322
  "retries": 0
323
+ },
324
+ "reviewer": {
325
+ "provider": "cli",
326
+ "cli": ["agy", "--dangerously-skip-permissions", "--print-timeout", "110s", "-p"],
327
+ "timeoutMs": 120000,
328
+ "retries": 1
88
329
  }
89
- },
90
- "spendAuthorization": {
91
- "mode": "interactive",
92
- "maxUsdPerCall": 0.50,
93
- "maxUsdPerSession": 5.00
94
330
  }
95
331
  }
96
332
  ```
97
333
 
98
- For the old agy-only coding path, set `coderCli.cli` back to
99
- `["agy", "--dangerously-skip-permissions", "-p"]`. For the old agy-only
100
- research path, set `researcher.cli` to the same agy argv. The provider
101
- contract does not change; Codex is piloted as the existing CLI executor.
102
- When `coderCli` or `researcher` use `codex ... exec`, Composer captures
103
- Codex's final message with `--output-last-message` automatically, so the
104
- main session receives a short outcome instead of raw event output. Composer
105
- refuses explicit `codex exec --sandbox danger-full-access` and
106
- `--dangerously-bypass-approvals-and-sandbox` configs by default; set
107
- `COMPOSER_ALLOW_DANGEROUS_CODEX=1` only inside an external sandbox.
108
- The default Codex coding lane sets `timeoutMs` to 15 minutes and overrides
109
- the nested Codex run to `model_reasoning_effort="medium"` so it does not
110
- inherit slower global high-effort settings intended for the main orchestrator.
111
- Keep `reviewer` as the default gate. Use `reviewerClaude` only when the user
112
- asks for Claude review or when a risky diff needs an expensive second opinion.
113
-
114
- ### Fast direct-tool mode
115
-
116
- Composer keeps the CLI executor path, but the plugin now treats it more like a
117
- small SDK harness:
118
-
119
- - `composer_code_cli` is the default edit lane; the legacy `coder` subagent is
120
- only for rare patch-only GLM fallback.
121
- - `composer_research`, `composer_review`, and `composer_review_claude` can be
122
- called directly because their providers already run off the main Claude Code
123
- context and return bounded summaries.
124
- - The `researcher`, `reviewer`, and `reviewer-claude` subagents remain available
125
- when raw upstream output is expected to be large enough to need an isolated
126
- wrapper context.
127
- - CLI calls append best-effort timing records to
128
- `/tmp/composer-cli-usage.jsonl`; GLM calls append timing/cache records to
129
- `/tmp/composer-glm-usage.jsonl`. These files contain durations and character
130
- counts plus success/error status, not prompts.
131
-
132
- **`.env.json`** (NEVER commit) — credentials only:
334
+ Currently wired provider IDs are `mock`, `anthropic`, and `cli`. The schema reserves `openai_compatible`, but the runtime does not yet implement that adapter.
335
+
336
+ ### Optional Oracle role
337
+
338
+ Created by `agent-composer init --oracle`:
133
339
 
134
340
  ```json
135
341
  {
136
- "ANTHROPIC_BASE_URL": "https://api.z.ai/api/anthropic",
137
- "ANTHROPIC_AUTH_TOKEN": "<your-glm-or-anthropic-compatible-token>"
342
+ "roles": {
343
+ "oraclePlanner": {
344
+ "provider": "cli",
345
+ "cli": ["bash", "scripts/oracle-plan-mcp.sh", "--mode", "auto", "--"],
346
+ "timeoutMs": 1200000,
347
+ "retries": 0,
348
+ "maxResultChars": 14000
349
+ }
350
+ }
351
+ }
352
+ ```
353
+
354
+ Keep `researcher` on Codex. Oracle is for opt-in planning/review/debugging, not routine search.
355
+
356
+ ### Codex lifecycle policy
357
+
358
+ `codexLifecycle` controls advisory Codex participation at lifecycle points such as post-plan, post-code, test failure, or repeated failed attempts.
359
+
360
+ ```json
361
+ {
362
+ "codexLifecycle": {
363
+ "enabled": true,
364
+ "mode": "ask",
365
+ "execution": "background",
366
+ "model": "gpt-5.4-mini",
367
+ "triggers": {
368
+ "postPlan": true,
369
+ "postCodeApply": true,
370
+ "postTestFailure": true,
371
+ "afterFailedAttempts": true,
372
+ "preCommit": false,
373
+ "stopWarm": false
374
+ },
375
+ "thresholds": {
376
+ "minScore": 60,
377
+ "minExpectedOutputTokens": 500,
378
+ "minChangedFiles": 2,
379
+ "minDiffLines": 80,
380
+ "failedAttempts": 2
381
+ },
382
+ "fallback": {
383
+ "enabled": true,
384
+ "order": ["reviewerClaude", "reviewer", "coder"]
385
+ }
386
+ }
138
387
  }
139
388
  ```
140
389
 
141
- The MCP server reads `.env.json` via `fs.readFileSync` it is **never** exposed to the orchestrator session.
390
+ Lifecycle runs are advisory. They should not silently mutate files; apply suggestions through the normal code lane and review them.
391
+
392
+ ### Codex review gate
142
393
 
143
- ### Soft-disable Composer
394
+ `codexReview` controls optional cross-LLM review and pre-commit gating.
395
+ `codexReview.preCommitHook.maxConsecutiveBlocks` is an escape hatch for review-gate oscillation. Keep it unset by default. When set to `N` (> 0), after `N` consecutive blocks on the same branch the gate allows the commit once, emits an audited `allow-cap` event, and resets the counter.
144
396
 
145
- Composer hooks can be disabled without editing Claude Code settings:
397
+ ```json
398
+ {
399
+ "codexReview": {
400
+ "enabled": true,
401
+ "preCommitCommand": "adversarial-review",
402
+ "scope": "auto",
403
+ "model": "gpt-5.5",
404
+ "preCommitHook": {
405
+ "enabled": true,
406
+ "blockOnSeverity": "high",
407
+ "timeoutMs": 900000,
408
+ "failClosed": true
409
+ },
410
+ "warmCache": {
411
+ "enabled": true,
412
+ "maxAgeMinutes": 30
413
+ }
414
+ }
415
+ }
416
+ ```
417
+
418
+ To gate manual terminal commits, install the real Git hook bridge:
146
419
 
147
420
  ```bash
148
- # Disable for one launch
149
- COMPOSER_ENABLED=0 claude
421
+ printf '#!/usr/bin/env bash\nexec "$(git rev-parse --show-toplevel)/scripts/precommit_codex_review.sh" --git-hook\n' \
422
+ > .git/hooks/pre-commit
423
+ chmod +x .git/hooks/pre-commit
424
+ ```
425
+
426
+ `git commit --no-verify` still bypasses local Git hooks. Use CI and branch protection when every commit path must be enforced.
427
+
428
+ ### Spend and consent policy
150
429
 
151
- # Disable globally for already-configured hooks
152
- touch ~/.claude/composer.disabled
430
+ `spendAuthorization` is a routing/consent policy exposed to the orchestrator and config tools.
153
431
 
154
- # Re-enable globally
155
- rm -f ~/.claude/composer.disabled
432
+ ```json
433
+ {
434
+ "spendAuthorization": {
435
+ "mode": "interactive",
436
+ "maxUsdPerCall": 0.5,
437
+ "maxUsdPerSession": 5.0
438
+ }
439
+ }
156
440
  ```
157
441
 
158
- Project-local disable is also supported with `touch .composer-disabled`.
159
- For scripts or tests, set `COMPOSER_DISABLED_FILE=/path/to/sentinel`.
160
- This disables Composer hooks immediately. To fully suppress skill autoload,
161
- also set `"composer-mastermind": "off"` in Claude Code `skillOverrides` and
162
- restart CC.
442
+ CLI providers such as Codex and `agy` are billed by their own authentication and do not share a universal billing meter with Composer. Keep provider budgets conservative and use `agent-composer doctor` before relying on gates.
163
443
 
164
- ## How dispatch works
444
+ ### Config tools
165
445
 
166
- Inside a Claude Code session, dispatch flow:
446
+ Inside Claude Code, prefer MCP config tools over hand-editing JSON:
167
447
 
448
+ ```text
449
+ composer_config_get(scope="active")
450
+ composer_config_set(scope="project", codexLifecycle={...})
451
+ composer_config_set(scope="project", codexReview={...})
168
452
  ```
169
- User asks for code work
170
-
171
- Composer-mastermind SKILL.md picks a direct MCP tool or fallback subagent
172
-
173
- Direct MCP call → composer_code_cli / composer_research / composer_review
174
- or Task fallback coder.md / researcher.md / reviewer.md / reviewer-claude.md
175
-
176
- MCP server routes to GLM (anthropic) or Codex/agy CLI per composer.config.json
177
-
178
- Provider returns bounded summary; orchestrator integrates
453
+
454
+ `composer_config_set` intentionally accepts narrow patches for lifecycle and review-gate settings, validates the result, and refuses implicit writes to the global fallback path.
455
+
456
+ ## Trust, security, and limits
457
+
458
+ Composer is designed for supervised local development.
459
+
460
+ ### Global enforcement
461
+
462
+ The boundary_guard hook is installed once at the user level and applies in every repository. The main Claude session cannot call `Edit`/`Write`/`Update`/`NotebookEdit` directly anywhere — those route through `composer_code_cli` / `composer_code_chain`. Enforcement defaults to ON and is gated only by kill switches, read fresh on every tool call (no restart needed):
463
+
464
+ | Switch | Scope | Effect |
465
+ | ------ | ----- | ------ |
466
+ | `~/.claude/composer.disabled` | Global | Suspends enforcement in all repos. Toggle with `/composer disable` / `/composer enable`. |
467
+ | `$CLAUDE_PROJECT_DIR/.composer-disabled` | Per-repo | Opts a single repo out of enforcement. |
468
+ | `COMPOSER_DANGEROUSLY_BYPASS_PERMISSIONS=1` | Process env | Lets authorized headless jobs/workers author files directly. Dev-only escape hatch. |
469
+
470
+ The `/composer` slash command (`enable` / `disable` / `status`) flips `~/.claude/composer.disabled` live. It affects hook enforcement only — it does not stop or reconfigure the MCP server.
471
+
472
+ ### What is mechanically enforced
473
+
474
+ - `boundary_guard.sh` denies main-thread file mutation tools and MCP write/edit/exec wrappers in every repo (global user-level hook), fails closed on malformed input, and canonicalizes paths via the nearest existing ancestor so new-directory writes are not false-gated.
475
+ - `composer_code_chain` rejects path traversal and symlink escapes before applying files.
476
+ - `CLIProvider` uses argv arrays, not shell interpolation, and refuses dangerous Codex sandbox configs by default.
477
+ - Codex pre-commit gates can fail closed in Claude Code hook mode and terminal git-hook mode.
478
+ - Oracle browser runs are protected by a single-holder lock and stored under local Composer state / `.composer/oracle/` artefacts.
479
+ - Provider execution, polling loops, CLI retries, the Codex lifecycle chain, Oracle jobs, and hook/reaper/lock runtimes are all time-bounded with cancellation propagated, so a stuck worker cannot hang the main session.
480
+
481
+ ### What still needs project discipline
482
+
483
+ - Use branch protection / CI for `git commit --no-verify` or pushes from outside the local workflow.
484
+ - Do not pass secret files to Oracle or any external provider.
485
+ - Treat Oracle browser automation as a personal, supervised workflow, not a high-volume API.
486
+ - Review all code changes before reporting success.
487
+
488
+ ### Local artefacts to keep out of Git
489
+
490
+ At minimum:
491
+
492
+ ```gitignore
493
+ .env.json
494
+ .composer/handoffs/
495
+ .composer/codex-lifecycle/
496
+ .composer/oracle/
497
+ .composer/results/
498
+ .composer/briefs/
179
499
  ```
180
500
 
181
- Composer also emits a deterministic dispatch hint for `Task`/`Agent` calls
182
- when `scripts/dispatch_guard.sh` is installed. The hint classifies the
183
- request before the worker starts, so the orchestrator can choose a cheaper
184
- lane when the task is simple and reserve expensive paths for the cases that
185
- need isolation or extra reasoning.
186
-
187
- | Task shape | Default route |
188
- |---|---|
189
- | Tiny rename/comment/non-mutating request | Inline |
190
- | Small self-contained diff review | Inline review |
191
- | File mutation with path references | `composer_code_cli` |
192
- | Research-first implementation | `composer_research`, then `composer_code_cli` |
193
- | Security or large review | `composer_review` first; escalate to `composer_review_claude` only when needed |
194
- | Explicit premium/Claude review | `composer_review_claude` |
195
-
196
- ## Measuring trust
197
-
198
- Composer's route-confidence harness compares the same tasks across direct
199
- Claude, GLM-chain, and Codex-CLI routes. The `cc-only` route removes the
200
- worktree-local `.claude/` directory before running so the project plugin does
201
- not bias the baseline. It writes JSONL records with
202
- success, route adherence, typecheck status, changed-file count, wall time,
203
- and **total Claude Code tokens** from `modelUsage`.
501
+ ## Doctor and validation
502
+
503
+ Run this whenever setup feels suspicious:
204
504
 
205
505
  ```bash
206
- # Build first so the MCP server entry exists.
207
- npm run build
506
+ agent-composer doctor
507
+ ```
208
508
 
209
- # Run one representative task across all routes, three replicas each.
210
- npm run eval:routes -- --task t8-csv-module --runs 3
509
+ Add `--json` for a machine-readable report (full JSON on stdout; exit 0 = healthy, exit 1 = unhealthy):
211
510
 
212
- # Re-summarize an existing JSONL file without spending more tokens.
213
- npm run eval:routes -- --summary-only --input /tmp/composer-route-runs.jsonl
511
+ ```bash
512
+ agent-composer doctor --json
214
513
  ```
215
514
 
216
- The headline checks are: `composer-codex-cli` should preserve or improve
217
- success/typecheck rate while lowering median total-CC tokens versus
218
- `cc-only`; `routeHonored` must stay high enough to prove the orchestrator is
219
- actually using the route under test.
515
+ Useful local checks for contributors:
220
516
 
221
- Five resilience layers ensure unattended `/evolve` runs cannot damage the host repo:
222
-
223
- 1. **Sandbox isolation** — each per-task eval runs in a throwaway `git worktree` at `/tmp/composer-eval-<pid>-<taskId>`
224
- 2. **Per-task fault isolation** — one task's spawn failure records `score: 0` and continues
225
- 3. **Stat-gate precondition guards** — Wilcoxon paired-test skips when arrays are asymmetric
226
- 4. **Spawn diagnostics** — stderr/stdout tail appended to error messages
227
- 5. **Per-task wall-time bound** — `execFile` `timeout: 180_000` with SIGTERM; absorbed by layer 2
517
+ ```bash
518
+ npm run typecheck
519
+ npm run test
520
+ npm run test:hooks
521
+ npm run test:scripts
522
+ npm run schema:lint
523
+ ```
228
524
 
229
- ## Security model
525
+ Oracle-specific smoke tests:
230
526
 
231
- - **`agent-composer` publish surface**: `dist/`, `plugin/`, `composer.config.schema.json`, `README.md`, `package.json`. No tests, no source, no `.env*` (gitignored). Current npm dry-run package size is 84.4 KB.
232
- - **Spend caps**: per-call (`maxUsdPerCall`, default $0.50) and per-session (`maxUsdPerSession`, default $5.00) enforced in the runner before any external API call. Configurable per project.
233
- - **Self-evolution scope** (see ADR 0003): five layers gate any SKILL.md mutation diff-path regex, text deny-list, stat gate, human-promote-only, audit trail. Auto-promote is permanently off the table.
234
- - **Boundary hook**: PreToolUse fail-closed denial of `Edit`/`Update`/`Write`/`NotebookEdit` in the orchestrator session, plus MCP write/edit/exec variants. Native Bash is allowed for inspection and verification. The C0.5 subagent tools allowlist is append-only.
527
+ ```bash
528
+ scripts/oracle-pro-safe.sh --dry-run --mode quick -- "Smoke test. Say OK."
529
+ scripts/oracle-pro-safe.sh --mode quick -- "Say OK and identify the mode/model."
530
+ ```
235
531
 
236
- ## Contributing
532
+ ## Project layout
533
+
534
+ ```text
535
+ agent-composer/
536
+ ├── src/
537
+ │ ├── server.ts # MCP tool registration
538
+ │ ├── providers/ # CLI / Anthropic-compatible / mock providers
539
+ │ ├── config/ # config loader, paths, schema mirror
540
+ │ ├── cli/ # init, doctor, dispatch hint helpers
541
+ │ └── util/ # handoffs, jobs, locks, routing, apply helpers
542
+ ├── plugin/composer-mastermind/ # Claude Code plugin, skill, subagents, hooks
543
+ ├── scripts/ # Oracle adapters, review gates, release helpers
544
+ ├── docs/adr/ # architecture decisions
545
+ ├── tests/ # Vitest, hook, and script tests
546
+ └── composer.config.schema.json # user-facing config schema
547
+ ```
237
548
 
238
- Clone, install, run tests:
549
+ ## Development
239
550
 
240
551
  ```bash
241
- git clone <this-repo>
242
- cd composer
552
+ git clone https://github.com/xicv/agent-composer.git
553
+ cd agent-composer
243
554
  npm install
244
- npx tsc --noEmit # type check
245
- ./node_modules/.bin/vitest run # 435 tests
246
- ./node_modules/.bin/ajv validate \ # schema lint
247
- --strict=false -c ajv-formats \
248
- -s composer.config.schema.json \
249
- -d composer.config.json
555
+ npm run typecheck
556
+ npm run test
557
+ npm run test:hooks
558
+ npm run test:scripts
559
+ npm run schema:lint
560
+ npm run build
250
561
  ```
251
562
 
252
- Per-task layer reference docs (in the source tree):
563
+ Before publishing or merging a large routing change, run a real dogfood task through:
564
+
565
+ ```text
566
+ handoff → code_cli → review → targeted tests → doctor
567
+ ```
253
568
 
254
- - `docs/STATUS.md` — current state + dogfood audit log + every /evolve run
255
- - `docs/multi_agent_orchestration_plan.md` — architecture
256
- - `docs/tdd_plan.md` — build sequence + quality rubric
257
- - `docs/self_evolving_composer.md` — autonomous skill evolution (T1/T2/T3)
258
- - `docs/adr/0001-contracts.md` — frozen C0.1–C0.5 contracts (append-only)
259
- - `docs/adr/0002-meta-mcp.md` — Wave 4 packaging contract (M0.1–M0.5)
260
- - `docs/adr/0003-self-evolution.md` — self-evolution mutation scope (S1–S5)
569
+ ## Design principles
261
570
 
262
- The `/evolve` loop is a GEPA-style reflective optimizer: it evaluates the parent skill, captures **failing-task transcripts**, and routes them into mutation operators (`add_counterexample` / `add_constraint` / `add_negative_example` / `reflect_and_rewrite`) so each candidate is shaped by real failures. A no-op guard skips mutations that produce no change. Recommended supervised invocation: `--eval-mode real --length-lambda 0.0001 --replicas 3 --tasks <code subset>`. It mutates only the project-local `.claude/skills/composer-mastermind/SKILL.md`, writes `SKILL.candidate.md` for **manual review** (auto-promote is permanently off), and the published plugin install is read-only. Release sync from dev to plugin happens via `scripts/release-sync.mjs --bump <semver>`.
571
+ 1. **One brain, many workers.** The orchestrator owns intent; providers own execution.
572
+ 2. **Offload complete work, not fragments.** Prefer workers that apply files and return summaries.
573
+ 3. **Review with a different model.** Author and reviewer should usually be different providers.
574
+ 4. **Keep Oracle rare and valuable.** ChatGPT Pro is for hard reasoning, not routine lookup.
575
+ 5. **Make state durable and inspectable.** Handoffs, jobs, answers, and reviews should have paths.
576
+ 6. **Prefer opt-in gates.** Strong gates are available, but users choose the strictness per repo.
577
+ 7. **Never hide uncertainty.** Failed providers, skipped reviews, and orphaned jobs should surface as records, not disappear.
263
578
 
264
579
  ## License
265
580