agent-composer 0.1.13 → 0.1.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,8 +1,8 @@
1
1
  # Composer — multi-agent orchestration for Claude Code
2
2
 
3
- [![npm](https://img.shields.io/badge/npm-agent--composer-blue)](#install) [![tests](https://img.shields.io/badge/vitest-319%20passing-brightgreen)](#contributing) [![license](https://img.shields.io/badge/license-MIT-lightgrey)](#license)
3
+ [![npm](https://img.shields.io/badge/npm-agent--composer-blue)](#install) [![tests](https://img.shields.io/badge/vitest-376%20passing-brightgreen)](#contributing) [![license](https://img.shields.io/badge/license-MIT-lightgrey)](#license)
4
4
 
5
- > **Claude orchestrates. GLM and `agy` execute.** Composer is an MCP server + Claude Code plugin that lets the most-capable model hold the plan while cheaper models do the typing saving Claude Max5 tokens and keeping every dispatched task reviewable.
5
+ > **Claude orchestrates. GLM and `agy` execute — and *apply* — off your Claude quota.** Composer is an MCP server + Claude Code plugin that lets the most-capable model hold the plan while cheaper models generate *and write* the code in their own context. Because the executors apply files themselves (instead of returning text the main session must re-ingest), composer measurably cuts Claude Max5 token burn (~64–71% on substantial multi-file tasks) while keeping the orchestrator's context lean and every change reviewable.
6
6
 
7
7
  ## What it is
8
8
 
@@ -10,11 +10,25 @@ Two coordinated artefacts:
10
10
 
11
11
  | Artefact | Purpose |
12
12
  |---|---|
13
- | **`agent-composer`** (this npm package) | MCP server exposing `composer_research`, `composer_code`, `composer_review` tools. Wraps GLM (via Anthropic-compatible endpoint) and the `agy` CLI (Gemini). |
13
+ | **`agent-composer`** (this npm package) | MCP server exposing `composer_research`, `composer_code`, `composer_code_chain`, `composer_code_cli`, `composer_review`. Wraps GLM (via Anthropic-compatible endpoint) and the `agy` CLI (Gemini). |
14
14
  | **`composer-mastermind`** (Claude Code plugin) | Orchestrator skill + three haiku-wrapped subagents (`coder`, `researcher`, `reviewer`) + `boundary_guard` PreToolUse hook + `/evolve` slash command. |
15
15
 
16
16
  Combined, they turn the main Claude session into a coordinator that never writes code, runs bash, or edits files directly. Work is dispatched through the three MCP tools; the boundary hook fails closed if a denied tool is requested.
17
17
 
18
+ ## Tools
19
+
20
+ Five MCP tools, all routing work off the main Claude session:
21
+
22
+ | Tool | Executor | What it does |
23
+ |---|---|---|
24
+ | `composer_code_chain` | GLM authors → server applies | **Default for substantial edits.** GLM writes the complete files off-CC (`FILE: <path>` + fenced blocks); the MCP server applies them deterministically off-CC; the orchestrator only relays a summary. ~71% fewer total-CC tokens on multi-file tasks. |
25
+ | `composer_code_cli` | agy (Gemini CLI) | agy generates **and applies** the files itself off-CC, returns a summary. ~64% fewer total-CC tokens. |
26
+ | `composer_code` | GLM | Returns a patch as text; the caller integrates it (patch-only / legacy). |
27
+ | `composer_research` | agy | Research, docs, web lookup → structured summary. |
28
+ | `composer_review` | agy | Reviews the diff **and runs `tsc`/tests off-CC**; use a reviewer model different from the author for cross-model rigor (e.g. GLM writes → agy reviews). |
29
+
30
+ **Why "off-CC" matters:** GLM (z.ai) and agy (Gemini) run on *separate* quotas. Generating and *applying* code in their own context — not returning text the main Claude session must re-ingest — is what actually preserves your Max5 quota. The eval harness scores on **total-CC tokens** (every Claude model in a run = real Max5 burn), with a correctness gate (tsc/tests) and N-run averaging.
31
+
18
32
  ## Install
19
33
 
20
34
  ```bash
@@ -61,6 +75,7 @@ Two files at the consumer-project root, both gitignored or partially gitignored:
61
75
  "roles": {
62
76
  "researcher": { "provider": "cli", "cli": ["agy", "--dangerously-skip-permissions", "-p"] },
63
77
  "coder": { "provider": "anthropic", "baseUrl": "https://api.z.ai/api/anthropic", "apiKeyEnv": "ANTHROPIC_AUTH_TOKEN" },
78
+ "coderCli": { "provider": "cli", "cli": ["agy", "--dangerously-skip-permissions", "-p"] },
64
79
  "reviewer": { "provider": "cli", "cli": ["agy", "--dangerously-skip-permissions", "-p"] }
65
80
  },
66
81
  "spendAuthorization": {
@@ -124,7 +139,7 @@ git clone <this-repo>
124
139
  cd composer
125
140
  npm install
126
141
  npx tsc --noEmit # type check
127
- ./node_modules/.bin/vitest run # 319 tests
142
+ ./node_modules/.bin/vitest run # 376 tests
128
143
  ./node_modules/.bin/ajv validate \ # schema lint
129
144
  --strict=false -c ajv-formats \
130
145
  -s composer.config.schema.json \
@@ -141,7 +156,7 @@ Per-task layer reference docs (in the source tree):
141
156
  - `docs/adr/0002-meta-mcp.md` — Wave 4 packaging contract (M0.1–M0.5)
142
157
  - `docs/adr/0003-self-evolution.md` — self-evolution mutation scope (S1–S5)
143
158
 
144
- The `/evolve` loop mutates only the project-local `.claude/skills/composer-mastermind/SKILL.md` the published plugin install is read-only. Release sync from dev to plugin happens via `scripts/release-sync.mjs --bump <semver>`.
159
+ The `/evolve` loop is a GEPA-style reflective optimizer: it evaluates the parent skill, captures **failing-task transcripts**, and routes them into mutation operators (`add_counterexample` / `add_constraint` / `add_negative_example` / `reflect_and_rewrite`) so each candidate is shaped by real failures. A no-op guard skips mutations that produce no change. Recommended supervised invocation: `--eval-mode real --length-lambda 0.0001 --replicas 3 --tasks <code subset>`. It mutates only the project-local `.claude/skills/composer-mastermind/SKILL.md`, writes `SKILL.candidate.md` for **manual review** (auto-promote is permanently off), and the published plugin install is read-only. Release sync from dev to plugin happens via `scripts/release-sync.mjs --bump <semver>`.
145
160
 
146
161
  ## License
147
162
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agent-composer",
3
- "version": "0.1.13",
3
+ "version": "0.1.15",
4
4
  "type": "module",
5
5
  "description": "Multi-agent orchestration MCP server. Claude orchestrates; GLM and agy do the work.",
6
6
  "bin": {
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: coder
3
3
  description: Use when the orchestrator needs code written, refactored, debugged, or implemented. Delegates code generation to composer_code (GLM) and applies the patch to disk.
4
- tools: mcp__composer__composer_code, Read, Glob, Edit, Write
4
+ tools: mcp__composer__composer_code, Read, Glob, Edit, Write, Bash
5
5
  model: haiku
6
6
  ---
7
7
 
@@ -27,5 +27,6 @@ You are the Composer **Coder** subagent. Your job is two-step:
27
27
  - DO NOT re-Read after Edit/Write — trust the tool's return value. PostToolUse hooks run lint + tsc as the verification gate. If a real bug shipped, the reviewer subagent catches it on the next pass.
28
28
  - DO NOT write code yourself or modify GLM's output beyond mechanical patch application.
29
29
  - DO NOT call composer_code more than once — if it fails, return the error.
30
- - DO NOT use Bash/sed/awk/perl. Edit/Write only.
30
+ - DO use `Bash` for filesystem setup and verification: `mkdir -p` before a Write, `ls`/`cat` to confirm a patch actually landed on disk, and the self-check gate (`npm run typecheck`, `vitest run <file>`). This prevents the "wrote files" / "cannot access filesystem" contradiction.
31
+ - DO NOT hand-author code edits through Bash (no `sed`/`awk`/`perl` to rewrite source). Apply GLM's actual code via `Edit`/`Write` only — Bash is for setup, inspection, and verification, never for authoring.
31
32
  - DO NOT critique the returned code — that is the reviewer's job.
@@ -80,7 +80,7 @@ case "$TOOL" in
80
80
  Bash|Edit|Write|NotebookEdit \
81
81
  | mcp__*__write_file | mcp__*__edit_file | mcp__*__bash \
82
82
  | mcp__*__write | mcp__*__edit | mcp__*__exec)
83
- emit_deny "DENY: dispatch Task(subagent_type=\"coder\"); no Bash workaround."
83
+ emit_deny "DENY (main thread): route Edit/Write via Task(subagent_type=\"coder\"). Coder applies the patch and may use Bash to verify."
84
84
  ;;
85
85
  esac
86
86
 
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "composer-mastermind",
3
- "version": "0.1.13",
3
+ "version": "0.1.15",
4
4
  "description": "Multi-agent orchestrator: Claude as brain, GLM/agy as executors. Dispatches code/research/review work to subagents wired through the @composer-mcp/server MCP server.",
5
5
  "claudeCodeVersion": ">=4.6",
6
6
  "requires": [