glm-mcp-claude 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -19,9 +19,9 @@ files directly. One command to install.
19
19
  > **no separate pay-per-token Anthropic API key required**. Only GLM needs a (cheap) Z.ai key.
20
20
  > Opus orchestrates on your subscription; GLM does the heavy lifting for a fraction of the cost.
21
21
 
22
- ![The glm subagent (orchestrated by Haiku 4.5, the cheap layer) reading this repo and offloading generation to GLM](assets/demo-glm-subagent-summary.png)
22
+ ![Directly calling the GLM agent to write a file end-to-end on disk](assets/demo-glm-agent-umbrella.png)
23
23
 
24
- <sub>↑ The `glm` subagent (orchestrated by Haiku 4.5, the cheap layer) reading the repo and offloading the heavy work to GLM via the MCP toolsthe Opus Haiku GLM hybrid in action.</sub>
24
+ <sub>↑ **Directly calling the GLM agent (`glm_agent`).** Prompt: *"write a 2000-word Shakespearean essay about the usefulness of an umbrella into my Desktop."* GLM created the file itself**18 iterations, ~$0.064** — Opus never touched the keys. GLM reads, writes, edits, and runs your files directly.</sub>
25
25
 
26
26
  ```bash
27
27
  # from npm:
@@ -122,21 +122,16 @@ tool-heavy dependent loops, huge context, vision, or anything you mark sensitive
122
122
  | `glm_delegate` | GLM tokens | Text in → text out. GLM drafts; you place it. |
123
123
  | `glm_agent` | GLM tokens | GLM works your repo directly (read/write/edit/bash). Returns a diff + action log + git revert; supports `dry_run` (propose, don't write). |
124
124
 
125
- ### Example: directly calling the GLM agent
125
+ ### Example: delegating a read-and-summarize task
126
126
 
127
- A real runasking GLM (via `glm_agent`) to write a file end-to-end on disk:
127
+ The `glm` subagentits cheap Haiku layer driving, GLM doing the heavy lifting — reading this
128
+ whole repo and summarizing it:
128
129
 
129
- ![GLM agent writing a 2000-word Shakespearean essay to disk in 18 iterations for about 6 cents](assets/demo-glm-agent-umbrella.png)
130
+ ![The glm subagent (Haiku 4.5) reading the repo and offloading generation to GLM](assets/demo-glm-subagent-summary.png)
130
131
 
131
- > **Prompt:** *"Using the GLM agent `glm_agent`, write a 2000-word essay in Shakespearean format about the usefulness of an umbrella, into my Desktop."*
132
-
133
- GLM did it itself — created the file directly, no round-tripping the content through the main agent:
134
-
135
- - **Output:** `Umbrella-Essay-Shakespeare.md` — ~2,260 words of Early Modern English (*thee/thou/thy*, *doth/hath*) with two blank-verse interludes
136
- - **Work:** 18 tool-loop iterations; **one** file created, nothing existing touched
137
- - **Cost:** ~**$0.064** — a fraction of running the same task on Opus
138
-
139
- That's the point: the orchestrator stays on Opus while `glm_agent` does the heavy, file-touching work for cents.
132
+ <sub>Orchestrated by Haiku 4.5 (the cheap layer), offloading the token-heavy work to GLM via the
133
+ MCP tools — the Opus → Haiku → GLM hybrid in action. The orchestrator stays on Opus while GLM does
134
+ the file-touching / heavy work for cents.</sub>
140
135
 
141
136
  ---
142
137
 
package/agents/glm.md CHANGED
@@ -36,6 +36,10 @@ call `glm_delegate` (task + pasted context), then apply it with your own Write/E
36
36
  2. **Serialize GLM calls** — one at a time (GLM caps concurrency ~1).
37
37
  3. **Verify before returning.** Build/lint/test or re-read. If GLM's output is wrong or it loops,
38
38
  retry once with a sharper prompt; if still bad, do the critical part yourself or escalate to Opus.
39
+ 4. **Always end your report with the GLM stats.** `glm_agent` prints a `=== GLM STATS ===` block
40
+ (model, tokens delegated, iterations, cost); `glm_delegate` prints a `[GLM delegated … tokens to
41
+ <model>]` line. Surface these in your final message so every run clearly states **which GLM model
42
+ ran (e.g. glm-5.2) and how many tokens were delegated.**
39
43
 
40
44
  ## Operating rules
41
45
  - **Serialize GLM calls.** GLM caps concurrent requests (~1); one `glm_delegate` at a time.
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "glm-mcp",
3
- "version": "1.0.0",
3
+ "version": "1.1.0",
4
4
  "description": "MCP server that delegates self-contained subtasks to the GLM (Zhipu/Z.ai) Anthropic-compatible API, so Claude Code can use GLM as a cheap, peak-aware subagent.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -177,16 +177,26 @@ server.registerTool(
177
177
  const chosen = resolveModel(model, now);
178
178
  try {
179
179
  const r = await runGlmAgent({ model: chosen, task, context, workdir, maxTokens: resolveMaxTokens(max_tokens), thinking, dryRun: dry_run });
180
- const cost = estimateCost(chosen, r.usage.input_tokens, r.usage.output_tokens, now);
180
+ const inTok = r.usage.input_tokens || 0;
181
+ const outTok = r.usage.output_tokens || 0;
182
+ const totalTok = inTok + outTok;
183
+ const cost = estimateCost(chosen, inTok, outTok, now);
184
+ const opusCost = estimateCost("claude-opus", inTok, outTok, now);
185
+ const xCheaper = cost > 0 ? Math.round(opusCost / cost) : "?";
181
186
  const banner = r.dryRun ? "*** DRY RUN — nothing was written; this is GLM's PROPOSED change for you to approve ***\n" : "";
182
- const totalTok = (r.usage.input_tokens || 0) + (r.usage.output_tokens || 0);
183
187
  const header =
184
- `[GLM agent] delegated ${totalTok} tokens (${r.usage.input_tokens || 0} in / ${r.usage.output_tokens || 0} out) to ${chosen} est $${cost} | ` +
185
- `dir=${r.root} | iterations=${r.iters}${r.hitCap ? " (HIT CAP -- may be incomplete)" : ""} | actions=${r.actions.length} | files=${r.changedFiles.length}`;
188
+ `[GLM agent] ${chosen} | dir=${r.root} | ${r.iters} iterations${r.hitCap ? " (HIT CAP -- may be incomplete)" : ""} | ${r.actions.length} actions | ${r.changedFiles.length} files`;
186
189
  const actions = r.actions.length ? `\nActions:\n- ${r.actions.join("\n- ")}` : "";
187
190
  const diff = r.diff ? `\n\n=== DIFF (review this) ===\n${r.diff}` : "\n\n(no file changes)";
188
191
  const revert = !r.dryRun && r.git && r.git.revertHint ? `\n\nRevert: ${r.git.revertHint}` : "";
189
- return { content: [{ type: "text", text: clip(`${banner}${header}${actions}${diff}${revert}\n\n=== GLM SUMMARY ===\n${r.text}`) }] };
192
+ // Prominent stats footer, shown after every glm_agent run finishes.
193
+ const stats =
194
+ `\n\n=== GLM STATS (this subagent) ===\n` +
195
+ `model: ${chosen}\n` +
196
+ `tokens: ${totalTok} delegated to GLM (${inTok} in / ${outTok} out)\n` +
197
+ `iterations: ${r.iters}${r.hitCap ? " (hit cap)" : ""} files changed: ${r.changedFiles.length}\n` +
198
+ `est. cost: $${cost} (~${xCheaper}x cheaper than Opus)`;
199
+ return { content: [{ type: "text", text: clip(`${banner}${header}${actions}${diff}${revert}\n\n=== GLM SUMMARY ===\n${r.text}${stats}`) }] };
190
200
  } catch (e) {
191
201
  return {
192
202
  isError: true,
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "glm-mcp-claude",
3
- "version": "1.0.0",
3
+ "version": "1.1.0",
4
4
  "description": "GLM (Zhipu/Z.ai) as a cheap, full-capability subagent for Claude Code — auto-routing between Opus and GLM, a file-editing agent with diff/dry-run/git-revert oversight, and a one-command installer.",
5
5
  "type": "module",
6
6
  "bin": {