npm - glm-mcp-claude - Versions diffs - 1.0.0 → 1.1.0 - Mend

glm-mcp-claude 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -19,9 +19,9 @@ files directly. One command to install.
 > **no separate pay-per-token Anthropic API key required**. Only GLM needs a (cheap) Z.ai key.
 > Opus orchestrates on your subscription; GLM does the heavy lifting for a fraction of the cost.
-![The glm subagent (orchestrated by Haiku 4.5, the cheap layer) reading this repo and offloading generation to GLM](assets/demo-glm-subagent-summary.png)
+![Directly calling the GLM agent to write a file end-to-end on disk](assets/demo-glm-agent-umbrella.png)
-<sub>↑ The `glm` subagent (orchestrated by Haiku 4.5, the cheap layer) reading the repo and offloading the heavy work to GLM via the MCP tools — the Opus → Haiku → GLM hybrid in action.</sub>
+<sub>↑ **Directly calling the GLM agent (`glm_agent`).** Prompt: *"write a 2000-word Shakespearean essay about the usefulness of an umbrella into my Desktop."* GLM created the file itself — **18 iterations, ~$0.064** — Opus never touched the keys. GLM reads, writes, edits, and runs your files directly.</sub>
 ```bash
 # from npm:
@@ -122,21 +122,16 @@ tool-heavy dependent loops, huge context, vision, or anything you mark sensitive
 | `glm_delegate` | GLM tokens | Text in → text out. GLM drafts; you place it. |
 | `glm_agent` | GLM tokens | GLM works your repo directly (read/write/edit/bash). Returns a diff + action log + git revert; supports `dry_run` (propose, don't write). |
-### Example: directly calling the GLM agent
+### Example: delegating a read-and-summarize task
-A real run — asking GLM (via `glm_agent`) to write a file end-to-end on disk:
+The `glm` subagent — its cheap Haiku layer driving, GLM doing the heavy lifting — reading this
+whole repo and summarizing it:
-![GLM agent writing a 2000-word Shakespearean essay to disk in 18 iterations for about 6 cents](assets/demo-glm-agent-umbrella.png)
+![The glm subagent (Haiku 4.5) reading the repo and offloading generation to GLM](assets/demo-glm-subagent-summary.png)
-> **Prompt:** *"Using the GLM agent `glm_agent`, write a 2000-word essay in Shakespearean format about the usefulness of an umbrella, into my Desktop."*
-GLM did it itself — created the file directly, no round-tripping the content through the main agent:
-- **Output:** `Umbrella-Essay-Shakespeare.md` — ~2,260 words of Early Modern English (*thee/thou/thy*, *doth/hath*) with two blank-verse interludes
-- **Work:** 18 tool-loop iterations; **one** file created, nothing existing touched
-- **Cost:** ~**$0.064** — a fraction of running the same task on Opus
-That's the point: the orchestrator stays on Opus while `glm_agent` does the heavy, file-touching work for cents.
+<sub>Orchestrated by Haiku 4.5 (the cheap layer), offloading the token-heavy work to GLM via the
+MCP tools — the Opus → Haiku → GLM hybrid in action. The orchestrator stays on Opus while GLM does
+the file-touching / heavy work for cents.</sub>
 ---

package/agents/glm.md CHANGED Viewed

@@ -36,6 +36,10 @@ call `glm_delegate` (task + pasted context), then apply it with your own Write/E
 2. **Serialize GLM calls** — one at a time (GLM caps concurrency ~1).
 3. **Verify before returning.** Build/lint/test or re-read. If GLM's output is wrong or it loops,
    retry once with a sharper prompt; if still bad, do the critical part yourself or escalate to Opus.
+4. **Always end your report with the GLM stats.** `glm_agent` prints a `=== GLM STATS ===` block
+   (model, tokens delegated, iterations, cost); `glm_delegate` prints a `[GLM delegated … tokens to
+   <model>]` line. Surface these in your final message so every run clearly states **which GLM model
+   ran (e.g. glm-5.2) and how many tokens were delegated.**
 ## Operating rules
 - **Serialize GLM calls.** GLM caps concurrent requests (~1); one `glm_delegate` at a time.

package/glm-mcp/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "glm-mcp",
-  "version": "1.0.0",
+  "version": "1.1.0",
   "description": "MCP server that delegates self-contained subtasks to the GLM (Zhipu/Z.ai) Anthropic-compatible API, so Claude Code can use GLM as a cheap, peak-aware subagent.",
   "type": "module",
   "bin": {

package/glm-mcp/src/index.js CHANGED Viewed

@@ -177,16 +177,26 @@ server.registerTool(
     const chosen = resolveModel(model, now);
     try {
       const r = await runGlmAgent({ model: chosen, task, context, workdir, maxTokens: resolveMaxTokens(max_tokens), thinking, dryRun: dry_run });
-      const cost = estimateCost(chosen, r.usage.input_tokens, r.usage.output_tokens, now);
+      const inTok = r.usage.input_tokens || 0;
+      const outTok = r.usage.output_tokens || 0;
+      const totalTok = inTok + outTok;
+      const cost = estimateCost(chosen, inTok, outTok, now);
+      const opusCost = estimateCost("claude-opus", inTok, outTok, now);
+      const xCheaper = cost > 0 ? Math.round(opusCost / cost) : "?";
       const banner = r.dryRun ? "*** DRY RUN — nothing was written; this is GLM's PROPOSED change for you to approve ***\n" : "";
-      const totalTok = (r.usage.input_tokens || 0) + (r.usage.output_tokens || 0);
       const header =
-        `[GLM agent] delegated ${totalTok} tokens (${r.usage.input_tokens || 0} in / ${r.usage.output_tokens || 0} out) to ${chosen} — est $${cost} | ` +
-        `dir=${r.root} | iterations=${r.iters}${r.hitCap ? " (HIT CAP -- may be incomplete)" : ""} | actions=${r.actions.length} | files=${r.changedFiles.length}`;
+        `[GLM agent] ${chosen} | dir=${r.root} | ${r.iters} iterations${r.hitCap ? " (HIT CAP -- may be incomplete)" : ""} | ${r.actions.length} actions | ${r.changedFiles.length} files`;
       const actions = r.actions.length ? `\nActions:\n- ${r.actions.join("\n- ")}` : "";
       const diff = r.diff ? `\n\n=== DIFF (review this) ===\n${r.diff}` : "\n\n(no file changes)";
       const revert = !r.dryRun && r.git && r.git.revertHint ? `\n\nRevert: ${r.git.revertHint}` : "";
-      return { content: [{ type: "text", text: clip(`${banner}${header}${actions}${diff}${revert}\n\n=== GLM SUMMARY ===\n${r.text}`) }] };
+      // Prominent stats footer, shown after every glm_agent run finishes.
+      const stats =
+        `\n\n=== GLM STATS (this subagent) ===\n` +
+        `model:      ${chosen}\n` +
+        `tokens:     ${totalTok} delegated to GLM (${inTok} in / ${outTok} out)\n` +
+        `iterations: ${r.iters}${r.hitCap ? " (hit cap)" : ""}   files changed: ${r.changedFiles.length}\n` +
+        `est. cost:  $${cost}  (~${xCheaper}x cheaper than Opus)`;
+      return { content: [{ type: "text", text: clip(`${banner}${header}${actions}${diff}${revert}\n\n=== GLM SUMMARY ===\n${r.text}${stats}`) }] };
     } catch (e) {
       return {
         isError: true,

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "glm-mcp-claude",
-  "version": "1.0.0",
+  "version": "1.1.0",
   "description": "GLM (Zhipu/Z.ai) as a cheap, full-capability subagent for Claude Code — auto-routing between Opus and GLM, a file-editing agent with diff/dry-run/git-revert oversight, and a one-command installer.",
   "type": "module",
   "bin": {