gm-skill 2.0.1623 → 2.0.1625

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -104,7 +104,7 @@ Every skill's `allowed-tools:` is reduced to `Skill, Read, Write` (plus the SKIL
104
104
 
105
105
  **Every possible aspect that can be checked for jank is a PRD row; the architecture is pliable**: at PLAN, for every surface the prompt concerns, enumerate every aspect checkable for `jank` -- every immaturity, unfinished edge, half-wired path -- across gui/ux/ui/client-state/server-state/the boundary and any surface reached, each its own row including a profiling row and a security row per surface. `jank` is load-bearing: hunt the rough/unpolished/almost-done, not only outright bugs. Scoped to the prompt's concern + its reachable closure, exhaustive within it. Every issue found opens its own debug-and-repair plan spooled the same turn; every quick improvement is spooled too. `pliable`: every architectural change that clearly improves or reduces maintenance burden is a spooled plan -- replacing bespoke code with native functionality or a popular well-maintained library is encouraged ONLY when it nets a smaller maintained surface (a heavy dep for a few lines is the guarded failure mode). Fan-out is the spool-native shape (parallel `prd-add`/`codesearch`/`exec_js`, plugkit task-spawn), never the platform's Task/Explore subagent. One tell-tale AI design element (boilerplate flourish, over-hedged comment, generic scaffold name, machine-authored shape) spawns a full-codebase sweep plan -- scan/per-cluster/fix-and-verify rows, exhaustive over every file, never a one-off fix.
106
106
 
107
- **Client-side debugging exposes globals and evaluates in-browser, never blind-restarts**: surface the relevant state as a `window.*` global and read it live via the `browser` verb's `page.evaluate`, running experiments in the browser, rather than blind experimentation + server restarts. The live page is the debugger; the same `browser` surface that witnesses an edit also diagnoses it.
107
+ **Client-side debugging exposes globals and evaluates in-browser, never blind-restarts**: the live page is the debugger (rs-learn: `recall: client-side-debug-globals-live-page`).
108
108
 
109
109
  **Mundane user-facing output is suppressed or stripped to the bone**: drop articles, preamble, play-by-play; boot-probe narration, dispatch echoes, restating prose just read, status recaps do not ship. What survives is substantive: a real finding, a decision + one-line reason, a blocker, the single-line PRD-read declaration. Terse = fewer/shorter words, NEVER zero tool calls and NEVER silent work -- the turn still ends in the chain-advancing tool call.
110
110
 
@@ -162,7 +162,7 @@ Orchestration state is tracked via `.gm/` marker files, not hook events; the CLI
162
162
 
163
163
  **Apparent tooling failure is mechanical self-recovery, NEVER a question for the user and never an a/b-test/blind-restart.** A missing spool response / stale watcher is the agent's own job: honor a future `busy_until` else boot the watcher and re-dispatch -- the spooler is sound by construction, so asking the user to do what a verb can do is a paper-spirit violation. Recovery mechanics (atomic `.status.json`, `FailedToOpenSocket` retry, debug-via-`window.*`-globals) in rs-learn (`recall: spooler self-recovery mechanics`).
164
164
 
165
- **Process-of-elimination is the debugging paradigm EVERYWHERE, and manual real-services witness is the verification paradigm EVERYWHERE.** Every debug -- code, wasm, cascade, browser, the spooler itself -- enumerates candidate causes as mutables and eliminates each by a witness read against real input (`exec_js`/`codesearch`/`Read`/`browser page.evaluate`), each elimination revealing the next, never guess-and-restart/a-b-test/shotgun. Every verification is manual labour against the real thing -- the single mock-free `test.js`, the live page, the real service, the live wasm -- never an automated unit/mock suite standing in for the real-services witness (the conventional-testing tell-tale gm replaces). Stated in `instructions/execute.md` (the served EXECUTE prose) so it reaches every LLM in-session.
165
+ **Process-of-elimination is the debugging paradigm EVERYWHERE, and manual real-services witness is the verification paradigm EVERYWHERE** -- both stated in `instructions/execute.md` (served EXECUTE prose). Detail in rs-learn (`recall: process-of-elimination manual-real-services-witness paradigm`).
166
166
 
167
167
  **The first verb after a genuine multi-minute IDLE is `instruction`, to reset the long-gap clock**: only spool verbs reset it, so a long investigation in platform tools trips a false stall -- interleave `instruction`/`prd-add` to stay warm, and dispatch `instruction` BEFORE any predictable blocking wait. Threshold + platform-tool exception in rs-learn (`recall: first verb after multi-minute wait instruction long-gap`).
168
168
 
package/bin/install.js CHANGED
@@ -7,7 +7,6 @@ const os = require('os');
7
7
  const readline = require('readline');
8
8
 
9
9
  const SKILL_NAME = 'gm';
10
- const AUTOCOMPACT_WINDOW = 380000;
11
10
 
12
11
  function out(msg) { process.stdout.write(msg + '\n'); }
13
12
  function err(msg) { process.stderr.write(msg + '\n'); }
@@ -102,8 +101,6 @@ function applyClaudeSettings(home) {
102
101
  const backup = settingsPath + '.bak';
103
102
  try { fs.copyFileSync(settingsPath, backup); err(`existing settings.json was malformed; backed up to ${backup}`); } catch (_) {}
104
103
  }
105
- obj.autoCompactEnabled = true;
106
- obj.autoCompactWindow = AUTOCOMPACT_WINDOW;
107
104
  obj.effortLevel = 'low';
108
105
  obj.alwaysThinkingEnabled = false;
109
106
  fs.mkdirSync(path.dirname(settingsPath), { recursive: true });
@@ -116,8 +113,6 @@ function applyClaudeSettings(home) {
116
113
 
117
114
  const SETTINGS_EXPLAINER = [
118
115
  'Claude Code settings applied:',
119
- ' autoCompactEnabled = true keep long sessions coherent by auto-compacting context',
120
- ` autoCompactWindow = ${AUTOCOMPACT_WINDOW} absolute token count (38% of a 1M window), not a percentage`,
121
116
  " effortLevel = low thinking effort lowered",
122
117
  ' alwaysThinkingEnabled = false explicit thinking turned off',
123
118
  '',
@@ -136,7 +131,7 @@ async function offerClaudeSettings(home) {
136
131
  try {
137
132
  out('');
138
133
  out('Claude Code detected. gm works best with reasoning-in-code rather than hidden thinking tokens.');
139
- out('Offer to set: autoCompactEnabled=true, autoCompactWindow=' + AUTOCOMPACT_WINDOW + ', effortLevel=low, alwaysThinkingEnabled=false.');
134
+ out('Offer to set: effortLevel=low, alwaysThinkingEnabled=false.');
140
135
  const ans = (await ask(rl, 'Apply these Claude Code settings now? [y/N] ')).trim().toLowerCase();
141
136
  if (ans === 'y' || ans === 'yes') {
142
137
  const r = applyClaudeSettings(home);
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-plugkit",
3
- "version": "2.0.1623",
3
+ "version": "2.0.1625",
4
4
  "description": "Bootstrap and daemon-spawn tool for gm plugkit binary. Downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Includes plugkit-wasm-wrapper for WASM-based spool watching.",
5
5
  "main": "index.js",
6
6
  "bin": {
@@ -163,7 +163,6 @@ function dispatchVerbToWasmInternal(instance, verb, body) {
163
163
  if (!dispatch) return null;
164
164
  const verbBytes = new TextEncoder().encode(verb);
165
165
  const bodyBytes = new TextEncoder().encode(body || '');
166
- // writeWasmInput re-reads memory.buffer fresh after each alloc (avoids the detached-buffer write bug).
167
166
  let verbPtr = 0, bodyPtr = 0;
168
167
  try { verbPtr = writeWasmInput(instance, verbBytes, `dispatch_verb(${verb}).verb`); }
169
168
  catch (e) { throw new Error(`wasm-alloc-failed for dispatch_verb(${verb}): ${e.message}`); }
@@ -1054,10 +1053,6 @@ function startManagedBrowser(pw, profileDir) {
1054
1053
  '--disable-default-apps',
1055
1054
  '--disable-gpu-process-crash-limit',
1056
1055
  ];
1057
- // In containers where unprivileged user namespaces are disabled, Chromium's
1058
- // sandbox cannot initialize and the remote-debugging port never binds (the CDP
1059
- // "did not become ready" failure). Opt in to running without the sandbox (plus
1060
- // the small-/dev/shm workaround common in containers) via GM_BROWSER_NO_SANDBOX=1.
1061
1056
  if (process.env.GM_BROWSER_NO_SANDBOX === '1') {
1062
1057
  args.push('--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage');
1063
1058
  }
@@ -1375,33 +1370,18 @@ function guardWasmRange(buffer, ptr, len, where) {
1375
1370
  }
1376
1371
  }
1377
1372
 
1378
- // Decode a packed (ptr,len) i64 dispatch result into a JS string, the ONE correct way.
1379
- // Two bugs this consolidates (they only surface once the wasm memory grows past a threshold --
1380
- // e.g. a large .gm state file -> a big plugkit_alloc -> the memory grows past ~2GB / the linear
1381
- // memory is re-grown mid-dispatch):
1382
- // 1. SIGNED i64 result. dispatch_verb returns an i64; a high bit set (large ptr or a packed
1383
- // len in the top 32 bits) makes `result` a NEGATIVE BigInt. `result >> 32n` on a negative
1384
- // BigInt arithmetic-shifts in sign bits -> a garbage/negative len, and the low-word mask can
1385
- // misread too. Normalize to unsigned 64-bit FIRST: BigInt.asUintN(64, result).
1386
- // 2. DETACHED buffer. `instance.exports.memory.buffer` captured before plugkit_alloc/dispatch is
1387
- // a STALE ArrayBuffer once the wasm linear memory grows (the old buffer detaches). Reading the
1388
- // result against it throws 'Start offset N is outside the bounds of the buffer'. Always re-read
1389
- // instance.exports.memory.buffer FRESH at the moment of the view, never reuse a captured one.
1390
1373
  function decodeWasmResult(instance, result, where) {
1391
- const u = BigInt.asUintN(64, BigInt(result)); // (1) normalize the i64 to unsigned before splitting
1374
+ const u = BigInt.asUintN(64, BigInt(result));
1392
1375
  const ptr = Number(u & 0xffffffffn);
1393
1376
  const len = Number(u >> 32n);
1394
1377
  if (ptr === 0 || len === 0) return '';
1395
- const buffer = instance.exports.memory.buffer; // (2) FRESH buffer (post-grow), never a stale capture
1378
+ const buffer = instance.exports.memory.buffer;
1396
1379
  guardWasmRange(buffer, ptr, len, where);
1397
1380
  const out = new TextDecoder().decode(new Uint8Array(buffer, ptr, len));
1398
1381
  try { instance.exports.plugkit_free(ptr, len); } catch (_) {}
1399
1382
  return out;
1400
1383
  }
1401
1384
 
1402
- // Write input bytes into wasm memory, re-reading memory.buffer FRESH after the alloc so a memory
1403
- // grow during plugkit_alloc never leaves us writing into a detached buffer (the write-side half of
1404
- // the detached-buffer bug). Returns the ptr (caller frees) or throws on alloc failure.
1405
1385
  function writeWasmInput(instance, bytes, where) {
1406
1386
  if (bytes.length === 0) return 0;
1407
1387
  const ptr = instance.exports.plugkit_alloc(bytes.length);
@@ -3275,7 +3255,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3275
3255
  }
3276
3256
  }
3277
3257
 
3278
- // writeWasmInput re-reads memory.buffer fresh after each alloc (detached-buffer write fix).
3279
3258
  const verbPtr = writeWasmInput(instance, verbBytes, `spool-dispatch:${verb}.verb`);
3280
3259
  const bodyPtr = writeWasmInput(instance, bodyBytes, `spool-dispatch:${verb}.body`);
3281
3260
 
@@ -3283,8 +3262,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3283
3262
  const result = dispatch(verbPtr, verbBytes.length, bodyPtr, bodyBytes.length);
3284
3263
  clearVerbActive();
3285
3264
 
3286
- // decodeWasmResult normalizes the i64 (BigInt.asUintN), re-reads the buffer FRESH (post-grow),
3287
- // guards the range, AND frees the result ptr -- so the (ptr,len) free below is dropped.
3288
3265
  let resultStr = decodeWasmResult(instance, result, `spool-dispatch:${verb}`);
3289
3266
 
3290
3267
  if (autoRecallPayload) {
@@ -3338,7 +3315,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3338
3315
 
3339
3316
  try { instance.exports.plugkit_free(verbPtr, verbBytes.length); } catch (_) {}
3340
3317
  try { instance.exports.plugkit_free(bodyPtr, bodyBytes.length); } catch (_) {}
3341
- // (the result ptr is freed inside decodeWasmResult above)
3342
3318
 
3343
3319
  try { if (fs.existsSync(filePath)) fs.unlinkSync(filePath); } catch (_) {}
3344
3320
  unmarkProcessed(key);
package/gm.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.1623",
3
+ "version": "2.0.1625",
4
4
  "description": "Spool-dispatch orchestration engine with unified state machine, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-skill",
3
- "version": "2.0.1623",
3
+ "version": "2.0.1625",
4
4
  "description": "Canonical universal harness — AI-native software engineering via skill-driven orchestration; bootstraps plugkit for task execution and session isolation. Install in any AI coding agent host.",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
@@ -85,3 +85,5 @@ The chain is not COMPLETE until changes are on origin. Commit and push at the en
85
85
  **Prune bad memory on sight -- a wrong recall hit is worse than a miss.** A stale/superseded/wrong `recall` or `auto_recall` hit gets `memorize-prune {key}` (deletes text + embedding). For an uncertain set, `memorize-prune {query}` returns review-only candidates; judge, then re-dispatch the stale `{keys:[...]}` -- never a blind similarity-delete.
86
86
 
87
87
  On turn entry plugkit attaches an `auto_recall` pack derived from the prompt; read its hits alongside `recall_hits` (the phase+PRD-subject pack). It fires once per turn entry on its own -- do not re-trigger it.
88
+
89
+ If the instructions amount to doing more than one step or imply it, use or create a workflow, or set a goal to track progress, and if subagents are available fan out subagents that use gm for everything, up to 8 in parallel