gm-skill 2.0.1576 → 2.0.1578

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -118,11 +118,11 @@ Every skill's `allowed-tools:` is reduced to `Skill, Read, Write` (plus the SKIL
118
118
 
119
119
  **AGENTS.md / CLAUDE.md are inline-edited AND dual-written to the store**: edit them inline for structural rules (the only doc surviving context summarization), AND `memorize-fire` the same rule so `recall`/`auto_recall` surface it later -- complementary, not alternatives. Never `namespace:"AGENTS.md"`; load-bearing rules go to the default namespace. Mechanics in rs-learn (`recall: memorize-fire ingestion classifier`).
120
120
 
121
- **A memorized workaround is a tool defect; transform it, never accumulate it**: we work USING gm, not ON it, so a `recall` memo framed as a workaround, known-limitation, or internal-advice is tribal knowledge a fresh user/LLM lacks -- the tool then surprises them, and surprises are never allowed; everything must be abundantly predictable at face value. Resolve every such memo one of three ways, then prune it: (a) already covered by the standing prose (SKILL.md / instruction bundle) -> prune the redundant recall; (b) prose-worthy but absent -> add the rule to the prose, then prune; (c) genuine surprising behavior -> fix the code so it is predictable, then prune. `recall` carries project work-context (what the work surfaced about the user's problem), never tool-operation advice -- the tool's prose + behavior alone make every operation predictable. Witnessed transforms: CRLF/LF const-drift -> sync LF-normalization; codesearch cwd-scope -> the search-routing prose clause.
121
+ **A memorized workaround is a tool defect; transform it, never accumulate it**: we work USING gm, not ON it, so a `recall` memo framed as a workaround, known-limitation, or internal-advice is tribal knowledge a fresh user/LLM lacks -- the tool then surprises them, and surprises are never allowed; everything must be abundantly predictable at face value. Resolve: (a) already in standing prose -> prune recall; (b) prose-worthy but absent -> add to prose then prune; (c) genuinely surprising behavior -> fix code so it is predictable then prune.
122
122
 
123
123
  **Behavioral discipline lives in plugkit's `instruction` verb**: dispatch `instruction` for the live phase-specific prose (Three-Layer Admission Filter, maturity-first emit, closure anti-shapes, code invariants); do not duplicate it here. Enumeration in rs-learn (`recall: instruction-verb behavioral discipline invariants`).
124
124
 
125
- **The agent IS the LLM rs-learn calls**: rs-learn never reaches a separate judge model for a quality score, relevance, prune, route, or loss signal -- plugkit IS the harness and the agent IS the model, each an inline decision reported through the spool. Per-core internals in rs-learn (`recall: rs-learn self-report core internals`).
125
+ **The agent IS the LLM rs-learn calls**: no separate judge model; all decisions are inline via spool. Internals in rs-learn (`recall: rs-learn self-report core internals`).
126
126
 
127
127
  **host_exec_js is synchronous**: pass a real per-call `timeoutMs` (zero/missing is a hard error). Detail in rs-learn (`recall: host_exec_js synchronous`).
128
128
 
@@ -148,13 +148,13 @@ Push to any rs-* sibling triggers `cascade.yml` -> rs-plugkit `release.yml` -> s
148
148
 
149
149
  Orchestration state is tracked via `.gm/` marker files, not hook events; the CLI layer calls `checkDispatchGates()` before tool execution to gate Write/Edit/git. Marker set (`prd.yml, mutables.yml, needs-gm, gm-fired-<sessionId>, residual-check-fired`) + SpoolDispatcher mechanism in rs-learn (`recall: gate enforcement layer`, `recall: spool dispatch gates marker files`).
150
150
 
151
- **gm-skill tool-use sequencing**: `Skill(skill="gm-skill")` writes `.gm/gm-fired-<sessionId>` to clear the needs-gm gate (cleared at turn start to reset it). One shipped skill, no subagent variant.
151
+ **gm-skill tool-use sequencing**: `Skill(skill="gm-skill")` clears the needs-gm gate. One shipped skill, no subagent variant. Marker mechanics in rs-learn (`recall: gm-skill tool-use sequencing mechanics`).
152
152
 
153
153
  **The skill is the driver, not a post-hoc witness**: when a request carries the standing instruction to use gm-skill (every `/loop` fire, any prompt naming `/gm-skill`), the FIRST working action is `Skill(skill="gm-skill")`, and the skill prose drives the chain PLAN->COMPLETE. Dispatching spool verbs directly without first entering the skill executes the work outside the skill the user asked to drive it; entering only at the end to confirm terminal state does NOT satisfy the instruction. The boot probe (`cat .gm/exec-spool/.status.json` ...) is prescribed by the skill and may precede invocation; everything that mutates state happens inside the skill-driven session.
154
154
 
155
155
  **Dead-watcher recovery uses `bun x gm-plugkit@latest spool`, never direct-node boot** (mechanism in rs-learn: `recall: dead-watcher recovery bun x not direct-node`).
156
156
 
157
- **The first verb after a genuine multi-minute IDLE is `instruction`, to reset the long-gap clock**: the gate fires on genuine idle only (>300s since the last instruction AND >300s since any verb), so active back-to-back work verbs keep the chain alive without an interleaved `instruction` -- do not inject defensive instruction dispatches between active work. A true wait (version download, overnight, long external CI watch) trips it, and the first verb back is `instruction`. When the wait is self-inflicted and predictable (a blocking `TaskOutput`/`gh run watch`), dispatch `instruction` immediately BEFORE entering the wait, not only after. "Work verbs"/"any verb" here means SPOOL dispatches -- platform `Bash`/`Read`/`Edit`/`Grep` do NOT reset the clock, so a long investigation run purely in them (the audit `gmsniff`/`ccsniff` sweep + source reading/editing exceeding 300s) trips a false `mid-chain-stall` even while actively working; interleave a `prd-add` (convert each finding as it emerges per density-grows-along-the-walk) or an `instruction` to keep the clock warm. Mechanism in rs-learn (`recall: first verb after multi-minute wait instruction long-gap`).
157
+ **The first verb after a genuine multi-minute IDLE is `instruction`, to reset the long-gap clock**: gate fires when >300s since last instruction AND >300s since any SPOOL verb. Platform `Bash`/`Read`/`Edit`/`Grep` do NOT reset the clock -- a long investigation run in them trips a false stall; interleave `prd-add` or `instruction` to keep warm. For a predictable blocking wait (`TaskOutput`/`gh run watch`), dispatch `instruction` BEFORE entering the wait. Detail + platform-tool exception in rs-learn (`recall: first verb after multi-minute wait instruction long-gap`).
158
158
 
159
159
  **A stop-hook firing on a terminal chain does not authorize re-polling**: when a stop-hook fires while already at `phase=COMPLETE` AND `prd_pending_count=0`, re-dispatching `instruction`/`phase-status` to "re-confirm" is a deviation (`deviation.complete-chain-poll`, `instructions/mod.rs`). Two admissible responses: (a) a prose-only turn (COMPLETE is in hand), or (b) genuinely new planned work opened with a FRESH `{"prompt":...}` body (resets phase to PLAN, driven through the skill). Repeatedly answering the same hook is a loop; state the terminal facts once and stop, or open new work.
160
160
 
@@ -1,4 +1,4 @@
1
- #!/usr/bin/env node
1
+ #!/usr/bin/env node
2
2
  'use strict';
3
3
 
4
4
  const fs = require('fs');
@@ -7,10 +7,6 @@ const os = require('os');
7
7
  const crypto = require('crypto');
8
8
  const { spawn, spawnSync } = require('child_process');
9
9
 
10
- // Resolve a bare command name to its actual .exe on Windows. cmd.exe + .cmd
11
- // shim chains re-enter conhost (visible window flash) even with
12
- // windowsHide:true on the parent. Spawning the real .exe directly lets
13
- // CREATE_NO_WINDOW propagate. See [[windows-spawn-cmd-shim-flash]].
14
10
  function resolveWindowsExe(cmd) {
15
11
  if (process.platform !== 'win32') return cmd;
16
12
  try {
@@ -683,11 +679,6 @@ function copyWasmToGmTools(wasmPath, version) {
683
679
  } catch (_) {}
684
680
  }
685
681
  if (!wasmFresh) {
686
- // copyFileSync truncates the target before streaming ~149MB, leaving a window where
687
- // a crash or a concurrent watcher load sees a truncated/absent wasm (the
688
- // "self-heal: wasm not installed" crash-loop during an upgrade). Copy to a
689
- // pid-suffixed temp and rename over the target: same-volume rename is atomic,
690
- // with the Windows EEXIST/EPERM unlink+retry.
691
682
  const tmp = `${target}.partial-${process.pid}`;
692
683
  fs.copyFileSync(wasmPath, tmp);
693
684
  try { fs.renameSync(tmp, target); }
@@ -1,7 +1,4 @@
1
1
  #!/usr/bin/env node
2
- // Legacy fallback. The canonical surface for lang/*.js plugins is the wasm
3
- // `lang` verb in rs-plugkit, dispatched via .gm/exec-spool/in/lang/<N>.txt.
4
- // This standalone runner is kept for direct CLI debug + pre-cascade situations.
5
2
  'use strict';
6
3
  const fs = require('fs');
7
4
  const path = require('path');
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-plugkit",
3
- "version": "2.0.1576",
3
+ "version": "2.0.1578",
4
4
  "description": "Bootstrap and daemon-spawn tool for gm plugkit binary. Downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Includes plugkit-wasm-wrapper for WASM-based spool watching.",
5
5
  "main": "index.js",
6
6
  "bin": {
@@ -1,4 +1,4 @@
1
- import fs from 'fs';
1
+ import fs from 'fs';
2
2
  import path from 'path';
3
3
  import os from 'os';
4
4
  import crypto from 'crypto';
@@ -13,16 +13,8 @@ const _httpModule = http;
13
13
  const _httpsModule = https;
14
14
  import { fileURLToPath } from 'url';
15
15
 
16
- // Set by the spool watcher's writeStatus closure once it is live. Lets long synchronous verbs
17
- // (browser/chromium spawn, long exec) stamp a busy_until window into .status.json before the
18
- // blocking call, so a liveness probe reads "busy" not "dead" while the event loop is blocked.
19
16
  let _writeStatusBusy = () => {};
20
- // Latest busy_until epoch ms stamped by a long synchronous verb (codesearch rebuild, chromium
21
- // spawn). scanStalledTurns reads it so a busy watcher is not mis-flagged as an idle stall.
22
17
  let _lastBusyUntil = 0;
23
- // First 12 hex of sha256 of this watcher's own gmTools wrapper. Module-scoped so writeStatus
24
- // (a different function scope) can stamp status.wrapper_sha, which the supervisor compares
25
- // against the on-disk wrapper to recycle a watcher running a stale wrapper-only fix.
26
18
  let _ownWrapperSha12 = '';
27
19
 
28
20
  function spawnSync(cmd, args, opts) {
@@ -346,18 +338,12 @@ function turnTick(sess, verb, taskBase, phase, prdPending) {
346
338
  const key = sess || '(no-session)';
347
339
  const now = Date.now();
348
340
  let t = _turns.get(key);
349
- // Any verb arriving after an idle gap closes the stale turn -- not just instruction.
350
- // Otherwise a non-instruction verb (prd-add, mutable-resolve, transition) landing
351
- // after an overnight sleep stamps t.lastTs forward without splitting, and dur_ms
352
- // (lastTs - startTs) balloons to wall-clock-with-sleep instead of active work time.
353
341
  if (t && (now - t.lastTs) > TURN_IDLE_MS) {
354
342
  endTurn(sess, t, true);
355
343
  _turns.delete(key);
356
344
  t = null;
357
345
  }
358
346
  if (!t) {
359
- // Only an instruction dispatch opens a new turn; a stray non-instruction verb after
360
- // idle is recorded against no turn (the next instruction starts the real turn).
361
347
  if (verb !== 'instruction') return;
362
348
  const idx = ((_turns.get(key + ':lastIdx') || 0) + 1);
363
349
  _turns.set(key + ':lastIdx', idx);
@@ -367,27 +353,15 @@ function turnTick(sess, verb, taskBase, phase, prdPending) {
367
353
  }
368
354
  t.lastTs = now;
369
355
  t.dispatches++;
370
- // A verb arriving resumes the turn -- clear any prior stall flag so a later re-stall
371
- // is a fresh episode, not silently suppressed by the one-shot guard.
372
356
  t.stallEmitted = false;
373
357
  t.verbs.set(verb, (t.verbs.get(verb) || 0) + 1);
374
358
  if (phase) { t.phases.add(phase); t.lastPhase = phase; }
375
359
  if (typeof prdPending === 'number') t.prdPending = prdPending;
376
360
  }
377
361
 
378
- // turn.end fires only when a NEW verb arrives after idle, so a turn that simply never
379
- // receives another verb stays open forever and emits no signal -- a permanent stall is
380
- // silence, not an event, which is how a mid-EXECUTE stop stays invisible for days. The
381
- // heartbeat scan closes that hole: for each open turn idle past STALL_MS whose last phase
382
- // is non-terminal (or carries open PRD rows), emit turn.stalled once. One-shot per episode
383
- // (stallEmitted), reset when a verb resumes the turn. A COMPLETE turn with no open rows
384
- // idling is the authorized prose-only state and never stalls.
385
362
  const STALL_MS = 300_000;
386
363
  function scanStalledTurns() {
387
364
  const now = Date.now();
388
- // A long synchronous verb (codesearch index rebuild, chromium spawn) stamps busy_until and
389
- // blocks the spool -- the agent is legitimately waiting, not stalled. Honor it exactly as
390
- // supervisor.js checkWatcherHealth does, so a busy watcher never emits a false mid-chain-stall.
391
365
  if (_lastBusyUntil && _lastBusyUntil > now) return;
392
366
  for (const [key, t] of _turns) {
393
367
  if (!t || typeof t !== 'object' || !Number.isFinite(t.startTs)) continue;
@@ -396,9 +370,6 @@ function scanStalledTurns() {
396
370
  const terminal = t.lastPhase === 'COMPLETE' && (t.prdPending === 0 || t.prdPending == null);
397
371
  if (terminal) continue;
398
372
  t.stallEmitted = true;
399
- // key is the _turns map key (sess || '(no-session)'). When it is the sentinel, the turn was
400
- // unattributed, so do not override logEvent's own cwd+sess base fields with '(no-session)' --
401
- // let the cwd-based attribution stand. Pass an explicit sess only when key is a real session.
402
373
  const fields = {
403
374
  turn_idx: t.idx,
404
375
  ended_in_phase: t.lastPhase || null,
@@ -411,10 +382,6 @@ function scanStalledTurns() {
411
382
  }
412
383
  }
413
384
 
414
- // Every spool dispatch is the agent actively driving the chain, including wasm-direct verbs
415
- // (recall/codesearch/exec_js/git/fetch) that never reach turnTick. Refresh the open turn's stall
416
- // clock so a Bash-free stretch of pure wasm-direct verbs does not trip a false mid-chain-stall
417
- // (the recurring audit-fire own-defect). Never create or split a turn -- that stays turnTick's job.
418
385
  function touchActiveTurn(sess) {
419
386
  const t = _turns.get(sess || '(no-session)');
420
387
  if (!t) return;
@@ -888,8 +855,6 @@ function runBrowserRunner(pw, args, timeoutMs, cwd, claudeSessionId) {
888
855
  const sockDir = playwriterHomeFor(cwd, claudeSessionId);
889
856
  try { fs.mkdirSync(sockDir, { recursive: true }); } catch (_) {}
890
857
  env.PLAYWRITER_HOME = sockDir;
891
- // Stamp a busy window before the synchronous spawn so the blocked event loop's stale heartbeat
892
- // is not misread as a dead watcher. Pad past the spawn timeout for teardown.
893
858
  _writeStatusBusy((timeoutMs || 30000) + 5000);
894
859
  return spawnSync(spawnCmd, spawnArgs, {
895
860
  encoding: 'utf-8',
@@ -911,9 +876,6 @@ function scrubBrowserRunnerText(s) {
911
876
  return t;
912
877
  }
913
878
 
914
- // Standard OS install locations for a Chrome/Chromium that speaks CDP. Used as a
915
- // fallback when the managed ms-playwright cache is absent (e.g. cache evicted),
916
- // so the browser verb keeps working off the system browser instead of failing.
917
879
  function findSystemChromiumBinary() {
918
880
  const candidates = process.platform === 'win32'
919
881
  ? [
@@ -1721,8 +1683,6 @@ function makeHostFunctions(instanceRef) {
1721
1683
  const key = readWasmStr(instanceRef.value, keyPtr, keyLen);
1722
1684
  if (!ns || !key) return 0;
1723
1685
  let removed = 0;
1724
- // Delete the key from the namespace AND its -vec sibling across every enabled discipline dir,
1725
- // so a pruned memory leaves no orphan embedding that host_vec_search would still surface.
1726
1686
  for (const baseNs of [ns, `${ns}-vec`]) {
1727
1687
  for (const dir of kvNamespaceDirs(baseNs)) {
1728
1688
  const fp = path.join(dir, safeName(key) + '.json');
@@ -2139,8 +2099,8 @@ async function runSpoolWatcher(instance, spoolDir) {
2139
2099
  }
2140
2100
  function lockBody() { return `${process.pid}|${Date.now()}|${_ownWrapperSha12}`; }
2141
2101
  function acquireLock() {
2142
- try {
2143
- if (fs.existsSync(LOCK_PATH)) {
2102
+ function checkExistingHolder() {
2103
+ try {
2144
2104
  const content = fs.readFileSync(LOCK_PATH, 'utf-8').trim();
2145
2105
  const parts = content.split('|');
2146
2106
  const pidStr = parts[0];
@@ -2164,6 +2124,7 @@ async function runSpoolWatcher(instance, spoolDir) {
2164
2124
  }));
2165
2125
  } catch (_) {}
2166
2126
  try { process.kill(parseInt(pidStr, 10), 'SIGTERM'); } catch (_) {}
2127
+ return 'takeover';
2167
2128
  } else {
2168
2129
  const msg = JSON.stringify({ ok: false, reason: 'another-watcher-active', pid: pidStr, age_ms: age });
2169
2130
  console.error(`[plugkit-wasm] ${msg}; refusing to start`);
@@ -2181,11 +2142,32 @@ async function runSpoolWatcher(instance, spoolDir) {
2181
2142
  } else if (!holderAlive) {
2182
2143
  console.error(`[plugkit-wasm] stale lock (holder pid=${pidStr} dead, age=${age}ms); taking over`);
2183
2144
  try { logEvent('plugkit', 'watcher.lock-pid-dead-takeover', { stale_pid: pidStr, lock_age_ms: age }); } catch (_) {}
2145
+ return 'takeover';
2184
2146
  } else {
2185
2147
  console.error(`[plugkit-wasm] stale lock (age=${age}ms); taking over`);
2148
+ return 'takeover';
2186
2149
  }
2150
+ } catch (_) {
2151
+ return 'takeover';
2152
+ }
2153
+ }
2154
+ try {
2155
+ let fd;
2156
+ try {
2157
+ fd = fs.openSync(LOCK_PATH, 'wx');
2158
+ } catch (e) {
2159
+ if (e.code !== 'EEXIST') throw e;
2160
+ const action = checkExistingHolder();
2161
+ if (action !== 'takeover') return;
2162
+ try { fs.unlinkSync(LOCK_PATH); } catch (_) {}
2163
+ fd = fs.openSync(LOCK_PATH, 'wx');
2164
+ }
2165
+ try {
2166
+ const body = Buffer.from(lockBody(), 'utf-8');
2167
+ fs.writeSync(fd, body);
2168
+ } finally {
2169
+ fs.closeSync(fd);
2187
2170
  }
2188
- fs.writeFileSync(LOCK_PATH, lockBody());
2189
2171
  } catch (e) {
2190
2172
  console.error(`[plugkit-wasm] lock acquire failed: ${e.message}`);
2191
2173
  process.exit(1);
@@ -2227,13 +2209,6 @@ async function runSpoolWatcher(instance, spoolDir) {
2227
2209
  detected_at: Date.now(),
2228
2210
  });
2229
2211
  try { console.error(`[plugkit-wasm] VERB ABORT detected: prior watcher pid=${priorVerb.pid} died inside verb=${priorVerb.verb} task=${priorVerb.task}`); } catch (_) {}
2230
- // The aborted dispatch otherwise gets NO response file: the in-file was consumed,
2231
- // the prior watcher died before writing out/, and the agent waits forever on a
2232
- // Read that never lands (then must git-archaeology whether the verb's side effects
2233
- // happened). Write a definite failure response so the agent's Read returns
2234
- // immediately and it re-dispatches. Both out-name shapes are written because
2235
- // .verb-active.json does not record whether the dispatch was root or nested;
2236
- // the agent reads whichever its dispatch shape expects, the other is swept.
2237
2212
  if (priorVerb.verb && priorVerb.task) {
2238
2213
  try {
2239
2214
  const abortBody = JSON.stringify({
@@ -2513,10 +2488,6 @@ async function runSpoolWatcher(instance, spoolDir) {
2513
2488
  child.unref();
2514
2489
  try { logEvent('plugkit', 'gm-plugkit.self-stale-respawn', { running_version: own, latest_version: latest, cache_busted: bustCache, attempt: (respawnGuard.attempts || 0) + 1 }); } catch (_) {}
2515
2490
  try { fs.writeFileSync(path.join(spoolDir, '.shutdown-reason.json'), JSON.stringify({ reason: 'gm-plugkit-self-stale', ts: Date.now(), pid: process.pid, running_version: own, latest_version: latest })); } catch (_) {}
2516
- // Wait for the replacement's fresh heartbeat before exiting (mirror the
2517
- // version-drift path) instead of a blind 2s exit: the gm-plugkit download can
2518
- // take many seconds, and exiting early lets the supervisor relaunch the SAME
2519
- // stale version before the new one lands, so the update never sticks.
2520
2491
  const myPid = process.pid;
2521
2492
  const respawnDeadline = Date.now() + 90000;
2522
2493
  const exitSelfStale = () => { try { process.exit(0); } catch (_) {} };
@@ -2575,11 +2546,6 @@ async function runSpoolWatcher(instance, spoolDir) {
2575
2546
  setTimeout(probeGmPlugkitSelfStale, 5000);
2576
2547
  setInterval(probeGmPlugkitSelfStale, 300_000);
2577
2548
 
2578
- // A supervised watcher self-exits on drift assuming the supervisor respawns it. If the
2579
- // supervisor has died, that bare exit leaves the spool dead (worse than staying up). Treat a
2580
- // dead/absent supervisor as unsupervised so the drift loops take the self-respawn-and-wait path
2581
- // (spawn replacement, wait for its heartbeat, then exit) instead. False-negative is self-correcting:
2582
- // if both the supervisor and this watcher respawn, the single-instance lock admits exactly one.
2583
2549
  function _supervisorIsDead() {
2584
2550
  try {
2585
2551
  const sp = parseInt(fs.readFileSync(path.join(spoolDir, '.supervisor.pid'), 'utf8').trim(), 10);
@@ -3051,11 +3017,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3051
3017
  const relPath = path.relative(inDir, filePath);
3052
3018
  const dir = path.dirname(relPath);
3053
3019
  const verb = dir === '.' ? path.basename(filePath, path.extname(filePath)) : dir;
3054
- // Defense-in-depth beyond walkDir's dot-dir skip: a real verb is a single clean
3055
- // segment (e.g. instruction, prd-resolve). A derived verb containing a path
3056
- // separator or a dot-segment means the file lives under a stray nested spool
3057
- // (in/prd-resolve/.gm/exec-spool/...); dispatching it builds a bogus verb+outName
3058
- // and ENOENT-storms every tick. Skip + unmark so it never re-enters the loop.
3059
3020
  if (/[\\/]/.test(verb) || verb.split(/[\\/]/).some(seg => seg.startsWith('.'))) {
3060
3021
  try { logEvent('plugkit', 'spool.skip-nested-verb', { rel: relPath, derived_verb: verb }); } catch (_) {}
3061
3022
  unmarkProcessed(key);
@@ -3075,15 +3036,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3075
3036
  console.log(`[dispatch] -> verb=${verb} task=${taskBase} body=${bodyBytes.length}b`);
3076
3037
  logEvent('plugkit', 'dispatch.start', { verb, task: taskBase, body_bytes: bodyBytes.length, cwd: process.cwd() });
3077
3038
 
3078
- // Network-bound git verbs block the event loop for the duration of a push/fetch (~30s),
3079
- // so the 5s heartbeat cannot fire and the supervisor would reap the watcher as hung
3080
- // (the VERB ABORT). Stamp a busy_until window before the synchronous dispatch so the
3081
- // supervisor's heartbeat-stale check honors it, exactly as the browser runner does.
3082
- // codesearch is the longest synchronous verb: a cold first call loads the 133MB bge-small
3083
- // bert model AND re-indexes the tree. A cold build was witnessed at ~252s (dispatch log
3084
- // codesearch ms=251772), so a 180s window let the supervisor reap the watcher mid-index and
3085
- // respawn it, cold-loading again = respawn-thrash that never completes (the codeinsight-stale
3086
- // symptom). codesearch gets a 360s window; the bounded git verbs keep 180s.
3087
3039
  if (verb === 'codesearch') {
3088
3040
  try { _writeStatusBusy(360000); } catch (_) {}
3089
3041
  } else if (verb === 'git_finalize' || verb === 'git_push' || verb === 'git_fetch') {
@@ -3188,11 +3140,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3188
3140
  try {
3189
3141
  for (const entry of fs.readdirSync(dir)) {
3190
3142
  if (/\.tmp\.\d+(\.|$)/.test(entry)) continue;
3191
- // The verb tree is in/<verb>/[<sub>/]<N>.<ext> -- at most two levels deep. A
3192
- // dot-prefixed dir (e.g. a stray nested .gm/exec-spool/ created by a misfire)
3193
- // is never a verb dir; recursing into it derives a bogus verb like
3194
- // `prd-resolve\.gm\exec-spool` and dispatch-errors on every tick forever.
3195
- // Skip dot-dirs and cap depth so a spool-inside-spool cannot wedge the watcher.
3196
3143
  if (entry.startsWith('.')) continue;
3197
3144
  const fullPath = path.join(dir, entry);
3198
3145
  let stat;
@@ -3229,12 +3176,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3229
3176
  wrapper_sha: _ownWrapperSha12 || null,
3230
3177
  idle_limit_ms: IDLE_LIMIT_MS,
3231
3178
  };
3232
- // A synchronous verb (chromium spawn, long exec) blocks the event loop, so the 5s
3233
- // heartbeat interval cannot fire for the duration. Without a hint, a liveness probe that
3234
- // checks ts-within-15s reads the busy watcher as dead and may kill/respawn it mid-verb.
3235
- // busy_until tells probes "alive but synchronously busy until this epoch ms" -- read it
3236
- // alongside ts: a stale ts whose busy_until is still in the future is a busy watcher, not
3237
- // a dead one. The pre-verb writeStatus(busyMs) stamps it before the blocking call.
3238
3179
  if (busyMs && busyMs > 0) { rec.busy_until = now + busyMs; _lastBusyUntil = rec.busy_until; }
3239
3180
  fs.writeFileSync(STATUS_PATH, JSON.stringify(rec));
3240
3181
  } catch (_) {}
@@ -3435,9 +3376,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3435
3376
  logEvent('plugkit', 'update.available', { installed, latest });
3436
3377
  _lastKnownDrift = latest;
3437
3378
  }
3438
- // NOTE: no version-file bump here either -- see the network-path comment above. Bumping the version
3439
- // file ahead of a verified binary download poisons installedVersionAtTools() and causes an infinite
3440
- // drift-respawn thrash. Auto-update is notify-only until a sha-verified force-download path exists.
3441
3379
  }
3442
3380
  function checkUpdateViaNpm(installed) {
3443
3381
  const req = https.get({
@@ -3615,9 +3553,6 @@ async function runSpoolWatcher(instance, spoolDir) {
3615
3553
  watch(inDir, { recursive: true }, (eventType, filename) => {
3616
3554
  if (!filename) return;
3617
3555
  if (/\.tmp\.\d+(\.|$)/.test(filename)) return;
3618
- // Skip any path with a dot-prefixed segment (e.g. a stray nested
3619
- // prd-resolve/.gm/exec-spool/...): it is not a real verb dispatch and walking it
3620
- // derives a bogus verb that dispatch-errors on every tick. Matches walkDir's guard.
3621
3556
  if (filename.split(/[\\/]/).some(seg => seg.startsWith('.'))) return;
3622
3557
  const fullPath = path.join(inDir, filename);
3623
3558
  markActivity('watch');
@@ -3681,11 +3616,6 @@ async function selfHealFromGithubReleases() {
3681
3616
  }
3682
3617
  const toolsDir = GM_TOOLS_ROOT;
3683
3618
  fs.mkdirSync(toolsDir, { recursive: true });
3684
- // Replace the live wasm atomically. A direct writeFileSync truncates the target
3685
- // before streaming ~149MB, so a crash mid-write or a concurrent watcher load in
3686
- // that window sees a truncated or absent wasm ("self-heal: wasm not installed"
3687
- // crash-loop). Write to a pid-suffixed temp and rename over the target; rename
3688
- // on the same volume is atomic, with the Windows EEXIST/EPERM unlink+retry.
3689
3619
  const wasmTarget = path.join(toolsDir, 'plugkit.wasm');
3690
3620
  const wasmTmp = `${wasmTarget}.partial-${process.pid}`;
3691
3621
  fs.writeFileSync(wasmTmp, wasm);
@@ -3738,10 +3668,6 @@ async function tryInstantiate(wasmPath) {
3738
3668
  return { instance, instanceRef };
3739
3669
  }
3740
3670
 
3741
- // In-process API. Lets a host (e.g. freddie) drive memorize/recall/auto-recall against
3742
- // .gm/rs-learn.db WITHOUT running the spool daemon loop: the wasm instance is created once
3743
- // and cached, and dispatch() returns parsed JSON. The wasm host functions resolve the project
3744
- // .gm dir from CLAUDE_PROJECT_DIR/cwd, so set those in the host process before first dispatch.
3745
3671
  let _sharedPlugkit = null;
3746
3672
  export async function createPlugkit(opts = {}) {
3747
3673
  if (_sharedPlugkit && !opts.fresh) return _sharedPlugkit;
@@ -50,7 +50,7 @@ function logEvent(event, fields) {
50
50
  ...fields,
51
51
  }) + '\n';
52
52
  fs.appendFileSync(path.join(dir, 'plugkit.jsonl'), line);
53
- } catch (_) {}
53
+ } catch (e) { try { console.error('[supervisor] logEvent write failed:', e); } catch (_) {} }
54
54
  }
55
55
 
56
56
  function writeSupervisorStatus(state, extra) {
@@ -69,14 +69,7 @@ function pidAlive(pid) {
69
69
  try { process.kill(pid, 0); return true; } catch (_) { return false; }
70
70
  }
71
71
 
72
- // Single-instance guard. findSupervisorPid (skill-bootstrap.js) reads .supervisor.pid to early-return
73
- // when a supervisor is already running; without it every bootstrap spawns a duplicate supervisor,
74
- // and duplicates spawn duplicate watchers that lock-fight in an endless spawn-reject churn. Write the
75
- // pid file on startup and refuse to start if a live peer already holds it.
76
72
  function acquireSingleInstance() {
77
- // Atomic via O_EXCL ('wx'): exclusive-create fails if the file exists, so when N supervisors
78
- // race to start in the same instant exactly one wins. A plain existsSync->write is TOCTOU and
79
- // lets a concurrent burst all pass, which is the duplicate-supervisor churn this guards against.
80
73
  for (let attempt = 0; attempt < 2; attempt++) {
81
74
  try {
82
75
  const fd = fs.openSync(SUPERVISOR_PID_PATH, 'wx');
@@ -87,13 +80,6 @@ function acquireSingleInstance() {
87
80
  let other = NaN;
88
81
  try { other = parseInt(fs.readFileSync(SUPERVISOR_PID_PATH, 'utf-8').trim(), 10); } catch (_) {}
89
82
  if (Number.isFinite(other) && other !== process.pid && pidAlive(other)) {
90
- // An alive holder pid is not the same as a working holder: a wedged supervisor
91
- // (event loop stuck, watcher dead, neither .supervisor.json nor .status.json
92
- // advancing) blocks every newcomer forever under a pidAlive-only check, forcing
93
- // manual process kills to recover the spool. Discriminate by progress, not
94
- // liveness: holder is wedged only when its own status heartbeat AND the spool
95
- // status are both stale past the takeover window, honoring a future busy_until
96
- // exactly as checkWatcherHealth does.
97
83
  const TAKEOVER_STALE_MS = 45_000;
98
84
  const now = Date.now();
99
85
  let supTs = 0;
@@ -119,7 +105,6 @@ function acquireSingleInstance() {
119
105
  try { spawnSync('taskkill', ['/F', '/T', '/PID', String(other)], { stdio: 'ignore', windowsHide: true, timeout: 3000 }); } catch (_) {}
120
106
  }
121
107
  }
122
- // Holder is dead/stale/wedged: remove and retry the exclusive create once.
123
108
  try { fs.unlinkSync(SUPERVISOR_PID_PATH); } catch (_) {}
124
109
  continue;
125
110
  }
@@ -298,10 +283,6 @@ function checkWatcherHealth() {
298
283
  return;
299
284
  }
300
285
  const now = Date.now();
301
- // A long synchronous verb (git_finalize's ~30s network push, a chromium spawn)
302
- // blocks the heartbeat write. The verb advertises busy_until in .status.json; while
303
- // that is in the future the watcher is busy, not hung -- reaping it kills the verb
304
- // mid-flight (the VERB ABORT). Honor busy_until exactly as the agent boot probe does.
305
286
  if (status.busy_until && status.busy_until > now) {
306
287
  return;
307
288
  }
@@ -320,10 +301,6 @@ function checkWatcherHealth() {
320
301
  }
321
302
  return;
322
303
  }
323
- // A published wrapper-only fix (no wasm version bump) lands in ~/.gm-tools via the next
324
- // bootstrap's ensureWrapperFresh, but a healthy running watcher keeps the old wrapper until it
325
- // restarts. On wrapper_sha drift (watcher's reported sha != on-disk), recycle so the fix goes
326
- // live without a manual kill. busy_until already returned above, so the watcher is not mid-verb.
327
304
  const reported = status.wrapper_sha || null;
328
305
  const onDisk = wrapperSha12OnDisk();
329
306
  if (reported && onDisk && reported !== onDisk) {
@@ -339,13 +316,6 @@ function checkWatcherHealth() {
339
316
  }
340
317
  return;
341
318
  }
342
- // The watcher reads the wasm's embedded instance_version at load and compares it to the
343
- // plugkit.version text file (file_version), exposing version_drifted when they disagree.
344
- // This catches the case where the version text was bumped (e.g. ensureReady's remote-latest
345
- // override) but the cached plugkit.wasm bytes are a different build -- the text claims 635
346
- // while the binary embeds 634, so ensureReady's text-only drift check never re-downloads.
347
- // On that drift, evict the stale cached wasm so the next bootstrap fails isReady() and
348
- // redownloads the correct build, then recycle the child to load it.
349
319
  if (status.version_drifted === true) {
350
320
  if (now - lastVersionDriftActionAt < VERSION_DRIFT_COOLDOWN_MS) {
351
321
  return;
package/gm.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.1576",
3
+ "version": "2.0.1578",
4
4
  "description": "Spool-dispatch orchestration engine with unified state machine, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
package/lang/ssh.js CHANGED
@@ -146,7 +146,6 @@ module.exports = {
146
146
  if (!cmd) return '[no command provided]';
147
147
  const target = loadTarget(targetName);
148
148
 
149
- // Detect background-only commands (fire-and-forget: ends with & or uses nohup/systemd-run)
150
149
  const isBackground = /(&\s*$|^\s*(nohup|systemd-run|setsid)\s)/m.test(cmd);
151
150
 
152
151
  if (isBackground) {
package/lib/spool.js CHANGED
@@ -49,7 +49,7 @@ function writeSpool(body, lang = 'nodejs', options = {}) {
49
49
  fs.mkdirSync(inDir, { recursive: true });
50
50
 
51
51
  const sessionId = options.sessionId || process.env.CLAUDE_SESSION_ID;
52
- const code = sessionId ? `const SESSION_ID = '${sessionId}';\n${body}` : body;
52
+ const code = sessionId ? `const SESSION_ID = ${JSON.stringify(sessionId)};\n${body}` : body;
53
53
 
54
54
  fs.writeFileSync(inFile, code, 'utf8');
55
55
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-skill",
3
- "version": "2.0.1576",
3
+ "version": "2.0.1578",
4
4
  "description": "Canonical universal harness — AI-native software engineering via skill-driven orchestration; bootstraps plugkit for task execution and session isolation. Install in any AI coding agent host.",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",