gm-skill 2.0.1576 → 2.0.1578
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +4 -4
- package/gm-plugkit/bootstrap.js +1 -10
- package/gm-plugkit/lang-host-runner.js +0 -3
- package/gm-plugkit/package.json +1 -1
- package/gm-plugkit/plugkit-wasm-wrapper.js +26 -100
- package/gm-plugkit/supervisor.js +1 -31
- package/gm.json +1 -1
- package/lang/ssh.js +0 -1
- package/lib/spool.js +1 -1
- package/package.json +1 -1
package/AGENTS.md
CHANGED
|
@@ -118,11 +118,11 @@ Every skill's `allowed-tools:` is reduced to `Skill, Read, Write` (plus the SKIL
|
|
|
118
118
|
|
|
119
119
|
**AGENTS.md / CLAUDE.md are inline-edited AND dual-written to the store**: edit them inline for structural rules (the only doc surviving context summarization), AND `memorize-fire` the same rule so `recall`/`auto_recall` surface it later -- complementary, not alternatives. Never `namespace:"AGENTS.md"`; load-bearing rules go to the default namespace. Mechanics in rs-learn (`recall: memorize-fire ingestion classifier`).
|
|
120
120
|
|
|
121
|
-
**A memorized workaround is a tool defect; transform it, never accumulate it**: we work USING gm, not ON it, so a `recall` memo framed as a workaround, known-limitation, or internal-advice is tribal knowledge a fresh user/LLM lacks -- the tool then surprises them, and surprises are never allowed; everything must be abundantly predictable at face value. Resolve
|
|
121
|
+
**A memorized workaround is a tool defect; transform it, never accumulate it**: we work USING gm, not ON it, so a `recall` memo framed as a workaround, known-limitation, or internal-advice is tribal knowledge a fresh user/LLM lacks -- the tool then surprises them, and surprises are never allowed; everything must be abundantly predictable at face value. Resolve: (a) already in standing prose -> prune recall; (b) prose-worthy but absent -> add to prose then prune; (c) genuinely surprising behavior -> fix code so it is predictable then prune.
|
|
122
122
|
|
|
123
123
|
**Behavioral discipline lives in plugkit's `instruction` verb**: dispatch `instruction` for the live phase-specific prose (Three-Layer Admission Filter, maturity-first emit, closure anti-shapes, code invariants); do not duplicate it here. Enumeration in rs-learn (`recall: instruction-verb behavioral discipline invariants`).
|
|
124
124
|
|
|
125
|
-
**The agent IS the LLM rs-learn calls**:
|
|
125
|
+
**The agent IS the LLM rs-learn calls**: no separate judge model; all decisions are inline via spool. Internals in rs-learn (`recall: rs-learn self-report core internals`).
|
|
126
126
|
|
|
127
127
|
**host_exec_js is synchronous**: pass a real per-call `timeoutMs` (zero/missing is a hard error). Detail in rs-learn (`recall: host_exec_js synchronous`).
|
|
128
128
|
|
|
@@ -148,13 +148,13 @@ Push to any rs-* sibling triggers `cascade.yml` -> rs-plugkit `release.yml` -> s
|
|
|
148
148
|
|
|
149
149
|
Orchestration state is tracked via `.gm/` marker files, not hook events; the CLI layer calls `checkDispatchGates()` before tool execution to gate Write/Edit/git. Marker set (`prd.yml, mutables.yml, needs-gm, gm-fired-<sessionId>, residual-check-fired`) + SpoolDispatcher mechanism in rs-learn (`recall: gate enforcement layer`, `recall: spool dispatch gates marker files`).
|
|
150
150
|
|
|
151
|
-
**gm-skill tool-use sequencing**: `Skill(skill="gm-skill")`
|
|
151
|
+
**gm-skill tool-use sequencing**: `Skill(skill="gm-skill")` clears the needs-gm gate. One shipped skill, no subagent variant. Marker mechanics in rs-learn (`recall: gm-skill tool-use sequencing mechanics`).
|
|
152
152
|
|
|
153
153
|
**The skill is the driver, not a post-hoc witness**: when a request carries the standing instruction to use gm-skill (every `/loop` fire, any prompt naming `/gm-skill`), the FIRST working action is `Skill(skill="gm-skill")`, and the skill prose drives the chain PLAN->COMPLETE. Dispatching spool verbs directly without first entering the skill executes the work outside the skill the user asked to drive it; entering only at the end to confirm terminal state does NOT satisfy the instruction. The boot probe (`cat .gm/exec-spool/.status.json` ...) is prescribed by the skill and may precede invocation; everything that mutates state happens inside the skill-driven session.
|
|
154
154
|
|
|
155
155
|
**Dead-watcher recovery uses `bun x gm-plugkit@latest spool`, never direct-node boot** (mechanism in rs-learn: `recall: dead-watcher recovery bun x not direct-node`).
|
|
156
156
|
|
|
157
|
-
**The first verb after a genuine multi-minute IDLE is `instruction`, to reset the long-gap clock**:
|
|
157
|
+
**The first verb after a genuine multi-minute IDLE is `instruction`, to reset the long-gap clock**: gate fires when >300s since last instruction AND >300s since any SPOOL verb. Platform `Bash`/`Read`/`Edit`/`Grep` do NOT reset the clock -- a long investigation run in them trips a false stall; interleave `prd-add` or `instruction` to keep warm. For a predictable blocking wait (`TaskOutput`/`gh run watch`), dispatch `instruction` BEFORE entering the wait. Detail + platform-tool exception in rs-learn (`recall: first verb after multi-minute wait instruction long-gap`).
|
|
158
158
|
|
|
159
159
|
**A stop-hook firing on a terminal chain does not authorize re-polling**: when a stop-hook fires while already at `phase=COMPLETE` AND `prd_pending_count=0`, re-dispatching `instruction`/`phase-status` to "re-confirm" is a deviation (`deviation.complete-chain-poll`, `instructions/mod.rs`). Two admissible responses: (a) a prose-only turn (COMPLETE is in hand), or (b) genuinely new planned work opened with a FRESH `{"prompt":...}` body (resets phase to PLAN, driven through the skill). Repeatedly answering the same hook is a loop; state the terminal facts once and stop, or open new work.
|
|
160
160
|
|
package/gm-plugkit/bootstrap.js
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
#!/usr/bin/env node
|
|
1
|
+
#!/usr/bin/env node
|
|
2
2
|
'use strict';
|
|
3
3
|
|
|
4
4
|
const fs = require('fs');
|
|
@@ -7,10 +7,6 @@ const os = require('os');
|
|
|
7
7
|
const crypto = require('crypto');
|
|
8
8
|
const { spawn, spawnSync } = require('child_process');
|
|
9
9
|
|
|
10
|
-
// Resolve a bare command name to its actual .exe on Windows. cmd.exe + .cmd
|
|
11
|
-
// shim chains re-enter conhost (visible window flash) even with
|
|
12
|
-
// windowsHide:true on the parent. Spawning the real .exe directly lets
|
|
13
|
-
// CREATE_NO_WINDOW propagate. See [[windows-spawn-cmd-shim-flash]].
|
|
14
10
|
function resolveWindowsExe(cmd) {
|
|
15
11
|
if (process.platform !== 'win32') return cmd;
|
|
16
12
|
try {
|
|
@@ -683,11 +679,6 @@ function copyWasmToGmTools(wasmPath, version) {
|
|
|
683
679
|
} catch (_) {}
|
|
684
680
|
}
|
|
685
681
|
if (!wasmFresh) {
|
|
686
|
-
// copyFileSync truncates the target before streaming ~149MB, leaving a window where
|
|
687
|
-
// a crash or a concurrent watcher load sees a truncated/absent wasm (the
|
|
688
|
-
// "self-heal: wasm not installed" crash-loop during an upgrade). Copy to a
|
|
689
|
-
// pid-suffixed temp and rename over the target: same-volume rename is atomic,
|
|
690
|
-
// with the Windows EEXIST/EPERM unlink+retry.
|
|
691
682
|
const tmp = `${target}.partial-${process.pid}`;
|
|
692
683
|
fs.copyFileSync(wasmPath, tmp);
|
|
693
684
|
try { fs.renameSync(tmp, target); }
|
|
@@ -1,7 +1,4 @@
|
|
|
1
1
|
#!/usr/bin/env node
|
|
2
|
-
// Legacy fallback. The canonical surface for lang/*.js plugins is the wasm
|
|
3
|
-
// `lang` verb in rs-plugkit, dispatched via .gm/exec-spool/in/lang/<N>.txt.
|
|
4
|
-
// This standalone runner is kept for direct CLI debug + pre-cascade situations.
|
|
5
2
|
'use strict';
|
|
6
3
|
const fs = require('fs');
|
|
7
4
|
const path = require('path');
|
package/gm-plugkit/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "gm-plugkit",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.1578",
|
|
4
4
|
"description": "Bootstrap and daemon-spawn tool for gm plugkit binary. Downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Includes plugkit-wasm-wrapper for WASM-based spool watching.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"bin": {
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
import fs from 'fs';
|
|
1
|
+
import fs from 'fs';
|
|
2
2
|
import path from 'path';
|
|
3
3
|
import os from 'os';
|
|
4
4
|
import crypto from 'crypto';
|
|
@@ -13,16 +13,8 @@ const _httpModule = http;
|
|
|
13
13
|
const _httpsModule = https;
|
|
14
14
|
import { fileURLToPath } from 'url';
|
|
15
15
|
|
|
16
|
-
// Set by the spool watcher's writeStatus closure once it is live. Lets long synchronous verbs
|
|
17
|
-
// (browser/chromium spawn, long exec) stamp a busy_until window into .status.json before the
|
|
18
|
-
// blocking call, so a liveness probe reads "busy" not "dead" while the event loop is blocked.
|
|
19
16
|
let _writeStatusBusy = () => {};
|
|
20
|
-
// Latest busy_until epoch ms stamped by a long synchronous verb (codesearch rebuild, chromium
|
|
21
|
-
// spawn). scanStalledTurns reads it so a busy watcher is not mis-flagged as an idle stall.
|
|
22
17
|
let _lastBusyUntil = 0;
|
|
23
|
-
// First 12 hex of sha256 of this watcher's own gmTools wrapper. Module-scoped so writeStatus
|
|
24
|
-
// (a different function scope) can stamp status.wrapper_sha, which the supervisor compares
|
|
25
|
-
// against the on-disk wrapper to recycle a watcher running a stale wrapper-only fix.
|
|
26
18
|
let _ownWrapperSha12 = '';
|
|
27
19
|
|
|
28
20
|
function spawnSync(cmd, args, opts) {
|
|
@@ -346,18 +338,12 @@ function turnTick(sess, verb, taskBase, phase, prdPending) {
|
|
|
346
338
|
const key = sess || '(no-session)';
|
|
347
339
|
const now = Date.now();
|
|
348
340
|
let t = _turns.get(key);
|
|
349
|
-
// Any verb arriving after an idle gap closes the stale turn -- not just instruction.
|
|
350
|
-
// Otherwise a non-instruction verb (prd-add, mutable-resolve, transition) landing
|
|
351
|
-
// after an overnight sleep stamps t.lastTs forward without splitting, and dur_ms
|
|
352
|
-
// (lastTs - startTs) balloons to wall-clock-with-sleep instead of active work time.
|
|
353
341
|
if (t && (now - t.lastTs) > TURN_IDLE_MS) {
|
|
354
342
|
endTurn(sess, t, true);
|
|
355
343
|
_turns.delete(key);
|
|
356
344
|
t = null;
|
|
357
345
|
}
|
|
358
346
|
if (!t) {
|
|
359
|
-
// Only an instruction dispatch opens a new turn; a stray non-instruction verb after
|
|
360
|
-
// idle is recorded against no turn (the next instruction starts the real turn).
|
|
361
347
|
if (verb !== 'instruction') return;
|
|
362
348
|
const idx = ((_turns.get(key + ':lastIdx') || 0) + 1);
|
|
363
349
|
_turns.set(key + ':lastIdx', idx);
|
|
@@ -367,27 +353,15 @@ function turnTick(sess, verb, taskBase, phase, prdPending) {
|
|
|
367
353
|
}
|
|
368
354
|
t.lastTs = now;
|
|
369
355
|
t.dispatches++;
|
|
370
|
-
// A verb arriving resumes the turn -- clear any prior stall flag so a later re-stall
|
|
371
|
-
// is a fresh episode, not silently suppressed by the one-shot guard.
|
|
372
356
|
t.stallEmitted = false;
|
|
373
357
|
t.verbs.set(verb, (t.verbs.get(verb) || 0) + 1);
|
|
374
358
|
if (phase) { t.phases.add(phase); t.lastPhase = phase; }
|
|
375
359
|
if (typeof prdPending === 'number') t.prdPending = prdPending;
|
|
376
360
|
}
|
|
377
361
|
|
|
378
|
-
// turn.end fires only when a NEW verb arrives after idle, so a turn that simply never
|
|
379
|
-
// receives another verb stays open forever and emits no signal -- a permanent stall is
|
|
380
|
-
// silence, not an event, which is how a mid-EXECUTE stop stays invisible for days. The
|
|
381
|
-
// heartbeat scan closes that hole: for each open turn idle past STALL_MS whose last phase
|
|
382
|
-
// is non-terminal (or carries open PRD rows), emit turn.stalled once. One-shot per episode
|
|
383
|
-
// (stallEmitted), reset when a verb resumes the turn. A COMPLETE turn with no open rows
|
|
384
|
-
// idling is the authorized prose-only state and never stalls.
|
|
385
362
|
const STALL_MS = 300_000;
|
|
386
363
|
function scanStalledTurns() {
|
|
387
364
|
const now = Date.now();
|
|
388
|
-
// A long synchronous verb (codesearch index rebuild, chromium spawn) stamps busy_until and
|
|
389
|
-
// blocks the spool -- the agent is legitimately waiting, not stalled. Honor it exactly as
|
|
390
|
-
// supervisor.js checkWatcherHealth does, so a busy watcher never emits a false mid-chain-stall.
|
|
391
365
|
if (_lastBusyUntil && _lastBusyUntil > now) return;
|
|
392
366
|
for (const [key, t] of _turns) {
|
|
393
367
|
if (!t || typeof t !== 'object' || !Number.isFinite(t.startTs)) continue;
|
|
@@ -396,9 +370,6 @@ function scanStalledTurns() {
|
|
|
396
370
|
const terminal = t.lastPhase === 'COMPLETE' && (t.prdPending === 0 || t.prdPending == null);
|
|
397
371
|
if (terminal) continue;
|
|
398
372
|
t.stallEmitted = true;
|
|
399
|
-
// key is the _turns map key (sess || '(no-session)'). When it is the sentinel, the turn was
|
|
400
|
-
// unattributed, so do not override logEvent's own cwd+sess base fields with '(no-session)' --
|
|
401
|
-
// let the cwd-based attribution stand. Pass an explicit sess only when key is a real session.
|
|
402
373
|
const fields = {
|
|
403
374
|
turn_idx: t.idx,
|
|
404
375
|
ended_in_phase: t.lastPhase || null,
|
|
@@ -411,10 +382,6 @@ function scanStalledTurns() {
|
|
|
411
382
|
}
|
|
412
383
|
}
|
|
413
384
|
|
|
414
|
-
// Every spool dispatch is the agent actively driving the chain, including wasm-direct verbs
|
|
415
|
-
// (recall/codesearch/exec_js/git/fetch) that never reach turnTick. Refresh the open turn's stall
|
|
416
|
-
// clock so a Bash-free stretch of pure wasm-direct verbs does not trip a false mid-chain-stall
|
|
417
|
-
// (the recurring audit-fire own-defect). Never create or split a turn -- that stays turnTick's job.
|
|
418
385
|
function touchActiveTurn(sess) {
|
|
419
386
|
const t = _turns.get(sess || '(no-session)');
|
|
420
387
|
if (!t) return;
|
|
@@ -888,8 +855,6 @@ function runBrowserRunner(pw, args, timeoutMs, cwd, claudeSessionId) {
|
|
|
888
855
|
const sockDir = playwriterHomeFor(cwd, claudeSessionId);
|
|
889
856
|
try { fs.mkdirSync(sockDir, { recursive: true }); } catch (_) {}
|
|
890
857
|
env.PLAYWRITER_HOME = sockDir;
|
|
891
|
-
// Stamp a busy window before the synchronous spawn so the blocked event loop's stale heartbeat
|
|
892
|
-
// is not misread as a dead watcher. Pad past the spawn timeout for teardown.
|
|
893
858
|
_writeStatusBusy((timeoutMs || 30000) + 5000);
|
|
894
859
|
return spawnSync(spawnCmd, spawnArgs, {
|
|
895
860
|
encoding: 'utf-8',
|
|
@@ -911,9 +876,6 @@ function scrubBrowserRunnerText(s) {
|
|
|
911
876
|
return t;
|
|
912
877
|
}
|
|
913
878
|
|
|
914
|
-
// Standard OS install locations for a Chrome/Chromium that speaks CDP. Used as a
|
|
915
|
-
// fallback when the managed ms-playwright cache is absent (e.g. cache evicted),
|
|
916
|
-
// so the browser verb keeps working off the system browser instead of failing.
|
|
917
879
|
function findSystemChromiumBinary() {
|
|
918
880
|
const candidates = process.platform === 'win32'
|
|
919
881
|
? [
|
|
@@ -1721,8 +1683,6 @@ function makeHostFunctions(instanceRef) {
|
|
|
1721
1683
|
const key = readWasmStr(instanceRef.value, keyPtr, keyLen);
|
|
1722
1684
|
if (!ns || !key) return 0;
|
|
1723
1685
|
let removed = 0;
|
|
1724
|
-
// Delete the key from the namespace AND its -vec sibling across every enabled discipline dir,
|
|
1725
|
-
// so a pruned memory leaves no orphan embedding that host_vec_search would still surface.
|
|
1726
1686
|
for (const baseNs of [ns, `${ns}-vec`]) {
|
|
1727
1687
|
for (const dir of kvNamespaceDirs(baseNs)) {
|
|
1728
1688
|
const fp = path.join(dir, safeName(key) + '.json');
|
|
@@ -2139,8 +2099,8 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2139
2099
|
}
|
|
2140
2100
|
function lockBody() { return `${process.pid}|${Date.now()}|${_ownWrapperSha12}`; }
|
|
2141
2101
|
function acquireLock() {
|
|
2142
|
-
|
|
2143
|
-
|
|
2102
|
+
function checkExistingHolder() {
|
|
2103
|
+
try {
|
|
2144
2104
|
const content = fs.readFileSync(LOCK_PATH, 'utf-8').trim();
|
|
2145
2105
|
const parts = content.split('|');
|
|
2146
2106
|
const pidStr = parts[0];
|
|
@@ -2164,6 +2124,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2164
2124
|
}));
|
|
2165
2125
|
} catch (_) {}
|
|
2166
2126
|
try { process.kill(parseInt(pidStr, 10), 'SIGTERM'); } catch (_) {}
|
|
2127
|
+
return 'takeover';
|
|
2167
2128
|
} else {
|
|
2168
2129
|
const msg = JSON.stringify({ ok: false, reason: 'another-watcher-active', pid: pidStr, age_ms: age });
|
|
2169
2130
|
console.error(`[plugkit-wasm] ${msg}; refusing to start`);
|
|
@@ -2181,11 +2142,32 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2181
2142
|
} else if (!holderAlive) {
|
|
2182
2143
|
console.error(`[plugkit-wasm] stale lock (holder pid=${pidStr} dead, age=${age}ms); taking over`);
|
|
2183
2144
|
try { logEvent('plugkit', 'watcher.lock-pid-dead-takeover', { stale_pid: pidStr, lock_age_ms: age }); } catch (_) {}
|
|
2145
|
+
return 'takeover';
|
|
2184
2146
|
} else {
|
|
2185
2147
|
console.error(`[plugkit-wasm] stale lock (age=${age}ms); taking over`);
|
|
2148
|
+
return 'takeover';
|
|
2186
2149
|
}
|
|
2150
|
+
} catch (_) {
|
|
2151
|
+
return 'takeover';
|
|
2152
|
+
}
|
|
2153
|
+
}
|
|
2154
|
+
try {
|
|
2155
|
+
let fd;
|
|
2156
|
+
try {
|
|
2157
|
+
fd = fs.openSync(LOCK_PATH, 'wx');
|
|
2158
|
+
} catch (e) {
|
|
2159
|
+
if (e.code !== 'EEXIST') throw e;
|
|
2160
|
+
const action = checkExistingHolder();
|
|
2161
|
+
if (action !== 'takeover') return;
|
|
2162
|
+
try { fs.unlinkSync(LOCK_PATH); } catch (_) {}
|
|
2163
|
+
fd = fs.openSync(LOCK_PATH, 'wx');
|
|
2164
|
+
}
|
|
2165
|
+
try {
|
|
2166
|
+
const body = Buffer.from(lockBody(), 'utf-8');
|
|
2167
|
+
fs.writeSync(fd, body);
|
|
2168
|
+
} finally {
|
|
2169
|
+
fs.closeSync(fd);
|
|
2187
2170
|
}
|
|
2188
|
-
fs.writeFileSync(LOCK_PATH, lockBody());
|
|
2189
2171
|
} catch (e) {
|
|
2190
2172
|
console.error(`[plugkit-wasm] lock acquire failed: ${e.message}`);
|
|
2191
2173
|
process.exit(1);
|
|
@@ -2227,13 +2209,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2227
2209
|
detected_at: Date.now(),
|
|
2228
2210
|
});
|
|
2229
2211
|
try { console.error(`[plugkit-wasm] VERB ABORT detected: prior watcher pid=${priorVerb.pid} died inside verb=${priorVerb.verb} task=${priorVerb.task}`); } catch (_) {}
|
|
2230
|
-
// The aborted dispatch otherwise gets NO response file: the in-file was consumed,
|
|
2231
|
-
// the prior watcher died before writing out/, and the agent waits forever on a
|
|
2232
|
-
// Read that never lands (then must git-archaeology whether the verb's side effects
|
|
2233
|
-
// happened). Write a definite failure response so the agent's Read returns
|
|
2234
|
-
// immediately and it re-dispatches. Both out-name shapes are written because
|
|
2235
|
-
// .verb-active.json does not record whether the dispatch was root or nested;
|
|
2236
|
-
// the agent reads whichever its dispatch shape expects, the other is swept.
|
|
2237
2212
|
if (priorVerb.verb && priorVerb.task) {
|
|
2238
2213
|
try {
|
|
2239
2214
|
const abortBody = JSON.stringify({
|
|
@@ -2513,10 +2488,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2513
2488
|
child.unref();
|
|
2514
2489
|
try { logEvent('plugkit', 'gm-plugkit.self-stale-respawn', { running_version: own, latest_version: latest, cache_busted: bustCache, attempt: (respawnGuard.attempts || 0) + 1 }); } catch (_) {}
|
|
2515
2490
|
try { fs.writeFileSync(path.join(spoolDir, '.shutdown-reason.json'), JSON.stringify({ reason: 'gm-plugkit-self-stale', ts: Date.now(), pid: process.pid, running_version: own, latest_version: latest })); } catch (_) {}
|
|
2516
|
-
// Wait for the replacement's fresh heartbeat before exiting (mirror the
|
|
2517
|
-
// version-drift path) instead of a blind 2s exit: the gm-plugkit download can
|
|
2518
|
-
// take many seconds, and exiting early lets the supervisor relaunch the SAME
|
|
2519
|
-
// stale version before the new one lands, so the update never sticks.
|
|
2520
2491
|
const myPid = process.pid;
|
|
2521
2492
|
const respawnDeadline = Date.now() + 90000;
|
|
2522
2493
|
const exitSelfStale = () => { try { process.exit(0); } catch (_) {} };
|
|
@@ -2575,11 +2546,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2575
2546
|
setTimeout(probeGmPlugkitSelfStale, 5000);
|
|
2576
2547
|
setInterval(probeGmPlugkitSelfStale, 300_000);
|
|
2577
2548
|
|
|
2578
|
-
// A supervised watcher self-exits on drift assuming the supervisor respawns it. If the
|
|
2579
|
-
// supervisor has died, that bare exit leaves the spool dead (worse than staying up). Treat a
|
|
2580
|
-
// dead/absent supervisor as unsupervised so the drift loops take the self-respawn-and-wait path
|
|
2581
|
-
// (spawn replacement, wait for its heartbeat, then exit) instead. False-negative is self-correcting:
|
|
2582
|
-
// if both the supervisor and this watcher respawn, the single-instance lock admits exactly one.
|
|
2583
2549
|
function _supervisorIsDead() {
|
|
2584
2550
|
try {
|
|
2585
2551
|
const sp = parseInt(fs.readFileSync(path.join(spoolDir, '.supervisor.pid'), 'utf8').trim(), 10);
|
|
@@ -3051,11 +3017,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3051
3017
|
const relPath = path.relative(inDir, filePath);
|
|
3052
3018
|
const dir = path.dirname(relPath);
|
|
3053
3019
|
const verb = dir === '.' ? path.basename(filePath, path.extname(filePath)) : dir;
|
|
3054
|
-
// Defense-in-depth beyond walkDir's dot-dir skip: a real verb is a single clean
|
|
3055
|
-
// segment (e.g. instruction, prd-resolve). A derived verb containing a path
|
|
3056
|
-
// separator or a dot-segment means the file lives under a stray nested spool
|
|
3057
|
-
// (in/prd-resolve/.gm/exec-spool/...); dispatching it builds a bogus verb+outName
|
|
3058
|
-
// and ENOENT-storms every tick. Skip + unmark so it never re-enters the loop.
|
|
3059
3020
|
if (/[\\/]/.test(verb) || verb.split(/[\\/]/).some(seg => seg.startsWith('.'))) {
|
|
3060
3021
|
try { logEvent('plugkit', 'spool.skip-nested-verb', { rel: relPath, derived_verb: verb }); } catch (_) {}
|
|
3061
3022
|
unmarkProcessed(key);
|
|
@@ -3075,15 +3036,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3075
3036
|
console.log(`[dispatch] -> verb=${verb} task=${taskBase} body=${bodyBytes.length}b`);
|
|
3076
3037
|
logEvent('plugkit', 'dispatch.start', { verb, task: taskBase, body_bytes: bodyBytes.length, cwd: process.cwd() });
|
|
3077
3038
|
|
|
3078
|
-
// Network-bound git verbs block the event loop for the duration of a push/fetch (~30s),
|
|
3079
|
-
// so the 5s heartbeat cannot fire and the supervisor would reap the watcher as hung
|
|
3080
|
-
// (the VERB ABORT). Stamp a busy_until window before the synchronous dispatch so the
|
|
3081
|
-
// supervisor's heartbeat-stale check honors it, exactly as the browser runner does.
|
|
3082
|
-
// codesearch is the longest synchronous verb: a cold first call loads the 133MB bge-small
|
|
3083
|
-
// bert model AND re-indexes the tree. A cold build was witnessed at ~252s (dispatch log
|
|
3084
|
-
// codesearch ms=251772), so a 180s window let the supervisor reap the watcher mid-index and
|
|
3085
|
-
// respawn it, cold-loading again = respawn-thrash that never completes (the codeinsight-stale
|
|
3086
|
-
// symptom). codesearch gets a 360s window; the bounded git verbs keep 180s.
|
|
3087
3039
|
if (verb === 'codesearch') {
|
|
3088
3040
|
try { _writeStatusBusy(360000); } catch (_) {}
|
|
3089
3041
|
} else if (verb === 'git_finalize' || verb === 'git_push' || verb === 'git_fetch') {
|
|
@@ -3188,11 +3140,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3188
3140
|
try {
|
|
3189
3141
|
for (const entry of fs.readdirSync(dir)) {
|
|
3190
3142
|
if (/\.tmp\.\d+(\.|$)/.test(entry)) continue;
|
|
3191
|
-
// The verb tree is in/<verb>/[<sub>/]<N>.<ext> -- at most two levels deep. A
|
|
3192
|
-
// dot-prefixed dir (e.g. a stray nested .gm/exec-spool/ created by a misfire)
|
|
3193
|
-
// is never a verb dir; recursing into it derives a bogus verb like
|
|
3194
|
-
// `prd-resolve\.gm\exec-spool` and dispatch-errors on every tick forever.
|
|
3195
|
-
// Skip dot-dirs and cap depth so a spool-inside-spool cannot wedge the watcher.
|
|
3196
3143
|
if (entry.startsWith('.')) continue;
|
|
3197
3144
|
const fullPath = path.join(dir, entry);
|
|
3198
3145
|
let stat;
|
|
@@ -3229,12 +3176,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3229
3176
|
wrapper_sha: _ownWrapperSha12 || null,
|
|
3230
3177
|
idle_limit_ms: IDLE_LIMIT_MS,
|
|
3231
3178
|
};
|
|
3232
|
-
// A synchronous verb (chromium spawn, long exec) blocks the event loop, so the 5s
|
|
3233
|
-
// heartbeat interval cannot fire for the duration. Without a hint, a liveness probe that
|
|
3234
|
-
// checks ts-within-15s reads the busy watcher as dead and may kill/respawn it mid-verb.
|
|
3235
|
-
// busy_until tells probes "alive but synchronously busy until this epoch ms" -- read it
|
|
3236
|
-
// alongside ts: a stale ts whose busy_until is still in the future is a busy watcher, not
|
|
3237
|
-
// a dead one. The pre-verb writeStatus(busyMs) stamps it before the blocking call.
|
|
3238
3179
|
if (busyMs && busyMs > 0) { rec.busy_until = now + busyMs; _lastBusyUntil = rec.busy_until; }
|
|
3239
3180
|
fs.writeFileSync(STATUS_PATH, JSON.stringify(rec));
|
|
3240
3181
|
} catch (_) {}
|
|
@@ -3435,9 +3376,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3435
3376
|
logEvent('plugkit', 'update.available', { installed, latest });
|
|
3436
3377
|
_lastKnownDrift = latest;
|
|
3437
3378
|
}
|
|
3438
|
-
// NOTE: no version-file bump here either -- see the network-path comment above. Bumping the version
|
|
3439
|
-
// file ahead of a verified binary download poisons installedVersionAtTools() and causes an infinite
|
|
3440
|
-
// drift-respawn thrash. Auto-update is notify-only until a sha-verified force-download path exists.
|
|
3441
3379
|
}
|
|
3442
3380
|
function checkUpdateViaNpm(installed) {
|
|
3443
3381
|
const req = https.get({
|
|
@@ -3615,9 +3553,6 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3615
3553
|
watch(inDir, { recursive: true }, (eventType, filename) => {
|
|
3616
3554
|
if (!filename) return;
|
|
3617
3555
|
if (/\.tmp\.\d+(\.|$)/.test(filename)) return;
|
|
3618
|
-
// Skip any path with a dot-prefixed segment (e.g. a stray nested
|
|
3619
|
-
// prd-resolve/.gm/exec-spool/...): it is not a real verb dispatch and walking it
|
|
3620
|
-
// derives a bogus verb that dispatch-errors on every tick. Matches walkDir's guard.
|
|
3621
3556
|
if (filename.split(/[\\/]/).some(seg => seg.startsWith('.'))) return;
|
|
3622
3557
|
const fullPath = path.join(inDir, filename);
|
|
3623
3558
|
markActivity('watch');
|
|
@@ -3681,11 +3616,6 @@ async function selfHealFromGithubReleases() {
|
|
|
3681
3616
|
}
|
|
3682
3617
|
const toolsDir = GM_TOOLS_ROOT;
|
|
3683
3618
|
fs.mkdirSync(toolsDir, { recursive: true });
|
|
3684
|
-
// Replace the live wasm atomically. A direct writeFileSync truncates the target
|
|
3685
|
-
// before streaming ~149MB, so a crash mid-write or a concurrent watcher load in
|
|
3686
|
-
// that window sees a truncated or absent wasm ("self-heal: wasm not installed"
|
|
3687
|
-
// crash-loop). Write to a pid-suffixed temp and rename over the target; rename
|
|
3688
|
-
// on the same volume is atomic, with the Windows EEXIST/EPERM unlink+retry.
|
|
3689
3619
|
const wasmTarget = path.join(toolsDir, 'plugkit.wasm');
|
|
3690
3620
|
const wasmTmp = `${wasmTarget}.partial-${process.pid}`;
|
|
3691
3621
|
fs.writeFileSync(wasmTmp, wasm);
|
|
@@ -3738,10 +3668,6 @@ async function tryInstantiate(wasmPath) {
|
|
|
3738
3668
|
return { instance, instanceRef };
|
|
3739
3669
|
}
|
|
3740
3670
|
|
|
3741
|
-
// In-process API. Lets a host (e.g. freddie) drive memorize/recall/auto-recall against
|
|
3742
|
-
// .gm/rs-learn.db WITHOUT running the spool daemon loop: the wasm instance is created once
|
|
3743
|
-
// and cached, and dispatch() returns parsed JSON. The wasm host functions resolve the project
|
|
3744
|
-
// .gm dir from CLAUDE_PROJECT_DIR/cwd, so set those in the host process before first dispatch.
|
|
3745
3671
|
let _sharedPlugkit = null;
|
|
3746
3672
|
export async function createPlugkit(opts = {}) {
|
|
3747
3673
|
if (_sharedPlugkit && !opts.fresh) return _sharedPlugkit;
|
package/gm-plugkit/supervisor.js
CHANGED
|
@@ -50,7 +50,7 @@ function logEvent(event, fields) {
|
|
|
50
50
|
...fields,
|
|
51
51
|
}) + '\n';
|
|
52
52
|
fs.appendFileSync(path.join(dir, 'plugkit.jsonl'), line);
|
|
53
|
-
} catch (_) {}
|
|
53
|
+
} catch (e) { try { console.error('[supervisor] logEvent write failed:', e); } catch (_) {} }
|
|
54
54
|
}
|
|
55
55
|
|
|
56
56
|
function writeSupervisorStatus(state, extra) {
|
|
@@ -69,14 +69,7 @@ function pidAlive(pid) {
|
|
|
69
69
|
try { process.kill(pid, 0); return true; } catch (_) { return false; }
|
|
70
70
|
}
|
|
71
71
|
|
|
72
|
-
// Single-instance guard. findSupervisorPid (skill-bootstrap.js) reads .supervisor.pid to early-return
|
|
73
|
-
// when a supervisor is already running; without it every bootstrap spawns a duplicate supervisor,
|
|
74
|
-
// and duplicates spawn duplicate watchers that lock-fight in an endless spawn-reject churn. Write the
|
|
75
|
-
// pid file on startup and refuse to start if a live peer already holds it.
|
|
76
72
|
function acquireSingleInstance() {
|
|
77
|
-
// Atomic via O_EXCL ('wx'): exclusive-create fails if the file exists, so when N supervisors
|
|
78
|
-
// race to start in the same instant exactly one wins. A plain existsSync->write is TOCTOU and
|
|
79
|
-
// lets a concurrent burst all pass, which is the duplicate-supervisor churn this guards against.
|
|
80
73
|
for (let attempt = 0; attempt < 2; attempt++) {
|
|
81
74
|
try {
|
|
82
75
|
const fd = fs.openSync(SUPERVISOR_PID_PATH, 'wx');
|
|
@@ -87,13 +80,6 @@ function acquireSingleInstance() {
|
|
|
87
80
|
let other = NaN;
|
|
88
81
|
try { other = parseInt(fs.readFileSync(SUPERVISOR_PID_PATH, 'utf-8').trim(), 10); } catch (_) {}
|
|
89
82
|
if (Number.isFinite(other) && other !== process.pid && pidAlive(other)) {
|
|
90
|
-
// An alive holder pid is not the same as a working holder: a wedged supervisor
|
|
91
|
-
// (event loop stuck, watcher dead, neither .supervisor.json nor .status.json
|
|
92
|
-
// advancing) blocks every newcomer forever under a pidAlive-only check, forcing
|
|
93
|
-
// manual process kills to recover the spool. Discriminate by progress, not
|
|
94
|
-
// liveness: holder is wedged only when its own status heartbeat AND the spool
|
|
95
|
-
// status are both stale past the takeover window, honoring a future busy_until
|
|
96
|
-
// exactly as checkWatcherHealth does.
|
|
97
83
|
const TAKEOVER_STALE_MS = 45_000;
|
|
98
84
|
const now = Date.now();
|
|
99
85
|
let supTs = 0;
|
|
@@ -119,7 +105,6 @@ function acquireSingleInstance() {
|
|
|
119
105
|
try { spawnSync('taskkill', ['/F', '/T', '/PID', String(other)], { stdio: 'ignore', windowsHide: true, timeout: 3000 }); } catch (_) {}
|
|
120
106
|
}
|
|
121
107
|
}
|
|
122
|
-
// Holder is dead/stale/wedged: remove and retry the exclusive create once.
|
|
123
108
|
try { fs.unlinkSync(SUPERVISOR_PID_PATH); } catch (_) {}
|
|
124
109
|
continue;
|
|
125
110
|
}
|
|
@@ -298,10 +283,6 @@ function checkWatcherHealth() {
|
|
|
298
283
|
return;
|
|
299
284
|
}
|
|
300
285
|
const now = Date.now();
|
|
301
|
-
// A long synchronous verb (git_finalize's ~30s network push, a chromium spawn)
|
|
302
|
-
// blocks the heartbeat write. The verb advertises busy_until in .status.json; while
|
|
303
|
-
// that is in the future the watcher is busy, not hung -- reaping it kills the verb
|
|
304
|
-
// mid-flight (the VERB ABORT). Honor busy_until exactly as the agent boot probe does.
|
|
305
286
|
if (status.busy_until && status.busy_until > now) {
|
|
306
287
|
return;
|
|
307
288
|
}
|
|
@@ -320,10 +301,6 @@ function checkWatcherHealth() {
|
|
|
320
301
|
}
|
|
321
302
|
return;
|
|
322
303
|
}
|
|
323
|
-
// A published wrapper-only fix (no wasm version bump) lands in ~/.gm-tools via the next
|
|
324
|
-
// bootstrap's ensureWrapperFresh, but a healthy running watcher keeps the old wrapper until it
|
|
325
|
-
// restarts. On wrapper_sha drift (watcher's reported sha != on-disk), recycle so the fix goes
|
|
326
|
-
// live without a manual kill. busy_until already returned above, so the watcher is not mid-verb.
|
|
327
304
|
const reported = status.wrapper_sha || null;
|
|
328
305
|
const onDisk = wrapperSha12OnDisk();
|
|
329
306
|
if (reported && onDisk && reported !== onDisk) {
|
|
@@ -339,13 +316,6 @@ function checkWatcherHealth() {
|
|
|
339
316
|
}
|
|
340
317
|
return;
|
|
341
318
|
}
|
|
342
|
-
// The watcher reads the wasm's embedded instance_version at load and compares it to the
|
|
343
|
-
// plugkit.version text file (file_version), exposing version_drifted when they disagree.
|
|
344
|
-
// This catches the case where the version text was bumped (e.g. ensureReady's remote-latest
|
|
345
|
-
// override) but the cached plugkit.wasm bytes are a different build -- the text claims 635
|
|
346
|
-
// while the binary embeds 634, so ensureReady's text-only drift check never re-downloads.
|
|
347
|
-
// On that drift, evict the stale cached wasm so the next bootstrap fails isReady() and
|
|
348
|
-
// redownloads the correct build, then recycle the child to load it.
|
|
349
319
|
if (status.version_drifted === true) {
|
|
350
320
|
if (now - lastVersionDriftActionAt < VERSION_DRIFT_COOLDOWN_MS) {
|
|
351
321
|
return;
|
package/gm.json
CHANGED
package/lang/ssh.js
CHANGED
|
@@ -146,7 +146,6 @@ module.exports = {
|
|
|
146
146
|
if (!cmd) return '[no command provided]';
|
|
147
147
|
const target = loadTarget(targetName);
|
|
148
148
|
|
|
149
|
-
// Detect background-only commands (fire-and-forget: ends with & or uses nohup/systemd-run)
|
|
150
149
|
const isBackground = /(&\s*$|^\s*(nohup|systemd-run|setsid)\s)/m.test(cmd);
|
|
151
150
|
|
|
152
151
|
if (isBackground) {
|
package/lib/spool.js
CHANGED
|
@@ -49,7 +49,7 @@ function writeSpool(body, lang = 'nodejs', options = {}) {
|
|
|
49
49
|
fs.mkdirSync(inDir, { recursive: true });
|
|
50
50
|
|
|
51
51
|
const sessionId = options.sessionId || process.env.CLAUDE_SESSION_ID;
|
|
52
|
-
const code = sessionId ? `const SESSION_ID =
|
|
52
|
+
const code = sessionId ? `const SESSION_ID = ${JSON.stringify(sessionId)};\n${body}` : body;
|
|
53
53
|
|
|
54
54
|
fs.writeFileSync(inFile, code, 'utf8');
|
|
55
55
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "gm-skill",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.1578",
|
|
4
4
|
"description": "Canonical universal harness — AI-native software engineering via skill-driven orchestration; bootstraps plugkit for task execution and session isolation. Install in any AI coding agent host.",
|
|
5
5
|
"author": "AnEntrypoint",
|
|
6
6
|
"license": "MIT",
|