gm-plugkit 2.0.1522 → 2.0.1524
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/SKILL.md +13 -13
- package/bootstrap.js +4 -3
- package/cli.js +5 -5
- package/package.json +1 -1
- package/plugkit-wasm-wrapper.js +51 -22
- package/supervisor.js +1 -1
package/SKILL.md
CHANGED
|
@@ -6,11 +6,11 @@ allowed-tools: Skill, Read, Write, Bash(bun *), Bash(npx *)
|
|
|
6
6
|
|
|
7
7
|
# gm
|
|
8
8
|
|
|
9
|
-
**A turn that ends without a tool call is a stop, and stopping mid-chain is the cardinal failure.** Every programming agent reads only tool calls and their outputs; an assistant message that ends in prose with no tool call IS the turn ending, and the session halts there. So while the chain is in-flight (phase
|
|
9
|
+
**A turn that ends without a tool call is a stop, and stopping mid-chain is the cardinal failure.** Every programming agent reads only tool calls and their outputs; an assistant message that ends in prose with no tool call IS the turn ending, and the session halts there. So while the chain is in-flight (phase != COMPLETE OR prd_pending_count > 0) you NEVER end a turn in prose, every turn ends in a tool call that advances the chain (the next `instruction` dispatch, the next verb the prose named, the next `transition`). You do NOT summarize, you do NOT write a "here's what I did" wrap-up, you do NOT narrate closure, summary is a stop, and stopping is only authorized when plugkit itself returns `phase=COMPLETE` AND `prd_pending_count=0`. A turn-final sentence that names the next move instead of making it is the same stop facing forward -- announcing a read, a verb, a re-dispatch is not doing it, and the chain strands exactly where the prose pointed. Take the move you were about to describe. Surfacing a decision is a tool call too (`AskUserQuestion` or `prd-add`), never a prose-only "confirming direction." Before you are ever tempted to stop, you dispatch `phase-status` and read it: if it is not terminal, that temptation was a drift signal, dispatch `instruction` and keep walking. This is tool-agnostic by construction: it depends on nothing but the verb spool, so it holds identically on every agent, with no hook and no tool-specific feature. The one and only thing that authorizes the prose-only turn is plugkit's COMPLETE pronouncement; until then, the answer to "am I done?" is another tool call.
|
|
10
10
|
|
|
11
11
|
**Done is what plugkit says is done, never your claim.** The COMPLETE gate is the single arbiter. If the chain is not at COMPLETE, there is a next transition to seek; idle mid-chain is a deviation.
|
|
12
12
|
|
|
13
|
-
**Standing approval
|
|
13
|
+
**Standing approval -- finish everything, every session, at maximum granularity.** Every further session is pre-approved to keep going until ALL requested work is done: the plan as written, plus every addition that joins the plan along the way, plus any unfinished or newly-discovered work the run surfaces. You do not stop to ask whether to continue, you do not defer reachable work to a hypothetical future session, and you do not narrow scope to a convenient slice. Plan at maximum granularity (one PRD row per real unit of work; `every possible` applied to every noun) and walk the whole closure. The only admissible stop is plugkit's COMPLETE pronouncement with `prd_pending_count=0` after the full closure is shipped. `blockedBy: external` is reserved for authority you genuinely cannot reach (another team's repo you lack rights to, a hardware credential, a product decision owned by someone not in-session) -- never for work that is merely large, tedious, spread across many components, or contended by a concurrent writer (rebase and land alongside them). Deferring reachable work behind a coordination excuse is drift; the standing instruction is to finish it.
|
|
14
14
|
|
|
15
15
|
**Every possible action begins and ends with `instruction`.** When in doubt, dispatch instruction. When denied, dispatch instruction. When the next move is unclear, dispatch instruction. There is no other recovery primitive and there is no situation in which improvising beats re-reading the prose.
|
|
16
16
|
|
|
@@ -18,17 +18,17 @@ allowed-tools: Skill, Read, Write, Bash(bun *), Bash(npx *)
|
|
|
18
18
|
|
|
19
19
|
This is the only thing that makes the discipline work. Drop this and every possible other rule collapses: mutables get resolved without witness, COMPLETE gets claimed without VERIFY, residuals get narrated away instead of scanned, and the chain becomes a story you tell instead of work you ship.
|
|
20
20
|
|
|
21
|
-
Every turn: dispatch `instruction` (you are the one dispatching it), read the response body, follow the imperative prose, dispatch the next verb the prose names. Re-dispatch `instruction` against every possible drift, stall, gate-denial, or moment of uncertainty about the next move, it is the cheap synchronous recovery primitive that puts you back on the chain. While the chain is in-flight (phase
|
|
21
|
+
Every turn: dispatch `instruction` (you are the one dispatching it), read the response body, follow the imperative prose, dispatch the next verb the prose names. Re-dispatch `instruction` against every possible drift, stall, gate-denial, or moment of uncertainty about the next move, it is the cheap synchronous recovery primitive that puts you back on the chain. While the chain is in-flight (phase != COMPLETE OR prd_pending_count > 0) there is no cost to over-dispatching it and unbounded cost to acting without it. A session that stops dispatching instruction mid-chain has stopped walking the chain. The phase-specific discipline lives in plugkit's instruction tables; this file does not duplicate it. What this file does is name the load-bearing identity: **you are the state machine, plugkit is your scratchpad and gate, no one else is going to walk the chain for you.**
|
|
22
22
|
|
|
23
|
-
**Once `phase=COMPLETE` AND `prd_pending_count=0`, the chain is closed and you stop dispatching.** Polling `instruction` on a terminal chain returns the same UPDATE-DOCS prose every time and produces `turn.end dispatches:1 verbs:{instruction:1}` events in gmsniff that mark the agent as polling rather than walking. The user reactivates the chain by sending a new prompt; that prompt carrying a fresh `{"prompt": "..."}` body is the intent to reset phase to PLAN. The reset is not always automatic: if your first `instruction` dispatch on intended-new work still returns `phase=COMPLETE` / UPDATE-DOCS prose, dispatch `transition to=PLAN` **once** to reopen the chain
|
|
23
|
+
**Once `phase=COMPLETE` AND `prd_pending_count=0`, the chain is closed and you stop dispatching.** Polling `instruction` on a terminal chain returns the same UPDATE-DOCS prose every time and produces `turn.end dispatches:1 verbs:{instruction:1}` events in gmsniff that mark the agent as polling rather than walking. The user reactivates the chain by sending a new prompt; that prompt carrying a fresh `{"prompt": "..."}` body is the intent to reset phase to PLAN. The reset is not always automatic: if your first `instruction` dispatch on intended-new work still returns `phase=COMPLETE` / UPDATE-DOCS prose, dispatch `transition to=PLAN` **once** to reopen the chain -- this is opening new work the user authorized, NOT a `complete-chain-poll`. Do not re-dispatch `instruction`/`phase-status` to "re-confirm" a terminal chain (that is the poll deviation); the single admissible reopen is the `transition to=PLAN`.
|
|
24
24
|
|
|
25
|
-
**Client-side edits are gated by Browser Witness (paper
|
|
25
|
+
**Client-side edits are gated by Browser Witness (paper section 23, hard rule).** If you Write or Edit any file with a client-side extension, `.html`, `.js`, `.jsx`, `.ts`, `.tsx`, `.vue`, `.svelte`, `.mjs`, `.css`, or anything loaded from an HTML entry, you dispatch the `browser` verb in the **same turn** with a `page.evaluate` body that asserts the invariant the edit establishes. The transition gate refuses `transition to=COMPLETE` until `.turn-browser-witnessed` covers every entry in `.turn-browser-edits.json` with matching sha; absence or mismatch fires `deviation.client-edit-no-witness`. There is no "validate later", later does not arrive in the chain you are walking; the same response that contains the client-side Write/Edit also contains the `browser` Write + Read.
|
|
26
26
|
|
|
27
|
-
**The live page is the debugger
|
|
27
|
+
**The live page is the debugger -- expose globals, evaluate in-browser, never blind-restart.** To debug client-side code you expose the relevant state as a `window.*` global and read it live through the `browser` verb's `page.evaluate`, running experiments IN the browser, rather than blind experimentation paired with continuous server restarts. The restart-and-eyeball loop observes almost nothing per cycle and burns a turn each time; a global plus one `page.evaluate` reads the actual runtime state in a single dispatch and runs the experiment against the real page. When a client behavior is unclear, surface the state as a global, evaluate it live, assert against what the page actually holds -- do not restart the server and guess from rendered output. The same `browser` surface that witnesses an edit also diagnoses it.
|
|
28
28
|
|
|
29
|
-
**Search routes through the spool, never a platform search agent.** For any code, file, or symbol search
|
|
29
|
+
**Search routes through the spool, never a platform search agent.** For any code, file, or symbol search -- whereabouts, "where is X defined", "what calls Y", grepping the tree -- you dispatch the `codesearch` verb (`.gm/exec-spool/in/codesearch/<N>.txt` with `{"query":"..."}`), and for prior-knowledge you dispatch `recall`. You do NOT reach for the platform's Explore agent, a Task/general-purpose search subagent, raw `grep`/`Glob`, or any host-native code-search; those are not substitutes for `codesearch`, exactly as puppeteer is not a substitute for the `browser` verb. They bypass the spool, the committed code-search index, and the recall-grounded discipline -- the search becomes invisible to gmsniff, ungrounded in what the project already learned, and non-portable across harnesses. The orient fan-out at PLAN is `recall` + `codesearch` in parallel; every ad-hoc lookup mid-EXECUTE is a `codesearch` dispatch too. Reaching outside the spool for search is the same drift as reaching outside it for the browser: the capability exists as a verb, so you use the verb.
|
|
30
30
|
|
|
31
|
-
**This is one instance of a class rule: every platform-native capability that has a plugkit verb is forbidden in favor of the verb.** Your `allowed-tools` already blocks raw shell beyond the boot commands, but a harness can still offer the capability as its own first-class tool or subagent that slips past that restriction
|
|
31
|
+
**This is one instance of a class rule: every platform-native capability that has a plugkit verb is forbidden in favor of the verb.** Your `allowed-tools` already blocks raw shell beyond the boot commands, but a harness can still offer the capability as its own first-class tool or subagent that slips past that restriction -- a search/Explore agent, a web-fetch or web-search tool, a plan/architect subagent, a notebook editor. For each, the plugkit verb is the only admissible surface: code/file/symbol search -> `codesearch`; prior knowledge -> `recall`; fetching a URL or searching the web -> the `fetch` verb (`.gm/exec-spool/in/fetch/`); running code -> `exec_js` / the exec spool; a real browser -> the `browser` verb; persisting memory -> `memorize-fire`; **any git operation -> the git verbs** (`git_status`/`git_log`/`git_diff`/`git_show`/`git_branch` to inspect, `git_add`/`git_commit`/`git_finalize`/`git_push` to stage-commit-push, `git_checkout`/`git_fetch`/`git_rm`/`git_revert`/`git_reset` to mutate) -- `git_finalize {message}` bundles add->commit->porcelain-gate->push in ONE dispatch and is the COMPLETE-phase push surface, so you never shell `git` via Bash and never spend 4 tool-use events on what is one verb; a `bash`/`sh`/`powershell` body that invokes git is gated (`deviation.bash-git-bypass`). The native tool is never the substitute, for the same three reasons every time: it bypasses the spool (invisible to the ledger), it bypasses the project's committed index and learned memory (ungrounded), and it is non-portable across harnesses (a different agent host has a different native tool, so a discipline built on the native tool does not transport -- only the verb does). When you reach for any capability, the question is not "what tool does my platform give me" but "what verb does plugkit expose for this"; if a verb exists, the native tool is off-limits, and if no verb exists the gap is a missing verb to add, not a license to reach around the spool.
|
|
32
32
|
|
|
33
33
|
**Boot before dispatching. Always check first.** Writing to `.gm/exec-spool/in/instruction/N.txt` while the watcher is dead is the canonical cold-start failure, the request sits forever, you read no response, you fabricate the chain from memory of the prose. The spool directory's existence does NOT mean the watcher is alive; `.status.json` mtime within the last 15s does. The leftover `.status.json` from yesterday's dead watcher is the most common trap.
|
|
34
34
|
|
|
@@ -52,7 +52,7 @@ bun x gm-plugkit@latest spool > /dev/null 2>&1 &
|
|
|
52
52
|
|
|
53
53
|
Never poll the spool dir with `sleep && ls` or `Start-Sleep && Test-Path`, plugkit is synchronous from your view; if the response is not there, the watcher is dead (re-check `.status.json` mtime) or the verb is slow (check `.gm/exec-spool/.watcher.log`), not "still processing."
|
|
54
54
|
|
|
55
|
-
**Dead-watcher recovery is mandatory, never abandon the dispatch.** If two consecutive re-Reads return "file does not exist" AND `.status.json` ts is stale (>15s gap from current epoch) AND `busy_until` is absent or in the past, the watcher is dead. (A future `busy_until` means a long synchronous verb is running, the response will land when it finishes; wait, do not boot.) Your next call is `bun x gm-plugkit@latest spool` to boot a fresh watcher (the wrapper has self-respawn paths now, one boot deploys every queued fix to disk). Then re-dispatch the original verb. Do NOT reach for an alternative tool, puppeteer-core, agent-browser, WebFetch, raw `chrome.exe`, none of these substitute for the `browser` verb. Reaching outside plugkit when the spool surface is reachable orphans state the next session cannot reap, bypasses paper
|
|
55
|
+
**Dead-watcher recovery is mandatory, never abandon the dispatch.** If two consecutive re-Reads return "file does not exist" AND `.status.json` ts is stale (>15s gap from current epoch) AND `busy_until` is absent or in the past, the watcher is dead. (A future `busy_until` means a long synchronous verb is running, the response will land when it finishes; wait, do not boot.) Your next call is `bun x gm-plugkit@latest spool` to boot a fresh watcher (the wrapper has self-respawn paths now, one boot deploys every queued fix to disk). Then re-dispatch the original verb. Do NOT reach for an alternative tool, puppeteer-core, agent-browser, WebFetch, raw `chrome.exe`, none of these substitute for the `browser` verb. Reaching outside plugkit when the spool surface is reachable orphans state the next session cannot reap, bypasses paper section 23 witness gates, and ages the project's discipline. The recovery is always: notice dead -> boot -> re-dispatch. The full chain from spool-write to disk-Read-success is the only admissible loop; any short-circuit produces unreconcilable state.
|
|
56
56
|
|
|
57
57
|
When writing the spool input from PowerShell, pass `-Encoding utf8` (or use `[System.IO.File]::WriteAllText($path, $body)` which defaults to UTF-8 no-BOM). PowerShell 5.1's default `Out-File` / `Set-Content` write UTF-16 LE with BOM, which the watcher detects and re-decodes (`spool.body-encoding-recoded` event in gmsniff), but the deviation is a fingerprint of an instruction you missed. Use `bash -c "echo -n '...' > ..."` or `Write` tool instead when the body is structured JSON.
|
|
58
58
|
|
|
@@ -66,13 +66,13 @@ The chain is not COMPLETE until your changes are on origin. Commit and push at t
|
|
|
66
66
|
|
|
67
67
|
**Apply "every possible" to every noun.** PLAN is exhaustive, not minimal. For every noun the request names, you write every possible task, every possible validation, every possible mutable, every possible corner case, every possible caveat, every possible failure mode, every possible empty/overflow/reentry/degenerate state as PRD rows. Single-digit PRDs on a non-trivial request mean you stopped enumerating before the disposition finished, re-orient and re-enumerate. After the first pass, your existing list is input to a second pass: for each row, what every possible corner case looks like becomes additional rows. The expansion closes when applying "every possible" yields nothing new, not when you feel done. Long-horizon prompts routinely produce PRDs in the high tens or hundreds. Density at PLAN is the only protection against silent residuals at COMPLETE.
|
|
68
68
|
|
|
69
|
-
**Sweep every possible aspect for jank, and every aspect is a PRD row.** At PLAN, for every surface the user's prompt concerns, you add to the PRD a complete enumeration of every possible aspect that can be checked for jank
|
|
69
|
+
**Sweep every possible aspect for jank, and every aspect is a PRD row.** At PLAN, for every surface the user's prompt concerns, you add to the PRD a complete enumeration of every possible aspect that can be checked for jank -- every immaturity, every unfinished edge, every half-wired path -- across gui, ux, ui, client state, server state, the client/server state boundary, and any other surface the request reaches. `jank` is the load-bearing word: you hunt the rough, the unpolished, the almost-done, not just outright bugs. Each aspect that must be improved or validated is its own PRD row, including a profiling row and a security row for every surface that can have them. The sweep is scoped to what the prompt concerns and its reachable closure, not an unbounded repo-wide audit -- but within that closure it is exhaustive. Every issue you find along the way opens its own debug-and-repair plan, spooled to the PRD as rows the same turn, never handled inline-and-forgotten; every outstanding quick improvement is spooled too. Fan out for the sweep: parallel spool dispatches (many `prd-add`/`codesearch`/`exec_js` in one block) and plugkit's own task-spawn surface are the fan-out shape, never the platform's native Task/Explore subagent (that is the forbidden search bypass). You fan out subagents for everything that parallelizes.
|
|
70
70
|
|
|
71
|
-
**One tell-tale AI design element spawns a full-codebase sweep.** If any tell-tale AI design element is found along the way
|
|
71
|
+
**One tell-tale AI design element spawns a full-codebase sweep.** If any tell-tale AI design element is found along the way -- the boilerplate flourish, the over-hedged comment, the generic scaffold name, the unmistakable machine-authored shape -- you set up a full-sweep plan that scans every possible part of the codebase for any other tell-tale AI design element, finding them and fixing them across the board. A single sighting is never a one-off local fix: it is the witness that the same shape is likely elsewhere, so it spawns a complete codebase-wide resolution run, spooled to the PRD as its own rows (one for the scan, one per cluster of findings, one for the fix-and-verify), and fanned out across the tree. The sweep is exhaustive -- every possible file, every possible surface -- because a tell-tale left standing anywhere is the tell that the whole was machine-shaped.
|
|
72
72
|
|
|
73
73
|
**Graphical symbols are forbidden; convert them to industry-standard text on sight.** Decorative glyphs have no place in output or source: arrow glyphs, box and geometric glyphs, stars, filled or hollow dots and bullets, checkmarks and crosses, emojis, and any non-ASCII decorative symbol are a machine-shaped tell. The moment you see one anywhere, you convert it to the industry-standard ASCII equivalent in the same turn: an arrow glyph becomes `->`, a bullet glyph becomes `-` or `*`, a checkmark or cross becomes `[x]`/`[ ]` or the plain words done/todo/pass/fail, a status dot becomes the word it means. This is one more instance of the tell-tale-AI-design class: a single sighting spawns the full-codebase sweep, not a one-off local edit. The exemptions are narrow and concrete: functional code operators (`=>`, `??`, `?.`, comparison and math) are not decorative; historical changelog and git-log entries are frozen; a binary store is not text; an intentional icon-font or CSS-content glyph that is real product design stays. Everything else converts the instant it is found.
|
|
74
74
|
|
|
75
|
-
**Treat the architecture as pliable.** `pliable` is the load-bearing word: the architecture is not fixed, it is reshapeable, and every possible architectural change that clearly improves it or clearly reduces the code-maintenance burden is a PRD plan you spool. Replacing bespoke code with native functionality or a very-popular, well-maintained library is encouraged
|
|
75
|
+
**Treat the architecture as pliable.** `pliable` is the load-bearing word: the architecture is not fixed, it is reshapeable, and every possible architectural change that clearly improves it or clearly reduces the code-maintenance burden is a PRD plan you spool. Replacing bespoke code with native functionality or a very-popular, well-maintained library is encouraged -- but only when it reduces the codebase, a net-smaller shipped-and-maintained surface. Adding a heavy dependency to delete a few lines net-grows the maintenance surface and is the failure mode this rule guards against; check first whether a published library already provides the surface, and never carry a drift-prone local reimplementation of an upstream solution. You make every improvement that is clearly outstanding.
|
|
76
76
|
|
|
77
77
|
**Noticing is a planning event.** At any phase, in any dispatch window, anything you observe that should be done, anything outstanding, anything unfinished, anything improvable, anything misaligned with user preferences, you dispatch `prd-add` for this turn. Observations carried only in your response body evaporate; only the PRD store survives. The default response to noticing is to convert. "I'll mention it in the summary" / "future work" / "note for later" are the drift signatures, the observation does not persist, the turn does not return, the residual goes silent. Density grows along the walk, not just at PLAN-time. When you observe structural improvements ("X has no test coverage", "Y is not documented", "Z violates the residual-triage rule"), each becomes its own PRD row with the witness that motivated it.
|
|
78
78
|
|
|
@@ -82,7 +82,7 @@ The chain is not COMPLETE until your changes are on origin. Commit and push at t
|
|
|
82
82
|
|
|
83
83
|
Response body is not a mutation surface either. Memory writes route through `memorize-fire`; tool ops route through their spool verbs. Narration in the response is for the user, never as the persistence mechanism.
|
|
84
84
|
|
|
85
|
-
**Suppress mundane output; strip it to the bone.** Every possible mundane line of user-facing text is suppressed or cut to the bone
|
|
85
|
+
**Suppress mundane output; strip it to the bone.** Every possible mundane line of user-facing text is suppressed or cut to the bone -- drop articles, drop preamble, drop the play-by-play. Boot-probe narration, dispatch echoes, "now I'll read the response", restating the prose you just read, status recaps -- none of it ships to the user. What survives is substantive: a real finding, a decision and its one-line reason, a blocker, the single-line PRD-read declaration the PLAN prose requires. Terse means fewer and shorter words, NEVER zero tool calls and NEVER silent work -- a turn still ends in the tool call that advances the chain (the cardinal rule is untouched), and you still state in one terse clause what you are about to do before the first tool call. You cut the mundane, you do not cut the chain. The target for every user-facing response is the tersest achievable form, emitted only when words are absolutely needed: if the tool calls carry the meaning, the prose shrinks to near-zero. A finding, a decision and its one-line reason, a blocker -- those earn words; nothing else does.
|
|
86
86
|
|
|
87
87
|
**Prune bad memory on sight, a wrong recall hit is worse than a miss.** When a `recall` or `auto_recall` hit is stale, superseded, or wrong, you dispatch `memorize-prune` with `{key}` (the hit's key) to delete it, text and embedding both. Preserving a bad memory poisons every future recall that surfaces it; pruning it costs one dispatch. Pruning bad memory matters more than preserving good memory. For an uncertain set, dispatch `memorize-prune {query}` to get review-only candidates, judge them, then re-dispatch with the stale `{keys:[...]}`, never a blind similarity-delete.
|
|
88
88
|
|
package/bootstrap.js
CHANGED
|
@@ -567,7 +567,7 @@ async function bootstrap(opts) {
|
|
|
567
567
|
}
|
|
568
568
|
log('sha256 verified');
|
|
569
569
|
} else {
|
|
570
|
-
log('no sha256 manifest
|
|
570
|
+
log('no sha256 manifest -- skipping verify');
|
|
571
571
|
}
|
|
572
572
|
|
|
573
573
|
try { fs.renameSync(partialPath, finalPath); }
|
|
@@ -808,7 +808,7 @@ async function ensureReady(opts) {
|
|
|
808
808
|
instruction: `gm-plugkit running ${selfStale.own} but npm has ${selfStale.latest}. The npx/bun cache served a stale copy. Clear the cache so the next invocation picks up the latest wrapper fixes: bun pm cache rm; or npx clear-npx-cache; or rm -rf ~/.npm/_npx ~/AppData/Local/npm-cache/_npx`,
|
|
809
809
|
};
|
|
810
810
|
try { fs.writeFileSync(path.join(spoolDir, '.gm-plugkit-stale.json'), JSON.stringify(marker, null, 2)); } catch (_) {}
|
|
811
|
-
log(`gm-plugkit self-stale: running ${selfStale.own}, latest npm ${selfStale.latest}
|
|
811
|
+
log(`gm-plugkit self-stale: running ${selfStale.own}, latest npm ${selfStale.latest} -- cache served old code (marker at .gm/exec-spool/.gm-plugkit-stale.json)`);
|
|
812
812
|
} else if (selfStale && selfStale.own) {
|
|
813
813
|
try {
|
|
814
814
|
const projectDir = process.env.CLAUDE_PROJECT_DIR || process.cwd();
|
|
@@ -914,7 +914,7 @@ function startSpoolDaemon() {
|
|
|
914
914
|
try {
|
|
915
915
|
const wrapper = path.join(gmToolsDir(), 'plugkit-wasm-wrapper.js');
|
|
916
916
|
if (!fs.existsSync(wrapper)) {
|
|
917
|
-
return { ok: false, error: `wrapper not at ${wrapper}
|
|
917
|
+
return { ok: false, error: `wrapper not at ${wrapper} -- ensureReady() must run first` };
|
|
918
918
|
}
|
|
919
919
|
const runtime = resolveNodeRuntime();
|
|
920
920
|
const projectDir = process.env.CLAUDE_PROJECT_DIR || process.cwd();
|
|
@@ -968,6 +968,7 @@ function startSpoolDaemon() {
|
|
|
968
968
|
module.exports = {
|
|
969
969
|
bootstrap,
|
|
970
970
|
ensureReady,
|
|
971
|
+
gmToolsDir,
|
|
971
972
|
getWasmPath,
|
|
972
973
|
getBinaryPath,
|
|
973
974
|
startSpoolDaemon,
|
package/cli.js
CHANGED
|
@@ -5,9 +5,9 @@ const fs = require('fs');
|
|
|
5
5
|
const os = require('os');
|
|
6
6
|
const path = require('path');
|
|
7
7
|
const cp = require('child_process');
|
|
8
|
-
const { ensureReady, startSpoolDaemon } = require('./bootstrap');
|
|
8
|
+
const { ensureReady, startSpoolDaemon, gmToolsDir } = require('./bootstrap');
|
|
9
9
|
|
|
10
|
-
const usage = `gm-plugkit
|
|
10
|
+
const usage = `gm-plugkit -- Bootstrap and daemon-spawn for gm plugkit binary.
|
|
11
11
|
|
|
12
12
|
Usage:
|
|
13
13
|
bun x gm-plugkit@latest Bootstrap + start spool daemon
|
|
@@ -24,7 +24,7 @@ Usage:
|
|
|
24
24
|
|
|
25
25
|
function readDiskWasmVersion() {
|
|
26
26
|
try {
|
|
27
|
-
const versionFile = path.join(
|
|
27
|
+
const versionFile = path.join(gmToolsDir(), 'plugkit.version');
|
|
28
28
|
return fs.readFileSync(versionFile, 'utf-8').trim() || null;
|
|
29
29
|
} catch (_) { return null; }
|
|
30
30
|
}
|
|
@@ -47,9 +47,9 @@ function readWatcherInstanceVersion(pid) {
|
|
|
47
47
|
|
|
48
48
|
function killStaleWatchers() {
|
|
49
49
|
try {
|
|
50
|
-
const wrapperPath = path.join(
|
|
50
|
+
const wrapperPath = path.join(gmToolsDir(), 'plugkit-wasm-wrapper.js');
|
|
51
51
|
if (!fs.existsSync(wrapperPath)) {
|
|
52
|
-
console.log(JSON.stringify({ ok: false, error:
|
|
52
|
+
console.log(JSON.stringify({ ok: false, error: `wrapper not installed at ${wrapperPath}` }));
|
|
53
53
|
return 1;
|
|
54
54
|
}
|
|
55
55
|
const diskMtime = fs.statSync(wrapperPath).mtimeMs;
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "gm-plugkit",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.1524",
|
|
4
4
|
"description": "Bootstrap and daemon-spawn tool for gm plugkit binary. Downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Includes plugkit-wasm-wrapper for WASM-based spool watching.",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"bin": {
|
package/plugkit-wasm-wrapper.js
CHANGED
|
@@ -297,7 +297,7 @@ function capHitText(hits, maxLen, maxCount) {
|
|
|
297
297
|
if (!Array.isArray(hits)) return hits;
|
|
298
298
|
return hits.slice(0, maxCount).map((h) => {
|
|
299
299
|
if (!h || typeof h !== 'object' || typeof h.text !== 'string' || h.text.length <= maxLen) return h;
|
|
300
|
-
return { ...h, text: h.text.slice(0, maxLen) + '
|
|
300
|
+
return { ...h, text: h.text.slice(0, maxLen) + '...[+' + (h.text.length - maxLen) + 'ch]' };
|
|
301
301
|
});
|
|
302
302
|
}
|
|
303
303
|
|
|
@@ -346,7 +346,7 @@ function turnTick(sess, verb, taskBase, phase, prdPending) {
|
|
|
346
346
|
const key = sess || '(no-session)';
|
|
347
347
|
const now = Date.now();
|
|
348
348
|
let t = _turns.get(key);
|
|
349
|
-
// Any verb arriving after an idle gap closes the stale turn
|
|
349
|
+
// Any verb arriving after an idle gap closes the stale turn -- not just instruction.
|
|
350
350
|
// Otherwise a non-instruction verb (prd-add, mutable-resolve, transition) landing
|
|
351
351
|
// after an overnight sleep stamps t.lastTs forward without splitting, and dur_ms
|
|
352
352
|
// (lastTs - startTs) balloons to wall-clock-with-sleep instead of active work time.
|
|
@@ -367,7 +367,7 @@ function turnTick(sess, verb, taskBase, phase, prdPending) {
|
|
|
367
367
|
}
|
|
368
368
|
t.lastTs = now;
|
|
369
369
|
t.dispatches++;
|
|
370
|
-
// A verb arriving resumes the turn
|
|
370
|
+
// A verb arriving resumes the turn -- clear any prior stall flag so a later re-stall
|
|
371
371
|
// is a fresh episode, not silently suppressed by the one-shot guard.
|
|
372
372
|
t.stallEmitted = false;
|
|
373
373
|
t.verbs.set(verb, (t.verbs.get(verb) || 0) + 1);
|
|
@@ -376,7 +376,7 @@ function turnTick(sess, verb, taskBase, phase, prdPending) {
|
|
|
376
376
|
}
|
|
377
377
|
|
|
378
378
|
// turn.end fires only when a NEW verb arrives after idle, so a turn that simply never
|
|
379
|
-
// receives another verb stays open forever and emits no signal
|
|
379
|
+
// receives another verb stays open forever and emits no signal -- a permanent stall is
|
|
380
380
|
// silence, not an event, which is how a mid-EXECUTE stop stays invisible for days. The
|
|
381
381
|
// heartbeat scan closes that hole: for each open turn idle past STALL_MS whose last phase
|
|
382
382
|
// is non-terminal (or carries open PRD rows), emit turn.stalled once. One-shot per episode
|
|
@@ -386,7 +386,7 @@ const STALL_MS = 300_000;
|
|
|
386
386
|
function scanStalledTurns() {
|
|
387
387
|
const now = Date.now();
|
|
388
388
|
// A long synchronous verb (codesearch index rebuild, chromium spawn) stamps busy_until and
|
|
389
|
-
// blocks the spool
|
|
389
|
+
// blocks the spool -- the agent is legitimately waiting, not stalled. Honor it exactly as
|
|
390
390
|
// supervisor.js checkWatcherHealth does, so a busy watcher never emits a false mid-chain-stall.
|
|
391
391
|
if (_lastBusyUntil && _lastBusyUntil > now) return;
|
|
392
392
|
for (const [key, t] of _turns) {
|
|
@@ -397,7 +397,7 @@ function scanStalledTurns() {
|
|
|
397
397
|
if (terminal) continue;
|
|
398
398
|
t.stallEmitted = true;
|
|
399
399
|
// key is the _turns map key (sess || '(no-session)'). When it is the sentinel, the turn was
|
|
400
|
-
// unattributed, so do not override logEvent's own cwd+sess base fields with '(no-session)'
|
|
400
|
+
// unattributed, so do not override logEvent's own cwd+sess base fields with '(no-session)' --
|
|
401
401
|
// let the cwd-based attribution stand. Pass an explicit sess only when key is a real session.
|
|
402
402
|
const fields = {
|
|
403
403
|
turn_idx: t.idx,
|
|
@@ -946,6 +946,35 @@ function scrubBrowserRunnerText(s) {
|
|
|
946
946
|
return t;
|
|
947
947
|
}
|
|
948
948
|
|
|
949
|
+
// Standard OS install locations for a Chrome/Chromium that speaks CDP. Used as a
|
|
950
|
+
// fallback when the managed ms-playwright cache is absent (e.g. cache evicted),
|
|
951
|
+
// so the browser verb keeps working off the system browser instead of failing.
|
|
952
|
+
function findSystemChromiumBinary() {
|
|
953
|
+
const candidates = process.platform === 'win32'
|
|
954
|
+
? [
|
|
955
|
+
path.join(process.env.PROGRAMFILES || 'C:\\Program Files', 'Google', 'Chrome', 'Application', 'chrome.exe'),
|
|
956
|
+
path.join(process.env['PROGRAMFILES(X86)'] || 'C:\\Program Files (x86)', 'Google', 'Chrome', 'Application', 'chrome.exe'),
|
|
957
|
+
path.join(process.env.PROGRAMFILES || 'C:\\Program Files', 'Chromium', 'Application', 'chrome.exe'),
|
|
958
|
+
]
|
|
959
|
+
: process.platform === 'darwin'
|
|
960
|
+
? [
|
|
961
|
+
'/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
|
|
962
|
+
'/Applications/Chromium.app/Contents/MacOS/Chromium',
|
|
963
|
+
]
|
|
964
|
+
: [
|
|
965
|
+
'/usr/bin/chromium',
|
|
966
|
+
'/usr/bin/chromium-browser',
|
|
967
|
+
'/usr/bin/google-chrome',
|
|
968
|
+
'/usr/bin/google-chrome-stable',
|
|
969
|
+
'/opt/google/chrome/chrome',
|
|
970
|
+
'/snap/bin/chromium',
|
|
971
|
+
];
|
|
972
|
+
for (const c of candidates) {
|
|
973
|
+
try { if (fs.existsSync(c)) return c; } catch (_) {}
|
|
974
|
+
}
|
|
975
|
+
return null;
|
|
976
|
+
}
|
|
977
|
+
|
|
949
978
|
function findInstalledChromiumBinary() {
|
|
950
979
|
try {
|
|
951
980
|
const explicit = process.env.GM_BROWSER_RUNNER_PATH || process.env.PLAYWRITER_BROWSER_PATH;
|
|
@@ -982,11 +1011,11 @@ function findInstalledChromiumBinary() {
|
|
|
982
1011
|
}
|
|
983
1012
|
}
|
|
984
1013
|
}
|
|
985
|
-
if (found.length === 0) return
|
|
1014
|
+
if (found.length === 0) return findSystemChromiumBinary();
|
|
986
1015
|
found.sort((a, b) => b.ver - a.ver);
|
|
987
1016
|
return found[0].candidate;
|
|
988
1017
|
} catch (_) {
|
|
989
|
-
return
|
|
1018
|
+
return findSystemChromiumBinary();
|
|
990
1019
|
}
|
|
991
1020
|
}
|
|
992
1021
|
|
|
@@ -1872,7 +1901,7 @@ function makeHostFunctions(instanceRef) {
|
|
|
1872
1901
|
ok: false,
|
|
1873
1902
|
error: 'missing timeoutMs',
|
|
1874
1903
|
required: 'positive integer milliseconds',
|
|
1875
|
-
paper_ref: '
|
|
1904
|
+
paper_ref: 'section 20',
|
|
1876
1905
|
received: rawTimeout === undefined ? null : rawTimeout,
|
|
1877
1906
|
});
|
|
1878
1907
|
}
|
|
@@ -1882,7 +1911,7 @@ function makeHostFunctions(instanceRef) {
|
|
|
1882
1911
|
error: 'timeoutMs below floor',
|
|
1883
1912
|
min: MIN_TIMEOUT_MS,
|
|
1884
1913
|
received: rawTimeout,
|
|
1885
|
-
paper_ref: '
|
|
1914
|
+
paper_ref: 'section 20',
|
|
1886
1915
|
});
|
|
1887
1916
|
}
|
|
1888
1917
|
const timeoutMs = rawTimeout;
|
|
@@ -2573,7 +2602,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2573
2602
|
action: 'spawn-replacement-and-exit',
|
|
2574
2603
|
boot_reason: bootReason,
|
|
2575
2604
|
});
|
|
2576
|
-
console.error(`[plugkit-wasm] version drift detected: instance=${instV} file=${fileV}
|
|
2605
|
+
console.error(`[plugkit-wasm] version drift detected: instance=${instV} file=${fileV} -- spawning replacement via bun x gm-plugkit@latest spool, waiting for its heartbeat before exiting`);
|
|
2577
2606
|
let spawnOk = false;
|
|
2578
2607
|
try {
|
|
2579
2608
|
const cp = _childProcess;
|
|
@@ -2670,7 +2699,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2670
2699
|
action: 'spawn-replacement-and-exit',
|
|
2671
2700
|
boot_reason: bootReason,
|
|
2672
2701
|
});
|
|
2673
|
-
console.error(`[plugkit-wasm] wrapper.js drift detected
|
|
2702
|
+
console.error(`[plugkit-wasm] wrapper.js drift detected -- spawning replacement directly from installed wrapper then exiting`);
|
|
2674
2703
|
try {
|
|
2675
2704
|
const cp = _childProcess;
|
|
2676
2705
|
const child = cp.spawn(process.execPath, [_wrapperPathInstalled, 'spool'], {
|
|
@@ -2910,7 +2939,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2910
2939
|
instruction: _isPlannedBoot
|
|
2911
2940
|
? `Planned restart: prior watcher exited with reason="${_priorShutdown.reason}". No action required.`
|
|
2912
2941
|
: (_severity === 'warn'
|
|
2913
|
-
? 'Prior watcher disappeared with a recent heartbeat
|
|
2942
|
+
? 'Prior watcher disappeared with a recent heartbeat -- likely a clean shutdown that did not write .shutdown-reason.json. Inspect .watcher.log if recurrent.'
|
|
2914
2943
|
: 'Prior watcher died without a planned shutdown and without a recent heartbeat. This is treated as a critical failure. Inspect .watcher.log and gm-log/<day>/plugkit.jsonl events supervisor.watcher-exited-unexpectedly + supervisor.heartbeat-stale around the prior_status.ts timestamp to diagnose root cause.'),
|
|
2915
2944
|
};
|
|
2916
2945
|
logEvent('plugkit', _isPlannedBoot ? 'watcher.planned-restart' : 'watcher.unplanned-restart', incidentPayload);
|
|
@@ -2928,7 +2957,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
2928
2957
|
if (_isPlannedBoot) {
|
|
2929
2958
|
console.log(`[plugkit-wasm] planned restart: prior reason="${_priorShutdown.reason}" boot_reason=${_bootReason}`);
|
|
2930
2959
|
} else {
|
|
2931
|
-
console.error(`[plugkit-wasm] UNPLANNED RESTART detected
|
|
2960
|
+
console.error(`[plugkit-wasm] UNPLANNED RESTART detected -- prior watcher died without writing .shutdown-reason.json. prior_status_age_ms=${restartContext.prior_status_age_ms} boot_reason=${_bootReason}`);
|
|
2932
2961
|
}
|
|
2933
2962
|
}
|
|
2934
2963
|
try { fs.unlinkSync(SHUTDOWN_REASON_PATH); } catch (_) {}
|
|
@@ -3023,7 +3052,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3023
3052
|
// Defense-in-depth beyond walkDir's dot-dir skip: a real verb is a single clean
|
|
3024
3053
|
// segment (e.g. instruction, prd-resolve). A derived verb containing a path
|
|
3025
3054
|
// separator or a dot-segment means the file lives under a stray nested spool
|
|
3026
|
-
// (in/prd-resolve/.gm/exec-spool
|
|
3055
|
+
// (in/prd-resolve/.gm/exec-spool/...); dispatching it builds a bogus verb+outName
|
|
3027
3056
|
// and ENOENT-storms every tick. Skip + unmark so it never re-enters the loop.
|
|
3028
3057
|
if (/[\\/]/.test(verb) || verb.split(/[\\/]/).some(seg => seg.startsWith('.'))) {
|
|
3029
3058
|
try { logEvent('plugkit', 'spool.skip-nested-verb', { rel: relPath, derived_verb: verb }); } catch (_) {}
|
|
@@ -3161,7 +3190,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3161
3190
|
try {
|
|
3162
3191
|
for (const entry of fs.readdirSync(dir)) {
|
|
3163
3192
|
if (/\.tmp\.\d+(\.|$)/.test(entry)) continue;
|
|
3164
|
-
// The verb tree is in/<verb>/[<sub>/]<N>.<ext>
|
|
3193
|
+
// The verb tree is in/<verb>/[<sub>/]<N>.<ext> -- at most two levels deep. A
|
|
3165
3194
|
// dot-prefixed dir (e.g. a stray nested .gm/exec-spool/ created by a misfire)
|
|
3166
3195
|
// is never a verb dir; recursing into it derives a bogus verb like
|
|
3167
3196
|
// `prd-resolve\.gm\exec-spool` and dispatch-errors on every tick forever.
|
|
@@ -3205,7 +3234,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3205
3234
|
// A synchronous verb (chromium spawn, long exec) blocks the event loop, so the 5s
|
|
3206
3235
|
// heartbeat interval cannot fire for the duration. Without a hint, a liveness probe that
|
|
3207
3236
|
// checks ts-within-15s reads the busy watcher as dead and may kill/respawn it mid-verb.
|
|
3208
|
-
// busy_until tells probes "alive but synchronously busy until this epoch ms"
|
|
3237
|
+
// busy_until tells probes "alive but synchronously busy until this epoch ms" -- read it
|
|
3209
3238
|
// alongside ts: a stale ts whose busy_until is still in the future is a busy watcher, not
|
|
3210
3239
|
// a dead one. The pre-verb writeStatus(busyMs) stamps it before the blocking call.
|
|
3211
3240
|
if (busyMs && busyMs > 0) { rec.busy_until = now + busyMs; _lastBusyUntil = rec.busy_until; }
|
|
@@ -3408,7 +3437,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3408
3437
|
logEvent('plugkit', 'update.available', { installed, latest });
|
|
3409
3438
|
_lastKnownDrift = latest;
|
|
3410
3439
|
}
|
|
3411
|
-
// NOTE: no version-file bump here either
|
|
3440
|
+
// NOTE: no version-file bump here either -- see the network-path comment above. Bumping the version
|
|
3412
3441
|
// file ahead of a verified binary download poisons installedVersionAtTools() and causes an infinite
|
|
3413
3442
|
// drift-respawn thrash. Auto-update is notify-only until a sha-verified force-download path exists.
|
|
3414
3443
|
}
|
|
@@ -3489,7 +3518,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3489
3518
|
// NOTE: do NOT bump the disk version file here to "arm" a drift-respawn. installedVersionAtTools()
|
|
3490
3519
|
// reads that file as the source of truth for the installed version; bumping it ahead of the actual
|
|
3491
3520
|
// wasm download makes ensureReady compute versionDrift=false (file==target) and isReady()=true, so it
|
|
3492
|
-
// returns already-ready WITHOUT downloading the new binary
|
|
3521
|
+
// returns already-ready WITHOUT downloading the new binary -- while the running instance is still the
|
|
3493
3522
|
// old version. The drift-check then sees instance(old) != file(new) forever and self-respawns in an
|
|
3494
3523
|
// infinite loop, each respawn reloading the same old wasm. The version file must only advance AFTER
|
|
3495
3524
|
// a verified binary download (bootstrap's job). Auto-update stays notify-only until ensureReady gains
|
|
@@ -3630,7 +3659,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3630
3659
|
const base = path.basename(fp, path.extname(fp));
|
|
3631
3660
|
const outName = verbDir === '.' ? `${base}.json` : `${verbDir}-${base}.json`;
|
|
3632
3661
|
try {
|
|
3633
|
-
fs.writeFileSync(path.join(outDir, outName), JSON.stringify({ ok: false, error: 'stale input
|
|
3662
|
+
fs.writeFileSync(path.join(outDir, outName), JSON.stringify({ ok: false, error: 'stale input -- never dispatched or watcher crash mid-flight' }));
|
|
3634
3663
|
} catch (e) { console.error(`[stale-sweep] failed to write error for ${rel}: ${e.message}`); }
|
|
3635
3664
|
try { fs.unlinkSync(fp); stale++; } catch (e) { console.error(`[stale-sweep] failed to unlink ${rel}: ${e.message}`); }
|
|
3636
3665
|
console.error(`[stale-sweep] auto-failed ${rel} (age >${600}s)`);
|
|
@@ -3659,7 +3688,7 @@ async function runSpoolWatcher(instance, spoolDir) {
|
|
|
3659
3688
|
if (!filename) return;
|
|
3660
3689
|
if (/\.tmp\.\d+(\.|$)/.test(filename)) return;
|
|
3661
3690
|
// Skip any path with a dot-prefixed segment (e.g. a stray nested
|
|
3662
|
-
// prd-resolve/.gm/exec-spool
|
|
3691
|
+
// prd-resolve/.gm/exec-spool/...): it is not a real verb dispatch and walking it
|
|
3663
3692
|
// derives a bogus verb that dispatch-errors on every tick. Matches walkDir's guard.
|
|
3664
3693
|
if (filename.split(/[\\/]/).some(seg => seg.startsWith('.'))) return;
|
|
3665
3694
|
const fullPath = path.join(inDir, filename);
|
|
@@ -3782,7 +3811,7 @@ async function tryInstantiate(wasmPath) {
|
|
|
3782
3811
|
if (isLink || isCompile) {
|
|
3783
3812
|
const healed = await selfHeal(`${e.name || 'instantiate'}: ${e.message}`);
|
|
3784
3813
|
if (!healed) {
|
|
3785
|
-
console.error('[plugkit-wasm] wrapper/wasm version skew
|
|
3814
|
+
console.error('[plugkit-wasm] wrapper/wasm version skew -- run: bun x gm-plugkit@latest spool');
|
|
3786
3815
|
process.exit(1);
|
|
3787
3816
|
}
|
|
3788
3817
|
({ instance, instanceRef } = await tryInstantiate(wasmPath));
|
package/supervisor.js
CHANGED
|
@@ -261,7 +261,7 @@ function checkWatcherHealth() {
|
|
|
261
261
|
const now = Date.now();
|
|
262
262
|
// A long synchronous verb (git_finalize's ~30s network push, a chromium spawn)
|
|
263
263
|
// blocks the heartbeat write. The verb advertises busy_until in .status.json; while
|
|
264
|
-
// that is in the future the watcher is busy, not hung
|
|
264
|
+
// that is in the future the watcher is busy, not hung -- reaping it kills the verb
|
|
265
265
|
// mid-flight (the VERB ABORT). Honor busy_until exactly as the agent boot probe does.
|
|
266
266
|
if (status.busy_until && status.busy_until > now) {
|
|
267
267
|
return;
|