gm-skill 2.0.1507 → 2.0.1509

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -106,7 +106,7 @@ Every possible skill's `allowed-tools:` frontmatter is reduced to `Skill, Read,
106
106
 
107
107
  **Noticing is a planning event, at every phase, in every dispatch window**: any observation the agent makes during the chain, anything that should be done, anything outstanding, anything unfinished, anything improvable, anything misaligned with user preferences, anything the work itself surfaces about what *else* the work touches, is a `prd-add` the agent dispatches this turn. Observations carried in the response body without conversion to a PRD row evaporate when the turn ends; only the PRD store survives. The default response to noticing is to convert. The discovery surface keeps producing new in-scope items as the chain walks PLAN->EXECUTE->EMIT->VERIFY, every phase has its own noticing-to-PRD pressure. Skipping the conversion ("I'll mention it in the summary" / "future work" / "note for later") is the canonical drift mechanism: the observation does not persist, the future turn does not arrive, the residual goes silent. Density grows along the walk, not just at PLAN-time; a chain that exits PLAN with N rows and reaches COMPLETE with N rows has either had no real discoveries (unlikely on a non-trivial task) or has lost them. When the discovery is structural rather than concrete, "the project would benefit from X", "this surface has no test coverage", "the docs do not mention Y", "the agent's preference for Z is being violated here", it is still a PRD row, written with the witness that motivated it. Preference-aware noticing applies the same conversion: when the agent observes that current state diverges from user-stated preferences (dense PRDs, residual-triage, no name-and-defer, every-possible expansion, browser-witness coverage, push-on-clean), each divergence is a `prd-add` describing what the aligned state looks like.
108
108
 
109
- **A turn without a tool call is a stop; summary is a stop; both are forbidden until plugkit pronounces COMPLETE**: every programming agent, regardless of vendor, reads only tool calls and their outputs, so an assistant message that ends in prose with no tool call IS the turn ending and the session halts there. This is the mechanical root of the "agent did one small piece and stopped" failure: the model wrote a wrap-up paragraph, emitted no tool call, and the harness read that as done. The rule is therefore absolute and tool-agnostic: while the chain is in-flight (phase COMPLETE OR prd_pending_count > 0) the agent NEVER ends a turn in prose and NEVER writes a summary/recap/"what I did" closure, every turn terminates in a tool call that advances the chain (`instruction`, the next named verb, `transition`, `phase-status`). The only event that authorizes a prose-only turn is plugkit returning `phase=COMPLETE` AND `prd_pending_count=0`; the agent's own sense that "the work feels done" authorizes nothing. Before any apparent stop or any summary, the agent dispatches `phase-status` and rechecks, a non-terminal phase means the urge to stop was drift, and the recovery is to dispatch `instruction` and continue. This depends on nothing but the verb spool, so it holds on every agent with no hook and no tool-specific feature; any continuation mechanism that relies on a hook or a single tool's behavior is non-portable and must be replaced by this spool-only discipline.
109
+ **A turn without a tool call is a stop; summary is a stop; both are forbidden until plugkit pronounces COMPLETE**: every programming agent, regardless of vendor, reads only tool calls and their outputs, so an assistant message that ends in prose with no tool call IS the turn ending and the session halts there. This is the mechanical root of the "agent did one small piece and stopped" failure: the model wrote a wrap-up paragraph, emitted no tool call, and the harness read that as done. Deferred intent is the same stop facing forward — a turn-final sentence naming the next move ("let me read X", "I'll re-dispatch instruction") instead of making it; the chain strands where the prose pointed (one real run halted at EXECUTE with 22 open rows on an announced-but-unmade read). The rule is absolute and tool-agnostic: while the chain is in-flight (phase != COMPLETE OR prd_pending_count > 0) the agent NEVER ends a turn in prose every turn terminates in a tool call that advances the chain (`instruction`, the next named verb, `transition`, `phase-status`). Take the move you were about to describe; surface a decision through `AskUserQuestion` or `prd-add`, never a prose-only "confirming direction." The only event that authorizes a prose-only turn is plugkit returning `phase=COMPLETE` AND `prd_pending_count=0`; the agent's own sense that "the work feels done" authorizes nothing. Before any apparent stop or any summary, the agent dispatches `phase-status` and rechecks, a non-terminal phase means the urge to stop was drift, and the recovery is to dispatch `instruction` and continue. This depends on nothing but the verb spool, so it holds on every agent with no hook and no tool-specific feature; any continuation mechanism that relies on a hook or a single tool's behavior is non-portable and must be replaced by this spool-only discipline.
110
110
 
111
111
  **Always seek the next state transition**: if the chain is not COMPLETE, there is a next move. Idle mid-chain is a deviation. The agent who finishes a verb and stops without dispatching the next instruction has stopped walking the chain. `phase-status` tells you where you are; `instruction` tells you what's next. There is no "I'll wait for the user" mid-chain, the user authorized closure at request time, not phase-by-phase.
112
112
 
@@ -140,7 +140,7 @@ Every possible skill's `allowed-tools:` frontmatter is reduced to `Skill, Read,
140
140
 
141
141
  Push to any rs-* sibling triggers `cascade.yml` -> rs-plugkit `release.yml` -> single `plugkit.wasm` (npm `plugkit-wasm` + `plugkit-bin` Releases) -> auto-bump `gm.json::plugkitVersion` -> `publish.yml` ships gm-skill + gm-plugkit + the SKILL.md mirror. Full step sequence + PUBLISHER_TOKEN setup in rs-learn (`recall: cascade pipeline`).
142
142
 
143
- Three npm packages publish from this repo: `gm-skill` (the skill harness), `gm-plugkit` (bootstrap + watcher), `plugkit-wasm` (wasm binary). publish.yml + the rs-plugkit cascade ships all three on every version-bump commit. The legacy 15 downstream repos (gm-cc, gm-gc, gm-oc, gm-kilo, gm-codex, gm-qwen, gm-copilot-cli, gm-hermes, gm-thebird, gm-vscode, gm-cursor, gm-zed, gm-jetbrains, gm-antigravity, gm-windsurf) are archived on GitHub, no further releases, no orphan-commit publish step.
143
+ Three npm packages publish from this repo: `gm-skill` (the skill harness), `gm-plugkit` (bootstrap + watcher), `plugkit-wasm` (wasm binary). publish.yml + the rs-plugkit cascade ships all three on every version-bump commit. The legacy 15 downstream repos are archived on GitHub, no further releases, no orphan-commit publish step.
144
144
 
145
145
  **Repos involved (push to every possible one triggers cascade):**
146
146
  - `AnEntrypoint/rs-exec`, exec runner, browser sessions, idle cleanup, session task isolation
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-plugkit",
3
- "version": "2.0.1507",
3
+ "version": "2.0.1509",
4
4
  "description": "Bootstrap and daemon-spawn tool for gm plugkit binary. Downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Includes plugkit-wasm-wrapper for WASM-based spool watching.",
5
5
  "main": "index.js",
6
6
  "bin": {
@@ -330,7 +330,7 @@ function endTurn(sess, t, idleSpanned) {
330
330
  });
331
331
  }
332
332
 
333
- function turnTick(sess, verb, taskBase, phase) {
333
+ function turnTick(sess, verb, taskBase, phase, prdPending) {
334
334
  const key = sess || '(no-session)';
335
335
  const now = Date.now();
336
336
  let t = _turns.get(key);
@@ -349,14 +349,50 @@ function turnTick(sess, verb, taskBase, phase) {
349
349
  if (verb !== 'instruction') return;
350
350
  const idx = ((_turns.get(key + ':lastIdx') || 0) + 1);
351
351
  _turns.set(key + ':lastIdx', idx);
352
- t = { idx, startTs: now, lastTs: now, dispatches: 0, verbs: new Map(), phases: new Set(), deviations: 0, lastPhase: phase };
352
+ t = { idx, startTs: now, lastTs: now, dispatches: 0, verbs: new Map(), phases: new Set(), deviations: 0, lastPhase: phase, prdPending: null, stallEmitted: false };
353
353
  _turns.set(key, t);
354
354
  logEvent('plugkit', 'turn.start', { sess, turn_idx: idx, phase: phase || null });
355
355
  }
356
356
  t.lastTs = now;
357
357
  t.dispatches++;
358
+ // A verb arriving resumes the turn — clear any prior stall flag so a later re-stall
359
+ // is a fresh episode, not silently suppressed by the one-shot guard.
360
+ t.stallEmitted = false;
358
361
  t.verbs.set(verb, (t.verbs.get(verb) || 0) + 1);
359
362
  if (phase) { t.phases.add(phase); t.lastPhase = phase; }
363
+ if (typeof prdPending === 'number') t.prdPending = prdPending;
364
+ }
365
+
366
+ // turn.end fires only when a NEW verb arrives after idle, so a turn that simply never
367
+ // receives another verb stays open forever and emits no signal — a permanent stall is
368
+ // silence, not an event, which is how a mid-EXECUTE stop stays invisible for days. The
369
+ // heartbeat scan closes that hole: for each open turn idle past STALL_MS whose last phase
370
+ // is non-terminal (or carries open PRD rows), emit turn.stalled once. One-shot per episode
371
+ // (stallEmitted), reset when a verb resumes the turn. A COMPLETE turn with no open rows
372
+ // idling is the authorized prose-only state and never stalls.
373
+ const STALL_MS = 300_000;
374
+ function scanStalledTurns() {
375
+ const now = Date.now();
376
+ for (const [key, t] of _turns) {
377
+ if (!t || typeof t !== 'object' || !Number.isFinite(t.startTs)) continue;
378
+ if (t.stallEmitted) continue;
379
+ if ((now - t.lastTs) < STALL_MS) continue;
380
+ const terminal = t.lastPhase === 'COMPLETE' && (t.prdPending === 0 || t.prdPending == null);
381
+ if (terminal) continue;
382
+ t.stallEmitted = true;
383
+ // key is the _turns map key (sess || '(no-session)'). When it is the sentinel, the turn was
384
+ // unattributed, so do not override logEvent's own cwd+sess base fields with '(no-session)' —
385
+ // let the cwd-based attribution stand. Pass an explicit sess only when key is a real session.
386
+ const fields = {
387
+ turn_idx: t.idx,
388
+ ended_in_phase: t.lastPhase || null,
389
+ prd_pending: t.prdPending,
390
+ idle_ms: now - t.lastTs,
391
+ dispatches: t.dispatches,
392
+ };
393
+ if (key && key !== '(no-session)') fields.sess = key;
394
+ logEvent('hook', 'deviation.mid-chain-stall', fields);
395
+ }
360
396
  }
361
397
 
362
398
  let __sessCache = { value: '', mtimeMs: 0, readAt: 0, srcMtimeMs: 0 };
@@ -504,7 +540,7 @@ function emitOrchestratorEvents(verb, taskBase, resultStr) {
504
540
  }
505
541
  const data = parsed.data || {};
506
542
  const sess = readCurrentSess();
507
- turnTick(sess, verb, taskBase, data.phase);
543
+ turnTick(sess, verb, taskBase, data.phase, typeof data.prd_pending_count === 'number' ? data.prd_pending_count : undefined);
508
544
  switch (verb) {
509
545
  case 'transition':
510
546
  logEvent('plugkit', 'phase.transitioned', { task: taskBase, phase: data.phase, next_skill: data.nextSkill, recall_count: Array.isArray(data.recall_hits) ? data.recall_hits.length : 0 });
@@ -3065,6 +3101,7 @@ async function runSpoolWatcher(instance, spoolDir) {
3065
3101
  }
3066
3102
  _writeStatusBusy = (ms) => { try { writeStatus(ms); } catch (_) {} };
3067
3103
  setInterval(() => writeStatus(), 5000);
3104
+ setInterval(() => { try { scanStalledTurns(); } catch (_) {} }, 30000);
3068
3105
  writeStatus();
3069
3106
 
3070
3107
  const TURN_SUMMARY_PATH = path.join(spoolDir, '.turn-summary.json');
package/gm.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.1507",
3
+ "version": "2.0.1509",
4
4
  "description": "Spool-dispatch orchestration engine with unified state machine, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-skill",
3
- "version": "2.0.1507",
3
+ "version": "2.0.1509",
4
4
  "description": "Canonical universal harness — AI-native software engineering via skill-driven orchestration; bootstraps plugkit for task execution and session isolation. Install in any AI coding agent host.",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
@@ -6,7 +6,7 @@ allowed-tools: Skill, Read, Write, Bash(bun *), Bash(npx *)
6
6
 
7
7
  # gm
8
8
 
9
- **A turn that ends without a tool call is a stop, and stopping mid-chain is the cardinal failure.** Every programming agent reads only tool calls and their outputs; an assistant message that ends in prose with no tool call IS the turn ending, and the session halts there. So while the chain is in-flight (phase ≠ COMPLETE OR prd_pending_count > 0) you NEVER end a turn in prose, every turn ends in a tool call that advances the chain (the next `instruction` dispatch, the next verb the prose named, the next `transition`). You do NOT summarize, you do NOT write a "here's what I did" wrap-up, you do NOT narrate closure, summary is a stop, and stopping is only authorized when plugkit itself returns `phase=COMPLETE` AND `prd_pending_count=0`. Before you are ever tempted to stop or summarize, you dispatch `phase-status` and read it: if it is not terminal, that temptation was a drift signal, dispatch `instruction` and keep walking. This is tool-agnostic by construction: it depends on nothing but the verb spool, so it holds identically on every agent, with no hook and no tool-specific feature. The one and only thing that authorizes the prose-only turn is plugkit's COMPLETE pronouncement; until then, the answer to "am I done?" is another tool call.
9
+ **A turn that ends without a tool call is a stop, and stopping mid-chain is the cardinal failure.** Every programming agent reads only tool calls and their outputs; an assistant message that ends in prose with no tool call IS the turn ending, and the session halts there. So while the chain is in-flight (phase ≠ COMPLETE OR prd_pending_count > 0) you NEVER end a turn in prose, every turn ends in a tool call that advances the chain (the next `instruction` dispatch, the next verb the prose named, the next `transition`). You do NOT summarize, you do NOT write a "here's what I did" wrap-up, you do NOT narrate closure, summary is a stop, and stopping is only authorized when plugkit itself returns `phase=COMPLETE` AND `prd_pending_count=0`. A turn-final sentence that names the next move instead of making it is the same stop facing forward — announcing a read, a verb, a re-dispatch is not doing it, and the chain strands exactly where the prose pointed. Take the move you were about to describe. Surfacing a decision is a tool call too (`AskUserQuestion` or `prd-add`), never a prose-only "confirming direction." Before you are ever tempted to stop, you dispatch `phase-status` and read it: if it is not terminal, that temptation was a drift signal, dispatch `instruction` and keep walking. This is tool-agnostic by construction: it depends on nothing but the verb spool, so it holds identically on every agent, with no hook and no tool-specific feature. The one and only thing that authorizes the prose-only turn is plugkit's COMPLETE pronouncement; until then, the answer to "am I done?" is another tool call.
10
10
 
11
11
  **Done is what plugkit says is done, never your claim.** The COMPLETE gate is the single arbiter. If the chain is not at COMPLETE, there is a next transition to seek; idle mid-chain is a deviation.
12
12