gm-skill 2.0.1622 → 2.0.1624

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -104,7 +104,7 @@ Every skill's `allowed-tools:` is reduced to `Skill, Read, Write` (plus the SKIL
104
104
 
105
105
  **Every possible aspect that can be checked for jank is a PRD row; the architecture is pliable**: at PLAN, for every surface the prompt concerns, enumerate every aspect checkable for `jank` -- every immaturity, unfinished edge, half-wired path -- across gui/ux/ui/client-state/server-state/the boundary and any surface reached, each its own row including a profiling row and a security row per surface. `jank` is load-bearing: hunt the rough/unpolished/almost-done, not only outright bugs. Scoped to the prompt's concern + its reachable closure, exhaustive within it. Every issue found opens its own debug-and-repair plan spooled the same turn; every quick improvement is spooled too. `pliable`: every architectural change that clearly improves or reduces maintenance burden is a spooled plan -- replacing bespoke code with native functionality or a popular well-maintained library is encouraged ONLY when it nets a smaller maintained surface (a heavy dep for a few lines is the guarded failure mode). Fan-out is the spool-native shape (parallel `prd-add`/`codesearch`/`exec_js`, plugkit task-spawn), never the platform's Task/Explore subagent. One tell-tale AI design element (boilerplate flourish, over-hedged comment, generic scaffold name, machine-authored shape) spawns a full-codebase sweep plan -- scan/per-cluster/fix-and-verify rows, exhaustive over every file, never a one-off fix.
106
106
 
107
- **Client-side debugging exposes globals and evaluates in-browser, never blind-restarts**: surface the relevant state as a `window.*` global and read it live via the `browser` verb's `page.evaluate`, running experiments in the browser, rather than blind experimentation + server restarts. The live page is the debugger; the same `browser` surface that witnesses an edit also diagnoses it.
107
+ **Client-side debugging exposes globals and evaluates in-browser, never blind-restarts**: the live page is the debugger (rs-learn: `recall: client-side-debug-globals-live-page`).
108
108
 
109
109
  **Mundane user-facing output is suppressed or stripped to the bone**: drop articles, preamble, play-by-play; boot-probe narration, dispatch echoes, restating prose just read, status recaps do not ship. What survives is substantive: a real finding, a decision + one-line reason, a blocker, the single-line PRD-read declaration. Terse = fewer/shorter words, NEVER zero tool calls and NEVER silent work -- the turn still ends in the chain-advancing tool call.
110
110
 
@@ -160,9 +160,9 @@ Orchestration state is tracked via `.gm/` marker files, not hook events; the CLI
160
160
 
161
161
  **Dead-watcher recovery uses `bun x gm-plugkit@latest spool`, never direct-node boot** (mechanism in rs-learn: `recall: dead-watcher recovery bun x not direct-node`).
162
162
 
163
- **Apparent tooling failure is mechanical self-recovery, NEVER a question for the user and never an a/b-test/blind-restart.** "The spooler is not working" / a missing spool response / a stale watcher is the agent's own job to fix: honor a future `busy_until` (wait), else boot the watcher and re-dispatch; on a transient boot hiccup (`FailedToOpenSocket`) retry `@latest`, never the non-`@latest` cache (stale). The spooler is sound by construction -- `.status.json` is written atomically (temp+rename, `atomicWriteJson`) and every long verb advertises `busy_until` -- so a transient unreadable/stale read is a respawn/idle-teardown window to boot through, not a broken tool; asking the user to do what a verb can do is a paper-spirit violation. Debug the live page via `window.*` globals + the `browser` verb's `page.evaluate` as a process of elimination, never variant-after-variant a/b testing. This IS the core gm method on every surface including its own tooling: record all mutables, eliminate each by witness, discover more, keep going.
163
+ **Apparent tooling failure is mechanical self-recovery, NEVER a question for the user and never an a/b-test/blind-restart.** A missing spool response / stale watcher is the agent's own job: honor a future `busy_until` else boot the watcher and re-dispatch -- the spooler is sound by construction, so asking the user to do what a verb can do is a paper-spirit violation. Recovery mechanics (atomic `.status.json`, `FailedToOpenSocket` retry, debug-via-`window.*`-globals) in rs-learn (`recall: spooler self-recovery mechanics`).
164
164
 
165
- **Process-of-elimination is the debugging paradigm EVERYWHERE, and manual real-services witness is the verification paradigm EVERYWHERE.** Every debug -- code, wasm, cascade, browser, the spooler itself -- enumerates candidate causes as mutables and eliminates each by a witness read against real input (`exec_js`/`codesearch`/`Read`/`browser page.evaluate`), each elimination revealing the next, never guess-and-restart/a-b-test/shotgun. Every verification is manual labour against the real thing -- the single mock-free `test.js`, the live page, the real service, the live wasm -- never an automated unit/mock suite standing in for the real-services witness (the conventional-testing tell-tale gm replaces). Stated in `instructions/execute.md` (the served EXECUTE prose) so it reaches every LLM in-session.
165
+ **Process-of-elimination is the debugging paradigm EVERYWHERE, and manual real-services witness is the verification paradigm EVERYWHERE** -- both stated in `instructions/execute.md` (served EXECUTE prose). Detail in rs-learn (`recall: process-of-elimination manual-real-services-witness paradigm`).
166
166
 
167
167
  **The first verb after a genuine multi-minute IDLE is `instruction`, to reset the long-gap clock**: only spool verbs reset it, so a long investigation in platform tools trips a false stall -- interleave `instruction`/`prd-add` to stay warm, and dispatch `instruction` BEFORE any predictable blocking wait. Threshold + platform-tool exception in rs-learn (`recall: first verb after multi-minute wait instruction long-gap`).
168
168
 
@@ -178,5 +178,7 @@ One-shot system-state probe: dispatch `plugkit health` via the file-spool before
178
178
 
179
179
  Site build + landing render is single-surface detail, fully drained to rs-learn (`recall: gm site build details`).
180
180
 
181
+ **The site consumes the `anentrypoint-design` SDK pro-rata, never overriding it.** `site/theme.mjs` loads the SDK at runtime (`unpkg.com/anentrypoint-design@latest`) and the local `<style>` carries ONLY render-mode plumbing (flatspace html-class toggles `article-flow`/`landing-cap`, the crumb media query) plus site article-layout rhythm that is not an SDK component -- never a themed visual component. Every graphic-design change (a token, a component's look, TOC/cli/panel/card/callout styling) is made IN the SDK repo (`../anentrypoint-design`, GitHub `AnEntrypoint/design`, npm `anentrypoint-design`) as a token-only sheet and published; the site picks it up via `@latest`. A new local CSS rule that styles a visual component is a deviation -- it belongs in the SDK. SDK component sheets are lint-gated literal-free (every color a `var(--token)`); the SDK build prefixes all selectors with the `.ds-247420` scope. Mechanism in rs-learn (`recall: design SDK pro-rata consumption`).
182
+
181
183
 
182
184
  @.gm/next-step.md
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-plugkit",
3
- "version": "2.0.1622",
3
+ "version": "2.0.1624",
4
4
  "description": "Bootstrap and daemon-spawn tool for gm plugkit binary. Downloads the correct platform binary, verifies SHA256, and starts the spool watcher daemon. Includes plugkit-wasm-wrapper for WASM-based spool watching.",
5
5
  "main": "index.js",
6
6
  "bin": {
package/gm.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.1622",
3
+ "version": "2.0.1624",
4
4
  "description": "Spool-dispatch orchestration engine with unified state machine, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-skill",
3
- "version": "2.0.1622",
3
+ "version": "2.0.1624",
4
4
  "description": "Canonical universal harness — AI-native software engineering via skill-driven orchestration; bootstraps plugkit for task execution and session isolation. Install in any AI coding agent host.",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
@@ -85,3 +85,5 @@ The chain is not COMPLETE until changes are on origin. Commit and push at the en
85
85
  **Prune bad memory on sight -- a wrong recall hit is worse than a miss.** A stale/superseded/wrong `recall` or `auto_recall` hit gets `memorize-prune {key}` (deletes text + embedding). For an uncertain set, `memorize-prune {query}` returns review-only candidates; judge, then re-dispatch the stale `{keys:[...]}` -- never a blind similarity-delete.
86
86
 
87
87
  On turn entry plugkit attaches an `auto_recall` pack derived from the prompt; read its hits alongside `recall_hits` (the phase+PRD-subject pack). It fires once per turn entry on its own -- do not re-trigger it.
88
+
89
+ If the instructions amount to doing more than one step or imply it, use or create a workflow, or set a goal to track progress, and if subagents are available fan out subagents that use gm for everything, up to 8 in parallel