npm - gm-cc - Versions diffs - 2.0.663 → 2.0.664 - Mend

gm-cc 2.0.663 → 2.0.664

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/.claude-plugin/marketplace.json +1 -1
package/agents/memorize.md +33 -7
package/package.json +1 -1
package/plugin.json +1 -1
package/skills/gm/SKILL.md +17 -3
package/skills/gm-execute/SKILL.md +23 -0
package/skills/planning/SKILL.md +22 -0

package/.claude-plugin/marketplace.json CHANGED Viewed

@@ -4,7 +4,7 @@
     "name": "AnEntrypoint"
   },
   "description": "State machine agent with hooks, skills, and automated git enforcement",
-  "version": "2.0.663",
+  "version": "2.0.664",
   "metadata": {
     "description": "State machine agent with hooks, skills, and automated git enforcement"
   },

package/agents/memorize.md CHANGED Viewed

@@ -28,18 +28,28 @@ Discard:
 ## STEP 2: INGEST INTO RS-LEARN
-For each classified fact, invoke the plugkit memorize subcommand (prefers in-process HTTP to a running `rs-learn serve`, falls back to subprocess automatically — fast either way):
+For each classified fact, invoke `exec:memorize` (HTTP-preferred, subprocess fallback — fast either way):
-```bash
-plugkit memorize --source "<type>/<slug>" "<fact text>"
 ```
+exec:memorize
+<type>/<slug>
+<fact body — one to three self-contained sentences>
+```
+Line 1 of the body is the source tag (e.g. `feedback/terse-responses`, `project/merge-freeze`). Lines 2+ are the fact itself. Use kebab-case slugs.
+To invalidate previously-memorized content (correction or retraction):
-Where `<fact text>` is a self-contained one-to-three sentence summary of the fact, and `<slug>` is a short kebab-case label (e.g. `feedback/terse-responses`, `project/merge-freeze`).
+```
+exec:forget
+by-source <tag>
+```
-For multi-paragraph facts, prefer the file form to avoid argv length limits:
+Or by content:
-```bash
-plugkit memorize --source "<type>/<slug>" --file <path>
+```
+exec:forget
+by-query <2-6 search words>
 ```
 ## STEP 3: AGENTS.md
@@ -52,3 +62,19 @@ For each qualifying fact from context:
 - If genuinely non-obvious → append to the appropriate section
 Never add: obvious patterns, active task progress, redundant restatements.
+## STEP 4: AGENTS.md → RS-LEARN MIGRATION (BENCHMARK + DRAIN)
+AGENTS.md is the **always-on context buffer** — every prompt sees it. rs-learn is the **conditional retrieval store** — only relevant facts surface. The migration loop turns AGENTS.md into a benchmark for rs-learn's recall quality:
+1. Pick **5 random items** from AGENTS.md (sections, paragraphs, or numbered points). Don't pick the most recent additions — pick the oldest stable items.
+2. For each item, derive a 2-6 word query that a future agent would naturally use to find this fact.
+3. Run `exec:recall` with that query.
+4. Decide:
+   - **Recall accurate AND complete** → the rs-learn store has internalized this fact; **remove it from AGENTS.md**. Frees buffer space and confirms learning.
+   - **Recall partial / outdated / missing** → keep the AGENTS.md item AND ingest a refined version of the fact via `exec:memorize` so next round it can pass. Note the outcome in your run log.
+5. Record the audit cycle: how many items checked, how many removed, how many refined. Append this single-line summary to AGENTS.md under a `## Learning audit` section so future audits can see drift over time.
+Why: AGENTS.md grows monotonically without this loop. rs-learn already filters by relevance per-prompt, so duplicating stable facts in AGENTS.md just inflates the always-on context. The migration drains AGENTS.md into the retrieval store as the store proves it can recall. Failed migrations leave the fact in AGENTS.md (safe default) and improve the store. Success rate over time = a metric for how well gm is learning this project.
+Don't migrate if the fact is genuinely about agent meta-behavior that must be active every prompt (e.g. "always invoke gm:gm first") — those stay permanently.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gm-cc",
-  "version": "2.0.663",
+  "version": "2.0.664",
   "description": "State machine agent with hooks, skills, and automated git enforcement",
   "author": "AnEntrypoint",
   "license": "MIT",

package/plugin.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "gm",
-  "version": "2.0.663",
+  "version": "2.0.664",
   "description": "State machine agent with hooks, skills, and automated git enforcement",
   "author": {
     "name": "AnEntrypoint",

package/skills/gm/SKILL.md CHANGED Viewed

@@ -14,6 +14,17 @@ Each phase loads protocols via Skill invocation only. Reading summary ≠ being
 `gm-execute` = execution contract (all phases). `governance` = route/legitimacy reference (load once).
+## RECALL — HARD RULE
+Before resolving any unknown via fresh execution, check past sessions. Memorized facts only help if recalled.
+```
+exec:recall
+<2-6 word query>
+```
+Triggers: unknown feels familiar | sub-task on a known project | about to ask user something likely already discussed | about to design where prior decision exists. Hits = weak_prior; still witness before adopting. ~200 tokens, ~5ms when serve is running.
 ## MEMORIZE — HARD RULE
 Unknown→known = memorize same turn it resolves. Background, non-blocking.
@@ -26,11 +37,14 @@ Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt
 Multiple facts → parallel Agent calls in ONE message. End-of-turn: scan for un-memorized resolutions → spawn now.
+**Recall + memorize together = learning loop.** Skipping either breaks it.
 ## EXECUTION ORDER
-1. Code execution (exec:<lang>, exec:codesearch) — 90%+ of unknowns
-2. Web (WebFetch/WebSearch) — env facts not in codebase
-3. User — only when 1+2 exhausted AND decision is destructive-irreversible
+1. Recall — `plugkit recall` for any familiar-feeling unknown (cheapest, 200 tokens)
+2. Code execution (exec:<lang>, exec:codesearch) — 90%+ of unknowns
+3. Web (WebFetch/WebSearch) — env facts not in codebase
+4. User — only when 1+2+3 exhausted AND decision is destructive-irreversible
 "Should I..." mid-chain = invoke next skill instead.

package/skills/gm-execute/SKILL.md CHANGED Viewed

@@ -93,6 +93,27 @@ Real services, real data, real timing. Mocks/stubs/simulations = delete. Scatter
 Browser escalation: exec:browser → browser skill → navigate/click → screenshot (last resort).
+## RECALL — HARD RULE
+Before resolving any new unknown via fresh execution, check whether past sessions already answered it. Memorized facts are useless if not recalled.
+Triggers (any = run recall NOW, before exec/codesearch):
+- About to investigate a "did we hit this before" question
+- About to ask the user something likely already discussed
+- About to design an approach where a prior decision exists
+- Encountered an error/quirk that "feels familiar"
+- Starting a new sub-task on a project worked on before
+- About to write a comment explaining a non-obvious choice
+```
+exec:recall
+<2-6 word query>
+```
+Recall hits are weak_prior — still witness via execution before acting. Empty result = no prior memory; proceed normally. Never block on recall — capped at 6s timeout, ~5ms when rs-learn serve is running.
+Recall costs ~200 tokens for 5 hits. Cheaper than re-investigating the same problem.
 ## MEMORIZE — HARD RULE
 Unknown→known = memorize same turn it resolves.
@@ -105,6 +126,8 @@ Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt
 N facts → N parallel Agent calls in ONE message. End-of-turn self-check mandatory.
+**Recall + memorize together = the learning loop.** Recall before investigating; memorize after resolving. Skipping either breaks the loop.
 **Never**: Bash(node/npm/npx/bun) | fake data | mocks | scattered tests | fallbacks | Grep/Glob/Find/Explore | sequential independent items | respond to user mid-phase | edit before witnessing | duplicate code | if/else where dispatch table suffices | one-liners that obscure | reinvent what native/library provides
 **Always**: witness every hypothesis | import real modules | scan before edit | regress on new unknown | delete mocks/comments/scattered tests on discovery | test.js for every behavior change | invoke next skill immediately when done

package/skills/planning/SKILL.md CHANGED Viewed

@@ -16,6 +16,26 @@ Output = every fault surface work could fail on. Unknown named+resolved = cheape
 Later-phase unknown → return here. Not failure — machine working. Patch-around-in-place = compounding debt.
+## RECALL — HARD RULE
+Before naming any unknown, check past sessions. Cross-session memory is the cheapest mutable resolver.
+Triggers (any = run `plugkit recall` BEFORE adding the .prd item):
+- Unknown matches a previously-discussed topic in this project
+- About to investigate a "have we seen this" question
+- About to design an approach where a prior decision likely exists
+- Quirk/error feels familiar
+- Sub-task in a project worked on before
+```
+exec:recall
+<2-6 word query>
+```
+Hits are weak_prior — feed the proposed approach into PLAN, but witness via EXECUTE before adopting. Empty hits = proceed normally; no signal lost.
+Skip recall only when: brand-new project (no DB) | trivially-bounded edit with zero unknowns | user gave surgical instructions.
 ## MEMORIZE — HARD RULE
 Every unknown resolved → memorize same turn. Not batched, not deferred.
@@ -28,6 +48,8 @@ Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt
 Multiple facts in one turn → parallel Agent calls in ONE message. End-of-turn: scan for missed → spawn now.
+**Recall + memorize together = the learning loop.** Recall before investigating; memorize after resolving.
 ## STATE MACHINE
 **FORWARD**: PLAN → `gm-execute` | EXECUTE → `gm-emit` | EMIT → `gm-complete` | VERIFY .prd remains → `gm-execute` | VERIFY .prd empty+pushed → `update-docs`