gm-cc 2.0.663 → 2.0.664

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -4,7 +4,7 @@
4
4
  "name": "AnEntrypoint"
5
5
  },
6
6
  "description": "State machine agent with hooks, skills, and automated git enforcement",
7
- "version": "2.0.663",
7
+ "version": "2.0.664",
8
8
  "metadata": {
9
9
  "description": "State machine agent with hooks, skills, and automated git enforcement"
10
10
  },
@@ -28,18 +28,28 @@ Discard:
28
28
 
29
29
  ## STEP 2: INGEST INTO RS-LEARN
30
30
 
31
- For each classified fact, invoke the plugkit memorize subcommand (prefers in-process HTTP to a running `rs-learn serve`, falls back to subprocess automatically — fast either way):
31
+ For each classified fact, invoke `exec:memorize` (HTTP-preferred, subprocess fallback — fast either way):
32
32
 
33
- ```bash
34
- plugkit memorize --source "<type>/<slug>" "<fact text>"
35
33
  ```
34
+ exec:memorize
35
+ <type>/<slug>
36
+ <fact body — one to three self-contained sentences>
37
+ ```
38
+
39
+ Line 1 of the body is the source tag (e.g. `feedback/terse-responses`, `project/merge-freeze`). Lines 2+ are the fact itself. Use kebab-case slugs.
40
+
41
+ To invalidate previously-memorized content (correction or retraction):
36
42
 
37
- Where `<fact text>` is a self-contained one-to-three sentence summary of the fact, and `<slug>` is a short kebab-case label (e.g. `feedback/terse-responses`, `project/merge-freeze`).
43
+ ```
44
+ exec:forget
45
+ by-source <tag>
46
+ ```
38
47
 
39
- For multi-paragraph facts, prefer the file form to avoid argv length limits:
48
+ Or by content:
40
49
 
41
- ```bash
42
- plugkit memorize --source "<type>/<slug>" --file <path>
50
+ ```
51
+ exec:forget
52
+ by-query <2-6 search words>
43
53
  ```
44
54
 
45
55
  ## STEP 3: AGENTS.md
@@ -52,3 +62,19 @@ For each qualifying fact from context:
52
62
  - If genuinely non-obvious → append to the appropriate section
53
63
 
54
64
  Never add: obvious patterns, active task progress, redundant restatements.
65
+
66
+ ## STEP 4: AGENTS.md → RS-LEARN MIGRATION (BENCHMARK + DRAIN)
67
+
68
+ AGENTS.md is the **always-on context buffer** — every prompt sees it. rs-learn is the **conditional retrieval store** — only relevant facts surface. The migration loop turns AGENTS.md into a benchmark for rs-learn's recall quality:
69
+
70
+ 1. Pick **5 random items** from AGENTS.md (sections, paragraphs, or numbered points). Don't pick the most recent additions — pick the oldest stable items.
71
+ 2. For each item, derive a 2-6 word query that a future agent would naturally use to find this fact.
72
+ 3. Run `exec:recall` with that query.
73
+ 4. Decide:
74
+ - **Recall accurate AND complete** → the rs-learn store has internalized this fact; **remove it from AGENTS.md**. Frees buffer space and confirms learning.
75
+ - **Recall partial / outdated / missing** → keep the AGENTS.md item AND ingest a refined version of the fact via `exec:memorize` so next round it can pass. Note the outcome in your run log.
76
+ 5. Record the audit cycle: how many items checked, how many removed, how many refined. Append this single-line summary to AGENTS.md under a `## Learning audit` section so future audits can see drift over time.
77
+
78
+ Why: AGENTS.md grows monotonically without this loop. rs-learn already filters by relevance per-prompt, so duplicating stable facts in AGENTS.md just inflates the always-on context. The migration drains AGENTS.md into the retrieval store as the store proves it can recall. Failed migrations leave the fact in AGENTS.md (safe default) and improve the store. Success rate over time = a metric for how well gm is learning this project.
79
+
80
+ Don't migrate if the fact is genuinely about agent meta-behavior that must be active every prompt (e.g. "always invoke gm:gm first") — those stay permanently.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-cc",
3
- "version": "2.0.663",
3
+ "version": "2.0.664",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
package/plugin.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.663",
3
+ "version": "2.0.664",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": {
6
6
  "name": "AnEntrypoint",
@@ -14,6 +14,17 @@ Each phase loads protocols via Skill invocation only. Reading summary ≠ being
14
14
 
15
15
  `gm-execute` = execution contract (all phases). `governance` = route/legitimacy reference (load once).
16
16
 
17
+ ## RECALL — HARD RULE
18
+
19
+ Before resolving any unknown via fresh execution, check past sessions. Memorized facts only help if recalled.
20
+
21
+ ```
22
+ exec:recall
23
+ <2-6 word query>
24
+ ```
25
+
26
+ Triggers: unknown feels familiar | sub-task on a known project | about to ask user something likely already discussed | about to design where prior decision exists. Hits = weak_prior; still witness before adopting. ~200 tokens, ~5ms when serve is running.
27
+
17
28
  ## MEMORIZE — HARD RULE
18
29
 
19
30
  Unknown→known = memorize same turn it resolves. Background, non-blocking.
@@ -26,11 +37,14 @@ Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt
26
37
 
27
38
  Multiple facts → parallel Agent calls in ONE message. End-of-turn: scan for un-memorized resolutions → spawn now.
28
39
 
40
+ **Recall + memorize together = learning loop.** Skipping either breaks it.
41
+
29
42
  ## EXECUTION ORDER
30
43
 
31
- 1. Code execution (exec:<lang>, exec:codesearch) 90%+ of unknowns
32
- 2. Web (WebFetch/WebSearch) — env facts not in codebase
33
- 3. Useronly when 1+2 exhausted AND decision is destructive-irreversible
44
+ 1. Recall `plugkit recall` for any familiar-feeling unknown (cheapest, 200 tokens)
45
+ 2. Code execution (exec:<lang>, exec:codesearch) — 90%+ of unknowns
46
+ 3. Web (WebFetch/WebSearch) env facts not in codebase
47
+ 4. User — only when 1+2+3 exhausted AND decision is destructive-irreversible
34
48
 
35
49
  "Should I..." mid-chain = invoke next skill instead.
36
50
 
@@ -93,6 +93,27 @@ Real services, real data, real timing. Mocks/stubs/simulations = delete. Scatter
93
93
 
94
94
  Browser escalation: exec:browser → browser skill → navigate/click → screenshot (last resort).
95
95
 
96
+ ## RECALL — HARD RULE
97
+
98
+ Before resolving any new unknown via fresh execution, check whether past sessions already answered it. Memorized facts are useless if not recalled.
99
+
100
+ Triggers (any = run recall NOW, before exec/codesearch):
101
+ - About to investigate a "did we hit this before" question
102
+ - About to ask the user something likely already discussed
103
+ - About to design an approach where a prior decision exists
104
+ - Encountered an error/quirk that "feels familiar"
105
+ - Starting a new sub-task on a project worked on before
106
+ - About to write a comment explaining a non-obvious choice
107
+
108
+ ```
109
+ exec:recall
110
+ <2-6 word query>
111
+ ```
112
+
113
+ Recall hits are weak_prior — still witness via execution before acting. Empty result = no prior memory; proceed normally. Never block on recall — capped at 6s timeout, ~5ms when rs-learn serve is running.
114
+
115
+ Recall costs ~200 tokens for 5 hits. Cheaper than re-investigating the same problem.
116
+
96
117
  ## MEMORIZE — HARD RULE
97
118
 
98
119
  Unknown→known = memorize same turn it resolves.
@@ -105,6 +126,8 @@ Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt
105
126
 
106
127
  N facts → N parallel Agent calls in ONE message. End-of-turn self-check mandatory.
107
128
 
129
+ **Recall + memorize together = the learning loop.** Recall before investigating; memorize after resolving. Skipping either breaks the loop.
130
+
108
131
  **Never**: Bash(node/npm/npx/bun) | fake data | mocks | scattered tests | fallbacks | Grep/Glob/Find/Explore | sequential independent items | respond to user mid-phase | edit before witnessing | duplicate code | if/else where dispatch table suffices | one-liners that obscure | reinvent what native/library provides
109
132
 
110
133
  **Always**: witness every hypothesis | import real modules | scan before edit | regress on new unknown | delete mocks/comments/scattered tests on discovery | test.js for every behavior change | invoke next skill immediately when done
@@ -16,6 +16,26 @@ Output = every fault surface work could fail on. Unknown named+resolved = cheape
16
16
 
17
17
  Later-phase unknown → return here. Not failure — machine working. Patch-around-in-place = compounding debt.
18
18
 
19
+ ## RECALL — HARD RULE
20
+
21
+ Before naming any unknown, check past sessions. Cross-session memory is the cheapest mutable resolver.
22
+
23
+ Triggers (any = run `plugkit recall` BEFORE adding the .prd item):
24
+ - Unknown matches a previously-discussed topic in this project
25
+ - About to investigate a "have we seen this" question
26
+ - About to design an approach where a prior decision likely exists
27
+ - Quirk/error feels familiar
28
+ - Sub-task in a project worked on before
29
+
30
+ ```
31
+ exec:recall
32
+ <2-6 word query>
33
+ ```
34
+
35
+ Hits are weak_prior — feed the proposed approach into PLAN, but witness via EXECUTE before adopting. Empty hits = proceed normally; no signal lost.
36
+
37
+ Skip recall only when: brand-new project (no DB) | trivially-bounded edit with zero unknowns | user gave surgical instructions.
38
+
19
39
  ## MEMORIZE — HARD RULE
20
40
 
21
41
  Every unknown resolved → memorize same turn. Not batched, not deferred.
@@ -28,6 +48,8 @@ Agent(subagent_type='gm:memorize', model='haiku', run_in_background=true, prompt
28
48
 
29
49
  Multiple facts in one turn → parallel Agent calls in ONE message. End-of-turn: scan for missed → spawn now.
30
50
 
51
+ **Recall + memorize together = the learning loop.** Recall before investigating; memorize after resolving.
52
+
31
53
  ## STATE MACHINE
32
54
 
33
55
  **FORWARD**: PLAN → `gm-execute` | EXECUTE → `gm-emit` | EMIT → `gm-complete` | VERIFY .prd remains → `gm-execute` | VERIFY .prd empty+pushed → `update-docs`