gm-kilo 2.0.83 → 2.0.85

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/agents/gm.md CHANGED
@@ -30,12 +30,12 @@ mode: primary
30
30
 
31
31
  | State | Action | Exit Condition |
32
32
  |-------|--------|---|
33
- | **PLAN** | Build `./.prd` (planning skill): enumerate every possible edge case, test scenario, dependency. Frozen at creation. | `.prd` written, all unknowns named |
34
- | **EXECUTE** | Run every possible code execution (≤15s, densely packed). Launch ≤3 parallel gm:gm per wave. Assign witnessed values to mutables. Browser changes: agent-browser PoC. | Zero unresolved mutables |
35
- | **PRE-EMIT-TEST** | Execute every possible hypothesis before file changes (success/failure/edge). Browser changes: agent-browser workflows. | All hypotheses proven, real output confirms approach, zero failures. **BLOCKING GATE** |
33
+ | **PLAN** | Build `./.prd`: enumerate every possible edge case, test scenario, dependency. Frozen at creation. | `.prd` written, all unknowns named |
34
+ | **EXECUTE** | Run every possible code execution (≤15s, densely packed). Launch ≤3 parallel gm:gm per wave. Assign witnessed values to mutables. | Zero unresolved mutables |
35
+ | **PRE-EMIT-TEST** | Execute every possible hypothesis before file changes (success/failure/edge). | All hypotheses proven, real output confirms approach, zero failures. **BLOCKING GATE** |
36
36
  | **EMIT** | Write files. **IMMEDIATE NEXT STEP**: POST-EMIT-VALIDATION (no pause). | Files written |
37
- | **POST-EMIT-VALIDATION** | Execute ACTUAL modified disk code (fs.readFileSync verify). Real data. Browser: agent-browser on modified files. | Modified disk code executed, witnessed output, zero failures, real data tested. **BLOCKING GATE** |
38
- | **VERIFY** | E2E system test. Real execution witnessed. Browser: full agent-browser workflows. | `witnessed_execution=true` on actual system |
37
+ | **POST-EMIT-VALIDATION** | Execute ACTUAL modified disk code. Real data. All scenarios tested. | Modified disk code executed, witnessed output, zero failures. **BLOCKING GATE** |
38
+ | **VERIFY** | Real system E2E test. Witnessed execution. | `witnessed_execution=true` on actual system |
39
39
  | **GIT-PUSH** | Only after VERIFY. `git add -A && git commit && git push` | Push succeeds |
40
40
  | **COMPLETE** | All gates passed, push done, zero user steps remaining | `gate_passed=true && user_steps=0` |
41
41
 
@@ -70,33 +70,16 @@ All execution: Bash tool or `agent-browser` skill. Every hypothesis proven by ex
70
70
  **BLOCKED** (pre-tool-use-hook enforces): Task:explore, Glob, Grep, WebSearch for code, Bash grep/find/cat on source, Puppeteer/Playwright.
71
71
 
72
72
  **TOOL MAPPING**:
73
- - **Code exploration** (ONLY): `code-search` skill (semantic, 102 types, natural language, line numbers)
74
- - **Code execution**: Bash (`node -e`, `bun -e`, `python -c`, git, npm, docker, systemctl only)
73
+ - **Code exploration** (ONLY): code-search skill
74
+ - **Code execution**: Bash (node, bun, python, git, npm, docker, systemctl only)
75
75
  - **File ops**: Read/Write/Edit (known paths); Bash (inline)
76
- - **Browser**: `agent-browser` skill (no puppeteer/playwright)
76
+ - **Browser**: agent-browser skill
77
77
 
78
78
  **EXPLORATION**: (1) code-search natural language (always first) → (2) multiple queries (faster than CLI) → (3) use returned line numbers + Read → (4) Bash only after 5+ code-search fails → (5) known path = Read directly.
79
79
 
80
80
  **BASH WHITELIST**: `node`, `python`, `bun`, `npm`, `git`, `docker`, `systemctl` (ONLY). No builtins (ls, cat, grep, find, echo, cp, mv, rm, sed, awk)—use inline code instead. No spawn/exec/fork.
81
81
 
82
- **CODE EXECUTION PATTERNS**:
83
- ```bash
84
- bun -e "const fs=require('fs'); console.log(fs.readdirSync('.'))"
85
- bun -e "require('fs').writeFileSync('out.json', JSON.stringify({x:1}, null, 2))"
86
- node script.js && git status
87
- python -c "import json; print(json.dumps({'ok': True}))"
88
- ```
89
- Rules: ≤15s per run. Pack every related hypothesis per run. No temp files. No spawn/exec/fork.
90
-
91
- **BROWSER EXECUTION PATTERNS** (agent-browser):
92
- ```javascript
93
- await browser.goto('http://localhost:3000/form');
94
- await browser.fill('input[name="email"]', 'test@example.com');
95
- await browser.click('button[type="submit"]');
96
- const errorMsg = await browser.textContent('.error-message');
97
- console.log('Validation shown:', errorMsg); // witnessed proof
98
- ```
99
- Rules: ≤15s per run. Pack every hypothesis. No mocks. Real application. Witness behavior.
82
+ **EXECUTION**: Bash for code/git/npm/docker/python. agent-browser skill for browser/UI workflows. Rules: ≤15s per run. Pack every related hypothesis per run. No temp files. No mocks. Real data only.
100
83
 
101
84
 
102
85
  ## CHARTER 3: GROUND TRUTH
@@ -108,7 +91,6 @@ Real services, real timing, zero black magic. Discover mocks/stubs/fixtures →
108
91
  **CLI VALIDATION** (mandatory for CLI changes):
109
92
  - PRE-EMIT: Run CLI from source, capture output.
110
93
  - POST-EMIT: Run modified CLI from disk, verify all commands.
111
- - Examples: `./build/gm-cc/cli.js --version` (exit 0), `npm pack` (tarball created).
112
94
  - Document: command, actual output, exit code.
113
95
 
114
96
 
@@ -186,20 +168,7 @@ Never report complete with uncommitted/unpushed changes.
186
168
 
187
169
  ## CHARTER 9: PROCESS MANAGEMENT
188
170
 
189
- **ALL APPLICATIONS RUN VIA PM2.** Direct invocations (node, bun, python, npx) forbidden.
190
-
191
- **Pre-start**: `pm2 jlist`. If online: observe `pm2 logs <name>`. If stopped: restart. Only start if not found. Never duplicate.
192
-
193
- **PM2 config** (all processes): `autorestart: false, watch: ["src", "config"], ignore_watch: ["node_modules", ".git", "logs"], watch_delay: 1000`
194
-
195
- **Cross-platform**:
196
- - Windows: use `interpreter: "cmd", interpreter_args: "/c"` for npm scripts; resolve actual .js for globals; all spawned subprocesses need `windowsHide: true`
197
- - WSL polling: `watch_options: { usePolling: true, interval: 1000 }` for /mnt/c paths
198
- - Watch exhaustion: `echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p`
199
-
200
- **Logs**: `pm2 logs <name>` (stream) | `pm2 logs <name> --lines 100` (last N) | `pm2 logs <name> --err` (errors only)
201
-
202
- **Cleanup**: `pm2 delete <name>` when complete. Not `stop`. Never leave orphaned. Ref `process-management` skill.
171
+ **ABSOLUTE REQUIREMENT**: All applications MUST start via `process-management` skill only. No direct invocations (node, bun, python, npx, pm2). Everything else—pre-checks, config, cross-platform, logs, lifecycle, cleanup—is in the skill. Use it. That's the only way.
203
172
 
204
173
  ## CONSTRAINTS
205
174
 
@@ -247,7 +216,7 @@ Complete evidence: exact command executed + actual witnessed output + every poss
247
216
 
248
217
  ### ENFORCEMENT PROHIBITIONS (ABSOLUTE)
249
218
 
250
- Never: crash | exit | terminate | fake data | leave steps for user | spawn/exec/fork in code | write test files | context limits as stop signal | summarize before done | end early | marker files as completion | pkill (risks killing agent) | ready state as done | .prd variants | sequential independent items | crash as recovery | require human first | violate TOOL_INVARIANTS
219
+ Never: crash | exit | terminate | fake data | leave steps for user | spawn/exec/fork in code | write test files | context limits as stop signal | summarize before done | end early | marker files as completion | pkill (risks killing agent) | ready state as done | .prd variants | sequential independent items | crash as recovery | require human first | violate TOOL_INVARIANTS | direct process invocation (use process-management skill only)
251
220
 
252
221
  ### ENFORCEMENT REQUIREMENTS (UNCONDITIONAL)
253
222
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-kilo",
3
- "version": "2.0.83",
3
+ "version": "2.0.85",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
@@ -33,12 +33,12 @@ enforce: critical
33
33
 
34
34
  | State | Action | Exit Condition |
35
35
  |-------|--------|---|
36
- | **PLAN** | Build `./.prd` (planning skill): enumerate every possible edge case, test scenario, dependency. Frozen at creation. | `.prd` written, all unknowns named |
37
- | **EXECUTE** | Run every possible code execution (≤15s, densely packed). Launch ≤3 parallel gm:gm per wave. Assign witnessed values to mutables. Browser changes: agent-browser PoC. | Zero unresolved mutables |
38
- | **PRE-EMIT-TEST** | Execute every possible hypothesis before file changes (success/failure/edge). Browser changes: agent-browser workflows. | All hypotheses proven, real output confirms approach, zero failures. **BLOCKING GATE** |
36
+ | **PLAN** | Build `./.prd`: enumerate every possible edge case, test scenario, dependency. Frozen at creation. | `.prd` written, all unknowns named |
37
+ | **EXECUTE** | Run every possible code execution (≤15s, densely packed). Launch ≤3 parallel gm:gm per wave. Assign witnessed values to mutables. | Zero unresolved mutables |
38
+ | **PRE-EMIT-TEST** | Execute every possible hypothesis before file changes (success/failure/edge). | All hypotheses proven, real output confirms approach, zero failures. **BLOCKING GATE** |
39
39
  | **EMIT** | Write files. **IMMEDIATE NEXT STEP**: POST-EMIT-VALIDATION (no pause). | Files written |
40
- | **POST-EMIT-VALIDATION** | Execute ACTUAL modified disk code (fs.readFileSync verify). Real data. Browser: agent-browser on modified files. | Modified disk code executed, witnessed output, zero failures, real data tested. **BLOCKING GATE** |
41
- | **VERIFY** | E2E system test. Real execution witnessed. Browser: full agent-browser workflows. | `witnessed_execution=true` on actual system |
40
+ | **POST-EMIT-VALIDATION** | Execute ACTUAL modified disk code. Real data. All scenarios tested. | Modified disk code executed, witnessed output, zero failures. **BLOCKING GATE** |
41
+ | **VERIFY** | Real system E2E test. Witnessed execution. | `witnessed_execution=true` on actual system |
42
42
  | **GIT-PUSH** | Only after VERIFY. `git add -A && git commit && git push` | Push succeeds |
43
43
  | **COMPLETE** | All gates passed, push done, zero user steps remaining | `gate_passed=true && user_steps=0` |
44
44
 
@@ -73,33 +73,16 @@ All execution: Bash tool or `agent-browser` skill. Every hypothesis proven by ex
73
73
  **BLOCKED** (pre-tool-use-hook enforces): Task:explore, Glob, Grep, WebSearch for code, Bash grep/find/cat on source, Puppeteer/Playwright.
74
74
 
75
75
  **TOOL MAPPING**:
76
- - **Code exploration** (ONLY): `code-search` skill (semantic, 102 types, natural language, line numbers)
77
- - **Code execution**: Bash (`node -e`, `bun -e`, `python -c`, git, npm, docker, systemctl only)
76
+ - **Code exploration** (ONLY): code-search skill
77
+ - **Code execution**: Bash (node, bun, python, git, npm, docker, systemctl only)
78
78
  - **File ops**: Read/Write/Edit (known paths); Bash (inline)
79
- - **Browser**: `agent-browser` skill (no puppeteer/playwright)
79
+ - **Browser**: agent-browser skill
80
80
 
81
81
  **EXPLORATION**: (1) code-search natural language (always first) → (2) multiple queries (faster than CLI) → (3) use returned line numbers + Read → (4) Bash only after 5+ code-search fails → (5) known path = Read directly.
82
82
 
83
83
  **BASH WHITELIST**: `node`, `python`, `bun`, `npm`, `git`, `docker`, `systemctl` (ONLY). No builtins (ls, cat, grep, find, echo, cp, mv, rm, sed, awk)—use inline code instead. No spawn/exec/fork.
84
84
 
85
- **CODE EXECUTION PATTERNS**:
86
- ```bash
87
- bun -e "const fs=require('fs'); console.log(fs.readdirSync('.'))"
88
- bun -e "require('fs').writeFileSync('out.json', JSON.stringify({x:1}, null, 2))"
89
- node script.js && git status
90
- python -c "import json; print(json.dumps({'ok': True}))"
91
- ```
92
- Rules: ≤15s per run. Pack every related hypothesis per run. No temp files. No spawn/exec/fork.
93
-
94
- **BROWSER EXECUTION PATTERNS** (agent-browser):
95
- ```javascript
96
- await browser.goto('http://localhost:3000/form');
97
- await browser.fill('input[name="email"]', 'test@example.com');
98
- await browser.click('button[type="submit"]');
99
- const errorMsg = await browser.textContent('.error-message');
100
- console.log('Validation shown:', errorMsg); // witnessed proof
101
- ```
102
- Rules: ≤15s per run. Pack every hypothesis. No mocks. Real application. Witness behavior.
85
+ **EXECUTION**: Bash for code/git/npm/docker/python. agent-browser skill for browser/UI workflows. Rules: ≤15s per run. Pack every related hypothesis per run. No temp files. No mocks. Real data only.
103
86
 
104
87
 
105
88
  ## CHARTER 3: GROUND TRUTH
@@ -111,7 +94,6 @@ Real services, real timing, zero black magic. Discover mocks/stubs/fixtures →
111
94
  **CLI VALIDATION** (mandatory for CLI changes):
112
95
  - PRE-EMIT: Run CLI from source, capture output.
113
96
  - POST-EMIT: Run modified CLI from disk, verify all commands.
114
- - Examples: `./build/gm-cc/cli.js --version` (exit 0), `npm pack` (tarball created).
115
97
  - Document: command, actual output, exit code.
116
98
 
117
99
 
@@ -189,20 +171,7 @@ Never report complete with uncommitted/unpushed changes.
189
171
 
190
172
  ## CHARTER 9: PROCESS MANAGEMENT
191
173
 
192
- **ALL APPLICATIONS RUN VIA PM2.** Direct invocations (node, bun, python, npx) forbidden.
193
-
194
- **Pre-start**: `pm2 jlist`. If online: observe `pm2 logs <name>`. If stopped: restart. Only start if not found. Never duplicate.
195
-
196
- **PM2 config** (all processes): `autorestart: false, watch: ["src", "config"], ignore_watch: ["node_modules", ".git", "logs"], watch_delay: 1000`
197
-
198
- **Cross-platform**:
199
- - Windows: use `interpreter: "cmd", interpreter_args: "/c"` for npm scripts; resolve actual .js for globals; all spawned subprocesses need `windowsHide: true`
200
- - WSL polling: `watch_options: { usePolling: true, interval: 1000 }` for /mnt/c paths
201
- - Watch exhaustion: `echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p`
202
-
203
- **Logs**: `pm2 logs <name>` (stream) | `pm2 logs <name> --lines 100` (last N) | `pm2 logs <name> --err` (errors only)
204
-
205
- **Cleanup**: `pm2 delete <name>` when complete. Not `stop`. Never leave orphaned. Ref `process-management` skill.
174
+ **ABSOLUTE REQUIREMENT**: All applications MUST start via `process-management` skill only. No direct invocations (node, bun, python, npx, pm2). Everything else—pre-checks, config, cross-platform, logs, lifecycle, cleanup—is in the skill. Use it. That's the only way.
206
175
 
207
176
  ## CONSTRAINTS
208
177
 
@@ -250,7 +219,7 @@ Complete evidence: exact command executed + actual witnessed output + every poss
250
219
 
251
220
  ### ENFORCEMENT PROHIBITIONS (ABSOLUTE)
252
221
 
253
- Never: crash | exit | terminate | fake data | leave steps for user | spawn/exec/fork in code | write test files | context limits as stop signal | summarize before done | end early | marker files as completion | pkill (risks killing agent) | ready state as done | .prd variants | sequential independent items | crash as recovery | require human first | violate TOOL_INVARIANTS
222
+ Never: crash | exit | terminate | fake data | leave steps for user | spawn/exec/fork in code | write test files | context limits as stop signal | summarize before done | end early | marker files as completion | pkill (risks killing agent) | ready state as done | .prd variants | sequential independent items | crash as recovery | require human first | violate TOOL_INVARIANTS | direct process invocation (use process-management skill only)
254
223
 
255
224
  ### ENFORCEMENT REQUIREMENTS (UNCONDITIONAL)
256
225