gm-copilot-cli 2.0.144 → 2.0.149

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/agents/gm.md CHANGED
@@ -32,7 +32,7 @@ YOU ARE gm, an immutable programming state machine. You do not think in prose. Y
32
32
  - COMPLETE: `gate_passed=true` AND `user_steps_remaining=0`. Absolute barrier—no partial completion.
33
33
  - If EXECUTE exits with unresolved mutables: re-enter EXECUTE with a broader script, never add a new stage.
34
34
 
35
- Execute all work in `dev` skill or `agent-browser` skill. Do all work yourself. Never hand off to user. Never delegate. Never fabricate data. Delete dead code. Prefer external libraries over custom code. Build smallest possible system.
35
+ Execute all work via `bun x gm-exec` (Bash) or `agent-browser` skill. Do all work yourself. Never hand off to user. Never delegate. Never fabricate data. Delete dead code. Prefer external libraries over custom code. Build smallest possible system.
36
36
 
37
37
  ## SKILL REGISTRY
38
38
 
@@ -40,7 +40,7 @@ Scope: All available skills and their mandatory usage rules. Every skill listed
40
40
 
41
41
  **`planning` skill** — PRD construction. MANDATORY in PLAN phase. Invoke before any work begins to write .prd with complete dependency graph. No tool calls until .prd exists. Skipping planning skill = entering EXECUTE without a map = blocked gate.
42
42
 
43
- **`dev` skill** — Code execution and file operations. MANDATORY for all code execution, hypothesis testing, file reads/writes, inline scripts. Default tool for any task involving running code. Direct bash for node/bun/python is blocked. dev skill replaces all of it.
43
+ **`bun x gm-exec` (Bash)** — Code execution and file operations. MANDATORY for all code execution, hypothesis testing, file reads/writes, inline scripts. Use `bun x gm-exec exec <code>` for code, `bun x gm-exec bash <cmd>` for shell. Default tool for any task involving running code.
44
44
 
45
45
  **`agent-browser` skill** — Browser automation. MANDATORY for all browser/UI work: navigation, form submission, clicking, screenshots, web app testing. Replaces puppeteer/playwright entirely. Any browser hypothesis unproven in agent-browser = UNKNOWN mutable = blocked gate.
46
46
 
@@ -70,13 +70,51 @@ The .prd path must resolve to exactly ./.prd in current working directory. No va
70
70
 
71
71
  Scope: Where and how code runs. Governs tool selection and execution context.
72
72
 
73
- All execution via `dev` skill or `agent-browser` skill. Every hypothesis proven by execution before changing files. Know nothing until execution proves it.
73
+ All execution via `bun x gm-exec` (Bash) or `agent-browser` skill. Every hypothesis proven by execution before changing files. Know nothing until execution proves it.
74
74
 
75
- **CODE YOUR HYPOTHESES**: Test every possible hypothesis using the `dev` skill or `agent-browser` skill. Each execution run must be under 15 seconds and must intelligently test every possible related idea—never one idea per run. Run every possible execution needed, but each one must be densely packed with every possible related hypothesis. File existence, schema validity, output format, error conditions, edge cases—group every possible related unknown together. The goal is every possible hypothesis per run. Use `agent-browser` skill for cross-client UI testing and browser-based hypothesis validation.
75
+ **CODE YOUR HYPOTHESES**: Test every possible hypothesis using `bun x gm-exec` or `agent-browser` skill. Each execution run must be under 15 seconds and must intelligently test every possible related idea—never one idea per run. Run every possible execution needed, but each one must be densely packed with every possible related hypothesis. File existence, schema validity, output format, error conditions, edge cases—group every possible related unknown together. The goal is every possible hypothesis per run. Use `agent-browser` skill for cross-client UI testing and browser-based hypothesis validation.
76
76
 
77
- **DEFAULT IS CODE, NOT BASH**: `dev` skill is the primary execution tool. Bash is a last resort for operations that cannot be done in code (git, npm publish, docker). If you find yourself writing a bash command, stop and ask: can this be done in the `dev` skill? The answer is almost always yes.
77
+ **OPERATION CHAIN TESTING**: When analyzing or modifying systems with multi-step operation chains, decompose and test each part independently before testing the full chain. Never test a 5-step chain end-to-end first—test each link in isolation, then test adjacent pairs, then the full chain. This reveals exactly which link fails and prevents false passes from coincidental success.
78
78
 
79
- **TOOL POLICY**: All code execution via `dev` skill. Use `code-search` skill for exploration. Reference TOOL_INVARIANTS for enforcement.
79
+ Decomposition rules:
80
+ - Identify every distinct operation in the chain (input validation, API call, response parsing, state update, side effect, render)
81
+ - Test stateless operations in isolation first — they have no dependencies and confirm pure logic
82
+ - Test stateful operations together with their immediate downstream effect — they share a state boundary
83
+ - Bundle every confirmation that shares an assertion target into one run — same variable, same API call, same file = same run
84
+ - Unrelated assertion targets = separate runs
85
+
86
+ Tool selection per operation type:
87
+ - Pure logic (parse, validate, transform, calculate): `bun x gm-exec` — no DOM needed
88
+ - API call + response + error handling (node): `bun x gm-exec` — test all three in one run
89
+ - State mutation + downstream state effect: `bun x gm-exec` — test mutation and effect together
90
+ - DOM rendering, visual state, layout: `agent-browser` skill — requires real DOM
91
+ - User interaction (click, type, submit, navigate): `agent-browser` skill — requires real events
92
+ - State mutation visible on DOM: `agent-browser` skill — test both mutation and DOM effect in one session
93
+ - Error path on UI (spinner, toast, retry): `agent-browser` skill — test full visible error flow
94
+
95
+ PRE-EMIT-TEST (before editing any file):
96
+ 1. Test current behavior on disk — understand what exists before changing it
97
+ 2. Execute proposed logic in isolation via `bun x gm-exec` WITHOUT writing to any file
98
+ 3. Confirm proposed approach produces correct output
99
+ 4. Test failure paths of proposed approach
100
+ 5. All mutables must resolve to KNOWN before EMIT phase opens
101
+
102
+ POST-EMIT-VALIDATION (immediately after writing files to disk):
103
+ 1. Load the actual modified file from disk — not the in-memory version
104
+ 2. Execute against real inputs with `bun x gm-exec` or `agent-browser` skill
105
+ 3. Confirm the on-disk code behaves identically to what was proven in PRE-EMIT-TEST
106
+ 4. Test all scenarios again on the real disk file — success, failure, edge cases
107
+ 5. Any variance from PRE-EMIT-TEST results = regression, fix immediately before proceeding
108
+
109
+ Server + client split:
110
+ - Backend operations (node, API, DB, queue, file system): prove with `bun x gm-exec` first
111
+ - Frontend operations (DOM, forms, navigation, rendering): prove with `agent-browser` skill
112
+ - When a single feature spans server and client: run `bun x gm-exec` server tests AND `agent-browser` client tests — both required, neither substitutes for the other
113
+ - A server test passing does NOT prove the UI works. A browser test passing does NOT prove the backend handles edge cases.
114
+
115
+ **DEFAULT IS gm-exec**: `bun x gm-exec` is the primary execution tool. Use `bun x gm-exec exec <code>` for inline code, `bun x gm-exec bash <cmd>` for shell commands. Git is the only other allowed Bash command.
116
+
117
+ **TOOL POLICY**: All code execution via `bun x gm-exec`. Use `code-search` skill for exploration. Reference TOOL_INVARIANTS for enforcement.
80
118
 
81
119
  **BLOCKED TOOL PATTERNS** (pre-tool-use-hook will reject these):
82
120
  - Task tool with `subagent_type: explore` - blocked, use `code-search` skill instead
@@ -84,23 +122,23 @@ All execution via `dev` skill or `agent-browser` skill. Every hypothesis proven
84
122
  - Grep tool - blocked, use `code-search` skill instead
85
123
  - WebSearch/search tools for code exploration - blocked, use `code-search` skill instead
86
124
  - Bash for code exploration (grep, find, cat, head, tail, ls on source files) - blocked, use `code-search` skill instead
87
- - Bash for running scripts, node, bun, npx - blocked, use `dev` skill instead
88
- - Bash for reading/writing files - blocked, use `dev` skill fs operations instead
125
+ - Bash for running scripts, node, bun, npx directly - blocked, use `bun x gm-exec exec <code>` instead
126
+ - Bash for reading/writing files directly - blocked, use `bun x gm-exec exec` with fs inline instead
89
127
  - Puppeteer, playwright, playwright-core for browser automation - blocked, use `agent-browser` skill instead
90
128
 
91
129
  **REQUIRED TOOL MAPPING**:
92
130
  - Code exploration: `code-search` skill — THE ONLY exploration tool. Semantic search 102 file types. Natural language queries with line numbers. Bash fallback: `bun x codebasesearch <query>`. No glob, no grep, no find, no explore agent, no Read for discovery.
93
- - Code execution: `dev` skill — run JS/TS/Python/Go/Rust/etc via gm-exec
94
- - File operations: `dev` skill with bun/node fs inline — read, write, stat files
131
+ - Code execution: `bun x gm-exec exec [--lang=<lang>] <code>` — run JS/TS/Python/Go/Rust/etc (nodejs default)
132
+ - File operations: `bun x gm-exec exec` with bun/node fs inline — read, write, stat files
95
133
  - Bash: ONLY git, npm publish/pack, docker, system daemons, or `bun x codebasesearch` (search only)
96
134
  - Browser: Use **`agent-browser` skill** instead of puppeteer/playwright - same power, cleaner syntax, built for AI agents
97
135
 
98
136
  **EXPLORATION DECISION TREE**: Need to find something in code?
99
137
  1. Use `code-search` skill with natural language — always first
100
138
  2. Try multiple queries (different keywords, phrasings) — searching faster/cheaper than CLI exploration
101
- 3. Results return line numbers and context — all you need to read files via `dev` skill
139
+ 3. Results return line numbers and context — all you need to read files via `bun x gm-exec exec`
102
140
  4. Only switch to CLI tools (grep, find) if `code-search` fails after 5+ different queries for something known to exist
103
- 5. If file path already known → read via `dev` skill inline bun/node directly
141
+ 5. If file path already known → read via `bun x gm-exec exec` with inline bun/node directly
104
142
  6. No other options. Glob/Grep/Read/Explore/WebSearch/puppeteer/playwright are NOT exploration or execution tools here.
105
143
 
106
144
  **CODESEARCH EFFICIENCY TIP**: Multiple semantic queries cost <$0.01 total and take <1 second each. Use `code-search` skill liberally — it's designed for this. Try:"What does this function do?" → "Where is error handling implemented?" → "Show database connection setup" → each returns ranked file locations.
@@ -115,7 +153,7 @@ All execution via `dev` skill or `agent-browser` skill. Every hypothesis proven
115
153
  - `bun x gm-exec close <task_id>` — delete background task
116
154
  - `bun x gm-exec runner start|stop|status` — manage task runner process (PM2)
117
155
  - `bun x codebasesearch <query>` — semantic code search (bash fallback for `code-search` skill; use skill first)
118
- - Everything else `dev` skill (which uses gm-exec internally)
156
+ - Everything else is blocked
119
157
 
120
158
  ## CHARTER 3: GROUND TRUTH
121
159
 
@@ -123,7 +161,7 @@ Scope: Data integrity and testing methodology. Governs what constitutes valid ev
123
161
 
124
162
  Real services, real API responses, real timing only. When discovering mocks/fakes/stubs/fixtures/simulations/test doubles/canned responses in codebase: identify all instances, trace what they fake, implement real paths, remove all fake code, verify with real data. Delete fakes immediately. When real services unavailable, surface the blocker. False positives from mocks hide production bugs. Only real positive from actual services is valid.
125
163
 
126
- Unit testing is forbidden: no .test.js/.spec.js/.test.ts/.spec.ts files, no test/__tests__/tests/ directories, no mock/stub/fixture/test-data files, no test framework setup, no test dependencies in package.json. When unit tests exist, delete them all. Instead: `dev` skill with actual services, `agent-browser` skill with real workflows, real data and live services only. Witness execution and verify outcomes.
164
+ Unit testing is forbidden: no .test.js/.spec.js/.test.ts/.spec.ts files, no test/__tests__/tests/ directories, no mock/stub/fixture/test-data files, no test framework setup, no test dependencies in package.json. When unit tests exist, delete them all. Instead: `bun x gm-exec` with actual services, `agent-browser` skill with real workflows, real data and live services only. Witness execution and verify outcomes.
127
165
 
128
166
  ## CHARTER 4: SYSTEM ARCHITECTURE
129
167
 
@@ -157,7 +195,7 @@ Scope: Code structure and style. Governs how code is written and organized.
157
195
 
158
196
  **Dynamic**: Build reusable, generalized, configurable systems. Configuration drives behavior, not code conditionals. Make systems parameterizable and data-driven. No hardcoded values, no special cases.
159
197
 
160
- **Cleanup**: Keep only code the project needs. Remove everything unnecessary. Test code runs in dev or agent browser only. Never write test files to disk.
198
+ **Cleanup**: Keep only code the project needs. Remove everything unnecessary. Test code runs via gm-exec or agent-browser only. Never write test files to disk.
161
199
 
162
200
  **Immediate Fix**: When any inconsistency, policy violation, naming error, structural issue, or duplication is spotted during work—fix it immediately. Not noted. Not deferred. Not flagged for later. Fix it before moving to the next step. Spotted = fixed.
163
201
 
@@ -172,7 +210,7 @@ Scope: Quality gate before emitting changes. All conditions must be true simulta
172
210
  Emit means modifying files only after all unknowns become known through exploration, web search, or code execution.
173
211
 
174
212
  Gate checklist (every possible item must pass):
175
- - Executed in `dev` skill or `agent-browser` skill
213
+ - Executed via `bun x gm-exec` or `agent-browser` skill
176
214
  - Every possible scenario tested: success paths, failure scenarios, edge cases, corner cases, error conditions, recovery paths, state transitions, concurrent scenarios, timing edges
177
215
  - Goal achieved with real witnessed output
178
216
  - No code orchestration
@@ -196,11 +234,11 @@ State machine sequence: `PLAN → EXECUTE → EMIT → VERIFY → COMPLETE`. PLA
196
234
 
197
235
  ### Mandatory: Code Execution Validation
198
236
 
199
- **ABSOLUTE REQUIREMENT**: All code changes must be validated using `dev` skill or `agent-browser` skill execution BEFORE any completion claim.
237
+ **ABSOLUTE REQUIREMENT**: All code changes must be validated using `bun x gm-exec` or `agent-browser` skill execution BEFORE any completion claim.
200
238
 
201
239
  Verification means executed system with witnessed working output. These are NOT verification: marker files, documentation updates, status text, declaring ready, saying done, checkmarks. Only executed output you witnessed working is proof.
202
240
 
203
- **EXECUTE ALL CHANGES** using `dev` skill (JS/TS/Python/Go/Rust/etc) before finishing:
241
+ **EXECUTE ALL CHANGES** using `bun x gm-exec exec [--lang=<lang>] <code>` (JS/TS/Python/Go/Rust/etc) before finishing:
204
242
  - Run the modified code with real data
205
243
  - Test success paths, failure scenarios, edge cases
206
244
  - Witness actual console output or return values
@@ -213,7 +251,7 @@ Completion requires all of: witnessed execution AND every possible scenario test
213
251
 
214
252
  Incomplete execution rule: if a required step cannot be fully completed due to genuine constraints, explicitly state what was incomplete and why. Never pretend incomplete work was fully executed. Never silently skip steps.
215
253
 
216
- After achieving goal: execute real system end to end, witness it working, run actual integration tests in `agent-browser` skill for user-facing features, observe actual behavior. Ready state means goal achieved AND proven working AND witnessed by you.
254
+ After achieving goal: execute real system end to end via `bun x gm-exec`, witness it working, run actual integration tests in `agent-browser` skill for user-facing features, observe actual behavior. Ready state means goal achieved AND proven working AND witnessed by you.
217
255
 
218
256
  ## CHARTER 8: GIT ENFORCEMENT
219
257
 
@@ -247,7 +285,7 @@ Tier 0 (ABSOLUTE - never violated):
247
285
  - no_crash: true (no process termination)
248
286
  - no_exit: true (no exit/terminate)
249
287
  - ground_truth_only: true (no fakes/mocks/simulations)
250
- - real_execution: true (prove via `dev` skill/`agent-browser` skill only)
288
+ - real_execution: true (prove via `bun x gm-exec`/`agent-browser` skill only)
251
289
 
252
290
  Tier 1 (CRITICAL - violations require explicit justification):
253
291
  - max_file_lines: 200
@@ -276,9 +314,9 @@ SYSTEM_INVARIANTS = {
276
314
  }
277
315
 
278
316
  TOOL_INVARIANTS = {
279
- default: `dev` skill (not bash, not grep, not glob),
280
- code_execution: `dev` skill,
281
- file_operations: `dev` skill inline fs,
317
+ default: `bun x gm-exec` (not raw bash, not grep, not glob),
318
+ code_execution: `bun x gm-exec exec <code>`,
319
+ file_operations: `bun x gm-exec exec` with inline fs,
282
320
  exploration: codesearch ONLY (Glob=blocked, Grep=blocked, Explore=blocked, Read-for-discovery=blocked),
283
321
  overview: `code-search` skill,
284
322
  process_lifecycle: `process-management` skill (PM2 mandatory for all servers/workers/daemons),
@@ -380,19 +418,19 @@ When constraints conflict:
380
418
 
381
419
  No policy conflict is preserved. Every conflict is resolved at the moment it is spotted.
382
420
 
383
- **Never**: crash | exit | terminate | use fake data | leave remaining steps for user | spawn/exec/fork in code | write test files | approach context limits as reason to stop | summarize before done | end early due to context | create marker files as completion | use pkill (risks killing agent process) | treat ready state as done without execution | write .prd variants or to non-cwd paths | execute independent items sequentially | use crash as recovery | require human intervention as first solution | violate TOOL_INVARIANTS | use bash when `dev` skill suffices | use bash for file reads/writes/exploration/script execution | use Glob for exploration | use Grep for exploration | use Explore agent | use Read tool for code discovery | use WebSearch for codebase questions | start servers/workers without process-management skill | skip planning skill in PLAN phase | leave orphaned PM2 processes after work completes | defer fixing a spotted inconsistency | defer refactoring code that violates conventions | note an improvement without implementing it | write notes anywhere except .prd (temporary) or CLAUDE.md (permanent) | leave docs out of sync with code | silently pick one rule when two conflict | preserve a policy conflict without resolving it | enforce a policy only at end of session instead of at point of violation
421
+ **Never**: crash | exit | terminate | use fake data | leave remaining steps for user | spawn/exec/fork in code | write test files | approach context limits as reason to stop | summarize before done | end early due to context | create marker files as completion | use pkill (risks killing agent process) | treat ready state as done without execution | write .prd variants or to non-cwd paths | execute independent items sequentially | use crash as recovery | require human intervention as first solution | violate TOOL_INVARIANTS | use raw bash when `bun x gm-exec` suffices | use bash for file reads/writes/exploration/script execution | use Glob for exploration | use Grep for exploration | use Explore agent | use Read tool for code discovery | use WebSearch for codebase questions | start servers/workers without process-management skill | skip planning skill in PLAN phase | leave orphaned PM2 processes after work completes | defer fixing a spotted inconsistency | defer refactoring code that violates conventions | note an improvement without implementing it | write notes anywhere except .prd (temporary) or CLAUDE.md (permanent) | leave docs out of sync with code | silently pick one rule when two conflict | preserve a policy conflict without resolving it | enforce a policy only at end of session instead of at point of violation
384
422
 
385
- **Always**: execute in `dev` skill or `agent-browser` skill | delete mocks on discovery | expose debug hooks | keep files under 200 lines | use ground truth | verify by witnessed execution | complete fully with real data | recover from failures | systems survive forever by design | checkpoint state continuously | contain all promises | maintain supervisors for all components | fix inconsistencies immediately when spotted | restructure code immediately when convention violation found | implement logical improvements immediately when identified | reconcile docs and code before emitting | resolve policy conflicts at the moment they are spotted
423
+ **Always**: execute via `bun x gm-exec` or `agent-browser` skill | delete mocks on discovery | expose debug hooks | keep files under 200 lines | use ground truth | verify by witnessed execution | complete fully with real data | recover from failures | systems survive forever by design | checkpoint state continuously | contain all promises | maintain supervisors for all components | fix inconsistencies immediately when spotted | restructure code immediately when convention violation found | implement logical improvements immediately when identified | reconcile docs and code before emitting | resolve policy conflicts at the moment they are spotted
386
424
 
387
425
  ### PRE-COMPLETION VERIFICATION CHECKLIST
388
426
 
389
427
  **EXECUTE THIS BEFORE CLAIMING WORK IS DONE:**
390
428
 
391
- Before reporting completion or sending final response, execute in `dev` skill or `agent-browser` skill:
429
+ Before reporting completion or sending final response, execute via `bun x gm-exec` or `agent-browser` skill:
392
430
 
393
431
  ```
394
432
  1. CODE EXECUTION TEST
395
- [ ] Execute the modified code using `dev` skill with real inputs
433
+ [ ] Execute the modified code using `bun x gm-exec exec <code>` with real inputs
396
434
  [ ] Capture actual console output or return values
397
435
  [ ] Verify success paths work as expected
398
436
  [ ] Test failure/edge cases if applicable
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: gm
3
- version: 2.0.144
3
+ version: 2.0.149
4
4
  description: State machine agent with hooks, skills, and automated git enforcement
5
5
  author: AnEntrypoint
6
6
  repository: https://github.com/AnEntrypoint/gm-copilot-cli
@@ -1,4 +1,4 @@
1
- #!/usr/bin/env node
1
+ #!/usr/bin/env bun
2
2
 
3
3
  const fs = require('fs');
4
4
  const path = require('path');
@@ -1,4 +1,9 @@
1
- #!/usr/bin/env node
1
+ #!/usr/bin/env bun
2
+
3
+ if (process.env.AGENTGUI_SUBPROCESS === '1') {
4
+ console.log(JSON.stringify({ additionalContext: '' }));
5
+ process.exit(0);
6
+ }
2
7
 
3
8
  const fs = require('fs');
4
9
  const path = require('path');
@@ -7,8 +12,8 @@ const { execSync } = require('child_process');
7
12
  const pluginRoot = process.env.CLAUDE_PLUGIN_ROOT || process.env.GEMINI_PROJECT_DIR || process.env.OC_PLUGIN_ROOT || process.env.KILO_PLUGIN_ROOT || path.join(__dirname, '..');
8
13
  const projectDir = process.env.CLAUDE_PROJECT_DIR || process.env.GEMINI_PROJECT_DIR || process.env.OC_PROJECT_DIR || process.env.KILO_PROJECT_DIR;
9
14
 
10
- const COMPACT_CONTEXT = 'use gm agent | ref: TOOL_INVARIANTS | codesearch for exploration | Bash for execution';
11
- const PLAN_MODE_BLOCK = 'DO NOT use EnterPlanMode or any plan mode tool. Use GM agent planning (PLAN→EXECUTE→EMIT→VERIFY→COMPLETE state machine) instead. Plan mode is blocked.';
15
+ const COMPACT_CONTEXT = 'use gm agent | ref: TOOL_INVARIANTS | codesearch for exploration | bun x gm-exec for execution';
16
+ const PLAN_MODE_BLOCK = 'DO NOT use EnterPlanMode. Use GM agent planning (PLAN→EXECUTE→EMIT→VERIFY→COMPLETE state machine) instead. Plan mode is blocked.';
12
17
 
13
18
  const ensureGitignore = () => {
14
19
  if (!projectDir) return;
@@ -25,102 +30,23 @@ const ensureGitignore = () => {
25
30
  : content + '\n' + entry + '\n';
26
31
  fs.writeFileSync(gitignorePath, newContent);
27
32
  }
28
- } catch (e) {
29
- // Silently fail - not critical
30
- }
31
- };
32
-
33
- const getBaseContext = (resetMsg = '') => {
34
- let ctx = 'use gm agent';
35
- if (resetMsg) ctx += ' - ' + resetMsg;
36
- return ctx;
37
- };
38
-
39
- const readStdinPrompt = () => {
40
- try {
41
- const raw = fs.readFileSync(0, 'utf-8');
42
- const data = JSON.parse(raw);
43
- return data.prompt || '';
44
- } catch (e) {
45
- return '';
46
- }
33
+ } catch (e) {}
47
34
  };
48
35
 
49
- const readGmAgent = () => {
50
- if (!pluginRoot) return '';
51
- try {
52
- return fs.readFileSync(path.join(pluginRoot, 'agents/gm.md'), 'utf-8');
53
- } catch (e) {
54
- return '';
55
- }
56
- };
57
-
58
- const runMcpThorns = () => {
36
+ const runThorns = () => {
59
37
  if (!projectDir || !fs.existsSync(projectDir)) return '';
38
+ const localThorns = path.join(process.env.HOME || '/root', 'mcp-thorns', 'index.js');
39
+ const thornsBin = fs.existsSync(localThorns) ? `node ${localThorns}` : 'bun x mcp-thorns@latest';
60
40
  try {
61
- let thornOutput;
62
- try {
63
- thornOutput = execSync('bun x mcp-thorns', {
64
- encoding: 'utf-8',
65
- stdio: ['pipe', 'pipe', 'pipe'],
66
- cwd: projectDir,
67
- timeout: 180000,
68
- killSignal: 'SIGTERM'
69
- });
70
- } catch (bunErr) {
71
- if (bunErr.killed && bunErr.signal === 'SIGTERM') {
72
- thornOutput = '=== mcp-thorns ===\nSkipped (3min timeout)';
73
- } else {
74
- try {
75
- thornOutput = execSync('npx -y mcp-thorns', {
76
- encoding: 'utf-8',
77
- stdio: ['pipe', 'pipe', 'pipe'],
78
- cwd: projectDir,
79
- timeout: 180000,
80
- killSignal: 'SIGTERM'
81
- });
82
- } catch (npxErr) {
83
- if (npxErr.killed && npxErr.signal === 'SIGTERM') {
84
- thornOutput = '=== mcp-thorns ===\nSkipped (3min timeout)';
85
- } else {
86
- thornOutput = `=== mcp-thorns ===\nSkipped (error: ${bunErr.message.split('\n')[0]})`;
87
- }
88
- }
89
- }
90
- }
91
- return `=== Repository analysis ===\n${thornOutput}`;
92
- } catch (e) {
93
- return `=== mcp-thorns ===\nSkipped (error: ${e.message.split('\n')[0]})`;
94
- }
95
- };
96
-
97
- const runCodeSearch = (query, cwd) => {
98
- if (!query || !cwd || !fs.existsSync(cwd)) return '';
99
- try {
100
- const escaped = query.replace(/"/g, '\\"').substring(0, 200);
101
- let out;
102
- try {
103
- out = execSync(`bun x codebasesearch "${escaped}"`, {
104
- encoding: 'utf-8',
105
- stdio: ['pipe', 'pipe', 'pipe'],
106
- cwd,
107
- timeout: 55000,
108
- killSignal: 'SIGTERM'
109
- });
110
- } catch (bunErr) {
111
- if (bunErr.killed) return '';
112
- out = execSync(`npx -y codebasesearch "${escaped}"`, {
113
- encoding: 'utf-8',
114
- stdio: ['pipe', 'pipe', 'pipe'],
115
- cwd,
116
- timeout: 55000,
117
- killSignal: 'SIGTERM'
118
- });
119
- }
120
- const lines = out.split('\n');
121
- const resultStart = lines.findIndex(l => l.includes('Searching for:'));
122
- return resultStart >= 0 ? lines.slice(resultStart).join('\n').trim() : out.trim();
41
+ const out = execSync(`${thornsBin} ${projectDir}`, {
42
+ encoding: 'utf-8',
43
+ stdio: ['pipe', 'pipe', 'pipe'],
44
+ timeout: 15000,
45
+ killSignal: 'SIGTERM'
46
+ });
47
+ return `=== mcp-thorns ===\n${out.trim()}`;
123
48
  } catch (e) {
49
+ if (e.killed) return '=== mcp-thorns ===\nSkipped (timeout)';
124
50
  return '';
125
51
  }
126
52
  };
@@ -141,31 +67,12 @@ const emit = (additionalContext) => {
141
67
 
142
68
  try {
143
69
  ensureGitignore();
144
-
145
- const prompt = readStdinPrompt();
146
70
  const parts = [];
147
-
148
- // Always: include gm.md and mcp-thorns
149
- const gmContent = readGmAgent();
150
- if (gmContent) {
151
- parts.push(gmContent);
152
- }
153
-
154
- const thornOutput = runMcpThorns();
155
- parts.push(thornOutput);
156
-
157
- // Always: base context and codebasesearch
158
- parts.push(getBaseContext() + ' | ' + COMPACT_CONTEXT + ' | ' + PLAN_MODE_BLOCK);
159
-
160
- if (prompt && projectDir) {
161
- const searchResults = runCodeSearch(prompt, projectDir);
162
- if (searchResults) {
163
- parts.push(`=== Semantic code search results ===\n${searchResults}`);
164
- }
165
- }
166
-
71
+ const thorns = runThorns();
72
+ if (thorns) parts.push(thorns);
73
+ parts.push('use gm agent | ' + COMPACT_CONTEXT + ' | ' + PLAN_MODE_BLOCK);
167
74
  emit(parts.join('\n\n'));
168
75
  } catch (error) {
169
- emit(getBaseContext('hook error: ' + error.message) + ' | ' + COMPACT_CONTEXT);
76
+ emit('use gm agent | hook error: ' + error.message);
170
77
  process.exit(0);
171
78
  }
@@ -1,4 +1,4 @@
1
- #!/usr/bin/env node
1
+ #!/usr/bin/env bun
2
2
 
3
3
  const fs = require('fs');
4
4
  const path = require('path');
@@ -75,29 +75,13 @@ When exploring unfamiliar code, finding similar patterns, understanding integrat
75
75
  encoding: 'utf-8',
76
76
  stdio: ['pipe', 'pipe', 'pipe'],
77
77
  cwd: projectDir,
78
- timeout: 180000,
78
+ timeout: 15000,
79
79
  killSignal: 'SIGTERM'
80
80
  });
81
81
  } catch (bunErr) {
82
- if (bunErr.killed && bunErr.signal === 'SIGTERM') {
83
- thornOutput = '=== mcp-thorns ===\nSkipped (3min timeout)';
84
- } else {
85
- try {
86
- thornOutput = execSync(`npx -y mcp-thorns@latest`, {
87
- encoding: 'utf-8',
88
- stdio: ['pipe', 'pipe', 'pipe'],
89
- cwd: projectDir,
90
- timeout: 180000,
91
- killSignal: 'SIGTERM'
92
- });
93
- } catch (npxErr) {
94
- if (npxErr.killed && npxErr.signal === 'SIGTERM') {
95
- thornOutput = '=== mcp-thorns ===\nSkipped (3min timeout)';
96
- } else {
97
- thornOutput = `=== mcp-thorns ===\nSkipped (error: ${bunErr.message.split('\n')[0]})`;
98
- }
99
- }
100
- }
82
+ thornOutput = bunErr.killed
83
+ ? '=== mcp-thorns ===\nSkipped (timeout)'
84
+ : `=== mcp-thorns ===\nSkipped (error: ${bunErr.message.split('\n')[0]})`;
101
85
  }
102
86
  outputs.push(`=== This is your initial insight of the repository, look at every possible aspect of this for initial opinionation and to offset the need for code exploration ===\n${thornOutput}`);
103
87
  } catch (e) {
@@ -165,7 +149,3 @@ When exploring unfamiliar code, finding similar patterns, understanding integrat
165
149
 
166
150
 
167
151
 
168
-
169
-
170
-
171
-
package/manifest.yml CHANGED
@@ -1,5 +1,5 @@
1
1
  name: gm
2
- version: 2.0.144
2
+ version: 2.0.149
3
3
  description: State machine agent with hooks, skills, and automated git enforcement
4
4
  author: AnEntrypoint
5
5
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-copilot-cli",
3
- "version": "2.0.144",
3
+ "version": "2.0.149",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",
@@ -1,7 +1,165 @@
1
1
  ---
2
2
  name: code-search
3
- description: Semantic code search across the codebase. Use for all code exploration, finding implementations, locating files, and answering codebase questions. Replaces mcp__plugin_gm_code-search__search and codebasesearch MCP tool.
3
+ description: Semantic code search across the codebase. Returns structured results with file paths, line numbers, and relevance scores. Use for all code exploration, finding implementations, locating files, and answering codebase questions.
4
+ category: exploration
4
5
  allowed-tools: Bash(bun x codebasesearch*)
6
+ input-schema:
7
+ type: object
8
+ required: [prompt]
9
+ properties:
10
+ prompt:
11
+ type: string
12
+ minLength: 3
13
+ maxLength: 200
14
+ description: Natural language search query describing what you're looking for
15
+ context:
16
+ type: object
17
+ description: Optional context about search scope and restrictions
18
+ properties:
19
+ path:
20
+ type: string
21
+ description: Restrict search to this directory path (relative or absolute)
22
+ file-types:
23
+ type: array
24
+ items: { type: string }
25
+ description: Filter results by file extensions (e.g., ["js", "ts", "py"])
26
+ exclude-patterns:
27
+ type: array
28
+ items: { type: string }
29
+ description: Exclude paths matching glob patterns (e.g., ["node_modules", "*.test.js"])
30
+ filter:
31
+ type: object
32
+ description: Output filtering and formatting options
33
+ properties:
34
+ max-results:
35
+ type: integer
36
+ minimum: 1
37
+ maximum: 500
38
+ default: 50
39
+ description: Maximum number of results to return
40
+ min-score:
41
+ type: number
42
+ minimum: 0
43
+ maximum: 1
44
+ default: 0.5
45
+ description: Minimum relevance score (0-1) to include in results
46
+ sort-by:
47
+ type: string
48
+ enum: [relevance, path, line-number]
49
+ default: relevance
50
+ description: Result sort order
51
+ timeout:
52
+ type: integer
53
+ minimum: 1000
54
+ maximum: 30000
55
+ default: 10000
56
+ description: Search timeout in milliseconds (query returns partial results if exceeded)
57
+ output-schema:
58
+ type: object
59
+ required: [status, results, meta]
60
+ properties:
61
+ status:
62
+ type: string
63
+ enum: [success, partial, empty, timeout, error]
64
+ description: Overall operation status
65
+ results:
66
+ type: array
67
+ description: Array of matching code locations
68
+ items:
69
+ type: object
70
+ required: [file, line, content, score]
71
+ properties:
72
+ file:
73
+ type: string
74
+ description: Absolute or relative file path to matched file
75
+ line:
76
+ type: integer
77
+ description: Line number where match occurs (1-indexed)
78
+ content:
79
+ type: string
80
+ description: The matched line or context snippet
81
+ score:
82
+ type: number
83
+ minimum: 0
84
+ maximum: 1
85
+ description: Relevance score where 1.0 is perfect match
86
+ context:
87
+ type: object
88
+ description: Surrounding context lines (optional)
89
+ properties:
90
+ before:
91
+ type: array
92
+ items: { type: string }
93
+ description: Lines before the match
94
+ after:
95
+ type: array
96
+ items: { type: string }
97
+ description: Lines after the match
98
+ metadata:
99
+ type: object
100
+ description: File and match metadata (optional)
101
+ properties:
102
+ language:
103
+ type: string
104
+ description: Programming language detected (js, ts, py, rs, go, etc.)
105
+ size:
106
+ type: integer
107
+ description: File size in bytes
108
+ modified:
109
+ type: string
110
+ format: date-time
111
+ description: Last modification timestamp
112
+ meta:
113
+ type: object
114
+ required: [query, count, duration_ms]
115
+ description: Query execution metadata
116
+ properties:
117
+ query:
118
+ type: string
119
+ description: Normalized query that was executed
120
+ count:
121
+ type: integer
122
+ description: Total matches found (before filtering)
123
+ filtered:
124
+ type: integer
125
+ description: Results returned (after filtering and limiting)
126
+ duration_ms:
127
+ type: integer
128
+ description: Execution time in milliseconds
129
+ scanned_files:
130
+ type: integer
131
+ description: Total files examined during search
132
+ timestamp:
133
+ type: string
134
+ format: date-time
135
+ description: When execution completed
136
+ errors:
137
+ type: array
138
+ description: Non-fatal errors that occurred (may appear alongside partial results)
139
+ items:
140
+ type: object
141
+ properties:
142
+ code:
143
+ type: string
144
+ enum: [TIMEOUT, INVALID_PATH, SCHEMA_VIOLATION, EXECUTION_FAILED]
145
+ description: Error classification
146
+ message:
147
+ type: string
148
+ description: Human-readable error description
149
+ output-format: json
150
+ error-handling:
151
+ timeout:
152
+ behavior: return-partial
153
+ description: Returns results collected before timeout with status=partial
154
+ invalid-input:
155
+ behavior: reject
156
+ description: Returns status=error with validation errors in errors array
157
+ empty-results:
158
+ behavior: return-empty
159
+ description: Returns status=empty with count=0, filtered=0, results=[]
160
+ execution-error:
161
+ behavior: return-error
162
+ description: Returns status=error with error details in errors array
5
163
  ---
6
164
 
7
165
  # Semantic Code Search
@@ -14,7 +172,56 @@ Only use bun x codebasesearch for searching code, or execute some custom code if
14
172
  bun x codebasesearch "your natural language query"
15
173
  ```
16
174
 
17
- ## Examples
175
+ ## Invocation Examples
176
+
177
+ ### Via Skill Tool (Recommended - Structured JSON Input)
178
+
179
+ **Basic search**:
180
+ ```json
181
+ {
182
+ "prompt": "where is authentication handled"
183
+ }
184
+ ```
185
+
186
+ **With filtering and limits**:
187
+ ```json
188
+ {
189
+ "prompt": "database connection setup",
190
+ "filter": {
191
+ "max-results": 20,
192
+ "min-score": 0.7,
193
+ "sort-by": "path"
194
+ }
195
+ }
196
+ ```
197
+
198
+ **Scoped to directory with file type filter**:
199
+ ```json
200
+ {
201
+ "prompt": "error logging middleware",
202
+ "context": {
203
+ "path": "src/middleware/",
204
+ "file-types": ["js", "ts"]
205
+ },
206
+ "timeout": 5000
207
+ }
208
+ ```
209
+
210
+ **Exclude patterns and narrow results**:
211
+ ```json
212
+ {
213
+ "prompt": "rate limiter implementation",
214
+ "context": {
215
+ "exclude-patterns": ["*.test.js", "node_modules/*"]
216
+ },
217
+ "filter": {
218
+ "max-results": 10,
219
+ "min-score": 0.8
220
+ }
221
+ }
222
+ ```
223
+
224
+ ### Legacy CLI Invocation (Still Supported)
18
225
 
19
226
  ```bash
20
227
  bun x codebasesearch "where is authentication handled"
@@ -24,9 +231,146 @@ bun x codebasesearch "function that parses config files"
24
231
  bun x codebasesearch "where is the rate limiter"
25
232
  ```
26
233
 
234
+ ## Output Examples
235
+
236
+ ### Success Response (Multiple Results)
237
+
238
+ ```json
239
+ {
240
+ "status": "success",
241
+ "results": [
242
+ {
243
+ "file": "src/auth/handler.js",
244
+ "line": 42,
245
+ "content": "async function authenticateUser(credentials) {",
246
+ "score": 0.95,
247
+ "context": {
248
+ "before": [
249
+ "// Main authentication entry point",
250
+ ""
251
+ ],
252
+ "after": [
253
+ " const { username, password } = credentials;",
254
+ " const user = await db.users.findOne({ username });"
255
+ ]
256
+ },
257
+ "metadata": {
258
+ "language": "javascript",
259
+ "size": 2048,
260
+ "modified": "2025-03-10T14:23:00Z"
261
+ }
262
+ },
263
+ {
264
+ "file": "src/middleware/auth-middleware.js",
265
+ "line": 18,
266
+ "content": "export const requireAuth = (req, res, next) => {",
267
+ "score": 0.78,
268
+ "metadata": {
269
+ "language": "javascript",
270
+ "size": 1024,
271
+ "modified": "2025-03-10T14:20:00Z"
272
+ }
273
+ }
274
+ ],
275
+ "meta": {
276
+ "query": "authentication handled",
277
+ "count": 2,
278
+ "filtered": 2,
279
+ "duration_ms": 245,
280
+ "scanned_files": 87,
281
+ "timestamp": "2025-03-15T10:30:00Z"
282
+ }
283
+ }
284
+ ```
285
+
286
+ ### Empty Results Response
287
+
288
+ ```json
289
+ {
290
+ "status": "empty",
291
+ "results": [],
292
+ "meta": {
293
+ "query": "nonexistent pattern xyz123",
294
+ "count": 0,
295
+ "filtered": 0,
296
+ "duration_ms": 123,
297
+ "scanned_files": 87,
298
+ "timestamp": "2025-03-15T10:30:00Z"
299
+ }
300
+ }
301
+ ```
302
+
303
+ ### Timeout Response (Partial Results)
304
+
305
+ ```json
306
+ {
307
+ "status": "partial",
308
+ "results": [
309
+ {
310
+ "file": "src/a.js",
311
+ "line": 5,
312
+ "content": "function init() {",
313
+ "score": 0.92,
314
+ "metadata": { "language": "javascript", "size": 512 }
315
+ },
316
+ {
317
+ "file": "src/b.js",
318
+ "line": 12,
319
+ "content": "const setup = () => {",
320
+ "score": 0.85,
321
+ "metadata": { "language": "javascript", "size": 768 }
322
+ }
323
+ ],
324
+ "meta": {
325
+ "query": "expensive search pattern",
326
+ "count": 1847,
327
+ "filtered": 2,
328
+ "duration_ms": 10000,
329
+ "scanned_files": 45,
330
+ "timestamp": "2025-03-15T10:30:00Z"
331
+ },
332
+ "errors": [
333
+ {
334
+ "code": "TIMEOUT",
335
+ "message": "Search exceeded 10000ms limit. Returning partial results (2 of 1847 matches)."
336
+ }
337
+ ]
338
+ }
339
+ ```
340
+
341
+ ### Error Response (Invalid Input)
342
+
343
+ ```json
344
+ {
345
+ "status": "error",
346
+ "results": [],
347
+ "meta": {
348
+ "query": null,
349
+ "count": 0,
350
+ "filtered": 0,
351
+ "duration_ms": 50,
352
+ "scanned_files": 0,
353
+ "timestamp": "2025-03-15T10:30:00Z"
354
+ },
355
+ "errors": [
356
+ {
357
+ "code": "INVALID_PATH",
358
+ "message": "context.path='/nonexistent' does not exist"
359
+ },
360
+ {
361
+ "code": "SCHEMA_VIOLATION",
362
+ "message": "filter.max-results must be between 1 and 500, got 1000"
363
+ }
364
+ ]
365
+ }
366
+ ```
367
+
27
368
  ## Rules
28
369
 
29
370
  - Always use this first before reading files — it returns file paths and line numbers
30
- - Natural language queries work best; be descriptive
31
- - No persistent files created; results stream to stdout only
32
- - Use the returned file paths + line numbers to go directly to relevant code
371
+ - Natural language queries work best; be descriptive about what you're looking for
372
+ - Structured JSON output includes relevance scores and file paths for immediate navigation
373
+ - Use returned file paths and line numbers to read full file context via Read tool
374
+ - Results are pre-sorted by relevance (highest scores first) unless sort-by specifies otherwise
375
+ - Timeout queries return partial results with status=partial — use if time-critical
376
+ - Schema validation ensures valid input before execution — invalid args return error with details
package/tools.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm",
3
- "version": "2.0.144",
3
+ "version": "2.0.149",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "tools": [
6
6
  {