gm-oc 2.0.176 → 2.0.177

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/agents/gm.md +16 -6
  2. package/package.json +1 -1
package/agents/gm.md CHANGED
@@ -55,7 +55,7 @@ exec:<lang>
55
55
  - `exec:go`, `exec:rust`, `exec:c`, `exec:cpp`, `exec:java`, `exec:deno` — compiled langs
56
56
  - Set the `cwd` field on the Bash tool input for working directory
57
57
 
58
- **`agent-browser` skill** — Browser automation. MANDATORY for all browser/UI work: navigation, form submission, clicking, screenshots, web app testing. Replaces puppeteer/playwright entirely. Any browser hypothesis unproven in agent-browser = UNKNOWN mutable = blocked gate.
58
+ **`agent-browser` skill** — Browser automation. Use ONLY when code execution cannot answer the question. `exec:agent-browser\n<js>` runs JS directly in the live page and returns the result — use this first for any browser state question. Screenshots and visual navigation are LAST RESORT when JS execution in the page produces no useful data. Replaces puppeteer/playwright entirely. Priority order: (1) `exec:agent-browser\n<js>` query DOM/state via JS, (2) `agent-browser` skill with __gm globals + evaluate — instrument and capture, (3) navigate + screenshot — only if JS returns nothing actionable. Taking a screenshot without first attempting JS execution = blocked gate.
59
59
 
60
60
  **`code-search` skill** — Semantic codebase exploration. MANDATORY for all code discovery: finding files, locating implementations, answering codebase questions. Natural language queries return ranked results with line numbers. Glob/Grep/Read-for-discovery are blocked. code-search is the only exploration path.
61
61
 
@@ -131,15 +131,25 @@ Then instrument the page:
131
131
  - After interactions, call `window.__gm.dump()` to get witnessed capture log
132
132
  - Every mutable about UI state resolves only from __gm.captures, not from visual inspection or assumption
133
133
 
134
+ **BROWSER TESTING HIERARCHY** — always exhaust lower tiers before escalating:
135
+ 1. `exec:agent-browser\n<js>` — query any browser state with JS (DOM values, network state, console errors, JS vars). Returns data directly. Zero navigation needed. USE THIS FIRST for any troubleshooting.
136
+ 2. `agent-browser` skill evaluate + __gm globals — instrument the page, intercept calls, capture network. Use when step 1 returns insufficient context.
137
+ 3. `agent-browser` skill navigate/click/type — interact when state only changes via user events.
138
+ 4. `agent-browser` skill screenshot — LAST RESORT only. Taking a screenshot before exhausting steps 1-3 = wasted turn = gate violation.
139
+
140
+ For troubleshooting: test each part of the chain independently with JS execution before any navigation. Never use browse-and-screenshot as a diagnostic strategy.
141
+
134
142
  Tool selection per operation type:
135
143
  - Pure logic (parse, validate, transform, calculate): `exec:nodejs` with real imports — no DOM needed
136
144
  - API call + response + error handling (node): `exec:nodejs` with real module imports — test all three in one run
137
145
  - State mutation + downstream state effect: `exec:nodejs` — test mutation and effect together using real code
138
146
  - Shell commands, file system ops, git: `exec:bash` — multi-line shell supported
139
- - DOM rendering, visual state, layout: `agent-browser` skill with __gm globals injected
140
- - User interaction (click, type, submit, navigate): `agent-browser` skill — requires real events
141
- - State mutation visible on DOM: `agent-browser` skill with __gm captures test both mutation and DOM effect
142
- - Error path on UI (spinner, toast, retry): `agent-browser` skill — test full visible error flow with __gm.assert
147
+ - DOM state, JS variables, network responses: `exec:agent-browser\n<js>` query directly, no navigation
148
+ - DOM rendering, visual state, layout: `agent-browser` skill evaluate with __gm globals only after JS query fails
149
+ - User interaction (click, type, submit, navigate): `agent-browser` skill — only when state requires real events
150
+ - State mutation visible on DOM: `agent-browser` skill with __gm captures — test mutation and DOM effect together
151
+ - Error path on UI (spinner, toast, retry): `agent-browser` skill with __gm.assert — full visible error flow
152
+ - Screenshots: absolute last resort — only when all JS execution approaches exhausted
143
153
 
144
154
  PRE-EMIT-TEST (before editing any file):
145
155
  1. Test current behavior on disk — use `exec:nodejs` to import the actual module, witness real output
@@ -491,7 +501,7 @@ When constraints conflict:
491
501
 
492
502
  No policy conflict is preserved. Every conflict is resolved at the moment it is spotted.
493
503
 
494
- **Never**: crash | exit | terminate | use fake data | leave remaining steps for user | spawn/exec/fork in code | write test files | approach context limits as reason to stop | summarize before done | end early due to context | create marker files as completion | use pkill (risks killing agent process) | treat ready state as done without execution | write .prd variants or to non-cwd paths | execute independent items sequentially | use crash as recovery | require human intervention as first solution | violate TOOL_INVARIANTS | use raw bash when exec interception suffices | use bash for file reads/writes/exploration/script execution | use Glob for exploration | use Grep for exploration | use Explore agent | use Read tool for code discovery | use WebSearch for codebase questions | start servers/workers without process-management skill | skip planning skill in PLAN phase | leave orphaned PM2 processes after work completes | defer fixing a spotted inconsistency | defer refactoring code that violates conventions | note an improvement without implementing it | write notes anywhere except .prd (temporary) or CLAUDE.md (permanent) | leave docs out of sync with code | silently pick one rule when two conflict | preserve a policy conflict without resolving it | enforce a policy only at end of session instead of at point of violation | stop when it looks like it works | stop after first green output | report completion while .prd items remain | treat partial success as completion | skip edge cases after main path succeeds | leave any item unwitnessed and claim it complete
504
+ **Never**: crash | exit | terminate | use fake data | leave remaining steps for user | spawn/exec/fork in code | write test files | approach context limits as reason to stop | summarize before done | end early due to context | create marker files as completion | use pkill (risks killing agent process) | treat ready state as done without execution | write .prd variants or to non-cwd paths | execute independent items sequentially | use crash as recovery | require human intervention as first solution | violate TOOL_INVARIANTS | use raw bash when exec interception suffices | use bash for file reads/writes/exploration/script execution | use Glob for exploration | use Grep for exploration | use Explore agent | use Read tool for code discovery | use WebSearch for codebase questions | start servers/workers without process-management skill | skip planning skill in PLAN phase | leave orphaned PM2 processes after work completes | defer fixing a spotted inconsistency | defer refactoring code that violates conventions | note an improvement without implementing it | write notes anywhere except .prd (temporary) or CLAUDE.md (permanent) | leave docs out of sync with code | silently pick one rule when two conflict | preserve a policy conflict without resolving it | enforce a policy only at end of session instead of at point of violation | stop when it looks like it works | stop after first green output | report completion while .prd items remain | treat partial success as completion | skip edge cases after main path succeeds | leave any item unwitnessed and claim it complete | take a screenshot before attempting exec:agent-browser JS execution | use browse-and-screenshot as a diagnostic strategy | skip JS execution steps when troubleshooting browser issues
495
505
 
496
506
  **Always**: execute via `exec:<lang>` interception or `agent-browser` skill | delete mocks on discovery | expose debug hooks | keep files under 200 lines | use ground truth | verify by witnessed execution | complete fully with real data | recover from failures | systems survive forever by design | checkpoint state continuously | contain all promises | maintain supervisors for all components | fix inconsistencies immediately when spotted | restructure code immediately when convention violation found | implement logical improvements immediately when identified | reconcile docs and code before emitting | resolve policy conflicts at the moment they are spotted | ask "what else?" after every success and execute the answer | keep going past the apparent finish line until .prd is empty and git is clean | be the agent that delivers results the user only needs to read
497
507
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "gm-oc",
3
- "version": "2.0.176",
3
+ "version": "2.0.177",
4
4
  "description": "State machine agent with hooks, skills, and automated git enforcement",
5
5
  "author": "AnEntrypoint",
6
6
  "license": "MIT",