gm-copilot-cli 2.0.14 → 2.0.15
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/gm.md +15 -9
- package/copilot-profile.md +1 -1
- package/manifest.yml +1 -1
- package/package.json +1 -1
- package/skills/agent-browser/SKILL.md +273 -18
- package/skills/planning/SKILL.md +335 -0
- package/tools.json +1 -1
package/agents/gm.md
CHANGED
|
@@ -24,10 +24,10 @@ YOU ARE gm, an immutable programming state machine. You do not think in prose. Y
|
|
|
24
24
|
|
|
25
25
|
**STATE TRANSITION RULES**:
|
|
26
26
|
- States: `PLAN → EXECUTE → EMIT → VERIFY → COMPLETE`
|
|
27
|
-
- PLAN:
|
|
28
|
-
- EXECUTE:
|
|
29
|
-
- EMIT:
|
|
30
|
-
- VERIFY:
|
|
27
|
+
- PLAN: Use `planning` skill to construct `./.prd` with complete dependency graph. No tool calls yet. Exit condition: `.prd` written with all unknowns named as items, every possible edge case captured, dependencies mapped.
|
|
28
|
+
- EXECUTE: Run every possible code execution needed, each under 15 seconds, densely packed with every possible hypothesis. Launch ≤3 parallel gm:gm subagents per wave. Assigns witnessed values to mutables. Exit condition: zero unresolved mutables.
|
|
29
|
+
- EMIT: Write all files. Exit condition: every possible gate checklist mutable `resolved=true` simultaneously.
|
|
30
|
+
- VERIFY: Run real system end to end, witness output. Exit condition: `witnessed_execution=true`.
|
|
31
31
|
- COMPLETE: `gate_passed=true` AND `user_steps_remaining=0`. Absolute barrier—no partial completion.
|
|
32
32
|
- If EXECUTE exits with unresolved mutables: re-enter EXECUTE with a broader script, never add a new stage.
|
|
33
33
|
|
|
@@ -53,7 +53,7 @@ Scope: Where and how code runs. Governs tool selection and execution context.
|
|
|
53
53
|
|
|
54
54
|
All execution in plugin:gm:dev or plugin:browser:execute. Every hypothesis proven by execution before changing files. Know nothing until execution proves it.
|
|
55
55
|
|
|
56
|
-
**CODE YOUR HYPOTHESES**: Test every possible hypothesis by writing code. Each execution run must be under 15 seconds and must intelligently test every possible related idea—never one idea per run. Run every possible execution needed, but each one must be densely packed with every possible related hypothesis. File existence, schema validity, output format, error conditions, edge cases—group every possible related unknown together. The goal is every possible hypothesis per run.
|
|
56
|
+
**CODE YOUR HYPOTHESES**: Test every possible hypothesis by writing code in plugin:gm:dev or plugin:browser:execute. Each execution run must be under 15 seconds and must intelligently test every possible related idea—never one idea per run. Run every possible execution needed, but each one must be densely packed with every possible related hypothesis. File existence, schema validity, output format, error conditions, edge cases—group every possible related unknown together. The goal is every possible hypothesis per run. Use `agent-browser` skill for cross-client UI testing and browser-based hypothesis validation. Use plugin:gm:dev global scope for live state inspection and REPL debugging.
|
|
57
57
|
|
|
58
58
|
**DEFAULT IS CODE, NOT BASH**: `plugin:gm:dev` is the primary execution tool. Bash is a last resort for operations that cannot be done in code (git, npm publish, docker). If you find yourself writing a bash command, stop and ask: can this be done in plugin:gm:dev? The answer is almost always yes.
|
|
59
59
|
|
|
@@ -67,18 +67,24 @@ All execution in plugin:gm:dev or plugin:browser:execute. Every hypothesis prove
|
|
|
67
67
|
- Bash for code exploration (grep, find, cat, head, tail, ls on source files) - blocked, use codesearch instead
|
|
68
68
|
- Bash for running scripts, node, bun, npx - blocked, use plugin:gm:dev instead
|
|
69
69
|
- Bash for reading/writing files - blocked, use plugin:gm:dev fs operations instead
|
|
70
|
+
- Puppeteer, playwright, playwright-core for browser automation - blocked, use `agent-browser` skill instead
|
|
70
71
|
|
|
71
72
|
**REQUIRED TOOL MAPPING**:
|
|
72
|
-
- Code exploration: `mcp__plugin_gm_code-search__search` (codesearch) - THE ONLY exploration tool. Natural language queries. No glob, no grep, no find, no explore agent, no Read for discovery.
|
|
73
|
+
- Code exploration: `mcp__plugin_gm_code-search__search` (codesearch) - THE ONLY exploration tool. Semantic search 102 file types. Natural language queries with line numbers. No glob, no grep, no find, no explore agent, no Read for discovery.
|
|
73
74
|
- Code execution: `mcp__plugin_gm_dev__execute` (plugin:gm:dev) - run JS/TS/Python/Go/Rust/etc
|
|
74
75
|
- File operations: `mcp__plugin_gm_dev__execute` with fs module - read, write, stat files
|
|
75
76
|
- Bash: `mcp__plugin_gm_dev__bash` - ONLY git, npm publish/pack, docker, system daemons
|
|
76
|
-
- Browser:
|
|
77
|
+
- Browser: Use **`agent-browser` skill** instead of puppeteer/playwright - same power, cleaner syntax, built for AI agents
|
|
77
78
|
|
|
78
79
|
**EXPLORATION DECISION TREE**: Need to find something in code?
|
|
79
80
|
1. Use `mcp__plugin_gm_code-search__search` with natural language — always first
|
|
80
|
-
2.
|
|
81
|
-
3.
|
|
81
|
+
2. Try multiple queries (different keywords, phrasings) — searching faster/cheaper than CLI exploration
|
|
82
|
+
3. Codesearch returns line numbers and context — all you need to Read via fs.readFileSync
|
|
83
|
+
4. Only switch to CLI tools (grep, find) if codesearch fails after 5+ different queries for something known to exist
|
|
84
|
+
5. If file path already known → read via plugin:gm:dev fs.readFileSync directly
|
|
85
|
+
6. No other options. Glob/Grep/Read/Explore/WebSearch/puppeteer/playwright are NOT exploration or execution tools here.
|
|
86
|
+
|
|
87
|
+
**CODESEARCH EFFICIENCY TIP**: Multiple semantic queries cost <$0.01 total and take <1 second each. A single CLI grep costs nothing but requires parsing results and may miss files. Use codesearch liberally — it's designed for this. Try:"What does this function do?" → "Where is error handling implemented?" → "Show database connection setup" → each returns ranked file locations.
|
|
82
88
|
|
|
83
89
|
**BASH WHITELIST** (only acceptable bash uses):
|
|
84
90
|
- `git` commands (status, add, commit, push, pull, log, diff)
|
package/copilot-profile.md
CHANGED
package/manifest.yml
CHANGED
package/package.json
CHANGED
|
@@ -231,27 +231,282 @@ agent-browser eval -b "$(echo -n 'Array.from(document.querySelectorAll("a")).map
|
|
|
231
231
|
- Nested quotes, arrow functions, template literals, or multiline -> use `eval --stdin <<'EVALEOF'`
|
|
232
232
|
- Programmatic/generated scripts -> use `eval -b` with base64
|
|
233
233
|
|
|
234
|
-
##
|
|
234
|
+
## Complete Command Reference
|
|
235
235
|
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
|
|
236
|
+
### Core Navigation & Lifecycle
|
|
237
|
+
```bash
|
|
238
|
+
agent-browser open <url> # Navigate (aliases: goto, navigate)
|
|
239
|
+
agent-browser close # Close browser (aliases: quit, exit)
|
|
240
|
+
agent-browser back # Go back
|
|
241
|
+
agent-browser forward # Go forward
|
|
242
|
+
agent-browser reload # Reload page
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
### Snapshots & Element References
|
|
246
|
+
```bash
|
|
247
|
+
agent-browser snapshot # Accessibility tree with semantic refs
|
|
248
|
+
agent-browser snapshot -i # Interactive elements with @e refs
|
|
249
|
+
agent-browser snapshot -i -C # Include cursor-interactive divs (onclick, pointer)
|
|
250
|
+
agent-browser snapshot -s "#sel" # Scope snapshot to CSS selector
|
|
251
|
+
agent-browser snapshot --json # JSON output for parsing
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
### Interaction - Click, Fill, Type, Select
|
|
255
|
+
```bash
|
|
256
|
+
agent-browser click <sel> # Click element
|
|
257
|
+
agent-browser click <sel> --new-tab # Open link in new tab
|
|
258
|
+
agent-browser dblclick <sel> # Double-click
|
|
259
|
+
agent-browser focus <sel> # Focus element
|
|
260
|
+
agent-browser type <sel> <text> # Type into element (append)
|
|
261
|
+
agent-browser fill <sel> <text> # Clear and fill
|
|
262
|
+
agent-browser select <sel> <val> # Select dropdown option
|
|
263
|
+
agent-browser check <sel> # Check checkbox
|
|
264
|
+
agent-browser uncheck <sel> # Uncheck checkbox
|
|
265
|
+
agent-browser press <key> # Press key (Enter, Tab, Control+a, etc.) (alias: key)
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
### Keyboard & Text Input
|
|
269
|
+
```bash
|
|
270
|
+
agent-browser keyboard type <text> # Type with real keystrokes (no selector, uses focus)
|
|
271
|
+
agent-browser keyboard inserttext <text> # Insert text without triggering key events
|
|
272
|
+
agent-browser keydown <key> # Hold key down
|
|
273
|
+
agent-browser keyup <key> # Release key
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
### Mouse & Drag
|
|
277
|
+
```bash
|
|
278
|
+
agent-browser hover <sel> # Hover element
|
|
279
|
+
agent-browser drag <src> <tgt> # Drag and drop
|
|
280
|
+
agent-browser mouse move <x> <y> # Move mouse to coordinates
|
|
281
|
+
agent-browser mouse down [button] # Press mouse button (left/right/middle)
|
|
282
|
+
agent-browser mouse up [button] # Release mouse button
|
|
283
|
+
agent-browser mouse wheel <dy> [dx] # Scroll wheel
|
|
284
|
+
```
|
|
285
|
+
|
|
286
|
+
### Scrolling & Viewport
|
|
287
|
+
```bash
|
|
288
|
+
agent-browser scroll <dir> [px] # Scroll (up/down/left/right, optional px)
|
|
289
|
+
agent-browser scrollintoview <sel> # Scroll element into view (alias: scrollinto)
|
|
290
|
+
agent-browser set viewport <w> <h> # Set viewport size (e.g., 1920 1080)
|
|
291
|
+
agent-browser set device <name> # Emulate device (e.g., "iPhone 14")
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
### Get Information
|
|
295
|
+
```bash
|
|
296
|
+
agent-browser get text <sel> # Get text content
|
|
297
|
+
agent-browser get html <sel> # Get innerHTML
|
|
298
|
+
agent-browser get value <sel> # Get input value
|
|
299
|
+
agent-browser get attr <sel> <attr> # Get attribute value
|
|
300
|
+
agent-browser get title # Get page title
|
|
301
|
+
agent-browser get url # Get current URL
|
|
302
|
+
agent-browser get count <sel> # Count matching elements
|
|
303
|
+
agent-browser get box <sel> # Get bounding box {x, y, width, height}
|
|
304
|
+
agent-browser get styles <sel> # Get computed CSS styles
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
### Check State
|
|
308
|
+
```bash
|
|
309
|
+
agent-browser is visible <sel> # Check if visible
|
|
310
|
+
agent-browser is enabled <sel> # Check if enabled (not disabled)
|
|
311
|
+
agent-browser is checked <sel> # Check if checked (checkbox/radio)
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
### File Operations
|
|
315
|
+
```bash
|
|
316
|
+
agent-browser upload <sel> <files> # Upload files to file input
|
|
317
|
+
agent-browser screenshot [path] # Screenshot to temp or custom path
|
|
318
|
+
agent-browser screenshot --full # Full page screenshot
|
|
319
|
+
agent-browser screenshot --annotate # Annotated with numbered element labels
|
|
320
|
+
agent-browser pdf <path> # Save as PDF
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
### Semantic Locators (Alternative to Selectors)
|
|
324
|
+
```bash
|
|
325
|
+
agent-browser find role <role> <action> [value] # By ARIA role
|
|
326
|
+
agent-browser find text <text> <action> # By text content
|
|
327
|
+
agent-browser find label <label> <action> [value] # By form label
|
|
328
|
+
agent-browser find placeholder <ph> <action> [value] # By placeholder text
|
|
329
|
+
agent-browser find alt <text> <action> # By alt text
|
|
330
|
+
agent-browser find title <text> <action> # By title attribute
|
|
331
|
+
agent-browser find testid <id> <action> [value] # By data-testid
|
|
332
|
+
agent-browser find first <sel> <action> [value] # First matching element
|
|
333
|
+
agent-browser find last <sel> <action> [value] # Last matching element
|
|
334
|
+
agent-browser find nth <n> <sel> <action> [value] # Nth matching element
|
|
335
|
+
|
|
336
|
+
# Role examples: button, link, textbox, combobox, checkbox, radio, heading, list, etc.
|
|
337
|
+
# Actions: click, fill, type, hover, focus, check, uncheck, text
|
|
338
|
+
# Options: --name <name> (filter by accessible name), --exact (exact text match)
|
|
339
|
+
```
|
|
244
340
|
|
|
245
|
-
|
|
341
|
+
### Waiting
|
|
342
|
+
```bash
|
|
343
|
+
agent-browser wait <selector> # Wait for element to be visible
|
|
344
|
+
agent-browser wait <ms> # Wait for time in milliseconds
|
|
345
|
+
agent-browser wait --text "Welcome" # Wait for text to appear
|
|
346
|
+
agent-browser wait --url "**/dash" # Wait for URL pattern
|
|
347
|
+
agent-browser wait --load networkidle # Wait for load state (load, domcontentloaded, networkidle)
|
|
348
|
+
agent-browser wait --fn "window.ready === true" # Wait for JS condition
|
|
349
|
+
```
|
|
246
350
|
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
351
|
+
### JavaScript Evaluation
|
|
352
|
+
```bash
|
|
353
|
+
agent-browser eval <js> # Run JavaScript in browser
|
|
354
|
+
agent-browser eval -b "<base64>" # Base64-encoded JS (avoid shell escaping)
|
|
355
|
+
agent-browser eval --stdin <<'EOF' # JS from stdin (heredoc, recommended for complex code)
|
|
356
|
+
```
|
|
252
357
|
|
|
358
|
+
### Browser Environment
|
|
253
359
|
```bash
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
360
|
+
agent-browser set geo <lat> <lng> # Set geolocation
|
|
361
|
+
agent-browser set offline [on|off] # Toggle offline mode
|
|
362
|
+
agent-browser set headers <json> # Set HTTP headers
|
|
363
|
+
agent-browser set credentials <u> <p> # HTTP basic auth
|
|
364
|
+
agent-browser set media [dark|light] # Emulate color scheme (prefers-color-scheme)
|
|
257
365
|
```
|
|
366
|
+
|
|
367
|
+
### Cookies & Storage
|
|
368
|
+
```bash
|
|
369
|
+
agent-browser cookies # Get all cookies
|
|
370
|
+
agent-browser cookies set <name> <val> # Set cookie
|
|
371
|
+
agent-browser cookies clear # Clear cookies
|
|
372
|
+
agent-browser storage local # Get all localStorage
|
|
373
|
+
agent-browser storage local <key> # Get specific key
|
|
374
|
+
agent-browser storage local set <k> <v> # Set value
|
|
375
|
+
agent-browser storage local clear # Clear all localStorage
|
|
376
|
+
agent-browser storage session # Same for sessionStorage
|
|
377
|
+
agent-browser storage session <key> # Get sessionStorage key
|
|
378
|
+
agent-browser storage session set <k> <v> # Set sessionStorage
|
|
379
|
+
agent-browser storage session clear # Clear sessionStorage
|
|
380
|
+
```
|
|
381
|
+
|
|
382
|
+
### Network & Interception
|
|
383
|
+
```bash
|
|
384
|
+
agent-browser network route <url> # Intercept requests
|
|
385
|
+
agent-browser network route <url> --abort # Block requests
|
|
386
|
+
agent-browser network route <url> --body <json> # Mock response with JSON
|
|
387
|
+
agent-browser network unroute [url] # Remove routes
|
|
388
|
+
agent-browser network requests # View tracked requests
|
|
389
|
+
agent-browser network requests --filter api # Filter by keyword
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
### Tabs & Windows
|
|
393
|
+
```bash
|
|
394
|
+
agent-browser tab # List active tabs
|
|
395
|
+
agent-browser tab new [url] # Open new tab (optionally with URL)
|
|
396
|
+
agent-browser tab <n> # Switch to tab n
|
|
397
|
+
agent-browser tab close [n] # Close tab (current or specific)
|
|
398
|
+
agent-browser window new # Open new window
|
|
399
|
+
```
|
|
400
|
+
|
|
401
|
+
### Frames
|
|
402
|
+
```bash
|
|
403
|
+
agent-browser frame <sel> # Switch to iframe by selector
|
|
404
|
+
agent-browser frame main # Switch back to main frame
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
### Dialogs
|
|
408
|
+
```bash
|
|
409
|
+
agent-browser dialog accept [text] # Accept alert/confirm (with optional prompt text)
|
|
410
|
+
agent-browser dialog dismiss # Dismiss dialog
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
### State Persistence (Auth, Sessions)
|
|
414
|
+
```bash
|
|
415
|
+
agent-browser state save <path> # Save authenticated session
|
|
416
|
+
agent-browser state load <path> # Load session state
|
|
417
|
+
agent-browser state list # List saved state files
|
|
418
|
+
agent-browser state show <file> # Show state summary
|
|
419
|
+
agent-browser state rename <old> <new> # Rename state
|
|
420
|
+
agent-browser state clear [name] # Clear specific session
|
|
421
|
+
agent-browser state clear --all # Clear all states
|
|
422
|
+
agent-browser state clean --older-than <days> # Delete old states
|
|
423
|
+
```
|
|
424
|
+
|
|
425
|
+
### Debugging & Analysis
|
|
426
|
+
```bash
|
|
427
|
+
agent-browser highlight <sel> # Highlight element visually
|
|
428
|
+
agent-browser console # View console messages (log, error, warn)
|
|
429
|
+
agent-browser console --clear # Clear console
|
|
430
|
+
agent-browser errors # View JavaScript errors
|
|
431
|
+
agent-browser errors --clear # Clear errors
|
|
432
|
+
agent-browser trace start [path] # Start DevTools trace
|
|
433
|
+
agent-browser trace stop [path] # Stop and save trace
|
|
434
|
+
agent-browser profiler start # Start Chrome DevTools profiler
|
|
435
|
+
agent-browser profiler stop [path] # Stop and save .json profile
|
|
436
|
+
```
|
|
437
|
+
|
|
438
|
+
### Visual Debugging
|
|
439
|
+
```bash
|
|
440
|
+
agent-browser --headed open <url> # Headless=false, show visual browser
|
|
441
|
+
agent-browser record start <file.webm> # Record session
|
|
442
|
+
agent-browser record stop # Stop recording
|
|
443
|
+
```
|
|
444
|
+
|
|
445
|
+
### Comparisons & Diffs
|
|
446
|
+
```bash
|
|
447
|
+
agent-browser diff snapshot # Compare current vs last snapshot
|
|
448
|
+
agent-browser diff snapshot --baseline before.txt # Compare current vs saved snapshot
|
|
449
|
+
agent-browser diff snapshot --selector "#main" --compact # Scoped diff
|
|
450
|
+
agent-browser diff screenshot --baseline before.png # Visual pixel diff
|
|
451
|
+
agent-browser diff screenshot --baseline b.png -o d.png # Save diff to custom path
|
|
452
|
+
agent-browser diff screenshot --baseline b.png -t 0.2 # Color threshold 0-1
|
|
453
|
+
agent-browser diff url https://v1.com https://v2.com # Compare two URLs
|
|
454
|
+
agent-browser diff url https://v1.com https://v2.com --screenshot # With visual diff
|
|
455
|
+
agent-browser diff url https://v1.com https://v2.com --selector "#main" # Scoped
|
|
456
|
+
```
|
|
457
|
+
|
|
458
|
+
### Sessions & Parallelism
|
|
459
|
+
```bash
|
|
460
|
+
agent-browser --session <name> <cmd> # Run in named session (isolated instance)
|
|
461
|
+
agent-browser session list # List active sessions
|
|
462
|
+
agent-browser session show # Show current session
|
|
463
|
+
# Example: agent-browser --session agent1 open site.com
|
|
464
|
+
# agent-browser --session agent2 open other.com
|
|
465
|
+
```
|
|
466
|
+
|
|
467
|
+
### Browser Connection
|
|
468
|
+
```bash
|
|
469
|
+
agent-browser connect <port> # Connect via Chrome DevTools Protocol
|
|
470
|
+
agent-browser --auto-connect open <url> # Auto-discover running Chrome
|
|
471
|
+
agent-browser --cdp 9222 <cmd> # Explicit CDP port
|
|
472
|
+
```
|
|
473
|
+
|
|
474
|
+
### Setup & Installation
|
|
475
|
+
```bash
|
|
476
|
+
agent-browser install # Download Chromium browser
|
|
477
|
+
agent-browser install --with-deps # Also install system dependencies (Linux)
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
### Advanced: Local Files & Protocols
|
|
481
|
+
```bash
|
|
482
|
+
agent-browser --allow-file-access open file:///path/to/file.pdf
|
|
483
|
+
agent-browser --allow-file-access open file:///path/to/page.html
|
|
484
|
+
```
|
|
485
|
+
|
|
486
|
+
### Advanced: iOS/Mobile Testing
|
|
487
|
+
```bash
|
|
488
|
+
agent-browser device list # List available iOS simulators
|
|
489
|
+
agent-browser -p ios --device "iPhone 16 Pro" open <url> # Launch on device
|
|
490
|
+
agent-browser -p ios snapshot -i # Snapshot on iOS
|
|
491
|
+
agent-browser -p ios tap @e1 # Tap (alias for click)
|
|
492
|
+
agent-browser -p ios swipe up # Mobile gestures
|
|
493
|
+
agent-browser -p ios screenshot mobile.png
|
|
494
|
+
agent-browser -p ios close # Close simulator
|
|
495
|
+
# Requires: macOS, Xcode, Appium (npm install -g appium && appium driver install xcuitest)
|
|
496
|
+
```
|
|
497
|
+
|
|
498
|
+
## Key Patterns for Agents
|
|
499
|
+
|
|
500
|
+
**Always use agent-browser instead of puppeteer, playwright, or playwright-core** — it has the same capabilities with simpler syntax and better integration with AI agents.
|
|
501
|
+
|
|
502
|
+
**Multi-step workflows**:
|
|
503
|
+
1. `agent-browser open <url>`
|
|
504
|
+
2. `agent-browser snapshot -i` (get refs)
|
|
505
|
+
3. `agent-browser fill @e1 "value"`
|
|
506
|
+
4. `agent-browser click @e2`
|
|
507
|
+
5. `agent-browser wait --load networkidle` (after navigation)
|
|
508
|
+
6. `agent-browser snapshot -i` (re-snapshot for new refs)
|
|
509
|
+
|
|
510
|
+
**Debugging complex interactions**: Use `agent-browser --headed open <url>` to see visual browser, then `agent-browser highlight @e1` to verify element targeting.
|
|
511
|
+
|
|
512
|
+
**Ground truth verification**: Combine `agent-browser eval` for JavaScript inspection with `agent-browser screenshot` for visual confirmation.
|
|
@@ -0,0 +1,335 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: planning
|
|
3
|
+
description: PRD construction for work planning. Use this skill in PLAN phase to build .prd file with complete dependency graph of all items, edge cases, and subtasks before execution begins.
|
|
4
|
+
allowed-tools: Write
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Work Planning with PRD Construction
|
|
8
|
+
|
|
9
|
+
## Overview
|
|
10
|
+
|
|
11
|
+
This skill constructs `./.prd` (Product Requirements Document) files for structured work tracking. The PRD is a **single source of truth** that captures every possible item to complete, organized as a dependency graph for parallel execution.
|
|
12
|
+
|
|
13
|
+
**CRITICAL**: The PRD must be created in PLAN phase before any work begins. It blocks all other work until complete. It is frozen after creation—only items may be removed as they complete. No additions or reorganizations after plan is created.
|
|
14
|
+
|
|
15
|
+
## When to Use This Skill
|
|
16
|
+
|
|
17
|
+
Use `planning` skill when:
|
|
18
|
+
- Starting a new task or initiative
|
|
19
|
+
- User requests multiple items/features/fixes that need coordination
|
|
20
|
+
- Work has dependencies, parallellizable items, or complex stages
|
|
21
|
+
- You need to track progress across multiple independent work streams
|
|
22
|
+
|
|
23
|
+
**Do NOT use** if task is trivial (single item under 5 minutes).
|
|
24
|
+
|
|
25
|
+
## PRD Structure
|
|
26
|
+
|
|
27
|
+
Each PRD contains:
|
|
28
|
+
- **items**: Array of work items with dependencies
|
|
29
|
+
- **completed**: Empty list (populated as items finish)
|
|
30
|
+
- **metadata**: Total estimates, phases, notes
|
|
31
|
+
|
|
32
|
+
### Item Fields
|
|
33
|
+
|
|
34
|
+
```json
|
|
35
|
+
{
|
|
36
|
+
"id": "1",
|
|
37
|
+
"subject": "imperative verb describing outcome",
|
|
38
|
+
"status": "pending",
|
|
39
|
+
"description": "detailed requirement",
|
|
40
|
+
"blocking": ["2", "3"],
|
|
41
|
+
"blockedBy": ["4"],
|
|
42
|
+
"effort": "small|medium|large",
|
|
43
|
+
"category": "feature|bug|refactor|docs",
|
|
44
|
+
"notes": "contextual info"
|
|
45
|
+
}
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
### Key Rules
|
|
49
|
+
|
|
50
|
+
**Subject**: Use imperative form - "Fix auth bug", "Add webhook support", "Consolidate templates", not "Bug: auth", "New feature", etc.
|
|
51
|
+
|
|
52
|
+
**Blocking/Blocked By**: Map dependency graph
|
|
53
|
+
- If item 2 waits for item 1: `"blockedBy": ["1"]`
|
|
54
|
+
- If item 1 blocks items 2 & 3: `"blocking": ["2", "3"]`
|
|
55
|
+
|
|
56
|
+
**Status**: Only three values
|
|
57
|
+
- `pending` - not started
|
|
58
|
+
- `in_progress` - currently working
|
|
59
|
+
- `completed` - fully done
|
|
60
|
+
|
|
61
|
+
**Effort**: Estimate relative scope
|
|
62
|
+
- `small`: 1-2 items in 15 min
|
|
63
|
+
- `medium`: 3-5 items in 30-45 min
|
|
64
|
+
- `large`: 6+ items or 1+ hours
|
|
65
|
+
|
|
66
|
+
## Complete Item Template
|
|
67
|
+
|
|
68
|
+
Use this when planning complex work:
|
|
69
|
+
|
|
70
|
+
```json
|
|
71
|
+
{
|
|
72
|
+
"id": "task-name-1",
|
|
73
|
+
"subject": "Consolidate duplicate template builders",
|
|
74
|
+
"status": "pending",
|
|
75
|
+
"description": "Extract shared generatePackageJson() and buildHooksMap() logic from cli-adapter.js and extension-adapter.js into TemplateBuilder methods. Current duplication causes maintenance burden.",
|
|
76
|
+
"category": "refactor",
|
|
77
|
+
"effort": "medium",
|
|
78
|
+
"blocking": ["task-name-2"],
|
|
79
|
+
"blockedBy": [],
|
|
80
|
+
"acceptance": [
|
|
81
|
+
"Single generatePackageJson() method in TemplateBuilder",
|
|
82
|
+
"Both adapters call TemplateBuilder methods",
|
|
83
|
+
"All 9 platforms generate identical package.json structure",
|
|
84
|
+
"No duplication in adapter code"
|
|
85
|
+
],
|
|
86
|
+
"edge_cases": [
|
|
87
|
+
"Platforms without package.json (JetBrains IDE)",
|
|
88
|
+
"Custom fields for CLI vs extension platforms"
|
|
89
|
+
],
|
|
90
|
+
"verification": "All 9 build outputs pass validation, adapters <150 lines each"
|
|
91
|
+
}
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
## Comprehensive Planning Checklist
|
|
95
|
+
|
|
96
|
+
When creating PRD, cover:
|
|
97
|
+
|
|
98
|
+
### Requirements
|
|
99
|
+
- [ ] Main objective clearly stated
|
|
100
|
+
- [ ] Success criteria defined
|
|
101
|
+
- [ ] User-facing changes vs internal
|
|
102
|
+
- [ ] Backwards compatibility implications
|
|
103
|
+
- [ ] Data migration needed?
|
|
104
|
+
|
|
105
|
+
### Edge Cases
|
|
106
|
+
- [ ] Empty inputs/missing files
|
|
107
|
+
- [ ] Large scale (1000s of items?)
|
|
108
|
+
- [ ] Concurrent access patterns
|
|
109
|
+
- [ ] Timeout/hang scenarios
|
|
110
|
+
- [ ] Recovery from failures
|
|
111
|
+
|
|
112
|
+
### Dependencies
|
|
113
|
+
- [ ] External services/APIs required?
|
|
114
|
+
- [ ] Third-party library versions
|
|
115
|
+
- [ ] Environment setup (DB, redis, etc)
|
|
116
|
+
- [ ] Breaking changes from upgrades?
|
|
117
|
+
|
|
118
|
+
### Acceptance Criteria
|
|
119
|
+
- [ ] Code changed meets goal
|
|
120
|
+
- [ ] Tests pass (if applicable)
|
|
121
|
+
- [ ] Performance requirements met
|
|
122
|
+
- [ ] Security concerns addressed
|
|
123
|
+
- [ ] Documentation updated
|
|
124
|
+
|
|
125
|
+
### Integration Points
|
|
126
|
+
- [ ] Does it touch other systems?
|
|
127
|
+
- [ ] API compatibility impacts?
|
|
128
|
+
- [ ] Database schema changes?
|
|
129
|
+
- [ ] Message queue formats?
|
|
130
|
+
- [ ] Configuration propagation?
|
|
131
|
+
|
|
132
|
+
### Error Handling
|
|
133
|
+
- [ ] What fails gracefully?
|
|
134
|
+
- [ ] What fails hard?
|
|
135
|
+
- [ ] Recovery mechanisms?
|
|
136
|
+
- [ ] Fallback options?
|
|
137
|
+
- [ ] User notification strategy?
|
|
138
|
+
|
|
139
|
+
## PRD Lifecycle
|
|
140
|
+
|
|
141
|
+
### Creation Phase
|
|
142
|
+
1. Enumerate **every possible unknown** as work item
|
|
143
|
+
2. Map dependencies (blocking/blockedBy)
|
|
144
|
+
3. Group parallelizable items into waves
|
|
145
|
+
4. Verify all edge cases captured
|
|
146
|
+
5. Write `./.prd` to disk
|
|
147
|
+
6. **FREEZE** - no modifications except item removal
|
|
148
|
+
|
|
149
|
+
### Execution Phase
|
|
150
|
+
1. Read `.prd`
|
|
151
|
+
2. Find all `pending` items with no `blockedBy`
|
|
152
|
+
3. Launch ≤3 parallel workers (gm:gm subagents) per wave
|
|
153
|
+
4. As items complete, update status to `completed`
|
|
154
|
+
5. Remove completed items from `.prd` file
|
|
155
|
+
6. Launch next wave when previous completes
|
|
156
|
+
7. Continue until `.prd` is empty
|
|
157
|
+
|
|
158
|
+
### Completion Phase
|
|
159
|
+
- `.prd` file is empty (all items removed)
|
|
160
|
+
- All work committed and pushed
|
|
161
|
+
- Tests passing
|
|
162
|
+
- No remaining `pending` or `in_progress` items
|
|
163
|
+
|
|
164
|
+
## File Location
|
|
165
|
+
|
|
166
|
+
**CRITICAL**: PRD must be at exactly `./.prd` (current working directory root).
|
|
167
|
+
|
|
168
|
+
- ✅ `/home/user/plugforge/.prd`
|
|
169
|
+
- ❌ `/home/user/plugforge/.prd-temp`
|
|
170
|
+
- ❌ `/home/user/plugforge/build/.prd`
|
|
171
|
+
- ❌ `/home/user/plugforge/.prd.json`
|
|
172
|
+
|
|
173
|
+
No variants, no subdirectories, no extensions. Absolute path must resolve to `cwd + .prd`.
|
|
174
|
+
|
|
175
|
+
## JSON Format
|
|
176
|
+
|
|
177
|
+
PRD files are **valid JSON** for easy parsing and manipulation.
|
|
178
|
+
|
|
179
|
+
```json
|
|
180
|
+
{
|
|
181
|
+
"project": "plugforge",
|
|
182
|
+
"created": "2026-02-24",
|
|
183
|
+
"objective": "Unify agent tooling and planning infrastructure",
|
|
184
|
+
"items": [
|
|
185
|
+
{
|
|
186
|
+
"id": "1",
|
|
187
|
+
"subject": "Update agent-browser skill documentation",
|
|
188
|
+
"status": "pending",
|
|
189
|
+
"description": "Add complete command reference with all 100+ commands",
|
|
190
|
+
"blocking": ["2"],
|
|
191
|
+
"blockedBy": [],
|
|
192
|
+
"effort": "small",
|
|
193
|
+
"category": "docs"
|
|
194
|
+
},
|
|
195
|
+
{
|
|
196
|
+
"id": "2",
|
|
197
|
+
"subject": "Create planning skill for PRD construction",
|
|
198
|
+
"status": "pending",
|
|
199
|
+
"description": "New skill that creates .prd files with dependency graphs",
|
|
200
|
+
"blocking": ["3"],
|
|
201
|
+
"blockedBy": ["1"],
|
|
202
|
+
"effort": "medium",
|
|
203
|
+
"category": "feature"
|
|
204
|
+
},
|
|
205
|
+
{
|
|
206
|
+
"id": "3",
|
|
207
|
+
"subject": "Update gm.md agent instructions",
|
|
208
|
+
"status": "pending",
|
|
209
|
+
"description": "Reference new skills, emphasize codesearch over cli tools",
|
|
210
|
+
"blocking": [],
|
|
211
|
+
"blockedBy": ["2"],
|
|
212
|
+
"effort": "medium",
|
|
213
|
+
"category": "docs"
|
|
214
|
+
}
|
|
215
|
+
],
|
|
216
|
+
"completed": []
|
|
217
|
+
}
|
|
218
|
+
```
|
|
219
|
+
|
|
220
|
+
## Execution Guidelines
|
|
221
|
+
|
|
222
|
+
**Wave Orchestration**: Maximum 3 subagents per wave (gm:gm agents via Task tool).
|
|
223
|
+
|
|
224
|
+
```
|
|
225
|
+
Wave 1: Items 1, 2, 3 (all pending, no dependencies)
|
|
226
|
+
└─ 3 subagents launched in parallel
|
|
227
|
+
|
|
228
|
+
Wave 2: Items 4, 5 (depend on Wave 1 completion)
|
|
229
|
+
└─ Items 6, 7 (wait for Wave 2)
|
|
230
|
+
|
|
231
|
+
Wave 3: Items 6, 7
|
|
232
|
+
└─ 2 subagents (since only 2 items)
|
|
233
|
+
|
|
234
|
+
Wave 4: Item 8 (depends on Wave 3)
|
|
235
|
+
└─ Completes work
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
After each wave completes:
|
|
239
|
+
1. Remove finished items from `.prd`
|
|
240
|
+
2. Write `.prd` (now shorter)
|
|
241
|
+
3. Check for newly unblocked items
|
|
242
|
+
4. Launch next wave
|
|
243
|
+
|
|
244
|
+
## Example: Multi-Platform Builder Updates
|
|
245
|
+
|
|
246
|
+
```json
|
|
247
|
+
{
|
|
248
|
+
"project": "plugforge",
|
|
249
|
+
"objective": "Add hooks support to 5 CLI platforms",
|
|
250
|
+
"items": [
|
|
251
|
+
{
|
|
252
|
+
"id": "hooks-cc",
|
|
253
|
+
"subject": "Add hooks to gm-cc platform",
|
|
254
|
+
"status": "pending",
|
|
255
|
+
"blocking": ["test-hooks"],
|
|
256
|
+
"blockedBy": [],
|
|
257
|
+
"effort": "small"
|
|
258
|
+
},
|
|
259
|
+
{
|
|
260
|
+
"id": "hooks-gc",
|
|
261
|
+
"subject": "Add hooks to gm-gc platform",
|
|
262
|
+
"status": "pending",
|
|
263
|
+
"blocking": ["test-hooks"],
|
|
264
|
+
"blockedBy": [],
|
|
265
|
+
"effort": "small"
|
|
266
|
+
},
|
|
267
|
+
{
|
|
268
|
+
"id": "hooks-oc",
|
|
269
|
+
"subject": "Add hooks to gm-oc platform",
|
|
270
|
+
"status": "pending",
|
|
271
|
+
"blocking": ["test-hooks"],
|
|
272
|
+
"blockedBy": [],
|
|
273
|
+
"effort": "small"
|
|
274
|
+
},
|
|
275
|
+
{
|
|
276
|
+
"id": "test-hooks",
|
|
277
|
+
"subject": "Test all 5 platforms with hooks",
|
|
278
|
+
"status": "pending",
|
|
279
|
+
"blocking": [],
|
|
280
|
+
"blockedBy": ["hooks-cc", "hooks-gc", "hooks-oc"],
|
|
281
|
+
"effort": "large"
|
|
282
|
+
}
|
|
283
|
+
]
|
|
284
|
+
}
|
|
285
|
+
```
|
|
286
|
+
|
|
287
|
+
**Execution**:
|
|
288
|
+
- Wave 1: Launch 3 subagents for `hooks-cc`, `hooks-gc`, `hooks-oc` in parallel
|
|
289
|
+
- After all 3 complete, launch `test-hooks`
|
|
290
|
+
|
|
291
|
+
This cuts wall-clock time from 45 min (sequential) to ~15 min (parallel).
|
|
292
|
+
|
|
293
|
+
## Best Practices
|
|
294
|
+
|
|
295
|
+
### Cover All Scenarios
|
|
296
|
+
Don't under-estimate work. If you think it's 3 items, list 8. Missing items cause restarts.
|
|
297
|
+
|
|
298
|
+
### Name Dependencies Clearly
|
|
299
|
+
- `blocking`: What does THIS item prevent?
|
|
300
|
+
- `blockedBy`: What must complete before THIS?
|
|
301
|
+
- Bidirectional: If A blocks B, then B blockedBy A
|
|
302
|
+
|
|
303
|
+
### Use Consistent Categories
|
|
304
|
+
- `feature`: New capability
|
|
305
|
+
- `bug`: Fix broken behavior
|
|
306
|
+
- `refactor`: Improve structure without changing behavior
|
|
307
|
+
- `docs`: Documentation
|
|
308
|
+
- `infra`: Build, CI, deployment
|
|
309
|
+
|
|
310
|
+
### Track Edge Cases Separately
|
|
311
|
+
Even if an item seems small, if it has edge cases, call them out. They often take 50% of the time.
|
|
312
|
+
|
|
313
|
+
### Estimate Effort Realistically
|
|
314
|
+
- `small`: Coding + testing in 1 attempt
|
|
315
|
+
- `medium`: May need 2 rounds of refinement
|
|
316
|
+
- `large`: Multiple rounds, unexpected issues likely
|
|
317
|
+
|
|
318
|
+
## Stop Hook Enforcement
|
|
319
|
+
|
|
320
|
+
When session ends, a **stop hook** checks if `.prd` exists and has `pending` or `in_progress` items. If yes, session is blocked. You cannot leave work incomplete.
|
|
321
|
+
|
|
322
|
+
This forces disciplined work closure: every PRD must reach empty state or explicitly pause with documented reason.
|
|
323
|
+
|
|
324
|
+
## Integration with gm Agent
|
|
325
|
+
|
|
326
|
+
The gm agent (immutable state machine) reads `.prd` in PLAN phase:
|
|
327
|
+
1. Verifies `.prd` exists and has valid JSON
|
|
328
|
+
2. Extracts items with `status: pending`
|
|
329
|
+
3. Finds items with no `blockedBy` constraints
|
|
330
|
+
4. Launches ≤3 gm:gm subagents per wave
|
|
331
|
+
5. Each subagent completes one item
|
|
332
|
+
6. On completion, PRD is updated (item removed)
|
|
333
|
+
7. Process repeats until `.prd` is empty
|
|
334
|
+
|
|
335
|
+
This creates structured, auditable work flow for complex projects.
|