@protolabsai/proto 0.22.0 → 0.24.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +153 -1
- package/bundled/browser-automation/SKILL.md +358 -0
- package/bundled/harness-reference/SKILL.md +146 -0
- package/bundled/subagent-driven-development/SKILL.md +10 -0
- package/cli.js +7201 -5138
- package/package.json +2 -2
- package/bundled/qc-helper/docs/_meta.ts +0 -30
- package/bundled/qc-helper/docs/common-workflow.md +0 -571
- package/bundled/qc-helper/docs/configuration/_meta.ts +0 -10
- package/bundled/qc-helper/docs/configuration/auth.md +0 -366
- package/bundled/qc-helper/docs/configuration/memory.md +0 -0
- package/bundled/qc-helper/docs/configuration/model-providers.md +0 -542
- package/bundled/qc-helper/docs/configuration/qwen-ignore.md +0 -55
- package/bundled/qc-helper/docs/configuration/settings.md +0 -661
- package/bundled/qc-helper/docs/configuration/themes.md +0 -160
- package/bundled/qc-helper/docs/configuration/trusted-folders.md +0 -61
- package/bundled/qc-helper/docs/extension/_meta.ts +0 -9
- package/bundled/qc-helper/docs/extension/extension-releasing.md +0 -204
- package/bundled/qc-helper/docs/extension/getting-started-extensions.md +0 -299
- package/bundled/qc-helper/docs/extension/introduction.md +0 -331
- package/bundled/qc-helper/docs/features/_meta.ts +0 -20
- package/bundled/qc-helper/docs/features/approval-mode.md +0 -263
- package/bundled/qc-helper/docs/features/arena.md +0 -218
- package/bundled/qc-helper/docs/features/checkpointing.md +0 -77
- package/bundled/qc-helper/docs/features/commands.md +0 -314
- package/bundled/qc-helper/docs/features/export.md +0 -51
- package/bundled/qc-helper/docs/features/followup-suggestions.md +0 -109
- package/bundled/qc-helper/docs/features/headless.md +0 -318
- package/bundled/qc-helper/docs/features/hooks.md +0 -356
- package/bundled/qc-helper/docs/features/language.md +0 -139
- package/bundled/qc-helper/docs/features/lsp.md +0 -453
- package/bundled/qc-helper/docs/features/mcp.md +0 -299
- package/bundled/qc-helper/docs/features/sandbox.md +0 -241
- package/bundled/qc-helper/docs/features/scheduled-tasks.md +0 -139
- package/bundled/qc-helper/docs/features/skills.md +0 -289
- package/bundled/qc-helper/docs/features/sub-agents.md +0 -307
- package/bundled/qc-helper/docs/features/token-caching.md +0 -29
- package/bundled/qc-helper/docs/ide-integration/_meta.ts +0 -4
- package/bundled/qc-helper/docs/ide-integration/ide-companion-spec.md +0 -182
- package/bundled/qc-helper/docs/ide-integration/ide-integration.md +0 -144
- package/bundled/qc-helper/docs/integration-github-action.md +0 -241
- package/bundled/qc-helper/docs/integration-jetbrains.md +0 -81
- package/bundled/qc-helper/docs/integration-vscode.md +0 -39
- package/bundled/qc-helper/docs/integration-zed.md +0 -72
- package/bundled/qc-helper/docs/overview.md +0 -65
- package/bundled/qc-helper/docs/quickstart.md +0 -273
- package/bundled/qc-helper/docs/reference/_meta.ts +0 -4
- package/bundled/qc-helper/docs/reference/keyboard-shortcuts.md +0 -72
- package/bundled/qc-helper/docs/reference/sdk-api.md +0 -524
- package/bundled/qc-helper/docs/support/Uninstall.md +0 -42
- package/bundled/qc-helper/docs/support/_meta.ts +0 -6
- package/bundled/qc-helper/docs/support/tos-privacy.md +0 -112
- package/bundled/qc-helper/docs/support/troubleshooting.md +0 -123
package/README.md
CHANGED
|
@@ -262,10 +262,76 @@ A `MEMORY.md` index is auto-generated and loaded into the system prompt at the s
|
|
|
262
262
|
|
|
263
263
|
After each conversation turn, a background extraction agent reviews recent messages and auto-creates memories for notable facts. This runs fire-and-forget with restricted tools (read/write/glob in the memory directory only).
|
|
264
264
|
|
|
265
|
+
## Agent Harness
|
|
266
|
+
|
|
267
|
+
proto includes a harness system that enforces quality gates, limits scope, and recovers from failures automatically.
|
|
268
|
+
|
|
269
|
+
### Sprint Contract (Scope Lock)
|
|
270
|
+
|
|
271
|
+
Prevents agents from modifying files outside an agreed scope. Before coding begins, negotiate a contract that defines exactly which files will be created or modified. The scope lock is armed — any write outside scope is rejected with a recovery message.
|
|
272
|
+
|
|
273
|
+
**Workflow:**
|
|
274
|
+
|
|
275
|
+
```bash
|
|
276
|
+
proto
|
|
277
|
+
/sprint-contract
|
|
278
|
+
> Task: Refactor auth module
|
|
279
|
+
> Files: src/auth.ts, src/utils.ts
|
|
280
|
+
> Confirm
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
**Behavior:**
|
|
284
|
+
|
|
285
|
+
- Write to `src/auth.ts` → ALLOWED
|
|
286
|
+
- Write to `tests/foo.test.ts` → BLOCKED with scope violation message
|
|
287
|
+
|
|
288
|
+
Contracts persist at `.proto/sprint-contract.json` and auto-restore on session resume.
|
|
289
|
+
|
|
290
|
+
### Behavior Verification Gate
|
|
291
|
+
|
|
292
|
+
Post-run smoke tests that verify changes actually work. After a subagent completes, the gate runs your defined scenarios (shell commands) in parallel. Failures inject a remediation message back to the agent for self-correction.
|
|
293
|
+
|
|
294
|
+
**Setup** — create `.proto/verify-scenarios.json`:
|
|
295
|
+
|
|
296
|
+
```json
|
|
297
|
+
[
|
|
298
|
+
{ "name": "tests pass", "command": "npm test -- --run", "timeoutMs": 60000 },
|
|
299
|
+
{ "name": "build works", "command": "npm run build", "timeoutMs": 30000 },
|
|
300
|
+
{ "name": "no TypeScript errors", "command": "npm run typecheck" }
|
|
301
|
+
]
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
**Behavior:**
|
|
305
|
+
|
|
306
|
+
1. Agent completes task, reports GOAL
|
|
307
|
+
2. Gate fires, runs all scenarios in parallel
|
|
308
|
+
3. If any fail → remediation message injected, agent self-corrects
|
|
309
|
+
4. Gate fires again until all pass
|
|
310
|
+
|
|
311
|
+
### Multi-Sample Retry
|
|
312
|
+
|
|
313
|
+
When a subagent fails (ERROR, MAX_TURNS, or TIMEOUT), proto retries up to 2 more times with escalating temperatures (0.7 → 1.0 → 1.3). Each retry gets a `[RETRY CONTEXT]` block summarizing previous failures. Best result by score is returned.
|
|
314
|
+
|
|
315
|
+
This reduces false negatives from single-run failures and gives the model multiple chances with different sampling strategies.
|
|
316
|
+
|
|
317
|
+
### Repo Map
|
|
318
|
+
|
|
319
|
+
PageRank-based file importance ranking. Analyzes the project's TypeScript/JS import graph to surface the most central files. Useful for understanding codebase structure or finding related files.
|
|
320
|
+
|
|
321
|
+
**Usage:**
|
|
322
|
+
|
|
323
|
+
```bash
|
|
324
|
+
proto -p "Use the repo_map tool to find the most important files in this codebase"
|
|
325
|
+
proto -p "Use repo_map with seedFiles=['src/auth.ts'] to find related files"
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
Results are cached at `.proto/repo-map-cache.json` and auto-invalidate on file changes.
|
|
329
|
+
|
|
265
330
|
## Skills
|
|
266
331
|
|
|
267
|
-
proto ships with
|
|
332
|
+
proto ships with 22 bundled skills for agentic workflows:
|
|
268
333
|
|
|
334
|
+
- **browser-automation** — Web browser automation
|
|
269
335
|
- **brainstorming** — Structured ideation
|
|
270
336
|
- **dispatching-parallel-agents** — Fan-out/fan-in subagent patterns
|
|
271
337
|
- **executing-plans** — Step-by-step plan execution
|
|
@@ -285,6 +351,92 @@ proto ships with 16 bundled skills for agentic workflows:
|
|
|
285
351
|
|
|
286
352
|
Use `/skills` to list available skills in a session.
|
|
287
353
|
|
|
354
|
+
### Browser Automation
|
|
355
|
+
|
|
356
|
+
proto includes a native browser automation tool powered by [agent-browser](https://github.com/nickinack/agent-browser). This enables AI agents to interact with websites — navigate, click, fill forms, take screenshots, and extract content.
|
|
357
|
+
|
|
358
|
+
#### Installation
|
|
359
|
+
|
|
360
|
+
```bash
|
|
361
|
+
npm install -g agent-browser
|
|
362
|
+
agent-browser install # Downloads Chrome
|
|
363
|
+
```
|
|
364
|
+
|
|
365
|
+
#### Usage
|
|
366
|
+
|
|
367
|
+
```javascript
|
|
368
|
+
// Open a website
|
|
369
|
+
browser({ action: 'open', url: 'https://example.com' });
|
|
370
|
+
|
|
371
|
+
// Get interactive elements
|
|
372
|
+
browser({ action: 'snapshot', flags: JSON.stringify({ interactive: true }) });
|
|
373
|
+
|
|
374
|
+
// Click an element
|
|
375
|
+
browser({ action: 'click', selector: '@e2' });
|
|
376
|
+
|
|
377
|
+
// Fill a form
|
|
378
|
+
browser({ action: 'fill', selector: '@e1', text: 'user@example.com' });
|
|
379
|
+
|
|
380
|
+
// Take screenshot
|
|
381
|
+
browser({ action: 'screenshot', outputPath: '/path/to/screenshot.png' });
|
|
382
|
+
```
|
|
383
|
+
|
|
384
|
+
#### Key Actions
|
|
385
|
+
|
|
386
|
+
| Action | Description |
|
|
387
|
+
| ------------------------------ | ------------------------------------------ |
|
|
388
|
+
| `open` / `close` | Navigate to URL or close browser |
|
|
389
|
+
| `click` / `dblclick` / `hover` | Element interaction |
|
|
390
|
+
| `fill` / `type` | Form input |
|
|
391
|
+
| `snapshot` | Get accessibility tree with element refs |
|
|
392
|
+
| `screenshot` | Capture page screenshot |
|
|
393
|
+
| `get` / `is` / `find` | Query element properties |
|
|
394
|
+
| `wait` | Wait for elements, network, or URL changes |
|
|
395
|
+
| `batch` | Execute multiple commands in sequence |
|
|
396
|
+
|
|
397
|
+
The browser skill (`/skills` → browser-automation) provides comprehensive documentation for all 38 available actions.
|
|
398
|
+
|
|
399
|
+
## Agent Teams
|
|
400
|
+
|
|
401
|
+
Run multiple coordinated agents that share tasks and communicate directly with each other.
|
|
402
|
+
|
|
403
|
+
```
|
|
404
|
+
/team start my-team lead:coordinator scout:Explore coder:general-purpose
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
This spawns three live agents immediately. Each member runs as an in-process agent and gets two extra tools injected automatically:
|
|
408
|
+
|
|
409
|
+
- **`mailbox_send`** — send a message to a teammate by their agentId
|
|
410
|
+
- **`mailbox_receive`** — drain all unread messages from your inbox
|
|
411
|
+
|
|
412
|
+
Members share the same task list (`task_create`, `task_list`, `task_update`) so any agent can create tasks and others can claim them.
|
|
413
|
+
|
|
414
|
+
### Team commands
|
|
415
|
+
|
|
416
|
+
| Command | Description |
|
|
417
|
+
| -------------------------------------- | ------------------------------------- |
|
|
418
|
+
| `/team start <name> [member:type ...]` | Spawn live agents and start the team |
|
|
419
|
+
| `/team status <name>` | Show live member status |
|
|
420
|
+
| `/team stop <name>` | Kill all agents and release resources |
|
|
421
|
+
| `/team list` | List all teams in the project |
|
|
422
|
+
| `/team delete <name>` | Delete a team config |
|
|
423
|
+
|
|
424
|
+
**Default team** (no members specified): `lead` (coordinator) + `scout` (Explore).
|
|
425
|
+
|
|
426
|
+
Agent IDs follow the pattern `<name>-<index>` (e.g. `lead-0`, `scout-1`). Use these when sending mailbox messages between agents.
|
|
427
|
+
|
|
428
|
+
### Member types
|
|
429
|
+
|
|
430
|
+
| Type | Purpose |
|
|
431
|
+
| ----------------- | ----------------------------------------- |
|
|
432
|
+
| `coordinator` | Orchestrate subtasks across other members |
|
|
433
|
+
| `Explore` | Fast codebase search and analysis |
|
|
434
|
+
| `general-purpose` | Multi-step implementation tasks |
|
|
435
|
+
| `verify` | Review and correctness checking |
|
|
436
|
+
| `plan` | Design plans before implementation |
|
|
437
|
+
|
|
438
|
+
Any user-defined sub-agent from `.proto/agents/` can also be used as a member type.
|
|
439
|
+
|
|
288
440
|
## Commands
|
|
289
441
|
|
|
290
442
|
| Command | Description |
|
|
@@ -0,0 +1,358 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: browser-automation
|
|
3
|
+
description: Use browser automation to interact with web pages - navigate, click, fill forms, take screenshots, and extract content from websites. Perfect for testing, data collection, and web interactions.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Browser Automation
|
|
7
|
+
|
|
8
|
+
Use the native `browser` tool to automate web browser interactions. This skill enables AI agents to navigate websites, interact with elements, fill forms, and extract information.
|
|
9
|
+
|
|
10
|
+
## Prerequisites
|
|
11
|
+
|
|
12
|
+
**Before using this skill, verify:**
|
|
13
|
+
|
|
14
|
+
1. **agent-browser is installed:**
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
agent-browser --version
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
If not installed:
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
npm install -g agent-browser
|
|
24
|
+
agent-browser install # Downloads Chrome
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
2. **Chrome is installed:**
|
|
28
|
+
The browser tool will auto-download Chrome if needed via `agent-browser install`.
|
|
29
|
+
|
|
30
|
+
## Core Workflow
|
|
31
|
+
|
|
32
|
+
### Step 1: Open a Website
|
|
33
|
+
|
|
34
|
+
```javascript
|
|
35
|
+
browser({
|
|
36
|
+
action: 'open',
|
|
37
|
+
url: 'https://example.com',
|
|
38
|
+
headed: false, // true to see browser window
|
|
39
|
+
});
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### Step 2: Get Interactive Elements
|
|
43
|
+
|
|
44
|
+
Use `snapshot` to get an accessibility tree with numbered element references:
|
|
45
|
+
|
|
46
|
+
```javascript
|
|
47
|
+
browser({
|
|
48
|
+
action: 'snapshot',
|
|
49
|
+
flags: JSON.stringify({ interactive: true }),
|
|
50
|
+
});
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
**Output shows elements with refs like:**
|
|
54
|
+
|
|
55
|
+
```text
|
|
56
|
+
[e1] Button: "Submit"
|
|
57
|
+
[e2] Textbox: "Email"
|
|
58
|
+
[e3] Textbox: "Password"
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
### Step 3: Interact with Elements
|
|
62
|
+
|
|
63
|
+
**Click an element:**
|
|
64
|
+
|
|
65
|
+
```javascript
|
|
66
|
+
browser({
|
|
67
|
+
action: 'click',
|
|
68
|
+
selector: '@e2', // or CSS selector like "#submit-btn"
|
|
69
|
+
});
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
**Fill a form field:**
|
|
73
|
+
|
|
74
|
+
```javascript
|
|
75
|
+
browser({
|
|
76
|
+
action: 'fill',
|
|
77
|
+
selector: '@e2',
|
|
78
|
+
text: 'user@example.com',
|
|
79
|
+
});
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
**Type with keystroke simulation:**
|
|
83
|
+
|
|
84
|
+
```javascript
|
|
85
|
+
browser({
|
|
86
|
+
action: 'type',
|
|
87
|
+
selector: '@e3',
|
|
88
|
+
text: 'password123',
|
|
89
|
+
});
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### Step 4: Take Screenshots
|
|
93
|
+
|
|
94
|
+
**Page screenshot:**
|
|
95
|
+
|
|
96
|
+
```javascript
|
|
97
|
+
browser({
|
|
98
|
+
action: 'screenshot',
|
|
99
|
+
outputPath: '/path/to/screenshot.png',
|
|
100
|
+
});
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
**Full-page screenshot:**
|
|
104
|
+
|
|
105
|
+
```javascript
|
|
106
|
+
browser({
|
|
107
|
+
action: 'screenshot',
|
|
108
|
+
outputPath: '/path/to/full.png',
|
|
109
|
+
flags: JSON.stringify({ full: true }),
|
|
110
|
+
});
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
## Selector Types
|
|
114
|
+
|
|
115
|
+
| Type | Example | Use Case |
|
|
116
|
+
| ------------ | ------------------------------- | -------------------- |
|
|
117
|
+
| Element ref | `@e1`, `@e2` | From snapshot output |
|
|
118
|
+
| CSS selector | `#id`, `.class`, `div > button` | Standard web dev |
|
|
119
|
+
| Semantic | `role:button`, `text:"Sign In"` | AI-friendly |
|
|
120
|
+
|
|
121
|
+
## Common Actions Reference
|
|
122
|
+
|
|
123
|
+
### Navigation
|
|
124
|
+
|
|
125
|
+
```javascript
|
|
126
|
+
// Open URL
|
|
127
|
+
browser({ action: 'open', url: 'https://example.com' });
|
|
128
|
+
|
|
129
|
+
// Close browser
|
|
130
|
+
browser({ action: 'close' });
|
|
131
|
+
|
|
132
|
+
// New tab
|
|
133
|
+
browser({ action: 'tab', text: 'new', url: 'https://example.com' });
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
### Element Interaction
|
|
137
|
+
|
|
138
|
+
```javascript
|
|
139
|
+
// Click
|
|
140
|
+
browser({ action: 'click', selector: '@e1' });
|
|
141
|
+
|
|
142
|
+
// Double-click
|
|
143
|
+
browser({ action: 'dblclick', selector: '@e1' });
|
|
144
|
+
|
|
145
|
+
// Hover
|
|
146
|
+
browser({ action: 'hover', selector: '@e1' });
|
|
147
|
+
|
|
148
|
+
// Fill (clears and types)
|
|
149
|
+
browser({ action: 'fill', selector: '@e2', text: 'value' });
|
|
150
|
+
|
|
151
|
+
// Type (appends)
|
|
152
|
+
browser({ action: 'type', selector: '@e2', text: 'value' });
|
|
153
|
+
|
|
154
|
+
// Press key
|
|
155
|
+
browser({ action: 'press', key: 'Enter' });
|
|
156
|
+
browser({ action: 'press', key: 'Tab' });
|
|
157
|
+
browser({ action: 'press', key: 'Escape' });
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
### Page Information
|
|
161
|
+
|
|
162
|
+
```javascript
|
|
163
|
+
// Get page title
|
|
164
|
+
browser({ action: 'get', text: 'title' });
|
|
165
|
+
|
|
166
|
+
// Get current URL
|
|
167
|
+
browser({ action: 'get', text: 'url' });
|
|
168
|
+
|
|
169
|
+
// Get element text
|
|
170
|
+
browser({ action: 'get', selector: '@e1', text: 'text' });
|
|
171
|
+
|
|
172
|
+
// Get element attribute
|
|
173
|
+
browser({ action: 'get', selector: '@e1', text: 'attr', attribute: 'href' });
|
|
174
|
+
|
|
175
|
+
// Check if visible
|
|
176
|
+
browser({ action: 'is', text: 'visible', selector: '@e1' });
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### Find Elements
|
|
180
|
+
|
|
181
|
+
```javascript
|
|
182
|
+
// Find by role and click
|
|
183
|
+
browser({ action: 'find', selector: 'role button', text: 'click' });
|
|
184
|
+
|
|
185
|
+
// Find by text
|
|
186
|
+
browser({ action: 'find', selector: 'text "Sign In"', text: 'click' });
|
|
187
|
+
|
|
188
|
+
// Find by label
|
|
189
|
+
browser({ action: 'find', selector: 'label "Email"', text: 'fill' });
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
### Waiting
|
|
193
|
+
|
|
194
|
+
```javascript
|
|
195
|
+
// Wait for URL pattern
|
|
196
|
+
browser({ action: 'wait', text: '**/dashboard' });
|
|
197
|
+
|
|
198
|
+
// Wait for text
|
|
199
|
+
browser({ action: 'wait', text: 'Welcome' });
|
|
200
|
+
|
|
201
|
+
// Wait for network idle
|
|
202
|
+
browser({ action: 'wait', text: 'networkidle' });
|
|
203
|
+
|
|
204
|
+
// Wait for selector
|
|
205
|
+
browser({ action: 'wait', selector: '@e1' });
|
|
206
|
+
|
|
207
|
+
// Wait with timeout (default 25s)
|
|
208
|
+
browser({ action: 'wait', selector: '@e1' });
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
### Batch Operations
|
|
212
|
+
|
|
213
|
+
Execute multiple commands in sequence:
|
|
214
|
+
|
|
215
|
+
```javascript
|
|
216
|
+
browser({
|
|
217
|
+
action: 'batch',
|
|
218
|
+
commands: [
|
|
219
|
+
'open https://example.com',
|
|
220
|
+
'wait --load networkidle',
|
|
221
|
+
'fill @e1 user@example.com',
|
|
222
|
+
'fill @e2 password',
|
|
223
|
+
'click @e3',
|
|
224
|
+
'wait --url **/dashboard',
|
|
225
|
+
],
|
|
226
|
+
});
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
## Sessions
|
|
230
|
+
|
|
231
|
+
Sessions persist authentication and state between commands:
|
|
232
|
+
|
|
233
|
+
```javascript
|
|
234
|
+
// Create named session
|
|
235
|
+
browser({
|
|
236
|
+
action: 'open',
|
|
237
|
+
url: 'https://app.example.com',
|
|
238
|
+
session: 'myapp-auth',
|
|
239
|
+
});
|
|
240
|
+
|
|
241
|
+
// Continue in same session
|
|
242
|
+
browser({
|
|
243
|
+
action: 'click',
|
|
244
|
+
selector: '@e1',
|
|
245
|
+
session: 'myapp-auth',
|
|
246
|
+
});
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
## Advanced Features
|
|
250
|
+
|
|
251
|
+
### Headed Mode (see browser window)
|
|
252
|
+
|
|
253
|
+
```javascript
|
|
254
|
+
browser({
|
|
255
|
+
action: 'open',
|
|
256
|
+
url: 'https://example.com',
|
|
257
|
+
headed: true,
|
|
258
|
+
});
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
### Network Interception
|
|
262
|
+
|
|
263
|
+
```javascript
|
|
264
|
+
// Block a URL
|
|
265
|
+
browser({
|
|
266
|
+
action: 'network',
|
|
267
|
+
text: 'route',
|
|
268
|
+
selector: 'https://ads.example.com',
|
|
269
|
+
attribute: 'abort', // or mock response
|
|
270
|
+
});
|
|
271
|
+
|
|
272
|
+
// View requests
|
|
273
|
+
browser({ action: 'network', text: 'requests' });
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
### Cookies & Storage
|
|
277
|
+
|
|
278
|
+
```javascript
|
|
279
|
+
// Get cookies
|
|
280
|
+
browser({ action: 'cookies' });
|
|
281
|
+
|
|
282
|
+
// Get localStorage
|
|
283
|
+
browser({ action: 'storage', text: 'local' });
|
|
284
|
+
|
|
285
|
+
// Set localStorage
|
|
286
|
+
browser({
|
|
287
|
+
action: 'storage',
|
|
288
|
+
text: 'local',
|
|
289
|
+
selector: 'set myKey myValue',
|
|
290
|
+
});
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
### Clipboard
|
|
294
|
+
|
|
295
|
+
```javascript
|
|
296
|
+
// Read clipboard
|
|
297
|
+
browser({ action: 'clipboard' });
|
|
298
|
+
|
|
299
|
+
// Write to clipboard
|
|
300
|
+
browser({ action: 'clipboard', text: 'write Hello World' });
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
## Best Practices
|
|
304
|
+
|
|
305
|
+
1. **Always start with `snapshot`**: Get the accessibility tree to understand page structure
|
|
306
|
+
|
|
307
|
+
2. **Use element refs from snapshot**: They're more reliable than CSS selectors for AI automation
|
|
308
|
+
|
|
309
|
+
3. **Wait for network idle**: After form submissions or page loads:
|
|
310
|
+
|
|
311
|
+
```javascript
|
|
312
|
+
browser({ action: 'wait', text: 'networkidle' });
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
4. **Use sessions for multi-step flows**: Maintains login state and cookies:
|
|
316
|
+
|
|
317
|
+
```javascript
|
|
318
|
+
browser({ action: 'open', url: 'https://app.com', session: 'app' });
|
|
319
|
+
```
|
|
320
|
+
|
|
321
|
+
5. **Take screenshots for debugging**: When automation fails, screenshot helps debug:
|
|
322
|
+
|
|
323
|
+
```javascript
|
|
324
|
+
browser({ action: 'screenshot', outputPath: '/tmp/debug.png' });
|
|
325
|
+
```
|
|
326
|
+
|
|
327
|
+
6. **Close browser when done**: Free resources:
|
|
328
|
+
```javascript
|
|
329
|
+
browser({ action: 'close' });
|
|
330
|
+
```
|
|
331
|
+
|
|
332
|
+
## Error Handling
|
|
333
|
+
|
|
334
|
+
If agent-browser is not installed, you'll get:
|
|
335
|
+
|
|
336
|
+
```text
|
|
337
|
+
agent-browser is not installed. Please install it with: npm install -g agent-browser
|
|
338
|
+
```
|
|
339
|
+
|
|
340
|
+
**Installation troubleshooting:**
|
|
341
|
+
|
|
342
|
+
```bash
|
|
343
|
+
# Install Chrome dependency
|
|
344
|
+
agent-browser install --with-deps
|
|
345
|
+
|
|
346
|
+
# Check installation
|
|
347
|
+
which agent-browser
|
|
348
|
+
agent-browser --version
|
|
349
|
+
```
|
|
350
|
+
|
|
351
|
+
## Use Cases
|
|
352
|
+
|
|
353
|
+
- **Web testing**: Automate UI testing workflows
|
|
354
|
+
- **Form filling**: Submit forms programmatically
|
|
355
|
+
- **Data extraction**: Scrape content from websites
|
|
356
|
+
- **Login flows**: Handle authentication
|
|
357
|
+
- **Screenshot capture**: Document web pages
|
|
358
|
+
- **Accessibility testing**: Verify page structure
|
|
@@ -0,0 +1,146 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: harness-reference
|
|
3
|
+
description: Reference guide for all agent harness safety features — doom loop detection, scope lock, git checkpoints, observation masking, sprint contract, reminders, repo map, behavior verification, and multi-sample retry
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Agent Harness Reference
|
|
7
|
+
|
|
8
|
+
The proto harness is a set of safety and reliability features that wrap every agent execution. They fire automatically — you don't need to invoke them manually. This skill documents each feature so you can understand what's protecting you and how to configure it.
|
|
9
|
+
|
|
10
|
+
## Features
|
|
11
|
+
|
|
12
|
+
### Doom Loop Detection
|
|
13
|
+
|
|
14
|
+
**What it does:** Detects when the agent is repeating the same tool call pattern in a sliding 20-call window. If the same fingerprint (tool + args hash) appears 3+ times, the harness injects a recovery message and records a Langfuse span.
|
|
15
|
+
|
|
16
|
+
**You don't need to do anything.** The harness detects this automatically.
|
|
17
|
+
|
|
18
|
+
### Scope Lock (Sprint Contract)
|
|
19
|
+
|
|
20
|
+
**What it does:** Before coding, the `sprint-contract` skill negotiates an explicit contract — the set of files that may change. Once activated, any write outside that set is blocked with a structured error.
|
|
21
|
+
|
|
22
|
+
**To activate:** Use the `sprint-contract` skill at the start of an implementation task. It writes `.proto/sprint-contract.json` and arms the in-memory scope lock. The lock is restored on session restart.
|
|
23
|
+
|
|
24
|
+
**To check status:** If a write is blocked, the error message tells you the violating path and the permitted set.
|
|
25
|
+
|
|
26
|
+
### Git Checkpoints
|
|
27
|
+
|
|
28
|
+
**What it does:** Before every file-mutating tool call (`write_file`, `edit`, `replace`), the harness creates a shadow-repo commit. This lets you diff or roll back to any pre-edit state.
|
|
29
|
+
|
|
30
|
+
**To roll back:** Use `git log` to find the checkpoint commit and `git checkout <hash> -- <file>` to restore.
|
|
31
|
+
|
|
32
|
+
### Observation Masking
|
|
33
|
+
|
|
34
|
+
**What it does:** When the context window gets large, the harness applies a rolling verbatim window — tool-call/result pairs older than the window are summarized as `[OBSERVATION_MASK: N pairs omitted]`. This keeps recent context intact while reducing token usage.
|
|
35
|
+
|
|
36
|
+
**You don't need to do anything.** Fires automatically during LLM compaction.
|
|
37
|
+
|
|
38
|
+
### Harness Reminders
|
|
39
|
+
|
|
40
|
+
**What it does:** The harness injects periodic reminders into context based on three triggers:
|
|
41
|
+
|
|
42
|
+
- Every 50 tool calls: warns about high tool usage
|
|
43
|
+
- After 3 consecutive test failures: suggests pausing to diagnose
|
|
44
|
+
- After 8 turns without any file write: suggests the agent may be over-analyzing
|
|
45
|
+
|
|
46
|
+
**You don't need to do anything.** The harness injects these automatically.
|
|
47
|
+
|
|
48
|
+
### Repo Map (`repo_map` tool)
|
|
49
|
+
|
|
50
|
+
**What it does:** Analyzes the import graph of the codebase and runs PageRank to surface the most-connected (and most-relevant) files. Call it at the start of any exploration or implementation task for fast orientation.
|
|
51
|
+
|
|
52
|
+
**To use:**
|
|
53
|
+
|
|
54
|
+
```
|
|
55
|
+
repo_map {} # globally most-connected files
|
|
56
|
+
repo_map { seedFiles: ["/abs/path"] } # personalized from known-relevant files
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
Results are cached at `.proto/repo-map-cache.json` and invalidated on file changes.
|
|
60
|
+
|
|
61
|
+
### Behavior Verification Gate
|
|
62
|
+
|
|
63
|
+
**What it does:** After every subagent task that completes successfully, the harness runs user-configured "verification scenarios" — shell commands that check your feature actually works. Failures are injected back to the agent for self-correction.
|
|
64
|
+
|
|
65
|
+
**To configure:** Create `.proto/verify-scenarios.json`:
|
|
66
|
+
|
|
67
|
+
```json
|
|
68
|
+
[
|
|
69
|
+
{
|
|
70
|
+
"name": "Unit tests pass",
|
|
71
|
+
"command": "npm test -- --run",
|
|
72
|
+
"timeoutMs": 60000
|
|
73
|
+
},
|
|
74
|
+
{
|
|
75
|
+
"name": "Build succeeds",
|
|
76
|
+
"command": "npm run build",
|
|
77
|
+
"timeoutMs": 30000
|
|
78
|
+
},
|
|
79
|
+
{
|
|
80
|
+
"name": "API health check",
|
|
81
|
+
"command": "curl -sf http://localhost:3000/health",
|
|
82
|
+
"expectedPattern": "ok",
|
|
83
|
+
"timeoutMs": 5000
|
|
84
|
+
}
|
|
85
|
+
]
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
See `.proto/verify-scenarios.example.json` for a full reference.
|
|
89
|
+
|
|
90
|
+
### Multi-Sample Retry (`multi_sample: true`)
|
|
91
|
+
|
|
92
|
+
**What it does:** When a subagent fails (doom loop, error, or max turns exceeded), the harness automatically retries up to 2 more times with escalating temperatures (0.7 → 1.0 → 1.3) and injects the failure context into each retry prompt. The best result among all attempts is returned and scored.
|
|
93
|
+
|
|
94
|
+
**Scoring:**
|
|
95
|
+
|
|
96
|
+
- GOAL + behavior gate pass → 3 (perfect)
|
|
97
|
+
- GOAL + no gate / gate pass → 3
|
|
98
|
+
- GOAL + gate fail → 2 (completed but not verified)
|
|
99
|
+
- MAX_TURNS / TIMEOUT → 1 (partial)
|
|
100
|
+
- ERROR → 0 (failure)
|
|
101
|
+
|
|
102
|
+
**To enable:** Set `multi_sample: true` on the Agent tool call:
|
|
103
|
+
|
|
104
|
+
```
|
|
105
|
+
Agent {
|
|
106
|
+
subagent_type: "general-purpose",
|
|
107
|
+
prompt: "implement the auth service",
|
|
108
|
+
multi_sample: true
|
|
109
|
+
}
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
Use for complex tasks with a history of failure, not for simple searches.
|
|
113
|
+
|
|
114
|
+
### Sprint Contract Service
|
|
115
|
+
|
|
116
|
+
**What it does:** Manages the full sprint contract lifecycle — parse, activate scope lock, persist to disk, load on resume. See the `sprint-contract` skill for usage.
|
|
117
|
+
|
|
118
|
+
**Files involved:**
|
|
119
|
+
|
|
120
|
+
- `.proto/sprint-contract.json` — persisted contract (restored on session start)
|
|
121
|
+
- `SprintContractService` — programmatic API
|
|
122
|
+
|
|
123
|
+
## Langfuse Fine-Tuning Data
|
|
124
|
+
|
|
125
|
+
All harness interventions emit OTel spans routed to Langfuse via OTLP → Tempo. To build fine-tuning datasets:
|
|
126
|
+
|
|
127
|
+
1. In Langfuse > Traces, filter by span name = `harness.intervention`
|
|
128
|
+
2. Use `harness.intervention.type` attribute to segment by type:
|
|
129
|
+
- `doom_loop` — recovery from loops
|
|
130
|
+
- `scope_violation` — scope lock enforcement
|
|
131
|
+
- `verification_failed` — post-edit and behavior gate failures
|
|
132
|
+
- `reminder.*` — context reminders
|
|
133
|
+
3. Export matching traces → dataset items
|
|
134
|
+
4. Annotate `harness.outcome` = `"recovered"` | `"not_recovered"`
|
|
135
|
+
5. Train on (input_context, intervention_message) pairs where outcome = recovered
|
|
136
|
+
|
|
137
|
+
## Configuration Summary
|
|
138
|
+
|
|
139
|
+
| Feature | Config location | Default |
|
|
140
|
+
| ----------------------- | ------------------------------------------------ | ------------------------------------ |
|
|
141
|
+
| Doom loop threshold | Code constant (`DOOM_REPEAT_THRESHOLD = 3`) | Always on |
|
|
142
|
+
| Scope lock | `.proto/sprint-contract.json` | Off until sprint-contract skill runs |
|
|
143
|
+
| Behavior gate scenarios | `.proto/verify-scenarios.json` | No scenarios (off) |
|
|
144
|
+
| Multi-sample retry | `multi_sample: true` on Agent call | Off (opt-in) |
|
|
145
|
+
| Observation mask window | Code constant (`INCREMENTAL_PROTECTED_TAIL`) | Always on |
|
|
146
|
+
| Harness reminders | Code constants (50 calls / 3 failures / 8 turns) | Always on |
|
|
@@ -120,6 +120,16 @@ Implementer subagents report one of four statuses. Handle each appropriately:
|
|
|
120
120
|
|
|
121
121
|
**Never** ignore an escalation or force the same model to retry without changes. If the implementer said it's stuck, something needs to change.
|
|
122
122
|
|
|
123
|
+
## Harness Features
|
|
124
|
+
|
|
125
|
+
The harness provides automatic safety nets you can leverage when dispatching implementers:
|
|
126
|
+
|
|
127
|
+
**Multi-sample retry** (`multi_sample: true`): For complex or high-risk tasks, set this on the Agent tool call. If the implementer fails (doom loop, error, max turns), the harness automatically retries up to 2 more times with escalating temperatures (0.7 → 1.0 → 1.3) and injects the failure context into each retry prompt. Returns the best result. Use for tasks that have previously failed or that touch many files.
|
|
128
|
+
|
|
129
|
+
**Behavior verification gate**: If `.proto/verify-scenarios.json` exists in the project, the harness runs those scenarios after every successful implementer completion. Failures are injected back to the model for self-correction. Add scenarios for smoke tests, build checks, and HTTP health checks.
|
|
130
|
+
|
|
131
|
+
**Sprint contract scope lock**: If the implementing agent was given a sprint contract (via the `sprint-contract` skill), the scope lock prevents it from writing files outside the agreed set. Any violation is blocked and reported.
|
|
132
|
+
|
|
123
133
|
## Prompt Templates
|
|
124
134
|
|
|
125
135
|
- `./implementer-prompt.md` - Dispatch implementer subagent
|