@protolabsai/proto 0.22.0 → 0.24.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. package/README.md +153 -1
  2. package/bundled/browser-automation/SKILL.md +358 -0
  3. package/bundled/harness-reference/SKILL.md +146 -0
  4. package/bundled/subagent-driven-development/SKILL.md +10 -0
  5. package/cli.js +7201 -5138
  6. package/package.json +2 -2
  7. package/bundled/qc-helper/docs/_meta.ts +0 -30
  8. package/bundled/qc-helper/docs/common-workflow.md +0 -571
  9. package/bundled/qc-helper/docs/configuration/_meta.ts +0 -10
  10. package/bundled/qc-helper/docs/configuration/auth.md +0 -366
  11. package/bundled/qc-helper/docs/configuration/memory.md +0 -0
  12. package/bundled/qc-helper/docs/configuration/model-providers.md +0 -542
  13. package/bundled/qc-helper/docs/configuration/qwen-ignore.md +0 -55
  14. package/bundled/qc-helper/docs/configuration/settings.md +0 -661
  15. package/bundled/qc-helper/docs/configuration/themes.md +0 -160
  16. package/bundled/qc-helper/docs/configuration/trusted-folders.md +0 -61
  17. package/bundled/qc-helper/docs/extension/_meta.ts +0 -9
  18. package/bundled/qc-helper/docs/extension/extension-releasing.md +0 -204
  19. package/bundled/qc-helper/docs/extension/getting-started-extensions.md +0 -299
  20. package/bundled/qc-helper/docs/extension/introduction.md +0 -331
  21. package/bundled/qc-helper/docs/features/_meta.ts +0 -20
  22. package/bundled/qc-helper/docs/features/approval-mode.md +0 -263
  23. package/bundled/qc-helper/docs/features/arena.md +0 -218
  24. package/bundled/qc-helper/docs/features/checkpointing.md +0 -77
  25. package/bundled/qc-helper/docs/features/commands.md +0 -314
  26. package/bundled/qc-helper/docs/features/export.md +0 -51
  27. package/bundled/qc-helper/docs/features/followup-suggestions.md +0 -109
  28. package/bundled/qc-helper/docs/features/headless.md +0 -318
  29. package/bundled/qc-helper/docs/features/hooks.md +0 -356
  30. package/bundled/qc-helper/docs/features/language.md +0 -139
  31. package/bundled/qc-helper/docs/features/lsp.md +0 -453
  32. package/bundled/qc-helper/docs/features/mcp.md +0 -299
  33. package/bundled/qc-helper/docs/features/sandbox.md +0 -241
  34. package/bundled/qc-helper/docs/features/scheduled-tasks.md +0 -139
  35. package/bundled/qc-helper/docs/features/skills.md +0 -289
  36. package/bundled/qc-helper/docs/features/sub-agents.md +0 -307
  37. package/bundled/qc-helper/docs/features/token-caching.md +0 -29
  38. package/bundled/qc-helper/docs/ide-integration/_meta.ts +0 -4
  39. package/bundled/qc-helper/docs/ide-integration/ide-companion-spec.md +0 -182
  40. package/bundled/qc-helper/docs/ide-integration/ide-integration.md +0 -144
  41. package/bundled/qc-helper/docs/integration-github-action.md +0 -241
  42. package/bundled/qc-helper/docs/integration-jetbrains.md +0 -81
  43. package/bundled/qc-helper/docs/integration-vscode.md +0 -39
  44. package/bundled/qc-helper/docs/integration-zed.md +0 -72
  45. package/bundled/qc-helper/docs/overview.md +0 -65
  46. package/bundled/qc-helper/docs/quickstart.md +0 -273
  47. package/bundled/qc-helper/docs/reference/_meta.ts +0 -4
  48. package/bundled/qc-helper/docs/reference/keyboard-shortcuts.md +0 -72
  49. package/bundled/qc-helper/docs/reference/sdk-api.md +0 -524
  50. package/bundled/qc-helper/docs/support/Uninstall.md +0 -42
  51. package/bundled/qc-helper/docs/support/_meta.ts +0 -6
  52. package/bundled/qc-helper/docs/support/tos-privacy.md +0 -112
  53. package/bundled/qc-helper/docs/support/troubleshooting.md +0 -123
package/README.md CHANGED
@@ -262,10 +262,76 @@ A `MEMORY.md` index is auto-generated and loaded into the system prompt at the s
262
262
 
263
263
  After each conversation turn, a background extraction agent reviews recent messages and auto-creates memories for notable facts. This runs fire-and-forget with restricted tools (read/write/glob in the memory directory only).
264
264
 
265
+ ## Agent Harness
266
+
267
+ proto includes a harness system that enforces quality gates, limits scope, and recovers from failures automatically.
268
+
269
+ ### Sprint Contract (Scope Lock)
270
+
271
+ Prevents agents from modifying files outside an agreed scope. Before coding begins, negotiate a contract that defines exactly which files will be created or modified. The scope lock is armed — any write outside scope is rejected with a recovery message.
272
+
273
+ **Workflow:**
274
+
275
+ ```bash
276
+ proto
277
+ /sprint-contract
278
+ > Task: Refactor auth module
279
+ > Files: src/auth.ts, src/utils.ts
280
+ > Confirm
281
+ ```
282
+
283
+ **Behavior:**
284
+
285
+ - Write to `src/auth.ts` → ALLOWED
286
+ - Write to `tests/foo.test.ts` → BLOCKED with scope violation message
287
+
288
+ Contracts persist at `.proto/sprint-contract.json` and auto-restore on session resume.
289
+
290
+ ### Behavior Verification Gate
291
+
292
+ Post-run smoke tests that verify changes actually work. After a subagent completes, the gate runs your defined scenarios (shell commands) in parallel. Failures inject a remediation message back to the agent for self-correction.
293
+
294
+ **Setup** — create `.proto/verify-scenarios.json`:
295
+
296
+ ```json
297
+ [
298
+ { "name": "tests pass", "command": "npm test -- --run", "timeoutMs": 60000 },
299
+ { "name": "build works", "command": "npm run build", "timeoutMs": 30000 },
300
+ { "name": "no TypeScript errors", "command": "npm run typecheck" }
301
+ ]
302
+ ```
303
+
304
+ **Behavior:**
305
+
306
+ 1. Agent completes task, reports GOAL
307
+ 2. Gate fires, runs all scenarios in parallel
308
+ 3. If any fail → remediation message injected, agent self-corrects
309
+ 4. Gate fires again until all pass
310
+
311
+ ### Multi-Sample Retry
312
+
313
+ When a subagent fails (ERROR, MAX_TURNS, or TIMEOUT), proto retries up to 2 more times with escalating temperatures (0.7 → 1.0 → 1.3). Each retry gets a `[RETRY CONTEXT]` block summarizing previous failures. Best result by score is returned.
314
+
315
+ This reduces false negatives from single-run failures and gives the model multiple chances with different sampling strategies.
316
+
317
+ ### Repo Map
318
+
319
+ PageRank-based file importance ranking. Analyzes the project's TypeScript/JS import graph to surface the most central files. Useful for understanding codebase structure or finding related files.
320
+
321
+ **Usage:**
322
+
323
+ ```bash
324
+ proto -p "Use the repo_map tool to find the most important files in this codebase"
325
+ proto -p "Use repo_map with seedFiles=['src/auth.ts'] to find related files"
326
+ ```
327
+
328
+ Results are cached at `.proto/repo-map-cache.json` and auto-invalidate on file changes.
329
+
265
330
  ## Skills
266
331
 
267
- proto ships with 16 bundled skills for agentic workflows:
332
+ proto ships with 22 bundled skills for agentic workflows:
268
333
 
334
+ - **browser-automation** — Web browser automation
269
335
  - **brainstorming** — Structured ideation
270
336
  - **dispatching-parallel-agents** — Fan-out/fan-in subagent patterns
271
337
  - **executing-plans** — Step-by-step plan execution
@@ -285,6 +351,92 @@ proto ships with 16 bundled skills for agentic workflows:
285
351
 
286
352
  Use `/skills` to list available skills in a session.
287
353
 
354
+ ### Browser Automation
355
+
356
+ proto includes a native browser automation tool powered by [agent-browser](https://github.com/nickinack/agent-browser). This enables AI agents to interact with websites — navigate, click, fill forms, take screenshots, and extract content.
357
+
358
+ #### Installation
359
+
360
+ ```bash
361
+ npm install -g agent-browser
362
+ agent-browser install # Downloads Chrome
363
+ ```
364
+
365
+ #### Usage
366
+
367
+ ```javascript
368
+ // Open a website
369
+ browser({ action: 'open', url: 'https://example.com' });
370
+
371
+ // Get interactive elements
372
+ browser({ action: 'snapshot', flags: JSON.stringify({ interactive: true }) });
373
+
374
+ // Click an element
375
+ browser({ action: 'click', selector: '@e2' });
376
+
377
+ // Fill a form
378
+ browser({ action: 'fill', selector: '@e1', text: 'user@example.com' });
379
+
380
+ // Take screenshot
381
+ browser({ action: 'screenshot', outputPath: '/path/to/screenshot.png' });
382
+ ```
383
+
384
+ #### Key Actions
385
+
386
+ | Action | Description |
387
+ | ------------------------------ | ------------------------------------------ |
388
+ | `open` / `close` | Navigate to URL or close browser |
389
+ | `click` / `dblclick` / `hover` | Element interaction |
390
+ | `fill` / `type` | Form input |
391
+ | `snapshot` | Get accessibility tree with element refs |
392
+ | `screenshot` | Capture page screenshot |
393
+ | `get` / `is` / `find` | Query element properties |
394
+ | `wait` | Wait for elements, network, or URL changes |
395
+ | `batch` | Execute multiple commands in sequence |
396
+
397
+ The browser skill (`/skills` → browser-automation) provides comprehensive documentation for all 38 available actions.
398
+
399
+ ## Agent Teams
400
+
401
+ Run multiple coordinated agents that share tasks and communicate directly with each other.
402
+
403
+ ```
404
+ /team start my-team lead:coordinator scout:Explore coder:general-purpose
405
+ ```
406
+
407
+ This spawns three live agents immediately. Each member runs as an in-process agent and gets two extra tools injected automatically:
408
+
409
+ - **`mailbox_send`** — send a message to a teammate by their agentId
410
+ - **`mailbox_receive`** — drain all unread messages from your inbox
411
+
412
+ Members share the same task list (`task_create`, `task_list`, `task_update`) so any agent can create tasks and others can claim them.
413
+
414
+ ### Team commands
415
+
416
+ | Command | Description |
417
+ | -------------------------------------- | ------------------------------------- |
418
+ | `/team start <name> [member:type ...]` | Spawn live agents and start the team |
419
+ | `/team status <name>` | Show live member status |
420
+ | `/team stop <name>` | Kill all agents and release resources |
421
+ | `/team list` | List all teams in the project |
422
+ | `/team delete <name>` | Delete a team config |
423
+
424
+ **Default team** (no members specified): `lead` (coordinator) + `scout` (Explore).
425
+
426
+ Agent IDs follow the pattern `<name>-<index>` (e.g. `lead-0`, `scout-1`). Use these when sending mailbox messages between agents.
427
+
428
+ ### Member types
429
+
430
+ | Type | Purpose |
431
+ | ----------------- | ----------------------------------------- |
432
+ | `coordinator` | Orchestrate subtasks across other members |
433
+ | `Explore` | Fast codebase search and analysis |
434
+ | `general-purpose` | Multi-step implementation tasks |
435
+ | `verify` | Review and correctness checking |
436
+ | `plan` | Design plans before implementation |
437
+
438
+ Any user-defined sub-agent from `.proto/agents/` can also be used as a member type.
439
+
288
440
  ## Commands
289
441
 
290
442
  | Command | Description |
@@ -0,0 +1,358 @@
1
+ ---
2
+ name: browser-automation
3
+ description: Use browser automation to interact with web pages - navigate, click, fill forms, take screenshots, and extract content from websites. Perfect for testing, data collection, and web interactions.
4
+ ---
5
+
6
+ # Browser Automation
7
+
8
+ Use the native `browser` tool to automate web browser interactions. This skill enables AI agents to navigate websites, interact with elements, fill forms, and extract information.
9
+
10
+ ## Prerequisites
11
+
12
+ **Before using this skill, verify:**
13
+
14
+ 1. **agent-browser is installed:**
15
+
16
+ ```bash
17
+ agent-browser --version
18
+ ```
19
+
20
+ If not installed:
21
+
22
+ ```bash
23
+ npm install -g agent-browser
24
+ agent-browser install # Downloads Chrome
25
+ ```
26
+
27
+ 2. **Chrome is installed:**
28
+ The browser tool will auto-download Chrome if needed via `agent-browser install`.
29
+
30
+ ## Core Workflow
31
+
32
+ ### Step 1: Open a Website
33
+
34
+ ```javascript
35
+ browser({
36
+ action: 'open',
37
+ url: 'https://example.com',
38
+ headed: false, // true to see browser window
39
+ });
40
+ ```
41
+
42
+ ### Step 2: Get Interactive Elements
43
+
44
+ Use `snapshot` to get an accessibility tree with numbered element references:
45
+
46
+ ```javascript
47
+ browser({
48
+ action: 'snapshot',
49
+ flags: JSON.stringify({ interactive: true }),
50
+ });
51
+ ```
52
+
53
+ **Output shows elements with refs like:**
54
+
55
+ ```text
56
+ [e1] Button: "Submit"
57
+ [e2] Textbox: "Email"
58
+ [e3] Textbox: "Password"
59
+ ```
60
+
61
+ ### Step 3: Interact with Elements
62
+
63
+ **Click an element:**
64
+
65
+ ```javascript
66
+ browser({
67
+ action: 'click',
68
+ selector: '@e2', // or CSS selector like "#submit-btn"
69
+ });
70
+ ```
71
+
72
+ **Fill a form field:**
73
+
74
+ ```javascript
75
+ browser({
76
+ action: 'fill',
77
+ selector: '@e2',
78
+ text: 'user@example.com',
79
+ });
80
+ ```
81
+
82
+ **Type with keystroke simulation:**
83
+
84
+ ```javascript
85
+ browser({
86
+ action: 'type',
87
+ selector: '@e3',
88
+ text: 'password123',
89
+ });
90
+ ```
91
+
92
+ ### Step 4: Take Screenshots
93
+
94
+ **Page screenshot:**
95
+
96
+ ```javascript
97
+ browser({
98
+ action: 'screenshot',
99
+ outputPath: '/path/to/screenshot.png',
100
+ });
101
+ ```
102
+
103
+ **Full-page screenshot:**
104
+
105
+ ```javascript
106
+ browser({
107
+ action: 'screenshot',
108
+ outputPath: '/path/to/full.png',
109
+ flags: JSON.stringify({ full: true }),
110
+ });
111
+ ```
112
+
113
+ ## Selector Types
114
+
115
+ | Type | Example | Use Case |
116
+ | ------------ | ------------------------------- | -------------------- |
117
+ | Element ref | `@e1`, `@e2` | From snapshot output |
118
+ | CSS selector | `#id`, `.class`, `div > button` | Standard web dev |
119
+ | Semantic | `role:button`, `text:"Sign In"` | AI-friendly |
120
+
121
+ ## Common Actions Reference
122
+
123
+ ### Navigation
124
+
125
+ ```javascript
126
+ // Open URL
127
+ browser({ action: 'open', url: 'https://example.com' });
128
+
129
+ // Close browser
130
+ browser({ action: 'close' });
131
+
132
+ // New tab
133
+ browser({ action: 'tab', text: 'new', url: 'https://example.com' });
134
+ ```
135
+
136
+ ### Element Interaction
137
+
138
+ ```javascript
139
+ // Click
140
+ browser({ action: 'click', selector: '@e1' });
141
+
142
+ // Double-click
143
+ browser({ action: 'dblclick', selector: '@e1' });
144
+
145
+ // Hover
146
+ browser({ action: 'hover', selector: '@e1' });
147
+
148
+ // Fill (clears and types)
149
+ browser({ action: 'fill', selector: '@e2', text: 'value' });
150
+
151
+ // Type (appends)
152
+ browser({ action: 'type', selector: '@e2', text: 'value' });
153
+
154
+ // Press key
155
+ browser({ action: 'press', key: 'Enter' });
156
+ browser({ action: 'press', key: 'Tab' });
157
+ browser({ action: 'press', key: 'Escape' });
158
+ ```
159
+
160
+ ### Page Information
161
+
162
+ ```javascript
163
+ // Get page title
164
+ browser({ action: 'get', text: 'title' });
165
+
166
+ // Get current URL
167
+ browser({ action: 'get', text: 'url' });
168
+
169
+ // Get element text
170
+ browser({ action: 'get', selector: '@e1', text: 'text' });
171
+
172
+ // Get element attribute
173
+ browser({ action: 'get', selector: '@e1', text: 'attr', attribute: 'href' });
174
+
175
+ // Check if visible
176
+ browser({ action: 'is', text: 'visible', selector: '@e1' });
177
+ ```
178
+
179
+ ### Find Elements
180
+
181
+ ```javascript
182
+ // Find by role and click
183
+ browser({ action: 'find', selector: 'role button', text: 'click' });
184
+
185
+ // Find by text
186
+ browser({ action: 'find', selector: 'text "Sign In"', text: 'click' });
187
+
188
+ // Find by label
189
+ browser({ action: 'find', selector: 'label "Email"', text: 'fill' });
190
+ ```
191
+
192
+ ### Waiting
193
+
194
+ ```javascript
195
+ // Wait for URL pattern
196
+ browser({ action: 'wait', text: '**/dashboard' });
197
+
198
+ // Wait for text
199
+ browser({ action: 'wait', text: 'Welcome' });
200
+
201
+ // Wait for network idle
202
+ browser({ action: 'wait', text: 'networkidle' });
203
+
204
+ // Wait for selector
205
+ browser({ action: 'wait', selector: '@e1' });
206
+
207
+ // Wait with timeout (default 25s)
208
+ browser({ action: 'wait', selector: '@e1' });
209
+ ```
210
+
211
+ ### Batch Operations
212
+
213
+ Execute multiple commands in sequence:
214
+
215
+ ```javascript
216
+ browser({
217
+ action: 'batch',
218
+ commands: [
219
+ 'open https://example.com',
220
+ 'wait --load networkidle',
221
+ 'fill @e1 user@example.com',
222
+ 'fill @e2 password',
223
+ 'click @e3',
224
+ 'wait --url **/dashboard',
225
+ ],
226
+ });
227
+ ```
228
+
229
+ ## Sessions
230
+
231
+ Sessions persist authentication and state between commands:
232
+
233
+ ```javascript
234
+ // Create named session
235
+ browser({
236
+ action: 'open',
237
+ url: 'https://app.example.com',
238
+ session: 'myapp-auth',
239
+ });
240
+
241
+ // Continue in same session
242
+ browser({
243
+ action: 'click',
244
+ selector: '@e1',
245
+ session: 'myapp-auth',
246
+ });
247
+ ```
248
+
249
+ ## Advanced Features
250
+
251
+ ### Headed Mode (see browser window)
252
+
253
+ ```javascript
254
+ browser({
255
+ action: 'open',
256
+ url: 'https://example.com',
257
+ headed: true,
258
+ });
259
+ ```
260
+
261
+ ### Network Interception
262
+
263
+ ```javascript
264
+ // Block a URL
265
+ browser({
266
+ action: 'network',
267
+ text: 'route',
268
+ selector: 'https://ads.example.com',
269
+ attribute: 'abort', // or mock response
270
+ });
271
+
272
+ // View requests
273
+ browser({ action: 'network', text: 'requests' });
274
+ ```
275
+
276
+ ### Cookies & Storage
277
+
278
+ ```javascript
279
+ // Get cookies
280
+ browser({ action: 'cookies' });
281
+
282
+ // Get localStorage
283
+ browser({ action: 'storage', text: 'local' });
284
+
285
+ // Set localStorage
286
+ browser({
287
+ action: 'storage',
288
+ text: 'local',
289
+ selector: 'set myKey myValue',
290
+ });
291
+ ```
292
+
293
+ ### Clipboard
294
+
295
+ ```javascript
296
+ // Read clipboard
297
+ browser({ action: 'clipboard' });
298
+
299
+ // Write to clipboard
300
+ browser({ action: 'clipboard', text: 'write Hello World' });
301
+ ```
302
+
303
+ ## Best Practices
304
+
305
+ 1. **Always start with `snapshot`**: Get the accessibility tree to understand page structure
306
+
307
+ 2. **Use element refs from snapshot**: They're more reliable than CSS selectors for AI automation
308
+
309
+ 3. **Wait for network idle**: After form submissions or page loads:
310
+
311
+ ```javascript
312
+ browser({ action: 'wait', text: 'networkidle' });
313
+ ```
314
+
315
+ 4. **Use sessions for multi-step flows**: Maintains login state and cookies:
316
+
317
+ ```javascript
318
+ browser({ action: 'open', url: 'https://app.com', session: 'app' });
319
+ ```
320
+
321
+ 5. **Take screenshots for debugging**: When automation fails, screenshot helps debug:
322
+
323
+ ```javascript
324
+ browser({ action: 'screenshot', outputPath: '/tmp/debug.png' });
325
+ ```
326
+
327
+ 6. **Close browser when done**: Free resources:
328
+ ```javascript
329
+ browser({ action: 'close' });
330
+ ```
331
+
332
+ ## Error Handling
333
+
334
+ If agent-browser is not installed, you'll get:
335
+
336
+ ```text
337
+ agent-browser is not installed. Please install it with: npm install -g agent-browser
338
+ ```
339
+
340
+ **Installation troubleshooting:**
341
+
342
+ ```bash
343
+ # Install Chrome dependency
344
+ agent-browser install --with-deps
345
+
346
+ # Check installation
347
+ which agent-browser
348
+ agent-browser --version
349
+ ```
350
+
351
+ ## Use Cases
352
+
353
+ - **Web testing**: Automate UI testing workflows
354
+ - **Form filling**: Submit forms programmatically
355
+ - **Data extraction**: Scrape content from websites
356
+ - **Login flows**: Handle authentication
357
+ - **Screenshot capture**: Document web pages
358
+ - **Accessibility testing**: Verify page structure
@@ -0,0 +1,146 @@
1
+ ---
2
+ name: harness-reference
3
+ description: Reference guide for all agent harness safety features — doom loop detection, scope lock, git checkpoints, observation masking, sprint contract, reminders, repo map, behavior verification, and multi-sample retry
4
+ ---
5
+
6
+ # Agent Harness Reference
7
+
8
+ The proto harness is a set of safety and reliability features that wrap every agent execution. They fire automatically — you don't need to invoke them manually. This skill documents each feature so you can understand what's protecting you and how to configure it.
9
+
10
+ ## Features
11
+
12
+ ### Doom Loop Detection
13
+
14
+ **What it does:** Detects when the agent is repeating the same tool call pattern in a sliding 20-call window. If the same fingerprint (tool + args hash) appears 3+ times, the harness injects a recovery message and records a Langfuse span.
15
+
16
+ **You don't need to do anything.** The harness detects this automatically.
17
+
18
+ ### Scope Lock (Sprint Contract)
19
+
20
+ **What it does:** Before coding, the `sprint-contract` skill negotiates an explicit contract — the set of files that may change. Once activated, any write outside that set is blocked with a structured error.
21
+
22
+ **To activate:** Use the `sprint-contract` skill at the start of an implementation task. It writes `.proto/sprint-contract.json` and arms the in-memory scope lock. The lock is restored on session restart.
23
+
24
+ **To check status:** If a write is blocked, the error message tells you the violating path and the permitted set.
25
+
26
+ ### Git Checkpoints
27
+
28
+ **What it does:** Before every file-mutating tool call (`write_file`, `edit`, `replace`), the harness creates a shadow-repo commit. This lets you diff or roll back to any pre-edit state.
29
+
30
+ **To roll back:** Use `git log` to find the checkpoint commit and `git checkout <hash> -- <file>` to restore.
31
+
32
+ ### Observation Masking
33
+
34
+ **What it does:** When the context window gets large, the harness applies a rolling verbatim window — tool-call/result pairs older than the window are summarized as `[OBSERVATION_MASK: N pairs omitted]`. This keeps recent context intact while reducing token usage.
35
+
36
+ **You don't need to do anything.** Fires automatically during LLM compaction.
37
+
38
+ ### Harness Reminders
39
+
40
+ **What it does:** The harness injects periodic reminders into context based on three triggers:
41
+
42
+ - Every 50 tool calls: warns about high tool usage
43
+ - After 3 consecutive test failures: suggests pausing to diagnose
44
+ - After 8 turns without any file write: suggests the agent may be over-analyzing
45
+
46
+ **You don't need to do anything.** The harness injects these automatically.
47
+
48
+ ### Repo Map (`repo_map` tool)
49
+
50
+ **What it does:** Analyzes the import graph of the codebase and runs PageRank to surface the most-connected (and most-relevant) files. Call it at the start of any exploration or implementation task for fast orientation.
51
+
52
+ **To use:**
53
+
54
+ ```
55
+ repo_map {} # globally most-connected files
56
+ repo_map { seedFiles: ["/abs/path"] } # personalized from known-relevant files
57
+ ```
58
+
59
+ Results are cached at `.proto/repo-map-cache.json` and invalidated on file changes.
60
+
61
+ ### Behavior Verification Gate
62
+
63
+ **What it does:** After every subagent task that completes successfully, the harness runs user-configured "verification scenarios" — shell commands that check your feature actually works. Failures are injected back to the agent for self-correction.
64
+
65
+ **To configure:** Create `.proto/verify-scenarios.json`:
66
+
67
+ ```json
68
+ [
69
+ {
70
+ "name": "Unit tests pass",
71
+ "command": "npm test -- --run",
72
+ "timeoutMs": 60000
73
+ },
74
+ {
75
+ "name": "Build succeeds",
76
+ "command": "npm run build",
77
+ "timeoutMs": 30000
78
+ },
79
+ {
80
+ "name": "API health check",
81
+ "command": "curl -sf http://localhost:3000/health",
82
+ "expectedPattern": "ok",
83
+ "timeoutMs": 5000
84
+ }
85
+ ]
86
+ ```
87
+
88
+ See `.proto/verify-scenarios.example.json` for a full reference.
89
+
90
+ ### Multi-Sample Retry (`multi_sample: true`)
91
+
92
+ **What it does:** When a subagent fails (doom loop, error, or max turns exceeded), the harness automatically retries up to 2 more times with escalating temperatures (0.7 → 1.0 → 1.3) and injects the failure context into each retry prompt. The best result among all attempts is returned and scored.
93
+
94
+ **Scoring:**
95
+
96
+ - GOAL + behavior gate pass → 3 (perfect)
97
+ - GOAL + no gate / gate pass → 3
98
+ - GOAL + gate fail → 2 (completed but not verified)
99
+ - MAX_TURNS / TIMEOUT → 1 (partial)
100
+ - ERROR → 0 (failure)
101
+
102
+ **To enable:** Set `multi_sample: true` on the Agent tool call:
103
+
104
+ ```
105
+ Agent {
106
+ subagent_type: "general-purpose",
107
+ prompt: "implement the auth service",
108
+ multi_sample: true
109
+ }
110
+ ```
111
+
112
+ Use for complex tasks with a history of failure, not for simple searches.
113
+
114
+ ### Sprint Contract Service
115
+
116
+ **What it does:** Manages the full sprint contract lifecycle — parse, activate scope lock, persist to disk, load on resume. See the `sprint-contract` skill for usage.
117
+
118
+ **Files involved:**
119
+
120
+ - `.proto/sprint-contract.json` — persisted contract (restored on session start)
121
+ - `SprintContractService` — programmatic API
122
+
123
+ ## Langfuse Fine-Tuning Data
124
+
125
+ All harness interventions emit OTel spans routed to Langfuse via OTLP → Tempo. To build fine-tuning datasets:
126
+
127
+ 1. In Langfuse > Traces, filter by span name = `harness.intervention`
128
+ 2. Use `harness.intervention.type` attribute to segment by type:
129
+ - `doom_loop` — recovery from loops
130
+ - `scope_violation` — scope lock enforcement
131
+ - `verification_failed` — post-edit and behavior gate failures
132
+ - `reminder.*` — context reminders
133
+ 3. Export matching traces → dataset items
134
+ 4. Annotate `harness.outcome` = `"recovered"` | `"not_recovered"`
135
+ 5. Train on (input_context, intervention_message) pairs where outcome = recovered
136
+
137
+ ## Configuration Summary
138
+
139
+ | Feature | Config location | Default |
140
+ | ----------------------- | ------------------------------------------------ | ------------------------------------ |
141
+ | Doom loop threshold | Code constant (`DOOM_REPEAT_THRESHOLD = 3`) | Always on |
142
+ | Scope lock | `.proto/sprint-contract.json` | Off until sprint-contract skill runs |
143
+ | Behavior gate scenarios | `.proto/verify-scenarios.json` | No scenarios (off) |
144
+ | Multi-sample retry | `multi_sample: true` on Agent call | Off (opt-in) |
145
+ | Observation mask window | Code constant (`INCREMENTAL_PROTECTED_TAIL`) | Always on |
146
+ | Harness reminders | Code constants (50 calls / 3 failures / 8 turns) | Always on |
@@ -120,6 +120,16 @@ Implementer subagents report one of four statuses. Handle each appropriately:
120
120
 
121
121
  **Never** ignore an escalation or force the same model to retry without changes. If the implementer said it's stuck, something needs to change.
122
122
 
123
+ ## Harness Features
124
+
125
+ The harness provides automatic safety nets you can leverage when dispatching implementers:
126
+
127
+ **Multi-sample retry** (`multi_sample: true`): For complex or high-risk tasks, set this on the Agent tool call. If the implementer fails (doom loop, error, max turns), the harness automatically retries up to 2 more times with escalating temperatures (0.7 → 1.0 → 1.3) and injects the failure context into each retry prompt. Returns the best result. Use for tasks that have previously failed or that touch many files.
128
+
129
+ **Behavior verification gate**: If `.proto/verify-scenarios.json` exists in the project, the harness runs those scenarios after every successful implementer completion. Failures are injected back to the model for self-correction. Add scenarios for smoke tests, build checks, and HTTP health checks.
130
+
131
+ **Sprint contract scope lock**: If the implementing agent was given a sprint contract (via the `sprint-contract` skill), the scope lock prevents it from writing files outside the agreed set. Any violation is blocked and reported.
132
+
123
133
  ## Prompt Templates
124
134
 
125
135
  - `./implementer-prompt.md` - Dispatch implementer subagent