@sebastianandreasson/pi-autonomous-agents 0.9.1 → 0.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -89,6 +89,7 @@ pi-harness report
89
89
  pi-harness clear-history
90
90
  pi-harness visual-once
91
91
  pi-harness visualize
92
+ pi-harness debug-live
92
93
  pi-harness visual-review-worker
93
94
  ```
94
95
 
@@ -319,4 +320,35 @@ npm run check
319
320
  npm test
320
321
  ```
321
322
 
323
+ For local visualizer iteration against fake live SDK agent:
324
+
325
+ ```bash
326
+ npm run debug:live-ui
327
+ ```
328
+
329
+ Scenario variants:
330
+
331
+ ```bash
332
+ node src/cli.mjs debug-live --reset --scenario noisy --task-count 24
333
+ node src/cli.mjs debug-live --reset --scenario retry
334
+ ```
335
+
336
+ For React/Vite visualizer UI dev loop:
337
+
338
+ ```bash
339
+ npm run dev:visualizer:ui
340
+ ```
341
+
342
+ For production visualizer UI build:
343
+
344
+ ```bash
345
+ npm run build:visualizer:ui
346
+ ```
347
+
348
+ Publish now auto-runs check, tests, and UI build via `prepublishOnly`.
349
+
350
+ This seeds `.pi-debug/live-ui/`, runs harness there with streaming fake SDK fixture, hosts visualizer, and gives stable local repro loop for UI work. React app lives in `visualizer-ui/`. Visualizer server now serves built assets from `visualizer-ui/dist/` and falls back to build-instructions page if build artifacts are missing.
351
+
352
+ See `docs/VISUALIZER_UI_PLAN.md` for migration plan.
353
+
322
354
  The package requires Node `>=20`.
@@ -35,8 +35,10 @@ Main package files:
35
35
  - `src/pi-prompts.mjs`: default prompt builders
36
36
  - `src/pi-visual-review.mjs`: multimodal visual-review worker
37
37
  - `src/pi-visual-once.mjs`: one-shot manual visual review runner
38
- - `src/pi-visualizer.mjs`: local web UI for orchestration flow and active stage
38
+ - `src/pi-visualizer.mjs`: local web UI entrypoint
39
+ - `src/pi-visualizer-server.mjs`: shared visualizer server/runtime
39
40
  - `src/pi-visualizer-shared.mjs`: flow-state helpers for visualizer
41
+ - `src/pi-debug-live.mjs`: local fake-live sandbox runner for visualizer debugging
40
42
  - `src/pi-report.mjs`: telemetry summary report
41
43
  - `templates/DEVELOPER.md`: default developer-role instructions template
42
44
  - `templates/TESTER.md`: default tester-role instructions template
@@ -49,6 +51,7 @@ pi-harness run
49
51
  pi-harness report
50
52
  pi-harness visual-once
51
53
  pi-harness visualize
54
+ pi-harness debug-live
52
55
  ```
53
56
 
54
57
  The package reads `PI_CONFIG_FILE` if provided. Otherwise it falls back to the bundled generic `pi.config.json`.
@@ -59,6 +62,8 @@ The package reads `PI_CONFIG_FILE` if provided. Otherwise it falls back to the b
59
62
 
60
63
  Visualizer reads active-run lock, TODO file, per-run state, per-run iteration summary, per-run last output snapshot, live feed JSONL, and telemetry to show current stage plus historical runs.
61
64
 
65
+ For local UI iteration in this package repo, use `pi-harness debug-live` to run against seeded fake live SDK sandbox. Useful variants: `--scenario noisy`, `--scenario retry`, `--task-count 24`.
66
+
62
67
  ## Config Contract
63
68
 
64
69
  Projects typically provide their own `pi.config.json` with fields such as:
@@ -0,0 +1,117 @@
1
+ # Visualizer UI migration plan
2
+
3
+ ## Goal
4
+
5
+ Replace inline browser JS/HTML in `src/pi-visualizer-server.mjs` with maintainable React frontend, while keeping Node visualizer server as API + SSE + static asset host.
6
+
7
+ ## Chosen stack
8
+
9
+ - React
10
+ - Vite
11
+ - TypeScript
12
+ - Zustand
13
+ - Plain CSS
14
+
15
+ ## Repo layout
16
+
17
+ ```text
18
+ src/
19
+ pi-visualizer-server.mjs # API, SSE, static asset host
20
+ visualizer-ui/
21
+ package.json
22
+ tsconfig.json
23
+ vite.config.ts
24
+ index.html
25
+ src/
26
+ main.tsx
27
+ App.tsx
28
+ api.ts
29
+ store.ts
30
+ types.ts
31
+ styles.css
32
+ components/
33
+ TodoList.tsx
34
+ FlowStrip.tsx
35
+ LiveFeed.tsx
36
+ CurrentEdits.tsx
37
+ DiagnosticsPanel.tsx
38
+ ```
39
+
40
+ ## State model
41
+
42
+ Zustand store owns:
43
+
44
+ - latest snapshot
45
+ - selected run
46
+ - selected todo
47
+ - selected event
48
+ - feed toggles
49
+ - SSE lifecycle
50
+ - initial load status / error
51
+
52
+ ## API contract
53
+
54
+ Current server routes:
55
+
56
+ - `GET /api/state` → full snapshot
57
+ - `GET /api/stream` → SSE full snapshots
58
+
59
+ Short term: React app consumes current full-snapshot SSE.
60
+
61
+ Next improvement:
62
+
63
+ - add monotonic `seq` to live feed entries
64
+ - optionally move SSE from full snapshots to patch events
65
+ - add snapshot version to ignore stale payloads
66
+
67
+ ## Migration phases
68
+
69
+ ### Phase 1
70
+ - scaffold `visualizer-ui/`
71
+ - keep current inline HTML as fallback
72
+ - add built asset serving from `visualizer-ui/dist`
73
+ - add `/api/state` alias
74
+
75
+ ### Phase 2
76
+ - port current layout into React components
77
+ - use Zustand store + initial snapshot fetch
78
+ - use SSE reconnect from frontend
79
+
80
+ ### Phase 3
81
+ - move feed/timeline ordering to stable `seq`
82
+ - reduce full rerenders
83
+ - preserve scroll behavior inside components
84
+
85
+ ### Phase 4
86
+ - remove inline browser app from server once built UI covers current features
87
+ - keep server only as API/static host
88
+
89
+ ## Dev workflow
90
+
91
+ Backend + fake live harness:
92
+
93
+ ```bash
94
+ npm run debug:live-ui
95
+ ```
96
+
97
+ Frontend dev server:
98
+
99
+ ```bash
100
+ npm run dev:visualizer:ui
101
+ ```
102
+
103
+ Frontend build:
104
+
105
+ ```bash
106
+ npm run build:visualizer:ui
107
+ ```
108
+
109
+ ## Notes
110
+
111
+ Current state:
112
+
113
+ - React/Vite/Zustand UI scaffold added
114
+ - built assets generated under `visualizer-ui/dist/`
115
+ - server serves built UI directly
116
+ - legacy inline browser app removed
117
+ - fallback page only shows build instructions when dist missing
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@sebastianandreasson/pi-autonomous-agents",
3
3
  "private": false,
4
- "version": "0.9.1",
4
+ "version": "0.11.0",
5
5
  "type": "module",
6
6
  "description": "Portable unattended PI harness for developer/tester/visual-review loops.",
7
7
  "license": "MIT",
@@ -19,13 +19,18 @@
19
19
  "@mariozechner/pi-coding-agent": "^0.66.1"
20
20
  },
21
21
  "scripts": {
22
- "check": "node --check src/cli.mjs && node --check src/pi-clear-history.mjs && node --check src/pi-client.mjs && node --check src/pi-config.mjs && node --check src/pi-flow.mjs && node --check src/pi-heartbeat.mjs && node --check src/pi-history.mjs && node --check src/pi-preflight.mjs && node --check src/pi-prompts.mjs && node --check src/pi-repo.mjs && node --check src/pi-report.mjs && node --check src/pi-sdk-turn.mjs && node --check src/pi-supervisor.mjs && node --check src/pi-telemetry.mjs && node --check src/pi-visual-once.mjs && node --check src/pi-visual-review.mjs && node --check src/pi-visualizer.mjs && node --check src/pi-visualizer-server.mjs && node --check src/pi-visualizer-shared.mjs && node --check src/index.mjs && node --check test/pi-heartbeat.test.mjs && node --check test/pi-lifecycle.test.mjs && node --check test/pi-role-models.test.mjs && node --check test/pi-flow.test.mjs && node --check test/pi-history.test.mjs && node --check test/pi-prompts.test.mjs && node --check test/pi-preflight.test.mjs && node --check test/pi-repo.test.mjs && node --check test/pi-sdk-supervisor.test.mjs && node --check test/pi-sdk-turn.test.mjs && node --check test/pi-telemetry.test.mjs && node --check test/pi-visualizer-shared.test.mjs && node --check test/fixtures/fake-pi.mjs && node --check test/fixtures/fake-pi-sdk.mjs",
23
- "test": "node --test test/pi-heartbeat.test.mjs test/pi-lifecycle.test.mjs test/pi-role-models.test.mjs test/pi-flow.test.mjs test/pi-history.test.mjs test/pi-prompts.test.mjs test/pi-preflight.test.mjs test/pi-repo.test.mjs test/pi-sdk-supervisor.test.mjs test/pi-sdk-turn.test.mjs test/pi-telemetry.test.mjs test/pi-visualizer-shared.test.mjs"
22
+ "check": "node --check src/cli.mjs && node --check src/pi-clear-history.mjs && node --check src/pi-client.mjs && node --check src/pi-config.mjs && node --check src/pi-debug-live.mjs && node --check src/pi-flow.mjs && node --check src/pi-heartbeat.mjs && node --check src/pi-history.mjs && node --check src/pi-preflight.mjs && node --check src/pi-prompts.mjs && node --check src/pi-repo.mjs && node --check src/pi-report.mjs && node --check src/pi-sdk-turn.mjs && node --check src/pi-supervisor.mjs && node --check src/pi-telemetry.mjs && node --check src/pi-visual-once.mjs && node --check src/pi-visual-review.mjs && node --check src/pi-visualizer.mjs && node --check src/pi-visualizer-server.mjs && node --check src/pi-visualizer-shared.mjs && node --check src/index.mjs && node --check test/pi-heartbeat.test.mjs && node --check test/pi-lifecycle.test.mjs && node --check test/pi-role-models.test.mjs && node --check test/pi-flow.test.mjs && node --check test/pi-history.test.mjs && node --check test/pi-prompts.test.mjs && node --check test/pi-preflight.test.mjs && node --check test/pi-repo.test.mjs && node --check test/pi-sdk-supervisor.test.mjs && node --check test/pi-sdk-turn.test.mjs && node --check test/pi-telemetry.test.mjs && node --check test/pi-visualizer-shared.test.mjs && node --check test/fixtures/fake-pi.mjs && node --check test/fixtures/fake-pi-sdk.mjs && node --check test/fixtures/fake-live-pi-sdk.mjs",
23
+ "test": "node --test test/pi-heartbeat.test.mjs test/pi-lifecycle.test.mjs test/pi-role-models.test.mjs test/pi-flow.test.mjs test/pi-history.test.mjs test/pi-prompts.test.mjs test/pi-preflight.test.mjs test/pi-repo.test.mjs test/pi-sdk-supervisor.test.mjs test/pi-sdk-turn.test.mjs test/pi-telemetry.test.mjs test/pi-visualizer-shared.test.mjs",
24
+ "debug:live-ui": "node src/cli.mjs debug-live --reset",
25
+ "dev:visualizer:ui": "npm --prefix visualizer-ui run dev",
26
+ "build:visualizer:ui": "npm --prefix visualizer-ui run build",
27
+ "prepublishOnly": "npm run check && npm test && npm run build:visualizer:ui"
24
28
  },
25
29
  "files": [
26
30
  "src",
27
31
  "templates",
28
32
  "docs",
33
+ "visualizer-ui/dist",
29
34
  "pi.config.json",
30
35
  "SETUP.md",
31
36
  "README.md"
package/src/cli.mjs CHANGED
@@ -19,6 +19,7 @@ const COMMANDS = new Map([
19
19
  ['clear-history', 'pi-clear-history.mjs'],
20
20
  ['visual-once', 'pi-visual-once.mjs'],
21
21
  ['visualize', 'pi-visualizer.mjs'],
22
+ ['debug-live', 'pi-debug-live.mjs'],
22
23
  ['visual-review-worker', 'pi-visual-review.mjs'],
23
24
  ])
24
25
 
package/src/pi-client.mjs CHANGED
@@ -3,6 +3,9 @@ import path from 'node:path'
3
3
  import { randomUUID } from 'node:crypto'
4
4
 
5
5
  const liveFeedWriteQueues = new Map()
6
+ const liveFeedSequences = new Map()
7
+ const MAX_LIVE_FEED_TEXT = 2000
8
+ const MAX_LIVE_FEED_SUMMARY = 600
6
9
  import {
7
10
  appendLog,
8
11
  writeTextFile,
@@ -41,6 +44,63 @@ async function writeAgentOutputSnapshot(config, content) {
41
44
  }
42
45
  }
43
46
 
47
+ function truncateText(value, maxChars) {
48
+ const text = String(value ?? '')
49
+ if (text.length <= maxChars) {
50
+ return text
51
+ }
52
+ return `${text.slice(0, maxChars - 16)}\n... [truncated]`
53
+ }
54
+
55
+ function summarizeValue(value, maxChars = MAX_LIVE_FEED_SUMMARY) {
56
+ if (value === null || value === undefined) {
57
+ return ''
58
+ }
59
+ if (typeof value === 'string') {
60
+ return truncateText(value, maxChars)
61
+ }
62
+ try {
63
+ return truncateText(JSON.stringify(value), maxChars)
64
+ } catch {
65
+ return truncateText(String(value), maxChars)
66
+ }
67
+ }
68
+
69
+ function sanitizeLiveFeedEvent(filePath, event) {
70
+ const nextSeq = (liveFeedSequences.get(filePath) ?? 0) + 1
71
+ liveFeedSequences.set(filePath, nextSeq)
72
+
73
+ const normalized = {
74
+ seq: nextSeq,
75
+ timestamp: String(event?.timestamp ?? new Date().toISOString()),
76
+ iteration: Number(event?.iteration ?? 0),
77
+ retryCount: Number(event?.retryCount ?? 0),
78
+ reason: String(event?.reason ?? ''),
79
+ phase: String(event?.phase ?? ''),
80
+ role: String(event?.role ?? ''),
81
+ kind: String(event?.kind ?? ''),
82
+ type: String(event?.type ?? 'event'),
83
+ toolName: String(event?.toolName ?? ''),
84
+ isError: event?.isError === true,
85
+ text: truncateText(event?.text ?? '', MAX_LIVE_FEED_TEXT),
86
+ }
87
+
88
+ const argsSummary = summarizeValue(event?.args)
89
+ const partialSummary = summarizeValue(event?.partialResult)
90
+ const resultSummary = summarizeValue(event?.result)
91
+ if (argsSummary !== '') {
92
+ normalized.argsSummary = argsSummary
93
+ }
94
+ if (partialSummary !== '') {
95
+ normalized.partialSummary = partialSummary
96
+ }
97
+ if (resultSummary !== '') {
98
+ normalized.resultSummary = resultSummary
99
+ }
100
+
101
+ return normalized
102
+ }
103
+
44
104
  async function appendLiveFeedEvent(config, event) {
45
105
  if (!config.runLiveFeedFile) {
46
106
  return
@@ -51,8 +111,9 @@ async function appendLiveFeedEvent(config, event) {
51
111
  const next = previous
52
112
  .catch(() => {})
53
113
  .then(async () => {
114
+ const sanitized = sanitizeLiveFeedEvent(filePath, event)
54
115
  await fs.mkdir(path.dirname(filePath), { recursive: true })
55
- await fs.appendFile(filePath, `${JSON.stringify(event)}\n`, 'utf8')
116
+ await fs.appendFile(filePath, `${JSON.stringify(sanitized)}\n`, 'utf8')
56
117
  })
57
118
 
58
119
  liveFeedWriteQueues.set(filePath, next)
@@ -0,0 +1,153 @@
1
+ #!/usr/bin/env node
2
+
3
+ import fs from 'node:fs/promises'
4
+ import path from 'node:path'
5
+ import process from 'node:process'
6
+ import { spawn, execFileSync } from 'node:child_process'
7
+ import { fileURLToPath } from 'node:url'
8
+
9
+ const scriptDir = path.dirname(fileURLToPath(import.meta.url))
10
+ const packageRoot = path.resolve(scriptDir, '..')
11
+ const cliFile = path.join(scriptDir, 'cli.mjs')
12
+ const fakePiFile = path.join(packageRoot, 'test', 'fixtures', 'fake-pi.mjs')
13
+ const fakeLiveSdkFile = path.join(packageRoot, 'test', 'fixtures', 'fake-live-pi-sdk.mjs')
14
+ const sandboxDir = path.join(packageRoot, '.pi-debug', 'live-ui')
15
+ const DEFAULT_TASK_COUNT = 12
16
+
17
+ function shellQuote(value) {
18
+ return JSON.stringify(String(value))
19
+ }
20
+
21
+ function readFlagValue(flag) {
22
+ const index = process.argv.indexOf(flag)
23
+ if (index === -1) {
24
+ return ''
25
+ }
26
+ return String(process.argv[index + 1] ?? '').trim()
27
+ }
28
+
29
+ function readScenario() {
30
+ const value = readFlagValue('--scenario') || process.env.PI_FAKE_LIVE_SCENARIO || 'default'
31
+ return String(value).trim() || 'default'
32
+ }
33
+
34
+ function readTaskCount() {
35
+ const raw = Number.parseInt(readFlagValue('--task-count') || process.env.PI_DEBUG_TASK_COUNT || `${DEFAULT_TASK_COUNT}`, 10)
36
+ return Number.isFinite(raw) && raw > 0 ? raw : DEFAULT_TASK_COUNT
37
+ }
38
+
39
+ function buildTodoLines(taskCount) {
40
+ const lines = []
41
+ for (let index = 1; index <= taskCount; index += 1) {
42
+ const phase = index <= Math.ceil(taskCount / 3)
43
+ ? 'Phase 1'
44
+ : index <= Math.ceil((taskCount * 2) / 3)
45
+ ? 'Phase 2'
46
+ : 'Phase 3'
47
+ const label = `Fake live task ${index}`
48
+ if (lines.length === 0 || lines[lines.length - 1] !== `## ${phase}`) {
49
+ if (lines.length > 0) {
50
+ lines.push('')
51
+ }
52
+ lines.push(`## ${phase}`)
53
+ lines.push('')
54
+ }
55
+ lines.push(`- [ ] ${label}`)
56
+ }
57
+ return `${lines.join('\n')}\n`
58
+ }
59
+
60
+ async function ensureRepo(cwd) {
61
+ try {
62
+ execFileSync('git', ['rev-parse', '--is-inside-work-tree'], { cwd, stdio: 'ignore' })
63
+ } catch {
64
+ execFileSync('git', ['init'], { cwd, stdio: 'ignore' })
65
+ execFileSync('git', ['config', 'user.name', 'PI Harness Debug'], { cwd, stdio: 'ignore' })
66
+ execFileSync('git', ['config', 'user.email', 'pi-harness-debug@example.com'], { cwd, stdio: 'ignore' })
67
+ }
68
+ }
69
+
70
+ async function seedFiles(cwd, { taskCount, scenario }) {
71
+ await fs.mkdir(path.join(cwd, 'pi'), { recursive: true })
72
+ await fs.writeFile(path.join(cwd, 'TODOS.md'), buildTodoLines(taskCount), 'utf8')
73
+ await fs.writeFile(path.join(cwd, 'DEVELOPER.md'), `Developer instructions for local visualizer debugging.\nScenario: ${scenario}\n`, 'utf8')
74
+ await fs.writeFile(path.join(cwd, 'TESTER.md'), `Tester instructions for local visualizer debugging.\nScenario: ${scenario}\n`, 'utf8')
75
+ await fs.writeFile(path.join(cwd, 'pi.config.json'), `${JSON.stringify({
76
+ transport: 'sdk',
77
+ taskFile: 'TODOS.md',
78
+ developerInstructionsFile: 'DEVELOPER.md',
79
+ testerInstructionsFile: 'TESTER.md',
80
+ piCli: fakePiFile,
81
+ piModel: 'fake-model',
82
+ roleModels: {
83
+ developer: 'fake-model',
84
+ developerRetry: 'fake-model',
85
+ developerFix: 'fake-model',
86
+ tester: 'fake-model',
87
+ testerCommit: 'fake-model',
88
+ },
89
+ testCommand: `${shellQuote(process.execPath)} -e ${shellQuote('setTimeout(()=>process.exit(0), 250)')}`,
90
+ streamTerminal: true,
91
+ continueAfterSeconds: 3600,
92
+ noEventTimeoutSeconds: 3600,
93
+ toolContinueAfterSeconds: 3600,
94
+ toolNoEventTimeoutSeconds: 3600,
95
+ sleepBetweenSeconds: 1,
96
+ maxIterations: Math.max(taskCount * 3, 20),
97
+ }, null, 2)}\n`, 'utf8')
98
+ }
99
+
100
+ async function ensureInitialCommit(cwd) {
101
+ try {
102
+ execFileSync('git', ['rev-parse', 'HEAD'], { cwd, stdio: 'ignore' })
103
+ } catch {
104
+ execFileSync('git', ['add', '.'], { cwd, stdio: 'ignore' })
105
+ execFileSync('git', ['commit', '-m', 'chore(debug): seed fake live sandbox'], { cwd, stdio: 'ignore' })
106
+ }
107
+ }
108
+
109
+ async function main() {
110
+ const reset = process.argv.includes('--reset')
111
+ const scenario = readScenario()
112
+ const taskCount = readTaskCount()
113
+
114
+ if (reset) {
115
+ await fs.rm(sandboxDir, { recursive: true, force: true })
116
+ }
117
+
118
+ await fs.mkdir(sandboxDir, { recursive: true })
119
+ await ensureRepo(sandboxDir)
120
+ await seedFiles(sandboxDir, { taskCount, scenario })
121
+ await ensureInitialCommit(sandboxDir)
122
+
123
+ process.stdout.write(`PI debug sandbox: ${sandboxDir}\n`)
124
+ process.stdout.write(`Using fake live SDK fixture: ${fakeLiveSdkFile}\n`)
125
+ process.stdout.write(`Scenario: ${scenario}\n`)
126
+ process.stdout.write(`Task count: ${taskCount}\n`)
127
+
128
+ const child = spawn(process.execPath, [cliFile, 'run'], {
129
+ cwd: sandboxDir,
130
+ env: {
131
+ ...process.env,
132
+ PI_CONFIG_FILE: 'pi.config.json',
133
+ PI_SDK_MODULE: fakeLiveSdkFile,
134
+ PI_FAKE_LIVE_SCENARIO: scenario,
135
+ PI_VISUALIZER_HOST: process.env.PI_VISUALIZER_HOST || '127.0.0.1',
136
+ PI_VISUALIZER_PORT: process.env.PI_VISUALIZER_PORT || '4317',
137
+ },
138
+ stdio: 'inherit',
139
+ })
140
+
141
+ child.on('exit', (code, signal) => {
142
+ if (signal) {
143
+ process.exitCode = 128
144
+ return
145
+ }
146
+ process.exitCode = code ?? 1
147
+ })
148
+ }
149
+
150
+ main().catch((error) => {
151
+ console.error(error instanceof Error ? error.stack ?? error.message : String(error))
152
+ process.exitCode = 1
153
+ })
@@ -119,6 +119,36 @@ function repoInstructionsAuthorityLine(config, instructionsFile, usesBundledInst
119
119
  return `Repo-local instructions in ${displayPath(config, instructionsFile)} are the primary role contract. Follow them over package defaults when they differ.\n`
120
120
  }
121
121
 
122
+ export function classifyTaskType(task) {
123
+ const text = String(task ?? '').trim().toLowerCase()
124
+ if (text === '') {
125
+ return 'general'
126
+ }
127
+
128
+ if (
129
+ /\b(write|add|create|implement|expand|improve|fix|update)\b.*\b(test|tests|coverage|regression test|spec|specs)\b/.test(text)
130
+ || /\b(test|tests|coverage|regression test|spec|specs)\b.*\b(write|add|create|implement|expand|improve|fix|update)\b/.test(text)
131
+ ) {
132
+ return 'test'
133
+ }
134
+
135
+ return 'general'
136
+ }
137
+
138
+ function formatTaskTypeGuidance(taskType) {
139
+ if (taskType !== 'test') {
140
+ return ''
141
+ }
142
+
143
+ return [
144
+ 'Test-task guidance:',
145
+ '- This TODO is primarily test-focused. Do not fail solely because changes are mostly or entirely tests.',
146
+ '- PASS if the new or updated test adds meaningful behavioral or regression coverage and verification passes.',
147
+ '- FAIL if the test is brittle, redundant, weakly asserted, or not tied to real behavior.',
148
+ '- Prefer checking whether the test would have failed before the change, or whether developer notes justify why missing coverage mattered.',
149
+ ].join('\n')
150
+ }
151
+
122
152
  function testerPassOwnershipRules(config) {
123
153
  if (config.commitMode === 'plan') {
124
154
  return {
@@ -353,6 +383,9 @@ export function buildTesterPrompt(config, {
353
383
  developerNotes || '(none provided)',
354
384
  configMaxLines(config, 'maxPromptNotesLines', 16),
355
385
  )
386
+ const taskType = classifyTaskType(task)
387
+ const taskTypeLabel = taskType === 'test' ? 'test-focused' : 'general'
388
+ const taskTypeGuidance = formatTaskTypeGuidance(taskType)
356
389
  const verificationCommand = config.testCommand.trim() === '' ? '(not configured)' : config.testCommand
357
390
  const visualCaptureNote = config.visualReviewEnabled
358
391
  ? `\n- Keep the screenshot capture flow working so the harness still produces current visual artifacts for review.`
@@ -364,6 +397,7 @@ export function buildTesterPrompt(config, {
364
397
  )
365
398
  const passOwnership = testerPassOwnershipRules(config)
366
399
  const largeFileRiskHint = formatLargeFileRiskHint(largeFileWarnings)
400
+ const taskTypeRuleBlock = taskTypeGuidance === '' ? '' : `${taskTypeGuidance}\n`
367
401
 
368
402
  if (!config.usingBundledTesterInstructions) {
369
403
  return `Read ${taskFile} and ${instructionsFile}.
@@ -375,6 +409,7 @@ You are the TESTER role. You are reviewing the most recent developer work from a
375
409
 
376
410
  Current phase: ${phase}
377
411
  Current task: ${task}
412
+ Current task type: ${taskTypeLabel}
378
413
  Reason for this tester pass: ${reason}
379
414
 
380
415
  Developer notes:
@@ -391,7 +426,7 @@ Rules:
391
426
  - If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
392
427
  - If blocked or inconclusive, return VERDICT: BLOCKED.
393
428
  - Do not hide real bugs with brittle tests.
394
- - ${passOwnership.successRule.slice(2)}
429
+ ${taskTypeRuleBlock}- ${passOwnership.successRule.slice(2)}
395
430
  - ${passOwnership.isolationRule.slice(2)}
396
431
  - ${passOwnership.extraRule.slice(2)}${visualCaptureNote}
397
432
 
@@ -417,6 +452,7 @@ You are the TESTER role. You are reviewing the most recent developer work from a
417
452
 
418
453
  Current phase: ${phase}
419
454
  Current task: ${task}
455
+ Current task type: ${taskTypeLabel}
420
456
  Reason for this tester pass: ${reason}
421
457
 
422
458
  Developer notes:
@@ -433,7 +469,7 @@ ${indentBlock(innerLoopValidationRules(verificationCommand), '\t')}
433
469
  - Prefer one focused browser-driven review pass.
434
470
  - If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
435
471
  - Do not hide real bugs with brittle tests.
436
- - If blocked or inconclusive, return VERDICT: BLOCKED.
472
+ ${taskTypeGuidance === '' ? '' : `${indentBlock(taskTypeGuidance, '\t')}\n`} - If blocked or inconclusive, return VERDICT: BLOCKED.
437
473
  ${indentBlock(passOwnership.successRule, '\t')}
438
474
  ${indentBlock(passOwnership.isolationRule, '\t')}
439
475
  ${indentBlock(passOwnership.extraRule, '\t')}${visualCaptureNote}