@sebastianandreasson/pi-autonomous-agents 0.9.1 → 0.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +32 -0
- package/docs/PI_SUPERVISOR.md +6 -1
- package/docs/VISUALIZER_UI_PLAN.md +117 -0
- package/package.json +8 -3
- package/src/cli.mjs +1 -0
- package/src/pi-client.mjs +62 -1
- package/src/pi-debug-live.mjs +153 -0
- package/src/pi-prompts.mjs +38 -2
- package/src/pi-supervisor.mjs +37 -6
- package/src/pi-telemetry.mjs +70 -5
- package/src/pi-visualizer-server.mjs +121 -554
- package/src/pi-visualizer-shared.mjs +29 -10
- package/visualizer-ui/dist/assets/index-C5V0jXPE.css +1 -0
- package/visualizer-ui/dist/assets/index-CpHvuv0C.js +12 -0
- package/visualizer-ui/dist/index.html +13 -0
package/README.md
CHANGED
|
@@ -89,6 +89,7 @@ pi-harness report
|
|
|
89
89
|
pi-harness clear-history
|
|
90
90
|
pi-harness visual-once
|
|
91
91
|
pi-harness visualize
|
|
92
|
+
pi-harness debug-live
|
|
92
93
|
pi-harness visual-review-worker
|
|
93
94
|
```
|
|
94
95
|
|
|
@@ -319,4 +320,35 @@ npm run check
|
|
|
319
320
|
npm test
|
|
320
321
|
```
|
|
321
322
|
|
|
323
|
+
For local visualizer iteration against fake live SDK agent:
|
|
324
|
+
|
|
325
|
+
```bash
|
|
326
|
+
npm run debug:live-ui
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
Scenario variants:
|
|
330
|
+
|
|
331
|
+
```bash
|
|
332
|
+
node src/cli.mjs debug-live --reset --scenario noisy --task-count 24
|
|
333
|
+
node src/cli.mjs debug-live --reset --scenario retry
|
|
334
|
+
```
|
|
335
|
+
|
|
336
|
+
For React/Vite visualizer UI dev loop:
|
|
337
|
+
|
|
338
|
+
```bash
|
|
339
|
+
npm run dev:visualizer:ui
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
For production visualizer UI build:
|
|
343
|
+
|
|
344
|
+
```bash
|
|
345
|
+
npm run build:visualizer:ui
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
Publish now auto-runs check, tests, and UI build via `prepublishOnly`.
|
|
349
|
+
|
|
350
|
+
This seeds `.pi-debug/live-ui/`, runs harness there with streaming fake SDK fixture, hosts visualizer, and gives stable local repro loop for UI work. React app lives in `visualizer-ui/`. Visualizer server now serves built assets from `visualizer-ui/dist/` and falls back to build-instructions page if build artifacts are missing.
|
|
351
|
+
|
|
352
|
+
See `docs/VISUALIZER_UI_PLAN.md` for migration plan.
|
|
353
|
+
|
|
322
354
|
The package requires Node `>=20`.
|
package/docs/PI_SUPERVISOR.md
CHANGED
|
@@ -35,8 +35,10 @@ Main package files:
|
|
|
35
35
|
- `src/pi-prompts.mjs`: default prompt builders
|
|
36
36
|
- `src/pi-visual-review.mjs`: multimodal visual-review worker
|
|
37
37
|
- `src/pi-visual-once.mjs`: one-shot manual visual review runner
|
|
38
|
-
- `src/pi-visualizer.mjs`: local web UI
|
|
38
|
+
- `src/pi-visualizer.mjs`: local web UI entrypoint
|
|
39
|
+
- `src/pi-visualizer-server.mjs`: shared visualizer server/runtime
|
|
39
40
|
- `src/pi-visualizer-shared.mjs`: flow-state helpers for visualizer
|
|
41
|
+
- `src/pi-debug-live.mjs`: local fake-live sandbox runner for visualizer debugging
|
|
40
42
|
- `src/pi-report.mjs`: telemetry summary report
|
|
41
43
|
- `templates/DEVELOPER.md`: default developer-role instructions template
|
|
42
44
|
- `templates/TESTER.md`: default tester-role instructions template
|
|
@@ -49,6 +51,7 @@ pi-harness run
|
|
|
49
51
|
pi-harness report
|
|
50
52
|
pi-harness visual-once
|
|
51
53
|
pi-harness visualize
|
|
54
|
+
pi-harness debug-live
|
|
52
55
|
```
|
|
53
56
|
|
|
54
57
|
The package reads `PI_CONFIG_FILE` if provided. Otherwise it falls back to the bundled generic `pi.config.json`.
|
|
@@ -59,6 +62,8 @@ The package reads `PI_CONFIG_FILE` if provided. Otherwise it falls back to the b
|
|
|
59
62
|
|
|
60
63
|
Visualizer reads active-run lock, TODO file, per-run state, per-run iteration summary, per-run last output snapshot, live feed JSONL, and telemetry to show current stage plus historical runs.
|
|
61
64
|
|
|
65
|
+
For local UI iteration in this package repo, use `pi-harness debug-live` to run against seeded fake live SDK sandbox. Useful variants: `--scenario noisy`, `--scenario retry`, `--task-count 24`.
|
|
66
|
+
|
|
62
67
|
## Config Contract
|
|
63
68
|
|
|
64
69
|
Projects typically provide their own `pi.config.json` with fields such as:
|
|
@@ -0,0 +1,117 @@
|
|
|
1
|
+
# Visualizer UI migration plan
|
|
2
|
+
|
|
3
|
+
## Goal
|
|
4
|
+
|
|
5
|
+
Replace inline browser JS/HTML in `src/pi-visualizer-server.mjs` with maintainable React frontend, while keeping Node visualizer server as API + SSE + static asset host.
|
|
6
|
+
|
|
7
|
+
## Chosen stack
|
|
8
|
+
|
|
9
|
+
- React
|
|
10
|
+
- Vite
|
|
11
|
+
- TypeScript
|
|
12
|
+
- Zustand
|
|
13
|
+
- Plain CSS
|
|
14
|
+
|
|
15
|
+
## Repo layout
|
|
16
|
+
|
|
17
|
+
```text
|
|
18
|
+
src/
|
|
19
|
+
pi-visualizer-server.mjs # API, SSE, static asset host
|
|
20
|
+
visualizer-ui/
|
|
21
|
+
package.json
|
|
22
|
+
tsconfig.json
|
|
23
|
+
vite.config.ts
|
|
24
|
+
index.html
|
|
25
|
+
src/
|
|
26
|
+
main.tsx
|
|
27
|
+
App.tsx
|
|
28
|
+
api.ts
|
|
29
|
+
store.ts
|
|
30
|
+
types.ts
|
|
31
|
+
styles.css
|
|
32
|
+
components/
|
|
33
|
+
TodoList.tsx
|
|
34
|
+
FlowStrip.tsx
|
|
35
|
+
LiveFeed.tsx
|
|
36
|
+
CurrentEdits.tsx
|
|
37
|
+
DiagnosticsPanel.tsx
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## State model
|
|
41
|
+
|
|
42
|
+
Zustand store owns:
|
|
43
|
+
|
|
44
|
+
- latest snapshot
|
|
45
|
+
- selected run
|
|
46
|
+
- selected todo
|
|
47
|
+
- selected event
|
|
48
|
+
- feed toggles
|
|
49
|
+
- SSE lifecycle
|
|
50
|
+
- initial load status / error
|
|
51
|
+
|
|
52
|
+
## API contract
|
|
53
|
+
|
|
54
|
+
Current server routes:
|
|
55
|
+
|
|
56
|
+
- `GET /api/state` → full snapshot
|
|
57
|
+
- `GET /api/stream` → SSE full snapshots
|
|
58
|
+
|
|
59
|
+
Short term: React app consumes current full-snapshot SSE.
|
|
60
|
+
|
|
61
|
+
Next improvement:
|
|
62
|
+
|
|
63
|
+
- add monotonic `seq` to live feed entries
|
|
64
|
+
- optionally move SSE from full snapshots to patch events
|
|
65
|
+
- add snapshot version to ignore stale payloads
|
|
66
|
+
|
|
67
|
+
## Migration phases
|
|
68
|
+
|
|
69
|
+
### Phase 1
|
|
70
|
+
- scaffold `visualizer-ui/`
|
|
71
|
+
- keep current inline HTML as fallback
|
|
72
|
+
- add built asset serving from `visualizer-ui/dist`
|
|
73
|
+
- add `/api/state` alias
|
|
74
|
+
|
|
75
|
+
### Phase 2
|
|
76
|
+
- port current layout into React components
|
|
77
|
+
- use Zustand store + initial snapshot fetch
|
|
78
|
+
- use SSE reconnect from frontend
|
|
79
|
+
|
|
80
|
+
### Phase 3
|
|
81
|
+
- move feed/timeline ordering to stable `seq`
|
|
82
|
+
- reduce full rerenders
|
|
83
|
+
- preserve scroll behavior inside components
|
|
84
|
+
|
|
85
|
+
### Phase 4
|
|
86
|
+
- remove inline browser app from server once built UI covers current features
|
|
87
|
+
- keep server only as API/static host
|
|
88
|
+
|
|
89
|
+
## Dev workflow
|
|
90
|
+
|
|
91
|
+
Backend + fake live harness:
|
|
92
|
+
|
|
93
|
+
```bash
|
|
94
|
+
npm run debug:live-ui
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
Frontend dev server:
|
|
98
|
+
|
|
99
|
+
```bash
|
|
100
|
+
npm run dev:visualizer:ui
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Frontend build:
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
npm run build:visualizer:ui
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Notes
|
|
110
|
+
|
|
111
|
+
Current state:
|
|
112
|
+
|
|
113
|
+
- React/Vite/Zustand UI scaffold added
|
|
114
|
+
- built assets generated under `visualizer-ui/dist/`
|
|
115
|
+
- server serves built UI directly
|
|
116
|
+
- legacy inline browser app removed
|
|
117
|
+
- fallback page only shows build instructions when dist missing
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@sebastianandreasson/pi-autonomous-agents",
|
|
3
3
|
"private": false,
|
|
4
|
-
"version": "0.
|
|
4
|
+
"version": "0.11.0",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"description": "Portable unattended PI harness for developer/tester/visual-review loops.",
|
|
7
7
|
"license": "MIT",
|
|
@@ -19,13 +19,18 @@
|
|
|
19
19
|
"@mariozechner/pi-coding-agent": "^0.66.1"
|
|
20
20
|
},
|
|
21
21
|
"scripts": {
|
|
22
|
-
"check": "node --check src/cli.mjs && node --check src/pi-clear-history.mjs && node --check src/pi-client.mjs && node --check src/pi-config.mjs && node --check src/pi-flow.mjs && node --check src/pi-heartbeat.mjs && node --check src/pi-history.mjs && node --check src/pi-preflight.mjs && node --check src/pi-prompts.mjs && node --check src/pi-repo.mjs && node --check src/pi-report.mjs && node --check src/pi-sdk-turn.mjs && node --check src/pi-supervisor.mjs && node --check src/pi-telemetry.mjs && node --check src/pi-visual-once.mjs && node --check src/pi-visual-review.mjs && node --check src/pi-visualizer.mjs && node --check src/pi-visualizer-server.mjs && node --check src/pi-visualizer-shared.mjs && node --check src/index.mjs && node --check test/pi-heartbeat.test.mjs && node --check test/pi-lifecycle.test.mjs && node --check test/pi-role-models.test.mjs && node --check test/pi-flow.test.mjs && node --check test/pi-history.test.mjs && node --check test/pi-prompts.test.mjs && node --check test/pi-preflight.test.mjs && node --check test/pi-repo.test.mjs && node --check test/pi-sdk-supervisor.test.mjs && node --check test/pi-sdk-turn.test.mjs && node --check test/pi-telemetry.test.mjs && node --check test/pi-visualizer-shared.test.mjs && node --check test/fixtures/fake-pi.mjs && node --check test/fixtures/fake-pi-sdk.mjs",
|
|
23
|
-
"test": "node --test test/pi-heartbeat.test.mjs test/pi-lifecycle.test.mjs test/pi-role-models.test.mjs test/pi-flow.test.mjs test/pi-history.test.mjs test/pi-prompts.test.mjs test/pi-preflight.test.mjs test/pi-repo.test.mjs test/pi-sdk-supervisor.test.mjs test/pi-sdk-turn.test.mjs test/pi-telemetry.test.mjs test/pi-visualizer-shared.test.mjs"
|
|
22
|
+
"check": "node --check src/cli.mjs && node --check src/pi-clear-history.mjs && node --check src/pi-client.mjs && node --check src/pi-config.mjs && node --check src/pi-debug-live.mjs && node --check src/pi-flow.mjs && node --check src/pi-heartbeat.mjs && node --check src/pi-history.mjs && node --check src/pi-preflight.mjs && node --check src/pi-prompts.mjs && node --check src/pi-repo.mjs && node --check src/pi-report.mjs && node --check src/pi-sdk-turn.mjs && node --check src/pi-supervisor.mjs && node --check src/pi-telemetry.mjs && node --check src/pi-visual-once.mjs && node --check src/pi-visual-review.mjs && node --check src/pi-visualizer.mjs && node --check src/pi-visualizer-server.mjs && node --check src/pi-visualizer-shared.mjs && node --check src/index.mjs && node --check test/pi-heartbeat.test.mjs && node --check test/pi-lifecycle.test.mjs && node --check test/pi-role-models.test.mjs && node --check test/pi-flow.test.mjs && node --check test/pi-history.test.mjs && node --check test/pi-prompts.test.mjs && node --check test/pi-preflight.test.mjs && node --check test/pi-repo.test.mjs && node --check test/pi-sdk-supervisor.test.mjs && node --check test/pi-sdk-turn.test.mjs && node --check test/pi-telemetry.test.mjs && node --check test/pi-visualizer-shared.test.mjs && node --check test/fixtures/fake-pi.mjs && node --check test/fixtures/fake-pi-sdk.mjs && node --check test/fixtures/fake-live-pi-sdk.mjs",
|
|
23
|
+
"test": "node --test test/pi-heartbeat.test.mjs test/pi-lifecycle.test.mjs test/pi-role-models.test.mjs test/pi-flow.test.mjs test/pi-history.test.mjs test/pi-prompts.test.mjs test/pi-preflight.test.mjs test/pi-repo.test.mjs test/pi-sdk-supervisor.test.mjs test/pi-sdk-turn.test.mjs test/pi-telemetry.test.mjs test/pi-visualizer-shared.test.mjs",
|
|
24
|
+
"debug:live-ui": "node src/cli.mjs debug-live --reset",
|
|
25
|
+
"dev:visualizer:ui": "npm --prefix visualizer-ui run dev",
|
|
26
|
+
"build:visualizer:ui": "npm --prefix visualizer-ui run build",
|
|
27
|
+
"prepublishOnly": "npm run check && npm test && npm run build:visualizer:ui"
|
|
24
28
|
},
|
|
25
29
|
"files": [
|
|
26
30
|
"src",
|
|
27
31
|
"templates",
|
|
28
32
|
"docs",
|
|
33
|
+
"visualizer-ui/dist",
|
|
29
34
|
"pi.config.json",
|
|
30
35
|
"SETUP.md",
|
|
31
36
|
"README.md"
|
package/src/cli.mjs
CHANGED
package/src/pi-client.mjs
CHANGED
|
@@ -3,6 +3,9 @@ import path from 'node:path'
|
|
|
3
3
|
import { randomUUID } from 'node:crypto'
|
|
4
4
|
|
|
5
5
|
const liveFeedWriteQueues = new Map()
|
|
6
|
+
const liveFeedSequences = new Map()
|
|
7
|
+
const MAX_LIVE_FEED_TEXT = 2000
|
|
8
|
+
const MAX_LIVE_FEED_SUMMARY = 600
|
|
6
9
|
import {
|
|
7
10
|
appendLog,
|
|
8
11
|
writeTextFile,
|
|
@@ -41,6 +44,63 @@ async function writeAgentOutputSnapshot(config, content) {
|
|
|
41
44
|
}
|
|
42
45
|
}
|
|
43
46
|
|
|
47
|
+
function truncateText(value, maxChars) {
|
|
48
|
+
const text = String(value ?? '')
|
|
49
|
+
if (text.length <= maxChars) {
|
|
50
|
+
return text
|
|
51
|
+
}
|
|
52
|
+
return `${text.slice(0, maxChars - 16)}\n... [truncated]`
|
|
53
|
+
}
|
|
54
|
+
|
|
55
|
+
function summarizeValue(value, maxChars = MAX_LIVE_FEED_SUMMARY) {
|
|
56
|
+
if (value === null || value === undefined) {
|
|
57
|
+
return ''
|
|
58
|
+
}
|
|
59
|
+
if (typeof value === 'string') {
|
|
60
|
+
return truncateText(value, maxChars)
|
|
61
|
+
}
|
|
62
|
+
try {
|
|
63
|
+
return truncateText(JSON.stringify(value), maxChars)
|
|
64
|
+
} catch {
|
|
65
|
+
return truncateText(String(value), maxChars)
|
|
66
|
+
}
|
|
67
|
+
}
|
|
68
|
+
|
|
69
|
+
function sanitizeLiveFeedEvent(filePath, event) {
|
|
70
|
+
const nextSeq = (liveFeedSequences.get(filePath) ?? 0) + 1
|
|
71
|
+
liveFeedSequences.set(filePath, nextSeq)
|
|
72
|
+
|
|
73
|
+
const normalized = {
|
|
74
|
+
seq: nextSeq,
|
|
75
|
+
timestamp: String(event?.timestamp ?? new Date().toISOString()),
|
|
76
|
+
iteration: Number(event?.iteration ?? 0),
|
|
77
|
+
retryCount: Number(event?.retryCount ?? 0),
|
|
78
|
+
reason: String(event?.reason ?? ''),
|
|
79
|
+
phase: String(event?.phase ?? ''),
|
|
80
|
+
role: String(event?.role ?? ''),
|
|
81
|
+
kind: String(event?.kind ?? ''),
|
|
82
|
+
type: String(event?.type ?? 'event'),
|
|
83
|
+
toolName: String(event?.toolName ?? ''),
|
|
84
|
+
isError: event?.isError === true,
|
|
85
|
+
text: truncateText(event?.text ?? '', MAX_LIVE_FEED_TEXT),
|
|
86
|
+
}
|
|
87
|
+
|
|
88
|
+
const argsSummary = summarizeValue(event?.args)
|
|
89
|
+
const partialSummary = summarizeValue(event?.partialResult)
|
|
90
|
+
const resultSummary = summarizeValue(event?.result)
|
|
91
|
+
if (argsSummary !== '') {
|
|
92
|
+
normalized.argsSummary = argsSummary
|
|
93
|
+
}
|
|
94
|
+
if (partialSummary !== '') {
|
|
95
|
+
normalized.partialSummary = partialSummary
|
|
96
|
+
}
|
|
97
|
+
if (resultSummary !== '') {
|
|
98
|
+
normalized.resultSummary = resultSummary
|
|
99
|
+
}
|
|
100
|
+
|
|
101
|
+
return normalized
|
|
102
|
+
}
|
|
103
|
+
|
|
44
104
|
async function appendLiveFeedEvent(config, event) {
|
|
45
105
|
if (!config.runLiveFeedFile) {
|
|
46
106
|
return
|
|
@@ -51,8 +111,9 @@ async function appendLiveFeedEvent(config, event) {
|
|
|
51
111
|
const next = previous
|
|
52
112
|
.catch(() => {})
|
|
53
113
|
.then(async () => {
|
|
114
|
+
const sanitized = sanitizeLiveFeedEvent(filePath, event)
|
|
54
115
|
await fs.mkdir(path.dirname(filePath), { recursive: true })
|
|
55
|
-
await fs.appendFile(filePath, `${JSON.stringify(
|
|
116
|
+
await fs.appendFile(filePath, `${JSON.stringify(sanitized)}\n`, 'utf8')
|
|
56
117
|
})
|
|
57
118
|
|
|
58
119
|
liveFeedWriteQueues.set(filePath, next)
|
|
@@ -0,0 +1,153 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
import fs from 'node:fs/promises'
|
|
4
|
+
import path from 'node:path'
|
|
5
|
+
import process from 'node:process'
|
|
6
|
+
import { spawn, execFileSync } from 'node:child_process'
|
|
7
|
+
import { fileURLToPath } from 'node:url'
|
|
8
|
+
|
|
9
|
+
const scriptDir = path.dirname(fileURLToPath(import.meta.url))
|
|
10
|
+
const packageRoot = path.resolve(scriptDir, '..')
|
|
11
|
+
const cliFile = path.join(scriptDir, 'cli.mjs')
|
|
12
|
+
const fakePiFile = path.join(packageRoot, 'test', 'fixtures', 'fake-pi.mjs')
|
|
13
|
+
const fakeLiveSdkFile = path.join(packageRoot, 'test', 'fixtures', 'fake-live-pi-sdk.mjs')
|
|
14
|
+
const sandboxDir = path.join(packageRoot, '.pi-debug', 'live-ui')
|
|
15
|
+
const DEFAULT_TASK_COUNT = 12
|
|
16
|
+
|
|
17
|
+
function shellQuote(value) {
|
|
18
|
+
return JSON.stringify(String(value))
|
|
19
|
+
}
|
|
20
|
+
|
|
21
|
+
function readFlagValue(flag) {
|
|
22
|
+
const index = process.argv.indexOf(flag)
|
|
23
|
+
if (index === -1) {
|
|
24
|
+
return ''
|
|
25
|
+
}
|
|
26
|
+
return String(process.argv[index + 1] ?? '').trim()
|
|
27
|
+
}
|
|
28
|
+
|
|
29
|
+
function readScenario() {
|
|
30
|
+
const value = readFlagValue('--scenario') || process.env.PI_FAKE_LIVE_SCENARIO || 'default'
|
|
31
|
+
return String(value).trim() || 'default'
|
|
32
|
+
}
|
|
33
|
+
|
|
34
|
+
function readTaskCount() {
|
|
35
|
+
const raw = Number.parseInt(readFlagValue('--task-count') || process.env.PI_DEBUG_TASK_COUNT || `${DEFAULT_TASK_COUNT}`, 10)
|
|
36
|
+
return Number.isFinite(raw) && raw > 0 ? raw : DEFAULT_TASK_COUNT
|
|
37
|
+
}
|
|
38
|
+
|
|
39
|
+
function buildTodoLines(taskCount) {
|
|
40
|
+
const lines = []
|
|
41
|
+
for (let index = 1; index <= taskCount; index += 1) {
|
|
42
|
+
const phase = index <= Math.ceil(taskCount / 3)
|
|
43
|
+
? 'Phase 1'
|
|
44
|
+
: index <= Math.ceil((taskCount * 2) / 3)
|
|
45
|
+
? 'Phase 2'
|
|
46
|
+
: 'Phase 3'
|
|
47
|
+
const label = `Fake live task ${index}`
|
|
48
|
+
if (lines.length === 0 || lines[lines.length - 1] !== `## ${phase}`) {
|
|
49
|
+
if (lines.length > 0) {
|
|
50
|
+
lines.push('')
|
|
51
|
+
}
|
|
52
|
+
lines.push(`## ${phase}`)
|
|
53
|
+
lines.push('')
|
|
54
|
+
}
|
|
55
|
+
lines.push(`- [ ] ${label}`)
|
|
56
|
+
}
|
|
57
|
+
return `${lines.join('\n')}\n`
|
|
58
|
+
}
|
|
59
|
+
|
|
60
|
+
async function ensureRepo(cwd) {
|
|
61
|
+
try {
|
|
62
|
+
execFileSync('git', ['rev-parse', '--is-inside-work-tree'], { cwd, stdio: 'ignore' })
|
|
63
|
+
} catch {
|
|
64
|
+
execFileSync('git', ['init'], { cwd, stdio: 'ignore' })
|
|
65
|
+
execFileSync('git', ['config', 'user.name', 'PI Harness Debug'], { cwd, stdio: 'ignore' })
|
|
66
|
+
execFileSync('git', ['config', 'user.email', 'pi-harness-debug@example.com'], { cwd, stdio: 'ignore' })
|
|
67
|
+
}
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
async function seedFiles(cwd, { taskCount, scenario }) {
|
|
71
|
+
await fs.mkdir(path.join(cwd, 'pi'), { recursive: true })
|
|
72
|
+
await fs.writeFile(path.join(cwd, 'TODOS.md'), buildTodoLines(taskCount), 'utf8')
|
|
73
|
+
await fs.writeFile(path.join(cwd, 'DEVELOPER.md'), `Developer instructions for local visualizer debugging.\nScenario: ${scenario}\n`, 'utf8')
|
|
74
|
+
await fs.writeFile(path.join(cwd, 'TESTER.md'), `Tester instructions for local visualizer debugging.\nScenario: ${scenario}\n`, 'utf8')
|
|
75
|
+
await fs.writeFile(path.join(cwd, 'pi.config.json'), `${JSON.stringify({
|
|
76
|
+
transport: 'sdk',
|
|
77
|
+
taskFile: 'TODOS.md',
|
|
78
|
+
developerInstructionsFile: 'DEVELOPER.md',
|
|
79
|
+
testerInstructionsFile: 'TESTER.md',
|
|
80
|
+
piCli: fakePiFile,
|
|
81
|
+
piModel: 'fake-model',
|
|
82
|
+
roleModels: {
|
|
83
|
+
developer: 'fake-model',
|
|
84
|
+
developerRetry: 'fake-model',
|
|
85
|
+
developerFix: 'fake-model',
|
|
86
|
+
tester: 'fake-model',
|
|
87
|
+
testerCommit: 'fake-model',
|
|
88
|
+
},
|
|
89
|
+
testCommand: `${shellQuote(process.execPath)} -e ${shellQuote('setTimeout(()=>process.exit(0), 250)')}`,
|
|
90
|
+
streamTerminal: true,
|
|
91
|
+
continueAfterSeconds: 3600,
|
|
92
|
+
noEventTimeoutSeconds: 3600,
|
|
93
|
+
toolContinueAfterSeconds: 3600,
|
|
94
|
+
toolNoEventTimeoutSeconds: 3600,
|
|
95
|
+
sleepBetweenSeconds: 1,
|
|
96
|
+
maxIterations: Math.max(taskCount * 3, 20),
|
|
97
|
+
}, null, 2)}\n`, 'utf8')
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
async function ensureInitialCommit(cwd) {
|
|
101
|
+
try {
|
|
102
|
+
execFileSync('git', ['rev-parse', 'HEAD'], { cwd, stdio: 'ignore' })
|
|
103
|
+
} catch {
|
|
104
|
+
execFileSync('git', ['add', '.'], { cwd, stdio: 'ignore' })
|
|
105
|
+
execFileSync('git', ['commit', '-m', 'chore(debug): seed fake live sandbox'], { cwd, stdio: 'ignore' })
|
|
106
|
+
}
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
async function main() {
|
|
110
|
+
const reset = process.argv.includes('--reset')
|
|
111
|
+
const scenario = readScenario()
|
|
112
|
+
const taskCount = readTaskCount()
|
|
113
|
+
|
|
114
|
+
if (reset) {
|
|
115
|
+
await fs.rm(sandboxDir, { recursive: true, force: true })
|
|
116
|
+
}
|
|
117
|
+
|
|
118
|
+
await fs.mkdir(sandboxDir, { recursive: true })
|
|
119
|
+
await ensureRepo(sandboxDir)
|
|
120
|
+
await seedFiles(sandboxDir, { taskCount, scenario })
|
|
121
|
+
await ensureInitialCommit(sandboxDir)
|
|
122
|
+
|
|
123
|
+
process.stdout.write(`PI debug sandbox: ${sandboxDir}\n`)
|
|
124
|
+
process.stdout.write(`Using fake live SDK fixture: ${fakeLiveSdkFile}\n`)
|
|
125
|
+
process.stdout.write(`Scenario: ${scenario}\n`)
|
|
126
|
+
process.stdout.write(`Task count: ${taskCount}\n`)
|
|
127
|
+
|
|
128
|
+
const child = spawn(process.execPath, [cliFile, 'run'], {
|
|
129
|
+
cwd: sandboxDir,
|
|
130
|
+
env: {
|
|
131
|
+
...process.env,
|
|
132
|
+
PI_CONFIG_FILE: 'pi.config.json',
|
|
133
|
+
PI_SDK_MODULE: fakeLiveSdkFile,
|
|
134
|
+
PI_FAKE_LIVE_SCENARIO: scenario,
|
|
135
|
+
PI_VISUALIZER_HOST: process.env.PI_VISUALIZER_HOST || '127.0.0.1',
|
|
136
|
+
PI_VISUALIZER_PORT: process.env.PI_VISUALIZER_PORT || '4317',
|
|
137
|
+
},
|
|
138
|
+
stdio: 'inherit',
|
|
139
|
+
})
|
|
140
|
+
|
|
141
|
+
child.on('exit', (code, signal) => {
|
|
142
|
+
if (signal) {
|
|
143
|
+
process.exitCode = 128
|
|
144
|
+
return
|
|
145
|
+
}
|
|
146
|
+
process.exitCode = code ?? 1
|
|
147
|
+
})
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
main().catch((error) => {
|
|
151
|
+
console.error(error instanceof Error ? error.stack ?? error.message : String(error))
|
|
152
|
+
process.exitCode = 1
|
|
153
|
+
})
|
package/src/pi-prompts.mjs
CHANGED
|
@@ -119,6 +119,36 @@ function repoInstructionsAuthorityLine(config, instructionsFile, usesBundledInst
|
|
|
119
119
|
return `Repo-local instructions in ${displayPath(config, instructionsFile)} are the primary role contract. Follow them over package defaults when they differ.\n`
|
|
120
120
|
}
|
|
121
121
|
|
|
122
|
+
export function classifyTaskType(task) {
|
|
123
|
+
const text = String(task ?? '').trim().toLowerCase()
|
|
124
|
+
if (text === '') {
|
|
125
|
+
return 'general'
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
if (
|
|
129
|
+
/\b(write|add|create|implement|expand|improve|fix|update)\b.*\b(test|tests|coverage|regression test|spec|specs)\b/.test(text)
|
|
130
|
+
|| /\b(test|tests|coverage|regression test|spec|specs)\b.*\b(write|add|create|implement|expand|improve|fix|update)\b/.test(text)
|
|
131
|
+
) {
|
|
132
|
+
return 'test'
|
|
133
|
+
}
|
|
134
|
+
|
|
135
|
+
return 'general'
|
|
136
|
+
}
|
|
137
|
+
|
|
138
|
+
function formatTaskTypeGuidance(taskType) {
|
|
139
|
+
if (taskType !== 'test') {
|
|
140
|
+
return ''
|
|
141
|
+
}
|
|
142
|
+
|
|
143
|
+
return [
|
|
144
|
+
'Test-task guidance:',
|
|
145
|
+
'- This TODO is primarily test-focused. Do not fail solely because changes are mostly or entirely tests.',
|
|
146
|
+
'- PASS if the new or updated test adds meaningful behavioral or regression coverage and verification passes.',
|
|
147
|
+
'- FAIL if the test is brittle, redundant, weakly asserted, or not tied to real behavior.',
|
|
148
|
+
'- Prefer checking whether the test would have failed before the change, or whether developer notes justify why missing coverage mattered.',
|
|
149
|
+
].join('\n')
|
|
150
|
+
}
|
|
151
|
+
|
|
122
152
|
function testerPassOwnershipRules(config) {
|
|
123
153
|
if (config.commitMode === 'plan') {
|
|
124
154
|
return {
|
|
@@ -353,6 +383,9 @@ export function buildTesterPrompt(config, {
|
|
|
353
383
|
developerNotes || '(none provided)',
|
|
354
384
|
configMaxLines(config, 'maxPromptNotesLines', 16),
|
|
355
385
|
)
|
|
386
|
+
const taskType = classifyTaskType(task)
|
|
387
|
+
const taskTypeLabel = taskType === 'test' ? 'test-focused' : 'general'
|
|
388
|
+
const taskTypeGuidance = formatTaskTypeGuidance(taskType)
|
|
356
389
|
const verificationCommand = config.testCommand.trim() === '' ? '(not configured)' : config.testCommand
|
|
357
390
|
const visualCaptureNote = config.visualReviewEnabled
|
|
358
391
|
? `\n- Keep the screenshot capture flow working so the harness still produces current visual artifacts for review.`
|
|
@@ -364,6 +397,7 @@ export function buildTesterPrompt(config, {
|
|
|
364
397
|
)
|
|
365
398
|
const passOwnership = testerPassOwnershipRules(config)
|
|
366
399
|
const largeFileRiskHint = formatLargeFileRiskHint(largeFileWarnings)
|
|
400
|
+
const taskTypeRuleBlock = taskTypeGuidance === '' ? '' : `${taskTypeGuidance}\n`
|
|
367
401
|
|
|
368
402
|
if (!config.usingBundledTesterInstructions) {
|
|
369
403
|
return `Read ${taskFile} and ${instructionsFile}.
|
|
@@ -375,6 +409,7 @@ You are the TESTER role. You are reviewing the most recent developer work from a
|
|
|
375
409
|
|
|
376
410
|
Current phase: ${phase}
|
|
377
411
|
Current task: ${task}
|
|
412
|
+
Current task type: ${taskTypeLabel}
|
|
378
413
|
Reason for this tester pass: ${reason}
|
|
379
414
|
|
|
380
415
|
Developer notes:
|
|
@@ -391,7 +426,7 @@ Rules:
|
|
|
391
426
|
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
392
427
|
- If blocked or inconclusive, return VERDICT: BLOCKED.
|
|
393
428
|
- Do not hide real bugs with brittle tests.
|
|
394
|
-
- ${passOwnership.successRule.slice(2)}
|
|
429
|
+
${taskTypeRuleBlock}- ${passOwnership.successRule.slice(2)}
|
|
395
430
|
- ${passOwnership.isolationRule.slice(2)}
|
|
396
431
|
- ${passOwnership.extraRule.slice(2)}${visualCaptureNote}
|
|
397
432
|
|
|
@@ -417,6 +452,7 @@ You are the TESTER role. You are reviewing the most recent developer work from a
|
|
|
417
452
|
|
|
418
453
|
Current phase: ${phase}
|
|
419
454
|
Current task: ${task}
|
|
455
|
+
Current task type: ${taskTypeLabel}
|
|
420
456
|
Reason for this tester pass: ${reason}
|
|
421
457
|
|
|
422
458
|
Developer notes:
|
|
@@ -433,7 +469,7 @@ ${indentBlock(innerLoopValidationRules(verificationCommand), '\t')}
|
|
|
433
469
|
- Prefer one focused browser-driven review pass.
|
|
434
470
|
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
435
471
|
- Do not hide real bugs with brittle tests.
|
|
436
|
-
- If blocked or inconclusive, return VERDICT: BLOCKED.
|
|
472
|
+
${taskTypeGuidance === '' ? '' : `${indentBlock(taskTypeGuidance, '\t')}\n`} - If blocked or inconclusive, return VERDICT: BLOCKED.
|
|
437
473
|
${indentBlock(passOwnership.successRule, '\t')}
|
|
438
474
|
${indentBlock(passOwnership.isolationRule, '\t')}
|
|
439
475
|
${indentBlock(passOwnership.extraRule, '\t')}${visualCaptureNote}
|