@sebastianandreasson/pi-autonomous-agents 0.5.0 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +249 -81
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -1,58 +1,79 @@
|
|
|
1
|
-
# PI
|
|
1
|
+
# PI Autonomous Agents
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
`@sebastianandreasson/pi-autonomous-agents` is an npm package for running a bounded unattended [PI](https://pi.dev/) workflow inside another repository.
|
|
4
4
|
|
|
5
|
-
|
|
6
|
-
- a fast verification step
|
|
7
|
-
- a skeptical `tester` pass
|
|
8
|
-
- optional periodic multimodal visual review
|
|
9
|
-
- tester-owned final commit by default
|
|
5
|
+
It orchestrates:
|
|
10
6
|
|
|
11
|
-
|
|
7
|
+
- a `developer` turn
|
|
8
|
+
- a fast local verification step
|
|
9
|
+
- an independent `tester` turn
|
|
10
|
+
- an optional focused `developerFix` turn when verification/tester finds a real issue
|
|
11
|
+
- optional periodic visual review from screenshots
|
|
12
12
|
|
|
13
|
-
|
|
13
|
+
The package is intentionally generic. It handles supervision, prompts, runtime state, telemetry, retries, and guardrails. The consuming repo still owns its own tasks, instructions, tests, model endpoints, and screenshot capture flow.
|
|
14
14
|
|
|
15
|
-
|
|
16
|
-
|
|
15
|
+
## Install
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
npm install -D @sebastianandreasson/pi-autonomous-agents
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
Then in the consuming repo, tell your agent:
|
|
22
|
+
|
|
23
|
+
```text
|
|
24
|
+
Find SETUP.md in @sebastianandreasson/pi-autonomous-agents and set everything up for this repository.
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
The package ships a top-level [SETUP.md](./SETUP.md) specifically for that workflow.
|
|
28
|
+
|
|
29
|
+
## What This Package Owns
|
|
30
|
+
|
|
31
|
+
- unattended loop orchestration
|
|
32
|
+
- PI adapter integration
|
|
17
33
|
- config loading
|
|
18
|
-
-
|
|
19
|
-
-
|
|
20
|
-
-
|
|
21
|
-
-
|
|
22
|
-
-
|
|
34
|
+
- prompt assembly
|
|
35
|
+
- verification/tester/visual-review handoff
|
|
36
|
+
- timeout and loop guards
|
|
37
|
+
- telemetry and run summaries
|
|
38
|
+
- runtime isolation and stale-run recovery
|
|
23
39
|
|
|
24
|
-
## What
|
|
40
|
+
## What Each Repo Must Provide
|
|
25
41
|
|
|
26
42
|
- `TODOS.md`
|
|
27
|
-
-
|
|
28
|
-
-
|
|
29
|
-
-
|
|
30
|
-
-
|
|
31
|
-
-
|
|
43
|
+
- repo-specific `pi/DEVELOPER.md`
|
|
44
|
+
- repo-specific `pi/TESTER.md`
|
|
45
|
+
- a fast bounded `testCommand`
|
|
46
|
+
- model configuration that actually matches the local/cloud providers in use
|
|
47
|
+
- optionally a screenshot capture command for visual review
|
|
32
48
|
|
|
33
|
-
##
|
|
49
|
+
## Quick Start In A Repo
|
|
50
|
+
|
|
51
|
+
The normal setup shape is:
|
|
34
52
|
|
|
35
53
|
```text
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
pi-
|
|
49
|
-
pi-
|
|
50
|
-
pi-
|
|
51
|
-
pi-
|
|
52
|
-
pi-visual-once
|
|
53
|
-
|
|
54
|
+
TODOS.md
|
|
55
|
+
pi.config.json
|
|
56
|
+
pi/
|
|
57
|
+
DEVELOPER.md
|
|
58
|
+
TESTER.md
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Typical scripts:
|
|
62
|
+
|
|
63
|
+
```json
|
|
64
|
+
{
|
|
65
|
+
"scripts": {
|
|
66
|
+
"pi:mock": "PI_CONFIG_FILE=pi.config.json PI_TRANSPORT=mock PI_TEST_CMD= pi-harness once",
|
|
67
|
+
"pi:once": "PI_CONFIG_FILE=pi.config.json pi-harness once",
|
|
68
|
+
"pi:run": "PI_CONFIG_FILE=pi.config.json pi-harness run",
|
|
69
|
+
"pi:report": "PI_CONFIG_FILE=pi.config.json pi-harness report",
|
|
70
|
+
"pi:visual:once": "PI_CONFIG_FILE=pi.config.json pi-harness visual-once"
|
|
71
|
+
}
|
|
72
|
+
}
|
|
54
73
|
```
|
|
55
74
|
|
|
75
|
+
Start from [templates/pi.config.example.json](./templates/pi.config.example.json), [templates/DEVELOPER.md](./templates/DEVELOPER.md), [templates/TESTER.md](./templates/TESTER.md), and [templates/gitignore.fragment](./templates/gitignore.fragment).
|
|
76
|
+
|
|
56
77
|
## CLI
|
|
57
78
|
|
|
58
79
|
```bash
|
|
@@ -65,65 +86,212 @@ pi-harness adapter
|
|
|
65
86
|
pi-harness visual-review-worker
|
|
66
87
|
```
|
|
67
88
|
|
|
68
|
-
Use `PI_CONFIG_FILE` to point
|
|
69
|
-
|
|
70
|
-
## Setup In Another Repo
|
|
71
|
-
|
|
72
|
-
After installing the package:
|
|
89
|
+
Use `PI_CONFIG_FILE` to point at the repo-local config file:
|
|
73
90
|
|
|
74
91
|
```bash
|
|
75
|
-
|
|
92
|
+
PI_CONFIG_FILE=pi.config.json pi-harness once
|
|
76
93
|
```
|
|
77
94
|
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
95
|
+
If `PI_CONFIG_FILE` is not set, the package falls back to the bundled generic [pi.config.json](./pi.config.json).
|
|
96
|
+
|
|
97
|
+
## Core Workflow
|
|
98
|
+
|
|
99
|
+
Each real iteration works like this:
|
|
100
|
+
|
|
101
|
+
1. `developer` implements one unchecked task from `TODOS.md`.
|
|
102
|
+
2. The harness runs the configured fast verification command.
|
|
103
|
+
3. If verification passes, `tester` reviews the change independently.
|
|
104
|
+
4. If tester or verification fails, the findings go back to `developerFix` for one focused repair pass.
|
|
105
|
+
5. If tester reaches `PASS`, tester creates the final commit directly by default.
|
|
106
|
+
6. Every `N` successful iterations, optional visual review can inspect screenshots and veto the success if it finds a real problem.
|
|
107
|
+
|
|
108
|
+
The default commit model is `commitMode: "agent"`. The older harness-managed parsed commit-plan flow still exists as `commitMode: "plan"`, but it is now a compatibility mode rather than the default.
|
|
109
|
+
|
|
110
|
+
## Recommended Model Setup
|
|
111
|
+
|
|
112
|
+
The package supports:
|
|
113
|
+
|
|
114
|
+
- one default text model via `piModel`
|
|
115
|
+
- one default visual-review model via `visualReviewModel`
|
|
116
|
+
- optional per-role overrides via `roleModels`
|
|
117
|
+
- per-model endpoint config in `models`
|
|
118
|
+
|
|
119
|
+
Typical pattern:
|
|
120
|
+
|
|
121
|
+
- local model for `developer`
|
|
122
|
+
- local model for `developerRetry`
|
|
123
|
+
- local model for `developerFix`
|
|
124
|
+
- local or slightly stronger model for `tester`
|
|
125
|
+
- stronger frontier model only for `visualReview`
|
|
126
|
+
|
|
127
|
+
Example:
|
|
128
|
+
|
|
129
|
+
```json
|
|
130
|
+
{
|
|
131
|
+
"piModel": "local/text-model",
|
|
132
|
+
"visualReviewModel": "cloud/vision-model",
|
|
133
|
+
"models": {
|
|
134
|
+
"local/text-model": {
|
|
135
|
+
"baseUrl": "http://localhost:8000/v1",
|
|
136
|
+
"apiKey": "local",
|
|
137
|
+
"vision": false
|
|
138
|
+
},
|
|
139
|
+
"local/tester-model": {
|
|
140
|
+
"baseUrl": "http://localhost:8000/v1",
|
|
141
|
+
"apiKey": "local",
|
|
142
|
+
"vision": false
|
|
143
|
+
},
|
|
144
|
+
"cloud/vision-model": {
|
|
145
|
+
"baseUrl": "https://api.openai.com/v1",
|
|
146
|
+
"apiKeyEnv": "OPENAI_API_KEY",
|
|
147
|
+
"vision": true
|
|
148
|
+
}
|
|
149
|
+
},
|
|
150
|
+
"roleModels": {
|
|
151
|
+
"developer": "local/text-model",
|
|
152
|
+
"developerRetry": "local/text-model",
|
|
153
|
+
"developerFix": "local/text-model",
|
|
154
|
+
"tester": "local/tester-model",
|
|
155
|
+
"visualReview": "cloud/vision-model"
|
|
156
|
+
}
|
|
157
|
+
}
|
|
82
158
|
```
|
|
83
159
|
|
|
84
|
-
|
|
160
|
+
Important:
|
|
161
|
+
|
|
162
|
+
- do not guess model ids
|
|
163
|
+
- if using a custom OpenAI-compatible provider, verify `<baseUrl>/models`
|
|
164
|
+
- if using PI models directly, verify `pi --list-models`
|
|
165
|
+
- if `PI_CODING_AGENT_DIR` points at a repo-local PI home, make sure it is bootstrapped and contains `models.json`
|
|
166
|
+
|
|
167
|
+
The harness now preflights those checks before starting a real run.
|
|
85
168
|
|
|
86
|
-
|
|
169
|
+
## Important Config Fields
|
|
87
170
|
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
|
|
171
|
+
Common fields in `pi.config.json`:
|
|
172
|
+
|
|
173
|
+
- `taskFile`
|
|
174
|
+
- `developerInstructionsFile`
|
|
175
|
+
- `testerInstructionsFile`
|
|
176
|
+
- `transport`
|
|
177
|
+
- `adapterCommand`
|
|
178
|
+
- `piModel`
|
|
179
|
+
- `models`
|
|
180
|
+
- `roleModels`
|
|
181
|
+
- `commitMode`
|
|
182
|
+
- `promptMode`
|
|
183
|
+
- `testCommand`
|
|
184
|
+
- `visualReviewEnabled`
|
|
185
|
+
- `visualCaptureCommand`
|
|
186
|
+
- `continueAfterSeconds`
|
|
187
|
+
- `toolContinueAfterSeconds`
|
|
188
|
+
- `noEventTimeoutSeconds`
|
|
189
|
+
- `toolNoEventTimeoutSeconds`
|
|
190
|
+
- `largeFileWarningLines`
|
|
191
|
+
- `largeSpecWarningLines`
|
|
192
|
+
|
|
193
|
+
Key defaults:
|
|
194
|
+
|
|
195
|
+
- `transport`: `adapter`
|
|
196
|
+
- `commitMode`: `agent`
|
|
197
|
+
- `promptMode`: `compact`
|
|
198
|
+
- `piTools`: `read,edit,write,find,ls,bash`
|
|
199
|
+
- `continueAfterSeconds`: `300`
|
|
200
|
+
- `toolContinueAfterSeconds`: `900`
|
|
201
|
+
- `noEventTimeoutSeconds`: `900`
|
|
202
|
+
- `toolNoEventTimeoutSeconds`: `1800`
|
|
203
|
+
|
|
204
|
+
## Prompt and Tooling Behavior
|
|
205
|
+
|
|
206
|
+
The package is optimized for local models by default:
|
|
207
|
+
|
|
208
|
+
- prompts are compacted before handoff
|
|
209
|
+
- changed-file lists and feedback excerpts are capped
|
|
210
|
+
- prompts prefer `read` for source inspection
|
|
211
|
+
- shell is intended for `git`, tests, and narrow diagnostics
|
|
212
|
+
- the adapter warns on obvious oversized shell-based file reads
|
|
213
|
+
- the supervisor emits large-file/spec warnings when touched files are getting risky
|
|
214
|
+
|
|
215
|
+
This is deliberate. Large monolith files, huge e2e specs, and broad TODO items are one of the main causes of local-model drift and retry loops.
|
|
216
|
+
|
|
217
|
+
Recommended repo shape:
|
|
218
|
+
|
|
219
|
+
- keep TODO items very small and implementation-shaped
|
|
220
|
+
- split giant stores/modules before they become constant edit hotspots
|
|
221
|
+
- split ever-growing end-to-end specs into scenario files
|
|
222
|
+
- keep the default `testCommand` to a bounded smoke check, not a multi-minute happy-path run
|
|
223
|
+
|
|
224
|
+
## Runtime Isolation And Recovery
|
|
91
225
|
|
|
92
|
-
|
|
226
|
+
Recent versions of the package isolate each run more aggressively:
|
|
93
227
|
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
228
|
+
- active ownership lock at `.pi-runtime/active-run.json`
|
|
229
|
+
- per-run runtime directory under `.pi-runtime/runs/<runId>/`
|
|
230
|
+
- per-run PI sessions and telemetry
|
|
231
|
+
- `runId` added to telemetry
|
|
232
|
+
- in-progress iteration state persisted before agent work starts
|
|
233
|
+
- stale run locks recovered when the owning PID is gone
|
|
234
|
+
- timeout cleanup kills the full spawned process group, not only the direct child
|
|
97
235
|
|
|
98
|
-
|
|
236
|
+
That is meant to prevent orphaned timed-out agents or concurrent supervisors from corrupting shared state.
|
|
99
237
|
|
|
100
|
-
|
|
101
|
-
- `developerInstructionsFile`: per-project developer instructions
|
|
102
|
-
- `testerInstructionsFile`: per-project tester instructions
|
|
103
|
-
- `roleModels`: optional per-role model overrides
|
|
104
|
-
- `commitMode`: `agent` by default, `plan` only for legacy harness-managed commit parsing
|
|
105
|
-
- `promptMode`: `compact` by default
|
|
106
|
-
- `testCommand`: fast verification command
|
|
107
|
-
- `visualCaptureCommand`: project-defined screenshot capture command
|
|
108
|
-
- `visualFeedbackFile`: latest visual-review handoff
|
|
109
|
-
- `testerFeedbackFile`: latest tester-review handoff
|
|
238
|
+
## Debugging Artifacts
|
|
110
239
|
|
|
111
|
-
|
|
240
|
+
Useful files during a run:
|
|
112
241
|
|
|
113
|
-
|
|
242
|
+
- `.pi-last-prompt.txt`
|
|
243
|
+
Exact assembled prompt for the current role.
|
|
244
|
+
- `.pi-last-output.txt`
|
|
245
|
+
Latest agent output snapshot.
|
|
246
|
+
- `.pi-last-verification.txt`
|
|
247
|
+
Latest verification output snapshot.
|
|
248
|
+
- `.pi-last-iteration.json`
|
|
249
|
+
Structured summary of the last completed iteration.
|
|
250
|
+
- `.pi-state.json`
|
|
251
|
+
Persistent harness state, including in-progress iteration data.
|
|
252
|
+
- `pi.log`
|
|
253
|
+
Main run log.
|
|
254
|
+
- `pi_telemetry.jsonl`
|
|
255
|
+
- `pi_telemetry.csv`
|
|
256
|
+
- `.pi-runtime/active-run.json`
|
|
257
|
+
- `.pi-runtime/runs/<runId>/...`
|
|
114
258
|
|
|
115
|
-
|
|
259
|
+
`pi-harness report` summarizes recent telemetry and surfaces things like terminal reasons and large-file warnings.
|
|
116
260
|
|
|
117
|
-
|
|
261
|
+
## Visual Review Contract
|
|
118
262
|
|
|
119
|
-
|
|
263
|
+
Visual review is optional and generic. The harness does not know how to navigate your app.
|
|
120
264
|
|
|
121
|
-
|
|
265
|
+
If enabled, your repo must provide a real screenshot capture command that writes a manifest under the configured capture directory. The manifest shape is documented in [docs/PI_SUPERVISOR.md](./docs/PI_SUPERVISOR.md).
|
|
122
266
|
|
|
123
|
-
|
|
267
|
+
Visual review should be used as a periodic audit, not as the default inner-loop gate.
|
|
124
268
|
|
|
125
|
-
|
|
269
|
+
## Resetting Harness State
|
|
126
270
|
|
|
127
|
-
|
|
271
|
+
If you want to wipe harness-generated state and start fresh:
|
|
272
|
+
|
|
273
|
+
```bash
|
|
274
|
+
PI_CONFIG_FILE=pi.config.json pi-harness clear-history
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
That clears configured harness runtime/history artifacts and verifies they are gone. It does not remove project source files.
|
|
278
|
+
|
|
279
|
+
## Docs
|
|
280
|
+
|
|
281
|
+
- [SETUP.md](./SETUP.md)
|
|
282
|
+
Agent-facing setup instructions for consuming repos.
|
|
283
|
+
- [docs/PI_SUPERVISOR.md](./docs/PI_SUPERVISOR.md)
|
|
284
|
+
More detailed flow, adapter, and runtime documentation.
|
|
285
|
+
- [templates/PROJECT_SETUP.md](./templates/PROJECT_SETUP.md)
|
|
286
|
+
Minimal consuming-repo layout summary.
|
|
287
|
+
|
|
288
|
+
## Development
|
|
289
|
+
|
|
290
|
+
In this package repo:
|
|
291
|
+
|
|
292
|
+
```bash
|
|
293
|
+
npm run check
|
|
294
|
+
npm test
|
|
295
|
+
```
|
|
128
296
|
|
|
129
|
-
The
|
|
297
|
+
The package requires Node `>=20`.
|
package/package.json
CHANGED