@sebastianandreasson/pi-autonomous-agents 0.3.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +6 -2
- package/SETUP.md +3 -0
- package/docs/PI_SUPERVISOR.md +4 -2
- package/package.json +1 -1
- package/src/index.mjs +1 -0
- package/src/pi-config.mjs +3 -1
- package/src/pi-prompts.mjs +47 -0
- package/src/pi-repo.mjs +59 -0
- package/src/pi-report.mjs +11 -0
- package/src/pi-rpc-adapter.mjs +42 -0
- package/src/pi-supervisor.mjs +58 -1
- package/src/pi-telemetry.mjs +2 -1
- package/templates/DEVELOPER.md +3 -0
- package/templates/TESTER.md +7 -4
- package/templates/pi.config.example.json +2 -0
package/README.md
CHANGED
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
- a fast verification step
|
|
7
7
|
- a skeptical `tester` pass
|
|
8
8
|
- optional periodic multimodal visual review
|
|
9
|
-
-
|
|
9
|
+
- tester-owned final commit by default
|
|
10
10
|
|
|
11
11
|
The package is intentionally generic. It does not know how to navigate or test a specific app on its own.
|
|
12
12
|
|
|
@@ -18,7 +18,7 @@ The package is intentionally generic. It does not know how to navigate or test a
|
|
|
18
18
|
- telemetry
|
|
19
19
|
- loop guards, timeout guards, and retries
|
|
20
20
|
- tester feedback + visual feedback handoff
|
|
21
|
-
- harness
|
|
21
|
+
- optional legacy harness git finalize step for `commitMode: "plan"`
|
|
22
22
|
- multimodal visual review client
|
|
23
23
|
|
|
24
24
|
## What Stays Per Project
|
|
@@ -119,4 +119,8 @@ By default, successful tester passes should stage and create the commit directly
|
|
|
119
119
|
|
|
120
120
|
Prompt/context handoff is compact by default. The harness now caps prior feedback excerpts, changed-file lists, verification excerpts, and prompt note handoff. If needed, tune `maxPromptChangedFiles`, `maxVisualFeedbackLines`, `maxTesterFeedbackLines`, `maxPromptNotesLines`, and `maxVerificationExcerptLines`.
|
|
121
121
|
|
|
122
|
+
The default coding tool mix is now safer for local models: `read,edit,write,find,ls,bash`. Prompts explicitly steer source inspection toward `read` and reserve shell usage for `git`, tests, and narrow diagnostics.
|
|
123
|
+
|
|
124
|
+
The harness also emits lightweight large-file warnings for touched source/spec files and carries them into `.pi-last-iteration.json`, `pi-harness report`, and relevant prompts. Tune `largeFileWarningLines` and `largeSpecWarningLines` if needed.
|
|
125
|
+
|
|
122
126
|
The harness expects screenshot capture to produce a `manifest.json` plus image files under the configured visual capture directory.
|
package/SETUP.md
CHANGED
|
@@ -47,6 +47,7 @@ If the repo uses another package manager already, use the repo-native equivalent
|
|
|
47
47
|
- `developerInstructionsFile`: `pi/DEVELOPER.md`
|
|
48
48
|
- `testerInstructionsFile`: `pi/TESTER.md`
|
|
49
49
|
- `commitMode`: normally `agent`
|
|
50
|
+
- `promptMode`: normally `compact`
|
|
50
51
|
- `testCommand`: a fast bounded verification command for this repo
|
|
51
52
|
- `visualCaptureCommand`: only if this repo has a real screenshot capture flow
|
|
52
53
|
- `models` / `piModel` / `visualReviewModel` / `roleModels`: configure the models actually available in this environment
|
|
@@ -125,6 +126,7 @@ Recommended pattern:
|
|
|
125
126
|
- local or slightly stronger model for `tester`
|
|
126
127
|
- stronger frontier model for `visualReview` only if available
|
|
127
128
|
- keep `commitMode` as `agent` unless the repo explicitly needs legacy harness-managed commit-plan parsing
|
|
129
|
+
- keep large-file thresholds sensible for local models (`largeFileWarningLines`, `largeSpecWarningLines`)
|
|
128
130
|
|
|
129
131
|
Example shape:
|
|
130
132
|
|
|
@@ -192,6 +194,7 @@ For flow debugging, inspect `.pi-last-iteration.json` after a run. It summarizes
|
|
|
192
194
|
- Do not enable visual review unless the repo actually has a usable capture command and model config.
|
|
193
195
|
- Keep changes minimal and local to harness setup.
|
|
194
196
|
- Prefer very small, implementation-shaped TODO items for local models. Broad tasks tend to create long turns, retries, and weak tester behavior.
|
|
197
|
+
- Prefer `read` for code inspection and keep shell usage focused on `git`, tests, and narrow diagnostics, especially for weaker local models.
|
|
195
198
|
|
|
196
199
|
## What To Report Back
|
|
197
200
|
|
package/docs/PI_SUPERVISOR.md
CHANGED
|
@@ -30,7 +30,7 @@ Main package files:
|
|
|
30
30
|
- `src/pi-client.mjs`: transport layer
|
|
31
31
|
- `src/pi-rpc-adapter.mjs`: built-in adapter from supervisor JSON to `pi --mode rpc`
|
|
32
32
|
- `src/pi-config.mjs`: config loader
|
|
33
|
-
- `src/pi-repo.mjs`: repo helpers, verification runner, git finalize step
|
|
33
|
+
- `src/pi-repo.mjs`: repo helpers, verification runner, and optional legacy git finalize step
|
|
34
34
|
- `src/pi-telemetry.mjs`: telemetry writer/reader
|
|
35
35
|
- `src/pi-prompts.mjs`: default prompt builders
|
|
36
36
|
- `src/pi-visual-review.mjs`: multimodal visual-review worker
|
|
@@ -126,7 +126,7 @@ Request shape:
|
|
|
126
126
|
"runtimeDir": "/absolute/repo/path/.pi-runtime",
|
|
127
127
|
"piCli": "pi",
|
|
128
128
|
"model": "local/model-name",
|
|
129
|
-
"tools": "read,
|
|
129
|
+
"tools": "read,edit,write,find,ls,bash",
|
|
130
130
|
"thinking": "",
|
|
131
131
|
"noExtensions": false,
|
|
132
132
|
"noSkills": false,
|
|
@@ -170,6 +170,8 @@ The default flow keeps commit ownership with the active agent:
|
|
|
170
170
|
|
|
171
171
|
If a repo explicitly needs the older harness-managed commit-plan flow, set `commitMode` to `plan`. In that mode, `testerCommit` and parsed commit plans are used as a compatibility path rather than the default.
|
|
172
172
|
|
|
173
|
+
For source inspection, prompts prefer `read` and reserve shell usage for `git`, tests, and narrow diagnostics. Large shell file reads are more likely to truncate under context pressure than focused `read` calls.
|
|
174
|
+
|
|
173
175
|
## Persistent Handoffs
|
|
174
176
|
|
|
175
177
|
The harness persists two cross-iteration handoff files:
|
package/package.json
CHANGED
package/src/index.mjs
CHANGED
package/src/pi-config.mjs
CHANGED
|
@@ -258,7 +258,9 @@ export function loadConfig(mode = 'once') {
|
|
|
258
258
|
maxTesterFeedbackLines: readInt('PI_MAX_TESTER_FEEDBACK_LINES', file.maxTesterFeedbackLines, 32),
|
|
259
259
|
maxPromptNotesLines: readInt('PI_MAX_PROMPT_NOTES_LINES', file.maxPromptNotesLines, 16),
|
|
260
260
|
maxVerificationExcerptLines: readInt('PI_MAX_VERIFICATION_EXCERPT_LINES', file.maxVerificationExcerptLines, 40),
|
|
261
|
-
|
|
261
|
+
largeFileWarningLines: readInt('PI_LARGE_FILE_WARNING_LINES', file.largeFileWarningLines, 500),
|
|
262
|
+
largeSpecWarningLines: readInt('PI_LARGE_SPEC_WARNING_LINES', file.largeSpecWarningLines, 300),
|
|
263
|
+
piTools: readString('PI_TOOLS', file.piTools, 'read,edit,write,find,ls,bash'),
|
|
262
264
|
piThinking: readString('PI_THINKING', file.piThinking, ''),
|
|
263
265
|
piNoExtensions: readBool('PI_NO_EXTENSIONS', file.piNoExtensions, false),
|
|
264
266
|
piNoSkills: readBool('PI_NO_SKILLS', file.piNoSkills, false),
|
package/src/pi-prompts.mjs
CHANGED
|
@@ -40,6 +40,20 @@ function formatChangedFilesSection(files, maxFiles) {
|
|
|
40
40
|
return lines.join('\n')
|
|
41
41
|
}
|
|
42
42
|
|
|
43
|
+
function formatLargeFileRiskHint(warnings) {
|
|
44
|
+
const list = Array.isArray(warnings) ? warnings.filter(Boolean) : []
|
|
45
|
+
if (list.length === 0) {
|
|
46
|
+
return ''
|
|
47
|
+
}
|
|
48
|
+
|
|
49
|
+
const lines = list
|
|
50
|
+
.slice(0, 3)
|
|
51
|
+
.map((warning) => `- ${warning.file} (${warning.lineCount} lines${warning.kind === 'large_spec' ? ', spec' : ''})`)
|
|
52
|
+
.join('\n')
|
|
53
|
+
|
|
54
|
+
return `\nLarge file risk in touched files:\n${lines}\nPrefer helper extraction, smaller scoped edits, or test splitting over broad in-place edits.\n`
|
|
55
|
+
}
|
|
56
|
+
|
|
43
57
|
function displayPath(config, filePath) {
|
|
44
58
|
const relativePath = path.relative(config.cwd, filePath)
|
|
45
59
|
if (
|
|
@@ -160,6 +174,9 @@ Harness rules:
|
|
|
160
174
|
- Start by checking git status so you know whether unrelated changes already exist.
|
|
161
175
|
- Update code, config, and docs only as needed for the selected task.
|
|
162
176
|
- Tick only the checkbox items that are actually completed.
|
|
177
|
+
- Use read for source inspection. Use bash only for git, tests, and narrow diagnostics.
|
|
178
|
+
- Do not build edits from large sed/grep output or from memory after partial shell reads.
|
|
179
|
+
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
163
180
|
- If blocked, add a brief note directly under the relevant task in ${taskFile} explaining the blocker, then stop.
|
|
164
181
|
- Do not create the final commit during the developer pass.
|
|
165
182
|
${staleEditRecoveryRules()}
|
|
@@ -180,6 +197,9 @@ Rules:
|
|
|
180
197
|
- Start with git status.
|
|
181
198
|
- Select the first unchecked actionable checkbox in phase order.
|
|
182
199
|
- Keep changes minimal and scoped.
|
|
200
|
+
- Use read for source inspection. Use bash only for git, tests, and narrow diagnostics.
|
|
201
|
+
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
202
|
+
- Do not edit from memory after partial shell output.
|
|
183
203
|
- Tick only completed items.
|
|
184
204
|
- If blocked, note it under the task in ${taskFile} and stop.
|
|
185
205
|
- Do not touch lockfiles, generated files, or unrelated assets.
|
|
@@ -203,11 +223,13 @@ export function buildFixPrompt(config, recentVerificationOutput, options = {}) {
|
|
|
203
223
|
config.usingBundledDeveloperInstructions,
|
|
204
224
|
)
|
|
205
225
|
const findings = clampLines(recentVerificationOutput, configMaxLines(config, 'maxVerificationExcerptLines', 40))
|
|
226
|
+
const largeFileRiskHint = formatLargeFileRiskHint(options.largeFileWarnings)
|
|
206
227
|
|
|
207
228
|
if (!config.usingBundledDeveloperInstructions) {
|
|
208
229
|
return `Read ${taskFile} and ${instructionsFile}.
|
|
209
230
|
${authorityLine}${visualFeedbackSection}
|
|
210
231
|
${testerFeedbackSection}
|
|
232
|
+
${largeFileRiskHint}
|
|
211
233
|
|
|
212
234
|
The tester step found a real problem in the current implementation. Fix only the product behavior related to the current phase and current task.
|
|
213
235
|
|
|
@@ -218,6 +240,9 @@ Harness rules:
|
|
|
218
240
|
- Start by checking git status so you know which files are already dirty.
|
|
219
241
|
- Do not paper over product bugs by weakening tests.
|
|
220
242
|
- Keep changes minimal and focused on the failing behavior.
|
|
243
|
+
- Use read for source inspection. Use bash only for git, tests, and narrow diagnostics.
|
|
244
|
+
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
245
|
+
- Do not edit from memory after partial shell output.
|
|
221
246
|
- Do not perform speculative cleanup or unrelated refactors in this pass.
|
|
222
247
|
- Do not create the final commit during the developer fix pass.
|
|
223
248
|
${staleEditRecoveryRules()}
|
|
@@ -230,6 +255,7 @@ Before stopping:
|
|
|
230
255
|
return `Read ${taskFile} and ${instructionsFile}.
|
|
231
256
|
${authorityLine}${visualFeedbackSection}
|
|
232
257
|
${testerFeedbackSection}
|
|
258
|
+
${largeFileRiskHint}
|
|
233
259
|
|
|
234
260
|
The tester step found a real problem in the current implementation. Fix only the product behavior related to the current phase and current task.
|
|
235
261
|
|
|
@@ -240,6 +266,9 @@ Rules:
|
|
|
240
266
|
- Start with git status.
|
|
241
267
|
- Keep the fix narrow.
|
|
242
268
|
- Do not weaken tests to hide product bugs.
|
|
269
|
+
- Use read for source inspection. Use bash only for git, tests, and narrow diagnostics.
|
|
270
|
+
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
271
|
+
- Do not edit from memory after partial shell output.
|
|
243
272
|
- Do not perform speculative cleanup or unrelated refactors.
|
|
244
273
|
- Do not create the final commit.
|
|
245
274
|
${staleEditRecoveryRules()}
|
|
@@ -259,12 +288,14 @@ export function buildSteeringPrompt(config, reason, options = {}) {
|
|
|
259
288
|
config.developerInstructionsFile,
|
|
260
289
|
config.usingBundledDeveloperInstructions,
|
|
261
290
|
)
|
|
291
|
+
const largeFileRiskHint = formatLargeFileRiskHint(options.largeFileWarnings)
|
|
262
292
|
|
|
263
293
|
if (!config.usingBundledDeveloperInstructions) {
|
|
264
294
|
return `Continue from the current repo state.
|
|
265
295
|
Read ${taskFile} and ${instructionsFile}.
|
|
266
296
|
${authorityLine}${visualFeedbackSection}
|
|
267
297
|
${testerFeedbackSection}
|
|
298
|
+
${largeFileRiskHint}
|
|
268
299
|
|
|
269
300
|
Reason for this follow-up: ${reason}
|
|
270
301
|
|
|
@@ -272,9 +303,11 @@ Select the first unchecked actionable checkbox in the current phase, complete on
|
|
|
272
303
|
|
|
273
304
|
Additional harness guardrails:
|
|
274
305
|
- Start by checking git status.
|
|
306
|
+
- Use read for source inspection. Use bash only for git, tests, and narrow diagnostics.
|
|
275
307
|
- Do not repeat the same tool call over and over.
|
|
276
308
|
- If you already read a file, use that context instead of rereading it unless something changed.
|
|
277
309
|
- If an edit fails once, reread the file before retrying. Do not repeat the same exact edit attempt.
|
|
310
|
+
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
278
311
|
- If you are stuck, make the smallest decisive next action or stop and state the blocker.`
|
|
279
312
|
}
|
|
280
313
|
|
|
@@ -282,15 +315,18 @@ Additional harness guardrails:
|
|
|
282
315
|
Read ${taskFile} and ${instructionsFile}.
|
|
283
316
|
${authorityLine}${visualFeedbackSection}
|
|
284
317
|
${testerFeedbackSection}
|
|
318
|
+
${largeFileRiskHint}
|
|
285
319
|
|
|
286
320
|
Reason for this follow-up: ${reason}
|
|
287
321
|
|
|
288
322
|
Select the first unchecked actionable checkbox in the current phase, complete one coherent task, tick completed items, run verification, and stop.
|
|
289
323
|
|
|
290
324
|
Additional guardrails:
|
|
325
|
+
- Use read for source inspection. Use bash only for git, tests, and narrow diagnostics.
|
|
291
326
|
- Do not repeat the same tool call over and over.
|
|
292
327
|
- If you already read a file, use that context instead of rereading it unless something changed.
|
|
293
328
|
- If an edit fails once, reread the file before retrying. Do not repeat the same exact edit attempt.
|
|
329
|
+
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
294
330
|
- Prefer the configured smoke verification path and one narrow targeted check over long full-flow Playwright specs.
|
|
295
331
|
- If you are stuck, make the smallest decisive next action or stop and state the blocker.`
|
|
296
332
|
}
|
|
@@ -303,6 +339,7 @@ export function buildTesterPrompt(config, {
|
|
|
303
339
|
reason = 'tester_review',
|
|
304
340
|
visualFeedback = '',
|
|
305
341
|
testerFeedback = '',
|
|
342
|
+
largeFileWarnings = [],
|
|
306
343
|
}) {
|
|
307
344
|
const taskFile = displayPath(config, config.taskFile)
|
|
308
345
|
const instructionsFile = displayPath(config, config.testerInstructionsFile)
|
|
@@ -326,11 +363,13 @@ export function buildTesterPrompt(config, {
|
|
|
326
363
|
config.usingBundledTesterInstructions,
|
|
327
364
|
)
|
|
328
365
|
const passOwnership = testerPassOwnershipRules(config)
|
|
366
|
+
const largeFileRiskHint = formatLargeFileRiskHint(largeFileWarnings)
|
|
329
367
|
|
|
330
368
|
if (!config.usingBundledTesterInstructions) {
|
|
331
369
|
return `Read ${taskFile} and ${instructionsFile}.
|
|
332
370
|
${authorityLine}${visualFeedbackSection}
|
|
333
371
|
${testerFeedbackSection}
|
|
372
|
+
${largeFileRiskHint}
|
|
334
373
|
|
|
335
374
|
You are the TESTER role. You are reviewing the most recent developer work from an independent quality and functionality perspective.
|
|
336
375
|
|
|
@@ -348,6 +387,8 @@ Rules:
|
|
|
348
387
|
- Start with git status.
|
|
349
388
|
- Follow repo-local tester instructions for what to verify and which commands to run.
|
|
350
389
|
- Prefer one focused review pass.
|
|
390
|
+
- Use read for source inspection. Use bash only for git, tests, and narrow diagnostics.
|
|
391
|
+
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
351
392
|
- If blocked or inconclusive, return VERDICT: BLOCKED.
|
|
352
393
|
- Do not hide real bugs with brittle tests.
|
|
353
394
|
- ${passOwnership.successRule.slice(2)}
|
|
@@ -370,6 +411,7 @@ Before stopping, end your final response with exactly one verdict line:
|
|
|
370
411
|
return `Read ${taskFile} and ${instructionsFile}.
|
|
371
412
|
${authorityLine}${visualFeedbackSection}
|
|
372
413
|
${testerFeedbackSection}
|
|
414
|
+
${largeFileRiskHint}
|
|
373
415
|
|
|
374
416
|
You are the TESTER role. You are reviewing the most recent developer work from an independent quality and functionality perspective.
|
|
375
417
|
|
|
@@ -385,9 +427,11 @@ ${changedFilesSection}
|
|
|
385
427
|
|
|
386
428
|
Rules:
|
|
387
429
|
- Start with git status.
|
|
430
|
+
- Use read for source inspection. Use bash only for git, tests, and narrow diagnostics.
|
|
388
431
|
- Run the repo verification command yourself: ${verificationCommand}
|
|
389
432
|
${indentBlock(innerLoopValidationRules(verificationCommand), '\t')}
|
|
390
433
|
- Prefer one focused browser-driven review pass.
|
|
434
|
+
- If a snippet seems incomplete, reread a smaller exact window with read instead of another large overlapping shell range.
|
|
391
435
|
- Do not hide real bugs with brittle tests.
|
|
392
436
|
- If blocked or inconclusive, return VERDICT: BLOCKED.
|
|
393
437
|
${indentBlock(passOwnership.successRule, '\t')}
|
|
@@ -415,6 +459,7 @@ export function buildCommitPrompt(config, {
|
|
|
415
459
|
reason = 'tester_passed_without_commit',
|
|
416
460
|
visualFeedback = '',
|
|
417
461
|
testerFeedback = '',
|
|
462
|
+
largeFileWarnings = [],
|
|
418
463
|
}) {
|
|
419
464
|
const taskFile = displayPath(config, config.taskFile)
|
|
420
465
|
const instructionsFile = displayPath(config, config.testerInstructionsFile)
|
|
@@ -433,10 +478,12 @@ export function buildCommitPrompt(config, {
|
|
|
433
478
|
developerNotes || '(none provided)',
|
|
434
479
|
configMaxLines(config, 'maxPromptNotesLines', 16),
|
|
435
480
|
)
|
|
481
|
+
const largeFileRiskHint = formatLargeFileRiskHint(largeFileWarnings)
|
|
436
482
|
|
|
437
483
|
return `Read ${taskFile} and ${instructionsFile}.
|
|
438
484
|
${authorityLine}${visualFeedbackSection}
|
|
439
485
|
${testerFeedbackSection}
|
|
486
|
+
${largeFileRiskHint}
|
|
440
487
|
|
|
441
488
|
You are the TESTER role. The implementation already passed functional review, but the final commit was not created.
|
|
442
489
|
|
package/src/pi-repo.mjs
CHANGED
|
@@ -225,6 +225,65 @@ export function findFirstUncheckedTaskInfo(taskFile) {
|
|
|
225
225
|
}
|
|
226
226
|
}
|
|
227
227
|
|
|
228
|
+
function countLines(text) {
|
|
229
|
+
const normalized = String(text ?? '')
|
|
230
|
+
if (normalized === '') {
|
|
231
|
+
return 0
|
|
232
|
+
}
|
|
233
|
+
return normalized.split('\n').length
|
|
234
|
+
}
|
|
235
|
+
|
|
236
|
+
function isSpecLikeFile(filePath) {
|
|
237
|
+
const normalized = String(filePath ?? '').replaceAll('\\', '/')
|
|
238
|
+
return /(^|\/)(e2e|test|tests|spec|specs)\//.test(normalized)
|
|
239
|
+
|| /\.(spec|test)\.[cm]?[jt]sx?$/.test(normalized)
|
|
240
|
+
}
|
|
241
|
+
|
|
242
|
+
export function collectLargeFileWarnings(cwd, files, {
|
|
243
|
+
largeFileWarningLines = 500,
|
|
244
|
+
largeSpecWarningLines = 300,
|
|
245
|
+
} = {}) {
|
|
246
|
+
const warnings = []
|
|
247
|
+
const seen = new Set()
|
|
248
|
+
|
|
249
|
+
for (const file of Array.isArray(files) ? files : []) {
|
|
250
|
+
const relativePath = String(file ?? '').trim()
|
|
251
|
+
if (relativePath === '' || seen.has(relativePath)) {
|
|
252
|
+
continue
|
|
253
|
+
}
|
|
254
|
+
seen.add(relativePath)
|
|
255
|
+
|
|
256
|
+
const absolutePath = path.resolve(cwd, relativePath)
|
|
257
|
+
let raw = ''
|
|
258
|
+
try {
|
|
259
|
+
raw = readFileSync(absolutePath, 'utf8')
|
|
260
|
+
} catch {
|
|
261
|
+
continue
|
|
262
|
+
}
|
|
263
|
+
|
|
264
|
+
const lineCount = countLines(raw)
|
|
265
|
+
const isSpec = isSpecLikeFile(relativePath)
|
|
266
|
+
if (isSpec && lineCount >= largeSpecWarningLines) {
|
|
267
|
+
warnings.push({
|
|
268
|
+
file: relativePath,
|
|
269
|
+
lineCount,
|
|
270
|
+
kind: 'large_spec',
|
|
271
|
+
})
|
|
272
|
+
continue
|
|
273
|
+
}
|
|
274
|
+
|
|
275
|
+
if (lineCount >= largeFileWarningLines) {
|
|
276
|
+
warnings.push({
|
|
277
|
+
file: relativePath,
|
|
278
|
+
lineCount,
|
|
279
|
+
kind: 'large_file',
|
|
280
|
+
})
|
|
281
|
+
}
|
|
282
|
+
}
|
|
283
|
+
|
|
284
|
+
return warnings.sort((left, right) => right.lineCount - left.lineCount)
|
|
285
|
+
}
|
|
286
|
+
|
|
228
287
|
export async function runShellCommand({
|
|
229
288
|
cwd,
|
|
230
289
|
command,
|
package/src/pi-report.mjs
CHANGED
|
@@ -35,6 +35,17 @@ async function main() {
|
|
|
35
35
|
console.log(`- ${kind}: ${count}`)
|
|
36
36
|
}
|
|
37
37
|
|
|
38
|
+
const iterationSummaries = recent.filter((event) => event.kind === 'iteration_summary')
|
|
39
|
+
const warningsByIteration = iterationSummaries
|
|
40
|
+
.filter((event) => String(event.riskWarnings ?? '').trim() !== '')
|
|
41
|
+
|
|
42
|
+
if (warningsByIteration.length > 0) {
|
|
43
|
+
console.log('\nLarge file warnings:')
|
|
44
|
+
for (const event of warningsByIteration.slice(-5)) {
|
|
45
|
+
console.log(`- iteration ${event.iteration}: ${event.riskWarnings}`)
|
|
46
|
+
}
|
|
47
|
+
}
|
|
48
|
+
|
|
38
49
|
const last = recent.at(-1)
|
|
39
50
|
if (!last) {
|
|
40
51
|
return
|
package/src/pi-rpc-adapter.mjs
CHANGED
|
@@ -54,6 +54,44 @@ function extractToolTarget(toolName, args) {
|
|
|
54
54
|
return ''
|
|
55
55
|
}
|
|
56
56
|
|
|
57
|
+
function extractShellCommand(args) {
|
|
58
|
+
if (!args || typeof args !== 'object') {
|
|
59
|
+
return ''
|
|
60
|
+
}
|
|
61
|
+
|
|
62
|
+
if (typeof args.command === 'string') {
|
|
63
|
+
return args.command
|
|
64
|
+
}
|
|
65
|
+
|
|
66
|
+
if (typeof args.cmd === 'string') {
|
|
67
|
+
return args.cmd
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
return ''
|
|
71
|
+
}
|
|
72
|
+
|
|
73
|
+
function isLargeShellRead(command) {
|
|
74
|
+
const text = String(command ?? '').trim()
|
|
75
|
+
if (text === '') {
|
|
76
|
+
return false
|
|
77
|
+
}
|
|
78
|
+
|
|
79
|
+
if (/^\s*cat\s+\S+/.test(text)) {
|
|
80
|
+
return true
|
|
81
|
+
}
|
|
82
|
+
|
|
83
|
+
const sedMatch = text.match(/sed\s+-n\s+['"]?(\d+)\s*,\s*(\d+)p['"]?/)
|
|
84
|
+
if (sedMatch) {
|
|
85
|
+
const start = Number.parseInt(sedMatch[1], 10)
|
|
86
|
+
const end = Number.parseInt(sedMatch[2], 10)
|
|
87
|
+
if (Number.isFinite(start) && Number.isFinite(end) && end >= start) {
|
|
88
|
+
return (end - start) >= 120
|
|
89
|
+
}
|
|
90
|
+
}
|
|
91
|
+
|
|
92
|
+
return false
|
|
93
|
+
}
|
|
94
|
+
|
|
57
95
|
function extractAssistantText(message) {
|
|
58
96
|
if (!message || message.role !== 'assistant' || !Array.isArray(message.content)) {
|
|
59
97
|
return ''
|
|
@@ -295,6 +333,7 @@ async function run() {
|
|
|
295
333
|
activeToolName = String(data.toolName ?? '')
|
|
296
334
|
activeToolStartedAt = Date.now()
|
|
297
335
|
const target = extractToolTarget(data.toolName, data.args)
|
|
336
|
+
const shellCommand = data.toolName === 'bash' ? extractShellCommand(data.args) : ''
|
|
298
337
|
if (signature === lastToolSignature) {
|
|
299
338
|
repeatedToolCount += 1
|
|
300
339
|
} else {
|
|
@@ -325,6 +364,9 @@ async function run() {
|
|
|
325
364
|
}
|
|
326
365
|
|
|
327
366
|
writeLive(`[PI tool:start] ${data.toolName}${suffix}\n`)
|
|
367
|
+
if (data.toolName === 'bash' && isLargeShellRead(shellCommand)) {
|
|
368
|
+
writeLive('[PI warning] large bash file read detected; prefer read or a smaller exact window to avoid truncated context.\n')
|
|
369
|
+
}
|
|
328
370
|
}
|
|
329
371
|
|
|
330
372
|
if (data.type === 'tool_execution_end') {
|
package/src/pi-supervisor.mjs
CHANGED
|
@@ -13,6 +13,7 @@ import {
|
|
|
13
13
|
import { appendTelemetry, ensureTelemetryFiles } from './pi-telemetry.mjs'
|
|
14
14
|
import {
|
|
15
15
|
appendLog,
|
|
16
|
+
collectLargeFileWarnings,
|
|
16
17
|
commitStagedFiles,
|
|
17
18
|
didRepoChange,
|
|
18
19
|
ensureFileExists,
|
|
@@ -79,6 +80,10 @@ function printTerminalSummary(config, summary) {
|
|
|
79
80
|
lines.push(`[PI supervisor] notes=${summary.notes}`)
|
|
80
81
|
}
|
|
81
82
|
|
|
83
|
+
if (Array.isArray(summary.largeFileWarnings) && summary.largeFileWarnings.length > 0) {
|
|
84
|
+
lines.push(`[PI supervisor] large_file_warnings=${formatLargeFileWarningsInline(summary.largeFileWarnings)}`)
|
|
85
|
+
}
|
|
86
|
+
|
|
82
87
|
if (summary.terminalReason) {
|
|
83
88
|
lines.push(`[PI supervisor] terminal_reason=${summary.terminalReason}`)
|
|
84
89
|
}
|
|
@@ -162,6 +167,7 @@ function createIterationSummary({
|
|
|
162
167
|
gitFinalizeStatus,
|
|
163
168
|
visualStatus,
|
|
164
169
|
terminalReason,
|
|
170
|
+
largeFileWarnings,
|
|
165
171
|
sessionId,
|
|
166
172
|
developerModel,
|
|
167
173
|
testerModel,
|
|
@@ -180,6 +186,7 @@ function createIterationSummary({
|
|
|
180
186
|
gitFinalizeStatus,
|
|
181
187
|
visualStatus,
|
|
182
188
|
terminalReason,
|
|
189
|
+
largeFileWarnings,
|
|
183
190
|
sessionId,
|
|
184
191
|
developerModel,
|
|
185
192
|
testerModel,
|
|
@@ -191,6 +198,39 @@ function didInvocationCreateCommit(invocation) {
|
|
|
191
198
|
return invocation?.beforeSnapshot?.head !== invocation?.afterSnapshot?.head
|
|
192
199
|
}
|
|
193
200
|
|
|
201
|
+
function mergeLargeFileWarnings(existing, incoming) {
|
|
202
|
+
const merged = new Map()
|
|
203
|
+
for (const warning of [...(existing || []), ...(incoming || [])]) {
|
|
204
|
+
if (!warning?.file) {
|
|
205
|
+
continue
|
|
206
|
+
}
|
|
207
|
+
const key = `${warning.kind}:${warning.file}`
|
|
208
|
+
const current = merged.get(key)
|
|
209
|
+
if (!current || Number(warning.lineCount) > Number(current.lineCount)) {
|
|
210
|
+
merged.set(key, warning)
|
|
211
|
+
}
|
|
212
|
+
}
|
|
213
|
+
return [...merged.values()].sort((left, right) => right.lineCount - left.lineCount)
|
|
214
|
+
}
|
|
215
|
+
|
|
216
|
+
function findLargeFileWarnings(config, files) {
|
|
217
|
+
return collectLargeFileWarnings(config.cwd, files, {
|
|
218
|
+
largeFileWarningLines: config.largeFileWarningLines,
|
|
219
|
+
largeSpecWarningLines: config.largeSpecWarningLines,
|
|
220
|
+
})
|
|
221
|
+
}
|
|
222
|
+
|
|
223
|
+
function formatLargeFileWarningsInline(warnings) {
|
|
224
|
+
const list = Array.isArray(warnings) ? warnings : []
|
|
225
|
+
if (list.length === 0) {
|
|
226
|
+
return ''
|
|
227
|
+
}
|
|
228
|
+
return list
|
|
229
|
+
.slice(0, 3)
|
|
230
|
+
.map((warning) => `${warning.file}(${warning.lineCount}${warning.kind === 'large_spec' ? ',spec' : ''})`)
|
|
231
|
+
.join(', ')
|
|
232
|
+
}
|
|
233
|
+
|
|
194
234
|
function clampPromptLines(text, maxLines) {
|
|
195
235
|
const normalized = String(text ?? '').trim()
|
|
196
236
|
if (normalized === '') {
|
|
@@ -644,6 +684,7 @@ async function runMainTurnWithRetries({ config, iteration, phase, sessionId, ses
|
|
|
644
684
|
prompt = buildSteeringPrompt(config, reason, {
|
|
645
685
|
visualFeedback: await readLatestVisualFeedback(config),
|
|
646
686
|
testerFeedback: await readLatestTesterFeedback(config),
|
|
687
|
+
largeFileWarnings: findLargeFileWarnings(config, listChangedFiles(config.cwd)),
|
|
647
688
|
})
|
|
648
689
|
|
|
649
690
|
if (shouldRetryForTimeout || shouldRetryForNoChange) {
|
|
@@ -656,12 +697,14 @@ async function runMainTurnWithRetries({ config, iteration, phase, sessionId, ses
|
|
|
656
697
|
}
|
|
657
698
|
|
|
658
699
|
async function runFixTurn({ config, iteration, phase, sessionId, sessionFile, testerOutput }) {
|
|
700
|
+
const largeFileWarnings = findLargeFileWarnings(config, listChangedFiles(config.cwd))
|
|
659
701
|
const fixPrompt = buildFixPrompt(
|
|
660
702
|
config,
|
|
661
703
|
clampPromptLines(testerOutput, Number(config.maxVerificationExcerptLines) || 40),
|
|
662
704
|
{
|
|
663
705
|
visualFeedback: await readLatestVisualFeedback(config),
|
|
664
706
|
testerFeedback: await readLatestTesterFeedback(config),
|
|
707
|
+
largeFileWarnings,
|
|
665
708
|
}
|
|
666
709
|
)
|
|
667
710
|
return await runAgentInvocation({
|
|
@@ -762,6 +805,7 @@ async function runTesterTurn({
|
|
|
762
805
|
developerNotes,
|
|
763
806
|
reason,
|
|
764
807
|
}) {
|
|
808
|
+
const largeFileWarnings = findLargeFileWarnings(config, changedFiles)
|
|
765
809
|
const prompt = buildTesterPrompt(config, {
|
|
766
810
|
phase,
|
|
767
811
|
task,
|
|
@@ -770,6 +814,7 @@ async function runTesterTurn({
|
|
|
770
814
|
reason,
|
|
771
815
|
visualFeedback: await readLatestVisualFeedback(config),
|
|
772
816
|
testerFeedback: await readLatestTesterFeedback(config),
|
|
817
|
+
largeFileWarnings,
|
|
773
818
|
})
|
|
774
819
|
|
|
775
820
|
const invocation = await runAgentInvocation({
|
|
@@ -835,6 +880,7 @@ async function runTesterCommitTurn({
|
|
|
835
880
|
developerNotes,
|
|
836
881
|
reason,
|
|
837
882
|
}) {
|
|
883
|
+
const largeFileWarnings = findLargeFileWarnings(config, changedFiles)
|
|
838
884
|
const prompt = buildCommitPrompt(config, {
|
|
839
885
|
phase,
|
|
840
886
|
task,
|
|
@@ -843,6 +889,7 @@ async function runTesterCommitTurn({
|
|
|
843
889
|
reason,
|
|
844
890
|
visualFeedback: await readLatestVisualFeedback(config),
|
|
845
891
|
testerFeedback: await readLatestTesterFeedback(config),
|
|
892
|
+
largeFileWarnings,
|
|
846
893
|
})
|
|
847
894
|
|
|
848
895
|
const invocation = await runAgentInvocation({
|
|
@@ -1054,6 +1101,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1054
1101
|
gitFinalizeStatus: 'not_run',
|
|
1055
1102
|
visualStatus: 'not_run',
|
|
1056
1103
|
terminalReason: 'all_tasks_complete',
|
|
1104
|
+
largeFileWarnings: [],
|
|
1057
1105
|
notes: 'No unchecked tasks remain in TODOS.md.',
|
|
1058
1106
|
sessionId: state.sessionId || '',
|
|
1059
1107
|
outputPath: config.lastAgentOutputFile,
|
|
@@ -1103,6 +1151,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1103
1151
|
let commitPlanFound = false
|
|
1104
1152
|
let gitFinalizeStatus = 'not_run'
|
|
1105
1153
|
let terminalReason = mainInvocation.result.terminalReason || ''
|
|
1154
|
+
let largeFileWarnings = findLargeFileWarnings(config, mainInvocation.changedFiles)
|
|
1106
1155
|
const noteParts = [`developer: ${mainInvocation.result.notes}`]
|
|
1107
1156
|
|
|
1108
1157
|
if (mainInvocation.result.status === 'success' && config.transport === 'mock') {
|
|
@@ -1157,6 +1206,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1157
1206
|
testerVerdict = testerInvocation.testerVerdict
|
|
1158
1207
|
commitPlanFound = testerInvocation.commitPlanFound === true
|
|
1159
1208
|
terminalReason = testerInvocation.result.terminalReason || terminalReason
|
|
1209
|
+
largeFileWarnings = mergeLargeFileWarnings(largeFileWarnings, findLargeFileWarnings(config, listChangedFiles(config.cwd)))
|
|
1160
1210
|
noteParts.push(`tester: ${testerInvocation.result.notes}`)
|
|
1161
1211
|
await writeTesterFeedback(config, {
|
|
1162
1212
|
iteration,
|
|
@@ -1184,6 +1234,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1184
1234
|
testerVerdict = testerCommitInvocation.testerVerdict
|
|
1185
1235
|
commitPlanFound = testerCommitInvocation.commitPlanFound === true
|
|
1186
1236
|
terminalReason = testerCommitInvocation.result.terminalReason || terminalReason
|
|
1237
|
+
largeFileWarnings = mergeLargeFileWarnings(largeFileWarnings, findLargeFileWarnings(config, listChangedFiles(config.cwd)))
|
|
1187
1238
|
noteParts.push(`tester_commit: ${testerCommitInvocation.result.notes}`)
|
|
1188
1239
|
await writeTesterFeedback(config, {
|
|
1189
1240
|
iteration,
|
|
@@ -1241,6 +1292,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1241
1292
|
sessionFile = fixInvocation.result.sessionFile || sessionFile
|
|
1242
1293
|
developerStatus = fixInvocation.result.status
|
|
1243
1294
|
terminalReason = fixInvocation.result.terminalReason || 'developer_fix_incomplete'
|
|
1295
|
+
largeFileWarnings = mergeLargeFileWarnings(largeFileWarnings, findLargeFileWarnings(config, listChangedFiles(config.cwd)))
|
|
1244
1296
|
noteParts.push(`developer_fix: ${fixInvocation.result.notes}`)
|
|
1245
1297
|
|
|
1246
1298
|
if (fixInvocation.result.status === 'success') {
|
|
@@ -1258,6 +1310,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1258
1310
|
testerVerdict = testerRecheck.testerVerdict
|
|
1259
1311
|
commitPlanFound = testerRecheck.commitPlanFound === true
|
|
1260
1312
|
terminalReason = testerRecheck.result.terminalReason || terminalReason
|
|
1313
|
+
largeFileWarnings = mergeLargeFileWarnings(largeFileWarnings, findLargeFileWarnings(config, listChangedFiles(config.cwd)))
|
|
1261
1314
|
noteParts.push(`tester_recheck: ${testerRecheck.result.notes}`)
|
|
1262
1315
|
await writeTesterFeedback(config, {
|
|
1263
1316
|
iteration,
|
|
@@ -1285,6 +1338,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1285
1338
|
testerVerdict = testerCommitInvocation.testerVerdict
|
|
1286
1339
|
commitPlanFound = testerCommitInvocation.commitPlanFound === true
|
|
1287
1340
|
terminalReason = testerCommitInvocation.result.terminalReason || terminalReason
|
|
1341
|
+
largeFileWarnings = mergeLargeFileWarnings(largeFileWarnings, findLargeFileWarnings(config, listChangedFiles(config.cwd)))
|
|
1288
1342
|
noteParts.push(`tester_commit: ${testerCommitInvocation.result.notes}`)
|
|
1289
1343
|
await writeTesterFeedback(config, {
|
|
1290
1344
|
iteration,
|
|
@@ -1436,7 +1490,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1436
1490
|
|
|
1437
1491
|
await appendLog(
|
|
1438
1492
|
config.logFile,
|
|
1439
|
-
`Finished iteration ${iteration} with status=${finalStatus} verification=${finalVerificationStatus} tester_verdict=${testerVerdict} commit_plan_found=${commitPlanFound} terminal_reason=${terminalReason}`
|
|
1493
|
+
`Finished iteration ${iteration} with status=${finalStatus} verification=${finalVerificationStatus} tester_verdict=${testerVerdict} commit_plan_found=${commitPlanFound} terminal_reason=${terminalReason}${largeFileWarnings.length > 0 ? ` large_file_warnings=${formatLargeFileWarningsInline(largeFileWarnings)}` : ''}`
|
|
1440
1494
|
)
|
|
1441
1495
|
|
|
1442
1496
|
const iterationEndSnapshot = getRepoSnapshot(config.cwd)
|
|
@@ -1453,6 +1507,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1453
1507
|
gitFinalizeStatus,
|
|
1454
1508
|
visualStatus,
|
|
1455
1509
|
terminalReason,
|
|
1510
|
+
largeFileWarnings,
|
|
1456
1511
|
sessionId,
|
|
1457
1512
|
developerModel: developerModelName,
|
|
1458
1513
|
testerModel: testerModelName,
|
|
@@ -1486,6 +1541,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1486
1541
|
testerVerdict,
|
|
1487
1542
|
commitPlanFound,
|
|
1488
1543
|
terminalReason,
|
|
1544
|
+
riskWarnings: formatLargeFileWarningsInline(largeFileWarnings),
|
|
1489
1545
|
notes: noteParts.join(' | '),
|
|
1490
1546
|
})
|
|
1491
1547
|
|
|
@@ -1504,6 +1560,7 @@ async function runIteration({ config, state, iteration }) {
|
|
|
1504
1560
|
gitFinalizeStatus,
|
|
1505
1561
|
visualStatus,
|
|
1506
1562
|
terminalReason,
|
|
1563
|
+
largeFileWarnings,
|
|
1507
1564
|
notes: noteParts.join(' | '),
|
|
1508
1565
|
sessionId,
|
|
1509
1566
|
outputPath: config.lastAgentOutputFile,
|
package/src/pi-telemetry.mjs
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
import fs from 'node:fs/promises'
|
|
2
2
|
|
|
3
|
-
const CSV_HEADER = 'timestamp,iteration,phase,kind,status,transport,session_id,timed_out,exit_code,duration_seconds,commit_before,commit_after,repo_changed,changed_files_count,verification_status,retry_count,role,model,tool_calls,tool_errors,message_updates,stop_reason,loop_detected,loop_signature,tester_verdict,commit_plan_found,terminal_reason,notes\n'
|
|
3
|
+
const CSV_HEADER = 'timestamp,iteration,phase,kind,status,transport,session_id,timed_out,exit_code,duration_seconds,commit_before,commit_after,repo_changed,changed_files_count,verification_status,retry_count,role,model,tool_calls,tool_errors,message_updates,stop_reason,loop_detected,loop_signature,tester_verdict,commit_plan_found,terminal_reason,risk_warnings,notes\n'
|
|
4
4
|
|
|
5
5
|
function csvEscape(value) {
|
|
6
6
|
const text = String(value ?? '')
|
|
@@ -56,6 +56,7 @@ export async function appendTelemetry(config, event) {
|
|
|
56
56
|
event.testerVerdict,
|
|
57
57
|
event.commitPlanFound,
|
|
58
58
|
event.terminalReason,
|
|
59
|
+
event.riskWarnings,
|
|
59
60
|
event.notes,
|
|
60
61
|
].map(csvEscape).join(',')
|
|
61
62
|
|
package/templates/DEVELOPER.md
CHANGED
|
@@ -20,6 +20,9 @@ Rules:
|
|
|
20
20
|
- Use the configured smoke verification path as the fast inner-loop gate. Do not replace it with a long full-flow Playwright spec unless the task explicitly requires it.
|
|
21
21
|
- If a long Playwright happy-path spec changes, validate with smoke plus one narrow targeted spec or deterministic state hook, not the entire full-flow run.
|
|
22
22
|
- Reserve long full-flow Playwright specs for an explicit nightly or post-run lane, not the developer turn.
|
|
23
|
+
- Use `read` for source inspection. Use shell only for `git`, tests, and narrow diagnostics.
|
|
24
|
+
- If a snippet seems incomplete, reread a smaller exact window instead of another huge overlapping shell range.
|
|
25
|
+
- Do not build edits from large `sed`/`grep` output or from memory after partial shell reads.
|
|
23
26
|
- Trust tool output over your own guesses.
|
|
24
27
|
- Do not repeatedly reread or rewrite the same file when one focused fix will do.
|
|
25
28
|
- After one failed edit attempt, reread the file before retrying.
|
package/templates/TESTER.md
CHANGED
|
@@ -7,7 +7,7 @@ Your job:
|
|
|
7
7
|
- review the developer's change from an independent user-facing perspective
|
|
8
8
|
- add or improve focused verification where needed
|
|
9
9
|
- verify actual functionality, not just plausibility
|
|
10
|
-
-
|
|
10
|
+
- create the final commit only when the work is truly ready
|
|
11
11
|
|
|
12
12
|
Rules:
|
|
13
13
|
|
|
@@ -16,6 +16,9 @@ Rules:
|
|
|
16
16
|
- Run the configured smoke verification command as the default inner-loop gate.
|
|
17
17
|
- Do not run long full-flow Playwright happy-path specs in the tester turn unless the task explicitly requires them.
|
|
18
18
|
- If a long spec changed, validate with smoke plus one narrow targeted spec or deterministic state setup instead of replaying the entire run.
|
|
19
|
+
- Use `read` for source inspection. Use shell only for `git`, tests, and narrow diagnostics.
|
|
20
|
+
- If a snippet seems incomplete, reread a smaller exact window instead of another huge overlapping shell range.
|
|
21
|
+
- Do not build edits from large `sed`/`grep` output or from memory after partial shell reads.
|
|
19
22
|
- Treat player-facing dead ends, missing affordances, broken progression, console/runtime failures, and unusable UI as real failures.
|
|
20
23
|
- If the task affects menus, unlocks, progression, classes, routes, shops, onboarding, or gating, verify a fresh-save path.
|
|
21
24
|
- Do not hide product bugs by weakening tests.
|
|
@@ -23,7 +26,7 @@ Rules:
|
|
|
23
26
|
- After one failed edit attempt, reread the file before retrying.
|
|
24
27
|
- Do not repeat the same exact oldText-based edit on the same file.
|
|
25
28
|
- If visual review is enabled, maintain the screenshot capture flow and manifest expected by the harness.
|
|
26
|
-
- If the change passes,
|
|
29
|
+
- If the change passes, stage only the related files and create the commit yourself.
|
|
27
30
|
- If the working tree cannot be isolated safely, return `VERDICT: BLOCKED`.
|
|
28
31
|
|
|
29
32
|
Before stopping:
|
|
@@ -31,7 +34,7 @@ Before stopping:
|
|
|
31
34
|
- include `Observed flow:`
|
|
32
35
|
- include `Player-facing result:`
|
|
33
36
|
- include `Regression check:`
|
|
37
|
+
- if passing, include `COMMIT_CREATED: true`
|
|
34
38
|
- if passing, include `COMMIT_MESSAGE: ...`
|
|
35
|
-
- if passing, include `
|
|
36
|
-
- if passing, include one `- path/to/file` line per file
|
|
39
|
+
- if passing, include `COMMIT_SHA: ...`
|
|
37
40
|
- end with exactly one verdict line: `VERDICT: PASS`, `VERDICT: FAIL`, or `VERDICT: BLOCKED`
|