pi-goal-x 0.16.1 → 0.17.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +9 -4
- package/docs/CHANGELOG.md +145 -0
- package/extensions/goal-policy.ts +21 -2
- package/extensions/goal-questionnaire.ts +45 -9
- package/extensions/goal-record.ts +2 -0
- package/extensions/goal-settings.ts +8 -0
- package/extensions/goal.ts +90 -18
- package/extensions/prompts/goal-prompts.ts +9 -3
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -31,6 +31,10 @@ All core features of [@capyup/pi-goal](https://github.com/capyup/pi-goal) are pr
|
|
|
31
31
|
- **Recursive subtasks** — tasks can have nested sub-tasks via `subtasks?: GoalTask[]` (full recursive type). Subtask depth is controlled globally by `subtaskDepth` in `.pi/pi-goal-x-settings.json` (default: 1 level). Too-deep subtrees are rejected at proposal.
|
|
32
32
|
- **Lightweight subtasks** — each task has an optional `lightweightSubtasks?: boolean` flag. When true, the parent can complete regardless of subtask status. When false/absent (full subtasks), all subtasks must be individually complete before the parent can close.
|
|
33
33
|
- **Per-task completion** — `complete_task` marks individual tasks done with optional evidence/verificationSummary, and `skip_task` marks tasks as skipped with a required reason. Neither stops the turn, so the agent can continue uninterrupted.
|
|
34
|
+
- **Recursive lookup** — `findTaskInTree` and `updateTaskInTree` search and update tasks at any depth. Subtask IDs are valid targets for `complete_task` and `skip_task`.
|
|
35
|
+
- **Subtask gate** — parent tasks with full subtasks require all sub-items to be completed or skipped before the parent can close, enforced by recursive `checkSubtasksComplete`.
|
|
36
|
+
- **Duplicate ID validation** — `validateTaskListProposal` recursively checks all task IDs across the entire tree, preventing collisions between parent/subtask or sibling subtasks.
|
|
37
|
+
- **Agent workflow guidance** — prompts include a `[TASK WORKFLOW]` section directing agents to use tasks as progress trackers, completing subtasks immediately when work finishes (not batch-marking at the end).
|
|
34
38
|
- **Hierarchical display** — task lists with subtasks render with indentation in prompts (`taskListBlock`, `goalPrompt`, `continuationPrompt`) and in the TUI widget (recursive count, BFS next-pending).
|
|
35
39
|
- **Optional `taskList`** — goals without a task list work exactly as before. The feature is entirely opt-in.
|
|
36
40
|
- **Soft `complete_goal` gate** — when `blockCompletion: true` is set, `complete_goal` surfaces a warning if pending tasks remain (prompt-level only; the agent can still complete).
|
|
@@ -48,19 +52,20 @@ All core features of [@capyup/pi-goal](https://github.com/capyup/pi-goal) are pr
|
|
|
48
52
|
### E2e test infrastructure
|
|
49
53
|
|
|
50
54
|
- **Deterministic fork tests using `--mode json`**: the e2e suite spawns a real `pi --fork --mode json` session, parses structured `tool_execution_start`/`tool_execution_end` JSON events for field-level assertions — no free-text AI output parsing. Uses `--append-system-prompt` + `--tools` to force deterministic tool calls.
|
|
51
|
-
- **Full coverage**:
|
|
55
|
+
- **Full coverage**: 310 tests total — function-level integration tests, mock-pi handler tests, file-validity checks, real `pi --fork --mode json` E2E tests, propose_goal_tweak unit/integration/e2e tests, task list policy/round-trip/render tests (including subtasks), and verification contract tests.
|
|
52
56
|
|
|
53
57
|
### Completion auditor
|
|
54
58
|
|
|
55
59
|
- **Live progress widget** — when the auditor runs, the TUI shows a spinner, a progress bar (`[████░░░░] 40%`), step labels (`Inspecting files...`, `Verifying success criteria...`), the current tool being executed, and recent output lines. No more wondering if anything is happening.
|
|
60
|
+
- **Per-goal auditor toggle** — during goal confirmation, press `a` to toggle the auditor on/off for that goal. The toggle uses a ●/○ indicator between the goal summary and confirm options. The default position comes from settings; the per-goal override persists within the session.
|
|
56
61
|
- **Escape to skip** — press Escape during an audit to abort it and complete the goal immediately. The skip is recorded in the ledger as `audit_skipped` with reason `user_aborted` and auditor model metadata.
|
|
57
|
-
- **Disable the auditor entirely** — set `disabled: true` in `.pi/pi-goal-x-settings.json` (or toggle it via `/goal-settings`). The agent can still bypass with user confirmation by passing `confirmBypassAuditor: true` to `
|
|
62
|
+
- **Disable the auditor entirely** — set `disabled: true` in `.pi/pi-goal-x-settings.json` (or toggle it via `/goal-settings`). The agent can still bypass with user confirmation by passing `confirmBypassAuditor: true` to `complete_goal`.
|
|
58
63
|
- **Skipped audits are recorded** — every skip (whether disabled or Escape-aborted) is logged to the ledger with the reason, provider, model, and thinking level for full traceability.
|
|
59
64
|
- **Robust abort detection** — the auditor detects aborts both from exceptions *and* from `session.prompt()` returning after an abort signal, preventing stuck goals or ghost states.
|
|
60
65
|
- **Cleaner lifecycle** — `AbortSignal` is properly wired to `session.abort()`, animation timers are cleaned up, and the unsubscribe path is always executed. No more having to kill the session.
|
|
61
66
|
- **Completion report includes full auditor output** — the auditor's full report is included in the goal completion conversation message upon approval, not just a verdict.
|
|
62
67
|
- **Session factory injection** — `runGoalCompletionAuditor` accepts an optional `createSession` parameter for testability, enabling mock auditor sessions in tests.
|
|
63
|
-
- **Structured test evidence** — the executor can pass `testResults` (exit code, suite name, output, timestamp) via `
|
|
68
|
+
- **Structured test evidence** — the executor can pass `testResults` (exit code, suite name, output, timestamp) via `complete_goal({testResults})`. The auditor receives a `<test_evidence>` block and is instructed to check it before re-running test suites, skipping redundant re-runs.
|
|
64
69
|
|
|
65
70
|
### Drafting & UX
|
|
66
71
|
|
|
@@ -232,7 +237,7 @@ The completion result prints a full report into the conversation:
|
|
|
232
237
|
- the auditor's approval report
|
|
233
238
|
- full current goal details, including objective, status, usage, mode, and file path
|
|
234
239
|
|
|
235
|
-
Sisyphus goals use the same completion tool as regular goals. The stricter part is the prompt/criteria standard: the agent should only call completion after the whole ordered objective is actually satisfied and likely to survive independent auditing. A paused goal can also be completed directly when the agent already has enough evidence that every requirement is satisfied; it does not need a resume just to call `
|
|
240
|
+
Sisyphus goals use the same completion tool as regular goals. The stricter part is the prompt/criteria standard: the agent should only call completion after the whole ordered objective is actually satisfied and likely to survive independent auditing. A paused goal can also be completed directly when the agent already has enough evidence that every requirement is satisfied; it does not need a resume just to call `complete_goal`.
|
|
236
241
|
|
|
237
242
|
## Schema gates
|
|
238
243
|
|
|
@@ -0,0 +1,145 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
## 0.17.0 (2026-05-29)
|
|
4
|
+
|
|
5
|
+
### Features
|
|
6
|
+
|
|
7
|
+
- **Per-goal auditor toggle** — press `a` during the confirmation dialog to toggle the auditor on/off for a specific goal. Default from settings; override persists within session.
|
|
8
|
+
- **Task workflow prompt guidance** — added `[TASK WORKFLOW]` section to both `goalPrompt` and `continuationPrompt`, directing agents to complete subtasks one-by-one as progress trackers (not batch-marking at the end).
|
|
9
|
+
- **Recursive duplicate ID validation** — `validateTaskListProposal` now checks all task IDs across the entire tree, preventing collisions between parent/subtask or sibling subtask IDs.
|
|
10
|
+
- **Escape dialog during audit** — pressing Escape during a completion audit shows a TUI dialog with "Mark complete without audit" or "Continue working" options.
|
|
11
|
+
|
|
12
|
+
### Fixes
|
|
13
|
+
|
|
14
|
+
- `validateTaskCompletion` and `validateTaskSkip` now use recursive `findTaskInTree` instead of flat `Array.find()` for nested subtask support.
|
|
15
|
+
- Updated README references from legacy `update_goal` to `complete_goal`.
|
|
16
|
+
|
|
17
|
+
### Tests
|
|
18
|
+
|
|
19
|
+
- 310 total tests (up from 308).
|
|
20
|
+
- Added tests for recursive duplicate ID detection across nested subtask trees.
|
|
21
|
+
- Added e2e test for `skipAuditor=true` path.
|
|
22
|
+
|
|
23
|
+
## 0.16.1 (2026-05-29)
|
|
24
|
+
|
|
25
|
+
### Features
|
|
26
|
+
|
|
27
|
+
- **Escape-to-skip audit** — press Escape during an auditor run to abort it and complete the goal immediately. The skip is recorded in the ledger with the reason `user_aborted` and auditor model metadata.
|
|
28
|
+
- **Audit progress widget** — the TUI shows a spinner, progress bar, step labels, current tool, and output lines while the auditor runs.
|
|
29
|
+
- **Audit abort detection** — the auditor detects aborts both from exceptions and from `session.prompt()` returning after an abort signal, preventing stuck goals or ghost states.
|
|
30
|
+
- **Goal status for Sisyphus** — `COMPLETED` status label for completed Sisyphus goals.
|
|
31
|
+
- **Multi-session focus isolation** — goal focus data uses `goalFocusDetails` which includes the goal id and reason but not full balance data, preventing cross-session focus leakage.
|
|
32
|
+
|
|
33
|
+
### Fixes
|
|
34
|
+
|
|
35
|
+
- Fixed a merge bug where `propose_task_list` could produce duplicate task list when called during a continuation.
|
|
36
|
+
|
|
37
|
+
## 0.16.0 (2026-05-29)
|
|
38
|
+
|
|
39
|
+
### Features
|
|
40
|
+
|
|
41
|
+
- **`delete_goal` tool** — new lifecycle tool for archiving goals by id. Accepts a required `goalId` and optional `reason`. Agent-facing only; not intended for user use.
|
|
42
|
+
- **`complete_goal` `status` optional** — the `status` parameter on `complete_goal` is now optional. When omitted, defaults to `"complete"`. Explicitly setting an invalid value (anything other than `"complete"`) still produces an error.
|
|
43
|
+
- **SCROLL FIX** — the confirmation dialog no longer scrolls to the bottom when the user is scrolled up and new content arrives. Uses `addContextWrapped()` which suppresses viewport resets.
|
|
44
|
+
- **Task list shown first** — the task list section now appears FIRST in the confirmation dialog context (before the objective), with context capped at 12 lines so tasks don't scroll off-screen.
|
|
45
|
+
- **Audit completion flow** — the completion report card no longer says "Goal audit approved." when the auditor was skipped (now shows "Goal audit skipped." with reason).
|
|
46
|
+
|
|
47
|
+
### Fixes
|
|
48
|
+
|
|
49
|
+
- Fixed task completion/skip validation for nested subtasks (uses recursive `findTaskInTree`).
|
|
50
|
+
- All `complete_goal` calls default to `status: "complete"` when no explicit status is provided.
|
|
51
|
+
- Updated prompts and tool descriptions to reflect the `complete_goal` naming.
|
|
52
|
+
|
|
53
|
+
### Tests
|
|
54
|
+
|
|
55
|
+
- Updated e2e tests to verify `complete_goal` accepts calls without status.
|
|
56
|
+
- Added e2e test verifying `complete_goal` rejects invalid explicit status.
|
|
57
|
+
|
|
58
|
+
## 0.15.1 (2026-05-28)
|
|
59
|
+
|
|
60
|
+
### Fixes
|
|
61
|
+
|
|
62
|
+
- Fixed settings file reference in storage writes.
|
|
63
|
+
|
|
64
|
+
### Documentation
|
|
65
|
+
|
|
66
|
+
- Reorganized README settings documentation for clarity.
|
|
67
|
+
|
|
68
|
+
## 0.14.0 (2026-05-27)
|
|
69
|
+
|
|
70
|
+
### Features
|
|
71
|
+
|
|
72
|
+
- **Subtask hierarchy** — tasks can have nested sub-tasks via `subtasks?: GoalTask[]`. Subtask depth controlled by `subtaskDepth` setting (default: 1). Deep subtrees are rejected at proposal.
|
|
73
|
+
- **Lightweight subtasks** — `lightweightSubtasks?: boolean` on tasks. When true, parent can complete regardless of subtask status. Full subtasks require all sub-items completed first.
|
|
74
|
+
- **Per-task contracts** — `propose_task_list` supports optional `verificationContract` per task. If set, `complete_task` requires a non-empty `verificationSummary`.
|
|
75
|
+
- **Task list block** — tasks are listed in prompts with checkboxes and status indicators.
|
|
76
|
+
|
|
77
|
+
### Tests
|
|
78
|
+
|
|
79
|
+
- Added e2e tests for goal creation with task list, scroll fix, and subtask validation.
|
|
80
|
+
|
|
81
|
+
## 0.13.0 (2026-05-22)
|
|
82
|
+
|
|
83
|
+
### Features
|
|
84
|
+
|
|
85
|
+
- **Verification contract system** — goals can include a `Verification contract:` section. Extracted and stored on the goal record. `complete_goal` rejects calls without `verificationSummary` when a contract is set.
|
|
86
|
+
- **Per-goal verification contracts** — the contract is extracted during goal drafting and enforced by tools and prompts.
|
|
87
|
+
- **`complete_goal` `testResults` removed** — replaced with `verificationSummary`. The old structured test results interface is gone.
|
|
88
|
+
- **Auditor integration** — the independent completion auditor receives both the `verificationContract` and `verificationSummary` and cross-checks claims against real artifacts.
|
|
89
|
+
|
|
90
|
+
### Tests
|
|
91
|
+
|
|
92
|
+
- Updated verification contract tests.
|
|
93
|
+
|
|
94
|
+
## 0.12.0 (2026-04-29)
|
|
95
|
+
|
|
96
|
+
### Features
|
|
97
|
+
|
|
98
|
+
- **Task list system** — `propose_task_list` tool with confirmation dialog. Tasks stored on goal record, rendered in prompts and widget, serialized to disk.
|
|
99
|
+
- **Unified goal + task acceptance** — `propose_goal_draft` accepts optional `tasks` array. Single dialog shows goal + task list together.
|
|
100
|
+
- **`complete_task` and `skip_task` tools** — per-task completion with evidence/verificationSummary. Neither stops the turn.
|
|
101
|
+
- **`update_goal` renamed to `complete_goal`** — the core completion tool now uses `complete_goal({status: "complete"})` and requires explicit status acceptance.
|
|
102
|
+
- **Completion report heading fix** — the report now shows `Goal complete.` instead of `Goal audit approved.` when no contract or auditor is involved.
|
|
103
|
+
|
|
104
|
+
### Tests
|
|
105
|
+
|
|
106
|
+
- Full task lifecycle tests (policy, round-trip, render, edge cases).
|
|
107
|
+
- Verification contract tests for both goal-level and per-task contracts.
|
|
108
|
+
|
|
109
|
+
## 0.11.0 (2026-04-23)
|
|
110
|
+
|
|
111
|
+
### Features
|
|
112
|
+
|
|
113
|
+
- **Deferred archival** — goals are archived at `turn_end`, not inline in the tool handler. Prevents premature archiving before the agent sees the audit result.
|
|
114
|
+
- **`propose_goal_tweak`** — sole mechanism for updating the goal objective during `/goal-tweak`. Uses the same Confirm/Continue Chatting dialog as goal creation.
|
|
115
|
+
- **Focus isolation** — goal focus is stored as a branch-local session entry, not in goal markdown metadata. Multiple sessions can have different focused goals.
|
|
116
|
+
- **Auditor bypass with user confirmation** — `confirmBypassAuditor: true` bypasses the auditor when the user explicitly opts out.
|
|
117
|
+
|
|
118
|
+
### Fixes
|
|
119
|
+
|
|
120
|
+
- Cleaned up lifecycle issues with AbortSignal wiring and timer cleanup.
|
|
121
|
+
|
|
122
|
+
## 0.10.0 (2026-04-15)
|
|
123
|
+
|
|
124
|
+
### Features
|
|
125
|
+
|
|
126
|
+
- **Completion audit system** — independent pi auditor agent verifies completion claims before archiving.
|
|
127
|
+
- **Audit progress** — real-time TUI progress widget with spinner, progress bar, and step labels.
|
|
128
|
+
- **Ledger system** — structured event log for all goal lifecycle events.
|
|
129
|
+
|
|
130
|
+
## 0.9.0 (2026-04-08)
|
|
131
|
+
|
|
132
|
+
### Features
|
|
133
|
+
|
|
134
|
+
- **`goal_question` and `goal_questionnaire`** — structured drafting question tools.
|
|
135
|
+
- **`/goal-settings`** — interactive settings configuration.
|
|
136
|
+
- **Sisyphus goal style** — patient ordered execution with prompt/criteria variant.
|
|
137
|
+
|
|
138
|
+
## 0.8.1 (2026-04-01)
|
|
139
|
+
|
|
140
|
+
### Features
|
|
141
|
+
|
|
142
|
+
- Initial fork from @capyup/pi-goal.
|
|
143
|
+
- Pause/resume/abort lifecycle.
|
|
144
|
+
- Multiple open goals.
|
|
145
|
+
- Auto-continue loop.
|
|
@@ -185,7 +185,7 @@ export function validateTaskCompletion(args: {
|
|
|
185
185
|
}): PolicyValidation {
|
|
186
186
|
if (!args.goal) return { ok: false, message: "No goal is set." };
|
|
187
187
|
if (!args.goal.taskList) return { ok: false, message: "Goal has no task list." };
|
|
188
|
-
const task = args.goal.taskList.tasks
|
|
188
|
+
const task = findTaskInTree(args.goal.taskList.tasks, args.taskId);
|
|
189
189
|
if (!task) return { ok: false, message: `Task "${args.taskId}" not found.` };
|
|
190
190
|
if (task.status === "complete") return { ok: false, message: `Task "${args.taskId}" is already complete.` };
|
|
191
191
|
if (task.status === "skipped") return { ok: false, message: `Task "${args.taskId}" was already skipped.` };
|
|
@@ -199,7 +199,7 @@ export function validateTaskSkip(args: {
|
|
|
199
199
|
}): PolicyValidation {
|
|
200
200
|
if (!args.goal) return { ok: false, message: "No goal is set." };
|
|
201
201
|
if (!args.goal.taskList) return { ok: false, message: "Goal has no task list." };
|
|
202
|
-
const task = args.goal.taskList.tasks
|
|
202
|
+
const task = findTaskInTree(args.goal.taskList.tasks, args.taskId);
|
|
203
203
|
if (!task) return { ok: false, message: `Task "${args.taskId}" not found.` };
|
|
204
204
|
if (task.status === "complete") return { ok: false, message: `Task "${args.taskId}" is already complete.` };
|
|
205
205
|
// Skipped tasks toggle via the executor; reason is only required for first-time skips.
|
|
@@ -241,6 +241,20 @@ export function findSubtaskDepthViolation(tasks: GoalTask[], maxDepth: number):
|
|
|
241
241
|
return undefined;
|
|
242
242
|
}
|
|
243
243
|
|
|
244
|
+
function checkDuplicateTaskIds(tasks: GoalTask[], ids: Set<string>): string | undefined {
|
|
245
|
+
for (const t of tasks) {
|
|
246
|
+
const id = t.id.trim();
|
|
247
|
+
if (!id) return "All tasks must have a non-empty id.";
|
|
248
|
+
if (ids.has(id)) return `Duplicate task id: "${id}".`;
|
|
249
|
+
ids.add(id);
|
|
250
|
+
if (t.subtasks) {
|
|
251
|
+
const childErr = checkDuplicateTaskIds(t.subtasks, ids);
|
|
252
|
+
if (childErr) return childErr;
|
|
253
|
+
}
|
|
254
|
+
}
|
|
255
|
+
return undefined;
|
|
256
|
+
}
|
|
257
|
+
|
|
244
258
|
export function validateTaskListProposal(args: {
|
|
245
259
|
goal: GoalPolicyRecordLike | null;
|
|
246
260
|
tasks: GoalTask[];
|
|
@@ -254,6 +268,11 @@ export function validateTaskListProposal(args: {
|
|
|
254
268
|
if (!t.title.trim()) return { ok: false, message: `Task "${t.id}" must have a non-empty title.` };
|
|
255
269
|
if (ids.has(t.id)) return { ok: false, message: `Duplicate task id: "${t.id}".` };
|
|
256
270
|
ids.add(t.id);
|
|
271
|
+
// Recursively check subtask ids against the same global set
|
|
272
|
+
if (t.subtasks && t.subtasks.length > 0) {
|
|
273
|
+
const childErr = checkDuplicateTaskIds(t.subtasks, ids);
|
|
274
|
+
if (childErr) return { ok: false, message: childErr };
|
|
275
|
+
}
|
|
257
276
|
}
|
|
258
277
|
// Check subtask depth limit
|
|
259
278
|
const maxDepth = args.maxSubtaskDepth ?? 1;
|
|
@@ -26,6 +26,7 @@ export interface GoalQuestionnaireResult {
|
|
|
26
26
|
questions: GoalQuestionnaireQuestion[];
|
|
27
27
|
answers: GoalQuestionnaireAnswer[];
|
|
28
28
|
cancelled: boolean;
|
|
29
|
+
auditorEnabled?: boolean;
|
|
29
30
|
}
|
|
30
31
|
|
|
31
32
|
export type ProposalDecision = "confirm" | "continue";
|
|
@@ -82,7 +83,7 @@ export function proposalDialogFailureMessage(error: unknown): string {
|
|
|
82
83
|
* the internal draft-confirm prompt. This keeps pi-goal self-contained and
|
|
83
84
|
* avoids depending on external question/questionnaire packages.
|
|
84
85
|
*/
|
|
85
|
-
export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions: GoalQuestionnaireQuestion[]): Promise<GoalQuestionnaireResult> {
|
|
86
|
+
export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions: GoalQuestionnaireQuestion[], auditorToggleInit?: { defaultEnabled: boolean }): Promise<GoalQuestionnaireResult> {
|
|
86
87
|
if (!ctx.hasUI) {
|
|
87
88
|
return { questions: [], answers: [], cancelled: true };
|
|
88
89
|
}
|
|
@@ -102,6 +103,7 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
|
|
|
102
103
|
let inputMode = false;
|
|
103
104
|
let inputQuestionId: string | null = null;
|
|
104
105
|
let cachedLines: string[] | undefined;
|
|
106
|
+
let auditorEnabled = auditorToggleInit?.defaultEnabled ?? true;
|
|
105
107
|
const answers = new Map<string, GoalQuestionnaireAnswer>();
|
|
106
108
|
const drafts = new Map<string, string>();
|
|
107
109
|
|
|
@@ -126,7 +128,7 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
|
|
|
126
128
|
// Restore hardware cursor now that the dialog is closing
|
|
127
129
|
tui.setShowHardwareCursor(wasHardwareCursorShown);
|
|
128
130
|
const ordered = questions.map((q) => answers.get(q.id)).filter((a): a is GoalQuestionnaireAnswer => !!a);
|
|
129
|
-
done({ questions, answers: ordered, cancelled });
|
|
131
|
+
done({ questions, answers: ordered, cancelled, auditorEnabled: auditorToggleInit ? auditorEnabled : undefined });
|
|
130
132
|
}
|
|
131
133
|
|
|
132
134
|
function currentQuestion(): GoalQuestionnaireQuestion | undefined {
|
|
@@ -272,6 +274,13 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
|
|
|
272
274
|
return;
|
|
273
275
|
}
|
|
274
276
|
|
|
277
|
+
// Auditor toggle hotkey
|
|
278
|
+
if (matchesKey(data, "a") && auditorToggleInit) {
|
|
279
|
+
auditorEnabled = !auditorEnabled;
|
|
280
|
+
refresh();
|
|
281
|
+
return;
|
|
282
|
+
}
|
|
283
|
+
|
|
275
284
|
if (matchesKey(data, Key.enter) && q) {
|
|
276
285
|
if (q.options.length === 0 || opts[optionIndex]?.isCustom) {
|
|
277
286
|
inputMode = true;
|
|
@@ -293,7 +302,9 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
|
|
|
293
302
|
if (matchesKey(data, Key.escape)) submit(true);
|
|
294
303
|
}
|
|
295
304
|
|
|
296
|
-
|
|
305
|
+
const MAX_CONTEXT_LINES = 12; // prevent viewport jumping by capping context display
|
|
306
|
+
|
|
307
|
+
function render(width: number): string[] {
|
|
297
308
|
if (cachedLines) return cachedLines;
|
|
298
309
|
const safeWidth = Math.max(20, width);
|
|
299
310
|
const lines: string[] = [];
|
|
@@ -302,6 +313,18 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
|
|
|
302
313
|
const add = (s: string) => lines.push(truncateToWidth(s, safeWidth, "…", true));
|
|
303
314
|
const addWrapped = (s: string) => lines.push(...wrapTextWithAnsi(s, safeWidth));
|
|
304
315
|
|
|
316
|
+
/** Wraps text and caps at MAX_CONTEXT_LINES to prevent viewport jumping. */
|
|
317
|
+
const addContextWrapped = (s: string) => {
|
|
318
|
+
const wrapped = wrapTextWithAnsi(s, safeWidth);
|
|
319
|
+
if (wrapped.length <= MAX_CONTEXT_LINES) {
|
|
320
|
+
lines.push(...wrapped);
|
|
321
|
+
} else {
|
|
322
|
+
lines.push(...wrapped.slice(0, MAX_CONTEXT_LINES));
|
|
323
|
+
const overflow = wrapped.length - MAX_CONTEXT_LINES;
|
|
324
|
+
lines.push(theme.fg("dim", ` ... ${overflow} more line${overflow === 1 ? "" : "s"} (full details after confirmation)`));
|
|
325
|
+
}
|
|
326
|
+
};
|
|
327
|
+
|
|
305
328
|
add(theme.fg("accent", "─".repeat(safeWidth)));
|
|
306
329
|
if (isMulti) {
|
|
307
330
|
const tabs: string[] = ["← "];
|
|
@@ -331,7 +354,7 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
|
|
|
331
354
|
|
|
332
355
|
if (inputMode && q) {
|
|
333
356
|
addWrapped(theme.fg("text", ` ${q.question}`));
|
|
334
|
-
if (q.context)
|
|
357
|
+
if (q.context) addContextWrapped(theme.fg("muted", ` ${q.context}`));
|
|
335
358
|
lines.push("");
|
|
336
359
|
if (q.options.length > 0) {
|
|
337
360
|
renderOptions();
|
|
@@ -352,7 +375,15 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
|
|
|
352
375
|
add(allAnswered() ? theme.fg("success", " Press Enter to submit") : theme.fg("warning", ` Unanswered: ${questions.filter((qq) => !answers.has(qq.id)).map((qq) => qq.id).join(", ")}`));
|
|
353
376
|
} else if (q) {
|
|
354
377
|
addWrapped(theme.fg("text", ` ${q.question}`));
|
|
355
|
-
if (q.context)
|
|
378
|
+
if (q.context) addContextWrapped(theme.fg("muted", ` ${q.context}`));
|
|
379
|
+
// Auditor toggle line between context and options
|
|
380
|
+
if (auditorToggleInit) {
|
|
381
|
+
const circle = auditorEnabled ? "●" : "○";
|
|
382
|
+
const label = auditorEnabled ? "Auditor enabled" : "Auditor disabled";
|
|
383
|
+
const color = auditorEnabled ? "success" : "warning";
|
|
384
|
+
add(theme.fg(color, ` ${circle} ${label}`) + theme.fg("dim", " (press 'a' to toggle)"));
|
|
385
|
+
lines.push("");
|
|
386
|
+
}
|
|
356
387
|
const existing = answers.get(q.id);
|
|
357
388
|
if (existing) add(theme.fg("dim", ` Current: ${existing.wasCustom ? "(wrote) " : ""}${existing.answer}`));
|
|
358
389
|
lines.push("");
|
|
@@ -361,7 +392,10 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
|
|
|
361
392
|
}
|
|
362
393
|
|
|
363
394
|
lines.push("");
|
|
364
|
-
if (!inputMode)
|
|
395
|
+
if (!inputMode) {
|
|
396
|
+
const auditorHint = auditorToggleInit ? " • a toggle auditor" : "";
|
|
397
|
+
add(theme.fg("dim", isMulti ? " Tab/←→ navigate • ↑↓ select • Enter confirm • Esc cancel" + auditorHint : " ↑↓ navigate • Enter select • Esc cancel" + auditorHint));
|
|
398
|
+
}
|
|
365
399
|
add(theme.fg("accent", "─".repeat(safeWidth)));
|
|
366
400
|
cachedLines = lines;
|
|
367
401
|
return lines;
|
|
@@ -379,7 +413,8 @@ export async function showProposalDialog(
|
|
|
379
413
|
ctx: ExtensionContext,
|
|
380
414
|
confirmationText: string,
|
|
381
415
|
focus: GoalDraftingFocus,
|
|
382
|
-
|
|
416
|
+
defaultAuditorEnabled?: boolean,
|
|
417
|
+
): Promise<{ decision: ProposalDecision; auditorEnabled: boolean }> {
|
|
383
418
|
const headerTitle = focus === "sisyphus" ? "Confirm Sisyphus Goal Draft" : "Confirm Goal Draft";
|
|
384
419
|
const result = await runGoalQuestionnaire(ctx, [{
|
|
385
420
|
id: "confirm",
|
|
@@ -388,11 +423,12 @@ export async function showProposalDialog(
|
|
|
388
423
|
options: ["Confirm — create this goal now", "Continue chatting — keep refining"],
|
|
389
424
|
recommended: 0,
|
|
390
425
|
allowCustom: false,
|
|
391
|
-
}]);
|
|
392
|
-
|
|
426
|
+
}], defaultAuditorEnabled !== undefined ? { defaultEnabled: defaultAuditorEnabled } : undefined);
|
|
427
|
+
const decision = proposalDecisionFromQuestionnaireResult({
|
|
393
428
|
cancelled: result.cancelled,
|
|
394
429
|
answer: result.answers[0]?.answer,
|
|
395
430
|
});
|
|
431
|
+
return { decision, auditorEnabled: result.auditorEnabled ?? true };
|
|
396
432
|
}
|
|
397
433
|
|
|
398
434
|
export function registerQuestionnaireTools(pi: ExtensionAPI): void {
|
|
@@ -45,6 +45,7 @@ export interface GoalRecord {
|
|
|
45
45
|
// Set by the agent's pause_goal tool. Cleared when the goal becomes active again.
|
|
46
46
|
pauseReason?: string;
|
|
47
47
|
pauseSuggestedAction?: string;
|
|
48
|
+
skipAuditor?: boolean;
|
|
48
49
|
taskList?: GoalTaskList;
|
|
49
50
|
/** Plain-text description of what verification evidence is required before completing this goal. */
|
|
50
51
|
verificationContract?: string;
|
|
@@ -247,6 +248,7 @@ export function normalizeGoalRecord(value: unknown): GoalRecord | null {
|
|
|
247
248
|
stopReason: raw.stopReason === "agent" || raw.stopReason === "user" ? raw.stopReason : undefined,
|
|
248
249
|
pauseReason: typeof raw.pauseReason === "string" && raw.pauseReason.trim() ? raw.pauseReason : undefined,
|
|
249
250
|
pauseSuggestedAction: typeof raw.pauseSuggestedAction === "string" && raw.pauseSuggestedAction.trim() ? raw.pauseSuggestedAction : undefined,
|
|
251
|
+
skipAuditor: raw.skipAuditor === true ? true : undefined,
|
|
250
252
|
taskList: normalizeTaskList(raw.taskList),
|
|
251
253
|
verificationContract: typeof raw.verificationContract === "string" ? raw.verificationContract : undefined,
|
|
252
254
|
};
|
|
@@ -143,6 +143,14 @@ export function loadGoalSettings(cwd: string, env: NodeJS.ProcessEnv = process.e
|
|
|
143
143
|
* Save settings to the unified settings file on disk.
|
|
144
144
|
* Persists only non-default values using the canonical key names.
|
|
145
145
|
*/
|
|
146
|
+
/**
|
|
147
|
+
* Determine whether the auditor should be enabled by default based on settings.
|
|
148
|
+
* The auditor is enabled by default unless settings.disabled === true.
|
|
149
|
+
*/
|
|
150
|
+
export function isAuditorEnabledByDefault(settings: GoalSettings): boolean {
|
|
151
|
+
return settings.disabled !== true;
|
|
152
|
+
}
|
|
153
|
+
|
|
146
154
|
export function saveGoalSettingsFileConfig(cwd: string, settings: GoalSettings): GoalSettings {
|
|
147
155
|
const clean: GoalSettings = {};
|
|
148
156
|
const provider = asNonEmptyString(settings.provider);
|
package/extensions/goal.ts
CHANGED
|
@@ -21,6 +21,7 @@ import {
|
|
|
21
21
|
} from "./goal-auditor.ts";
|
|
22
22
|
import {
|
|
23
23
|
goalSettingsPath,
|
|
24
|
+
isAuditorEnabledByDefault,
|
|
24
25
|
loadGoalSettings,
|
|
25
26
|
loadGoalSettingsFileConfig,
|
|
26
27
|
saveGoalSettingsFileConfig,
|
|
@@ -1578,7 +1579,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
1578
1579
|
}
|
|
1579
1580
|
}
|
|
1580
1581
|
const lifecycleHint = view && (view.status === "active" || view.status === "paused")
|
|
1581
|
-
? "\nLifecycle tools: if evidence proves the objective is satisfied, call complete_goal({
|
|
1582
|
+
? "\nLifecycle tools: if evidence proves the objective is satisfied, call complete_goal({verificationSummary: \"evidence\"}); if blocked, call pause_goal({reason, suggestedAction?}); if abandoned/obsolete/unsafe, call abort_goal({reason}). For file or shell work, use the normal work tools directly (write/read/bash/edit); do not call get_goal repeatedly just to look for tools."
|
|
1582
1583
|
: "";
|
|
1583
1584
|
const text = view
|
|
1584
1585
|
? `${detailedSummary(view)}${lifecycleHint}${nudge}${otherCount > 0 ? `\nOther open goals: ${otherCount} (human can run /goal-list or /goal-focus)` : ""}`
|
|
@@ -1737,20 +1738,25 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
1737
1738
|
const draftSummary = buildDraftConfirmationText({
|
|
1738
1739
|
focus: activeIntent.focus,
|
|
1739
1740
|
originalTopic: activeIntent.originalTopic,
|
|
1740
|
-
|
|
1741
|
+
// Tasks section appears FIRST in the context so it stays visible
|
|
1742
|
+
// even when the dialog caps long context lines.
|
|
1743
|
+
objective: taskSummarySection ? `${taskSummarySection}
|
|
1744
|
+
|
|
1745
|
+
${objective}` : objective,
|
|
1741
1746
|
autoContinue: autoContinueFlag,
|
|
1742
1747
|
});
|
|
1743
1748
|
|
|
1744
1749
|
const headless = shouldAutoConfirmProposal({ hasUI: ctx.hasUI, autoConfirmEnv: process.env.PI_GOAL_AUTO_CONFIRM });
|
|
1745
1750
|
|
|
1746
|
-
let decision: "confirm" | "continue";
|
|
1751
|
+
let decision: { decision: "confirm" | "continue"; auditorEnabled: boolean };
|
|
1752
|
+
const auditorDefault = isAuditorEnabledByDefault(loadGoalSettings(ctx.cwd));
|
|
1747
1753
|
if (headless) {
|
|
1748
1754
|
// Headless: auto-confirm (tests and non-TUI sessions).
|
|
1749
|
-
decision = "confirm";
|
|
1755
|
+
decision = { decision: "confirm", auditorEnabled: auditorDefault };
|
|
1750
1756
|
} else {
|
|
1751
1757
|
// TUI: show overlay dialog.
|
|
1752
1758
|
try {
|
|
1753
|
-
decision = await showProposalDialog(ctx, draftSummary, activeIntent.focus);
|
|
1759
|
+
decision = await showProposalDialog(ctx, draftSummary, activeIntent.focus, auditorDefault);
|
|
1754
1760
|
} catch (err) {
|
|
1755
1761
|
const message = proposalDialogFailureMessage(err);
|
|
1756
1762
|
ctx.ui.notify(message, "error");
|
|
@@ -1761,7 +1767,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
1761
1767
|
}
|
|
1762
1768
|
}
|
|
1763
1769
|
|
|
1764
|
-
if (decision === "confirm") {
|
|
1770
|
+
if (decision.decision === "confirm") {
|
|
1765
1771
|
// Extract verification contract from objective before creation
|
|
1766
1772
|
const { objective: cleanedObjective, verificationContract } = extractVerificationContract(objective);
|
|
1767
1773
|
const config: GoalCreationConfig = {
|
|
@@ -1772,6 +1778,12 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
1772
1778
|
confirmationIntent = null;
|
|
1773
1779
|
replaceGoal(config, ctx, false, verificationContract);
|
|
1774
1780
|
|
|
1781
|
+
// Set skipAuditor on the goal if user toggled auditor off
|
|
1782
|
+
if (!decision.auditorEnabled && state.goal) {
|
|
1783
|
+
state.goal = { ...state.goal, skipAuditor: true };
|
|
1784
|
+
setGoal(state.goal, ctx);
|
|
1785
|
+
}
|
|
1786
|
+
|
|
1775
1787
|
// Set task list if provided
|
|
1776
1788
|
if (tasksToCreate && tasksToCreate.length > 0 && state.goal) {
|
|
1777
1789
|
const now = nowIso();
|
|
@@ -1887,12 +1899,12 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
1887
1899
|
|
|
1888
1900
|
const headless = shouldAutoConfirmProposal({ hasUI: ctx.hasUI, autoConfirmEnv: process.env.PI_GOAL_AUTO_CONFIRM });
|
|
1889
1901
|
|
|
1890
|
-
let decision: "confirm" | "continue";
|
|
1902
|
+
let decision: { decision: "confirm" | "continue"; auditorEnabled: boolean };
|
|
1891
1903
|
if (headless) {
|
|
1892
|
-
decision = "confirm";
|
|
1904
|
+
decision = { decision: "confirm", auditorEnabled: !state.goal.skipAuditor };
|
|
1893
1905
|
} else {
|
|
1894
1906
|
try {
|
|
1895
|
-
decision = await showProposalDialog(ctx, draftSummary, state.goal.sisyphus ? "sisyphus" : "goal");
|
|
1907
|
+
decision = await showProposalDialog(ctx, draftSummary, state.goal.sisyphus ? "sisyphus" : "goal", !state.goal.skipAuditor);
|
|
1896
1908
|
} catch (err) {
|
|
1897
1909
|
const message = proposalDialogFailureMessage(err);
|
|
1898
1910
|
ctx.ui.notify(message, "error");
|
|
@@ -1903,7 +1915,11 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
1903
1915
|
}
|
|
1904
1916
|
}
|
|
1905
1917
|
|
|
1906
|
-
if (decision === "confirm") {
|
|
1918
|
+
if (decision.decision === "confirm") {
|
|
1919
|
+
// Persist any auditor toggle change
|
|
1920
|
+
if (state.goal) {
|
|
1921
|
+
state.goal = { ...state.goal, skipAuditor: !decision.auditorEnabled };
|
|
1922
|
+
}
|
|
1907
1923
|
// Extract verification contract from revised objective
|
|
1908
1924
|
const { objective: cleanedObjective, verificationContract } = extractVerificationContract(newObjective);
|
|
1909
1925
|
// Apply the tweak: write the new objective to disk authoritatively.
|
|
@@ -1981,7 +1997,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
1981
1997
|
description: "Mark the current active or paused pi goal complete. Only call this when the goal objective is actually achieved — no required work remains.",
|
|
1982
1998
|
promptSnippet: "Mark the active or paused pi goal complete — only when every requirement is satisfied.",
|
|
1983
1999
|
promptGuidelines: [
|
|
1984
|
-
"Call complete_goal
|
|
2000
|
+
"Call complete_goal only when the pi goal objective has actually been achieved and no required work remains.",
|
|
1985
2001
|
"Before calling complete_goal, you MUST provide a verificationSummary that addresses every success criterion and any verification contract on the goal. Fold all verification evidence (test output, grep results, requirements coverage) into this single field.",
|
|
1986
2002
|
"The auditor is authoritative: completion is archived only if the auditor report ends with <approved/>. If it ends with <disapproved/> or no approval marker, complete_goal is rejected and the goal remains open.",
|
|
1987
2003
|
"Do NOT call complete_goal if any work remains, even if substantial progress was made. Do not use it merely because work is stopping, tests passed, or you are blocked.",
|
|
@@ -2001,7 +2017,8 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
2001
2017
|
reconcileFocusedGoalFromDisk(ctx);
|
|
2002
2018
|
|
|
2003
2019
|
// -- Phase 2: Status validation --
|
|
2004
|
-
|
|
2020
|
+
const effectiveStatus = params.status ?? COMPLETE_STATUS;
|
|
2021
|
+
if (effectiveStatus !== COMPLETE_STATUS) {
|
|
2005
2022
|
throw new Error("complete_goal requires status=complete when marking a goal complete.");
|
|
2006
2023
|
}
|
|
2007
2024
|
|
|
@@ -2059,7 +2076,58 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
2059
2076
|
? `${settings.provider ?? "default"}/${settings.model ?? "default"}${settings.thinkingLevel ? `:${settings.thinkingLevel}` : ""}`
|
|
2060
2077
|
: "default";
|
|
2061
2078
|
|
|
2062
|
-
// Check if auditor is disabled
|
|
2079
|
+
// Check if auditor is disabled per-goal (user toggled it off during goal confirmation)
|
|
2080
|
+
if (auditTarget.skipAuditor) {
|
|
2081
|
+
pi.sendMessage<GoalAuditEventDetails>({
|
|
2082
|
+
customType: GOAL_AUDIT_ENTRY,
|
|
2083
|
+
content: `Goal completed — per-goal auditor disabled.`,
|
|
2084
|
+
display: true,
|
|
2085
|
+
details: { phase: "skipped", goalId: auditTarget.id, auditor: auditorLabel },
|
|
2086
|
+
});
|
|
2087
|
+
try {
|
|
2088
|
+
appendGoalEvent(ctx, {
|
|
2089
|
+
type: "audit_skipped",
|
|
2090
|
+
goalId: auditTarget.id,
|
|
2091
|
+
reason: "disabled",
|
|
2092
|
+
provider: settings.provider,
|
|
2093
|
+
model: settings.model,
|
|
2094
|
+
thinkingLevel: settings.thinkingLevel,
|
|
2095
|
+
at: nowIso(),
|
|
2096
|
+
});
|
|
2097
|
+
} catch {
|
|
2098
|
+
// Ledger append failure should not block completion
|
|
2099
|
+
}
|
|
2100
|
+
accountProgress(ctx);
|
|
2101
|
+
auditProgress = null;
|
|
2102
|
+
goalWidgetComponent?.invalidate();
|
|
2103
|
+
state.goal = {
|
|
2104
|
+
...auditTarget,
|
|
2105
|
+
status: "complete",
|
|
2106
|
+
stopReason: "agent",
|
|
2107
|
+
updatedAt: nowIso(),
|
|
2108
|
+
};
|
|
2109
|
+
state.goal = writeActiveGoalFile(ctx, state.goal);
|
|
2110
|
+
pi.appendEntry(STATE_ENTRY, goalDetails(state.goal));
|
|
2111
|
+
turnStoppedFor = state.goal?.id ?? null;
|
|
2112
|
+
resetGetGoalNudgeState(state.goal?.id);
|
|
2113
|
+
syncGoalTools();
|
|
2114
|
+
updateUI(ctx);
|
|
2115
|
+
return {
|
|
2116
|
+
content: [{
|
|
2117
|
+
type: "text",
|
|
2118
|
+
text: buildCompletionReport({
|
|
2119
|
+
detailedSummary: detailedSummary(state.goal),
|
|
2120
|
+
completionSummary: params.completionSummary,
|
|
2121
|
+
auditSkippedReason: "per-goal auditor disabled",
|
|
2122
|
+
taskSummary: state.goal?.taskList ? buildTaskSummary(state.goal.taskList) : null,
|
|
2123
|
+
}),
|
|
2124
|
+
}],
|
|
2125
|
+
details: goalDetails(state.goal),
|
|
2126
|
+
terminate: true,
|
|
2127
|
+
};
|
|
2128
|
+
}
|
|
2129
|
+
|
|
2130
|
+
// Check if auditor is disabled in settings
|
|
2063
2131
|
if (settings.disabled === true) {
|
|
2064
2132
|
if (params.confirmBypassAuditor !== true) {
|
|
2065
2133
|
return {
|
|
@@ -2524,7 +2592,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
2524
2592
|
promptSnippet: "Legacy no-op: Sisyphus no longer requires step_complete.",
|
|
2525
2593
|
promptGuidelines: [
|
|
2526
2594
|
"Do not call this in normal operation. Sisyphus mode shares the normal goal lifecycle and completion gate.",
|
|
2527
|
-
"
|
|
2595
|
+
"Call complete_goal only when the full objective is actually satisfied.",
|
|
2528
2596
|
],
|
|
2529
2597
|
parameters: Type.Object({
|
|
2530
2598
|
stepIndex: Type.Integer({ minimum: 1, description: "Legacy step index. Ignored." }),
|
|
@@ -2534,7 +2602,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
2534
2602
|
executionMode: "sequential",
|
|
2535
2603
|
async execute(_toolCallId, _params, _signal, _onUpdate, _ctx) {
|
|
2536
2604
|
return {
|
|
2537
|
-
content: [{ type: "text", text: "step_complete is no longer required. Sisyphus is now a prompt/criteria style that uses the normal goal lifecycle. Continue working from the objective, or call complete_goal
|
|
2605
|
+
content: [{ type: "text", text: "step_complete is no longer required. Sisyphus is now a prompt/criteria style that uses the normal goal lifecycle. Continue working from the objective, or call complete_goal only when the full objective is actually satisfied." }],
|
|
2538
2606
|
details: goalDetails(state.goal),
|
|
2539
2607
|
};
|
|
2540
2608
|
},
|
|
@@ -2648,13 +2716,17 @@ export default function goalExtension(pi: ExtensionAPI): void {
|
|
|
2648
2716
|
const gateLabel = blockCompletion ? " (blockCompletion enabled)" : "";
|
|
2649
2717
|
const proposalText = [`Proposed task list${gateLabel}:`, "", ...taskLines].join("\n");
|
|
2650
2718
|
|
|
2651
|
-
const
|
|
2652
|
-
if (decision !== "confirm") {
|
|
2719
|
+
const dialogResult = await showProposalDialog(ctx, proposalText, "goal", !state.goal?.skipAuditor);
|
|
2720
|
+
if (dialogResult.decision !== "confirm") {
|
|
2653
2721
|
return {
|
|
2654
2722
|
content: [{ type: "text", text: "Task list proposal declined." }],
|
|
2655
2723
|
details: goalDetails(state.goal),
|
|
2656
2724
|
};
|
|
2657
2725
|
}
|
|
2726
|
+
// Persist any auditor toggle change
|
|
2727
|
+
if (state.goal) {
|
|
2728
|
+
state.goal = { ...state.goal, skipAuditor: !dialogResult.auditorEnabled };
|
|
2729
|
+
}
|
|
2658
2730
|
|
|
2659
2731
|
// Apply
|
|
2660
2732
|
state.goal = mergeGoalPromptFromDisk(ctx, state.goal);
|
|
@@ -3165,7 +3237,7 @@ promptGuidelines: [
|
|
|
3165
3237
|
// Ledger read failure should not break the prompt
|
|
3166
3238
|
}
|
|
3167
3239
|
return {
|
|
3168
|
-
systemPrompt: `${currentSystemPrompt()}\n\n[PI GOAL PAUSED goalId=${current.id}]\n${untrustedObjectiveBlock(current)}${pauseExtras.join("\n")}${auditorExtra}\n\nThe goal is paused. Do not autonomously continue substantive work unless the user resumes it with /goal-resume. If the user explicitly asks to finish or abandon the paused goal, or the objective is already satisfied based on available evidence, you may call complete_goal
|
|
3240
|
+
systemPrompt: `${currentSystemPrompt()}\n\n[PI GOAL PAUSED goalId=${current.id}]\n${untrustedObjectiveBlock(current)}${pauseExtras.join("\n")}${auditorExtra}\n\nThe goal is paused. Do not autonomously continue substantive work unless the user resumes it with /goal-resume. If the user explicitly asks to finish or abandon the paused goal, or the objective is already satisfied based on available evidence, you may call complete_goal or abort_goal without resuming. Do not call pause_goal again.`,
|
|
3169
3241
|
};
|
|
3170
3242
|
}
|
|
3171
3243
|
const activeGoal = state.goal;
|
|
@@ -114,7 +114,7 @@ export function sisyphusDisciplineBlock(goal: GoalRecord): string {
|
|
|
114
114
|
"- Work patiently and sequentially. Do not rush to a shortcut just because it looks more efficient.",
|
|
115
115
|
"- Verify each meaningful action against the objective's own success criteria before moving on.",
|
|
116
116
|
"- If a step is unclear, blocked, fails, or seems wrong: call pause_goal({reason, suggestedAction?}) instead of inventing a workaround.",
|
|
117
|
-
"- Call complete_goal
|
|
117
|
+
"- Call complete_goal only after the full objective is actually satisfied. There is no separate step counter or step_complete requirement.",
|
|
118
118
|
].join("\n");
|
|
119
119
|
}
|
|
120
120
|
|
|
@@ -132,11 +132,14 @@ Available work tools for pursuing the active goal include write, read, bash, and
|
|
|
132
132
|
|
|
133
133
|
After goal confirmation, you may call propose_task_list once to set up an initial task list if the objective decomposes into trackable milestones. If a task list already exists, only restructure it when the user asks or the goal structurally changes — do not restructure autonomously. Do not add a task list for simple, single-step goals.
|
|
134
134
|
|
|
135
|
+
[TASK WORKFLOW]
|
|
136
|
+
Use tasks and subtasks as PROGRESS TRACKERS during your work — not as a post-hoc checklist to batch-mark at the end. As soon as you finish a concrete unit of work that corresponds to a task or subtask, call complete_task immediately with evidence of what you did. The system enforces that all subtasks must be completed (or skipped) before their parent task can be completed, so work from the leaves up: finish subtasks first, then mark the parent task complete. If a subtask is blocked and cannot proceed, call pause_goal rather than skipping it. This keeps the task list accurate and prevents the "all work done, now batch-mark everything" pattern.
|
|
137
|
+
|
|
135
138
|
To ask the user a structured question (e.g. when the user's spec changes and you need to clarify before updating the goal), use goal_question. It opens a question dialog and returns the user's answer as tool output. Use plain conversation for simple clarifications.
|
|
136
139
|
|
|
137
140
|
Task skipping restrictions: Only skip a task when the user explicitly asks you to, or when the task directly contradicts a hard constraint (e.g. an impossible requirement). Do NOT autonomously skip tasks to avoid work, or because they look optional, inconvenient, or out of scope. When in doubt, ask the user first. Calling skip_task on an already-skipped task toggles it back to pending (unskip).
|
|
138
141
|
|
|
139
|
-
Keep this goal in force until it is actually achieved. Do not pause for confirmation just because a phase, chapter, file, or checklist item is finished. At each natural stopping point, compare every explicit requirement with concrete evidence from the workspace/session. If the objective is complete, call complete_goal
|
|
142
|
+
Keep this goal in force until it is actually achieved. Do not pause for confirmation just because a phase, chapter, file, or checklist item is finished. At each natural stopping point, compare every explicit requirement with concrete evidence from the workspace/session. If the objective is complete, call complete_goal and provide a verificationSummary; complete_goal will launch an independent pi auditor agent and only archive if that auditor returns <approved/>. If it is not complete, choose the next concrete action and do it.
|
|
140
143
|
|
|
141
144
|
The completion auditor is independent and semantic, not a paperwork checklist. It may inspect files and command output, and it will reject scaffold-only, alpha, template, proxy-metric, or weakly verified completions with <disapproved/>.
|
|
142
145
|
|
|
@@ -174,6 +177,9 @@ export function continuationPrompt(goal: GoalRecord, settings?: GoalSettings): s
|
|
|
174
177
|
"",
|
|
175
178
|
"Task skipping restrictions: Only skip a task when the user explicitly asks you to, or when the task directly contradicts a hard constraint (e.g. an impossible requirement). Do NOT autonomously skip tasks to avoid work, or because they look optional, inconvenient, or out of scope. When in doubt, ask the user first. Calling skip_task on an already-skipped task toggles it back to pending (unskip).",
|
|
176
179
|
"",
|
|
180
|
+
"[TASK WORKFLOW]",
|
|
181
|
+
"Use tasks and subtasks as PROGRESS TRACKERS during your work — not as a post-hoc checklist to batch-mark at the end. As soon as you finish a concrete unit of work that corresponds to a task or subtask, call complete_task immediately with evidence of what you did. Subtasks must be completed (or skipped) before their parent task can be completed, so work from the leaves up: finish subtasks first, then mark the parent task complete. If a subtask is blocked and cannot proceed, call pause_goal rather than skipping it.",
|
|
182
|
+
"",
|
|
177
183
|
"Avoid repeating work that is already done. Choose the next concrete action toward the objective.",
|
|
178
184
|
"",
|
|
179
185
|
"Before deciding that the goal is achieved, perform a completion audit against the actual current state:",
|
|
@@ -186,7 +192,7 @@ export function continuationPrompt(goal: GoalRecord, settings?: GoalSettings): s
|
|
|
186
192
|
"- Treat uncertainty as not achieved; do more verification or continue the work.",
|
|
187
193
|
"- For content/research/book/tutorial/report/reader-outcome goals, explicitly audit semantic quality: not merely scaffold/template/alpha, substantive content reviewed, and intended reader/user task outcome supported.",
|
|
188
194
|
"",
|
|
189
|
-
"Do not rely on intent, partial progress, elapsed effort, memory of earlier work, or a plausible final answer as proof of completion. Only mark the goal achieved when your own audit shows that the objective has actually been achieved and no required work remains. If any requirement is missing, incomplete, or unverified, keep working instead of marking the goal complete. If the objective is achieved, call complete_goal with
|
|
195
|
+
"Do not rely on intent, partial progress, elapsed effort, memory of earlier work, or a plausible final answer as proof of completion. Only mark the goal achieved when your own audit shows that the objective has actually been achieved and no required work remains. If any requirement is missing, incomplete, or unverified, keep working instead of marking the goal complete. If the objective is achieved, call complete_goal with a verificationSummary that addresses every success criterion and any verification contract; the tool will launch an independent pi auditor agent and only archive if it returns <approved/>.",
|
|
190
196
|
"",
|
|
191
197
|
"Before marking any sub-item or task as complete (including ✅ checkmarks in your output), verify thoroughly against the relevant success criteria and any verification contract. Do NOT use completion indicators for items you have not fully verified.",
|
|
192
198
|
"",
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-goal-x",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.17.0",
|
|
4
4
|
"description": "Goal mode extension for pi: persistent long-running objectives, /goal-set drafting, Sisyphus prompt style, autoContinue, and an above-editor status overlay. Fork of @capyup/pi-goal.",
|
|
5
5
|
"license": "MIT",
|
|
6
6
|
"author": "pi-goal-x contributors",
|