npm - pi-goal-x - Versions diffs - 0.16.1 → 0.17.0 - Mend

pi-goal-x 0.16.1 → 0.17.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/README.md +9 -4
package/docs/CHANGELOG.md +145 -0
package/extensions/goal-policy.ts +21 -2
package/extensions/goal-questionnaire.ts +45 -9
package/extensions/goal-record.ts +2 -0
package/extensions/goal-settings.ts +8 -0
package/extensions/goal.ts +90 -18
package/extensions/prompts/goal-prompts.ts +9 -3
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -31,6 +31,10 @@ All core features of [@capyup/pi-goal](https://github.com/capyup/pi-goal) are pr
 - **Recursive subtasks** — tasks can have nested sub-tasks via `subtasks?: GoalTask[]` (full recursive type). Subtask depth is controlled globally by `subtaskDepth` in `.pi/pi-goal-x-settings.json` (default: 1 level). Too-deep subtrees are rejected at proposal.
 - **Lightweight subtasks** — each task has an optional `lightweightSubtasks?: boolean` flag. When true, the parent can complete regardless of subtask status. When false/absent (full subtasks), all subtasks must be individually complete before the parent can close.
 - **Per-task completion** — `complete_task` marks individual tasks done with optional evidence/verificationSummary, and `skip_task` marks tasks as skipped with a required reason. Neither stops the turn, so the agent can continue uninterrupted.
+- **Recursive lookup** — `findTaskInTree` and `updateTaskInTree` search and update tasks at any depth. Subtask IDs are valid targets for `complete_task` and `skip_task`.
+- **Subtask gate** — parent tasks with full subtasks require all sub-items to be completed or skipped before the parent can close, enforced by recursive `checkSubtasksComplete`.
+- **Duplicate ID validation** — `validateTaskListProposal` recursively checks all task IDs across the entire tree, preventing collisions between parent/subtask or sibling subtasks.
+- **Agent workflow guidance** — prompts include a `[TASK WORKFLOW]` section directing agents to use tasks as progress trackers, completing subtasks immediately when work finishes (not batch-marking at the end).
 - **Hierarchical display** — task lists with subtasks render with indentation in prompts (`taskListBlock`, `goalPrompt`, `continuationPrompt`) and in the TUI widget (recursive count, BFS next-pending).
 - **Optional `taskList`** — goals without a task list work exactly as before. The feature is entirely opt-in.
 - **Soft `complete_goal` gate** — when `blockCompletion: true` is set, `complete_goal` surfaces a warning if pending tasks remain (prompt-level only; the agent can still complete).
@@ -48,19 +52,20 @@ All core features of [@capyup/pi-goal](https://github.com/capyup/pi-goal) are pr
 ### E2e test infrastructure
 - **Deterministic fork tests using `--mode json`**: the e2e suite spawns a real `pi --fork --mode json` session, parses structured `tool_execution_start`/`tool_execution_end` JSON events for field-level assertions — no free-text AI output parsing. Uses `--append-system-prompt` + `--tools` to force deterministic tool calls.
-- **Full coverage**: 281 tests total — function-level integration tests, mock-pi handler tests, file-validity checks, real `pi --fork --mode json` E2E tests, propose_goal_tweak unit/integration/e2e tests, task list policy/round-trip/render tests (including subtasks), and verification contract tests.
+- **Full coverage**: 310 tests total — function-level integration tests, mock-pi handler tests, file-validity checks, real `pi --fork --mode json` E2E tests, propose_goal_tweak unit/integration/e2e tests, task list policy/round-trip/render tests (including subtasks), and verification contract tests.
 ### Completion auditor
 - **Live progress widget** — when the auditor runs, the TUI shows a spinner, a progress bar (`[████░░░░] 40%`), step labels (`Inspecting files...`, `Verifying success criteria...`), the current tool being executed, and recent output lines. No more wondering if anything is happening.
+- **Per-goal auditor toggle** — during goal confirmation, press `a` to toggle the auditor on/off for that goal. The toggle uses a ●/○ indicator between the goal summary and confirm options. The default position comes from settings; the per-goal override persists within the session.
 - **Escape to skip** — press Escape during an audit to abort it and complete the goal immediately. The skip is recorded in the ledger as `audit_skipped` with reason `user_aborted` and auditor model metadata.
-- **Disable the auditor entirely** — set `disabled: true` in `.pi/pi-goal-x-settings.json` (or toggle it via `/goal-settings`). The agent can still bypass with user confirmation by passing `confirmBypassAuditor: true` to `update_goal`.
+- **Disable the auditor entirely** — set `disabled: true` in `.pi/pi-goal-x-settings.json` (or toggle it via `/goal-settings`). The agent can still bypass with user confirmation by passing `confirmBypassAuditor: true` to `complete_goal`.
 - **Skipped audits are recorded** — every skip (whether disabled or Escape-aborted) is logged to the ledger with the reason, provider, model, and thinking level for full traceability.
 - **Robust abort detection** — the auditor detects aborts both from exceptions *and* from `session.prompt()` returning after an abort signal, preventing stuck goals or ghost states.
 - **Cleaner lifecycle** — `AbortSignal` is properly wired to `session.abort()`, animation timers are cleaned up, and the unsubscribe path is always executed. No more having to kill the session.
 - **Completion report includes full auditor output** — the auditor's full report is included in the goal completion conversation message upon approval, not just a verdict.
 - **Session factory injection** — `runGoalCompletionAuditor` accepts an optional `createSession` parameter for testability, enabling mock auditor sessions in tests.
-- **Structured test evidence** — the executor can pass `testResults` (exit code, suite name, output, timestamp) via `update_goal({testResults})`. The auditor receives a `<test_evidence>` block and is instructed to check it before re-running test suites, skipping redundant re-runs.
+- **Structured test evidence** — the executor can pass `testResults` (exit code, suite name, output, timestamp) via `complete_goal({testResults})`. The auditor receives a `<test_evidence>` block and is instructed to check it before re-running test suites, skipping redundant re-runs.
 ### Drafting & UX
@@ -232,7 +237,7 @@ The completion result prints a full report into the conversation:
 - the auditor's approval report
 - full current goal details, including objective, status, usage, mode, and file path
-Sisyphus goals use the same completion tool as regular goals. The stricter part is the prompt/criteria standard: the agent should only call completion after the whole ordered objective is actually satisfied and likely to survive independent auditing. A paused goal can also be completed directly when the agent already has enough evidence that every requirement is satisfied; it does not need a resume just to call `update_goal`.
+Sisyphus goals use the same completion tool as regular goals. The stricter part is the prompt/criteria standard: the agent should only call completion after the whole ordered objective is actually satisfied and likely to survive independent auditing. A paused goal can also be completed directly when the agent already has enough evidence that every requirement is satisfied; it does not need a resume just to call `complete_goal`.
 ## Schema gates

package/docs/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,145 @@
+# Changelog
+## 0.17.0 (2026-05-29)
+### Features
+- **Per-goal auditor toggle** — press `a` during the confirmation dialog to toggle the auditor on/off for a specific goal. Default from settings; override persists within session.
+- **Task workflow prompt guidance** — added `[TASK WORKFLOW]` section to both `goalPrompt` and `continuationPrompt`, directing agents to complete subtasks one-by-one as progress trackers (not batch-marking at the end).
+- **Recursive duplicate ID validation** — `validateTaskListProposal` now checks all task IDs across the entire tree, preventing collisions between parent/subtask or sibling subtask IDs.
+- **Escape dialog during audit** — pressing Escape during a completion audit shows a TUI dialog with "Mark complete without audit" or "Continue working" options.
+### Fixes
+- `validateTaskCompletion` and `validateTaskSkip` now use recursive `findTaskInTree` instead of flat `Array.find()` for nested subtask support.
+- Updated README references from legacy `update_goal` to `complete_goal`.
+### Tests
+- 310 total tests (up from 308).
+- Added tests for recursive duplicate ID detection across nested subtask trees.
+- Added e2e test for `skipAuditor=true` path.
+## 0.16.1 (2026-05-29)
+### Features
+- **Escape-to-skip audit** — press Escape during an auditor run to abort it and complete the goal immediately. The skip is recorded in the ledger with the reason `user_aborted` and auditor model metadata.
+- **Audit progress widget** — the TUI shows a spinner, progress bar, step labels, current tool, and output lines while the auditor runs.
+- **Audit abort detection** — the auditor detects aborts both from exceptions and from `session.prompt()` returning after an abort signal, preventing stuck goals or ghost states.
+- **Goal status for Sisyphus** — `COMPLETED` status label for completed Sisyphus goals.
+- **Multi-session focus isolation** — goal focus data uses `goalFocusDetails` which includes the goal id and reason but not full balance data, preventing cross-session focus leakage.
+### Fixes
+- Fixed a merge bug where `propose_task_list` could produce duplicate task list when called during a continuation.
+## 0.16.0 (2026-05-29)
+### Features
+- **`delete_goal` tool** — new lifecycle tool for archiving goals by id. Accepts a required `goalId` and optional `reason`. Agent-facing only; not intended for user use.
+- **`complete_goal` `status` optional** — the `status` parameter on `complete_goal` is now optional. When omitted, defaults to `"complete"`. Explicitly setting an invalid value (anything other than `"complete"`) still produces an error.
+- **SCROLL FIX** — the confirmation dialog no longer scrolls to the bottom when the user is scrolled up and new content arrives. Uses `addContextWrapped()` which suppresses viewport resets.
+- **Task list shown first** — the task list section now appears FIRST in the confirmation dialog context (before the objective), with context capped at 12 lines so tasks don't scroll off-screen.
+- **Audit completion flow** — the completion report card no longer says "Goal audit approved." when the auditor was skipped (now shows "Goal audit skipped." with reason).
+### Fixes
+- Fixed task completion/skip validation for nested subtasks (uses recursive `findTaskInTree`).
+- All `complete_goal` calls default to `status: "complete"` when no explicit status is provided.
+- Updated prompts and tool descriptions to reflect the `complete_goal` naming.
+### Tests
+- Updated e2e tests to verify `complete_goal` accepts calls without status.
+- Added e2e test verifying `complete_goal` rejects invalid explicit status.
+## 0.15.1 (2026-05-28)
+### Fixes
+- Fixed settings file reference in storage writes.
+### Documentation
+- Reorganized README settings documentation for clarity.
+## 0.14.0 (2026-05-27)
+### Features
+- **Subtask hierarchy** — tasks can have nested sub-tasks via `subtasks?: GoalTask[]`. Subtask depth controlled by `subtaskDepth` setting (default: 1). Deep subtrees are rejected at proposal.
+- **Lightweight subtasks** — `lightweightSubtasks?: boolean` on tasks. When true, parent can complete regardless of subtask status. Full subtasks require all sub-items completed first.
+- **Per-task contracts** — `propose_task_list` supports optional `verificationContract` per task. If set, `complete_task` requires a non-empty `verificationSummary`.
+- **Task list block** — tasks are listed in prompts with checkboxes and status indicators.
+### Tests
+- Added e2e tests for goal creation with task list, scroll fix, and subtask validation.
+## 0.13.0 (2026-05-22)
+### Features
+- **Verification contract system** — goals can include a `Verification contract:` section. Extracted and stored on the goal record. `complete_goal` rejects calls without `verificationSummary` when a contract is set.
+- **Per-goal verification contracts** — the contract is extracted during goal drafting and enforced by tools and prompts.
+- **`complete_goal` `testResults` removed** — replaced with `verificationSummary`. The old structured test results interface is gone.
+- **Auditor integration** — the independent completion auditor receives both the `verificationContract` and `verificationSummary` and cross-checks claims against real artifacts.
+### Tests
+- Updated verification contract tests.
+## 0.12.0 (2026-04-29)
+### Features
+- **Task list system** — `propose_task_list` tool with confirmation dialog. Tasks stored on goal record, rendered in prompts and widget, serialized to disk.
+- **Unified goal + task acceptance** — `propose_goal_draft` accepts optional `tasks` array. Single dialog shows goal + task list together.
+- **`complete_task` and `skip_task` tools** — per-task completion with evidence/verificationSummary. Neither stops the turn.
+- **`update_goal` renamed to `complete_goal`** — the core completion tool now uses `complete_goal({status: "complete"})` and requires explicit status acceptance.
+- **Completion report heading fix** — the report now shows `Goal complete.` instead of `Goal audit approved.` when no contract or auditor is involved.
+### Tests
+- Full task lifecycle tests (policy, round-trip, render, edge cases).
+- Verification contract tests for both goal-level and per-task contracts.
+## 0.11.0 (2026-04-23)
+### Features
+- **Deferred archival** — goals are archived at `turn_end`, not inline in the tool handler. Prevents premature archiving before the agent sees the audit result.
+- **`propose_goal_tweak`** — sole mechanism for updating the goal objective during `/goal-tweak`. Uses the same Confirm/Continue Chatting dialog as goal creation.
+- **Focus isolation** — goal focus is stored as a branch-local session entry, not in goal markdown metadata. Multiple sessions can have different focused goals.
+- **Auditor bypass with user confirmation** — `confirmBypassAuditor: true` bypasses the auditor when the user explicitly opts out.
+### Fixes
+- Cleaned up lifecycle issues with AbortSignal wiring and timer cleanup.
+## 0.10.0 (2026-04-15)
+### Features
+- **Completion audit system** — independent pi auditor agent verifies completion claims before archiving.
+- **Audit progress** — real-time TUI progress widget with spinner, progress bar, and step labels.
+- **Ledger system** — structured event log for all goal lifecycle events.
+## 0.9.0 (2026-04-08)
+### Features
+- **`goal_question` and `goal_questionnaire`** — structured drafting question tools.
+- **`/goal-settings`** — interactive settings configuration.
+- **Sisyphus goal style** — patient ordered execution with prompt/criteria variant.
+## 0.8.1 (2026-04-01)
+### Features
+- Initial fork from @capyup/pi-goal.
+- Pause/resume/abort lifecycle.
+- Multiple open goals.
+- Auto-continue loop.

package/extensions/goal-policy.ts CHANGED Viewed

@@ -185,7 +185,7 @@ export function validateTaskCompletion(args: {
 }): PolicyValidation {
 	if (!args.goal) return { ok: false, message: "No goal is set." };
 	if (!args.goal.taskList) return { ok: false, message: "Goal has no task list." };
-	const task = args.goal.taskList.tasks.find((t) => t.id === args.taskId);
+	const task = findTaskInTree(args.goal.taskList.tasks, args.taskId);
 	if (!task) return { ok: false, message: `Task "${args.taskId}" not found.` };
 	if (task.status === "complete") return { ok: false, message: `Task "${args.taskId}" is already complete.` };
 	if (task.status === "skipped") return { ok: false, message: `Task "${args.taskId}" was already skipped.` };
@@ -199,7 +199,7 @@ export function validateTaskSkip(args: {
 }): PolicyValidation {
 	if (!args.goal) return { ok: false, message: "No goal is set." };
 	if (!args.goal.taskList) return { ok: false, message: "Goal has no task list." };
-	const task = args.goal.taskList.tasks.find((t) => t.id === args.taskId);
+	const task = findTaskInTree(args.goal.taskList.tasks, args.taskId);
 	if (!task) return { ok: false, message: `Task "${args.taskId}" not found.` };
 	if (task.status === "complete") return { ok: false, message: `Task "${args.taskId}" is already complete.` };
 	// Skipped tasks toggle via the executor; reason is only required for first-time skips.
@@ -241,6 +241,20 @@ export function findSubtaskDepthViolation(tasks: GoalTask[], maxDepth: number):
 	return undefined;
 }
+function checkDuplicateTaskIds(tasks: GoalTask[], ids: Set<string>): string | undefined {
+	for (const t of tasks) {
+		const id = t.id.trim();
+		if (!id) return "All tasks must have a non-empty id.";
+		if (ids.has(id)) return `Duplicate task id: "${id}".`;
+		ids.add(id);
+		if (t.subtasks) {
+			const childErr = checkDuplicateTaskIds(t.subtasks, ids);
+			if (childErr) return childErr;
+		}
+	}
+	return undefined;
+}
 export function validateTaskListProposal(args: {
 	goal: GoalPolicyRecordLike | null;
 	tasks: GoalTask[];
@@ -254,6 +268,11 @@ export function validateTaskListProposal(args: {
 		if (!t.title.trim()) return { ok: false, message: `Task "${t.id}" must have a non-empty title.` };
 		if (ids.has(t.id)) return { ok: false, message: `Duplicate task id: "${t.id}".` };
 		ids.add(t.id);
+		// Recursively check subtask ids against the same global set
+		if (t.subtasks && t.subtasks.length > 0) {
+			const childErr = checkDuplicateTaskIds(t.subtasks, ids);
+			if (childErr) return { ok: false, message: childErr };
+		}
 	}
 	// Check subtask depth limit
 	const maxDepth = args.maxSubtaskDepth ?? 1;

package/extensions/goal-questionnaire.ts CHANGED Viewed

@@ -26,6 +26,7 @@ export interface GoalQuestionnaireResult {
 	questions: GoalQuestionnaireQuestion[];
 	answers: GoalQuestionnaireAnswer[];
 	cancelled: boolean;
+	auditorEnabled?: boolean;
 }
 export type ProposalDecision = "confirm" | "continue";
@@ -82,7 +83,7 @@ export function proposalDialogFailureMessage(error: unknown): string {
  * the internal draft-confirm prompt. This keeps pi-goal self-contained and
  * avoids depending on external question/questionnaire packages.
  */
-export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions: GoalQuestionnaireQuestion[]): Promise<GoalQuestionnaireResult> {
+export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions: GoalQuestionnaireQuestion[], auditorToggleInit?: { defaultEnabled: boolean }): Promise<GoalQuestionnaireResult> {
 	if (!ctx.hasUI) {
 		return { questions: [], answers: [], cancelled: true };
 	}
@@ -102,6 +103,7 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
 		let inputMode = false;
 		let inputQuestionId: string | null = null;
 		let cachedLines: string[] | undefined;
+		let auditorEnabled = auditorToggleInit?.defaultEnabled ?? true;
 		const answers = new Map<string, GoalQuestionnaireAnswer>();
 		const drafts = new Map<string, string>();
@@ -126,7 +128,7 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
 			// Restore hardware cursor now that the dialog is closing
 			tui.setShowHardwareCursor(wasHardwareCursorShown);
 			const ordered = questions.map((q) => answers.get(q.id)).filter((a): a is GoalQuestionnaireAnswer => !!a);
-			done({ questions, answers: ordered, cancelled });
+			done({ questions, answers: ordered, cancelled, auditorEnabled: auditorToggleInit ? auditorEnabled : undefined });
 		}
 		function currentQuestion(): GoalQuestionnaireQuestion | undefined {
@@ -272,6 +274,13 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
 				return;
 			}
+			// Auditor toggle hotkey
+			if (matchesKey(data, "a") && auditorToggleInit) {
+				auditorEnabled = !auditorEnabled;
+				refresh();
+				return;
+			}
 			if (matchesKey(data, Key.enter) && q) {
 				if (q.options.length === 0 || opts[optionIndex]?.isCustom) {
 					inputMode = true;
@@ -293,7 +302,9 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
 			if (matchesKey(data, Key.escape)) submit(true);
 		}
-		function render(width: number): string[] {
+		const MAX_CONTEXT_LINES = 12; // prevent viewport jumping by capping context display
+			function render(width: number): string[] {
 			if (cachedLines) return cachedLines;
 			const safeWidth = Math.max(20, width);
 			const lines: string[] = [];
@@ -302,6 +313,18 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
 			const add = (s: string) => lines.push(truncateToWidth(s, safeWidth, "…", true));
 			const addWrapped = (s: string) => lines.push(...wrapTextWithAnsi(s, safeWidth));
+			/** Wraps text and caps at MAX_CONTEXT_LINES to prevent viewport jumping. */
+			const addContextWrapped = (s: string) => {
+				const wrapped = wrapTextWithAnsi(s, safeWidth);
+				if (wrapped.length <= MAX_CONTEXT_LINES) {
+					lines.push(...wrapped);
+				} else {
+					lines.push(...wrapped.slice(0, MAX_CONTEXT_LINES));
+					const overflow = wrapped.length - MAX_CONTEXT_LINES;
+					lines.push(theme.fg("dim", `  ... ${overflow} more line${overflow === 1 ? "" : "s"} (full details after confirmation)`));
+				}
+			};
 			add(theme.fg("accent", "─".repeat(safeWidth)));
 			if (isMulti) {
 				const tabs: string[] = ["← "];
@@ -331,7 +354,7 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
 			if (inputMode && q) {
 				addWrapped(theme.fg("text", ` ${q.question}`));
-				if (q.context) addWrapped(theme.fg("muted", ` ${q.context}`));
+				if (q.context) addContextWrapped(theme.fg("muted", ` ${q.context}`));
 				lines.push("");
 				if (q.options.length > 0) {
 					renderOptions();
@@ -352,7 +375,15 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
 				add(allAnswered() ? theme.fg("success", " Press Enter to submit") : theme.fg("warning", ` Unanswered: ${questions.filter((qq) => !answers.has(qq.id)).map((qq) => qq.id).join(", ")}`));
 			} else if (q) {
 				addWrapped(theme.fg("text", ` ${q.question}`));
-				if (q.context) addWrapped(theme.fg("muted", ` ${q.context}`));
+				if (q.context) addContextWrapped(theme.fg("muted", ` ${q.context}`));
+				// Auditor toggle line between context and options
+				if (auditorToggleInit) {
+					const circle = auditorEnabled ? "●" : "○";
+					const label = auditorEnabled ? "Auditor enabled" : "Auditor disabled";
+					const color = auditorEnabled ? "success" : "warning";
+					add(theme.fg(color, ` ${circle} ${label}`) + theme.fg("dim", "  (press 'a' to toggle)"));
+					lines.push("");
+				}
 				const existing = answers.get(q.id);
 				if (existing) add(theme.fg("dim", ` Current: ${existing.wasCustom ? "(wrote) " : ""}${existing.answer}`));
 				lines.push("");
@@ -361,7 +392,10 @@ export async function runGoalQuestionnaire(ctx: ExtensionContext, rawQuestions:
 			}
 			lines.push("");
-			if (!inputMode) add(theme.fg("dim", isMulti ? " Tab/←→ navigate • ↑↓ select • Enter confirm • Esc cancel" : " ↑↓ navigate • Enter select • Esc cancel"));
+			if (!inputMode) {
+				const auditorHint = auditorToggleInit ? " • a toggle auditor" : "";
+				add(theme.fg("dim", isMulti ? " Tab/←→ navigate • ↑↓ select • Enter confirm • Esc cancel" + auditorHint : " ↑↓ navigate • Enter select • Esc cancel" + auditorHint));
+			}
 			add(theme.fg("accent", "─".repeat(safeWidth)));
 			cachedLines = lines;
 			return lines;
@@ -379,7 +413,8 @@ export async function showProposalDialog(
 	ctx: ExtensionContext,
 	confirmationText: string,
 	focus: GoalDraftingFocus,
-): Promise<ProposalDecision> {
+	defaultAuditorEnabled?: boolean,
+): Promise<{ decision: ProposalDecision; auditorEnabled: boolean }> {
 	const headerTitle = focus === "sisyphus" ? "Confirm Sisyphus Goal Draft" : "Confirm Goal Draft";
 	const result = await runGoalQuestionnaire(ctx, [{
 		id: "confirm",
@@ -388,11 +423,12 @@ export async function showProposalDialog(
 		options: ["Confirm — create this goal now", "Continue chatting — keep refining"],
 		recommended: 0,
 		allowCustom: false,
-	}]);
-	return proposalDecisionFromQuestionnaireResult({
+	}], defaultAuditorEnabled !== undefined ? { defaultEnabled: defaultAuditorEnabled } : undefined);
+	const decision = proposalDecisionFromQuestionnaireResult({
 		cancelled: result.cancelled,
 		answer: result.answers[0]?.answer,
 	});
+	return { decision, auditorEnabled: result.auditorEnabled ?? true };
 }
 export function registerQuestionnaireTools(pi: ExtensionAPI): void {

package/extensions/goal-record.ts CHANGED Viewed

@@ -45,6 +45,7 @@ export interface GoalRecord {
 	// Set by the agent's pause_goal tool. Cleared when the goal becomes active again.
 	pauseReason?: string;
 	pauseSuggestedAction?: string;
+	skipAuditor?: boolean;
 	taskList?: GoalTaskList;
 	/** Plain-text description of what verification evidence is required before completing this goal. */
 	verificationContract?: string;
@@ -247,6 +248,7 @@ export function normalizeGoalRecord(value: unknown): GoalRecord | null {
 		stopReason: raw.stopReason === "agent" || raw.stopReason === "user" ? raw.stopReason : undefined,
 		pauseReason: typeof raw.pauseReason === "string" && raw.pauseReason.trim() ? raw.pauseReason : undefined,
 		pauseSuggestedAction: typeof raw.pauseSuggestedAction === "string" && raw.pauseSuggestedAction.trim() ? raw.pauseSuggestedAction : undefined,
+		skipAuditor: raw.skipAuditor === true ? true : undefined,
 		taskList: normalizeTaskList(raw.taskList),
 		verificationContract: typeof raw.verificationContract === "string" ? raw.verificationContract : undefined,
 	};

package/extensions/goal-settings.ts CHANGED Viewed

@@ -143,6 +143,14 @@ export function loadGoalSettings(cwd: string, env: NodeJS.ProcessEnv = process.e
  * Save settings to the unified settings file on disk.
  * Persists only non-default values using the canonical key names.
  */
+/**
+ * Determine whether the auditor should be enabled by default based on settings.
+ * The auditor is enabled by default unless settings.disabled === true.
+ */
+export function isAuditorEnabledByDefault(settings: GoalSettings): boolean {
+	return settings.disabled !== true;
+}
 export function saveGoalSettingsFileConfig(cwd: string, settings: GoalSettings): GoalSettings {
 	const clean: GoalSettings = {};
 	const provider = asNonEmptyString(settings.provider);

package/extensions/goal.ts CHANGED Viewed

@@ -21,6 +21,7 @@ import {
 } from "./goal-auditor.ts";
 import {
 	goalSettingsPath,
+	isAuditorEnabledByDefault,
 	loadGoalSettings,
 	loadGoalSettingsFileConfig,
 	saveGoalSettingsFileConfig,
@@ -1578,7 +1579,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
 				}
 			}
 			const lifecycleHint = view && (view.status === "active" || view.status === "paused")
-				? "\nLifecycle tools: if evidence proves the objective is satisfied, call complete_goal({status: \"complete\"}); if blocked, call pause_goal({reason, suggestedAction?}); if abandoned/obsolete/unsafe, call abort_goal({reason}). For file or shell work, use the normal work tools directly (write/read/bash/edit); do not call get_goal repeatedly just to look for tools."
+				? "\nLifecycle tools: if evidence proves the objective is satisfied, call complete_goal({verificationSummary: \"evidence\"}); if blocked, call pause_goal({reason, suggestedAction?}); if abandoned/obsolete/unsafe, call abort_goal({reason}). For file or shell work, use the normal work tools directly (write/read/bash/edit); do not call get_goal repeatedly just to look for tools."
 				: "";
 			const text = view
 				? `${detailedSummary(view)}${lifecycleHint}${nudge}${otherCount > 0 ? `\nOther open goals: ${otherCount} (human can run /goal-list or /goal-focus)` : ""}`
@@ -1737,20 +1738,25 @@ export default function goalExtension(pi: ExtensionAPI): void {
 			const draftSummary = buildDraftConfirmationText({
 				focus: activeIntent.focus,
 				originalTopic: activeIntent.originalTopic,
-				objective: objective + taskSummarySection,
+				// Tasks section appears FIRST in the context so it stays visible
+				// even when the dialog caps long context lines.
+				objective: taskSummarySection ? `${taskSummarySection}
+${objective}` : objective,
 				autoContinue: autoContinueFlag,
 			});
 			const headless = shouldAutoConfirmProposal({ hasUI: ctx.hasUI, autoConfirmEnv: process.env.PI_GOAL_AUTO_CONFIRM });
-			let decision: "confirm" | "continue";
+			let decision: { decision: "confirm" | "continue"; auditorEnabled: boolean };
+			const auditorDefault = isAuditorEnabledByDefault(loadGoalSettings(ctx.cwd));
 			if (headless) {
 				// Headless: auto-confirm (tests and non-TUI sessions).
-				decision = "confirm";
+				decision = { decision: "confirm", auditorEnabled: auditorDefault };
 			} else {
 				// TUI: show overlay dialog.
 				try {
-					decision = await showProposalDialog(ctx, draftSummary, activeIntent.focus);
+					decision = await showProposalDialog(ctx, draftSummary, activeIntent.focus, auditorDefault);
 				} catch (err) {
 					const message = proposalDialogFailureMessage(err);
 					ctx.ui.notify(message, "error");
@@ -1761,7 +1767,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
 				}
 			}
-			if (decision === "confirm") {
+			if (decision.decision === "confirm") {
 				// Extract verification contract from objective before creation
 				const { objective: cleanedObjective, verificationContract } = extractVerificationContract(objective);
 				const config: GoalCreationConfig = {
@@ -1772,6 +1778,12 @@ export default function goalExtension(pi: ExtensionAPI): void {
 				confirmationIntent = null;
 				replaceGoal(config, ctx, false, verificationContract);
+				// Set skipAuditor on the goal if user toggled auditor off
+				if (!decision.auditorEnabled && state.goal) {
+					state.goal = { ...state.goal, skipAuditor: true };
+					setGoal(state.goal, ctx);
+				}
 				// Set task list if provided
 				if (tasksToCreate && tasksToCreate.length > 0 && state.goal) {
 					const now = nowIso();
@@ -1887,12 +1899,12 @@ export default function goalExtension(pi: ExtensionAPI): void {
 			const headless = shouldAutoConfirmProposal({ hasUI: ctx.hasUI, autoConfirmEnv: process.env.PI_GOAL_AUTO_CONFIRM });
-			let decision: "confirm" | "continue";
+			let decision: { decision: "confirm" | "continue"; auditorEnabled: boolean };
 			if (headless) {
-				decision = "confirm";
+				decision = { decision: "confirm", auditorEnabled: !state.goal.skipAuditor };
 			} else {
 				try {
-					decision = await showProposalDialog(ctx, draftSummary, state.goal.sisyphus ? "sisyphus" : "goal");
+					decision = await showProposalDialog(ctx, draftSummary, state.goal.sisyphus ? "sisyphus" : "goal", !state.goal.skipAuditor);
 				} catch (err) {
 					const message = proposalDialogFailureMessage(err);
 					ctx.ui.notify(message, "error");
@@ -1903,7 +1915,11 @@ export default function goalExtension(pi: ExtensionAPI): void {
 				}
 			}
-			if (decision === "confirm") {
+			if (decision.decision === "confirm") {
+				// Persist any auditor toggle change
+				if (state.goal) {
+					state.goal = { ...state.goal, skipAuditor: !decision.auditorEnabled };
+				}
 				// Extract verification contract from revised objective
 				const { objective: cleanedObjective, verificationContract } = extractVerificationContract(newObjective);
 				// Apply the tweak: write the new objective to disk authoritatively.
@@ -1981,7 +1997,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
 		description: "Mark the current active or paused pi goal complete. Only call this when the goal objective is actually achieved — no required work remains.",
 		promptSnippet: "Mark the active or paused pi goal complete — only when every requirement is satisfied.",
 		promptGuidelines: [
-			"Call complete_goal with status=complete only when the pi goal objective has actually been achieved and no required work remains.",
+			"Call complete_goal only when the pi goal objective has actually been achieved and no required work remains.",
 			"Before calling complete_goal, you MUST provide a verificationSummary that addresses every success criterion and any verification contract on the goal. Fold all verification evidence (test output, grep results, requirements coverage) into this single field.",
 			"The auditor is authoritative: completion is archived only if the auditor report ends with <approved/>. If it ends with <disapproved/> or no approval marker, complete_goal is rejected and the goal remains open.",
 			"Do NOT call complete_goal if any work remains, even if substantial progress was made. Do not use it merely because work is stopping, tests passed, or you are blocked.",
@@ -2001,7 +2017,8 @@ export default function goalExtension(pi: ExtensionAPI): void {
 			reconcileFocusedGoalFromDisk(ctx);
 			// -- Phase 2: Status validation --
-			if (params.status !== COMPLETE_STATUS) {
+			const effectiveStatus = params.status ?? COMPLETE_STATUS;
+			if (effectiveStatus !== COMPLETE_STATUS) {
 				throw new Error("complete_goal requires status=complete when marking a goal complete.");
 			}
@@ -2059,7 +2076,58 @@ export default function goalExtension(pi: ExtensionAPI): void {
 				? `${settings.provider ?? "default"}/${settings.model ?? "default"}${settings.thinkingLevel ? `:${settings.thinkingLevel}` : ""}`
 				: "default";
-			// Check if auditor is disabled
+			// Check if auditor is disabled per-goal (user toggled it off during goal confirmation)
+			if (auditTarget.skipAuditor) {
+				pi.sendMessage<GoalAuditEventDetails>({
+					customType: GOAL_AUDIT_ENTRY,
+					content: `Goal completed — per-goal auditor disabled.`,
+					display: true,
+					details: { phase: "skipped", goalId: auditTarget.id, auditor: auditorLabel },
+				});
+				try {
+					appendGoalEvent(ctx, {
+						type: "audit_skipped",
+						goalId: auditTarget.id,
+						reason: "disabled",
+						provider: settings.provider,
+						model: settings.model,
+						thinkingLevel: settings.thinkingLevel,
+						at: nowIso(),
+					});
+				} catch {
+					// Ledger append failure should not block completion
+				}
+				accountProgress(ctx);
+				auditProgress = null;
+				goalWidgetComponent?.invalidate();
+				state.goal = {
+					...auditTarget,
+					status: "complete",
+					stopReason: "agent",
+					updatedAt: nowIso(),
+				};
+				state.goal = writeActiveGoalFile(ctx, state.goal);
+				pi.appendEntry(STATE_ENTRY, goalDetails(state.goal));
+				turnStoppedFor = state.goal?.id ?? null;
+				resetGetGoalNudgeState(state.goal?.id);
+				syncGoalTools();
+				updateUI(ctx);
+				return {
+					content: [{
+						type: "text",
+						text: buildCompletionReport({
+							detailedSummary: detailedSummary(state.goal),
+							completionSummary: params.completionSummary,
+							auditSkippedReason: "per-goal auditor disabled",
+							taskSummary: state.goal?.taskList ? buildTaskSummary(state.goal.taskList) : null,
+						}),
+					}],
+					details: goalDetails(state.goal),
+					terminate: true,
+				};
+			}
+			// Check if auditor is disabled in settings
 			if (settings.disabled === true) {
 				if (params.confirmBypassAuditor !== true) {
 					return {
@@ -2524,7 +2592,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
 		promptSnippet: "Legacy no-op: Sisyphus no longer requires step_complete.",
 		promptGuidelines: [
 			"Do not call this in normal operation. Sisyphus mode shares the normal goal lifecycle and completion gate.",
-			"Complete the goal with complete_goal(status=complete) only when the full objective is actually satisfied.",
+			"Call complete_goal only when the full objective is actually satisfied.",
 		],
 		parameters: Type.Object({
 			stepIndex: Type.Integer({ minimum: 1, description: "Legacy step index. Ignored." }),
@@ -2534,7 +2602,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
 		executionMode: "sequential",
 		async execute(_toolCallId, _params, _signal, _onUpdate, _ctx) {
 			return {
-				content: [{ type: "text", text: "step_complete is no longer required. Sisyphus is now a prompt/criteria style that uses the normal goal lifecycle. Continue working from the objective, or call complete_goal(status=complete) only when the full objective is satisfied." }],
+				content: [{ type: "text", text: "step_complete is no longer required. Sisyphus is now a prompt/criteria style that uses the normal goal lifecycle. Continue working from the objective, or call complete_goal only when the full objective is actually satisfied." }],
 				details: goalDetails(state.goal),
 			};
 		},
@@ -2648,13 +2716,17 @@ export default function goalExtension(pi: ExtensionAPI): void {
 			const gateLabel = blockCompletion ? " (blockCompletion enabled)" : "";
 			const proposalText = [`Proposed task list${gateLabel}:`, "", ...taskLines].join("\n");
-			const decision = await showProposalDialog(ctx, proposalText, "goal");
-			if (decision !== "confirm") {
+			const dialogResult = await showProposalDialog(ctx, proposalText, "goal", !state.goal?.skipAuditor);
+			if (dialogResult.decision !== "confirm") {
 				return {
 					content: [{ type: "text", text: "Task list proposal declined." }],
 					details: goalDetails(state.goal),
 				};
 			}
+			// Persist any auditor toggle change
+			if (state.goal) {
+				state.goal = { ...state.goal, skipAuditor: !dialogResult.auditorEnabled };
+			}
 			// Apply
 			state.goal = mergeGoalPromptFromDisk(ctx, state.goal);
@@ -3165,7 +3237,7 @@ promptGuidelines: [
 				// Ledger read failure should not break the prompt
 			}
 			return {
-				systemPrompt: `${currentSystemPrompt()}\n\n[PI GOAL PAUSED goalId=${current.id}]\n${untrustedObjectiveBlock(current)}${pauseExtras.join("\n")}${auditorExtra}\n\nThe goal is paused. Do not autonomously continue substantive work unless the user resumes it with /goal-resume. If the user explicitly asks to finish or abandon the paused goal, or the objective is already satisfied based on available evidence, you may call complete_goal(status=complete) or abort_goal without resuming. Do not call pause_goal again.`,
+				systemPrompt: `${currentSystemPrompt()}\n\n[PI GOAL PAUSED goalId=${current.id}]\n${untrustedObjectiveBlock(current)}${pauseExtras.join("\n")}${auditorExtra}\n\nThe goal is paused. Do not autonomously continue substantive work unless the user resumes it with /goal-resume. If the user explicitly asks to finish or abandon the paused goal, or the objective is already satisfied based on available evidence, you may call complete_goal or abort_goal without resuming. Do not call pause_goal again.`,
 			};
 		}
 		const activeGoal = state.goal;

package/extensions/prompts/goal-prompts.ts CHANGED Viewed

@@ -114,7 +114,7 @@ export function sisyphusDisciplineBlock(goal: GoalRecord): string {
 		"- Work patiently and sequentially. Do not rush to a shortcut just because it looks more efficient.",
 		"- Verify each meaningful action against the objective's own success criteria before moving on.",
 		"- If a step is unclear, blocked, fails, or seems wrong: call pause_goal({reason, suggestedAction?}) instead of inventing a workaround.",
-		"- Call complete_goal(status=complete) only after the full objective is actually satisfied. There is no separate step counter or step_complete requirement.",
+		"- Call complete_goal only after the full objective is actually satisfied. There is no separate step counter or step_complete requirement.",
 	].join("\n");
 }
@@ -132,11 +132,14 @@ Available work tools for pursuing the active goal include write, read, bash, and
 After goal confirmation, you may call propose_task_list once to set up an initial task list if the objective decomposes into trackable milestones. If a task list already exists, only restructure it when the user asks or the goal structurally changes — do not restructure autonomously. Do not add a task list for simple, single-step goals.
+[TASK WORKFLOW]
+Use tasks and subtasks as PROGRESS TRACKERS during your work — not as a post-hoc checklist to batch-mark at the end. As soon as you finish a concrete unit of work that corresponds to a task or subtask, call complete_task immediately with evidence of what you did. The system enforces that all subtasks must be completed (or skipped) before their parent task can be completed, so work from the leaves up: finish subtasks first, then mark the parent task complete. If a subtask is blocked and cannot proceed, call pause_goal rather than skipping it. This keeps the task list accurate and prevents the "all work done, now batch-mark everything" pattern.
 To ask the user a structured question (e.g. when the user's spec changes and you need to clarify before updating the goal), use goal_question. It opens a question dialog and returns the user's answer as tool output. Use plain conversation for simple clarifications.
 Task skipping restrictions: Only skip a task when the user explicitly asks you to, or when the task directly contradicts a hard constraint (e.g. an impossible requirement). Do NOT autonomously skip tasks to avoid work, or because they look optional, inconvenient, or out of scope. When in doubt, ask the user first. Calling skip_task on an already-skipped task toggles it back to pending (unskip).
-Keep this goal in force until it is actually achieved. Do not pause for confirmation just because a phase, chapter, file, or checklist item is finished. At each natural stopping point, compare every explicit requirement with concrete evidence from the workspace/session. If the objective is complete, call complete_goal with status=complete and provide a verificationSummary; complete_goal will launch an independent pi auditor agent and only archive if that auditor returns <approved/>. If it is not complete, choose the next concrete action and do it.
+Keep this goal in force until it is actually achieved. Do not pause for confirmation just because a phase, chapter, file, or checklist item is finished. At each natural stopping point, compare every explicit requirement with concrete evidence from the workspace/session. If the objective is complete, call complete_goal and provide a verificationSummary; complete_goal will launch an independent pi auditor agent and only archive if that auditor returns <approved/>. If it is not complete, choose the next concrete action and do it.
 The completion auditor is independent and semantic, not a paperwork checklist. It may inspect files and command output, and it will reject scaffold-only, alpha, template, proxy-metric, or weakly verified completions with <disapproved/>.
@@ -174,6 +177,9 @@ export function continuationPrompt(goal: GoalRecord, settings?: GoalSettings): s
 		"",
 		"Task skipping restrictions: Only skip a task when the user explicitly asks you to, or when the task directly contradicts a hard constraint (e.g. an impossible requirement). Do NOT autonomously skip tasks to avoid work, or because they look optional, inconvenient, or out of scope. When in doubt, ask the user first. Calling skip_task on an already-skipped task toggles it back to pending (unskip).",
 		"",
+		"[TASK WORKFLOW]",
+		"Use tasks and subtasks as PROGRESS TRACKERS during your work — not as a post-hoc checklist to batch-mark at the end. As soon as you finish a concrete unit of work that corresponds to a task or subtask, call complete_task immediately with evidence of what you did. Subtasks must be completed (or skipped) before their parent task can be completed, so work from the leaves up: finish subtasks first, then mark the parent task complete. If a subtask is blocked and cannot proceed, call pause_goal rather than skipping it.",
+		"",
 		"Avoid repeating work that is already done. Choose the next concrete action toward the objective.",
 		"",
 		"Before deciding that the goal is achieved, perform a completion audit against the actual current state:",
@@ -186,7 +192,7 @@ export function continuationPrompt(goal: GoalRecord, settings?: GoalSettings): s
 		"- Treat uncertainty as not achieved; do more verification or continue the work.",
 		"- For content/research/book/tutorial/report/reader-outcome goals, explicitly audit semantic quality: not merely scaffold/template/alpha, substantive content reviewed, and intended reader/user task outcome supported.",
 		"",
-		"Do not rely on intent, partial progress, elapsed effort, memory of earlier work, or a plausible final answer as proof of completion. Only mark the goal achieved when your own audit shows that the objective has actually been achieved and no required work remains. If any requirement is missing, incomplete, or unverified, keep working instead of marking the goal complete. If the objective is achieved, call complete_goal with status \"complete\" and a verificationSummary that addresses every success criterion and any verification contract; the tool will launch an independent pi auditor agent and only archive if it returns <approved/>.",
+		"Do not rely on intent, partial progress, elapsed effort, memory of earlier work, or a plausible final answer as proof of completion. Only mark the goal achieved when your own audit shows that the objective has actually been achieved and no required work remains. If any requirement is missing, incomplete, or unverified, keep working instead of marking the goal complete. If the objective is achieved, call complete_goal with a verificationSummary that addresses every success criterion and any verification contract; the tool will launch an independent pi auditor agent and only archive if it returns <approved/>.",
 		"",
 		"Before marking any sub-item or task as complete (including ✅ checkmarks in your output), verify thoroughly against the relevant success criteria and any verification contract. Do NOT use completion indicators for items you have not fully verified.",
 		"",

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-goal-x",
-  "version": "0.16.1",
+  "version": "0.17.0",
   "description": "Goal mode extension for pi: persistent long-running objectives, /goal-set drafting, Sisyphus prompt style, autoContinue, and an above-editor status overlay. Fork of @capyup/pi-goal.",
   "license": "MIT",
   "author": "pi-goal-x contributors",