npm - pi-goal-x - Versions diffs - 0.10.0 → 0.10.2 - Mend

pi-goal-x 0.10.0 → 0.10.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +4 -3
package/extensions/goal-auditor.ts +36 -4
package/extensions/goal.ts +14 -51
package/extensions/prompts/goal-prompts.ts +6 -2
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -10,10 +10,10 @@ The extension is designed around one rule: **the user owns intent; the agent exe
 All core features of [@capyup/pi-goal](https://github.com/capyup/pi-goal) are preserved. The following changes are specific to pi-goal-x:
-### Mid-flight objective updates
+### Goal objective is immutable
-- **`update_goal({updatedObjective})`** — the agent can now sync the goal objective mid-flight when user requirements change, *without* completing the goal. This ensures the completion auditor evaluates against the latest requirements. The combined path (`updatedObjective` + `status: "complete"`) applies the update first, then runs the normal completion+audit flow.
-- **`apply_goal_tweak`** remains available for `/goal-tweak` drafting revisions; the new parameter is the lightest possible touch on the existing `update_goal` tool.
+- The goal objective is immutable — the agent **must not** modify it autonomously. Objective changes are only possible through `apply_goal_tweak`, which is gated behind the user-initiated `/goal-tweak` drafting flow. This prevents the agent from silently changing the goal contract.
+- **`apply_goal_tweak`** is the sole mechanism for updating the objective, available exclusively during a `/goal-tweak` drafting interview. If the user's requirements change, they must run `/goal-tweak` to initiate the revision flow.
 ### Deferred archival
@@ -35,6 +35,7 @@ All core features of [@capyup/pi-goal](https://github.com/capyup/pi-goal) are pr
 - **Cleaner lifecycle** — `AbortSignal` is properly wired to `session.abort()`, animation timers are cleaned up, and the unsubscribe path is always executed. No more having to kill the session.
 - **Completion report includes full auditor output** — the auditor's full report is included in the goal completion conversation message upon approval, not just a verdict.
 - **Session factory injection** — `runGoalCompletionAuditor` accepts an optional `createSession` parameter for testability, enabling mock auditor sessions in tests.
+- **Structured test evidence** — the executor can pass `testResults` (exit code, suite name, output, timestamp) via `update_goal({testResults})`. The auditor receives a `<test_evidence>` block and is instructed to check it before re-running test suites, skipping redundant re-runs.
 ### Drafting & UX

package/extensions/goal-auditor.ts CHANGED Viewed

@@ -127,10 +127,22 @@ export function parseAuditorDecision(output: string): { approved: boolean; disap
 	return { approved: approved && !disapproved, disapproved };
 }
+export interface AuditorTestResults {
+	/** Exit code of the test run (0 = success) */
+	exitCode: number;
+	/** Test suite name, e.g. 'npm test' */
+	suiteName?: string;
+	/** Last lines of test output showing results */
+	output?: string;
+	/** ISO timestamp of when tests were run */
+	timestamp?: string;
+}
 export function buildGoalAuditorPrompt(args: {
 	goal: GoalRecord;
 	completionSummary?: string | null;
 	detailedSummary: string;
+	testResults?: AuditorTestResults | null;
 }): string {
 	return [
 		"You are the independent completion auditor for pi-goal.",
@@ -157,12 +169,31 @@ export function buildGoalAuditorPrompt(args: {
 		"<goal_details>",
 		args.detailedSummary,
 		"</goal_details>",
+		...(args.testResults ? [
+			"",
+			"Executor test evidence:",
+			"<test_evidence>",
+			`  Suite: ${args.testResults.suiteName ?? "(not specified)"}`,
+			`  Exit code: ${args.testResults.exitCode}`,
+			`  Timestamp: ${args.testResults.timestamp ?? "(not specified)"}`,
+			`  Output:`,
+			...(args.testResults.output ? args.testResults.output.split("\n").map((l) => `    ${l}`) : ["    (none provided)"]),
+			"</test_evidence>",
+		] : []),
 		"",
 		"Audit checklist:",
-		"1. Extract the real success criteria from the objective, including quality/reader outcomes.",
-		"2. Inspect artifacts or command output that can prove or disprove those criteria.",
-		"3. Explain missing or weak evidence, especially scaffold-vs-final quality gaps.",
-		"4. End with exactly <approved/> only if the objective is truly complete; otherwise end with exactly <disapproved/>.",
+		...(args.testResults ? [
+			"1. Extract the real success criteria from the objective, including quality/reader outcomes.",
+			"2. Inspect artifacts or command output that can prove or disprove those criteria.",
+			"3. Before running a test suite with bash, check the <test_evidence> block. If the executor has provided recent passing test results for that suite, accept them as evidence rather than re-running the tests.",
+			"4. Explain missing or weak evidence, especially scaffold-vs-final quality gaps.",
+			"5. End with exactly <approved/> only if the objective is truly complete; otherwise end with exactly <disapproved/>.",
+		] : [
+			"1. Extract the real success criteria from the objective, including quality/reader outcomes.",
+			"2. Inspect artifacts or command output that can prove or disprove those criteria.",
+			"3. Explain missing or weak evidence, especially scaffold-vs-final quality gaps.",
+			"4. End with exactly <approved/> only if the objective is truly complete; otherwise end with exactly <disapproved/>.",
+		]),
 		"",
 		"Progress reporting:",
 		"You have the report_auditor_progress tool available to report your progress to the user.",
@@ -240,6 +271,7 @@ export async function runGoalCompletionAuditor(args: {
 	goal: GoalRecord;
 	completionSummary?: string | null;
 	detailedSummary: string;
+	testResults?: AuditorTestResults | null;
 	signal?: AbortSignal;
 	onProgress?: AuditorProgressCallback;
 	/**

package/extensions/goal.ts CHANGED Viewed

@@ -106,7 +106,6 @@ import {
 	shouldInjectPostCompactReminder,
 	validateGoalAbort,
 	validateGoalCompletion,
-	validateGoalUpdate,
 	validatePauseGoal,
 	validateResumeGoal,
 } from "./goal-policy.ts";
@@ -441,6 +440,8 @@ export default function goalExtension(pi: ExtensionAPI): void {
 				active.add(QUESTIONNAIRE_TOOL_NAME);
 			} else if (state.goal?.status === "active") {
 				for (const name of goalExecutionWorkTools) active.add(name);
+				active.add(QUESTION_TOOL_NAME);
+				active.add(QUESTIONNAIRE_TOOL_NAME);
 			}
 			pi.setActiveTools(Array.from(active));
 		} catch {}
@@ -1705,66 +1706,27 @@ export default function goalExtension(pi: ExtensionAPI): void {
 			"Do not call update_goal merely because work is stopping, substantial progress was made, or tests passed without covering every requirement.",
 			"Do not use update_goal=complete as an escape hatch when you are blocked. If you are blocked, call pause_goal({reason, suggestedAction?}) instead so the user can intervene.",
 			"For sisyphus goals, do not mark complete until every numbered step has been executed and individually verified against its done criterion.",
-			"If the user gives requirements, feedback, or corrections that differ from the goal objective, the goal is stale. Use update_goal with updatedObjective to sync the objective before continuing work or before marking the goal complete. This ensures the auditor evaluates against the latest requirements.",
+			"The goal objective is immutable. The agent MUST NOT modify the goal objective on its own initiative. If the user gives requirements, feedback, or corrections that differ from the goal objective, ask the user to run /goal-tweak to revise the goal. Use goal_question to confirm when the change is ambiguous.",
+			"If you have just run the test suite successfully and the tests all pass, include a testResults object with the exit code (0) and relevant output. The auditor will see this evidence and can skip re-running the tests.",
 		],
 		parameters: Type.Object({
 			status: Type.Optional(StringEnum([COMPLETE_STATUS] as const, { description: "Set to complete only when the objective is achieved." })),
 			completionSummary: Type.Optional(Type.String({ description: "Concise completion claim and evidence summary passed to the independent auditor agent." })),
 			confirmBypassAuditor: Type.Optional(Type.Boolean({ description: "Set to true to confirm bypassing the independent auditor when it is disabled in settings." })),
-			updatedObjective: Type.Optional(Type.String({ description: "Revised goal objective. Use when the user's requirements have changed mid-flight. The goal remains active so the agent can continue working toward the new objective. Can be combined with status=complete to update the objective before the completion audit." })),
-		}),
+			testResults: Type.Optional(Type.Object({
+				exitCode: Type.Number({ description: "Exit code of the test run (0 = success)" }),
+				suiteName: Type.Optional(Type.String({ description: "Test suite name, e.g. 'npm test'" })),
+				output: Type.Optional(Type.String({ description: "Last lines of test output showing results" })),
+				timestamp: Type.Optional(Type.String({ description: "ISO timestamp of when tests were run" })),
+			}, { description: "Structured test evidence passed to the auditor so it can skip redundant test re-runs. If you have just run the test suite successfully, include this so the auditor accepts the results without re-running." })),
+		}, { additionalProperties: false }),
 		executionMode: "sequential",
 		async execute(_toolCallId, params, signal, _onUpdate, ctx) {
 			reconcileFocusedGoalFromDisk(ctx);
-			// -- Phase 1: Objective update (quick sync) --
-			// Apply updatedObjective before any completion logic so the completion
-			// flow (if status=complete is also set) reads the latest objective.
-			if (params.updatedObjective !== undefined) {
-				const newObjective = params.updatedObjective.trim();
-				if (!newObjective) throw new Error("update_goal requires a non-empty updatedObjective.");
-				const updateGate = validateGoalUpdate({ goal: state.goal });
-				if (!updateGate.ok) {
-					return {
-						content: [{ type: "text", text: updateGate.message }],
-						details: goalDetails(state.goal),
-					};
-				}
-				if (!state.goal) throw new Error("Goal disappeared during objective update.");
-				const next: GoalRecord = {
-					...state.goal,
-					objective: newObjective,
-					updatedAt: nowIso(),
-				};
-				state.goal = writeActiveGoalFile(ctx, next);
-				pi.appendEntry(STATE_ENTRY, goalDetails(state.goal));
-				try {
-					appendGoalEvent(ctx, {
-						type: "goal_tweaked",
-						goalId: state.goal.id,
-						changeSummary: "Objective updated via update_goal",
-						at: state.goal.updatedAt,
-					});
-				} catch {
-					// Ledger append failure should not block update
-				}
-				updateUI(ctx);
-				// Quick sync only (no status=complete) — return without terminating
-				if (params.status !== COMPLETE_STATUS) {
-					return {
-						content: [{ type: "text", text: `Goal objective updated.` }],
-						details: goalDetails(state.goal),
-					};
-				}
-				// Fall through: status=complete also set, proceed with completion below
-			}
 			// -- Phase 2: Status validation --
 			if (params.status !== COMPLETE_STATUS) {
-				if (params.updatedObjective === undefined) {
-					throw new Error("update_goal requires either status=complete or updatedObjective.");
-				}
 				throw new Error("update_goal requires status=complete when marking a goal complete.");
 			}
@@ -1913,6 +1875,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
 				goal: auditTarget,
 				completionSummary: params.completionSummary,
 				detailedSummary: detailedSummary(auditTarget),
+				testResults: params.testResults,
 				signal: auditAbortController.signal,
 				onProgress: (progress) => {
 					auditProgress = {
@@ -2064,7 +2027,7 @@ export default function goalExtension(pi: ExtensionAPI): void {
 			};
 		},
 		renderCall(args, theme) {
-			const label = args?.status ?? args?.updatedObjective ? "sync" : "";
+			const label = args?.status ?? "";
 			return new Text(theme.fg("toolTitle", "update_goal ") + theme.fg("success", label), 0, 0);
 		},
 		renderResult(result, _options, theme) {

package/extensions/prompts/goal-prompts.ts CHANGED Viewed

@@ -36,6 +36,8 @@ ${untrustedObjectiveBlock(goal)}
 Available work tools for pursuing the active goal include write, read, bash, and edit. Use those tools directly for file and shell work; do not call get_goal repeatedly to discover tools.
+To ask the user a structured question (e.g. when the user's spec changes and you need to clarify before updating the goal), use goal_question. It opens a question dialog and returns the user's answer as tool output. Use plain conversation for simple clarifications.
 Keep this goal in force until it is actually achieved. Do not pause for confirmation just because a phase, chapter, file, or checklist item is finished. At each natural stopping point, compare every explicit requirement with concrete evidence from the workspace/session. If the objective is complete, call update_goal with status=complete and summarize the evidence; update_goal will launch an independent pi auditor agent and only archive if that auditor returns <approved/>. If it is not complete, choose the next concrete action and do it.
 The completion auditor is independent and semantic, not a paperwork checklist. It may inspect files and command output, and it will reject scaffold-only, alpha, template, proxy-metric, or weakly verified completions with <disapproved/>.
@@ -46,7 +48,7 @@ If the user explicitly asks to abandon/cancel this goal, or the objective is obs
 Do NOT silently invent workarounds, fake completion, or quietly redefine the objective. Do NOT call update_goal=complete to escape a blocker.
-Goal evolution: if the user gives requirements, feedback, or corrections that differ from the goal objective, the goal is stale. Propose the updated objective concisely and wait for the user to confirm before continuing. Use update_goal with updatedObjective for narrow focus-area changes, or suggest /goal-tweak for broader revisions (boundaries, constraints, multiple sections). Do NOT mark the goal complete with a stale objective.${sisyphusDisciplineBlock(goal) ? `\n${sisyphusDisciplineBlock(goal)}` : ""}`;
+Goal evolution: if the user gives requirements, feedback, or corrections that differ from the goal objective, the goal is stale. The goal objective is immutable — the agent must NOT modify it autonomously. Propose the updated objective concisely and ask the user to run /goal-tweak to revise it. Do NOT mark the goal complete with a stale objective.${sisyphusDisciplineBlock(goal) ? `\n${sisyphusDisciplineBlock(goal)}` : ""}`;
 }
 export function continuationPrompt(goal: GoalRecord): string {
@@ -62,6 +64,8 @@ export function continuationPrompt(goal: GoalRecord): string {
 		"",
 		"Available work tools for pursuing the active goal include write, read, bash, and edit. Use those tools directly for file and shell work; do not call get_goal repeatedly to discover tools.",
 		"",
+		"To ask the user a structured question (e.g. when the user's spec changes and you need to clarify before updating the goal), use goal_question. It opens a question dialog and returns the user's answer as tool output. Use plain conversation for simple clarifications.",
+		"",
 		"Avoid repeating work that is already done. Choose the next concrete action toward the objective.",
 		"",
 		"Before deciding that the goal is achieved, perform a completion audit against the actual current state:",
@@ -79,7 +83,7 @@ export function continuationPrompt(goal: GoalRecord): string {
 		"Do not call update_goal unless the goal is complete enough to survive independent semantic auditing. Do not mark a goal complete merely because work is stopping.",
 		"Do not ask the user for confirmation unless there is a real blocker.",
 		"",
-		"Goal evolution: if the user gives requirements, feedback, or corrections that differ from the goal objective, the goal is stale. Propose the updated objective concisely and wait for the user to confirm before continuing. Use update_goal with updatedObjective for narrow focus-area changes, or suggest /goal-tweak for broader revisions (boundaries, constraints, multiple sections). Do NOT mark the goal complete with a stale objective.",
+		"Goal evolution: if the user gives requirements, feedback, or corrections that differ from the goal objective, the goal is stale. The goal objective is immutable — the agent must NOT modify it autonomously. Propose the updated objective concisely and ask the user to run /goal-tweak to revise it. Do NOT mark the goal complete with a stale objective.",
 		"",
 		"If you hit a real blocker (missing credentials, contradictory spec, file/permission you cannot access, dangerous operation pending user approval, or an unclear Sisyphus-style ordered plan), call pause_goal({reason, suggestedAction?}) and stop. If the user explicitly asks to abandon/cancel, or the objective is obsolete, impossible, or unsafe to continue, call abort_goal({reason}) and stop. Do not silently invent workarounds. Do not fake completion. pause_goal and abort_goal are structured lifecycle exits; update_goal=complete is not an escape hatch for blockers.",
 		...(goal.sisyphus ? ["", sisyphusDisciplineBlock(goal)] : []),

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pi-goal-x",
-  "version": "0.10.0",
+  "version": "0.10.2",
   "description": "Goal mode extension for pi: persistent long-running objectives, /goal-set drafting, Sisyphus prompt style, autoContinue, and an above-editor status overlay. Fork of @capyup/pi-goal.",
   "license": "MIT",
   "author": "pi-goal-x contributors",