npm - @oisincoveney/pipeline - Versions diffs - 2.4.0 → 2.6.0 - Mend

@oisincoveney/pipeline 2.4.0 → 2.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

package/.agents/skills/orchestrate/SKILL.md +12 -0
package/dist/config/schemas.d.ts +16 -3
package/dist/config/schemas.js +17 -0
package/dist/moka-submit.d.ts +1 -1
package/dist/pipeline-runtime.js +1 -0
package/dist/planning/generate.js +3 -1
package/dist/runner-command-contract.d.ts +2 -2
package/dist/runtime/agent-node/agent-node.js +69 -3
package/dist/runtime/builtins/builtins.js +2 -0
package/dist/runtime/handoff.d.ts +1 -0
package/dist/runtime/handoff.js +91 -0
package/dist/runtime/node-state-store.js +9 -0
package/dist/runtime/opencode-session-executor.js +5 -2
package/dist/runtime/parallel-node/parallel-node.js +36 -3
package/dist/runtime/parallel-worktrees/parallel-worktrees.js +132 -0
package/dist/runtime/select-candidate/select-candidate.js +116 -0
package/dist/schedule/passes/candidates.js +42 -0
package/dist/schedule/passes/index.js +1 -0
package/dist/token-estimator.js +6 -5
package/package.json +1 -1

package/.agents/skills/orchestrate/SKILL.md CHANGED Viewed

@@ -70,6 +70,18 @@ Whichever host you are on, run the same six steps:
 5. **Learn** — Once the gates pass, run `MoKa Learner` to store durable lessons from the run (qdrant memory) when there is something worth reusing. This mirrors the canonical pipeline's LEARN phase; skip it only when the run produced nothing reusable.
 6. **Synthesize** — Report only the evidence the agents actually returned: what passed, what the diff is, what the reviewers proved. Never fabricate or assume an outcome an agent did not report.
+## Task sizing, reliability & token budget
+Token usage is the dominant cost and quality lever — it explains the bulk of agent performance variance, and context degrades well before a model's window fills. But the first job of sizing is **reliable completion**: a lane an agent can't finish is worthless however cheap. Size the work accordingly:
+- **Size for reliable completion first.** Each lane must be small enough that a single agent session finishes it cleanly. If an agent times out, stalls, or returns having only *planned* without producing its artifact, the lane was **too big** — split it into smaller lanes (one file, section, or concern each); do **not** just raise the timeout, that re-runs the same flake. **Slow is fine; flaky is not** — many small lanes that each reliably complete beat one big lane that gambles. Lanes that share a file run sequentially; only truly independent lanes fan out. Treat repeated stalls as a decomposition bug, not bad luck.
+- **Under-timeouts and permission walls are the real flake sources — not a step cap.** Per opencode's docs, an agent with no `steps`/`maxSteps` set "will continue to iterate until the model chooses to stop or the user interrupts the session" — i.e. **no hard step budget by default** (the MoKa agents set none). So a Code Writer that returns having only *planned* was not hitting a step limit; it was killed by too short a dispatch timeout or blocked on a denied read (e.g. `external_directory: deny`). Fixes: give long multi-file authoring runs **generous wall-clock** (do not kill them early — they are slow, not capped); scope lanes so they don't need denied/external reads; and only if you must *bound* a runaway agent, set `steps` in its config. Smaller lanes still help (less work = faster, fewer surprises), but "multi-file authoring can't be delegated" is a timeout/scoping issue, not an opencode limit.
+- **Scale fan-out to complexity, not ambition.** A trivial change is one agent (or just do it inline); a bounded change is 1–3 lanes; only go wide for genuinely independent breadth. Code parallelizes poorly — keep writer lanes narrow (the pipeline caps `green`/code fan-out at 2 for exactly this reason).
+- **Keep each agent's context small and high-signal.** Pass context by path and hand over the distilled `research.json`, never raw repo dumps. A lane that needs half the repo in its context is mis-scoped — split it.
+- **Distilled returns.** Expect each sub-agent to return a ~1–2k-token summary of its result, not its full transcript. Gather the summary; don't re-read the work.
+- **Re-dispatch once, with evidence.** On a gate `FAIL`, re-dispatch the failing lane a *single* time with concentrated failure evidence — do not thrash. Each fresh `opencode run` re-pays the full cold-start context tax (~35k tokens of standup before any work), so a retry loop is expensive; fix the input, not the dice.
+- **Smallest roster that covers the work.** Every extra lane is another cold standup. Default to the fewest specialists that close the task; add a lane only when it genuinely runs independently.
 ## Rules
 - **Doctrine is host-neutral; only the Dispatch section is host-specific.** Do not leak `opencode run` syntax into an OpenCode run or Task-tool talk into a Claude run.

package/dist/config/schemas.d.ts CHANGED Viewed

@@ -344,8 +344,8 @@ declare const configSchema: z.ZodObject<{
   rules: z.ZodDefault<z.ZodRecord<z.ZodString, z.ZodObject<{
     path: z.ZodString;
     source_root: z.ZodDefault<z.ZodEnum<{
-      project: "project";
       package: "package";
+      project: "project";
     }>>;
   }, z.core.$strict>>>;
   runners: z.ZodDefault<z.ZodRecord<z.ZodString, z.ZodObject<{
@@ -472,8 +472,8 @@ declare const configSchema: z.ZodObject<{
   schedules: z.ZodDefault<z.ZodRecord<z.ZodString, z.ZodObject<{
     description: z.ZodOptional<z.ZodString>;
     baseline: z.ZodEnum<{
-      quick: "quick";
       execute: "execute";
+      quick: "quick";
     }>;
     max_parallel_nodes: z.ZodOptional<z.ZodNumber>;
     node_catalog: z.ZodOptional<z.ZodString>;
@@ -485,13 +485,26 @@ declare const configSchema: z.ZodObject<{
   skills: z.ZodDefault<z.ZodRecord<z.ZodString, z.ZodObject<{
     path: z.ZodString;
     source_root: z.ZodDefault<z.ZodEnum<{
-      project: "project";
       package: "package";
+      project: "project";
     }>>;
   }, z.core.$strict>>>;
   task_context: z.ZodOptional<z.ZodObject<{
     type: z.ZodString;
   }, z.core.$loose>>;
+  best_of_n: z.ZodOptional<z.ZodObject<{
+    categories: z.ZodDefault<z.ZodArray<z.ZodString>>;
+    enabled: z.ZodDefault<z.ZodBoolean>;
+    judge_model: z.ZodOptional<z.ZodString>;
+    n: z.ZodDefault<z.ZodNumber>;
+  }, z.core.$strict>>;
+  context_handoff: z.ZodOptional<z.ZodObject<{
+    enabled: z.ZodDefault<z.ZodBoolean>;
+    model: z.ZodOptional<z.ZodString>;
+  }, z.core.$strict>>;
+  parallel_worktrees: z.ZodOptional<z.ZodObject<{
+    enabled: z.ZodDefault<z.ZodBoolean>;
+  }, z.core.$strict>>;
   token_budget: z.ZodDefault<z.ZodObject<{
     default_context_window: z.ZodDefault<z.ZodNumber>;
     max_context_pct: z.ZodDefault<z.ZodNumber>;

package/dist/config/schemas.js CHANGED Viewed

@@ -461,6 +461,17 @@ const DEFAULT_TOKEN_BUDGET = {
 		by_category: {}
 	}
 };
+const contextHandoffSchema = z.object({
+	enabled: z.boolean().default(false),
+	model: z.string().optional()
+}).strict();
+const parallelWorktreesSchema = z.object({ enabled: z.boolean().default(false) }).strict();
+const bestOfNSchema = z.object({
+	categories: z.array(z.string()).default(["green"]),
+	enabled: z.boolean().default(false),
+	judge_model: z.string().optional(),
+	n: z.number().int().positive().default(1)
+}).strict();
 const pipelineFileSchema = z.object({
 	default_workflow: z.string(),
 	entrypoints: strictRecord(entrypointSchema).default({}),
@@ -482,6 +493,9 @@ const pipelineFileSchema = z.object({
 	}),
 	schedules: strictRecord(schedulePolicySchema).default({}),
 	task_context: taskContextResolverSchema.optional(),
+	best_of_n: bestOfNSchema.optional(),
+	context_handoff: contextHandoffSchema.optional(),
+	parallel_worktrees: parallelWorktreesSchema.optional(),
 	token_budget: tokenBudgetSchema.default(DEFAULT_TOKEN_BUDGET),
 	workflows: strictRecord(workflowSchema).default({}),
 	version: z.literal(1)
@@ -513,6 +527,9 @@ const configSchema = z.object({
 	schedules: strictRecord(schedulePolicySchema).default({}),
 	skills: strictRecord(pathRefSchema).default({}),
 	task_context: taskContextResolverSchema.optional(),
+	best_of_n: bestOfNSchema.optional(),
+	context_handoff: contextHandoffSchema.optional(),
+	parallel_worktrees: parallelWorktreesSchema.optional(),
 	token_budget: tokenBudgetSchema.default(DEFAULT_TOKEN_BUDGET),
 	version: z.literal(1),
 	workflows: strictRecord(workflowSchema).default({})

package/dist/moka-submit.d.ts CHANGED Viewed

@@ -160,8 +160,8 @@ declare const mokaSubmitOptionsSchema: z.ZodDiscriminatedUnion<[z.ZodObject<{
   }, z.core.$strict>>;
   serviceAccountName: z.ZodOptional<z.ZodString>;
   mode: z.ZodEnum<{
-    full: "full";
     quick: "quick";
+    full: "full";
   }>;
   schedulePath: z.ZodOptional<z.ZodString>;
   scheduleYaml: z.ZodOptional<z.ZodString>;

package/dist/pipeline-runtime.js CHANGED Viewed

@@ -628,6 +628,7 @@ async function executeNodeAttemptCycle(node, context, attempt, previous) {
 	const beforeSnapshot = context.nodeStateStore.getSnapshot(node.id);
 	if (beforeSnapshot) context.nodeStateStore.setSnapshot(node.id, diffChangedFiles(beforeSnapshot, afterSnapshot, context.worktreePath));
 	context.nodeStateStore.recordOutput(node.id, last.output);
+	context.nodeStateStore.recordHandoff(node.id, last.handoff);
 	emitNodeOutputRecorded(context, node, attempt, last.output);
 	recordNodeEvent(context, node.id, {
 		at: now(),

package/dist/planning/generate.js CHANGED Viewed

@@ -5,6 +5,7 @@ import { createRunnerLaunchPlan, runLaunchPlan } from "../runner.js";
 import { normalizeRunnerOutput } from "../runner-output.js";
 import { loadBacklogPlanningContext } from "../schedule/backlog-context.js";
 import { baselineScheduleArtifact } from "../schedule/baseline.js";
+import { expandBestOfNCandidates } from "../schedule/passes/candidates.js";
 import { dependentsByNeed, flattenNodes, hasReachableDependent } from "./graph.js";
 import { isCoverageNode, isImplementationNode } from "../schedule/scheduling-roles.js";
 import { addGeneratedImplementationCoverage } from "../schedule/passes/coverage.js";
@@ -91,7 +92,7 @@ async function generateScheduleArtifact(options) {
 	const planningContext = { ...loadBacklogPlanningContext(options.task, options.worktreePath) };
 	const generatedArtifact = await planScheduleArtifact(baseline, policy.planner_profile, options, planningContext);
 	assertSchedulePassOrder();
-	const artifact = hydrateScheduleTaskContexts(canonicalizeGeneratedScheduleIds(applyNodeCatalogModelFallbacks(options.config, policy.node_catalog, addGeneratedImplementationCoverage(options.config, generatedArtifact))), planningContext);
+	const artifact = hydrateScheduleTaskContexts(canonicalizeGeneratedScheduleIds(applyNodeCatalogModelFallbacks(options.config, policy.node_catalog, expandBestOfNCandidates(options.config, addGeneratedImplementationCoverage(options.config, generatedArtifact)))), planningContext);
 	validateScheduleArtifact(options.config, artifact, planningContext);
 	compileScheduleArtifact(options.config, artifact, options.worktreePath);
 	return {
@@ -102,6 +103,7 @@ async function generateScheduleArtifact(options) {
 function assertSchedulePassOrder() {
 	if (SCHEDULE_PASS_ORDER.join("\0") !== [
 		"coverage",
+		"candidates",
 		"models",
 		"ids",
 		"references"

package/dist/runner-command-contract.d.ts CHANGED Viewed

@@ -43,8 +43,8 @@ declare const runnerDeliverySchema: z.ZodObject<{
 declare const mokaSubmissionSchema: z.ZodDiscriminatedUnion<[z.ZodObject<{
   kind: z.ZodLiteral<"graph">;
   mode: z.ZodEnum<{
-    full: "full";
     quick: "quick";
+    full: "full";
   }>;
 }, z.core.$strict>, z.ZodObject<{
   argv: z.ZodArray<z.ZodString>;
@@ -104,8 +104,8 @@ declare const runnerCommandPayloadSchema: z.ZodObject<{
   submission: z.ZodDefault<z.ZodDiscriminatedUnion<[z.ZodObject<{
     kind: z.ZodLiteral<"graph">;
     mode: z.ZodEnum<{
-      full: "full";
       quick: "quick";
+      full: "full";
     }>;
   }, z.core.$strict>, z.ZodObject<{
     argv: z.ZodArray<z.ZodString>;

package/dist/runtime/agent-node/agent-node.js CHANGED Viewed

@@ -9,6 +9,7 @@ import "../events/index.js";
 import { gatewayServerForProfile } from "../../mcp/gateway.js";
 import { selectNodeModel } from "../../model-resolver.js";
 import { estimateTokens } from "../../token-estimator.js";
+import { handoffFinalizerPrompt, parseHandoff, renderHandoff, synthesizeMinimalHandoff } from "../handoff.js";
 import { readFileSync } from "node:fs";
 //#region src/runtime/agent-node/agent-node.ts
 async function executeAgentNode(node, context, attempt) {
@@ -63,7 +64,8 @@ async function executeAgentNode(node, context, attempt) {
 		result,
 		attempt
 	});
-	return {
+	const handoff = await maybeDeriveHandoff(context, node, finalized.output, attempt);
+	return withOptionalHandoff({
 		evidence: [
 			`agent boundary node=${node.id} profile=${node.profile} runner=${plan.runnerId}`,
 			`estimated context tokens: ${decision.estimatedTokens}`,
@@ -76,7 +78,62 @@ async function executeAgentNode(node, context, attempt) {
 		exitCode: result.exitCode,
 		output: finalized.output,
 		timedOut: result.timedOut
+	}, handoff);
+}
+function withOptionalHandoff(result, handoff) {
+	return handoff ? {
+		...result,
+		handoff
+	} : result;
+}
+function profileRunner(context, node) {
+	return node.profile ? context.config.profiles[node.profile]?.runner : void 0;
+}
+/**
+* PIPE-83.1: derive a structured NodeHandoff for this node when context_handoff
+* is enabled. Fast-path reuses an already-handoff-shaped output; otherwise a
+* cheap read-only finalizer (mirroring createOutputRepairPlan) summarizes the
+* raw output, falling back to a synthesized minimal handoff. Returns undefined
+* when disabled so behaviour is unchanged.
+*/
+async function maybeDeriveHandoff(context, node, rawOutput, attempt) {
+	if (!context.config.context_handoff?.enabled) return;
+	return parseHandoff(rawOutput) ?? await runHandoffFinalizer(context, node, rawOutput, attempt);
+}
+async function runHandoffFinalizer(context, node, rawOutput, attempt) {
+	const runner = profileRunner(context, node);
+	if (!(runner && rawOutput.trim())) return synthesizeMinimalHandoff(rawOutput);
+	const plan = createHandoffFinalizerPlan(context, node, runner, rawOutput);
+	context.agentInvocations.push(plan);
+	emitAgentStart(context, plan, attempt);
+	const result = await context.executor(plan, { signal: context.signal });
+	emitAgentFinish(context, plan, attempt, result);
+	return parseHandoff(normalizeAgentOutput(plan, result.stdout).output) ?? synthesizeMinimalHandoff(rawOutput);
+}
+function createHandoffFinalizerPlan(context, node, runner, rawOutput) {
+	const finalizerProfileId = `${node.id}:handoff`;
+	const finalizerConfig = {
+		...context.config,
+		profiles: {
+			...context.config.profiles,
+			[finalizerProfileId]: {
+				filesystem: { mode: "read-only" },
+				instructions: { inline: "Summarize the agent output into a NodeHandoff JSON." },
+				network: { mode: "disabled" },
+				output: { format: "text" },
+				runner,
+				tools: []
+			}
+		}
 	};
+	const model = context.config.context_handoff?.model;
+	return createRunnerLaunchPlan(finalizerConfig, {
+		nodeId: finalizerProfileId,
+		profileId: finalizerProfileId,
+		prompt: handoffFinalizerPrompt(rawOutput),
+		worktreePath: context.worktreePath,
+		...model ? { model } : {}
+	});
 }
 /**
 * Pure model-routing decision for a node: estimate the assembled prompt size and
@@ -274,9 +331,18 @@ function renderAgentPrompt(node, context) {
 		"",
 		...inheritedOutputSections(node, context),
 		"Dependency outputs:",
-		...node.needs.map((need) => `## ${need}\n${context.nodeStateStore.outputText(need)}`)
+		...node.needs.map((need) => renderDependencySection(need, context))
 	].filter(Boolean).join("\n");
 }
+/**
+* PIPE-83.5: render a dependency's curated NodeHandoff when one was derived
+* (PIPE-83.1), otherwise fall back to its raw output text. The fallback keeps
+* behaviour identical when context_handoff is disabled (no handoffs recorded).
+*/
+function renderDependencySection(nodeId, context) {
+	const handoff = context.nodeStateStore.handoff(nodeId);
+	return handoff ? renderHandoff(nodeId, handoff) : `## ${nodeId}\n${context.nodeStateStore.outputText(nodeId)}`;
+}
 function renderGateOutputContract(node) {
 	const gates = node.gates ?? [];
 	const hasAcceptanceGate = gates.some((gate) => gate.kind === "acceptance" && (gate.target === void 0 || gate.target === "stdout"));
@@ -319,7 +385,7 @@ function inheritedOutputSections(node, context) {
 	if (inherited.length === 0) return [];
 	return [
 		"Inherited dependency outputs:",
-		...inherited.map((id) => `## ${id}\n${context.nodeStateStore.outputText(id)}`),
+		...inherited.map((id) => renderDependencySection(id, context)),
 		""
 	];
 }

package/dist/runtime/builtins/builtins.js CHANGED Viewed

@@ -1,10 +1,12 @@
 import { runFallow, runJscpd, runLint, runSemgrep, runTests, runTypecheck } from "../../gates.js";
 import { executeDrainMergeBuiltin } from "../drain-merge/drain-merge.js";
 import "../drain-merge/index.js";
+import { executeSelectCandidateBuiltin } from "../select-candidate/select-candidate.js";
 //#region src/runtime/builtins/builtins.ts
 async function executeBuiltin(builtin, context, node) {
 	switch (builtin) {
 		case "drain-merge": return executeDrainMergeBuiltin(context, node);
+		case "select-candidate": return executeSelectCandidateBuiltin(context, node);
 		case "test": {
 			const result = await runTests(context.worktreePath, context.signal);
 			return {

package/dist/runtime/handoff.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ import { z } from "zod";

package/dist/runtime/handoff.js ADDED Viewed

@@ -0,0 +1,91 @@
+import { z } from "zod";
+//#region src/runtime/handoff.ts
+/**
+* NodeHandoff (PIPE-83.1) — the curated, typed envelope a node hands to its
+* dependents in place of its raw transcript. PIPE-83.5 makes renderAgentPrompt
+* consume these instead of re-hydrating every upstream node's full output text;
+* PIPE-83.10 persists them durably as the unit of cross-node state.
+*
+* Produced by DERIVING from a node's raw output via a cheap finalizer (see
+* agent-node), with a synthesized minimal fallback when no structured handoff
+* is available so existing consumers keep working unchanged.
+*/
+const MARKDOWN_JSON_FENCE_RE = /^\s*```(?:json)?\s*\r?\n([\s\S]*?)\r?\n```\s*$/i;
+const SUMMARY_FALLBACK_MAX_CHARS = 600;
+const handoffArtifactSchema = z.object({
+	lineRange: z.tuple([z.number().int().nonnegative(), z.number().int().nonnegative()]).optional(),
+	path: z.string().min(1)
+});
+const nodeHandoffSchema = z.object({
+	artifacts: z.array(handoffArtifactSchema).default([]),
+	decisions: z.array(z.string()).default([]),
+	openQuestions: z.array(z.string()).default([]),
+	summary: z.string(),
+	testNames: z.array(z.string()).default([])
+});
+/**
+* Parse a candidate handoff JSON string (tolerant of a Markdown ```json fence).
+* Returns null when the text is not JSON or does not satisfy the schema, so the
+* caller can fall back rather than throw.
+*/
+function parseHandoff(raw) {
+	const source = MARKDOWN_JSON_FENCE_RE.exec(raw.trim())?.[1].trim() ?? raw.trim();
+	let value;
+	try {
+		value = JSON.parse(source);
+	} catch {
+		return null;
+	}
+	const result = nodeHandoffSchema.safeParse(value);
+	return result.success ? result.data : null;
+}
+/**
+* Minimal handoff synthesized from a node's raw output text. Used when no
+* structured handoff is derived, preserving the pre-PIPE-83 behaviour (the
+* summary stands in for the raw text downstream consumers used to receive).
+*/
+function synthesizeMinimalHandoff(outputText) {
+	return {
+		artifacts: [],
+		decisions: [],
+		openQuestions: [],
+		summary: outputText.trim().slice(0, SUMMARY_FALLBACK_MAX_CHARS),
+		testNames: []
+	};
+}
+/**
+* Render a handoff into the compact text a dependent node receives (PIPE-83.5):
+* the curated summary + non-empty sections, in place of the full raw transcript.
+*/
+function renderHandoff(nodeId, handoff) {
+	const sections = [
+		["Decisions:", handoff.decisions],
+		["Artifacts:", handoff.artifacts.map((a) => a.lineRange ? `${a.path}:${a.lineRange[0]}-${a.lineRange[1]}` : a.path)],
+		["Tests:", handoff.testNames],
+		["Open questions:", handoff.openQuestions]
+	];
+	const lines = [`## ${nodeId}`, handoff.summary];
+	for (const [heading, items] of sections) if (items.length > 0) lines.push(heading, ...items.map((item) => `- ${item}`));
+	return lines.join("\n");
+}
+/** Prompt for the cheap finalizer that derives a handoff from raw node output. */
+function handoffFinalizerPrompt(rawOutput) {
+	return [
+		"You are a handoff summarizer for a pipeline node.",
+		"Read the agent output below and return ONLY a JSON object describing what a",
+		"downstream node needs to continue — no Markdown fences, no prose outside JSON.",
+		"",
+		"Fields:",
+		"- \"summary\": string — concise description of what this node accomplished.",
+		"- \"decisions\": string[] — explicit choices made (libraries, APIs, approaches).",
+		"- \"artifacts\": {\"path\": string, \"lineRange\"?: [number, number]}[] — files touched.",
+		"- \"testNames\": string[] — tests added or changed.",
+		"- \"openQuestions\": string[] — unresolved items the next node should know.",
+		"Use empty arrays where nothing applies. Preserve facts; do not invent.",
+		"",
+		"Agent output:",
+		rawOutput
+	].join("\n");
+}
+//#endregion
+export { handoffFinalizerPrompt, parseHandoff, renderHandoff, synthesizeMinimalHandoff };

package/dist/runtime/node-state-store.js CHANGED Viewed

@@ -1,11 +1,13 @@
 //#region src/runtime/node-state-store.ts
 var NodeStateStore = class NodeStateStore {
+	handoffByNode;
 	inheritedOutputNodeIds;
 	lastOutputByNode;
 	nodeSnapshots;
 	nodeStates;
 	structuredOutputs;
 	constructor(input = {}) {
+		this.handoffByNode = input.handoffByNode ?? /* @__PURE__ */ new Map();
 		this.inheritedOutputNodeIds = input.inheritedOutputNodeIds ?? /* @__PURE__ */ new Set();
 		this.lastOutputByNode = input.lastOutputByNode ?? /* @__PURE__ */ new Map();
 		this.nodeSnapshots = input.nodeSnapshots ?? /* @__PURE__ */ new Map();
@@ -14,6 +16,7 @@ var NodeStateStore = class NodeStateStore {
 	}
 	forkForParallelChildren(children) {
 		return new NodeStateStore({
+			handoffByNode: new Map(this.handoffByNode),
 			inheritedOutputNodeIds: new Set(this.lastOutputByNode.keys()),
 			lastOutputByNode: new Map(this.lastOutputByNode),
 			nodeSnapshots: /* @__PURE__ */ new Map(),
@@ -34,6 +37,9 @@ var NodeStateStore = class NodeStateStore {
 	getOutput(nodeId) {
 		return this.lastOutputByNode.get(nodeId);
 	}
+	handoff(nodeId) {
+		return this.handoffByNode.get(nodeId);
+	}
 	outputText(nodeId) {
 		return this.lastOutputByNode.get(nodeId) ?? "";
 	}
@@ -47,6 +53,9 @@ var NodeStateStore = class NodeStateStore {
 	markInheritedOutput(nodeId) {
 		this.inheritedOutputNodeIds.add(nodeId);
 	}
+	recordHandoff(nodeId, handoff) {
+		if (handoff) this.handoffByNode.set(nodeId, handoff);
+	}
 	recordOutput(nodeId, output) {
 		this.lastOutputByNode.set(nodeId, output);
 	}

package/dist/runtime/opencode-session-executor.js CHANGED Viewed

@@ -26,6 +26,9 @@ function createOpencodeExecutor(deps) {
 		}
 	};
 }
+function sessionDirectory(deps, plan) {
+	return plan.cwd ?? deps.directory;
+}
 async function driveSession(deps, plan, options) {
 	const sessionId = await resolveSessionId(deps, plan);
 	deps.onSession?.(plan.nodeId, sessionId);
@@ -34,7 +37,7 @@ async function driveSession(deps, plan, options) {
 		const data = unwrap(await deps.client.session.prompt({
 			body: promptBody(plan),
 			path: { id: sessionId },
-			query: { directory: deps.directory }
+			query: { directory: sessionDirectory(deps, plan) }
 		}));
 		return {
 			...data.info ? { assistant: data.info } : {},
@@ -50,7 +53,7 @@ async function resolveSessionId(deps, plan) {
 	if (existing) return existing;
 	const session = unwrap(await deps.client.session.create({
 		body: { title: `moka:${plan.nodeId}` },
-		query: { directory: deps.directory }
+		query: { directory: plan.cwd ?? deps.directory }
 	}));
 	deps.registry.sessions.set(plan.nodeId, session.id);
 	return session.id;

package/dist/runtime/parallel-node/parallel-node.js CHANGED Viewed

@@ -1,5 +1,6 @@
 import { childReporter } from "../events/events.js";
 import "../events/index.js";
+import { createChildWorktree, gcParallelWorktrees } from "../parallel-worktrees/parallel-worktrees.js";
 import pLimit from "p-limit";
 //#region src/runtime/parallel-node/parallel-node.ts
 async function executeParallelNode(node, context, runtime) {
@@ -9,6 +10,7 @@ async function executeParallelNode(node, context, runtime) {
 		exitCode: 1,
 		output: ""
 	};
+	gcStaleWorktrees(context);
 	const linkedAbort = createLinkedAbortController(context.signal);
 	const childContext = createParallelChildContext(context, node.id, children, context.plan.execution.failFast ? linkedAbort.controller.signal : context.signal);
 	try {
@@ -23,6 +25,37 @@ async function executeParallelNode(node, context, runtime) {
 		linkedAbort.cleanup();
 	}
 }
+function gcStaleWorktrees(context) {
+	if (context.config.parallel_worktrees?.enabled) gcParallelWorktrees(context.worktreePath);
+}
+/**
+* PIPE-83.4: run a parallel child in its own git worktree when enabled, so
+* concurrent candidate edits can't collide. The lease is created inside the
+* per-child callback (not before scheduling) so failFast-cleared children never
+* allocate a worktree; release retains dirty/unpushed work for downstream
+* selection. Default-off path is byte-identical to the prior behaviour.
+*/
+function runChildInWorktree(child, context, runtime) {
+	return context.config.parallel_worktrees?.enabled ? runInLease(child, context, runtime, createChildLease(child, context)) : runtime.executeNode(child, context);
+}
+function createChildLease(child, context) {
+	return createChildWorktree({
+		childNodeId: child.id,
+		parentNodeId: context.parentParallelNodeId ?? "parallel",
+		repoRoot: context.worktreePath,
+		...context.runId ? { runId: context.runId } : {}
+	});
+}
+async function runInLease(child, context, runtime, lease) {
+	try {
+		return await runtime.executeNode(child, {
+			...context,
+			worktreePath: lease.path
+		});
+	} finally {
+		lease.release();
+	}
+}
 function createParallelChildContext(context, parentNodeId, children, signal) {
 	return {
 		...context,
@@ -60,9 +93,9 @@ function createLinkedAbortController(signal) {
 }
 function executeParallelChildren(children, context, runtime) {
 	for (const child of children) runtime.markNodeReady(context, child.id);
-	if (!context.maxParallelNodes) return Promise.all(children.map((child) => runtime.executeNode(child, context)));
+	if (!context.maxParallelNodes) return Promise.all(children.map((child) => runChildInWorktree(child, context, runtime)));
 	const limit = pLimit(context.maxParallelNodes);
-	return Promise.all(children.map((child) => limit(() => runtime.executeNode(child, context))));
+	return Promise.all(children.map((child) => limit(() => runChildInWorktree(child, context, runtime))));
 }
 async function executeFailFastParallelChildren(children, context, abortController, runtime) {
 	for (const child of children) runtime.markNodeReady(context, child.id);
@@ -71,7 +104,7 @@ async function executeFailFastParallelChildren(children, context, abortControlle
 		rejectOnClear: true
 	});
 	return (await Promise.allSettled(children.map((child) => limit(async () => {
-		const result = await runtime.executeNode(child, context);
+		const result = await runChildInWorktree(child, context, runtime);
 		if (result.status === "failed") {
 			abortController.abort();
 			limit.clearQueue();

package/dist/runtime/parallel-worktrees/parallel-worktrees.js ADDED Viewed

@@ -0,0 +1,132 @@
+import { existsSync, mkdirSync, readFileSync, readdirSync, writeFileSync } from "node:fs";
+import { join } from "node:path";
+import { execFileSync } from "node:child_process";
+//#region src/runtime/parallel-worktrees/parallel-worktrees.ts
+/**
+* PIPE-83.4: git-worktree isolation for parallel candidate nodes. Each parallel
+* child runs in its own worktree on an auto-named branch so concurrent edits do
+* not collide. Teardown is idempotent and crash-safe: a worktree with dirty or
+* unpushed work is RETAINED (never deleted), and orphaned worktrees are GC'd on
+* startup using the same safety guard. A worktree is NOT a sandbox — node_modules
+* and build state are shared; real isolation remains k8s mode.
+*/
+const WORKTREE_ROOT = ".pipeline/worktrees";
+const REGISTRY_DIR = join(WORKTREE_ROOT, "registry");
+const OWNER = "oisin-pipeline";
+function git(cwd, args) {
+	return execFileSync("git", args, {
+		cwd,
+		encoding: "utf8"
+	}).trim();
+}
+function sanitize(id) {
+	return id.replace(/[^A-Za-z0-9._-]/g, "-");
+}
+function writeManifest(path, manifest) {
+	writeFileSync(path, `${JSON.stringify(manifest, null, 2)}\n`, "utf8");
+}
+function readManifest(path) {
+	return JSON.parse(readFileSync(path, "utf8"));
+}
+function createChildWorktree(opts) {
+	const runSeg = sanitize(opts.runId ?? "local");
+	const parentSeg = sanitize(opts.parentNodeId);
+	const childSeg = sanitize(opts.childNodeId);
+	const baseSha = git(opts.repoRoot, ["rev-parse", "HEAD"]);
+	const relPath = join(WORKTREE_ROOT, "trees", runSeg, parentSeg, childSeg);
+	const absPath = join(opts.repoRoot, relPath);
+	const branch = `pipeline/worktrees/${runSeg}/${parentSeg}/${childSeg}`;
+	const leaseId = `${runSeg}__${parentSeg}__${childSeg}`;
+	const registryAbs = join(opts.repoRoot, REGISTRY_DIR);
+	mkdirSync(registryAbs, { recursive: true });
+	const manifestPath = join(registryAbs, `${leaseId}.json`);
+	const manifest = {
+		baseSha,
+		branch,
+		childNodeId: opts.childNodeId,
+		leaseId,
+		owner: OWNER,
+		parentNodeId: opts.parentNodeId,
+		path: relPath,
+		runId: opts.runId,
+		schemaVersion: 1,
+		state: "creating"
+	};
+	writeManifest(manifestPath, manifest);
+	if (!existsSync(absPath)) git(opts.repoRoot, [
+		"worktree",
+		"add",
+		"-b",
+		branch,
+		absPath,
+		baseSha
+	]);
+	writeManifest(manifestPath, {
+		...manifest,
+		state: "active"
+	});
+	return {
+		baseSha,
+		branch,
+		leaseId,
+		path: absPath,
+		release: () => releaseWorktree(opts.repoRoot, manifestPath)
+	};
+}
+/** Idempotent, crash-safe teardown. Retains (never deletes) dirty/unpushed work. */
+function releaseWorktree(repoRoot, manifestPath) {
+	if (!existsSync(manifestPath)) return "removed";
+	const manifest = readManifest(manifestPath);
+	const absPath = join(repoRoot, manifest.path);
+	git(repoRoot, ["worktree", "prune"]);
+	if (!existsSync(absPath)) {
+		writeManifest(manifestPath, {
+			...manifest,
+			state: "removed"
+		});
+		return "removed";
+	}
+	const guarded = retentionState(absPath, manifest.baseSha);
+	if (guarded) {
+		writeManifest(manifestPath, {
+			...manifest,
+			state: guarded
+		});
+		return guarded;
+	}
+	git(repoRoot, [
+		"worktree",
+		"remove",
+		"--force",
+		absPath
+	]);
+	git(repoRoot, [
+		"branch",
+		"-D",
+		manifest.branch
+	]);
+	writeManifest(manifestPath, {
+		...manifest,
+		state: "removed"
+	});
+	return "removed";
+}
+/** Returns a retention reason when the worktree must be kept, else undefined. */
+function retentionState(absPath, baseSha) {
+	if (git(absPath, [
+		"status",
+		"--porcelain",
+		"--untracked-files=all"
+	]).length > 0) return "retained-dirty";
+	if (git(absPath, ["rev-parse", "HEAD"]) !== baseSha) return "retained-unpushed";
+}
+/** Startup GC: release every pipeline-owned lease using the same safety guard. */
+function gcParallelWorktrees(repoRoot) {
+	const registryAbs = join(repoRoot, REGISTRY_DIR);
+	if (!existsSync(registryAbs)) return [];
+	const results = readdirSync(registryAbs).sort().filter((file) => file.endsWith(".json")).map((file) => join(registryAbs, file)).filter((manifestPath) => readManifest(manifestPath).owner === OWNER).map((manifestPath) => releaseWorktree(repoRoot, manifestPath));
+	git(repoRoot, ["worktree", "prune"]);
+	return results;
+}
+//#endregion
+export { createChildWorktree, gcParallelWorktrees };

package/dist/runtime/select-candidate/select-candidate.js ADDED Viewed

@@ -0,0 +1,116 @@
+import { createRunnerLaunchPlan } from "../../runner.js";
+import { normalizeRunnerOutput } from "../../runner-output.js";
+import { parseJsonObject } from "../json-validation/json-validation.js";
+import "../json-validation/index.js";
+//#region src/runtime/select-candidate/select-candidate.ts
+const SCORE_RE = /-?\d+(?:\.\d+)?/;
+function selectBestCandidate(candidates) {
+	const passing = candidates.filter((candidate) => candidate.status === "PASS");
+	if (passing.length === 0) return null;
+	return passing.reduce((best, candidate) => (candidate.judgeScore ?? 0) > (best.judgeScore ?? 0) ? candidate : best);
+}
+async function executeSelectCandidateBuiltin(context, node) {
+	const candidates = await scoreCandidates(context, readCandidates(context, node?.needs.at(0) ?? null));
+	const selected = selectBestCandidate(candidates);
+	if (!selected) return {
+		evidence: [`select-candidate: no passing candidate among ${candidates.length}`, ...candidates.map((candidate) => `- ${candidate.nodeId}: FAIL`)],
+		exitCode: 1,
+		output: ""
+	};
+	return {
+		evidence: [`select-candidate: selected '${selected.nodeId}' (judge=${selected.judgeScore ?? "n/a"}) from ${candidates.length} candidates`],
+		exitCode: 0,
+		output: selected.output
+	};
+}
+async function scoreCandidates(context, candidates) {
+	const model = context.config.best_of_n?.judge_model;
+	const runner = Object.keys(context.config.runners).at(0);
+	if (!(model && runner)) return candidates;
+	return await Promise.all(candidates.map((candidate) => scoreCandidate(context, candidate, runner, model)));
+}
+async function scoreCandidate(context, candidate, runner, model) {
+	const plan = judgePlan(context, candidate, runner, model);
+	context.agentInvocations.push(plan);
+	const judgeScore = parseScore(normalizeRunnerOutput(plan, (await context.executor(plan, { signal: context.signal })).stdout).output);
+	return judgeScore === null ? candidate : {
+		...candidate,
+		judgeScore
+	};
+}
+function judgePlan(context, candidate, runner, model) {
+	const profileId = `select-candidate:judge:${candidate.nodeId}`;
+	return createRunnerLaunchPlan({
+		...context.config,
+		profiles: {
+			...context.config.profiles,
+			[profileId]: {
+				filesystem: { mode: "read-only" },
+				instructions: { inline: "Score the candidate implementation." },
+				network: { mode: "disabled" },
+				output: { format: "text" },
+				runner,
+				tools: []
+			}
+		}
+	}, {
+		model,
+		nodeId: profileId,
+		profileId,
+		prompt: judgePrompt(context.task, candidate.output),
+		worktreePath: context.worktreePath
+	});
+}
+function judgePrompt(task, output) {
+	return [
+		"Score how well this candidate implementation satisfies the task.",
+		"Return ONLY a number between 0 and 1 (1 = best). No prose, no fences.",
+		"",
+		`Task: ${task}`,
+		"",
+		"Candidate result:",
+		output
+	].join("\n");
+}
+function parseScore(text) {
+	const match = SCORE_RE.exec(text);
+	if (!match) return null;
+	const value = Number(match[0]);
+	return Number.isFinite(value) ? Math.max(0, Math.min(1, value)) : null;
+}
+function readCandidates(context, upstreamNodeId) {
+	if (!upstreamNodeId) return [];
+	const upstream = context.plan.graph.node(upstreamNodeId);
+	const childrenOutput = parseJsonObject(parseJsonObject(context.nodeStateStore.getOutput(upstreamNodeId)).children);
+	return (upstream?.children ?? []).flatMap((child) => {
+		const raw = childrenOutput[child.id];
+		return raw === void 0 ? [] : [parseCandidate(child.id, raw)];
+	});
+}
+function parseCandidate(nodeId, raw) {
+	const output = typeof raw === "string" ? raw : JSON.stringify(raw);
+	const parsed = safeParseObject(output);
+	return {
+		judgeScore: candidateJudgeScore(parsed),
+		nodeId,
+		output,
+		status: candidateStatus(parsed)
+	};
+}
+function candidateStatus(parsed) {
+	if (!parsed) return "PASS";
+	return parsed.verdict === "FAIL" || parsed.status === "FAIL" ? "FAIL" : "PASS";
+}
+function candidateJudgeScore(parsed) {
+	return typeof parsed?.judge_score === "number" ? parsed.judge_score : null;
+}
+function safeParseObject(text) {
+	try {
+		const value = JSON.parse(text);
+		return value && typeof value === "object" ? value : null;
+	} catch {
+		return null;
+	}
+}
+//#endregion
+export { executeSelectCandidateBuiltin };

package/dist/schedule/passes/candidates.js ADDED Viewed

@@ -0,0 +1,42 @@
+//#region src/schedule/passes/candidates.ts
+/**
+* PIPE-83.7: best-of-N candidate generation. When config.best_of_n is enabled
+* with n > 1, each agent node whose id carries a configured category (e.g.
+* "green") is expanded into a kind:parallel node holding N candidate children
+* (each a full copy with a fresh id and no inter-candidate deps). The wrapper
+* keeps the original id + upstream needs, so downstream consumers and the
+* PIPE-83.9 selector see a single dependency. Default off / n=1 is identity, so
+* generated schedules and the PIPE-57 goldens are unchanged.
+*/
+function expandBestOfNCandidates(config, artifact) {
+	const bestOfN = config.best_of_n;
+	if (!bestOfN?.enabled || bestOfN.n <= 1) return artifact;
+	return {
+		...artifact,
+		workflows: Object.fromEntries(Object.entries(artifact.workflows).map(([id, workflow]) => [id, {
+			...workflow,
+			nodes: workflow.nodes.flatMap((node) => expandNode(node, bestOfN.categories, bestOfN.n))
+		}]))
+	};
+}
+function expandNode(node, categories, n) {
+	if (node.kind !== "agent" || !categories.some((category) => node.id.includes(category))) return [node];
+	const candidatesId = `${node.id}--candidates`;
+	return [{
+		id: candidatesId,
+		kind: "parallel",
+		nodes: Array.from({ length: n }, (_, index) => ({
+			...node,
+			id: `${node.id}--c${index + 1}`,
+			needs: []
+		})),
+		...node.needs ? { needs: node.needs } : {}
+	}, {
+		builtin: "select-candidate",
+		id: node.id,
+		kind: "builtin",
+		needs: [candidatesId]
+	}];
+}
+//#endregion
+export { expandBestOfNCandidates };

package/dist/schedule/passes/index.js CHANGED Viewed

@@ -1,6 +1,7 @@
 //#region src/schedule/passes/index.ts
 const SCHEDULE_PASS_ORDER = [
 	"coverage",
+	"candidates",
 	"models",
 	"ids",
 	"references"

package/dist/token-estimator.js CHANGED Viewed

@@ -1,13 +1,14 @@
 import { getEncoding } from "js-tiktoken";
 //#region src/token-estimator.ts
 /**
-* Token estimation for node sizing. Uses the `o200k_base` encoding (the GPT-5.5
-* family the MoKa agents run on).
+* Token estimation for node sizing. Uses the `o200k_base` BPE as a
+* model-agnostic heuristic — NOT a guarantee of any specific model's tokenizer.
 *
 * This is a cross-model ESTIMATE, not a billing-accurate count: the pipeline
-* routes nodes across OpenAI/Kimi/Qwen models whose tokenizers differ, so the
-* value is a sizing heuristic for budget/routing decisions. For exact counts on
-* Anthropic runners, use the Anthropic `count_tokens` API instead.
+* routes nodes across OpenAI/Kimi/Qwen models whose exact tokenizers differ (and
+* are not all known here), so treat the value as a sizing heuristic for
+* budget/routing decisions only. For exact counts on Anthropic runners, use the
+* Anthropic `count_tokens` API.
 */
 let encoder;
 function encoding() {

package/package.json CHANGED Viewed

@@ -121,7 +121,7 @@
     "prepack": "bun run build:cli"
   },
   "type": "module",
-  "version": "2.4.0",
+  "version": "2.6.0",
   "description": "Config-driven multi-agent pipeline runner for repository work",
   "main": "./dist/index.js",
   "types": "./dist/index.d.ts",