npm - sneakoscope - Versions diffs - 2.0.12 → 2.0.14 - Mend

sneakoscope 2.0.12 → 2.0.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (86) hide show

package/README.md CHANGED Viewed

@@ -16,7 +16,7 @@ Set up this agent project with Sneakoscope Codex. Use [[mandarange/Sneakoscope-C
 ## Current Release
-SKS **2.0.12** is the public-ready parallel runtime stabilization release. It closes release DAG coverage around Zellij slot renderer proof semantics, wires Naruto allocation/rebalance into the production scheduler, keeps pre-run worker smoke opt-in, and requires GPT Final approval before local/worktree candidate output can apply.
+SKS **2.0.14** is the public-ready research runtime stabilization release. It upgrades Research from a quality-contract package into a real stage-aware runtime with parallel source shards, source-ledger merge, claim matrix builder, repository-aware implementation blueprint, Codex/GPT final reviewer, and blackbox gates while preserving the 2.0.13 research artifact contract.
 What changed:
@@ -466,6 +466,8 @@ sks code-structure scan --json
 `sks research` prepares a named genius-lens agent council, requires every agent to run at `xhigh`, records one literal `Eureka!` idea per agent, runs an evidence-bound debate, and creates `research-source-skill.md` as a route-local source collection skill before synthesis. Research is not a code-change route: real runs may write only their own mission artifacts under `.sneakoscope/missions/<id>/`, and source/package/docs/config mutations block the run with `research-code-mutation-blocker.json`. The required Research persona lenses are Einstein Agent, Feynman Agent, Turing Agent, von Neumann Agent, and Skeptic Agent; they are cognitive roles, not impersonations, and `agent-ledger.json` must include `display_name`, `persona`, `persona_boundary`, `reasoning_effort`, falsifiers, cheap probes, and `challenge_or_response`. Normal Research is not a fixed three-cycle flow: it repeats source gathering, Eureka ideas, debate, falsification, and synthesis pressure until every agent records final agreement, or pauses at the explicit max-cycle safety cap with an unpassed gate. `debate-ledger.json` must include `consensus_iterations`, `unanimous_consensus`, and per-agent agreements; `research-gate.json` cannot pass until unanimous consensus is true for all agents. Normal Research is intentionally allowed to take one or two hours when the problem needs it; `--mock` is only for selftests or dry harness checks, and a real run blocks with `research-blocker.json` instead of silently substituting mock output when the Codex execution path is unavailable. The source layer contract separates latest papers, official/government or leading-institution sources, standards/primary docs, current news such as BBC/CNN/GDELT-style sources, public discourse such as X/Reddit, developer/practitioner knowledge such as Stack Overflow/GitHub, traditional background sources, and counterevidence/fact-checking; `source-ledger.json` must record layer coverage, source quality, blockers, citations, and cross-layer triangulation. Context7 is optional for `$Research` and only becomes relevant when the research topic specifically depends on package, API, framework, or SDK documentation. Research runs require `research-report.md`, `research-paper.md`, `genius-opinion-summary.md`, `research-source-skill.md`, `source-ledger.json`, `agent-ledger.json`, `debate-ledger.json`, `novelty-ledger.json`, `falsification-ledger.json`, and `research-gate.json` so they stay source-backed, adversarially checked, falsifiable, paper-ready, and clear about every agent lens opinion. `research status` reports source entries, source-layer coverage, triangulation checks, counterevidence, xhigh agent count, Eureka moments, debate exchanges, consensus iterations, unanimous consensus, paper presence/sections, genius-opinion summary coverage, agent findings, and falsification cases alongside the gate.
+In 2.0.14, Research also writes a quality contract and handoff package: `research-quality-contract.json`, parallel `research/cycle-N/source-shards/*.json`, `source-ledger.json`, `claim-evidence-matrix.json`, `source-quality-report.json`, `implementation-blueprint.json`/`.md`, `team-handoff-goal.md`, `experiment-plan.json`/`.md`, `replication-pack.json`, `research-work-graph.json`, `research-final-review.static.json`, `research-final-review.codex.json`, and `research-final-review.json`. The default gate requires 12 total sources, 5 source layers, 2 counterevidence sources, 8 key claims, 6 triangulated claims, 8 blueprint sections, 4 falsification cases, 5 experiment steps, a 2200-word report, and approved static plus Codex/GPT final review before `research-gate.json` can pass. See `docs/research-pipeline.md`, `docs/research-artifacts.md`, and `docs/research-implementation-handoff.md`.
 `sks recallpulse` is the 0.8.0 report-only RecallPulse utility. It writes `recallpulse-decision.json`, `mission-status-ledger.json`, `route-proof-capsule.json`, `evidence-envelope.json`, `recallpulse-governance-report.json`, `recallpulse-task-goal-ledger.json`, and `recallpulse-eval-report.json` for the current mission. RecallPulse does not replace route gates, Honest Mode, DB safety, imagegen evidence, or TriWiki validation; it records cache hits, hydration needs, duplicate suppression, route-governance risks, and final-summary-ready durable status so later releases can promote only measured improvements. Checklist updates are sequential: every `Txxx` row is treated as a child `$Goal` checkpoint, and `sks recallpulse checklist ... --task T001 --apply` refuses out-of-order checks unless explicitly overridden.
 `sks pipeline plan` shows the active route lane, kept/skipped stages, verification commands, and no-unrequested-fallback invariant. The 0.9.0 Decision Lattice augments this planning surface with report-only A*/proof-debt evidence: frontier paths considered, the selected path, and rejected paths with rejection reasons. `sks proof-field scan` remains the lightweight rubric for small changes; risky or broad signals return to the full Team/Honest path, and no speedup claim is valid without replay or eval evidence.
@@ -531,6 +533,8 @@ $DB inspect this migration for destructive risk
 Local model workers are off by default, so SKS stays GPT-only unless you explicitly enable them. Use the Codex App prompt commands:
+![SKS Local LLM mode workflow](docs/sks-local-llm-mode/assets/sks-local-llm-flow.png)
 ```text
 $with-local-llm-on
 $with-local-llm-off

package/crates/sks-core/Cargo.lock CHANGED Viewed

@@ -76,7 +76,7 @@ dependencies = [
 [[package]]
 name = "sks-core"
-version = "2.0.12"
+version = "2.0.14"
 dependencies = [
  "serde_json",
 ]

package/crates/sks-core/Cargo.toml CHANGED Viewed

@@ -1,6 +1,6 @@
 [package]
 name = "sks-core"
-version = "2.0.12"
+version = "2.0.14"
 edition = "2021"
 [dependencies]

package/crates/sks-core/src/main.rs CHANGED Viewed

@@ -4,7 +4,7 @@ use std::io::{self, Read, Seek, SeekFrom};
 fn main() {
     let mut args = std::env::args().skip(1);
     match args.next().as_deref() {
-        Some("--version") => println!("sks-rs 2.0.12"),
+        Some("--version") => println!("sks-rs 2.0.14"),
         Some("compact-info") => {
             let mut input = String::new();
             let _ = io::stdin().read_to_string(&mut input);

package/dist/.sks-build-stamp.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
   "schema": "sks.dist-build-stamp.v1",
   "package_name": "sneakoscope",
-  "package_version": "2.0.12",
-  "source_digest": "6f9f20ed184ebe714b41b9176d8414b7fded251a30591ada3c3bac53e49fcf61",
-  "source_file_count": 2053,
-  "built_at_source_time": 1780833723608
+  "package_version": "2.0.14",
+  "source_digest": "e559602363886e06dc0d079905dacbaddcadccd0617447573f90240aa9b8f2f4",
+  "source_file_count": 2104,
+  "built_at_source_time": 1780852294757
 }

package/dist/bin/sks.js CHANGED Viewed

@@ -1,5 +1,5 @@
 #!/usr/bin/env node
-const FAST_PACKAGE_VERSION = '2.0.12';
+const FAST_PACKAGE_VERSION = '2.0.14';
 const args = process.argv.slice(2);
 try {
     if (args[0] === '--agent' && args[1] === 'worker') {

package/dist/core/agents/agent-orchestrator.js CHANGED Viewed

@@ -209,7 +209,7 @@ export async function runNativeAgentOrchestrator(opts = {}) {
         microWins: strategyCompiled.gate.micro_wins
     });
     if (opts.narutoWorkGraph?.work_items?.length) {
-        partition = applyNarutoWorkGraphToPartition(partition, opts.narutoWorkGraph, roster, targetActiveSlots);
+        partition = applyNarutoWorkGraphToPartition(partition, opts.narutoWorkGraph, roster, targetActiveSlots, prompt);
         augmentVerificationRollbackDagForNaruto(strategyCompiled.verification_rollback_dag, partition.slices);
     }
     await runAgentJanitor({ missionDir: dir, missionId, projectHash: namespace.root_hash });
@@ -620,7 +620,7 @@ function withFinalGptPatchEnvelopes(results, patchEnvelopes = []) {
         next[0] = { ...next[0], patch_envelopes: patchEnvelopes };
     return next;
 }
-function applyNarutoWorkGraphToPartition(partition, graph, roster, targetActiveSlots) {
+function applyNarutoWorkGraphToPartition(partition, graph, roster, targetActiveSlots, parentPrompt = '') {
     const activeRoster = (Array.isArray(roster?.roster) ? roster.roster : []).slice(0, Math.max(1, targetActiveSlots));
     const activeAgentIds = new Set(activeRoster.map((row) => String(row.id || '')).filter(Boolean));
     const fallbackOwners = activeRoster.length ? activeRoster : [{ id: 'naruto_clone_001', role: 'verifier' }];
@@ -639,6 +639,7 @@ function applyNarutoWorkGraphToPartition(partition, graph, roster, targetActiveS
         const targetPaths = normalizePathList(item.target_paths);
         const verificationNodeId = writePaths.length ? `verify:${sliceId}` : null;
         const rollbackNodeId = writePaths.length ? `rollback:${sliceId}` : null;
+        const parentObjective = normalizeWorkerPromptText(parentPrompt);
         return {
             id: sliceId,
             owner_agent_id: owner,
@@ -669,12 +670,15 @@ function applyNarutoWorkGraphToPartition(partition, graph, roster, targetActiveS
             source_intelligence_refs: sourceIntelligenceRefs,
             goal_mode_ref: goalModeRef,
             strategy_refs: strategyRefs,
+            parent_prompt: parentObjective,
             max_attempts: 1,
             description: [
+                parentObjective ? `Parent Naruto objective:\n${parentObjective}` : null,
                 String(item.title || item.id || 'Naruto work item'),
                 `Naruto owner: ${owner}`,
                 item.allocation_reason ? `Allocation: ${item.allocation_reason}` : null,
-                writePaths.length ? `Write paths: ${writePaths.join(', ')}` : 'Read-only or no-write work item.'
+                writePaths.length ? `Write paths: ${writePaths.join(', ')}` : 'Read-only or no-write work item.',
+                writePaths.length ? null : 'Read-only instruction: inspect the requested files/artifacts and do not run package scripts, build commands, tests, or temp-file-creating checks unless the parent objective explicitly requires them.'
             ].filter(Boolean).join('\n')
         };
     });
@@ -1687,10 +1691,13 @@ function buildDirectSdkWorkerPrompt(slice) {
         '',
         write.length
             ? `Write-capable slice. Return JSON matching ${CODEX_AGENT_WORKER_RESULT_SCHEMA_ID}; include patch_envelopes for write_paths=${JSON.stringify(write)}. Each patch envelope must include schema, source "model_authored", agent_id, session_id, slot_id, generation_index, task_slice_id, lease_id, allowed_paths, operations, and rationale. Each operation must include op, path, search, replace, content, and diff; use empty strings for operation fields that do not apply.`
-            : `Read-only slice. Return JSON matching ${CODEX_AGENT_WORKER_RESULT_SCHEMA_ID}; do not report pre-existing repository dirtiness as changed_files.`,
+            : `Read-only slice. Return JSON matching ${CODEX_AGENT_WORKER_RESULT_SCHEMA_ID}; inspect relevant files/artifacts, do not mutate files, do not create temporary/build outputs, do not run package scripts/build/test commands unless explicitly required, and do not report pre-existing repository dirtiness as changed_files.`,
         'Required JSON fields: status, summary, findings, changed_files, patch_envelopes, verification, rollback_notes, blockers.'
     ].join('\n');
 }
+function normalizeWorkerPromptText(value) {
+    return String(value || '').replace(/\s+/g, ' ').trim().slice(0, 4000);
+}
 function buildDirectNoPatchReason(slice, opts) {
     const writePathCount = sdkWritePaths(slice, opts).length;
     return {

package/dist/core/agents/agent-output-validator.js CHANGED Viewed

@@ -33,7 +33,7 @@ export const AGENT_RESULT_RUNTIME_SCHEMA = {
         persona_id: { type: 'string' },
         task_slice_id: { type: 'string' },
         status: { enum: ['done', 'blocked', 'failed'] },
-        backend: { enum: ['fake', 'process', 'codex-sdk', 'zellij', 'ollama'] },
+        backend: { enum: ['fake', 'process', 'codex-sdk', 'python-codex-sdk', 'zellij', 'ollama', 'local-llm'] },
         summary: { type: 'string' },
         findings: { type: 'array', items: { type: 'string' } },
         proposed_changes: { type: 'array', items: { type: 'string' } },

package/dist/core/codex-control/codex-fake-sdk-adapter.js CHANGED Viewed

@@ -34,11 +34,11 @@ function fakeStructuredOutput(input) {
             summary: unsafe
                 ? 'Fake Codex SDK GPT final arbiter rejected an unsafe candidate for hermetic verification.'
                 : 'Fake Codex SDK GPT final arbiter approved the candidate for hermetic verification.',
-            gpt_review_findings: unsafe ? [{ severity: 'high', message: 'unsafe candidate rejected' }] : [],
+            gpt_review_findings: unsafe ? [{ id: 'unsafe-candidate', severity: 'high', summary: 'unsafe candidate rejected' }] : [],
             accepted_patch_envelopes: unsafe ? [] : [],
             modified_patch_envelopes: [],
-            rejected_patch_envelopes: unsafe ? [{ reason: 'unsafe candidate' }] : [],
-            required_followup_work: unsafe ? [{ blocker: 'unsafe_candidate_patch' }] : [],
+            rejected_patch_envelopes: unsafe ? [{ id: 'unsafe-candidate', summary: 'unsafe candidate', patch_envelope_json: '{}' }] : [],
+            required_followup_work: unsafe ? [{ id: 'unsafe_candidate_patch', severity: 'high', summary: 'unsafe_candidate_patch' }] : [],
             verification_plan: ['schema validation', 'local collaboration final gate'],
             rollback_notes: [],
             blockers: unsafe ? ['unsafe_candidate_patch'] : [],

package/dist/core/codex-control/codex-sdk-adapter.js CHANGED Viewed

@@ -1,5 +1,8 @@
+import path from 'node:path';
+import { appendJsonl } from '../fsx.js';
 import { buildCodexSdkConfig } from './codex-sdk-config-policy.js';
 import { buildCodexSdkEnv } from './codex-sdk-env-policy.js';
+import { translateCodexSdkEvent } from './codex-event-translator.js';
 export async function runRealCodexSdkTask(input, policy) {
     const mod = await import('@openai/codex-sdk');
     const Codex = mod.Codex || mod.default?.Codex || mod.default;
@@ -22,9 +25,15 @@ export async function runRealCodexSdkTask(input, policy) {
     const thread = resumeId ? codex.resumeThread(resumeId, threadOptions) : codex.startThread(threadOptions);
     const events = [];
     let finalResponse = '';
+    let liveEventsWritten = false;
+    const liveEventPath = input.mutationLedgerRoot ? path.join(input.mutationLedgerRoot, 'codex-sdk-events.jsonl') : null;
     const streamed = await thread.runStreamed(buildSdkInput(input), { outputSchema: input.outputSchema });
     for await (const event of streamed.events) {
         events.push(event);
+        if (liveEventPath) {
+            await appendJsonl(liveEventPath, translateCodexSdkEvent(event));
+            liveEventsWritten = true;
+        }
         if (event?.type === 'item.completed' && event?.item?.type === 'agent_message')
             finalResponse = String(event.item.text || '');
     }
@@ -37,6 +46,7 @@ export async function runRealCodexSdkTask(input, policy) {
         finalResponse,
         structuredOutput,
         blockers: [],
+        liveEventsWritten,
         raw: { item_count: events.filter((event) => String(event?.type || '').startsWith('item.')).length }
     };
 }

package/dist/core/codex-control/codex-task-runner.js CHANGED Viewed

@@ -67,8 +67,10 @@ export async function runCodexTask(input) {
     }
     const events = Array.isArray(adapterResult?.events) ? adapterResult.events : [];
     const translatedEvents = translateCodexSdkEvents(events);
-    for (const event of translatedEvents)
-        await appendJsonl(path.join(root, 'codex-sdk-events.jsonl'), event);
+    if (adapterResult?.liveEventsWritten !== true) {
+        for (const event of translatedEvents)
+            await appendJsonl(path.join(root, 'codex-sdk-events.jsonl'), event);
+    }
     if (adapterResult?.reliabilityShield)
         await writeJsonAtomic(path.join(root, 'codex-reliability-shield.json'), adapterResult.reliabilityShield);
     const structuredOutput = adapterResult?.structuredOutput;

package/dist/core/codex-control/gpt-final-review-schema.js CHANGED Viewed

@@ -1,5 +1,25 @@
 export const GPT_FINAL_ARBITER_RESULT_SCHEMA_ID = 'sks.gpt-final-arbiter-result.v1';
 export const GPT_FINAL_ARBITER_INPUT_SCHEMA = 'sks.gpt-final-arbiter-input.v1';
+const reviewItemSchema = {
+    type: 'object',
+    required: ['id', 'severity', 'summary'],
+    properties: {
+        id: { type: 'string' },
+        severity: { type: 'string', enum: ['low', 'medium', 'high'] },
+        summary: { type: 'string' }
+    },
+    additionalProperties: false
+};
+const patchDecisionSchema = {
+    type: 'object',
+    required: ['id', 'summary', 'patch_envelope_json'],
+    properties: {
+        id: { type: 'string' },
+        summary: { type: 'string' },
+        patch_envelope_json: { type: 'string' }
+    },
+    additionalProperties: false
+};
 export const gptFinalArbiterResultSchema = {
     type: 'object',
     required: [
@@ -17,20 +37,20 @@ export const gptFinalArbiterResultSchema = {
         'confidence'
     ],
     properties: {
-        schema: { const: GPT_FINAL_ARBITER_RESULT_SCHEMA_ID },
+        schema: { type: 'string', enum: [GPT_FINAL_ARBITER_RESULT_SCHEMA_ID] },
         status: { enum: ['approved', 'modified', 'rejected', 'needs_more_work'] },
         summary: { type: 'string' },
-        gpt_review_findings: { type: 'array', items: { type: 'object' } },
-        accepted_patch_envelopes: { type: 'array', items: { type: 'object' } },
-        modified_patch_envelopes: { type: 'array', items: { type: 'object' } },
-        rejected_patch_envelopes: { type: 'array', items: { type: 'object' } },
-        required_followup_work: { type: 'array', items: { type: 'object' } },
+        gpt_review_findings: { type: 'array', items: reviewItemSchema },
+        accepted_patch_envelopes: { type: 'array', items: patchDecisionSchema },
+        modified_patch_envelopes: { type: 'array', items: patchDecisionSchema },
+        rejected_patch_envelopes: { type: 'array', items: patchDecisionSchema },
+        required_followup_work: { type: 'array', items: reviewItemSchema },
         verification_plan: { type: 'array', items: { type: 'string' } },
         rollback_notes: { type: 'array', items: { type: 'string' } },
         blockers: { type: 'array', items: { type: 'string' } },
         confidence: { enum: ['low', 'medium', 'high'] }
     },
-    additionalProperties: true
+    additionalProperties: false
 };
 export function normalizeGptFinalArbiterResult(value) {
     const status = normalizeStatus(value?.status);
@@ -38,11 +58,11 @@ export function normalizeGptFinalArbiterResult(value) {
         schema: GPT_FINAL_ARBITER_RESULT_SCHEMA_ID,
         status,
         summary: String(value?.summary || defaultSummary(status)),
-        gpt_review_findings: array(value?.gpt_review_findings),
-        accepted_patch_envelopes: array(value?.accepted_patch_envelopes),
-        modified_patch_envelopes: array(value?.modified_patch_envelopes),
-        rejected_patch_envelopes: array(value?.rejected_patch_envelopes),
-        required_followup_work: array(value?.required_followup_work),
+        gpt_review_findings: reviewItems(value?.gpt_review_findings),
+        accepted_patch_envelopes: patchDecisionItems(value?.accepted_patch_envelopes),
+        modified_patch_envelopes: patchDecisionItems(value?.modified_patch_envelopes),
+        rejected_patch_envelopes: patchDecisionItems(value?.rejected_patch_envelopes),
+        required_followup_work: reviewItems(value?.required_followup_work),
         verification_plan: stringArray(value?.verification_plan),
         rollback_notes: stringArray(value?.rollback_notes),
         blockers: stringArray(value?.blockers),
@@ -57,12 +77,39 @@ function normalizeStatus(value) {
 function normalizeConfidence(value) {
     return value === 'low' || value === 'medium' || value === 'high' ? value : 'medium';
 }
-function array(value) {
-    return Array.isArray(value) ? value : [];
+function reviewItems(value) {
+    if (!Array.isArray(value))
+        return [];
+    return value.map((entry, index) => {
+        const raw = typeof entry === 'object' && entry !== null ? entry : { summary: entry };
+        return {
+            id: String(raw.id || raw.blocker || raw.reason || `item-${index + 1}`),
+            severity: normalizeSeverity(raw.severity),
+            summary: String(raw.summary || raw.message || raw.blocker || raw.reason || entry || '').trim()
+        };
+    }).filter((entry) => entry.summary);
+}
+function patchDecisionItems(value) {
+    if (!Array.isArray(value))
+        return [];
+    return value.map((entry, index) => {
+        const raw = typeof entry === 'object' && entry !== null ? entry : { summary: entry };
+        const patch = typeof raw.patch_envelope_json === 'string'
+            ? raw.patch_envelope_json
+            : JSON.stringify(entry ?? {});
+        return {
+            id: String(raw.id || raw.schema || raw.reason || `patch-${index + 1}`),
+            summary: String(raw.summary || raw.reason || raw.rationale || raw.schema || entry || '').trim() || `Patch decision ${index + 1}`,
+            patch_envelope_json: patch
+        };
+    });
 }
 function stringArray(value) {
     return Array.isArray(value) ? value.map((entry) => String(entry || '').trim()).filter(Boolean) : [];
 }
+function normalizeSeverity(value) {
+    return value === 'low' || value === 'medium' || value === 'high' ? value : 'medium';
+}
 function defaultSummary(status) {
     return status === 'approved' || status === 'modified'
         ? 'GPT final arbiter accepted the candidate result.'

package/dist/core/commands/naruto-command.js CHANGED Viewed

@@ -444,6 +444,7 @@ async function runNarutoControlPlaneSmoke(input) {
             item,
             placement,
             backend: 'fake',
+            parentPrompt: input.prompt,
             worktreePolicy: smokeWorktreePolicy,
             zellijSessionName: `sks-${input.missionId}`,
             visiblePaneCap: input.zellijVisiblePanes

package/dist/core/commands/research-command.js CHANGED Viewed

@@ -14,6 +14,15 @@ import { scanDbSafety } from '../db-safety.js';
 import { maybeFinalizeRoute } from '../proof/auto-finalize.js';
 import { runNativeAgentOrchestrator } from '../agents/agent-orchestrator.js';
 import { flag, positionalArgs, readFlagValue, readMaxCycles, readBoundedIntegerFlag, resolveMissionId, safeReadTextFile } from './command-utils.js';
+import { writeResearchWorkGraph } from '../research/research-work-graph.js';
+import { runResearchCycle } from '../research/research-cycle-runner.js';
+import { readResearchQualityContract } from '../research/research-quality-contract.js';
+import { readClaimEvidenceMatrix } from '../research/claim-evidence-matrix.js';
+import { readSourceQualityReport } from '../research/source-quality-report.js';
+import { readImplementationBlueprint, validateImplementationBlueprint } from '../research/implementation-blueprint.js';
+import { readExperimentPlan, validateExperimentPlan } from '../research/experiment-plan.js';
+import { readReplicationPack, validateReplicationPack } from '../research/replication-pack.js';
+import { readResearchFinalReview } from '../research/research-final-reviewer.js';
 const RESEARCH_DEFAULT_MAX_CYCLES = 12;
 const RESEARCH_DEFAULT_CYCLE_TIMEOUT_MINUTES = 120;
 const RESEARCH_MIN_CYCLE_TIMEOUT_MINUTES = 15;
@@ -128,30 +137,75 @@ async function researchRun(args) {
     const dryRunPatches = flag(args, '--dry-run-patches') || flag(args, '--dryrun-patches');
     const maxWriteAgents = readBoundedIntegerFlag(args, '--max-write-agents', Math.min(requestedAgents, 5), 1, 20);
     const mock = flag(args, '--mock');
+    const researchWorkGraph = await writeResearchWorkGraph(dir, plan);
+    const graphWorkItemCount = Math.max(1, Number(researchWorkGraph.total_work_items || researchWorkGraph.work_items?.length || 0));
+    await runResearchCycle(dir, researchWorkGraph, { cycle: 0, status: mock ? 'mock_native_orchestrator_planned' : 'native_orchestrator_planned' });
     await setCurrent(root, { mission_id: id, mode: 'RESEARCH', phase: 'RESEARCH_RUNNING_NO_QUESTIONS', questions_allowed: false, implementation_allowed: false, research_real_run_required: !mock, research_cycle_timeout_minutes: cycleTimeoutMinutes });
     await appendJsonlBounded(path.join(dir, 'events.jsonl'), { ts: nowIso(), type: 'research.run.started', maxCycles, mock, cycleTimeoutMinutes, real_run_required: !mock });
-    const nativeAgentRun = await runNativeAgentOrchestrator({ root, missionId: id, route: flag(args, '--autoresearch') ? '$AutoResearch' : '$Research', prompt: mission.prompt || plan.prompt || 'Research run', backend: mock ? 'fake' : 'codex-sdk', mock, agents: requestedAgents, targetActiveSlots, desiredWorkItemCount, minimumWorkItems, maxQueueExpansion, concurrency: Math.min(requestedAgents, 5), readonly: !(applyPatches && writeMode !== 'off'), profile, writeMode: writeMode, applyPatches, dryRunPatches, maxWriteAgents, roster: plan.native_agent_plan, routeCommand: 'sks research run', routeBlackboxKind: 'actual_research_command' });
+    const nativeAgentRun = await runNativeAgentOrchestrator({ root, missionId: id, route: flag(args, '--autoresearch') ? '$AutoResearch' : '$Research', prompt: mission.prompt || plan.prompt || 'Research run', backend: mock ? 'fake' : 'codex-sdk', mock, agents: requestedAgents, targetActiveSlots, desiredWorkItemCount: Math.max(desiredWorkItemCount, graphWorkItemCount), minimumWorkItems: Math.max(minimumWorkItems, Math.min(graphWorkItemCount, targetActiveSlots)), maxQueueExpansion, concurrency: Math.min(requestedAgents, 5), readonly: true, profile, writeMode: writeMode, applyPatches: false, dryRunPatches, maxWriteAgents, roster: plan.native_agent_plan, routeCommand: 'sks research run', routeBlackboxKind: 'actual_research_command', narutoWorkGraph: researchWorkGraph });
     await writeJsonAtomic(path.join(dir, 'research-native-agent-run.json'), nativeAgentRun);
     await appendJsonlBounded(path.join(dir, 'events.jsonl'), { ts: nowIso(), type: 'research.native_agents.completed', backend: nativeAgentRun.backend, ok: nativeAgentRun.ok, proof: nativeAgentRun.proof?.status });
-    if (mock) {
-        let gate = await writeMockResearchResult(dir, plan);
-        const nativeGate = { ...(gate.gate || gate), native_agent_proof: nativeAgentRun.proof?.ok === true, agent_central_ledger: true };
-        await writeJsonAtomic(path.join(dir, 'research-gate.json'), nativeGate);
-        gate = { ...gate, gate: nativeGate, passed: nativeGate.passed };
-        const proof = await maybeFinalizeRoute(root, { missionId: id, route: '$Research', gateFile: 'research-gate.json', gate: gate.gate || gate, artifacts: ['agents/agent-proof-evidence.json', 'research-native-agent-run.json', 'research-gate.json', 'research-report.md', researchPaperArtifactForPlan(plan), 'source-ledger.json', 'agent-ledger.json', 'debate-ledger.json', 'completion-proof.json'], mock, command: { cmd: `sks research run ${id} --mock`, status: 0 } });
-        await setCurrent(root, { mission_id: id, mode: 'RESEARCH', phase: gate.passed ? 'RESEARCH_DONE' : 'RESEARCH_PAUSED', questions_allowed: true, implementation_allowed: false });
-        if (flag(args, '--json'))
-            return console.log(JSON.stringify({ schema: flag(args, '--autoresearch') ? 'sks.autoresearch-run.v1' : 'sks.research-run.v1', ok: proof.ok, mission_id: id, gate, proof: proof.validation, native_agent_run: nativeAgentRun, agent_batches: plan.agent_batches, autoresearch_cycle_policy: plan.autoresearch_cycle_policy }, null, 2));
-        console.log(`Mock research done: ${id}`);
-        console.log(`Gate: ${gate.passed ? 'passed' : 'blocked'}`);
-        return;
-    }
     if (!nativeAgentRun.ok) {
         await maybeFinalizeRoute(root, { missionId: id, route: '$Research', gateFile: 'research-gate.json', gate: await readJson(path.join(dir, 'research-gate.json'), null), artifacts: ['agents/agent-proof-evidence.json', 'research-native-agent-run.json', 'completion-proof.json'], statusHint: 'blocked', blockers: nativeAgentRun.proof?.blockers || ['native_agent_backend_blocked'], command: { cmd: `sks research run ${id}`, status: 2 } });
         await setCurrent(root, { mission_id: id, mode: 'RESEARCH', phase: 'RESEARCH_BLOCKED_NATIVE_AGENTS', questions_allowed: true, implementation_allowed: false, blocker: 'agents/agent-proof-evidence.json' });
         process.exitCode = 2;
         return;
     }
+    const legacyResearchCycle = flag(args, '--legacy-research-cycle') || process.env.SKS_RESEARCH_LEGACY_CYCLE === '1';
+    const sourceMutationBaseline = await researchCodeMutationSnapshot(root, id);
+    if (!legacyResearchCycle) {
+        const cycleResult = await runResearchCycle({
+            root,
+            dir,
+            plan,
+            graph: researchWorkGraph,
+            cycle: 1,
+            backend: mock ? 'mock' : 'codex-sdk',
+            timeoutMs: cycleTimeoutMs,
+            maxParallelStages: readBoundedIntegerFlag(args, '--research-stage-parallelism', 4, 1, 16),
+            mock
+        });
+        const mutation = await researchCodeMutationDelta(root, sourceMutationBaseline, id);
+        if (mutation.blocked) {
+            const blocker = {
+                schema_version: 1,
+                mission_id: id,
+                ts: nowIso(),
+                phase: 'RESEARCH_BLOCKED_CODE_MUTATION',
+                reason: 'Research mode must not modify repository source files. Only route-local mission artifacts are allowed.',
+                changed_paths: mutation.changed_paths,
+                allowed_prefixes: mutation.allowed_prefixes,
+                implementation_allowed: false
+            };
+            await writeJsonAtomic(path.join(dir, 'research-code-mutation-blocker.json'), blocker);
+            await maybeFinalizeRoute(root, { missionId: id, route: '$Research', gateFile: 'research-gate.json', gate: await readJson(path.join(dir, 'research-gate.json'), null), artifacts: ['research-code-mutation-blocker.json', 'completion-proof.json'], statusHint: 'blocked', blockers: ['research_code_mutation_detected'], command: { cmd: `sks research run ${id}`, status: 2 } });
+            await setCurrent(root, { mission_id: id, mode: 'RESEARCH', phase: 'RESEARCH_BLOCKED_CODE_MUTATION', questions_allowed: true, implementation_allowed: false, blocker: 'research-code-mutation-blocker.json' });
+            process.exitCode = 2;
+            return;
+        }
+        const gate = await evaluateResearchGate(dir);
+        const passed = cycleResult.status === 'passed' && gate.passed === true;
+        const proof = await maybeFinalizeRoute(root, {
+            missionId: id,
+            route: '$Research',
+            gateFile: 'research-gate.json',
+            gate: gate.gate || gate,
+            artifacts: ['agents/agent-proof-evidence.json', 'research-native-agent-run.json', 'research-cycle-runner.json', 'research-gate.json', 'research-report.md', researchPaperArtifactForPlan(plan), 'source-ledger.json', 'claim-evidence-matrix.json', 'implementation-blueprint.json', 'team-handoff-goal.md', 'completion-proof.json'],
+            statusHint: passed ? undefined : 'blocked',
+            blockers: passed ? [] : [...(cycleResult.blockers || []), ...(gate.reasons || [])],
+            mock,
+            command: { cmd: `sks research run ${id}${mock ? ' --mock' : ''}`, status: passed ? 0 : 2 }
+        });
+        await setCurrent(root, { mission_id: id, mode: 'RESEARCH', phase: passed ? 'RESEARCH_DONE' : 'RESEARCH_BLOCKED_STAGE_CYCLE', questions_allowed: true, implementation_allowed: false });
+        await appendJsonlBounded(path.join(dir, 'events.jsonl'), { ts: nowIso(), type: passed ? 'research.done' : 'research.stage_cycle.blocked', cycle: 1, cycle_status: cycleResult.status });
+        await enforceRetention(root).catch(() => { });
+        if (flag(args, '--json'))
+            return console.log(JSON.stringify({ schema: flag(args, '--autoresearch') ? 'sks.autoresearch-run.v1' : 'sks.research-run.v1', ok: proof.ok && passed, mission_id: id, gate, quality_metrics: gate.metrics || null, proof: proof.validation, native_agent_run: nativeAgentRun, research_work_graph: researchWorkGraph, research_cycle: cycleResult, agent_batches: plan.agent_batches, autoresearch_cycle_policy: plan.autoresearch_cycle_policy }, null, 2));
+        printResearchCompletion(id, root, dir, plan, gate);
+        if (!passed)
+            process.exitCode = 2;
+        return;
+    }
     const codex = await getCodexInfo();
     if (!codex.bin) {
         const blocker = {
@@ -173,11 +227,10 @@ async function researchRun(args) {
     }
     let last = '';
     const researchCodexArgs = ['-c', 'service_tier="fast"', '-c', 'model_reasoning_effort="xhigh"'];
-    const sourceMutationBaseline = await researchCodeMutationSnapshot(root, id);
     for (let cycle = 1; cycle <= maxCycles; cycle += 1) {
         const cycleDir = path.join(dir, 'research', `cycle-${cycle}`);
         const outputFile = path.join(cycleDir, 'final.md');
-        await appendJsonlBounded(path.join(dir, 'events.jsonl'), { ts: nowIso(), type: 'research.cycle.start', cycle, timeoutMinutes: cycleTimeoutMinutes, profile, enforced_reasoning_effort: 'xhigh' });
+        await appendJsonlBounded(path.join(dir, 'events.jsonl'), { ts: nowIso(), type: 'research.legacy_cycle.start', cycle, timeoutMinutes: cycleTimeoutMinutes, profile, enforced_reasoning_effort: 'xhigh', legacy_final_md_loop: true });
         const prompt = buildResearchPrompt({ id, mission, plan, cycle, previous: last });
         const result = await runCodexExec({ root, prompt, outputFile, json: true, profile, extraArgs: researchCodexArgs, logDir: cycleDir, timeoutMs: cycleTimeoutMs });
         await writeJsonAtomic(path.join(cycleDir, 'process.json'), { code: result.code, stdout_tail: result.stdout, stderr_tail: result.stderr, stdout_bytes: result.stdoutBytes, stderr_bytes: result.stderrBytes, truncated: result.truncated, timed_out: result.timedOut });
@@ -212,8 +265,8 @@ async function researchRun(args) {
             await appendJsonlBounded(path.join(dir, 'events.jsonl'), { ts: nowIso(), type: 'research.done', cycle });
             await enforceRetention(root).catch(() => { });
             if (flag(args, '--json'))
-                return console.log(JSON.stringify({ schema: flag(args, '--autoresearch') ? 'sks.autoresearch-run.v1' : 'sks.research-run.v1', ok: proof.ok, mission_id: id, gate, proof: proof.validation, agent_batches: plan.agent_batches, autoresearch_cycle_policy: plan.autoresearch_cycle_policy }, null, 2));
-            console.log(`Research done: ${id}`);
+                return console.log(JSON.stringify({ schema: flag(args, '--autoresearch') ? 'sks.autoresearch-run.v1' : 'sks.research-run.v1', ok: proof.ok, mission_id: id, gate, quality_metrics: gate.metrics || null, proof: proof.validation, research_work_graph: researchWorkGraph, agent_batches: plan.agent_batches, autoresearch_cycle_policy: plan.autoresearch_cycle_policy }, null, 2));
+            printResearchCompletion(id, root, dir, plan, gate);
             return;
         }
     }
@@ -222,6 +275,20 @@ async function researchRun(args) {
     await setCurrent(root, { mission_id: id, mode: 'RESEARCH', phase: 'RESEARCH_PAUSED_MAX_CYCLES', questions_allowed: true, implementation_allowed: false });
     console.log(`Research paused after max cycles without unanimous agent consensus: ${id}`);
 }
+function printResearchCompletion(id, root, dir, plan, gate) {
+    const metrics = gate?.metrics || {};
+    const rel = (artifact) => path.relative(root, path.join(dir, artifact));
+    console.log(`Research done: ${id}`);
+    console.log(`Report: ${rel('research-report.md')}`);
+    console.log(`Paper: ${rel(researchPaperArtifactForPlan(plan))}`);
+    console.log(`Implementation blueprint: ${rel('implementation-blueprint.json')}`);
+    console.log(`Claim-evidence matrix: ${rel('claim-evidence-matrix.json')}`);
+    console.log(`Experiment plan: ${rel('experiment-plan.json')}`);
+    console.log(`Replication pack: ${rel('replication-pack.json')}`);
+    console.log(`Gate: ${gate?.passed ? 'passed' : 'blocked'}`);
+    console.log(`Quality: ${metrics.source_entries_total_with_counterevidence ?? metrics.source_entries ?? 0} sources / ${metrics.source_layers_covered ?? 0} layers / ${metrics.key_claims ?? 0} key claims / ${metrics.falsification_cases ?? 0} falsification cases`);
+    console.log(`Handoff: ${rel('team-handoff-goal.md')}`);
+}
 async function researchStatus(args) {
     const root = await sksRoot();
     const id = await resolveMissionId(root, args[0]);
@@ -246,6 +313,16 @@ async function researchStatus(args) {
     const agentRows = Array.isArray(agentLedger?.agents) ? agentLedger.agents : [];
     const sourceLayerRows = Array.isArray(sourceLedger?.source_layers) ? sourceLedger.source_layers : [];
     const sourceLayersCovered = sourceLayerRows.filter((layer) => layer.status === 'covered' && ((Array.isArray(layer.source_ids) && layer.source_ids.length) || (Array.isArray(layer.counterevidence_ids) && layer.counterevidence_ids.length))).length;
+    const qualityContract = await readResearchQualityContract(dir);
+    const claimMatrix = await readClaimEvidenceMatrix(dir);
+    const sourceQualityReport = await readSourceQualityReport(dir);
+    const implementationBlueprint = await readImplementationBlueprint(dir);
+    const experimentPlan = await readExperimentPlan(dir);
+    const replicationPack = await readReplicationPack(dir);
+    const finalReview = await readResearchFinalReview(dir);
+    const blueprintValidation = validateImplementationBlueprint(implementationBlueprint, qualityContract);
+    const experimentValidation = validateExperimentPlan(experimentPlan, qualityContract);
+    const replicationValidation = validateReplicationPack(replicationPack);
     console.log(JSON.stringify({
         mission,
         state,
@@ -275,7 +352,23 @@ async function researchStatus(args) {
         research_paper_artifact: paperArtifact.name,
         paper_present: Boolean(paperText.trim()),
         paper_sections: countResearchPaperSections(paperText),
-        falsification_cases: falsificationLedger?.cases?.length ?? null
+        falsification_cases: falsificationLedger?.cases?.length ?? null,
+        research_quality: {
+            contract: qualityContract,
+            report_word_count: gate?.metrics?.report_word_count ?? null,
+            claim_evidence_matrix_present: claimMatrix.present,
+            key_claims: claimMatrix.key_claim_ids.length,
+            triangulated_claims: claimMatrix.triangulated_claim_count,
+            claim_matrix_blockers: claimMatrix.blockers,
+            source_quality_report_ok: sourceQualityReport?.ok === true,
+            implementation_blueprint_sections: Array.isArray(implementationBlueprint?.sections) ? implementationBlueprint.sections.length : null,
+            implementation_blueprint_ok: blueprintValidation.ok,
+            experiment_steps: Array.isArray(experimentPlan?.steps) ? experimentPlan.steps.length : null,
+            experiment_plan_ok: experimentValidation.ok,
+            replication_pack_ok: replicationValidation.ok,
+            final_review_approved: finalReview?.approved === true,
+            final_review_blockers: finalReview?.blockers || []
+        }
     }, null, 2));
 }
 async function researchCodeMutationSnapshot(root, missionId = null) {

package/dist/core/fsx.js CHANGED Viewed

@@ -5,7 +5,7 @@ import os from 'node:os';
 import crypto from 'node:crypto';
 import { spawn } from 'node:child_process';
 import { fileURLToPath } from 'node:url';
-export const PACKAGE_VERSION = '2.0.12';
+export const PACKAGE_VERSION = '2.0.14';
 export const DEFAULT_PROCESS_TAIL_BYTES = 256 * 1024;
 export const DEFAULT_PROCESS_TIMEOUT_MS = 30 * 60 * 1000;
 export function nowIso() {

package/dist/core/naruto/naruto-real-worker-child.js CHANGED Viewed

@@ -32,7 +32,7 @@ async function main() {
             generationIndex: 1,
             sessionId: String(intake.item.id || ''),
             cwd: String(intake.worktree_path || process.cwd()),
-            prompt: buildNarutoWorkerPrompt(intake.item),
+            prompt: buildNarutoWorkerPrompt(intake.item, intake.parent_prompt),
             outputSchemaId: CODEX_AGENT_WORKER_RESULT_SCHEMA_ID,
             outputSchema: codexAgentWorkerResultSchema,
             sandboxPolicy: intake.item.write_allowed === true ? 'workspace-write' : 'read-only',
@@ -109,10 +109,12 @@ function backendPreference(value) {
         return ['local-llm', 'codex-sdk'];
     return ['codex-sdk'];
 }
-function buildNarutoWorkerPrompt(item) {
+function buildNarutoWorkerPrompt(item, parentPrompt) {
     const writeAllowed = item?.write_allowed === true;
+    const parentObjective = normalizeWorkerPromptText(parentPrompt);
     return [
         'You are a Naruto route worker. Complete only this assigned work item and return JSON matching the required schema.',
+        parentObjective ? `Parent Naruto objective:\n${parentObjective}` : null,
         `Work item: ${String(item?.id || '')} ${String(item?.title || item?.kind || '')}`,
         `Role: ${String(item?.required_role || 'worker')}`,
         `Kind: ${String(item?.kind || 'verification')}`,
@@ -122,8 +124,14 @@ function buildNarutoWorkerPrompt(item) {
         writeAllowed
             ? 'If changes are needed, return model-authored patch_envelopes scoped to write paths.'
             : 'This is read-only work. Do not mutate files and return an empty patch_envelopes array.',
+        writeAllowed
+            ? null
+            : 'For read-only work, inspect requested files/artifacts only; do not run package scripts, build commands, tests, or temp-file-creating checks unless the parent objective explicitly requires them.',
         'Include verification checks, rollback notes, blockers, findings, and changed_files.'
-    ].join('\n');
+    ].filter(Boolean).join('\n');
+}
+function normalizeWorkerPromptText(value) {
+    return String(value || '').replace(/\s+/g, ' ').trim().slice(0, 4000);
 }
 main().then(() => {
     process.exit(0);

package/dist/core/naruto/naruto-real-worker-runtime.js CHANGED Viewed

@@ -29,6 +29,7 @@ export async function spawnActualNarutoWorker(input) {
         generated_at: nowIso(),
         mission_id: input.missionId,
         item: input.item,
+        parent_prompt: normalizeWorkerPromptText(input.parentPrompt),
         placement: input.placement,
         backend: input.backend,
         result_path: resultPath,
@@ -85,6 +86,9 @@ export async function collectActualNarutoWorker(handle) {
         blockers
     };
 }
+function normalizeWorkerPromptText(value) {
+    return String(value || '').replace(/\s+/g, ' ').trim().slice(0, 4000);
+}
 function actualWorkerEntrypoint() {
     return fileURLToPath(new URL('./naruto-real-worker-child.js', import.meta.url));
 }