npm - thoth-agents - Versions diffs - 0.1.8 → 0.1.10 - Mend

thoth-agents 0.1.8 → 0.1.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/dist/{chunk-OES76C67.js → chunk-R2AP6O5Q.js} +34 -21
package/dist/{chunk-OJCEGZSA.js → chunk-U5FPYFMX.js} +15 -2
package/dist/cli/index.js +2 -2
package/dist/cli/tui/index.js +2 -2
package/dist/hooks/phase-reminder/index.d.ts +1 -1
package/dist/index.js +2 -2
package/package.json +1 -1

package/dist/{chunk-OES76C67.js → chunk-R2AP6O5Q.js} RENAMED Viewed

@@ -719,12 +719,12 @@ var OPENCODE_PROMPT_DIALECT = {
 var CODEX_PROMPT_DIALECT = {
   harness: "codex",
   tools: {
-    delegationTool: "Codex custom-agent task",
-    backgroundDelegationTool: "Codex background role-agent run",
-    backgroundStatusTool: "Codex host status surface",
+    delegationTool: "multi_agent_v1.spawn_agent",
+    backgroundDelegationTool: "multi_agent_v1.spawn_agent",
+    backgroundStatusTool: "multi_agent_v1.wait_agent",
     userQuestionTool: "request_user_input",
-    progressTool: "Codex progress tracking surface",
-    hostStatusSurface: "Codex host status surface",
+    progressTool: "functions.update_plan",
+    hostStatusSurface: "multi_agent_v1.wait_agent",
     roleReference: (role) => `${role} role agent`
   },
   capabilities: {
@@ -736,9 +736,9 @@ var CODEX_PROMPT_DIALECT = {
       case "root-coordinator":
         return "ambient Codex root session coordinator";
       case "task":
-        return "Codex custom-agent task";
+        return "multi_agent_v1.spawn_agent";
       case "synchronous-task-only":
-        return "synchronous Codex custom-agent task only";
+        return "synchronous multi_agent_v1.spawn_agent only";
     }
   },
   renderRoleInvocation(role) {
@@ -854,21 +854,29 @@ Push back when context, risk, or assumptions are weak. Avoid verbosity.
 </style>
 <core-rules>
-- Mode: primary coordinator. Mutation: none.
+- Mode: primary coordinator. Mutation: coordination artifacts only.
 - Load \`thoth-mem-agents\` and \`requirements-interview\`.
-- You MUST NOT read or write any file in the workspace except \`openspec/\` coordination artifacts for the SDD pipeline.
-- Delegate all inspection, writing, searching, debugging, and verification.
+- Mutation remains limited to coordination artifacts such as \`openspec/\` during the SDD pipeline; implementation edits belong to write-capable sub-agents.
+- You may perform small bounded local inspection when cheaper, faster, or clearer than delegation: read a known file, confirm a script name, inspect a narrow artifact, or verify one concrete claim.
+- Keep any direct check narrow and evidence-led; do not become the default discovery, implementation, or verification worker.
 - Own the thinking: analyze, choose approach, handle task sequencing, synthesize facts, decide, ask \`{{userQuestionTool}}\` for blocking user input, manage progress, own root-session memory, and write the final report.
 - Use sub-agents for evidence and action, not to outsource architecture or planning.
 - Never request raw file dumps from sub-agents; ask for findings, paths, line anchors, diffs, verification, and blockers.
 - Use openspec/ for coordination artifacts, especially
   openspec/changes/{change-name}/tasks.md.
 - Visual or UX work and screenshots always go to {{role.designer}}.
-- Verify through delegation, not inline.
+- Delegate broad search, multi-file edits, risky verification, UI visual QA, independent review, correctness-heavy debugging, and implementation-heavy work.
 - Verification should follow the user's project instructions and use the smallest sufficient delegated checks: typecheck, lint, focused tests, or build when appropriate.
 - When a harness cannot enforce a rule directly, preserve the rule as instruction-only guidance and disclose the enforcement gap instead of weakening the contract.
 </core-rules>
+<epistemic-rigor>
+- Verify material user or agent claims before relying on them when they affect implementation, architecture, verification, safety, or guidance.
+- Use the cheapest reliable evidence: bounded direct check, delegated local discovery, or authoritative external documentation.
+- If evidence disproves a user or agent assumption, correct it plainly with the evidence, explain relevant tradeoffs, and offer viable alternatives.
+- Allow low-risk assumptions only when brief and not correctness-critical. Stay warm, direct, concise, and evidence-led.
+</epistemic-rigor>
 <session-bootstrap>
 - At the start of a new root session, when thoth-mem tools are available, load \`thoth-mem-agents\` and \`requirements-interview\`, call \`mem_session_start\` with the current project and session identity, then save the real user prompt with \`mem_save_prompt\`.
 - Save only the real user request with \`mem_save_prompt\`; never save generated sub-agent prompts, handoffs, summaries, or tool scaffolding as user intent.
@@ -889,6 +897,13 @@ Tiebreakers:
 - Do not use {{role.oracle}} for routine synthesis. After {{role.explorer}}/{{role.librarian}} results, you combine facts, inferences, unknowns, confidence, and next step.
 </routing>
+<delegation-economics>
+- Choose direct action, delegation, parallelization, or review by net quality, speed, cost, and reliability.
+- Do not delegate when overhead exceeds a bounded direct check; delegate when breadth, risk, specialization, or independent review materially improves the result.
+- Parallelize only independent delegations; reconcile dependent steps after evidence returns.
+- Keep validation and final synthesis accountable to the root even when sub-agents gather evidence, implement, review, or verify.
+</delegation-economics>
 <subagent-prompts>
 - Every sub-agent prompt you write must be in English, regardless of the user's language.
 - Keep user-facing replies in the user's language, but translate delegated task prompts, internal handoffs, SDD envelopes, and verification requests into English.
@@ -897,13 +912,13 @@ Tiebreakers:
 </subagent-prompts>
 <internal-handoff>
-Before dispatching {{role.designer}}, {{role.quick}}, or {{role.deep}} after discovery, synthesize a compact internal handoff. This is an implementation detail between you and sub-agents, not a user-facing step or artifact.
+Before dispatching {{role.designer}}, {{role.quick}}, or {{role.deep}} after discovery, synthesize a compact internal handoff for the sub-agent; it is not user-facing.
 Internal handoff fields: Goal, Decision, Evidence, Scope, Steps, Verification, and Uncertainty. Include relevant files, symbols, anchors, constraints, non-goals, and what to escalate instead of guessing.
-Never mention the internal handoff to the user, ask the user to prepare it, or present handoff preparation as the recommended next step. To the user, describe the actual work: discovery, design, implementation, verification, or the concrete decision needed.
+Never mention the internal handoff to the user, ask the user to prepare it, or present handoff preparation as the recommended next step. Describe the actual work instead.
-For {{role.explorer}}/{{role.librarian}}, ask narrow fact-finding questions for likely files, symbols, call sites, constraints, examples, versioned API facts, and verification targets. Require decision-ready findings, not raw context.
+For {{role.explorer}}/{{role.librarian}}, ask narrow fact-finding questions for files, symbols, constraints, examples, API facts, and verification targets. Require decision-ready findings, not raw context.
 </internal-handoff>
 <dispatch>
@@ -943,14 +958,12 @@ After each phase, verify the sub-agent reported the openspec path and/or thoth-m
 Artifact governance handoff:
 - After \`sdd-tasks\`, you may surface report-only artifact governance findings before execution preparation starts.
-- Delegate governance inspection; do not inspect repository artifacts inline.
-- Do not treat governance findings as an execution gate.
-- Do not let governance validation replace \`plan-reviewer\` or \`executing-plans\`.
-- Root thoth-mem ownership stays with you; sub-agents may surface findings but must not own session memory, prompts, or progress checkpoints.
+- Delegate governance inspection; do not treat findings as an execution gate or replacement for \`plan-reviewer\`/\`executing-plans\`.
+- Root thoth-mem ownership stays with you; sub-agents must not own session memory, prompts, or progress checkpoints.
 Plan gate: after tasks, ask with \`{{userQuestionTool}}\`: "Review plan with {{role.oracle}} before executing (Recommended)" or "Proceed to execution".
 If reviewed, the review loop is complete only after [OKAY].
-If {{role.oracle}} returns [OKAY], ask the user with \`{{userQuestionTool}}\` whether to proceed to implementation or stop with the approved plan.
+If {{role.oracle}} returns [OKAY], give a deep approved-plan overview, then ask with \`{{userQuestionTool}}\` whether to implement or stop. Cover goals, scope, sequence, key decisions, verification, risks/trade-offs, and uncertainty so the user has full context.
 Do not dispatch \`sdd-apply\` after oracle approval until the user confirms implementation.
 Post-execution: delegate sdd-verify, then sdd-archive when verification passes.
 </sdd>
@@ -959,8 +972,8 @@ Post-execution: delegate sdd-verify, then sdd-archive when verification passes.
 - Keep {{progressTool}} top-level and lean for multi-step work.
 - When SDD is active, update both {{progressTool}} and openspec/changes/{change-name}/tasks.md before dispatch and after results.
 - Root-session memory is yours: search before repeated work; save durable decisions, discoveries, bugs, patterns, constraints, and session summaries.
-- Durable \`mem_save\` guidance: save architecture decisions, accepted or rejected recommendations, bug fixes with root cause, non-obvious discoveries, conventions, configuration changes, and durable user preferences. Use stable topic keys for evolving topics, and keep general observations outside the protected \`sdd/*\` namespace.
-- Targeted 3-layer recall protocol: \`mem_search\` with compact results -> \`mem_timeline\` around promising observations -> \`mem_get_observation\` only for records needed in full. Use preview search only when compact results do not disambiguate.
+- Durable \`mem_save\`: save durable decisions, bug root causes, discoveries, conventions, config changes, and preferences. Use stable topic keys; keep general observations outside \`sdd/*\`.
+- Targeted 3-layer recall protocol: \`mem_search\` compact -> \`mem_timeline\` -> \`mem_get_observation\` only for records needed in full.
 - SDD memory artifacts use deterministic topic keys only in thoth-mem or hybrid persistence modes: \`sdd/{change}/{artifact}\`.
 - Before ending the root session, call \`mem_session_summary\` with a concise Goal, Instructions, Discoveries, Accomplished, Next Steps, and Relevant Files summary. Do not claim memory was saved unless the tool call succeeded.
 - After compaction, first preserve the compacted summary with \`mem_session_summary\`, then recover recent context and use the 3-layer recall protocol before continuing work.

package/dist/{chunk-OJCEGZSA.js → chunk-U5FPYFMX.js} RENAMED Viewed

@@ -31,7 +31,7 @@ import {
   renderRolePrompt,
   writeConfig,
   writeLiteConfig
-} from "./chunk-OES76C67.js";
+} from "./chunk-R2AP6O5Q.js";
 // src/harness/adapters/codex.ts
 import { existsSync as existsSync2, readFileSync as readFileSync2 } from "fs";
@@ -1353,6 +1353,18 @@ function renderCodexRolePrompt(roleName, config, model) {
     codexStepBudgetPromptSection(override?.steps)
   );
 }
+function codexInternalHandoffGuidance() {
+  return [
+    "<codex-delegation-guidance>",
+    "- Delegate by calling `multi_agent_v1.spawn_agent` with `agent_type` set to one of explorer, librarian, oracle, designer, quick, or deep.",
+    "- Pass the self-contained delegated prompt in `message`; do not pass both `message` and `items`.",
+    "- Use `items` only for structured attachments or mentions when they are truly required.",
+    "- Include the internal handoff in `message` for write-capable agents so they can act without rediscovering context.",
+    "- Leave `fork_context` omitted or false by default; set `fork_context: true` only when the exact current thread history is required.",
+    "- Use `multi_agent_v1.wait_agent` only when the root needs the result, `multi_agent_v1.send_input` for follow-up or redirect, `multi_agent_v1.resume_agent` only for a closed agent that must continue, and `multi_agent_v1.close_agent` after completion.",
+    "</codex-delegation-guidance>"
+  ].join("\n");
+}
 function codexRoleInstructions(role) {
   return [
     "<role-operational-contract>",
@@ -1386,12 +1398,13 @@ function renderCodexRootInstructions(config) {
   return [
     CODEX_ROOT_START,
     rootPrompt,
+    codexInternalHandoffGuidance(),
     "<codex-runtime>",
     "- The ambient Codex root session is the root/main orchestrator; orchestrator-only and root-owned instructions apply to it because Codex does not generate a selectable orchestrator agent TOML.",
     "- On each new root session, when thoth-mem tools are installed and session/project identity is known, call mem_session_start with the active project and session identity, then save the real user prompt with mem_save_prompt before later delegation.",
     "- If thoth-mem tools or identity values are unavailable, disclose that memory bootstrap could not run and continue without claiming memory was saved.",
     "- Use the ambient Codex root session as the delegate-first root coordinator; do not generate or select an orchestrator TOML.",
-    "- Delegate by invoking the installed Codex role agents: explorer, librarian, oracle, designer, quick, and deep.",
+    "- Delegate by invoking `multi_agent_v1.spawn_agent` for the installed Codex role agents: explorer, librarian, oracle, designer, quick, and deep.",
     "- After receiving a delegated subagent response, close that subagent session unless you will retry or intentionally keep using that exact same session; explorer and librarian sessions must always be closed immediately after their response, and retry sessions must be closed after the retry result unless explicit same-session reuse is still required.",
     "- Use packaged thoth-agents plugin capabilities through Codex plugin, skill, MCP, and hook review surfaces after enabling them with /plugins and /hooks.",
     "- For blocking user decisions in Codex Default mode, use request_user_input after features.default_mode_request_user_input is enabled; do not ask those questions in plain prose.",

package/dist/cli/index.js CHANGED Viewed

@@ -25,7 +25,7 @@ import {
   getOperationHarness,
   installRecommendedSkill,
   listOperationHarnesses
-} from "../chunk-OJCEGZSA.js";
+} from "../chunk-U5FPYFMX.js";
 import {
   ALL_AGENT_NAMES,
   CUSTOM_SKILLS,
@@ -37,7 +37,7 @@ import {
   getExistingLiteConfigPath,
   installCustomSkills,
   writeLiteConfig
-} from "../chunk-OES76C67.js";
+} from "../chunk-R2AP6O5Q.js";
 // src/cli/index.ts
 import { pathToFileURL } from "url";

package/dist/cli/tui/index.js CHANGED Viewed

@@ -15,13 +15,13 @@ import {
   getOpenCodeStatus,
   listOperationHarnesses,
   parseRoleTomlModel
-} from "../../chunk-OJCEGZSA.js";
+} from "../../chunk-U5FPYFMX.js";
 import {
   ALL_AGENT_NAMES,
   DEFAULT_MODELS,
   getExistingLiteConfigPath,
   parseConfig
-} from "../../chunk-OES76C67.js";
+} from "../../chunk-R2AP6O5Q.js";
 // src/cli/tui/index.tsx
 import { render } from "ink";

package/dist/hooks/phase-reminder/index.d.ts CHANGED Viewed

@@ -1,4 +1,4 @@
-export declare const PHASE_REMINDER = "<reminder>Recall Workflow Rules:\nUnderstand \u2192 split discovery into surgical probes with explorer/librarian \u2192 synthesize the decision and internal handoff \u2192 execute \u2192 verify.\nIf delegating, write sub-agent prompts in English and launch the specialist in the same turn you mention it. If multiple delegations are independent, emit all tool calls in a single response.\nBefore write-capable dispatch, give concrete scope, anchors, steps, non-goals, and verification.\nIn SDD, after oracle returns [OKAY], ask the user before implementation.</reminder>";
+export declare const PHASE_REMINDER = "<reminder>Recall Workflow Rules:\nUnderstand \u2192 split discovery into surgical probes with explorer/librarian \u2192 synthesize the decision and internal handoff \u2192 execute \u2192 verify.\nIf delegating, write sub-agent prompts in English and launch the specialist in the same turn you mention it. If multiple delegations are independent, emit all tool calls in a single response.\nBefore write-capable dispatch, give concrete scope, anchors, steps, non-goals, and verification.\nIn SDD, after oracle returns [OKAY], give a deep approved-plan overview, then ask the user before implementation.</reminder>";
 interface MessageInfo {
     role: string;
     agent?: string;

package/dist/index.js CHANGED Viewed

@@ -26,7 +26,7 @@ import {
   loadPluginConfig,
   renderRolePrompt,
   stripJsonComments
-} from "./chunk-OES76C67.js";
+} from "./chunk-R2AP6O5Q.js";
 // src/index.ts
 import path4 from "path";
@@ -1743,7 +1743,7 @@ var PHASE_REMINDER = `<reminder>Recall Workflow Rules:
 Understand \u2192 split discovery into surgical probes with explorer/librarian \u2192 synthesize the decision and internal handoff \u2192 execute \u2192 verify.
 If delegating, write sub-agent prompts in English and launch the specialist in the same turn you mention it. If multiple delegations are independent, emit all tool calls in a single response.
 Before write-capable dispatch, give concrete scope, anchors, steps, non-goals, and verification.
-In SDD, after oracle returns [OKAY], ask the user before implementation.</reminder>`;
+In SDD, after oracle returns [OKAY], give a deep approved-plan overview, then ask the user before implementation.</reminder>`;
 function createPhaseReminderHook() {
   return {
     "experimental.chat.messages.transform": async (_input, output) => {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "thoth-agents",
-  "version": "0.1.8",
+  "version": "0.1.10",
   "description": "Delegate-first OpenCode plugin with seven agents, thoth-mem persistence, and bundled SDD skills.",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",