npm - clementine-agent - Versions diffs - 1.18.196 → 1.18.198 - Mend

clementine-agent 1.18.196 → 1.18.198

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/dist/agent/agent-definitions.js +111 -8
package/dist/agent/run-agent-context.js +25 -5
package/package.json +1 -1

package/dist/agent/agent-definitions.js CHANGED Viewed

@@ -42,14 +42,74 @@ const PLANNER_PROMPT = [
     '- End each step prompt with "Deliver: <one-line return shape>".',
 ].join('\n');
 const RESEARCHER_PROMPT = [
-    'You are a per-item research specialist. You receive ONE specific item to investigate (one lead, one account, one file, one topic).',
+    'You are a per-item research specialist. You receive ONE specific item to investigate (one lead, one account, one domain, one file, one topic).',
     '',
-    'Use your bounded tools to gather the requested information. Return a ONE-PARAGRAPH summary in the format the parent specified.',
+    '## Tool access (1.18.198)',
     '',
-    'NEVER return raw tool output, full lists, or unbounded data. If a tool returns 50KB of JSON, extract only the fields you need and discard the rest.',
+    'You INHERIT every tool the parent agent has access to: Bash, Read, Grep, Glob, WebSearch, WebFetch, AND every MCP tool the parent has wired (dataforseo, brightdata, Salesforce, Gmail, Composio integrations, etc.). If the parent has it, you have it.',
+    '',
+    'The parent\'s dispatch prompt names the SPECIFIC tool you should use. Use exactly that tool. If the prompt is vague ("enrich this domain"), pick the most appropriate read-only tool from your inherited surface and proceed.',
+    '',
+    '## Safety bounds (behavior, not allowlist)',
+    '',
+    'You are READ-ONLY. Never call:',
+    '- `Edit`, `Write`, `NotebookEdit` (mutate files)',
+    '- Any MCP tool whose name contains `send_`, `create_`, `update_`, `delete_`, `post_`, `apply_`, `move_`, `rename_`, `archive_`, `set_`, `add_`, `remove_`, `enable_`, `disable_`, `subscribe_`, `unsubscribe_`, `assign_`, `cancel_`, `approve_`',
+    '- Bash commands containing `rm `, `mv `, `>>`, `>` (except for piping to `head`/`awk`), `git commit`, `git push`, `sf data ... update`, `sf data ... delete`, or any shell side-effect',
+    '',
+    'If you cannot complete the request read-only, say so in one line — do not improvise a mutation.',
+    '',
+    '## Output discipline',
+    '',
+    'Return a ONE-PARAGRAPH summary in the format the parent specified. Never raw tool output, never full lists, never unbounded data dumps. If a tool returns 50KB of JSON, extract only the requested fields and discard the rest — your job is to compress.',
     '',
     'If you cannot find the requested data, say so in one line. Do not speculate.',
 ].join('\n');
+/**
+ * 1.18.197 — discovery subagent. The owner asked Clementine to be the
+ * ORCHESTRATOR, not the worker. When chat says "find that coach project
+ * locally", "where is the X folder", "what's in ~/Downloads/Y" — the
+ * main session should NOT run recursive Glob/find/Read in its own turn
+ * (that's the autocompact thrash we kept hitting). It should dispatch
+ * to this subagent which has its own fresh 200K context, does the
+ * file-system traversal, and returns paths + a 1-paragraph summary.
+ *
+ * The discovery subagent is intentionally narrower than researcher:
+ * researcher investigates ONE specific item; discovery LOCATES things.
+ *
+ * Tools: Bash (head/find/ls/awk), Read (one specific file at a time
+ * once located), Glob, Grep — all bounded.
+ *
+ * NOT included: Edit, Write, mutating MCP tools. Pure read-only.
+ */
+const DISCOVERY_PROMPT = [
+    'You are the file-system discovery specialist. You receive a discovery request from the orchestrator and return PATHS + a brief summary.',
+    '',
+    'Your job: locate things. NOT read full contents. NOT analyze in depth.',
+    '',
+    'Tooling rules (these prevent the autocompact thrashing that crashes the orchestrator):',
+    '- Use `Bash ls -la <dir>` to enumerate a directory — never recursive Glob without --maxdepth.',
+    '- Use `Bash find <dir> -maxdepth 3 -name "*.csv"` (or similar) to find files matching a pattern.',
+    '- Use `Bash head -c 2000 <file>` to PEEK at a file — never raw Read on an unknown-size file.',
+    '- Use `Bash wc -l <file>` to size-check before any Read.',
+    '- Once you find target files, return their absolute paths + sizes + one-line descriptions.',
+    '- DO NOT load file contents into your context unless asked for a specific file.',
+    '',
+    'Output format (strict):',
+    '```',
+    'Found: <count> matching items',
+    '',
+    'Paths:',
+    '- /absolute/path/to/file1.csv (12KB, 340 rows) — appears to be coach roster',
+    '- /absolute/path/to/file2.md (3KB) — README describing the project',
+    '',
+    'Recommendation: <which path the orchestrator should fetch next, if any>',
+    '```',
+    '',
+    'If nothing matches, say so in one line.',
+    '',
+    'You are bounded by max 15 turns. Use them wisely — list, scope, summarize, return.',
+].join('\n');
 const CRON_FIXER_PROMPT = [
     'You are the cron-fix specialist. You diagnose and apply fixes to broken cron jobs.',
     '',
@@ -132,18 +192,61 @@ export function buildAgentMap(opts = {}) {
         maxTurns: 1,
     };
     // Researcher: haiku, per-item investigation. Cheap fan-out target.
-    // No Bash — researcher is read-only fanout, must not mutate state.
+    //
+    // 1.18.198 — NO `tools` allowlist. Researcher inherits every tool the
+    // parent has access to (Bash, Read, MCP wildcards, etc.). The earlier
+    // hardcoded ['Read', 'Grep', 'Glob', 'WebSearch', 'WebFetch'] blocked
+    // researcher from using the parent's MCP servers — when Ross dispatched
+    // "Parallel SEO enrichment for 13 domains" the subagent couldn't call
+    // `mcp__dataforseo__*` because it wasn't in the allowlist. Result: the
+    // subagent said "I can't do that" and Ross fell back to running 25
+    // sequential MCP calls in his own turn, defeating the fan-out.
+    //
+    // Read-only behavior is enforced in RESEARCHER_PROMPT (behavior class:
+    // no `Edit`/`Write`, no MCP tools containing send_/create_/update_/
+    // delete_/etc.). The prompt is the contract; the SDK lets the subagent
+    // inherit everything from the parent.
     map['researcher'] = {
         description: [
             'Use this subagent to investigate ONE specific item — a single',
-            'lead, account, file, web page, or topic — and return a',
+            'lead, account, domain, file, web page, or topic — and return a',
             'one-paragraph summary. Spawn it in PARALLEL via the Agent tool',
-            'with one subagent per item when the planner returns multiple',
-            'research steps. Read-only: never mutates state. Cheap (Haiku).',
+            'with one subagent per item for batch fan-out. Read-only: never',
+            'mutates state (Haiku, inherits parent tool surface).',
+            '',
+            'When dispatching, NAME THE SPECIFIC TOOL in the prompt (e.g.',
+            '"call mcp__dataforseo__google_domain_rank_overview for domain X")',
+            'rather than describing the goal abstractly ("enrich this domain").',
         ].join(' '),
         prompt: RESEARCHER_PROMPT,
         model: 'haiku',
-        tools: ['Read', 'Grep', 'Glob', 'WebSearch', 'WebFetch'],
+        // NO `tools` field — inherit from parent. See RESEARCHER_PROMPT for
+        // read-only safety enforcement.
+        effort: 'low',
+        maxTurns: 15,
+    };
+    // Discovery (1.18.197): file-system / project location. Owner's
+    // northstar: Clementine orchestrates, doesn't bulk-process. ANY
+    // local file-system traversal ("find the X project", "where is Y",
+    // "what's in ~/Downloads", "scan this directory") delegates here so
+    // the recursive find/Glob/Read outputs land in this subagent's
+    // 200K window instead of the orchestrator's chat session. Returns
+    // paths + brief summaries — never file contents.
+    map['discovery'] = {
+        description: [
+            'Use this subagent for ANY local file-system or project discovery:',
+            '"find that X project", "locate the Y folder", "where is Z",',
+            '"scan ~/Downloads for W", "is there a file matching V", "list',
+            'what is in directory U". The discovery subagent has its own',
+            'fresh 200K context window and uses bounded `Bash` (ls, find,',
+            'head, wc) — it returns absolute paths + brief descriptions but',
+            'NEVER loads file contents into your main chat context. ALWAYS',
+            'prefer this over running recursive Glob / `find -r` / Read on',
+            'unknown-size files in your own turn — those are context bombs.',
+        ].join(' '),
+        prompt: DISCOVERY_PROMPT,
+        model: 'haiku',
+        tools: ['Bash', 'Read', 'Grep', 'Glob'],
         effort: 'low',
         maxTurns: 15,
     };

package/dist/agent/run-agent-context.js CHANGED Viewed

@@ -125,12 +125,32 @@ const BEHAVIORAL_POSTURE = `## How you operate
 **Verification posture for disputed claims.** If you see "Dispute mode" in the turn context, the owner is reporting that prior work FAILED. Past \`done\` claims in memory are NOT authoritative — your recall is biased. Before defending any past success, re-verify against reality: curl URLs, check file existence, run status commands. Saying "but my memory says it's live" without re-checking is a hallucination, not a defense.
-**Fan-out posture (1.18.194).** When the owner asks for 3+ similar operations — send N emails, pull N records, enrich N contacts, summarize N pages — dispatch subagents in PARALLEL via the Agent tool. One subagent per item. Don't loop in your own turn; that's slow, serializes I/O that should be concurrent, and burns context linearly. Available subagents (see Agent tool descriptions for the canonical list):
-- \`researcher\` (Haiku, parallel, read-only) — per-item investigation
-- \`planner\` (Opus, 1-turn, no tools) — decomposition before write/send batches
-- Hired agents (Ross, Nora, etc.) — cross-delegation when relevant
+**Orchestrator posture (1.18.197).** You are the orchestrator, not the worker. Your job in chat is to UNDERSTAND what the owner wants, DELEGATE the heavy lifting to the right subagent, and ORCHESTRATE the final response. The main chat session is a small, focused context — not a workspace for bulk file reads or recursive directory traversal. Loading raw tool outputs into your own turn is the failure mode; delegating is the success mode.
-A 25-contact enrichment that fans out to 25 \`researcher\` calls finishes in ~30s. The same work done serially in your own turn takes 10+ minutes AND fills your context window with tool outputs. Default to fan-out for batch work.`;
+**Tool-selection rubric.** Before running tools yourself, ask which bucket the request falls into:
+1. **Local discovery / file-system traversal** ("find the X project", "where is Y", "scan ~/Downloads", "what's in this folder", "is there a file matching Z") → dispatch \`discovery\` subagent via the Agent tool. It has its own 200K context and returns paths + summaries. Never run recursive \`Glob\`/\`find\`/\`Read\` on unknown-size files in your own turn — that's a context bomb.
+2. **Per-item batch work** (send N emails, pull N contacts, enrich N records, summarize N pages, "for each of these…") → dispatch \`researcher\` subagents in PARALLEL — one per item. A 25-item job that fans out finishes in ~30s. The same work done serially in your own turn takes 10+ minutes and fills your context with tool outputs.
+3. **Multi-step decomposition needed first** (Zach-style "find the project, build a report, deploy it, verify") → owner can opt into this via \`/plan\` which dispatches the \`planner\` subagent to decompose first, then chain workers per step. Don't auto-trigger plan mode yourself; respond directly and use subagents for the parts you can decompose.
+4. **Broken cron jobs** ("fix the X job", "what's failing", "re-run Y") → dispatch \`cron-fixer\` subagent — it owns the diagnose-and-apply flow with the right tools.
+5. **Cross-agent work** (work that belongs to Ross / Nora / Sasha / etc.) → dispatch the hired agent as a subagent so they execute with their own identity and tools.
+6. **Single, targeted action** (read this specific file, write this output, call this one MCP tool, send this one message) → do it yourself in your own turn. Direct tool use is correct when the scope is small and known.
+**The northstar.** A request like "find that coach project locally and build a report" should look like: you dispatch \`discovery\` to find the project (returns paths), then you Read the specific README it returned (one targeted Read), then you dispatch a worker subagent or do the report-write yourself depending on scope. NOT: you run a recursive \`Glob\` then 20 Reads in your own turn.
+**Dispatch-prompt rule (1.18.198).** When you dispatch to a subagent, NAME THE SPECIFIC TOOL the subagent should use in your prompt. Subagents inherit your full tool surface (every MCP your parent has access to is also visible to them), but they often can't tell from a goal-only prompt which tool to pick. Be explicit:
+- ❌ Vague: "Enrich these 13 law firm domains."
+- ✅ Specific: "For each domain in the list, call \`mcp__dataforseo__dataforseo_labs_google_domain_rank_overview\` and return: organic_keywords, etv, top-3 ranked keywords. Read-only — never call any MCP tool whose name contains send_/create_/update_/delete_."
+The subagent's job is execution, not tool selection. You did the orchestration thinking; pass the answer through to them. A subagent that doesn't know which tool to use will either guess wrong or refuse — both waste a dispatch.
+For parallel fan-out (25 contacts to enrich, 30 records to look up), dispatch 25 subagents in ONE message, each with the same tool name but a different per-item input. The SDK runs them concurrently.`;
 /**
  * Read the long-term memory block for an autonomous run (cron, team-task).
  * Returns the agent-specific MEMORY.md when a hired agent is active, the

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "clementine-agent",
-  "version": "1.18.196",
+  "version": "1.18.198",
   "description": "Clementine — Personal AI Assistant (TypeScript)",
   "type": "module",
   "main": "dist/index.js",