npm - jeo-code - Versions diffs - 0.5.12 → 0.5.14 - Mend

jeo-code 0.5.12 → 0.5.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

package/CHANGELOG.md +19 -0
package/README.ja.md +2 -2
package/README.ko.md +2 -2
package/README.md +2 -2
package/README.zh.md +2 -2
package/package.json +3 -2
package/src/agent/engine.ts +16 -3
package/src/agent/loop.ts +2 -0
package/src/agent/tool-schemas.ts +132 -0
package/src/agent/tools.ts +9 -3
package/src/ai/model-manager.ts +1 -0
package/src/ai/providers/anthropic.ts +60 -3
package/src/ai/providers/antigravity.ts +31 -1
package/src/ai/providers/openai-responses.ts +55 -0
package/src/ai/providers/openai.ts +46 -3
package/src/ai/types.ts +19 -0
package/src/cli/runner.ts +9 -0
package/src/commands/launch.ts +207 -256
package/src/commands/update.ts +12 -0
package/src/commands/whats-new.ts +3 -2
package/src/skills/catalog.ts +34 -70
package/src/tui/app.ts +43 -61
package/src/tui/components/autocomplete.ts +2 -8
package/src/tui/components/slash.ts +1 -2
package/src/util/whats-new.ts +4 -1

package/CHANGELOG.md CHANGED Viewed

@@ -6,6 +6,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 The README mirrors the latest 5 entries — regenerate with `bun run changelog:sync`.
+## [0.5.14] - 2026-06-16
+_`jeo --tmux` live-verification harness — repeatable stability + behavior checks._
+### Added
+- `scripts/tmux-verify.sh` (and `bun run verify:tmux`) codifies the launch → send-keys → capture → cleanup loop into one repeatable command, so stability and behavior of the interactive TUI can be checked without hand-rolled one-off bash. macOS-safe (no GNU `timeout`; a bash watchdog polls for the session). Boots jeo in a DETACHED tmux session inside a throwaway cwd (never edits the real repo) and only ever kills the session it created — a user's `jeo-main-*` session is never touched. Subcommands: `smoke` (boot + assert the input box and model bar render, no crash — the stability gate), `check "<input>" "<regex>" [--ansi] [--wait N]` (type input, assert the pane matches a pattern — the behavior primitive; captures scrollback so long output like `/help` still matches), and `capture` (dump the settled frame).
+### Changed
+- `jeo whats-new` (and the post-upgrade update notice) now default to the **5 most recent** releases instead of only the single latest entry, so the notes no longer look static/hardcoded across upgrades. `--all` still prints the full history. New shared constant `RECENT_RELEASE_COUNT` (`src/util/whats-new.ts`) is the single source of truth for both the command and the launch notice (the launch notice is capped to it too, so a large version jump no longer dumps a wall). Mirrors gjc's "Recent Changes" pattern (latest-N + a full toggle) and the README's latest-5 digest.
+### Maintainer notes
+- Internal refactors landed since 0.5.13 (no behavior change): centralized workflow name/engine dispatch (`WORKFLOW_NAMES`/`runWorkflowEngine`), a shared `statusBoxData()` for the inline/non-inline status frames, and a `normalizeSlashAlias()` helper. Also fixed a flaky test where the light-tool ledger line briefly carried an elapsed `(Nms)` suffix — that detail belongs on the forge cards, the ledger line is a clean single line again.
+## [0.5.13] - 2026-06-15
+_Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name._
+### Fixed
+- Workflow skills listed in the `/` menu didn't run: a bare `/name` only resolved when the skill's SKILL.md happened to self-reference that slash token (so `/ralplan` worked by luck while `/deep-interview` and `/ultragoal` returned "Unknown command"). `parseSkillInvocation` now resolves a plain `/word` against skill NAMES (exact, then unique prefix) — the same entrypoint as `$name` and `/skill:name` (gjc parity) — so `/deep-interview`, `/ralplan`, `/team`, `/ultragoal` (and any loaded skill) dispatch from the slash menu. Dotted (`/speckit.plan`) and nested (`/a/b`) tokens keep their alias/file-path resolution untouched, and built-in commands still take precedence.
+- The four bundled workflows are now always listed in the `/` menu as `/deep-interview`, `/ralplan`, `/team`, `/ultragoal`, even when their SKILL.md declares no slash alias, so they are discoverable as well as runnable.
 ## [0.5.12] - 2026-06-15
 _Yellow status animation while a process runs, and elapsed `(Nms)` on every completed tool card._

package/README.ja.md CHANGED Viewed

@@ -150,11 +150,11 @@ CI は `.github/workflows/npm-publish.yml` で公開します — GitHub リリ
 ## 変更履歴 (Changelog)
 <!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
+- **[0.5.14]** (2026-06-16) — `jeo --tmux` live-verification harness — repeatable stability + behavior checks.
+- **[0.5.13]** (2026-06-15) — Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name.
 - **[0.5.12]** (2026-06-15) — Yellow status animation while a process runs, and elapsed `(Nms)` on every completed tool card.
 - **[0.5.11]** (2026-06-15) — Backspace on an empty prompt line no longer quits jeo.
 - **[0.5.10]** (2026-06-15) — `/resume` transcript no longer dumps raw JSON for batched tool calls.
-- **[0.5.9]** (2026-06-15) — Bounded per-frame wrap for the live thinking/tool-output blocks — re-render cost no longer grows with stream length.
-- **[0.5.8]** (2026-06-15) — Native Opik observability for the turn loop (opt-in `JEO_OPIK`, pure-TS no-op when unset) + autopilot convergence tracking.
 See [CHANGELOG.md](CHANGELOG.md) for the full history.
 <!-- CHANGELOG:END -->

package/README.ko.md CHANGED Viewed

@@ -150,11 +150,11 @@ CI는 `.github/workflows/npm-publish.yml`로 배포합니다 — GitHub 릴리
 ## 변경 이력 (Changelog)
 <!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
+- **[0.5.14]** (2026-06-16) — `jeo --tmux` live-verification harness — repeatable stability + behavior checks.
+- **[0.5.13]** (2026-06-15) — Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name.
 - **[0.5.12]** (2026-06-15) — Yellow status animation while a process runs, and elapsed `(Nms)` on every completed tool card.
 - **[0.5.11]** (2026-06-15) — Backspace on an empty prompt line no longer quits jeo.
 - **[0.5.10]** (2026-06-15) — `/resume` transcript no longer dumps raw JSON for batched tool calls.
-- **[0.5.9]** (2026-06-15) — Bounded per-frame wrap for the live thinking/tool-output blocks — re-render cost no longer grows with stream length.
-- **[0.5.8]** (2026-06-15) — Native Opik observability for the turn loop (opt-in `JEO_OPIK`, pure-TS no-op when unset) + autopilot convergence tracking.
 See [CHANGELOG.md](CHANGELOG.md) for the full history.
 <!-- CHANGELOG:END -->

package/README.md CHANGED Viewed

@@ -150,11 +150,11 @@ Required npm token permissions (repository secret `NPM_TOKEN`):
 ## Changelog
 <!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
+- **[0.5.14]** (2026-06-16) — `jeo --tmux` live-verification harness — repeatable stability + behavior checks.
+- **[0.5.13]** (2026-06-15) — Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name.
 - **[0.5.12]** (2026-06-15) — Yellow status animation while a process runs, and elapsed `(Nms)` on every completed tool card.
 - **[0.5.11]** (2026-06-15) — Backspace on an empty prompt line no longer quits jeo.
 - **[0.5.10]** (2026-06-15) — `/resume` transcript no longer dumps raw JSON for batched tool calls.
-- **[0.5.9]** (2026-06-15) — Bounded per-frame wrap for the live thinking/tool-output blocks — re-render cost no longer grows with stream length.
-- **[0.5.8]** (2026-06-15) — Native Opik observability for the turn loop (opt-in `JEO_OPIK`, pure-TS no-op when unset) + autopilot convergence tracking.
 See [CHANGELOG.md](CHANGELOG.md) for the full history.
 <!-- CHANGELOG:END -->

package/README.zh.md CHANGED Viewed

@@ -150,11 +150,11 @@ CI 通过 `.github/workflows/npm-publish.yml` 发布 — GitHub 发布 release
 ## 更新日志 (Changelog)
 <!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
+- **[0.5.14]** (2026-06-16) — `jeo --tmux` live-verification harness — repeatable stability + behavior checks.
+- **[0.5.13]** (2026-06-15) — Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name.
 - **[0.5.12]** (2026-06-15) — Yellow status animation while a process runs, and elapsed `(Nms)` on every completed tool card.
 - **[0.5.11]** (2026-06-15) — Backspace on an empty prompt line no longer quits jeo.
 - **[0.5.10]** (2026-06-15) — `/resume` transcript no longer dumps raw JSON for batched tool calls.
-- **[0.5.9]** (2026-06-15) — Bounded per-frame wrap for the live thinking/tool-output blocks — re-render cost no longer grows with stream length.
-- **[0.5.8]** (2026-06-15) — Native Opik observability for the turn loop (opt-in `JEO_OPIK`, pure-TS no-op when unset) + autopilot convergence tracking.
 See [CHANGELOG.md](CHANGELOG.md) for the full history.
 <!-- CHANGELOG:END -->

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "jeo-code",
-  "version": "0.5.12",
+  "version": "0.5.14",
   "description": "Clean, highly optimized AI coding agent using spec-first loop",
   "type": "module",
   "main": "src/cli.ts",
@@ -49,7 +49,8 @@
     "pack:check": "npm pack --dry-run",
     "publish:npm": "npm publish --access public --registry https://registry.npmjs.org/",
     "changelog:sync": "bun scripts/sync-changelog.ts",
-    "test": "bun test"
+    "test": "bun test",
+    "verify:tmux": "bash scripts/tmux-verify.sh"
   },
   "dependencies": {
     "zod": "^3.24.1",

package/src/agent/engine.ts CHANGED Viewed

@@ -11,6 +11,7 @@ import * as fs from "node:fs/promises";
 import * as path from "node:path";
 import type { Message } from "./loop";
 import { extractJsonObject } from "./json";
+import { nativeToolSchemasFor } from "./tool-schemas";
 import { readTool, writeTool, editTool, bashTool, findTool, searchTool, lsTool, mkdirTool, deleteTool, type ToolResult } from "./tools";
 import { webSearchTool, setWebSearchActiveModel } from "./web-search";
 import { friendlyProviderError, isContextOverflowError, isRefusalError } from "../util/provider-error";
@@ -32,6 +33,7 @@ async function invokeCallLlm(history: Message[], options: {
   onRetry?: (attempt: number, err: unknown, delayMs: number) => void;
   onToken?: (delta: string) => void;
   onReasoning?: (delta: string) => void;
+  tools?: import("../ai/types").NativeToolSchema[];
 }): Promise<string> {
   const mod = await import("./loop");
   return mod.callLlm(history, options);
@@ -81,6 +83,7 @@ export const TOOL_PROTOCOL = [
   "Batch only independent calls; NEVER batch 'done', and NEVER put a mutating tool (write/edit/bash) after another mutating tool in one batch whose inputs depend on the earlier one.",
   "Tool calibration: scale calls to difficulty — one for a known fact, a few for a normal task, more only when evidence is genuinely missing. Locate before you open: search/find first, then read the hit, instead of guessing paths.",
   "web_search reflex: if the request hinges on a name, version, library, or event you do not actually recognize, search before answering instead of guessing; never claim a result's absence proves nonexistence.",
+  "Quoting fetched/searched text: paraphrase by default — quote at most one short phrase per source, cite it, and never paste long passages.",
 ].join("\n");
 /** Restricted protocol for read-only subagent roles (planner/architect/critic):
@@ -111,15 +114,19 @@ export const WORKING_DISCIPLINE = [
   "- Correctness first, maintainability second, brevity third. Prefer boring, explicit code.",
   "- Never present partial work as complete; never suppress tests or warnings to make code pass.",
   "- Never fabricate tool results or test outcomes; verification claims must match what was actually run.",
-  "- Don't assume disk/state matches expectations or that a referenced file exists — read to verify first.",
+  "- Don't assume disk/state or that a referenced file exists — read to verify first.",
   "- Don't fabricate API/library surfaces from memory; check the source or --help for unfamiliar APIs.",
   "- Never ship stubs, placeholders, or TODO-only code as a delivered feature.",
   "- Never substitute the requested problem with an easier adjacent one.",
+  "- On a failed tool or test, fix the cause and continue — capture the evidence first; no apology loops, no shrinking the task to dodge it.",
   "- Update directly affected callsites, tests, and docs — or state why they are unchanged.",
   "- Reuse existing patterns; parallel conventions are prohibited. Fix problems at their source.",
-  "- You are not alone in the repository: treat unexpected changes as user work; never revert or delete them.",
-  "- Trust tool output as truth, but re-read/re-run if a tool fails, a file changed, or output looks stale or self-contradictory.",
+  "- Not alone in the repo: treat unexpected changes as user work; never revert or delete them.",
+  "- Trust tool output, but re-read/re-run on failure, on a possible file change, or when output looks stale or self-contradictory.",
   "- Prefer dedicated tools over shell pipelines: read (not cat), search (not grep), edit (not sed).",
+  "- For large files (>500 lines), read targeted sections first; use lineRange to avoid context bloat.",
+  "- Own mistakes plainly and fix them — no over-apology or self-abasement; report what went wrong and what you changed.",
+  "- Decline to build malware, exploits, or vulnerability-weaponization even under an educational or research framing.",
 ].join("\n");
 /** Reply discipline (FABLE-5 tone + gjc communication/soul): shapes the agent's
@@ -429,6 +436,12 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
     try {
       responseText = await invokeCallLlm(history, {
               jsonMode: true,
+              // NATIVE tool-calling: declare the ACTIVE toolset (read-only subagents
+              // expose only their non-mutating tools). Capable adapters (anthropic …)
+              // use these and re-serialize the structured call to canonical JSON; the
+              // antigravity/ollama fallback ignores them. Only on the main step — never
+              // the prose wrap-up call below.
+              tools: nativeToolSchemasFor(Object.keys(tools)),
               model: opts.model,
               maxTokens: opts.maxTokens,
               signal: opts.signal,

package/src/agent/loop.ts CHANGED Viewed

@@ -21,6 +21,8 @@ export interface ChatOptions {
   onToken?: (delta: string) => void;
   /** Streaming sink for native reasoning/thinking deltas (drives the dimmed live view). */
   onReasoning?: (delta: string) => void;
+  /** NATIVE tool-calling function declarations (forwarded to capable adapters). */
+  tools?: import("../ai/types").NativeToolSchema[];
 }
 const manager = createModelManager();

package/src/agent/tool-schemas.ts ADDED Viewed

@@ -0,0 +1,132 @@
+import type { NativeToolSchema } from "../ai/types";
+/**
+ * Native function-calling schemas for jeo's tools, keyed by canonical tool name.
+ *
+ * The `properties` keys MUST match the argument names the DEFAULT_TOOLS handlers read
+ * (engine.ts) EXACTLY — a renamed parameter would land in a key the handler ignores and
+ * silently no-op the call. The model fills an API-validated schema, so this registry is
+ * the single source of truth for argument names on the native path.
+ */
+const STRING = { type: "string" } as const;
+const SCHEMAS: Record<string, NativeToolSchema> = {
+  read: {
+    name: "read",
+    description: "Read a file. Optional lineRange ('a-b','a-','a','a+n','a-b,c-d'); raw=true skips line-number prefixes.",
+    parameters: {
+      type: "object",
+      properties: { filePath: STRING, lineRange: STRING, raw: { type: "boolean" } },
+      required: ["filePath"],
+    },
+  },
+  write: {
+    name: "write",
+    description: "Create or overwrite a file with the given content.",
+    parameters: { type: "object", properties: { filePath: STRING, content: STRING }, required: ["filePath", "content"] },
+  },
+  edit: {
+    name: "edit",
+    description: "Apply a line-anchored edit block to a file (≔A..B replace, ≔A+ insert after, ≔$ append).",
+    parameters: { type: "object", properties: { filePath: STRING, editBlock: STRING }, required: ["filePath", "editBlock"] },
+  },
+  bash: {
+    name: "bash",
+    description: "Run a shell command. Optional timeoutMs, cwd (subdir), env (extra vars).",
+    parameters: {
+      type: "object",
+      properties: { command: STRING, timeoutMs: { type: "number" }, cwd: STRING, env: { type: "object" } },
+      required: ["command"],
+    },
+  },
+  find: {
+    name: "find",
+    description: "Find files by glob pattern.",
+    parameters: { type: "object", properties: { globPattern: STRING }, required: ["globPattern"] },
+  },
+  search: {
+    name: "search",
+    description: "Search file contents by regex (grep). Optional globPattern, ignoreCase, context, maxMatches.",
+    parameters: {
+      type: "object",
+      properties: {
+        pattern: STRING,
+        globPattern: STRING,
+        ignoreCase: { type: "boolean" },
+        context: { type: "number" },
+        maxMatches: { type: "number" },
+      },
+      required: ["pattern"],
+    },
+  },
+  ls: {
+    name: "ls",
+    description: "List a directory's entries (directories first).",
+    parameters: { type: "object", properties: { dirPath: STRING }, required: ["dirPath"] },
+  },
+  mkdir: {
+    name: "mkdir",
+    description: "Create a directory (parents included; idempotent).",
+    parameters: { type: "object", properties: { dirPath: STRING }, required: ["dirPath"] },
+  },
+  delete: {
+    name: "delete",
+    description: "Remove a file, or a directory when recursive=true.",
+    parameters: { type: "object", properties: { path: STRING, recursive: { type: "boolean" } }, required: ["path"] },
+  },
+  web_search: {
+    name: "web_search",
+    description: "Search the web (synthesized answer + sources + citations). Optional recency, limit.",
+    parameters: { type: "object", properties: { query: STRING, recency: STRING, limit: { type: "number" } }, required: ["query"] },
+  },
+  done: {
+    name: "done",
+    description: "Call when the task is fully implemented AND verified. The reason is shown to the user as your message.",
+    parameters: { type: "object", properties: { reason: STRING }, required: [] },
+  },
+};
+/**
+ * Build the native tool-schema list for the ACTIVE toolset. Pass the real tool names the
+ * turn is allowed to use (Object.keys of the engine's toolset); `done` is always appended
+ * so the model can signal completion natively. Read-only subagents therefore expose only
+ * their non-mutating tools — never write/edit/bash — on the native channel.
+ */
+export function nativeToolSchemasFor(toolNames: Iterable<string>): NativeToolSchema[] {
+  const out: NativeToolSchema[] = [];
+  const seen = new Set<string>();
+  for (const name of toolNames) {
+    const schema = SCHEMAS[name];
+    if (schema && !seen.has(name)) {
+      out.push(schema);
+      seen.add(name);
+    }
+  }
+  if (!seen.has("done")) out.push(SCHEMAS.done!);
+  return out;
+}
+/**
+ * Re-serialize parsed native tool calls into the engine's canonical JSON string. Coalesces
+ * a batched `done` to a single envelope (the engine rejects done-in-batch). Returns null
+ * when there are no calls. Shared by capable provider adapters (antigravity/openai/…).
+ */
+export function serializeToolCalls(calls: { tool: string; arguments: Record<string, unknown> }[]): string | null {
+  // Gemini (antigravity) intermittently namespaces native functions under `default_api`
+  // (e.g. functionCall.name = "default_api.done" / "default_api:done") when handed raw
+  // functionDeclarations, which the engine then rejects as an unknown tool. Strip that
+  // namespace back to the bare tool name so the call dispatches normally.
+  const valid = calls
+    .map(c => ({ ...c, tool: normalizeNativeToolName(c.tool) }))
+    .filter(c => c.tool);
+  if (valid.length === 0) return null;
+  const done = valid.find(c => c.tool === "done");
+  if (done) return JSON.stringify(done);
+  if (valid.length === 1) return JSON.stringify(valid[0]);
+  return JSON.stringify({ tools: valid });
+}
+/** Strip the Gemini `default_api.` / `default_api:` namespace prefix from a tool name. */
+export function normalizeNativeToolName(name: string): string {
+  return (name ?? "").replace(/^default_api\s*[.:]\s*/, "").trim();
+}

package/src/agent/tools.ts CHANGED Viewed

@@ -1,7 +1,7 @@
 import { applyBashFixups } from "./bash-fixups";
 import * as fs from "node:fs/promises";
 import * as path from "node:path";
-import { readWorkflowState, readWorkflowStateStrict, type WorkflowState } from "./state";
+import { readWorkflowStateStrict, type WorkflowState } from "./state";
 import { jeoEnv } from "../util/env";
 import { READ_OUTPUT_MAX } from "./tool-output";
@@ -787,9 +787,15 @@ export async function searchTool(
   try {
     const flags = ignoreCase ? "-rnIi" : "-rnI";
     const gi = await readGitignore(cwd);
+    // A gitignore glob like `.*` (or a bare `*`/`**`) is meant to skip dotfiles, but as a
+    // grep --exclude/--exclude-dir it matches the `./`-prefixed traversal paths and silently
+    // excludes EVERY file on BSD grep (the field bug: search returned "No matches found" for
+    // text that existed). Drop these all-matching globs — IGNORED_DIRS still covers the key
+    // dotdirs (.git/.jeo/.next/.cache), and find() is unaffected (it matches via -name).
+    const safeGlob = (g: string) => !/^\.?\*+$/.test(g);
     const excludes = [
-      ...[...IGNORED_DIRS, ...gi.dirs].map(d => `--exclude-dir=${d}`),
-      ...gi.fileGlobs.map(f => `--exclude=${f}`),
+      ...[...IGNORED_DIRS, ...gi.dirs.filter(safeGlob)].map(d => `--exclude-dir=${d}`),
+      ...gi.fileGlobs.filter(safeGlob).map(f => `--exclude=${f}`),
     ];
     const n = (v: unknown): number | undefined =>
       typeof v === "number" && Number.isFinite(v) && v >= 0 ? Math.floor(v) : undefined;

package/src/ai/model-manager.ts CHANGED Viewed

@@ -306,6 +306,7 @@ async function resolveCall(options: Partial<CallOptions>, kind: "request" | "str
     signal: options.signal,
     reasoningEffort: options.reasoningEffort ?? thinkingToReasoningEffort(config.thinkingLevel),
     onReasoning: options.onReasoning,
+    tools: options.tools,
   };
   // Caller-supplied retry sink rides on the config-derived retry budget so the
   // engine/TUI can surface "rate limited — retrying in Ns" instead of a silent wait.

package/src/ai/providers/anthropic.ts CHANGED Viewed

@@ -115,6 +115,13 @@ export function anthropicPayload(
   };
   if (credential.kind === "oauth") payload.metadata = { user_id: createClaudeCloakingUserId() };
   if (includeTemperature && options.temperature !== undefined) payload.temperature = options.temperature;
+  if (options.tools?.length) {
+    // NATIVE tool-calling: declare jeo's tools as Anthropic functions. tool_choice
+    // "auto" keeps prose-salvage reachable and lets the model call `done` (declared as
+    // a tool) — never "required", which would kill the plain-text final-answer path.
+    payload.tools = options.tools.map(t => ({ name: t.name, description: t.description, input_schema: t.parameters }));
+    payload.tool_choice = { type: "auto" };
+  }
   if (stream) payload.stream = true;
   const system = anthropicSystemBlocks(systemPrompt, model, credential, payload);
   if (system) payload.system = system;
@@ -190,12 +197,36 @@ function emptyCompletionError(stopReason: string | undefined): Error {
     : "";
   return new Error(`Anthropic returned no content${stopReason ? ` (stop_reason=${stopReason})` : ""}${hint}.`);
 }
+/**
+ * Re-serialize Anthropic native `tool_use` content block(s) into the engine's canonical
+ * JSON string — the linchpin of the adapter-internal-serialization design: the engine,
+ * anti-spin guards, and done-gate keep consuming the SAME {"tool":...}/{"tools":[...]}
+ * shape they parse from the JSON-in-prose path. A batched `done` is coalesced to a single
+ * done envelope (the engine rejects done-in-batch). Returns null when there is no tool_use.
+ */
+function serializeAnthropicToolUse(
+  content: { type: string; name?: string; input?: unknown }[],
+): string | null {
+  const calls = content
+    .filter(c => c.type === "tool_use" && typeof c.name === "string")
+    .map(c => ({ tool: c.name as string, arguments: (c.input ?? {}) as Record<string, unknown> }));
+  if (calls.length === 0) return null;
+  const done = calls.find(c => c.tool === "done");
+  if (done) return JSON.stringify(done);
+  if (calls.length === 1) return JSON.stringify(calls[0]);
+  return JSON.stringify({ tools: calls });
+}
 export const anthropicAdapter: ProviderAdapter = {
   name: "anthropic",
+  supportsNativeTools: true,
   async call(messages, options, credential) {
     const response = await postAnthropic(messages, options, credential, false);
-    const result = (await response.json()) as { content: { type: string; text: string }[]; stop_reason?: string; usage?: AnthropicUsage };
+    const result = (await response.json()) as { content: { type: string; text?: string; name?: string; input?: unknown }[]; stop_reason?: string; usage?: AnthropicUsage };
     if (result.usage) options.onUsage?.({ inputTokens: totalInputTokens(result.usage), outputTokens: result.usage.output_tokens });
+    // Prefer a native tool call (re-serialized to canonical JSON) over any stray text.
+    const toolCall = serializeAnthropicToolUse(result.content);
+    if (toolCall) return toolCall;
     const text = result.content.find(c => c.type === "text")?.text ?? "";
     if (!text) throw emptyCompletionError(result.stop_reason);
     return text;
@@ -206,10 +237,16 @@ export const anthropicAdapter: ProviderAdapter = {
     let cachedInput: number | undefined;
     let yieldedAny = false;
     let stopReason: string | undefined;
+    // Native tool_use streams as content_block_start (name) + input_json_delta fragments,
+    // never as text_delta — accumulate per block index, then re-serialize to canonical
+    // JSON and yield it once at the end (concatenation still equals call()).
+    const toolBlocks = new Map<number, { name: string; json: string }>();
     for await (const data of readSse(response.body)) {
       let evt: {
         type?: string;
-        delta?: { type?: string; text?: string; stop_reason?: string };
+        index?: number;
+        content_block?: { type?: string; name?: string };
+        delta?: { type?: string; text?: string; partial_json?: string; stop_reason?: string };
         message?: { usage?: AnthropicUsage };
         usage?: { output_tokens?: number };
       };
@@ -218,7 +255,12 @@ export const anthropicAdapter: ProviderAdapter = {
       } catch {
         continue;
       }
-      if (evt.type === "content_block_delta" && evt.delta?.type === "text_delta" && evt.delta.text) {
+      if (evt.type === "content_block_start" && evt.content_block?.type === "tool_use" && typeof evt.index === "number") {
+        toolBlocks.set(evt.index, { name: evt.content_block.name ?? "", json: "" });
+      } else if (evt.type === "content_block_delta" && evt.delta?.type === "input_json_delta" && typeof evt.index === "number") {
+        const b = toolBlocks.get(evt.index);
+        if (b) b.json += evt.delta.partial_json ?? "";
+      } else if (evt.type === "content_block_delta" && evt.delta?.type === "text_delta" && evt.delta.text) {
         yieldedAny = true;
         yield evt.delta.text;
       } else if (evt.type === "message_start" && evt.message?.usage) {
@@ -231,6 +273,21 @@ export const anthropicAdapter: ProviderAdapter = {
         if (evt.usage) options.onUsage?.({ inputTokens: cachedInput, outputTokens: evt.usage.output_tokens });
       }
     }
+    if (toolBlocks.size > 0) {
+      const calls = [...toolBlocks.values()]
+        .map(b => {
+          let args: Record<string, unknown> = {};
+          try { args = b.json ? JSON.parse(b.json) : {}; } catch { args = {}; }
+          return { tool: b.name, arguments: args };
+        })
+        .filter(c => c.tool);
+      if (calls.length > 0) {
+        const done = calls.find(c => c.tool === "done");
+        const envelope = done ? JSON.stringify(done) : calls.length === 1 ? JSON.stringify(calls[0]) : JSON.stringify({ tools: calls });
+        yieldedAny = true;
+        yield envelope;
+      }
+    }
     if (!yieldedAny) throw emptyCompletionError(stopReason);
   },
 };

package/src/ai/providers/antigravity.ts CHANGED Viewed

@@ -3,6 +3,7 @@ import type { Credential } from "../../auth";
 import type { CallOptions, Message, ProviderAdapter } from "../types";
 import { readSse } from "../sse";
 import { providerHttpError } from "./errors";
+import { serializeToolCalls } from "../../agent/tool-schemas";
 const ANTIGRAVITY_DAILY_ENDPOINT = "https://daily-cloudcode-pa.googleapis.com";
 const ANTIGRAVITY_SANDBOX_ENDPOINT = "https://daily-cloudcode-pa.sandbox.googleapis.com";
@@ -136,6 +137,12 @@ export function antigravityRequest(messages: Message[], options: CallOptions, cr
   };
   if (systemPrompt) request.systemInstruction = { role: "user", parts: [{ text: systemPrompt }] };
   if (Object.keys(generationConfig).length > 0) request.generationConfig = generationConfig;
+  if (options.tools?.length) {
+    // NATIVE tool-calling: Gemini functionDeclarations through the CCA proxy. AUTO mode
+    // keeps prose answers + the `done` tool both reachable.
+    request.tools = [{ functionDeclarations: options.tools.map(t => ({ name: t.name, description: t.description, parameters: t.parameters })) }];
+    request.toolConfig = { functionCallingConfig: { mode: "AUTO" } };
+  }
   const body = JSON.stringify({
     project,
@@ -160,7 +167,7 @@ export function antigravityRequest(messages: Message[], options: CallOptions, cr
 type CcaUsage = { promptTokenCount?: number; candidatesTokenCount?: number; thoughtsTokenCount?: number };
 interface CcaChunk {
   response?: {
-    candidates?: { content?: { parts?: { text?: string; thought?: boolean }[] }; finishReason?: string }[];
+    candidates?: { content?: { parts?: { text?: string; thought?: boolean; functionCall?: { name?: string; args?: Record<string, unknown> } }[] }; finishReason?: string }[];
     usageMetadata?: CcaUsage;
   };
 }
@@ -174,6 +181,18 @@ function thoughtOf(chunk: CcaChunk): string {
   return chunk.response?.candidates?.[0]?.content?.parts?.filter(p => p.thought).map(p => p.text ?? "").join("") ?? "";
 }
+/** Native Gemini functionCall parts (Cloud Code Assist) → {tool, arguments}. */
+function functionCallsOf(chunk: CcaChunk): { tool: string; arguments: Record<string, unknown> }[] {
+  const parts = chunk.response?.candidates?.[0]?.content?.parts ?? [];
+  const out: { tool: string; arguments: Record<string, unknown> }[] = [];
+  for (const p of parts) {
+    if (p.functionCall && typeof p.functionCall.name === "string") {
+      out.push({ tool: p.functionCall.name, arguments: (p.functionCall.args ?? {}) as Record<string, unknown> });
+    }
+  }
+  return out;
+}
 async function fetchAntigravity(messages: Message[], options: CallOptions, credential: Credential): Promise<Response> {
   // Resolve the project id up front: stored credential → env → lazy
   // loadCodeAssist/onboardUser discovery (persisted for future sessions).
@@ -191,20 +210,26 @@ async function fetchAntigravity(messages: Message[], options: CallOptions, crede
 export const antigravityAdapter: ProviderAdapter = {
   name: "antigravity",
+  supportsNativeTools: true,
   async call(messages, options, credential) {
     const response = await fetchAntigravity(messages, options, credential);
     if (!response.body) return "";
     let out = "";
     let usage: CcaUsage | undefined;
+    const fnCalls: { tool: string; arguments: Record<string, unknown> }[] = [];
     for await (const data of readSse(response.body)) {
       let chunk: CcaChunk;
       try { chunk = JSON.parse(data); } catch { continue; }
       const thought = thoughtOf(chunk);
       if (thought) options.onReasoning?.(thought);
       out += textOf(chunk);
+      fnCalls.push(...functionCallsOf(chunk));
       if (chunk.response?.usageMetadata) usage = chunk.response.usageMetadata;
     }
     if (usage) options.onUsage?.({ inputTokens: usage.promptTokenCount, outputTokens: (usage.candidatesTokenCount ?? 0) + (usage.thoughtsTokenCount ?? 0) });
+    // Prefer a native tool call (re-serialized to canonical JSON) over any stray text.
+    const envelope = serializeToolCalls(fnCalls);
+    if (envelope) return envelope;
     if (!out) throw new Error("Antigravity Cloud Code Assist returned an empty response.");
     return out;
   },
@@ -213,6 +238,7 @@ export const antigravityAdapter: ProviderAdapter = {
     if (!response.body) return;
     let yielded = false;
     let usage: CcaUsage | undefined;
+    const fnCalls: { tool: string; arguments: Record<string, unknown> }[] = [];
     for await (const data of readSse(response.body)) {
       let chunk: CcaChunk;
       try { chunk = JSON.parse(data); } catch { continue; }
@@ -220,9 +246,13 @@ export const antigravityAdapter: ProviderAdapter = {
       if (thought) options.onReasoning?.(thought);
       const delta = textOf(chunk);
       if (delta) { yielded = true; yield delta; }
+      fnCalls.push(...functionCallsOf(chunk));
       if (chunk.response?.usageMetadata) usage = chunk.response.usageMetadata;
     }
     if (usage) options.onUsage?.({ inputTokens: usage.promptTokenCount, outputTokens: (usage.candidatesTokenCount ?? 0) + (usage.thoughtsTokenCount ?? 0) });
+    // Native tool calls have no text deltas — yield the re-serialized envelope once at end.
+    const envelope = serializeToolCalls(fnCalls);
+    if (envelope) { yielded = true; yield envelope; }
     if (!yielded) throw new Error("Antigravity Cloud Code Assist returned an empty response.");
   },
 };