jeo-code 0.6.27 → 0.6.29
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +26 -0
- package/README.ja.md +2 -6
- package/README.ko.md +2 -6
- package/README.md +2 -6
- package/README.zh.md +2 -6
- package/package.json +1 -1
- package/src/agent/compaction.ts +10 -1
- package/src/agent/engine.ts +62 -16
- package/src/agent/loop.ts +3 -0
- package/src/ai/model-catalog.ts +12 -5
- package/src/ai/model-manager.ts +1 -0
- package/src/ai/providers/anthropic.ts +121 -21
- package/src/ai/providers/antigravity.ts +6 -0
- package/src/ai/providers/errors.ts +18 -0
- package/src/ai/providers/gemini.ts +84 -28
- package/src/ai/providers/openai-compatible-catalog.ts +10 -4
- package/src/ai/providers/openai-responses.ts +76 -19
- package/src/ai/types.ts +55 -2
- package/src/commands/launch.ts +90 -22
- package/src/tui/app.ts +38 -6
- package/src/tui/components/ascii-art.ts +27 -31
package/CHANGELOG.md
CHANGED
|
@@ -6,6 +6,32 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
|
|
6
6
|
|
|
7
7
|
The README mirrors the latest 5 entries — regenerate with `bun run changelog:sync`.
|
|
8
8
|
|
|
9
|
+
## [0.6.29] - 2026-06-19
|
|
10
|
+
_Signature-only thinking-block replay (Anthropic opus-4-7/4-8), plus a tmux mouse-flood memory guard confirming `jeo --tmux` does not leak._
|
|
11
|
+
|
|
12
|
+
### Fixed
|
|
13
|
+
- **Anthropic thinking-block replay now covers signature-only artifacts.** Newer Opus models (opus-4-7/opus-4-8) think internally — tokens billed, a valid `signature` present — but return empty thinking text. The cross-turn replay required both `signature` AND `text`, so those models' reasoning was dropped between steps. Replay now sends a signed `thinking` block whenever a `signature` (or `redacted`) is present (text defaults to `""`), restoring multi-step reasoning continuity for signature-only models. API-key requests also send the `interleaved-thinking` + `prompt-caching-scope` betas so thinking+tools and scoped caching work outside OAuth.
|
|
14
|
+
|
|
15
|
+
### Added
|
|
16
|
+
- **`claude-opus-4-7` catalogued** (FULL thinking, 200k ctx) and a dynamic context-window fallback for uncatalogued ids (claude 200k / gpt-5 400k / gemini-3 1M).
|
|
17
|
+
- **tmux mouse-report-flood memory guard** (`test/mouse-report-filter.test.ts`): 100k SGR mouse-move reports through `queuePromptInputChunk` leave the prompt queue at zero accumulation — the regression guard for the "`jeo --tmux` slows down over time" concern.
|
|
18
|
+
|
|
19
|
+
### Verified
|
|
20
|
+
- **`jeo --tmux` has no bun memory leak.** The in-process lifecycle probe (`scripts/mem-probe.ts`, 3000 turns) reports a per-turn heap slope of ≈0 (returns to baseline, exit-listeners flat); a real `jeo --tmux` process plateaus in RSS under sustained mouse/resize/keystroke churn instead of climbing; and mouse reports are filtered (not buffered) with `activityLog` bounded to a 200-entry per-turn ring.
|
|
21
|
+
|
|
22
|
+
## [0.6.28] - 2026-06-19
|
|
23
|
+
_Signed thinking-block replay: native reasoning is now sent BACK to providers across steps/turns, restoring multi-step reasoning continuity (gajae parity)._
|
|
24
|
+
|
|
25
|
+
### Added
|
|
26
|
+
- **Provider-native reasoning replay across all three first-party providers.** jeo now captures each provider's opaque/signed reasoning artifact during streaming and replays it on later turns to the SAME provider+model, so the model keeps its chain of thought across tool steps instead of re-deriving it. New `Message.reasoningArtifacts` plus structured `Message.toolUse` / `toolResults` (stable ids) let capable adapters reconstruct **native** tool blocks (the key to continuity — plain-text tool feedback makes Claude strip prior thinking):
|
|
27
|
+
- **Anthropic**: captures `signature_delta` + `redacted_thinking`; replays `thinking`(+signature) → `tool_use` → `tool_result` blocks (gated on same-model + thinking-enabled).
|
|
28
|
+
- **OpenAI Responses**: requests `include: ["reasoning.encrypted_content"]` (store stays false), captures reasoning item id+encrypted_content, replays native `reasoning` + `function_call` + `function_call_output` items.
|
|
29
|
+
- **Gemini**: captures per-part `thoughtSignature`, replays native `functionCall`(+thoughtSignature) / `functionResponse` parts (coalescing-safe). This was previously deferred — structured `toolUse` unblocks the functionCall binding.
|
|
30
|
+
- **Fail-safe strip-and-retry.** A 400 naming a thinking/signature/encrypted/reasoning field retries the step ONCE with artifacts stripped (plain history), so an expired signature or edited history can never wedge a turn. Per provider (Anthropic/OpenAI/Gemini).
|
|
31
|
+
|
|
32
|
+
### Changed
|
|
33
|
+
- **Reasoning artifacts ride the session record + token accounting.** `reasoningArtifacts` round-trips through session save/load (so `/resume` preserves replay continuity) and counts toward `estimateMessageTokens` (OpenAI encrypted blobs are KB-scale) so compaction/overflow stay honest. Markdown export is unchanged (artifacts are opaque). The engine's ~11 assistant-push sites are unified behind `pushAssistantTurn`, so every step (not just the final reply) carries its reasoning + artifacts. Antigravity is explicitly out of scope (no capture/replay; the provider-keyed match guard prevents any cross-adapter leakage).
|
|
34
|
+
|
|
9
35
|
## [0.6.27] - 2026-06-19
|
|
10
36
|
_Ponytail pass on the reasoning-tier mapper, plus a real-tmux verification of `jeo --tmux`._
|
|
11
37
|
|
package/README.ja.md
CHANGED
|
@@ -2,10 +2,6 @@
|
|
|
2
2
|
<img src="assets/hero.png" alt="jeo-code 自律コーディングエージェントのヒーローイラスト" width="100%" />
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
|
-
<p align="center">
|
|
6
|
-
<img src="assets/icon.png" alt="jeo-code icon" width="96" />
|
|
7
|
-
</p>
|
|
8
|
-
|
|
9
5
|
<h1 align="center">jeo-code (jeo)</h1>
|
|
10
6
|
|
|
11
7
|
<p align="center">
|
|
@@ -204,11 +200,11 @@ CI は `.github/workflows/npm-publish.yml` で公開します — GitHub リリ
|
|
|
204
200
|
## 変更履歴 (Changelog)
|
|
205
201
|
|
|
206
202
|
<!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
|
|
203
|
+
- **[0.6.29]** (2026-06-19) — Signature-only thinking-block replay (Anthropic opus-4-7/4-8), plus a tmux mouse-flood memory guard confirming `jeo --tmux` does not leak.
|
|
204
|
+
- **[0.6.28]** (2026-06-19) — Signed thinking-block replay: native reasoning is now sent BACK to providers across steps/turns, restoring multi-step reasoning continuity (gajae parity).
|
|
207
205
|
- **[0.6.27]** (2026-06-19) — Ponytail pass on the reasoning-tier mapper, plus a real-tmux verification of `jeo --tmux`.
|
|
208
206
|
- **[0.6.26]** (2026-06-19) — The forge emblem is redrawn again as the mascot crayfish, foregrounding its signature pincer claws (집게).
|
|
209
207
|
- **[0.6.25]** (2026-06-19) — Reasoning works at every thinking level (gajae parity), and the forge emblem is redrawn as the neon-lens coding wizard.
|
|
210
|
-
- **[0.6.24]** (2026-06-19) — `/provider` opens an interactive onboarding selector (OAuth vs API-compatible), and OpenAI-compatible backends gain per-vendor native-reasoning formats.
|
|
211
|
-
- **[0.6.23]** (2026-06-19) — Live reasoning/thinking streams in the TUI across every provider, three new OpenAI-compatible backends (LM Studio, xAI, Kimi) join the auth/discovery/catalog surface, and Gemini gains native function-calling.
|
|
212
208
|
|
|
213
209
|
See [CHANGELOG.md](CHANGELOG.md) for the full history.
|
|
214
210
|
<!-- CHANGELOG:END -->
|
package/README.ko.md
CHANGED
|
@@ -2,10 +2,6 @@
|
|
|
2
2
|
<img src="assets/hero.png" alt="jeo-code 자율 코딩 에이전트 히어로 일러스트" width="100%" />
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
|
-
<p align="center">
|
|
6
|
-
<img src="assets/icon.png" alt="jeo-code icon" width="96" />
|
|
7
|
-
</p>
|
|
8
|
-
|
|
9
5
|
<h1 align="center">jeo-code (jeo)</h1>
|
|
10
6
|
|
|
11
7
|
<p align="center">
|
|
@@ -204,11 +200,11 @@ CI는 `.github/workflows/npm-publish.yml`로 배포합니다 — GitHub 릴리
|
|
|
204
200
|
## 변경 이력 (Changelog)
|
|
205
201
|
|
|
206
202
|
<!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
|
|
203
|
+
- **[0.6.29]** (2026-06-19) — Signature-only thinking-block replay (Anthropic opus-4-7/4-8), plus a tmux mouse-flood memory guard confirming `jeo --tmux` does not leak.
|
|
204
|
+
- **[0.6.28]** (2026-06-19) — Signed thinking-block replay: native reasoning is now sent BACK to providers across steps/turns, restoring multi-step reasoning continuity (gajae parity).
|
|
207
205
|
- **[0.6.27]** (2026-06-19) — Ponytail pass on the reasoning-tier mapper, plus a real-tmux verification of `jeo --tmux`.
|
|
208
206
|
- **[0.6.26]** (2026-06-19) — The forge emblem is redrawn again as the mascot crayfish, foregrounding its signature pincer claws (집게).
|
|
209
207
|
- **[0.6.25]** (2026-06-19) — Reasoning works at every thinking level (gajae parity), and the forge emblem is redrawn as the neon-lens coding wizard.
|
|
210
|
-
- **[0.6.24]** (2026-06-19) — `/provider` opens an interactive onboarding selector (OAuth vs API-compatible), and OpenAI-compatible backends gain per-vendor native-reasoning formats.
|
|
211
|
-
- **[0.6.23]** (2026-06-19) — Live reasoning/thinking streams in the TUI across every provider, three new OpenAI-compatible backends (LM Studio, xAI, Kimi) join the auth/discovery/catalog surface, and Gemini gains native function-calling.
|
|
212
208
|
|
|
213
209
|
See [CHANGELOG.md](CHANGELOG.md) for the full history.
|
|
214
210
|
<!-- CHANGELOG:END -->
|
package/README.md
CHANGED
|
@@ -2,10 +2,6 @@
|
|
|
2
2
|
<img src="assets/hero.png" alt="jeo-code autonomous coding-agent hero illustration" width="100%" />
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
|
-
<p align="center">
|
|
6
|
-
<img src="assets/icon.png" alt="jeo-code icon" width="96" />
|
|
7
|
-
</p>
|
|
8
|
-
|
|
9
5
|
<h1 align="center">jeo-code (jeo)</h1>
|
|
10
6
|
|
|
11
7
|
<p align="center">
|
|
@@ -204,11 +200,11 @@ Required npm token permissions (repository secret `NPM_TOKEN`):
|
|
|
204
200
|
## Changelog
|
|
205
201
|
|
|
206
202
|
<!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
|
|
203
|
+
- **[0.6.29]** (2026-06-19) — Signature-only thinking-block replay (Anthropic opus-4-7/4-8), plus a tmux mouse-flood memory guard confirming `jeo --tmux` does not leak.
|
|
204
|
+
- **[0.6.28]** (2026-06-19) — Signed thinking-block replay: native reasoning is now sent BACK to providers across steps/turns, restoring multi-step reasoning continuity (gajae parity).
|
|
207
205
|
- **[0.6.27]** (2026-06-19) — Ponytail pass on the reasoning-tier mapper, plus a real-tmux verification of `jeo --tmux`.
|
|
208
206
|
- **[0.6.26]** (2026-06-19) — The forge emblem is redrawn again as the mascot crayfish, foregrounding its signature pincer claws (집게).
|
|
209
207
|
- **[0.6.25]** (2026-06-19) — Reasoning works at every thinking level (gajae parity), and the forge emblem is redrawn as the neon-lens coding wizard.
|
|
210
|
-
- **[0.6.24]** (2026-06-19) — `/provider` opens an interactive onboarding selector (OAuth vs API-compatible), and OpenAI-compatible backends gain per-vendor native-reasoning formats.
|
|
211
|
-
- **[0.6.23]** (2026-06-19) — Live reasoning/thinking streams in the TUI across every provider, three new OpenAI-compatible backends (LM Studio, xAI, Kimi) join the auth/discovery/catalog surface, and Gemini gains native function-calling.
|
|
212
208
|
|
|
213
209
|
See [CHANGELOG.md](CHANGELOG.md) for the full history.
|
|
214
210
|
<!-- CHANGELOG:END -->
|
package/README.zh.md
CHANGED
|
@@ -2,10 +2,6 @@
|
|
|
2
2
|
<img src="assets/hero.png" alt="jeo-code 自主编码代理主视觉插图" width="100%" />
|
|
3
3
|
</p>
|
|
4
4
|
|
|
5
|
-
<p align="center">
|
|
6
|
-
<img src="assets/icon.png" alt="jeo-code icon" width="96" />
|
|
7
|
-
</p>
|
|
8
|
-
|
|
9
5
|
<h1 align="center">jeo-code (jeo)</h1>
|
|
10
6
|
|
|
11
7
|
<p align="center">
|
|
@@ -204,11 +200,11 @@ CI 通过 `.github/workflows/npm-publish.yml` 发布 — GitHub 发布 release
|
|
|
204
200
|
## 更新日志 (Changelog)
|
|
205
201
|
|
|
206
202
|
<!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
|
|
203
|
+
- **[0.6.29]** (2026-06-19) — Signature-only thinking-block replay (Anthropic opus-4-7/4-8), plus a tmux mouse-flood memory guard confirming `jeo --tmux` does not leak.
|
|
204
|
+
- **[0.6.28]** (2026-06-19) — Signed thinking-block replay: native reasoning is now sent BACK to providers across steps/turns, restoring multi-step reasoning continuity (gajae parity).
|
|
207
205
|
- **[0.6.27]** (2026-06-19) — Ponytail pass on the reasoning-tier mapper, plus a real-tmux verification of `jeo --tmux`.
|
|
208
206
|
- **[0.6.26]** (2026-06-19) — The forge emblem is redrawn again as the mascot crayfish, foregrounding its signature pincer claws (집게).
|
|
209
207
|
- **[0.6.25]** (2026-06-19) — Reasoning works at every thinking level (gajae parity), and the forge emblem is redrawn as the neon-lens coding wizard.
|
|
210
|
-
- **[0.6.24]** (2026-06-19) — `/provider` opens an interactive onboarding selector (OAuth vs API-compatible), and OpenAI-compatible backends gain per-vendor native-reasoning formats.
|
|
211
|
-
- **[0.6.23]** (2026-06-19) — Live reasoning/thinking streams in the TUI across every provider, three new OpenAI-compatible backends (LM Studio, xAI, Kimi) join the auth/discovery/catalog surface, and Gemini gains native function-calling.
|
|
212
208
|
|
|
213
209
|
See [CHANGELOG.md](CHANGELOG.md) for the full history.
|
|
214
210
|
<!-- CHANGELOG:END -->
|
package/package.json
CHANGED
package/src/agent/compaction.ts
CHANGED
|
@@ -78,7 +78,16 @@ const messageTokenCache = new WeakMap<Message, number>();
|
|
|
78
78
|
export function estimateMessageTokens(msg: Message): number {
|
|
79
79
|
const hit = messageTokenCache.get(msg);
|
|
80
80
|
if (hit !== undefined) return hit;
|
|
81
|
-
|
|
81
|
+
let n = estimateTokens(msg.role) + estimateTokens(msg.content) + (msg.images?.length ?? 0) * IMAGE_TOKEN_ESTIMATE + 1;
|
|
82
|
+
// Native reasoning artifacts (signature / encrypted_content / thought text) are NOT in
|
|
83
|
+
// `content` but become REAL input tokens once an adapter replays them — count them so
|
|
84
|
+
// the context meter and compaction trigger stay honest (OpenAI encrypted blobs are KB-scale).
|
|
85
|
+
// toolUse/toolResults/toolResultExtra are already reflected in `content`, so they are not re-added.
|
|
86
|
+
for (const a of msg.reasoningArtifacts ?? []) {
|
|
87
|
+
n += estimateTokens(a.text ?? "") + estimateTokens(a.signature ?? "")
|
|
88
|
+
+ estimateTokens(a.redacted ?? "") + estimateTokens(a.thoughtSignature ?? "")
|
|
89
|
+
+ estimateTokens(a.encrypted ?? "");
|
|
90
|
+
}
|
|
82
91
|
messageTokenCache.set(msg, n);
|
|
83
92
|
return n;
|
|
84
93
|
}
|
package/src/agent/engine.ts
CHANGED
|
@@ -34,11 +34,30 @@ async function invokeCallLlm(history: Message[], options: {
|
|
|
34
34
|
onRetry?: (attempt: number, err: unknown, delayMs: number) => void;
|
|
35
35
|
onToken?: (delta: string) => void;
|
|
36
36
|
onReasoning?: (delta: string) => void;
|
|
37
|
+
onReasoningArtifact?: (artifact: import("../ai/types").ReasoningArtifact) => void;
|
|
37
38
|
tools?: import("../ai/types").NativeToolSchema[];
|
|
38
39
|
}): Promise<string> {
|
|
39
40
|
const mod = await import("./loop");
|
|
40
41
|
return mod.callLlm(history, options);
|
|
41
42
|
}
|
|
43
|
+
|
|
44
|
+
/** Push an assistant turn, attaching the step's reasoning + native replay records when
|
|
45
|
+
* present. Centralizes the assistant-push sites so reasoning/artifacts attach uniformly
|
|
46
|
+
* (not just the final reply). Omits empty fields so back-compat serialization and the
|
|
47
|
+
* identity-keyed token cache are unaffected. */
|
|
48
|
+
function pushAssistantTurn(
|
|
49
|
+
history: Message[],
|
|
50
|
+
content: string,
|
|
51
|
+
reasoning: string,
|
|
52
|
+
artifacts: import("../ai/types").ReasoningArtifact[],
|
|
53
|
+
toolUse?: import("../ai/types").ToolUseRecord[],
|
|
54
|
+
): void {
|
|
55
|
+
const msg: Message = { role: "assistant", content };
|
|
56
|
+
if (reasoning.trim()) msg.reasoning = reasoning;
|
|
57
|
+
if (artifacts.length) msg.reasoningArtifacts = artifacts;
|
|
58
|
+
if (toolUse && toolUse.length) msg.toolUse = toolUse;
|
|
59
|
+
history.push(msg);
|
|
60
|
+
}
|
|
42
61
|
export interface ToolInvocation {
|
|
43
62
|
tool: string;
|
|
44
63
|
arguments?: Record<string, any>;
|
|
@@ -176,6 +195,9 @@ export interface AgentLoopEvents {
|
|
|
176
195
|
/** Accumulated native reasoning/thinking text so far — drives a transient dimmed
|
|
177
196
|
* "thinking" view. Only requested when a consumer (TUI) attaches. */
|
|
178
197
|
onReasoningStream?(textSoFar: string): void;
|
|
198
|
+
/** Each provider-native reasoning ARTIFACT as it is captured (signature / thoughtSignature /
|
|
199
|
+
* reasoning item). Lets the final-reply path (launch.ts) persist artifacts for replay. */
|
|
200
|
+
onReasoningArtifactStream?(artifact: import("../ai/types").ReasoningArtifact): void;
|
|
179
201
|
/** Step-budget change (gjc-style retry flow): the limit was extended because the
|
|
180
202
|
* turn is making progress. `limit` is the new max; `reason` is display-ready. */
|
|
181
203
|
onBudget?(limit: number, reason: string): void;
|
|
@@ -345,7 +367,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
345
367
|
);
|
|
346
368
|
const consolidated = wrapUp.trim();
|
|
347
369
|
if (consolidated) {
|
|
348
|
-
history
|
|
370
|
+
pushAssistantTurn(history, consolidated, "", []);
|
|
349
371
|
return finish({
|
|
350
372
|
done: false,
|
|
351
373
|
steps: step,
|
|
@@ -493,6 +515,14 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
493
515
|
const onReasoning = ev.onReasoningStream
|
|
494
516
|
? (delta: string) => { reasonBuf += delta; ev.onReasoningStream!(reasonBuf); }
|
|
495
517
|
: undefined;
|
|
518
|
+
// Capture provider-native reasoning ARTIFACTS for replay (always — independent of any
|
|
519
|
+
// TUI display sink). Stays scoped to THIS step so a later consolidation push can't
|
|
520
|
+
// inherit a prior step's signatures.
|
|
521
|
+
const artifactBuf: import("../ai/types").ReasoningArtifact[] = [];
|
|
522
|
+
const onReasoningArtifact = (a: import("../ai/types").ReasoningArtifact) => {
|
|
523
|
+
artifactBuf.push(a);
|
|
524
|
+
ev.onReasoningArtifactStream?.(a);
|
|
525
|
+
};
|
|
496
526
|
let responseText: string;
|
|
497
527
|
try {
|
|
498
528
|
responseText = await invokeCallLlm(history, {
|
|
@@ -510,6 +540,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
510
540
|
onUsage: u => { acc.inputTokens += u.inputTokens ?? 0; acc.outputTokens += u.outputTokens ?? 0; sawUsage = true; },
|
|
511
541
|
onToken,
|
|
512
542
|
onReasoning,
|
|
543
|
+
onReasoningArtifact,
|
|
513
544
|
// Make provider auto-retry visible: previously a rate-limited call sat in a
|
|
514
545
|
// silent backoff wait, then surfaced "auto-retry was exhausted" with no trace
|
|
515
546
|
// of the retries that DID happen.
|
|
@@ -604,10 +635,10 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
604
635
|
const trimmed = responseText.trim();
|
|
605
636
|
parseFailures++;
|
|
606
637
|
if (trimmed && (!trimmed.includes("{") || parseFailures > MAX_PARSE_BOUNCES)) {
|
|
607
|
-
history
|
|
638
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf);
|
|
608
639
|
return finish({ done: true, steps: step, doneReason: trimmed });
|
|
609
640
|
}
|
|
610
|
-
history
|
|
641
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf);
|
|
611
642
|
history.push({
|
|
612
643
|
role: "user",
|
|
613
644
|
content:
|
|
@@ -654,7 +685,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
654
685
|
doneReason: `Stopped: the model returned no valid tool call ${MAX_INVALID_CALLS}× (a JSON reply with no valid "tool" or "tools" field). The selected model may be too small to follow the JSON tool protocol — switch to a stronger model with /model.`,
|
|
655
686
|
});
|
|
656
687
|
}
|
|
657
|
-
history
|
|
688
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf);
|
|
658
689
|
history.push({
|
|
659
690
|
role: "user",
|
|
660
691
|
content: `Your last reply had no "tool" or "tools" field. Reply with exactly one JSON object, e.g. {"tool":"find","arguments":{"globPattern":"src/**"}} or {"tools":[{"tool":"read","arguments":{"filePath":"src/main.ts"}}, ...]}.`,
|
|
@@ -674,7 +705,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
674
705
|
if (toolCalls.length === 1 && toolCalls[0].tool === "done") {
|
|
675
706
|
if (sawMutation && (!sawVerification || pendingHookFailure !== null) && !donePushbackUsed) {
|
|
676
707
|
donePushbackUsed = true; // second done always passes — escape hatch
|
|
677
|
-
history
|
|
708
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf);
|
|
678
709
|
history.push({
|
|
679
710
|
role: "user",
|
|
680
711
|
content: pendingHookFailure !== null
|
|
@@ -696,7 +727,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
696
727
|
const nudge = await ev.onBeforeDone((toolCalls[0].arguments?.reason as string) ?? "");
|
|
697
728
|
if (nudge) {
|
|
698
729
|
beforeDoneNudgeUsed = true;
|
|
699
|
-
history
|
|
730
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf);
|
|
700
731
|
history.push({ role: "user", content: nudge });
|
|
701
732
|
ev.onNotice?.("done deferred once — final plan reconciliation requested");
|
|
702
733
|
step++;
|
|
@@ -709,7 +740,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
709
740
|
if (opts.steer) {
|
|
710
741
|
const pending = opts.steer().map(s => (s ?? "").trim()).filter(Boolean);
|
|
711
742
|
if (pending.length) {
|
|
712
|
-
history
|
|
743
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf);
|
|
713
744
|
for (const text of pending) {
|
|
714
745
|
history.push({
|
|
715
746
|
role: "user",
|
|
@@ -754,7 +785,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
754
785
|
const lastChance = repeatCount === MAX_REPEAT - 1
|
|
755
786
|
? "This is your LAST attempt: if you emit the same call again the turn will end. "
|
|
756
787
|
: "";
|
|
757
|
-
history
|
|
788
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf);
|
|
758
789
|
history.push({
|
|
759
790
|
role: "user",
|
|
760
791
|
content:
|
|
@@ -784,7 +815,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
784
815
|
if (!cycleBounceUsed) {
|
|
785
816
|
cycleBounceUsed = true;
|
|
786
817
|
recentStepSigs.length = 0; // fresh window: the correction earns a real retry
|
|
787
|
-
history
|
|
818
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf);
|
|
788
819
|
history.push({
|
|
789
820
|
role: "user",
|
|
790
821
|
content:
|
|
@@ -944,6 +975,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
944
975
|
);
|
|
945
976
|
// Append the batch's hook diagnostics once so the model can self-correct. Two
|
|
946
977
|
// DISTINCT hooks with identical output collapse to one full block + a cross-ref.
|
|
978
|
+
let hookExtra = "";
|
|
947
979
|
if (hookDiags.length > 0) {
|
|
948
980
|
const seenHookFeedback = new Set<string>();
|
|
949
981
|
const diagLines: string[] = [];
|
|
@@ -956,14 +988,28 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
956
988
|
diagLines.push(`[post-turn hook "${d.run}" — exit ${d.exitCode}]:\n${truncateToolOutput(d.output)}`);
|
|
957
989
|
}
|
|
958
990
|
}
|
|
959
|
-
|
|
991
|
+
hookExtra = diagLines.join("\n");
|
|
992
|
+
resultBlocks.push(hookExtra);
|
|
960
993
|
}
|
|
961
994
|
|
|
962
|
-
|
|
963
|
-
|
|
964
|
-
|
|
965
|
-
|
|
966
|
-
|
|
995
|
+
// Structured native replay records: stable ids correlate the assistant tool_use
|
|
996
|
+
// turn with its tool_result user turn (the string `content` stays the source of
|
|
997
|
+
// truth for display / compaction / fallback adapters).
|
|
998
|
+
const idFor = (idx: number) => `call_${step}_${idx}`;
|
|
999
|
+
const toolUse: import("../ai/types").ToolUseRecord[] = indices.map(idx => ({
|
|
1000
|
+
id: idFor(idx),
|
|
1001
|
+
tool: toolCalls[idx].tool,
|
|
1002
|
+
arguments: toolCalls[idx].arguments ?? {},
|
|
1003
|
+
}));
|
|
1004
|
+
const toolResults: import("../ai/types").ToolResultRecord[] = indices.map((idx, i) => ({
|
|
1005
|
+
id: idFor(idx),
|
|
1006
|
+
output: bodies[i],
|
|
1007
|
+
isError: !results[idx].success,
|
|
1008
|
+
}));
|
|
1009
|
+
pushAssistantTurn(history, responseText, reasonBuf, artifactBuf, toolUse);
|
|
1010
|
+
const resultMsg: Message = { role: "user", content: resultBlocks.join("\n\n"), toolResults };
|
|
1011
|
+
if (hookExtra) resultMsg.toolResultExtra = hookExtra;
|
|
1012
|
+
history.push(resultMsg);
|
|
967
1013
|
};
|
|
968
1014
|
|
|
969
1015
|
if (aborted) {
|
|
@@ -1053,7 +1099,7 @@ export async function runAgentLoop(history: Message[], opts: AgentLoopOptions):
|
|
|
1053
1099
|
);
|
|
1054
1100
|
const consolidated = wrapUp.trim();
|
|
1055
1101
|
if (consolidated) {
|
|
1056
|
-
history
|
|
1102
|
+
pushAssistantTurn(history, consolidated, "", []);
|
|
1057
1103
|
return finish({
|
|
1058
1104
|
done: false,
|
|
1059
1105
|
steps: budget.limit(),
|
package/src/agent/loop.ts
CHANGED
|
@@ -26,6 +26,9 @@ export interface ChatOptions {
|
|
|
26
26
|
onToken?: (delta: string) => void;
|
|
27
27
|
/** Streaming sink for native reasoning/thinking deltas (drives the dimmed live view). */
|
|
28
28
|
onReasoning?: (delta: string) => void;
|
|
29
|
+
/** Streaming sink for provider-native reasoning ARTIFACTS (signature / thoughtSignature /
|
|
30
|
+
* reasoning item id+encrypted) — the replay channel, separate from onReasoning. */
|
|
31
|
+
onReasoningArtifact?: (artifact: import("../ai/types").ReasoningArtifact) => void;
|
|
29
32
|
/** NATIVE tool-calling function declarations (forwarded to capable adapters). */
|
|
30
33
|
tools?: import("../ai/types").NativeToolSchema[];
|
|
31
34
|
}
|
package/src/ai/model-catalog.ts
CHANGED
|
@@ -37,6 +37,8 @@ const STD: ThinkLevel[] = ["minimal", "low", "medium", "high"];
|
|
|
37
37
|
export const ANTIGRAVITY_MODELS = [
|
|
38
38
|
"claude-opus-4-5-thinking",
|
|
39
39
|
"claude-opus-4-6-thinking",
|
|
40
|
+
"claude-opus-4-7",
|
|
41
|
+
"claude-opus-4-7-thinking",
|
|
40
42
|
"claude-opus-4-8",
|
|
41
43
|
"claude-opus-4-8-thinking",
|
|
42
44
|
"claude-sonnet-4-5",
|
|
@@ -52,6 +54,7 @@ export const ANTIGRAVITY_MODELS = [
|
|
|
52
54
|
"gemini-3.1-pro-high",
|
|
53
55
|
"gemini-3.1-pro-low",
|
|
54
56
|
"gpt-oss-120b-medium",
|
|
57
|
+
"gpt-5.5",
|
|
55
58
|
] as const;
|
|
56
59
|
|
|
57
60
|
/** A curated set of common public models with their documented capabilities. */
|
|
@@ -62,9 +65,13 @@ export const MODEL_CATALOG: readonly CatalogModel[] = [
|
|
|
62
65
|
{ canonical: "claude-sonnet-4-5", provider: "anthropic", providerModel: "claude-sonnet-4-5-20250929", contextTokens: 200_000, maxOutputTokens: 64_000, thinking: FULL, images: true },
|
|
63
66
|
{ canonical: "claude-opus-4-1", provider: "anthropic", providerModel: "claude-opus-4-1-20250805", contextTokens: 200_000, maxOutputTokens: 32_000, thinking: FULL, images: true },
|
|
64
67
|
{ canonical: "claude-opus-4-5", provider: "anthropic", providerModel: "claude-opus-4-5-20251101", contextTokens: 200_000, maxOutputTokens: 64_000, thinking: FULL, images: true },
|
|
65
|
-
// NOTE:
|
|
66
|
-
//
|
|
68
|
+
// NOTE: opus-4-7 accepts extended thinking but currently returns 0 thinking tokens
|
|
69
|
+
// (model-internal, no visible thought). opus-4-8 thinks internally (tokens billed,
|
|
70
|
+
// signature present) but returns empty thinking text. Both are FULL-capable in the
|
|
71
|
+
// catalog so the budget is always sent — the nativizable path handles signature-only
|
|
72
|
+
// artifacts for cross-turn continuity.
|
|
67
73
|
{ canonical: "claude-opus-4-6", provider: "anthropic", providerModel: "claude-opus-4-6", contextTokens: 200_000, maxOutputTokens: 64_000, thinking: FULL, images: true },
|
|
74
|
+
{ canonical: "claude-opus-4-7", provider: "anthropic", providerModel: "claude-opus-4-7", contextTokens: 200_000, maxOutputTokens: 64_000, thinking: FULL, images: true },
|
|
68
75
|
{ canonical: "claude-opus-4-8", provider: "anthropic", providerModel: "claude-opus-4-8", contextTokens: 200_000, maxOutputTokens: 64_000, thinking: FULL, images: true },
|
|
69
76
|
// OpenAI
|
|
70
77
|
{ canonical: "gpt-4o", provider: "openai", providerModel: "gpt-4o", contextTokens: 128_000, maxOutputTokens: 16_384, thinking: [], images: true },
|
|
@@ -96,9 +103,9 @@ export const MODEL_CATALOG: readonly CatalogModel[] = [
|
|
|
96
103
|
canonical: `antigravity/${id}`,
|
|
97
104
|
provider: "antigravity",
|
|
98
105
|
providerModel: id,
|
|
99
|
-
contextTokens: id.includes("claude") ? 200_000 : id.includes("gemini-3") ? 1_000_000 : 1_000_000,
|
|
100
|
-
maxOutputTokens: id.includes("claude") ? 64_000 : 65_536,
|
|
101
|
-
thinking: id.includes("thinking") || id.includes("-high") || id.includes("-low") || id.includes("gemini-3") ? FULL : STD,
|
|
106
|
+
contextTokens: id.includes("claude") ? 200_000 : id.startsWith("gpt-5") ? 400_000 : id.includes("gemini-3") ? 1_000_000 : 1_000_000,
|
|
107
|
+
maxOutputTokens: id.includes("claude") ? 64_000 : id.startsWith("gpt-5") ? 128_000 : 65_536,
|
|
108
|
+
thinking: id.includes("thinking") || id.includes("-high") || id.includes("-low") || id.includes("gemini-3") || id.startsWith("gpt-5") ? FULL : STD,
|
|
102
109
|
images: !id.includes("gpt-oss"),
|
|
103
110
|
company: id.includes("claude") ? "Anthropic via Antigravity" : id.includes("gpt") ? "OpenAI via Antigravity" : "Google Antigravity",
|
|
104
111
|
})),
|
package/src/ai/model-manager.ts
CHANGED
|
@@ -332,6 +332,7 @@ async function resolveCall(options: Partial<CallOptions>, kind: "request" | "str
|
|
|
332
332
|
signal: options.signal,
|
|
333
333
|
reasoningEffort: options.reasoningEffort ?? thinkingToReasoningEffort(config.thinkingLevel),
|
|
334
334
|
onReasoning: options.onReasoning,
|
|
335
|
+
onReasoningArtifact: options.onReasoningArtifact,
|
|
335
336
|
tools: options.tools,
|
|
336
337
|
};
|
|
337
338
|
// Caller-supplied retry sink rides on the config-derived retry budget so the
|