npm - jeo-code - Versions diffs - 0.6.0 → 0.6.1 - Mend

jeo-code 0.6.0 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/CHANGELOG.md +15 -0
package/README.ja.md +9 -1
package/README.ko.md +9 -1
package/README.md +9 -1
package/README.zh.md +9 -1
package/package.json +1 -1
package/src/ai/providers/anthropic.ts +28 -3
package/src/ai/providers/antigravity.ts +6 -0
package/src/ai/providers/openai-responses.ts +14 -1
package/src/commands/launch.ts +137 -27
package/src/skills/catalog.ts +9 -2
package/src/tui/app.ts +9 -0

package/CHANGELOG.md CHANGED Viewed

@@ -6,6 +6,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 The README mirrors the latest 5 entries — regenerate with `bun run changelog:sync`.
+## [0.6.1] - 2026-06-16
+_Live reasoning progress (no more frozen "calling model"), thinking-level fixes for Anthropic/Antigravity, and input-box/Ctrl+O TUI fixes._
+### Added
+- **Live reasoning progress.** Codex/OpenAI reasoning models now stream their thinking into the live frame (`reasoning.summary: "auto"` + `response.reasoning*.delta` events surfaced via `onReasoning`), and the status row reads `reasoning (model)…` / `thinking — reasoning, no token stream yet…` after a silent wait instead of a frozen `calling model (Ns)…`.
+### Fixed
+- Thinking level is now applied to the **Anthropic and Antigravity** providers (it was a silent no-op there).
+- The **input box + caret stay in place after running a command** — no more vanishing box / caret parked at the reservation top.
+- **Skill runs render a compact `[skill]` card** instead of dumping the injected `SKILL.md` into a user box.
+- **Ctrl+O fold toggle** + incremental session durability across interruption.
+### Changed
+- Trimmed `fastThinkingLevelForModel` fallback to the real gap (ponytail pass); added a usage guide + demo video, linked from all READMEs.
 ## [0.6.0] - 2026-06-16
 _TUI quality of life: durable input history (↑ recalls past queries across launches), clean `/resume` rendering, and a scrollable mid-turn Ctrl+O panel._

package/README.ja.md CHANGED Viewed

@@ -28,6 +28,14 @@
 リポジトリ内で `jeo` を実行すると、ファイルを読み・編集し・コマンドを実行してタスクを完了まで進めます — 全ステップがスクロールバック親和なインライン TUI でライブ配信されます。
+## ドキュメント
+📖 **[使い方ガイド](docs/usage-guide.md)** — インストール、TUI操作（↑履歴、Ctrl+O、`!`シェル）、スラッシュコマンド、`/resume`、スペックファーストワークフローをデモ動画付きで解説。
+<video src="https://raw.githubusercontent.com/akillness/jeo-code/main/docs/jeo-code-promo.mp4" controls muted playsinline width="100%"></video>
+> インライン再生されない場合は ▶ [デモ動画を再生/ダウンロード](docs/jeo-code-promo.mp4)。
 ## ハイライト
 - **マルチプロバイダ・単一ループ** — Anthropic / OpenAI(+Codex) / Gemini / Antigravity / Ollama を均一な JSON ツールループで。入力欄から OAuth ログイン(`/provider login`)、モデル選択は即座にデフォルトとして永続化。
@@ -150,11 +158,11 @@ CI は `.github/workflows/npm-publish.yml` で公開します — GitHub リリ
 ## 変更履歴 (Changelog)
 <!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
+- **[0.6.1]** (2026-06-16) — Live reasoning progress (no more frozen "calling model"), thinking-level fixes for Anthropic/Antigravity, and input-box/Ctrl+O TUI fixes.
 - **[0.6.0]** (2026-06-16) — TUI quality of life: durable input history (↑ recalls past queries across launches), clean `/resume` rendering, and a scrollable mid-turn Ctrl+O panel.
 - **[0.5.16]** (2026-06-16) — `/resume` and Ctrl+O no longer corrupt the TUI — clean screen restore + scrollback expand.
 - **[0.5.15]** (2026-06-16) — `jeo update` now actually upgrades — bare command installs the latest release instead of just printing a manual command.
 - **[0.5.14]** (2026-06-16) — `jeo --tmux` live-verification harness — repeatable stability + behavior checks.
-- **[0.5.13]** (2026-06-15) — Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name.
 See [CHANGELOG.md](CHANGELOG.md) for the full history.
 <!-- CHANGELOG:END -->

package/README.ko.md CHANGED Viewed

@@ -28,6 +28,14 @@
 저장소 안에서 `jeo`를 실행하면 파일을 읽고, 수정하고, 명령을 실행하며 작업을 완료까지 끌고 갑니다 — 모든 스텝이 스크롤백 친화적인 인라인 TUI로 실시간 스트리밍됩니다.
+## 문서
+📖 **[사용 가이드](docs/usage-guide.md)** — 설치, TUI 조작(↑ 이전 쿼리, Ctrl+O, `!` 셸), 슬래시 명령, `/resume`, 스펙 우선 워크플로를 데모 영상과 함께 설명합니다.
+<video src="https://raw.githubusercontent.com/akillness/jeo-code/main/docs/jeo-code-promo.mp4" controls muted playsinline width="100%"></video>
+> 인라인 재생이 안 되면 ▶ [데모 영상 재생/다운로드](docs/jeo-code-promo.mp4).
 ## 하이라이트
 - **멀티 프로바이더, 단일 루프** — Anthropic / OpenAI(+Codex) / Gemini / Antigravity / Ollama를 균일한 JSON 도구 루프로. 입력창에서 바로 OAuth 로그인(`/provider login`), 모델 선택은 즉시 기본값으로 영속.
@@ -150,11 +158,11 @@ CI는 `.github/workflows/npm-publish.yml`로 배포합니다 — GitHub 릴리
 ## 변경 이력 (Changelog)
 <!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
+- **[0.6.1]** (2026-06-16) — Live reasoning progress (no more frozen "calling model"), thinking-level fixes for Anthropic/Antigravity, and input-box/Ctrl+O TUI fixes.
 - **[0.6.0]** (2026-06-16) — TUI quality of life: durable input history (↑ recalls past queries across launches), clean `/resume` rendering, and a scrollable mid-turn Ctrl+O panel.
 - **[0.5.16]** (2026-06-16) — `/resume` and Ctrl+O no longer corrupt the TUI — clean screen restore + scrollback expand.
 - **[0.5.15]** (2026-06-16) — `jeo update` now actually upgrades — bare command installs the latest release instead of just printing a manual command.
 - **[0.5.14]** (2026-06-16) — `jeo --tmux` live-verification harness — repeatable stability + behavior checks.
-- **[0.5.13]** (2026-06-15) — Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name.
 See [CHANGELOG.md](CHANGELOG.md) for the full history.
 <!-- CHANGELOG:END -->

package/README.md CHANGED Viewed

@@ -28,6 +28,14 @@
 Run `jeo` inside a repository and it reads files, edits them, runs commands, and drives the task to completion — streaming every step live in an inline, scrollback-friendly TUI.
+## Documentation
+📖 **[Usage guide](docs/usage-guide.md)** — install, TUI controls (↑ recall, Ctrl+O, `!` shell), slash commands, `/resume`, and the spec-first workflow, with a demo video.
+<video src="https://raw.githubusercontent.com/akillness/jeo-code/main/docs/jeo-code-promo.mp4" controls muted playsinline width="100%"></video>
+> Demo not playing inline? ▶ [Play / download the demo video](docs/jeo-code-promo.mp4).
 ## Highlights
 - **Multi-provider, one loop** — Anthropic / OpenAI (+Codex) / Gemini / Antigravity / Ollama behind a uniform JSON tool loop. OAuth login from the input box (`/provider login`), every model pick persists as the new default.
@@ -150,11 +158,11 @@ Required npm token permissions (repository secret `NPM_TOKEN`):
 ## Changelog
 <!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
+- **[0.6.1]** (2026-06-16) — Live reasoning progress (no more frozen "calling model"), thinking-level fixes for Anthropic/Antigravity, and input-box/Ctrl+O TUI fixes.
 - **[0.6.0]** (2026-06-16) — TUI quality of life: durable input history (↑ recalls past queries across launches), clean `/resume` rendering, and a scrollable mid-turn Ctrl+O panel.
 - **[0.5.16]** (2026-06-16) — `/resume` and Ctrl+O no longer corrupt the TUI — clean screen restore + scrollback expand.
 - **[0.5.15]** (2026-06-16) — `jeo update` now actually upgrades — bare command installs the latest release instead of just printing a manual command.
 - **[0.5.14]** (2026-06-16) — `jeo --tmux` live-verification harness — repeatable stability + behavior checks.
-- **[0.5.13]** (2026-06-15) — Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name.
 See [CHANGELOG.md](CHANGELOG.md) for the full history.
 <!-- CHANGELOG:END -->

package/README.zh.md CHANGED Viewed

@@ -28,6 +28,14 @@
 在仓库内运行 `jeo`，它会读取文件、编辑代码、执行命令，并把任务推进到完成 — 每一步都通过滚动友好的内联 TUI 实时呈现。
+## 文档
+📖 **[使用指南](docs/usage-guide.md)** — 安装、TUI 操作（↑ 历史、Ctrl+O、`!` shell）、斜杠命令、`/resume`、规格优先工作流，附演示视频。
+<video src="https://raw.githubusercontent.com/akillness/jeo-code/main/docs/jeo-code-promo.mp4" controls muted playsinline width="100%"></video>
+> 无法内联播放？▶ [播放/下载演示视频](docs/jeo-code-promo.mp4)。
 ## 亮点
 - **多提供商、单一循环** — Anthropic / OpenAI(+Codex) / Gemini / Antigravity / Ollama 统一在一个 JSON 工具循环中。输入框内直接 OAuth 登录(`/provider login`)，模型选择即刻持久化为默认值。
@@ -150,11 +158,11 @@ CI 通过 `.github/workflows/npm-publish.yml` 发布 — GitHub 发布 release
 ## 更新日志 (Changelog)
 <!-- CHANGELOG:START (auto-generated from CHANGELOG.md — run `bun run changelog:sync`) -->
+- **[0.6.1]** (2026-06-16) — Live reasoning progress (no more frozen "calling model"), thinking-level fixes for Anthropic/Antigravity, and input-box/Ctrl+O TUI fixes.
 - **[0.6.0]** (2026-06-16) — TUI quality of life: durable input history (↑ recalls past queries across launches), clean `/resume` rendering, and a scrollable mid-turn Ctrl+O panel.
 - **[0.5.16]** (2026-06-16) — `/resume` and Ctrl+O no longer corrupt the TUI — clean screen restore + scrollback expand.
 - **[0.5.15]** (2026-06-16) — `jeo update` now actually upgrades — bare command installs the latest release instead of just printing a manual command.
 - **[0.5.14]** (2026-06-16) — `jeo --tmux` live-verification harness — repeatable stability + behavior checks.
-- **[0.5.13]** (2026-06-15) — Workflow `/` commands actually run — `/deep-interview`, `/team`, `/ultragoal`, `/ralplan` dispatch by name.
 See [CHANGELOG.md](CHANGELOG.md) for the full history.
 <!-- CHANGELOG:END -->

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "jeo-code",
-  "version": "0.6.0",
+  "version": "0.6.1",
   "description": "Clean, highly optimized AI coding agent using spec-first loop",
   "type": "module",
   "main": "src/cli.ts",

package/src/ai/providers/anthropic.ts CHANGED Viewed

@@ -72,6 +72,18 @@ function anthropicSystemBlocks(
   return blocks;
 }
+/** Anthropic extended-thinking budget by reasoning effort (kept under max_tokens). Off for
+ *  low/minimal/unset effort so /fast and minimal thinking stay non-thinking (cheaper/faster). */
+function anthropicThinkingBudget(effort: CallOptions["reasoningEffort"], maxTokens: number): number | undefined {
+  let budget: number;
+  switch (effort) {
+    case "medium": budget = 4096; break;
+    case "high": budget = 10000; break;
+    default: return undefined;
+  }
+  return Math.min(budget, Math.max(1024, maxTokens - 1024));
+}
 export function anthropicPayload(
   messages: Message[],
   options: CallOptions,
@@ -109,13 +121,24 @@ export function anthropicPayload(
       last.content[last.content.length - 1] = { ...tail, cache_control: { type: "ephemeral" } };
     }
   }
+  const maxTokens = options.maxTokens ?? 4000;
+  const thinkingBudget = anthropicThinkingBudget(options.reasoningEffort, maxTokens);
   const payload: Record<string, unknown> = {
     model,
     messages: anthropicMessages,
-    max_tokens: options.maxTokens ?? 4000,
+    // Extended thinking requires max_tokens strictly above the thinking budget.
+    max_tokens: thinkingBudget !== undefined ? Math.max(maxTokens, thinkingBudget + 1024) : maxTokens,
   };
   if (credential.kind === "oauth") payload.metadata = { user_id: createClaudeCloakingUserId() };
-  if (includeTemperature && options.temperature !== undefined) payload.temperature = options.temperature;
+  if (thinkingBudget !== undefined) {
+    // Apply the thinking level: enable Claude extended thinking (the interleaved-thinking
+    // beta is already in the headers). Extended thinking forbids a custom temperature, so
+    // temperature is only set on the non-thinking path. Previously the thinking level only
+    // changed max_tokens and never reached Claude as actual reasoning depth.
+    payload.thinking = { type: "enabled", budget_tokens: thinkingBudget };
+  } else if (includeTemperature && options.temperature !== undefined) {
+    payload.temperature = options.temperature;
+  }
   if (options.tools?.length) {
     // NATIVE tool-calling: declare jeo's tools as Anthropic functions. tool_choice
     // "auto" keeps prose-salvage reachable and lets the model call `done` (declared as
@@ -232,7 +255,7 @@ export const anthropicAdapter: ProviderAdapter = {
         type?: string;
         index?: number;
         content_block?: { type?: string; name?: string };
-        delta?: { type?: string; text?: string; partial_json?: string; stop_reason?: string };
+        delta?: { type?: string; text?: string; partial_json?: string; thinking?: string; stop_reason?: string };
         message?: { usage?: AnthropicUsage };
         usage?: { output_tokens?: number };
       };
@@ -249,6 +272,8 @@ export const anthropicAdapter: ProviderAdapter = {
       } else if (evt.type === "content_block_delta" && evt.delta?.type === "text_delta" && evt.delta.text) {
         yieldedAny = true;
         yield evt.delta.text;
+      } else if (evt.type === "content_block_delta" && evt.delta?.type === "thinking_delta" && evt.delta.thinking) {
+        options.onReasoning?.(evt.delta.thinking);
       } else if (evt.type === "message_start" && evt.message?.usage) {
         // Cache only — usage is reported ONCE at message_delta so an accumulating
         // sink can't double-count input (and a pre-first-chunk retry that replays

package/src/ai/providers/antigravity.ts CHANGED Viewed

@@ -4,6 +4,7 @@ import type { CallOptions, Message, ProviderAdapter } from "../types";
 import { readSse } from "../sse";
 import { providerHttpError } from "./errors";
 import { serializeToolCalls } from "../../agent/tool-schemas";
+import { geminiThinkingBudget } from "./gemini";
 const ANTIGRAVITY_DAILY_ENDPOINT = "https://daily-cloudcode-pa.googleapis.com";
 const ANTIGRAVITY_SANDBOX_ENDPOINT = "https://daily-cloudcode-pa.sandbox.googleapis.com";
@@ -130,6 +131,11 @@ export function antigravityRequest(messages: Message[], options: CallOptions, cr
   if (options.temperature !== undefined) generationConfig.temperature = options.temperature;
   // Upstream Antigravity strips maxOutputTokens for non-Claude models; do the same.
   if (model.toLowerCase().includes("claude")) generationConfig.maxOutputTokens = options.maxTokens ?? 4000;
+  // Apply the thinking level: antigravity serves Gemini models through CCA, so reuse the
+  // Gemini thinkingConfig budget (off at minimal, scaling with reasoning effort). Without
+  // this the thinking level only changed token budget, never actual reasoning depth.
+  const agThinkingBudget = geminiThinkingBudget(model, options.reasoningEffort);
+  if (agThinkingBudget !== undefined) generationConfig.thinkingConfig = { thinkingBudget: agThinkingBudget };
   const request: Record<string, unknown> = {
     contents: antigravityContents(messages),

package/src/ai/providers/openai-responses.ts CHANGED Viewed

@@ -72,7 +72,9 @@ export function codexResponsesRequest(
   // Map thinkingLevel → reasoning effort for Codex reasoning models (gjc parity).
   // Drop out-of-enum values instead of forwarding them — the backend 400s on unknown efforts.
   if (options.reasoningEffort && VALID_REASONING_EFFORTS.has(options.reasoningEffort)) {
-    payload.reasoning = { effort: options.reasoningEffort };
+    // `summary: "auto"` makes the backend stream reasoning-summary deltas so the live
+    // frame can show the model's thinking instead of a frozen "calling model (Ns)…".
+    payload.reasoning = { effort: options.reasoningEffort, summary: "auto" };
   }
   const accountId = extractChatgptAccountId(token);
   const headers: Record<string, string> = {
@@ -88,6 +90,9 @@ export function codexResponsesRequest(
 export interface ResponsesEvent {
   delta?: string;
+  /** Reasoning-summary text delta (`response.reasoning_summary_text.delta` and the
+   *  Codex backend's `reasoning_text` variant) — streamed live as the model thinks. */
+  reasoningDelta?: string;
   usage?: { inputTokens?: number; outputTokens?: number };
   error?: string;
   /** `response.incomplete` cause (e.g. max_output_tokens) — surfaced when the
@@ -125,6 +130,12 @@ export function parseResponsesEvent(data: string): ResponsesEvent {
     return { toolCallArgsDelta: o.delta, toolCallIndex: o.output_index };
   }
   if (o.type === "response.output_text.delta" && typeof o.delta === "string") return { delta: o.delta };
+  // Reasoning-summary streaming: surface the model's thinking live. Accept the
+  // documented `response.reasoning_summary_text.delta` and the Codex backend's
+  // `response.reasoning_text.delta` (any reasoning*.delta variant) uniformly.
+  if (typeof o.delta === "string" && /^response\.reasoning[a-z_]*\.delta$/.test(o.type ?? "")) {
+    return { reasoningDelta: o.delta };
+  }
   // `response.incomplete` (max_output_tokens / content filter) also carries usage — don't drop it.
   if ((o.type === "response.completed" || o.type === "response.incomplete") && o.response?.usage) {
     return {
@@ -175,6 +186,7 @@ export async function codexResponsesCall(messages: Message[], options: CallOptio
   for await (const data of readSse(response.body)) {
     const ev = parseResponsesEvent(data);
     if (ev.delta) out += ev.delta;
+    if (ev.reasoningDelta) options.onReasoning?.(ev.reasoningDelta);
     accumulateResponsesToolCall(toolAcc, ev);
     if (ev.usage) options.onUsage?.(ev.usage);
     if (ev.incompleteReason) incompleteReason = ev.incompleteReason;
@@ -202,6 +214,7 @@ export async function* codexResponsesStream(
   const toolAcc = new Map<number, { name: string; args: string }>();
   for await (const data of readSse(response.body)) {
     const ev = parseResponsesEvent(data);
+    if (ev.reasoningDelta) options.onReasoning?.(ev.reasoningDelta);
     if (ev.delta) {
       yieldedAny = true;
       yield ev.delta;

package/src/commands/launch.ts CHANGED Viewed

@@ -142,6 +142,12 @@ function fastThinkingLevelForModel(modelId: string): ThinkLevel | undefined {
   const supported = catalogMetadata(modelId)?.thinking ?? [];
   if (supported.includes("minimal")) return "minimal";
   if (supported.includes("low")) return "low";
+  // Fallback for the one thinking-capable family that misses a catalog entry in practice:
+  // the dynamic antigravity `gemini-3.x` variants (gemini-3.5-flash-low/-extra-low, …) the
+  // CCA backend serves but the static catalog can't enumerate. They apply a thinking budget,
+  // so /fast offers `minimal` as the fast level instead of reporting "unsupported". (openai
+  // reasoning models and *-thinking/-high/-low antigravity ids are already catalogued.)
+  if (/gemini-(2\.5|[3-9])/.test(modelId.toLowerCase())) return "minimal";
   return undefined;
 }
@@ -1311,6 +1317,19 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
       base.onToolResult?.(tool, success, output);
     },
   });
+  /** Compose a session-persistence flush into onStep so each completed step is
+   *  written as it lands (durability across mid-turn interruption) without
+   *  disturbing the original onStep sink. */
+  const withStepPersistence = (
+    base: AgentLoopEvents,
+    persist: () => Promise<void>,
+  ): AgentLoopEvents => ({
+    ...base,
+    onStep: async (step: number) => {
+      await base.onStep?.(step);
+      await persist();
+    },
+  });
   /** The Ctrl+O detail block (shared by the prompt-time keypress handler and the
    *  mid-turn TUI binding): full last reply + full last tool output. */
   const composeDetailLines = (): string[] => {
@@ -1372,7 +1391,12 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
   const runTurn = async (
     userInput: string,
     useTui: boolean,
-    images?: ImageAttachment[]
+    images?: ImageAttachment[],
+    // What to show as the user card in the live frame: undefined → the prompt
+    // itself (normal turns); null → suppress it entirely (skill runs, where the
+    // injected SKILL.md is NOT user-authored and the [skill] card already names it,
+    // gjc-style); a string → show that compact label instead of the raw input.
+    opts?: { userCard?: string | null },
   ): Promise<{ done: boolean; steps: number; reply: string; rendered: boolean; usage: string }> => {
     const turnConfig = await readGlobalConfig();
     const activeModel = sessionModel || turnConfig.defaultModel;
@@ -1389,6 +1413,17 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
     // AFTER compaction (which mutates history) and consumed by the post-turn
     // persistence block below.
     let beforeLen = history.length;
+    // Incremental session persistence (durability across mid-turn interruption):
+    // persistTurnTail() flushes history messages added since the last flush — called
+    // right after the user prompt, on every onStep boundary, and once post-turn — so
+    // Ctrl+C / crash / ESC can't lose the prompt + finished steps before /resume.
+    let persistedLen = beforeLen;
+    const persistTurnTail = async (): Promise<void> => {
+      if (!sessionId || history.length <= persistedLen) return;
+      const tail = history.slice(persistedLen);
+      persistedLen = history.length;
+      try { await appendMessages(sessionId, tail, cwd); } catch { /* best-effort */ }
+    };
     let result;
     try {
       // Paint the live frame + spinner the INSTANT the turn is accepted, BEFORE the
@@ -1413,17 +1448,24 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
         tui?.events().onNotice?.(`(compacted ${compRes.removed} older message${compRes.removed === 1 ? "" : "s"})`);
       }
       beforeLen = history.length;
+      persistedLen = beforeLen; // re-baseline after compaction mutated history
       if (images?.length && catalogMetadata(activeModel)?.images === false) {
         const warn = `! ${activeModel} does not advertise image input — sending the attachment anyway.`;
         if (tui) tui.events().onNotice?.(warn);
         else console.log(warn);
       }
       history.push(images?.length ? { role: "user", content: userInput, images } : { role: "user", content: userInput });
+      // Persist the user prompt immediately so an interrupted turn keeps it on disk.
+      await persistTurnTail(); // persist the user prompt immediately
       // Keep the submitted query in scrollback: the prompt that STARTS a turn shows
       // only as the transient HUD turn-title otherwise, which vanishes when the live
       // frame clears at turn-end — so the conversation transcript lost every user
       // prompt. Flush a `user` card (same surface as a mid-turn steer) so it persists.
-      if (tui && userInput.trim()) tui.flushUserCard(userInput);
+      // Flush a `user` card (same surface as a mid-turn steer) so the prompt persists
+      // in scrollback. opts.userCard overrides it: null suppresses the card entirely
+      // (skill runs), a string shows a compact label instead of the raw input.
+      const cardText = opts?.userCard === undefined ? userInput : opts.userCard;
+      if (tui && cardText && cardText.trim()) tui.flushUserCard(cardText);
       tui?.setContextUsage(historyTokens(history), contextTokens);
       // Per-turn steering inbox (gjc parity): additional queries typed mid-turn land
@@ -1548,7 +1590,7 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
           maxTokens: sessionThinking ? thinkingMaxTokens(sessionThinking) : undefined,
           signal: ac.signal,
           steer: drainSteer,
-          events: wrapEvents({ ...withToolDetailCapture(tui ? tui.events() : streamEvents), onBeforeDone }, opik),
+          events: wrapEvents(withStepPersistence({ ...withToolDetailCapture(tui ? tui.events() : streamEvents), onBeforeDone }, persistTurnTail), opik),
         });
         if (result.done && looksLikeSkillEcho(result.doneReason ?? "", resolvedSkills)) {
           history.push({
@@ -1603,13 +1645,12 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
       || (result.done
         ? `(done in ${result.steps} step${result.steps === 1 ? "" : "s"} — the model returned no summary)`
         : `(reached the ${result.steps}-step limit without signaling done)`);
-    // Full-fidelity persistence: append every message the engine added this turn
-    // (user prompt + intermediate tool-call/tool-result turns), then the final reply.
+    // Persist any messages this turn produced that onStep hasn't flushed yet (the
+    // final step's tool-call/result + any retry-loop messages), then the reply.
+    // Incremental onStep flushes already wrote the prompt and completed steps, so
+    // this only covers the tail — net content is the full turn either way.
     try {
-      if (sessionId) {
-        // One batched fs append for the whole turn (was: one awaited append per message).
-        await appendMessages(sessionId, history.slice(beforeLen), cwd);
-      }
+      await persistTurnTail();
       history.push({ role: "assistant", content: reply });
       if (sessionId) await appendMessage(sessionId, { role: "assistant", content: reply }, cwd);
       if (tui) tui.finish(reply);
@@ -1729,7 +1770,7 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
         console.log(`▶ Running skill: ${inv.skill.name}${inv.intent ? ` — ${inv.intent}` : ""}`);
       }
       const task = buildSkillTask(inv.skill, inv.intent, inv.invokedAs);
-      const { reply, rendered, usage } = await runTurn(task, useOneShotTui);
+      const { reply, rendered, usage } = await runTurn(task, useOneShotTui, undefined, { userCard: null });
       if (!rendered) console.log(stripMarkdown(renderMarkdownTables(reply)) + usage);
       else if (usage) console.log(usage.trim());
     };
@@ -1837,7 +1878,7 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
     // starts — skill name, resolved SKILL.md path, and the prompt size.
     {
       const card = formatForgeBox(
-        { title: "[skill]", lines: skillInvocationCard(skill) },
+        { title: "[skill]", lines: skillInvocationCard(skill, intent) },
         { width: Math.min(100, Math.max(40, (process.stdout.columns ?? 80) - 2)), unicode: supportsUnicode(), paint: accentPaint(uiTheme), paintShadow: accentShadowPaint(uiTheme), color: uiTheme.color },
       );
       logLines(card);
@@ -1918,7 +1959,7 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
       // final reply is the skill's result.
       if (!useTui) console.log(`▶ Running skill: ${skill.name}${intent ? ` — ${intent}` : ""}`);
       const task = buildSkillTask(skill, intent, invokedAs);
-      const { reply, rendered, usage } = await runTurn(task, useTui);
+      const { reply, rendered, usage } = await runTurn(task, useTui, undefined, { userCard: null });
       if (!rendered) console.log(`jeo> ${stripMarkdown(renderMarkdownTables(reply))}${usage}`);
       else if (usage) console.log(usage.trim());
     }
@@ -2299,6 +2340,13 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
     }
   };
   let previewPending = false;
+  // Idle-prompt Ctrl+O detail panel: a REVERSIBLE, scrollable overlay drawn inside
+  // the footer reservation. Toggle open/closed with Ctrl+O; ↑↓/PgUp/PgDn scroll so
+  // long/CJK content is fully reachable (no "… N more" clip). null = closed.
+  let promptHistoryLines: string[] | null = null;
+  let promptHistoryScroll = 0;
+  let promptHistoryMaxScroll = 0;
+  let promptHistoryPage = 1;
   // Inline boxed-footer rendering with a FIXED reservation (the "@-mention typing
   // pushes the box down" fix). The footer reserves its full `footerRows` height
@@ -2450,6 +2498,44 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
     const preview = (slash.length ? slash : args).map(l => chalk.gray(truncateAnsi(l, cols)));
     return [statusBarLine(cols), "", ...input, ...preview].slice(0, footerRows);
   };
+  // Render the reversible Ctrl+O detail panel into the footer reservation: a status
+  // bar, a title (with scroll hint when needed), then a windowed slice of the detail
+  // with ↑/↓ counters. Recomputes the scroll bound/page size each paint so scroll
+  // keys can clamp correctly. Mirrors the mid-turn LaunchTui panel.
+  const historyPreviewLines = (detail: string[]): string[] => {
+    const cols = Math.max(24, (process.stdout.columns ?? 80) - 1);
+    const physical = detail.flatMap(line => line.split("\n")).map(line => truncateAnsi(line, cols));
+    const bodyLimit = Math.max(1, footerRows - 2); // status bar + title rows
+    const scrollable = physical.length > bodyLimit;
+    const cap = scrollable ? Math.max(1, bodyLimit - 2) : bodyLimit;
+    promptHistoryPage = cap;
+    promptHistoryMaxScroll = Math.max(0, physical.length - cap);
+    if (promptHistoryScroll > promptHistoryMaxScroll) promptHistoryScroll = promptHistoryMaxScroll;
+    const hint = scrollable ? "· ↑↓/PgUp/PgDn scroll · Ctrl+O closes" : "· Ctrl+O closes";
+    const title = `${uiAccent("history")} ${chalk.dim(hint)}`;
+    let body: string[];
+    if (!scrollable) {
+      body = physical;
+    } else {
+      const start = promptHistoryScroll;
+      const above = start;
+      const reserveTop = above > 0 ? 1 : 0;
+      let innerLimit = Math.max(1, bodyLimit - reserveTop - 1);
+      let win = physical.slice(start, start + innerLimit);
+      let below = physical.length - (start + win.length);
+      if (below === 0) {
+        innerLimit = Math.max(1, bodyLimit - reserveTop);
+        win = physical.slice(start, start + innerLimit);
+        below = physical.length - (start + win.length);
+      }
+      body = [];
+      if (above > 0) body.push(chalk.dim(`↑ ${above} more above`));
+      body.push(...win);
+      if (below > 0) body.push(chalk.dim(`↓ ${below} more below`));
+    }
+    footerCursor = { row: Math.min(1, footerRows - 1), col: 1 };
+    return [statusBarLine(cols), title, ...body].slice(0, footerRows);
+  };
   const drawFooter = (lines: string[]) => {
     if (!previewArmed || footerRendered === 0) return;
     // ALWAYS paint exactly footerRendered rows so the reservation is fully covered
@@ -2896,24 +2982,41 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
         return;
       }
       if (!previewArmed || pickerActive) return;
-      // Ctrl+O: expand the last assistant response into scrollback for a full,
-      // scrollable read. The live-turn TUI path uses LaunchTui.showDetail(); this
-      // idle-prompt path prints the same content above a cleanly re-armed footer.
-      // (Cmd+O is intercepted by macOS/terminal and never reaches the app.)
+      // Ctrl+O: toggle a reversible, scrollable detail panel inside the footer
+      // (expand on the first press, FOLD on the next). ↑↓/PgUp/PgDn scroll it while
+      // open — long/CJK content is fully reachable, nothing is clipped. The live-turn
+      // path uses LaunchTui.showDetail(). (Cmd+O is intercepted by the terminal.)
       if (key?.ctrl && key.name === "o") {
+        if (promptHistoryLines) {
+          promptHistoryLines = null; // fold
+          drawFooter(previewLines(typedLine, navIdx));
+          return;
+        }
         const detail = composeDetailLines();
         if (detail.length === 0) return;
-        // Expand the last response into scrollback CLEANLY instead of cramming it
-        // into the ~10-row footer reservation (which clipped long/CJK content and
-        // could garble the box). Disarm → print full detail → re-arm + repaint is
-        // the same path the resize handler/main loop use, so the input box and the
-        // typed draft restore without corruption.
-        disarmPreview();
-        logLines(detail);
-        armPreview();
-        drawFooter(previewLines(typedLine, navIdx));
+        promptHistoryLines = detail;
+        promptHistoryScroll = 0;
+        drawFooter(historyPreviewLines(detail));
         return;
       }
+      // While the detail panel is open, arrows / PgUp / PgDn scroll it instead of
+      // navigating slash/history; every other key (below) closes it.
+      if (promptHistoryLines && key && (key.name === "up" || key.name === "down" || key.name === "pageup" || key.name === "pagedown")) {
+        const page = key.name === "pageup" || key.name === "pagedown";
+        const dir = key.name === "up" || key.name === "pageup" ? -1 : 1;
+        const step = page ? Math.max(1, promptHistoryPage - 1) : 1;
+        const next = Math.min(promptHistoryMaxScroll, Math.max(0, promptHistoryScroll + dir * step));
+        if (next !== promptHistoryScroll) {
+          promptHistoryScroll = next;
+          drawFooter(historyPreviewLines(promptHistoryLines));
+        }
+        return;
+      }
+      if (promptHistoryLines) {
+        // Any other key dismisses the panel, then falls through to normal handling.
+        promptHistoryLines = null;
+        drawFooter(previewLines(typedLine, navIdx));
+      }
       // Ctrl+V: attach a clipboard IMAGE to the next message. Terminal text paste
       // never arrives as a ctrl+v keypress (it streams as plain stdin data), so this
       // binding is image-only; when the clipboard holds no image it's a silent no-op.
@@ -2956,7 +3059,13 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
         if (!previewArmed) return;
         try {
           if (key && (key.name === "return" || key.name === "enter")) {
-            drawFooter([]);
+            // Redraw the REAL box (not drawFooter([]), which erases it): this runs in a
+            // setImmediate and can fire AFTER the loop already re-armed + repainted the box
+            // for the next prompt (slash command / turn). An empty draw there wiped the fresh
+            // box and parked the cursor at the reservation top — the "input box vanishes and
+            // the caret leaves the prompt after a command" bug. Drawing previewLines keeps the
+            // box present and the caret in it whether this fires before submit or after re-arm.
+            drawFooter(previewLines(typedLine, navIdx));
             return;
           }
           // Arrow up/down: move the highlight over the slash keyword preview list.
@@ -3016,7 +3125,7 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
       try {
         disarmPreview();
         armPreview();
-        drawFooter(previewLines(typedLine, navIdx));
+        drawFooter(promptHistoryLines ? historyPreviewLines(promptHistoryLines) : previewLines(typedLine, navIdx));
       } catch { /* ignore resize render races */ }
     };
     process.stdout.on("resize", idleResizeHandler);
@@ -3055,6 +3164,7 @@ export async function runLaunchCommand(args: string[]): Promise<void> {
       // window where the box is gone and keystrokes echo nowhere (the "no response
       // after the result" gap). readline's own echo stays gated while armed.
       armPreview();
+      promptHistoryLines = null; // each fresh prompt starts with the Ctrl+O panel closed
       if (prefilledLine) {
         rli.line = prefilledLine;
         rli.cursor = prefilledLine.length;

package/src/skills/catalog.ts CHANGED Viewed

@@ -450,13 +450,20 @@ export async function loadSkills(cwd: string = process.cwd()): Promise<SkillDoc[
 /** gjc-style skill-invocation card body (the `[skill]` block shown in the TUI
  *  when `$name`//skill runs): name, resolved SKILL.md path (or the bundled
  *  module path), and the prompt size actually injected. Pure — testable. */
-export function skillInvocationCard(skill: SkillDoc): string[] {
+export function skillInvocationCard(skill: SkillDoc, intent?: string): string[] {
   const promptLines = (skill.raw ?? skill.details ?? "").split("\n").filter(l => l.trim().length > 0).length;
   // jeo-ref tree-connector detail: the skill name leads, resolved metadata hangs
-  // off ├─/└─ connectors so the card scans like the reference's Skill panel.
+  // off ├─/└─ connectors so the card scans like the reference's Skill panel. The
+  // intent (when given) is shown here so it isn't lost — the injected SKILL.md is
+  // NOT echoed as a user box (gjc-style: compact card, not the raw doc).
+  const trimmedIntent = intent?.trim();
+  const intentLine = trimmedIntent
+    ? [`├─ intent: ${trimmedIntent.length > 88 ? `${trimmedIntent.slice(0, 87)}…` : trimmedIntent}`]
+    : [];
   return [
     `Skill: ${skill.name}`,
     `├─ path: ${skill.sourcePath ?? `(bundled) src/prompts/skills/${skill.name}/SKILL.md`}`,
+    ...intentLine,
     `└─ prompt: ${promptLines} lines`,
   ];
 }

package/src/tui/app.ts CHANGED Viewed

@@ -766,6 +766,15 @@ export class LaunchTui {
       // (e.g. "rate limited (HTTP 429) — auto-retry #2 in 4s") instead of an opaque
       // ever-growing "calling model (18.4s)…".
       if (this.retryNotice) return `${this.retryNotice} (${elapsed}s)`;
+      // Reasoning is streaming → the live thought block already shows it; label the row.
+      if (this.streamingThought.trim() || this.streamingReasoning.trim()) {
+        return `reasoning (${this.footer.model}) (${elapsed}s)…`;
+      }
+      // No tokens after a few seconds: the model is almost certainly reasoning
+      // server-side (e.g. OpenAI hidden reasoning), NOT hung — say so instead of a
+      // frozen "calling model …" so a long silent wait still reads as progress.
+      const waited = this.currentStepStartedAt ? (Date.now() - this.currentStepStartedAt) / 1000 : 0;
+      if (waited >= 8) return `thinking (${this.footer.model}) — reasoning, no token stream yet (${elapsed}s)…`;
       return `calling model (${this.footer.model}) (${elapsed}s)…`;
     }
     if (running) {