kimiflare 0.10.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -11,36 +11,30 @@
11
11
  </p>
12
12
 
13
13
  <p align="center">
14
- A terminal coding agent powered by <strong><a href="https://developers.cloudflare.com/workers-ai/models/kimi-k2.6/">Kimi-K2.6</a></strong> on Cloudflare Workers AI. Moonshot's 1T-parameter open-source model runs directly on your Cloudflare account. You bring the token, your traffic goes straight to Cloudflare.
14
+ <strong>A terminal coding agent powered by <a href="https://developers.cloudflare.com/workers-ai/models/kimi-k2.6/">Kimi-K2.6</a> on Cloudflare Workers AI.</strong><br>
15
+ Moonshot's 1T-parameter open-source model, running directly on your Cloudflare account.
15
16
  </p>
16
17
 
17
- ```
18
- $ kimiflare
19
- kimiflare · /help for commands · ctrl-c to exit
20
-
21
- what files are here?
22
- ✓ glob(*)
23
- /Users/you/proj/package.json
24
- /Users/you/proj/src/index.ts
25
- ...
26
-
27
- › add a /health endpoint to server.ts
28
- ✓ read(src/server.ts)
29
- ◐ edit src/server.ts
30
- ─── permission requested ──────────────────
31
- @@ -42,6 +42,10 @@
32
- app.get('/', …)
33
- + app.get('/health', (_, res) => res.json({ ok: true }))
34
- ─────────────────────────────────────────────
35
- [Allow once] [Allow for session] [Deny]
36
- ```
18
+ <p align="center">
19
+ <img src="docs/screenshot.png" alt="kimiflare TUI" width="900">
20
+ </p>
21
+
22
+ ## Why kimiflare
37
23
 
38
- ## Install
24
+ - **262k context window** — Read entire modules, large configs, and full stack traces without the model losing track.
25
+ - **Image understanding** — Drop image paths into your prompt (PNG, JPG, WebP, GIF, BMP). The model sees them inline — great for UI reviews, diagrams, screenshots, and mockups.
26
+ - **Direct to Cloudflare** — No AI Gateway, no proxy, no OpenAI SDK. Your traffic goes straight to Workers AI from your account.
27
+ - **Plan mode** — Ask the agent to research and produce a plan without touching your filesystem. Review it, then exit plan mode to execute.
28
+
29
+ ## Quick start
39
30
 
40
31
  ```sh
41
32
  npm install -g kimiflare
33
+ kimiflare
42
34
  ```
43
35
 
36
+ On first run, an interactive onboarding wizard asks for your Cloudflare Account ID and API Token. That's it — you're ready.
37
+
44
38
  Or run without installing:
45
39
 
46
40
  ```sh
@@ -49,6 +43,25 @@ npx kimiflare
49
43
 
50
44
  Requires Node.js ≥ 20.
51
45
 
46
+ ## Features
47
+
48
+ | Feature | What it does |
49
+ |---------|-------------|
50
+ | **Plan / Edit / Auto modes** | `plan` blocks all mutating tools for safe research. `edit` (default) prompts per mutating call. `auto` approves everything for trusted tasks. |
51
+ | **Live task panel** | For multi-step work, the agent publishes a task list with progress icons (■ active, ☐ pending, ✓ done), elapsed time, and token deltas. |
52
+ | **14 terminal themes** | dark, light, high-contrast, dracula, nord, one-dark, monokai, solarized-dark/light, tokyo-night, gruvbox-dark/light, catppuccin-mocha, rose-pine. Interactive picker with live preview (`Ctrl+T`). |
53
+ | **Paste collapse** | Large pastes (≥200 chars or ≥2 newlines) collapse to `[pasted N lines #id]`. Full content still goes to the model — scrollback stays clean. |
54
+ | **Type-ahead queue** | Type your next prompt while the model is still working. Queued prompts show as `⏳ …` and fire in order. `Ctrl-C` aborts current + clears queue. |
55
+ | **Auto-compaction** | At ~80% context usage, kimiflare nudges you to run `/compact`. It summarizes older turns into a dense summary, keeping the last 4 turns intact. |
56
+ | **Streaming reasoning** | Toggle the model's chain-of-thought with `/reasoning` or `Ctrl-R`. See how it thinks in real time. |
57
+ | **Image understanding** | Drop image paths (PNG, JPG, WebP, GIF, BMP up to 5 MB) into any prompt. The model sees them inline — perfect for UI reviews, diagrams, and screenshots. |
58
+ | **Live cost tracking** | Status bar shows real-time cost based on Cloudflare pricing: `$0.95/M input`, `$0.16/M cached`, `$4.00/M output`. |
59
+ | **Session persistence** | Every turn is auto-saved. `/resume` lists past sessions (with message counts) in a paginated picker. |
60
+ | **Smart permissions** | Bash session-allow is keyed by the first token (e.g., allow all `git` commands). Write/edit show a unified diff before you approve. |
61
+ | **Project context (`/init`)** | Scans your repo and writes a concise `KIMI.md` — build commands, layout, conventions. Auto-loaded on every launch. |
62
+ | **Co-author auto-append** | Detects `git commit` commands and auto-injects `Co-authored-by: kimiflare <kimiflare@proton.me>`. |
63
+ | **Resilient transport** | Retries Cloudflare capacity errors (code 3040) and 5xx with exponential backoff up to 5 attempts. |
64
+
52
65
  ## Configure
53
66
 
54
67
  Get credentials from Cloudflare:
@@ -79,50 +92,98 @@ chmod 600 ~/.config/kimiflare/config.json
79
92
 
80
93
  ## Usage
81
94
 
95
+ ### Interactive TUI
96
+
82
97
  ```sh
83
- kimiflare # interactive TUI
84
- kimiflare -p "summarize PLAN.md" # one-shot, streams answer to stdout
85
- kimiflare -p "..." --dangerously-allow-all # auto-approve mutating tools (for scripts)
98
+ kimiflare # launch TUI
86
99
  kimiflare --model @cf/moonshotai/kimi-k2.6 # override model
87
- kimiflare --reasoning # (print mode) stream chain-of-thought to stderr
88
100
  ```
89
101
 
90
- Interactive slash commands:
102
+ ### Print mode (one-shot, non-interactive)
103
+
104
+ ```sh
105
+ kimiflare -p "summarize PLAN.md" # stream answer to stdout
106
+ kimiflare -p "..." --dangerously-allow-all # auto-approve mutating tools (for scripts)
107
+ kimiflare -p "..." --reasoning # include chain-of-thought in stderr
108
+ ```
109
+
110
+ ### Image understanding
111
+
112
+ Reference image files directly in your prompt — the model sees them inline:
113
+
114
+ ```sh
115
+ kimiflare
116
+ › fix the layout bug in this screenshot docs/bug.png
117
+ › convert this mockup design.png to Tailwind HTML
118
+ › explain this architecture diagram.png
119
+ ```
120
+
121
+ Supported formats: PNG, JPG, JPEG, WebP, GIF, BMP (up to 5 MB each, 10 per message).
91
122
 
92
- | Command | Effect |
93
- |-----------------------------|---------------------------------------------------------------------------------|
94
- | `/mode edit\|plan\|auto` | Switch mode. `edit` prompts for permission (default), `plan` is read-only research, `auto` auto-approves every tool call. |
95
- | `/plan` `/auto` `/edit` | Shortcuts for the three modes. |
123
+ ### CLI flags
124
+
125
+ | Flag | Short | Description |
126
+ |------|-------|-------------|
127
+ | `--print <prompt>` | `-p` | One-shot mode: send prompt, stream reply, exit |
128
+ | `--model <id>` | `-m` | Model ID (default: `@cf/moonshotai/kimi-k2.6`) |
129
+ | `--dangerously-allow-all` | — | Auto-approve every permission prompt (print mode only) |
130
+ | `--reasoning` | — | Stream chain-of-thought to stderr (print mode only) |
131
+ | `--version` | `-V` | Show version |
132
+ | `--help` | `-h` | Show help |
133
+
134
+ ## Slash commands
135
+
136
+ | Command | Effect |
137
+ |---------|--------|
138
+ | `/mode edit\|plan\|auto` | Switch mode. `edit` prompts for permission (default), `plan` is read-only research, `auto` auto-approves every tool call. |
139
+ | `/plan` `/auto` `/edit` | Shortcuts for the three modes. |
96
140
  | `/thinking low\|medium\|high` | Reasoning effort. `low` = fastest, shallow; `medium` = balanced (default); `high` = deepest, slowest. Saved to config. |
97
- | `/theme NAME` | Switch color scheme: `dark` (default), `light` (bright terminals), `high-contrast`. Saved to config. |
98
- | `/resume` | Pick a past conversation to restore. |
99
- | `/compact` | Summarize older turns to free context. Suggested automatically at ~80% full. |
100
- | `/init` | Scan the repo and write a `KIMI.md` so future agents have project context. |
101
- | `/reasoning` | Toggle chain-of-thought display. |
102
- | `/clear` | Reset the current conversation. |
103
- | `/cost` `/model` `/update` | Info commands. |
104
- | `/logout` | Clear saved credentials. |
105
- | `/help` `/exit` | List commands / quit. |
106
-
107
- Keys: `Shift+Tab` cycles mode · `Ctrl-R` toggles reasoning · `Ctrl-O` toggles verbose tool output · `Ctrl-C` interrupts an in-flight turn (press again to exit) · `↑`/`↓` walks prompt history.
108
-
109
- Editing keys (macOS):
110
-
111
- - `⌥←` / `⌥→` — jump word left/right (also works with `Esc b` / `Esc f`)
112
- - `⌘←` / `⌘→` — jump to start / end of line (in iTerm2's default profile; in Terminal.app you may need to map these to send `Ctrl-A` / `Ctrl-E`)
113
- - `⌥⌫` — delete word backward
114
- - `⌘⌫` — delete to start of line (iTerm2 sends this as `Ctrl-U`; map in Terminal.app if needed)
115
- - `⌥⌦` delete word forward
116
- - `Ctrl-A` / `Ctrl-E` — start / end of line (always works)
117
- - `Ctrl-W` / `Ctrl-U` / `Ctrl-K` delete word backward / to start of line / to end of line
118
-
119
- ### Modes
141
+ | `/theme` | Interactive theme picker with live preview (`Ctrl+T`). Saved to config. |
142
+ | `/theme NAME` | Set theme by name directly. |
143
+ | `/resume` | Pick a past conversation to restore. |
144
+ | `/compact` | Summarize older turns to free context. Suggested automatically at ~80% full. |
145
+ | `/init` | Scan the repo and write a `KIMI.md` so future agents have project context. |
146
+ | `/reasoning` | Toggle chain-of-thought display. |
147
+ | `/clear` | Reset the current conversation. |
148
+ | `/cost` | Show token usage for the current turn. |
149
+ | `/model` | Show current model. |
150
+ | `/update` | Check for updates manually. |
151
+ | `/logout` | Clear saved credentials. |
152
+ | `/help` | List all commands. |
153
+ | `/exit` | Quit. |
154
+
155
+ ## Keyboard shortcuts
156
+
157
+ ### Global
158
+
159
+ | Shortcut | Action |
160
+ |----------|--------|
161
+ | `Ctrl+C` | Interrupt current turn (press again to exit) |
162
+ | `Ctrl+R` | Toggle reasoning display |
163
+ | `Ctrl+O` | Toggle verbose tool output |
164
+ | `Ctrl+T` | Open theme picker |
165
+ | `Shift+Tab` | Cycle mode (edit → plan → auto) |
166
+ | `↑` / `↓` | Walk prompt history |
167
+
168
+ ### Editing (macOS / Linux)
169
+
170
+ | Shortcut | Action |
171
+ |----------|--------|
172
+ | `⌥←` / `⌥→` | Jump word left/right |
173
+ | `⌘←` / `⌘→` | Jump to start / end of line |
174
+ | `⌥⌫` | Delete word backward |
175
+ | `⌘⌫` | Delete to start of line |
176
+ | `⌥⌦` | Delete word forward |
177
+ | `Ctrl+A` / `Ctrl+E` | Start / end of line |
178
+ | `Ctrl+W` / `Ctrl+U` / `Ctrl+K` | Delete word backward / to start / to end of line |
179
+
180
+ ## Modes
120
181
 
121
182
  - **edit** — default. The agent calls tools freely for read-only work; mutating tools (`write`, `edit`, `bash`) pause for your approval.
122
183
  - **plan** — read-only. Mutating tools are hard-blocked. Ask "plan a refactor" and the agent will investigate and produce a plan without touching the filesystem. Exit plan mode to execute.
123
184
  - **auto** — autonomous. Every tool call is auto-approved. Use for trusted, well-scoped tasks.
124
185
 
125
- ### Thinking level (quality vs speed)
186
+ ## Thinking level (quality vs speed)
126
187
 
127
188
  Kimi-K2.6 always reasons, but you can cap the effort:
128
189
 
@@ -132,52 +193,26 @@ Kimi-K2.6 always reasons, but you can cap the effort:
132
193
 
133
194
  Set with `/thinking medium` (persists), or per-launch via `KIMI_REASONING_EFFORT=high`.
134
195
 
135
- ### Type-ahead queue
136
-
137
- You can type the next prompt while the model is still executing. Submitted prompts show up as `⏳ …` and fire in order as each turn completes. `Ctrl-C` aborts the current turn and clears the queue.
138
-
139
- ### Session persistence
140
-
141
- Sessions are saved to `~/.local/share/kimiflare/sessions/` after each turn. `/resume` lists the most recent (with first prompt + message count) so you can pick one up later.
142
-
143
- ### Task panel
144
-
145
- For multi-step requests, the agent can publish a live task list via the `tasks_set` tool. The panel shows progress inline with status icons (`■` active, `☐` pending, `✓` done), elapsed time, and tokens consumed for the current task batch. Press `Ctrl-O` while a turn is running to switch tool output between compact (first line) and verbose (full output) modes.
146
-
147
- ### Paste collapse
148
-
149
- Paste a large block (≥ 200 chars or ≥ 3 newlines in one paste) into the prompt and the input collapses it to `[pasted N lines #id]`. The full content still goes to the model on submit — only the on-screen display and chat history are collapsed, so scrollback doesn't get buried by a wall of code.
150
-
151
- ### Project context (KIMI.md)
152
-
153
- Run `/init` inside a repo and kimiflare scans the project (reads `package.json`, `README`, source layout, etc.) and writes a concise `KIMI.md` at the repo root — project overview, build/test commands, conventions, quirks. On every subsequent launch in that directory, `KIMI.md` (or `KIMIFLARE.md` or `AGENT.md`, whichever exists) is auto-loaded into the system prompt so the agent already "knows" the project. If the file already exists, `/init` refuses so you don't overwrite hand-edited context.
154
-
155
- ## Why
156
-
157
- - **262k context.** Read entire modules without pagination.
158
- - **Native tool use.** File I/O, shell, globs, grep, web fetch — all wired up, with per-call approval for anything mutating.
159
- - **Streaming reasoning + content.** The model's chain-of-thought streams separately; toggle with `/reasoning` or `Ctrl-R`.
160
- - **Pay your own way.** Your Cloudflare account, your credits, your rate limits. `$0.95 / M input`, `$0.16 / M cached input`, `$4.00 / M output`. The bottom status line shows live cost.
161
-
162
196
  ## Tools
163
197
 
164
198
  All tool calls show inline; mutating ones require per-call approval the first time, with an option to allow for the rest of the session.
165
199
 
166
- | Tool | Permission | What it does |
167
- |-------------|------------|--------------|
168
- | `read` | auto | Read a text file (≤ 2MB) with optional line range. |
169
- | `write` | prompt | Create or overwrite a file. Shows a unified diff before you approve. |
170
- | `edit` | prompt | Replace an exact substring. Fails unless `old_string` is unique (or `replace_all=true`). |
171
- | `bash` | prompt | Run a shell command via `bash -lc`. Session-allow is keyed by the first token of the command. |
172
- | `glob` | auto | Match files by pattern (`**/*.ts`), sorted by mtime. |
173
- | `grep` | auto | Regex search. Uses `rg` if installed; falls back to a JS walk. |
174
- | `web_fetch` | auto | Fetch a URL, convert HTML → markdown (≤ 100KB). |
200
+ | Tool | Permission | What it does |
201
+ |------|------------|--------------|
202
+ | `read` | auto | Read a text file (≤ 2MB) with optional line range. |
203
+ | `write` | prompt | Create or overwrite a file. Shows a unified diff before you approve. |
204
+ | `edit` | prompt | Replace an exact substring. Fails unless `old_string` is unique (or `replace_all=true`). |
205
+ | `bash` | prompt | Run a shell command via `bash -lc`. Session-allow is keyed by the first token of the command. |
206
+ | `glob` | auto | Match files by pattern (`**/*.ts`), sorted by mtime. |
207
+ | `grep` | auto | Regex search. Uses `rg` if installed; falls back to a JS walk. |
208
+ | `web_fetch` | auto | Fetch a URL, convert HTML → markdown (≤ 100KB). |
209
+ | `tasks_set` | auto | Publish a live task list for multi-step work. |
175
210
 
176
211
  ## How it works
177
212
 
178
213
  ```
179
214
  ┌───────────────────────────────────────────────────────────┐
180
- │ kimiflare (Node + Ink TUI)
215
+ │ kimiflare (Node.js TUI)
181
216
  user ─▶ │ │
182
217
  │ user msg ─▶ agent loop ─▶ runKimi() ──[POST SSE]──▶ │
183
218
  │ ▲ │
@@ -204,9 +239,23 @@ npm run build
204
239
  npm link # or: ln -s "$PWD/bin/kimiflare.mjs" ~/.local/bin/kimiflare
205
240
  ```
206
241
 
207
- ## Status
242
+ Scripts:
243
+ - `npm run build` — bundle with tsup (`dist/` + `bin/kimiflare.mjs`)
244
+ - `npm run dev` — run via tsx (`tsx src/index.tsx`)
245
+ - `npm run typecheck` — `tsc --noEmit`
246
+ - `npm start` — run compiled bin
247
+
248
+ ## Contributing
249
+
250
+ Contributions are welcome!
208
251
 
209
- Early but functional. Transport + tools + agent loop + print mode are verified end-to-end. Interactive TUI ships modes, themes, thinking levels, session resume, compaction, and type-ahead queue.
252
+ 1. Fork the repository
253
+ 2. Create a branch: `git checkout -b feat/your-feature`
254
+ 3. Make your changes
255
+ 4. Run `npm run typecheck` and `npm run build`
256
+ 5. Commit: `git commit -m "feat: description"`
257
+ 6. Push: `git push origin feat/your-feature`
258
+ 7. Open a Pull Request
210
259
 
211
260
  ## License
212
261
 
package/dist/index.js CHANGED
@@ -296,10 +296,19 @@ async function* parseStream(body, signal) {
296
296
  }
297
297
  function sanitizeMessagesForApi(messages) {
298
298
  return messages.map((m) => {
299
- if (!m.tool_calls || m.tool_calls.length === 0) return m;
299
+ let next = m;
300
+ if (Array.isArray(m.content)) {
301
+ next = {
302
+ ...m,
303
+ content: m.content.map(
304
+ (part) => part.type === "text" ? { ...part, text: sanitizeString(part.text) } : part
305
+ )
306
+ };
307
+ }
308
+ if (!next.tool_calls || next.tool_calls.length === 0) return next;
300
309
  return {
301
- ...m,
302
- tool_calls: m.tool_calls.map((tc) => ({
310
+ ...next,
311
+ tool_calls: next.tool_calls.map((tc) => ({
303
312
  ...tc,
304
313
  function: {
305
314
  name: tc.function.name,
@@ -1533,15 +1542,16 @@ async function compactMessages(opts2) {
1533
1542
  return { summary: "", newMessages: messages, replacedCount: 0 };
1534
1543
  }
1535
1544
  const transcript = toSummarize.map((m) => {
1545
+ const contentStr = typeof m.content === "string" ? m.content : m.content?.map((p) => p.type === "text" ? p.text : "[image]").join(" ") ?? "";
1536
1546
  if (m.role === "tool") {
1537
- const snippet = (m.content ?? "").slice(0, 500);
1547
+ const snippet = contentStr.slice(0, 500);
1538
1548
  return `[tool ${m.name ?? ""}] ${snippet}`;
1539
1549
  }
1540
1550
  if (m.role === "assistant") {
1541
1551
  const calls = m.tool_calls ? ` (tool_calls: ${m.tool_calls.map((c) => c.function.name).join(", ")})` : "";
1542
- return `[assistant]${calls} ${m.content ?? ""}`;
1552
+ return `[assistant]${calls} ${contentStr}`;
1543
1553
  }
1544
- return `[${m.role}] ${m.content ?? ""}`;
1554
+ return `[${m.role}] ${contentStr}`;
1545
1555
  }).join("\n");
1546
1556
  let summary = "";
1547
1557
  const events = runKimi({
@@ -1867,12 +1877,18 @@ function EventView({
1867
1877
  verbose
1868
1878
  }) {
1869
1879
  if (evt.kind === "user") {
1870
- return /* @__PURE__ */ jsxs4(Box4, { children: [
1871
- /* @__PURE__ */ jsxs4(Text4, { bold: true, color: theme.user, children: [
1872
- "\u203A",
1873
- " "
1880
+ return /* @__PURE__ */ jsxs4(Box4, { flexDirection: "column", children: [
1881
+ /* @__PURE__ */ jsxs4(Box4, { children: [
1882
+ /* @__PURE__ */ jsxs4(Text4, { bold: true, color: theme.user, children: [
1883
+ "\u203A",
1884
+ " "
1885
+ ] }),
1886
+ /* @__PURE__ */ jsx4(Text4, { bold: true, children: evt.text })
1874
1887
  ] }),
1875
- /* @__PURE__ */ jsx4(Text4, { bold: true, children: evt.text })
1888
+ evt.images && evt.images.length > 0 && /* @__PURE__ */ jsx4(Box4, { paddingLeft: 2, children: /* @__PURE__ */ jsxs4(Text4, { color: theme.info.color, dimColor: theme.info.dim, children: [
1889
+ "\u{1F5BC}\uFE0F ",
1890
+ evt.images.join(", ")
1891
+ ] }) })
1876
1892
  ] });
1877
1893
  }
1878
1894
  if (evt.kind === "assistant") {
@@ -3470,7 +3486,7 @@ async function listSessions(limit = 30) {
3470
3486
  const [s, raw] = await Promise.all([stat2(path), readFile7(path, "utf8")]);
3471
3487
  const parsed = JSON.parse(raw);
3472
3488
  const firstUser = parsed.messages.find((m) => m.role === "user");
3473
- const firstPrompt = typeof firstUser?.content === "string" ? firstUser.content : "(no prompt)";
3489
+ const firstPrompt = typeof firstUser?.content === "string" ? firstUser.content : firstUser?.content ? firstUser.content.find((p) => p.type === "text")?.text ?? "(no prompt)" : "(no prompt)";
3474
3490
  summaries.push({
3475
3491
  id: parsed.id,
3476
3492
  filePath: path,
@@ -3495,6 +3511,45 @@ var init_sessions = __esm({
3495
3511
  }
3496
3512
  });
3497
3513
 
3514
+ // src/util/image.ts
3515
+ import { readFile as readFile8 } from "fs/promises";
3516
+ import { basename as basename2 } from "path";
3517
+ async function encodeImageFile(filePath) {
3518
+ const buf = await readFile8(filePath);
3519
+ if (buf.byteLength > MAX_IMAGE_BYTES) {
3520
+ throw new Error(
3521
+ `image too large (${(buf.byteLength / 1024 / 1024).toFixed(1)} MB); max is ${MAX_IMAGE_BYTES / 1024 / 1024} MB`
3522
+ );
3523
+ }
3524
+ const ext = filePath.slice(filePath.lastIndexOf(".")).toLowerCase();
3525
+ const mime = EXT_TO_MIME[ext] ?? "image/jpeg";
3526
+ const b64 = buf.toString("base64");
3527
+ return {
3528
+ filename: basename2(filePath),
3529
+ mime,
3530
+ dataUrl: `data:${mime};base64,${b64}`
3531
+ };
3532
+ }
3533
+ function isImagePath(path) {
3534
+ const ext = path.slice(path.lastIndexOf(".")).toLowerCase();
3535
+ return ext in EXT_TO_MIME;
3536
+ }
3537
+ var MAX_IMAGE_BYTES, EXT_TO_MIME;
3538
+ var init_image = __esm({
3539
+ "src/util/image.ts"() {
3540
+ "use strict";
3541
+ MAX_IMAGE_BYTES = 5 * 1024 * 1024;
3542
+ EXT_TO_MIME = {
3543
+ ".png": "image/png",
3544
+ ".jpg": "image/jpeg",
3545
+ ".jpeg": "image/jpeg",
3546
+ ".gif": "image/gif",
3547
+ ".webp": "image/webp",
3548
+ ".bmp": "image/bmp"
3549
+ };
3550
+ }
3551
+ });
3552
+
3498
3553
  // src/app.tsx
3499
3554
  var app_exports = {};
3500
3555
  __export(app_exports, {
@@ -3510,6 +3565,16 @@ function capEvents(prev) {
3510
3565
  if (prev.length <= MAX_EVENTS) return prev;
3511
3566
  return prev.slice(prev.length - MAX_EVENTS);
3512
3567
  }
3568
+ function findImagePaths(text) {
3569
+ const paths = [];
3570
+ for (const token of text.split(/\s+/)) {
3571
+ const clean = token.replace(/^["']|["',;:!?]$/g, "").replace(/[.,;:!?]$/, "");
3572
+ if (isImagePath(clean) && existsSync(clean)) {
3573
+ paths.push(clean);
3574
+ }
3575
+ }
3576
+ return [...new Set(paths)];
3577
+ }
3513
3578
  function App({ initialCfg, initialUpdateResult }) {
3514
3579
  const { exit } = useApp();
3515
3580
  const [cfg, setCfg] = useState6(initialCfg);
@@ -3679,7 +3744,13 @@ function App({ initialCfg, initialUpdateResult }) {
3679
3744
  if (!cfg) return;
3680
3745
  if (!sessionIdRef.current) {
3681
3746
  const firstUser = messagesRef.current.find((m) => m.role === "user");
3682
- const firstText = typeof firstUser?.content === "string" ? firstUser.content : "session";
3747
+ let firstText = "session";
3748
+ if (typeof firstUser?.content === "string") {
3749
+ firstText = firstUser.content;
3750
+ } else if (Array.isArray(firstUser?.content)) {
3751
+ const textPart = firstUser.content.find((p) => p.type === "text");
3752
+ if (textPart?.text) firstText = textPart.text;
3753
+ }
3683
3754
  sessionIdRef.current = makeSessionId(firstText);
3684
3755
  }
3685
3756
  try {
@@ -3967,7 +4038,12 @@ function App({ initialCfg, initialUpdateResult }) {
3967
4038
  text: `resumed session ${picked.id} (${picked.messageCount} msgs)`
3968
4039
  }
3969
4040
  ]);
3970
- const userMsgs = file.messages.filter((m) => m.role === "user" && typeof m.content === "string").map((m) => m.content);
4041
+ const userMsgs = file.messages.filter((m) => m.role === "user" && m.content).map((m) => {
4042
+ if (!m.content) return "";
4043
+ if (typeof m.content === "string") return m.content;
4044
+ const textPart = m.content.find((p) => p.type === "text");
4045
+ return textPart?.text ?? "";
4046
+ }).filter((text) => text.length > 0);
3971
4047
  if (userMsgs.length > 0) setHistory(userMsgs);
3972
4048
  setUsage(null);
3973
4049
  } catch (e) {
@@ -4223,8 +4299,36 @@ use: /thinking low | medium | high`
4223
4299
  if (!trimmed) return;
4224
4300
  if (trimmed.startsWith("/") && handleSlash(trimmed)) return;
4225
4301
  const display = displayText?.trim() || trimmed;
4226
- setEvents((e) => [...e, { kind: "user", key: mkKey(), text: display }]);
4227
- messagesRef.current.push({ role: "user", content: sanitizeString(trimmed) });
4302
+ const imagePaths = findImagePaths(trimmed).slice(0, MAX_IMAGES_PER_MESSAGE);
4303
+ let images = [];
4304
+ let content = sanitizeString(trimmed);
4305
+ if (imagePaths.length > 0) {
4306
+ const encoded = await Promise.all(
4307
+ imagePaths.map(async (path) => {
4308
+ try {
4309
+ const img = await encodeImageFile(path);
4310
+ return { path, img };
4311
+ } catch (e) {
4312
+ setEvents((es) => [
4313
+ ...es,
4314
+ { kind: "error", key: mkKey(), text: `failed to encode image ${path}: ${e.message}` }
4315
+ ]);
4316
+ return null;
4317
+ }
4318
+ })
4319
+ );
4320
+ const valid = encoded.filter((x) => x !== null);
4321
+ if (valid.length > 0) {
4322
+ images = valid.map((v) => v.img.filename);
4323
+ const parts = [
4324
+ { type: "text", text: sanitizeString(trimmed) },
4325
+ ...valid.map((v) => ({ type: "image_url", image_url: { url: v.img.dataUrl } }))
4326
+ ];
4327
+ content = parts;
4328
+ }
4329
+ }
4330
+ setEvents((e) => [...e, { kind: "user", key: mkKey(), text: display, images: images.length > 0 ? images : void 0 }]);
4331
+ messagesRef.current.push({ role: "user", content });
4228
4332
  setBusy(true);
4229
4333
  setTurnStartedAt(Date.now());
4230
4334
  const controller = new AbortController();
@@ -4522,7 +4626,7 @@ async function renderApp(cfg, updateResult) {
4522
4626
  const instance = render(/* @__PURE__ */ jsx13(App, { initialCfg: cfg, initialUpdateResult: updateResult }));
4523
4627
  await instance.waitUntilExit();
4524
4628
  }
4525
- var CONTEXT_LIMIT, AUTO_COMPACT_SUGGEST_PCT, MAX_EVENTS, nextAssistantId, nextKey, mkKey, EFFORT_DESCRIPTIONS;
4629
+ var CONTEXT_LIMIT, AUTO_COMPACT_SUGGEST_PCT, MAX_EVENTS, nextAssistantId, nextKey, mkKey, MAX_IMAGES_PER_MESSAGE, EFFORT_DESCRIPTIONS;
4526
4630
  var init_app = __esm({
4527
4631
  "src/app.tsx"() {
4528
4632
  "use strict";
@@ -4546,12 +4650,14 @@ var init_app = __esm({
4546
4650
  init_theme();
4547
4651
  init_mode();
4548
4652
  init_sessions();
4653
+ init_image();
4549
4654
  CONTEXT_LIMIT = 262e3;
4550
4655
  AUTO_COMPACT_SUGGEST_PCT = 0.8;
4551
4656
  MAX_EVENTS = 500;
4552
4657
  nextAssistantId = 1;
4553
4658
  nextKey = 1;
4554
4659
  mkKey = () => `evt_${nextKey++}`;
4660
+ MAX_IMAGES_PER_MESSAGE = 10;
4555
4661
  EFFORT_DESCRIPTIONS = {
4556
4662
  low: "low \u2014 fastest; lightest reasoning. Best for simple Q&A, small edits, quick coordination.",
4557
4663
  medium: "medium \u2014 balanced (default). Solid quality on most edits, fast on trivial prompts.",