kimiflare 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/PLAN.md ADDED
@@ -0,0 +1,172 @@
1
+ # Plan: `kimiflare` — Claude Code-style CLI powered by Kimi-K2.6
2
+
3
+ > Living document. Keep updated at every milestone so context survives compaction.
4
+
5
+ ## Context
6
+
7
+ Build a terminal coding agent, similar in spirit to Claude Code, driven by the `@cf/moonshotai/kimi-k2.6` model hosted on Cloudflare Workers AI. User has large Cloudflare credits and wants them used directly — no AI Gateway, no OpenAI-compat layer; calls go straight to the native Workers AI `ai/run` REST endpoint.
8
+
9
+ - **Model**: `@cf/moonshotai/kimi-k2.6`. 262,144-token context, native function calling, streaming, reasoning, vision (unused in v1).
10
+ - **Endpoint**: `POST https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/@cf/moonshotai/kimi-k2.6` · `Authorization: Bearer $CLOUDFLARE_API_TOKEN`.
11
+ - **Pricing**: $0.95 / M input, $0.16 / M cached input, $4.00 / M output.
12
+
13
+ Outcome: a `kimiflare` binary that opens a TUI, lets the user chat, calls Kimi-k2.6 with tools (file I/O, bash, search, web fetch), streams tokens, asks permission before mutating anything, and loops tool results back to the model until it stops.
14
+
15
+ ## Probe findings (verified against the live API)
16
+
17
+ - Non-streaming body is wrapped: `{result: {...OpenAI chat.completion...}, success, errors, messages}`. Unwrap `result`.
18
+ - Streaming body is raw SSE — `data: {json}\n\n`, terminal `data: [DONE]`. OpenAI-compatible chunks.
19
+ - Kimi emits `delta.reasoning_content` BEFORE `delta.content`. Both are independent streams that share the `max_completion_tokens` budget. Use a generous cap (16384) so reasoning doesn't eat the final answer.
20
+ - Tool calls: first chunk has `{id, name, arguments: ""}`; later chunks carry only `arguments` delta, `index` identifies which call. Key accumulator by `index`, not `id`.
21
+ - Tool round-trip matches OpenAI spec: `role=assistant` with `tool_calls[]`, followed by `role=tool` with `tool_call_id`.
22
+ - Transient `HTTP 200 + success:false + errors[0].code:3040` ("Capacity temporarily exceeded"). Must retry with exponential backoff.
23
+ - Extra `data: {"response":"","usage":{...}}` event arrives before `[DONE]` — parser should ignore unknown events.
24
+ - `usage` rolls up in every streaming chunk → free live cost display.
25
+ - Tool-call `id` format is `"functions.<name>:<N>"`; treat as opaque string.
26
+
27
+ ## Stack
28
+
29
+ - Node 20+ (built-in fetch / ReadableStream / AbortController), ESM.
30
+ - TypeScript. `tsup` builds, `tsx` for dev.
31
+ - Ink + `ink-text-input`, `ink-spinner`, `ink-select-input`. React 18.
32
+ - `commander` for CLI args. `fast-glob` for globs. `diff` for unified diffs. `turndown` for HTML→markdown.
33
+ - No AI Gateway. No `openai` SDK. Direct `fetch`.
34
+
35
+ ## File layout
36
+
37
+ ```
38
+ /Users/sinameraji/kimiflare/
39
+ ├── package.json
40
+ ├── tsconfig.json
41
+ ├── tsup.config.ts
42
+ ├── bin/kimiflare.mjs # shebang shim → dist/index.js
43
+ ├── src/
44
+ │ ├── index.tsx # CLI entry (commander → Ink render or one-shot)
45
+ │ ├── app.tsx # Ink root: chat + input + permission modal
46
+ │ ├── config.ts # env + ~/.config/kimiflare/config.json
47
+ │ ├── agent/
48
+ │ │ ├── client.ts # Cloudflare Workers AI client (fetch + SSE + retry)
49
+ │ │ ├── stream.ts # delta accumulator (reasoning/content/tool_calls)
50
+ │ │ ├── loop.ts # model ↔ tools orchestration
51
+ │ │ ├── messages.ts # message types
52
+ │ │ └── system-prompt.ts # cwd/platform/date/tools
53
+ │ ├── tools/
54
+ │ │ ├── registry.ts
55
+ │ │ ├── executor.ts
56
+ │ │ ├── read.ts | write.ts | edit.ts | bash.ts | glob.ts | grep.ts | web-fetch.ts
57
+ │ ├── ui/
58
+ │ │ ├── chat.tsx | input.tsx | permission.tsx | tool-view.tsx | diff-view.tsx | spinner.tsx
59
+ │ └── util/
60
+ │ ├── sse.ts
61
+ │ ├── paths.ts
62
+ │ └── errors.ts
63
+ ```
64
+
65
+ ## Core designs
66
+
67
+ ### Cloudflare Workers AI client (`src/agent/client.ts`)
68
+
69
+ `runKimi({ messages, tools, signal })`:
70
+
71
+ - POSTs to `https://api.cloudflare.com/client/v4/accounts/${accountId}/ai/run/@cf/moonshotai/kimi-k2.6` with Bearer token.
72
+ - Body: `{ messages, tools, tool_choice: "auto", parallel_tool_calls: true, stream: true, temperature: 0.2, max_completion_tokens: 16384 }`.
73
+ - Returns an async iterator: `{type:"reasoning",delta}`, `{type:"text",delta}`, `{type:"tool_call_start",index,id,name}`, `{type:"tool_call_args",index,argsDelta}`, `{type:"usage",usage}`, `{type:"done",finishReason,usage}`.
74
+ - Retries on `code:3040` up to 5× with backoff (500ms, 1s, 2s, 4s, 8s + jitter).
75
+
76
+ ### SSE reader (`src/util/sse.ts`)
77
+
78
+ `async function* readSSE(stream: ReadableStream<Uint8Array>)` — splits on `\n\n`, strips `data: ` prefix, yields parsed JSON, stops on `[DONE]`.
79
+
80
+ ### Agent loop (`src/agent/loop.ts`)
81
+
82
+ ```
83
+ loop:
84
+ stream = runKimi(messages, tools)
85
+ collect content/reasoning/tool_calls from events
86
+ push final assistant message to messages[]
87
+ if tool_calls:
88
+ for each tc (serialize mutating, parallel for readonly):
89
+ result = executor.run(tc, askPermission)
90
+ push { role:"tool", tool_call_id: tc.id, content: result }
91
+ continue
92
+ else:
93
+ yield to UI
94
+ ```
95
+
96
+ Max 50 iterations per user turn.
97
+
98
+ ### Tools
99
+
100
+ | Tool | Permission | Notes |
101
+ |-------------|------------|-------|
102
+ | `read` | auto | `path`, `offset?`, `limit?`. UTF-8 text, ≤2MB. |
103
+ | `write` | prompt | `path`, `content`. Diff preview. |
104
+ | `edit` | prompt | `path`, `old_string`, `new_string`, `replace_all?`. Unique-match enforced. |
105
+ | `bash` | prompt | `command`, `timeout_ms?` (default 120s, max 600s). Output capped 30KB. |
106
+ | `glob` | auto | `pattern`, `path?`. `fast-glob`, 200 results sorted by mtime. |
107
+ | `grep` | auto | Shells out to `rg` if present, JS fallback otherwise. |
108
+ | `web_fetch` | auto | `url`, 20s timeout, HTML→markdown, ≤100KB. |
109
+
110
+ Permission modal scopes: **this call** / **this session for this tool** / **deny**. Bash session-allow is keyed by command prefix.
111
+
112
+ ### System prompt
113
+
114
+ Injects cwd, platform, shell, date, and one-line per registered tool. Emphasizes: prefer tools over guessing, read before editing, explain briefly before asking permission.
115
+
116
+ ### UI
117
+
118
+ - Chat scrollback: user (cyan), assistant text, collapsed tool blocks, streaming tokens live.
119
+ - Reasoning: dim collapsed block above final answer; off by default; toggle with `/reasoning` or Ctrl+R.
120
+ - Input: multi-line `ink-text-input`, slash commands `/clear /exit /model /cost /history /reasoning`.
121
+ - Permission modal: overlay with tool name + args (diff preview for edit/write, raw command for bash); 3-option select.
122
+ - Status line: model, running tokens, cost estimate.
123
+
124
+ ### Config
125
+
126
+ Resolution: env vars → `~/.config/kimiflare/config.json` → first-run prompt writing chmod 600 file.
127
+
128
+ ```jsonc
129
+ { "accountId": "…", "apiToken": "…", "model": "@cf/moonshotai/kimi-k2.6" }
130
+ ```
131
+
132
+ ### Entry modes
133
+
134
+ - `kimiflare` — interactive TUI
135
+ - `kimiflare -p "prompt"` — one-shot to stdout; permissions auto-deny unless `--dangerously-allow-all`
136
+ - `kimiflare --model <id>` — override model
137
+ - Session transcripts persisted at `~/.local/share/kimiflare/sessions/*.jsonl` (resume v1-optional)
138
+
139
+ ## Verification scenarios
140
+
141
+ 1. `npm install && npm run build && kimiflare --help` (after `npm link` or symlink).
142
+ 2. TUI opens with valid creds.
143
+ 3. Plain chat streams tokens live.
144
+ 4. Readonly tool auto-runs (glob/read).
145
+ 5. Mutating tool triggers permission modal, write+bash round-trip works.
146
+ 6. Multi-tool loop (grep across tree).
147
+ 7. Web fetch + summarize.
148
+ 8. `/cost` matches `0.95·in/1M + 4.00·out/1M`.
149
+ 9. `kimiflare -p --dangerously-allow-all` one-shot exits cleanly.
150
+ 10. Missing creds / network-kill errors are graceful.
151
+
152
+ Unit tests: SSE reader split-chunks + `[DONE]`; stream accumulator against recorded Kimi tool-call transcript; edit unique-match; bash command-prefix allowlist.
153
+
154
+ ## Progress log
155
+
156
+ | Milestone | Status |
157
+ |-----------|--------|
158
+ | API probes (5) | ✅ — see "Probe findings" |
159
+ | Scaffolding (package.json / tsconfig / tsup / bin) | ✅ |
160
+ | SSE reader + client + accumulator | ✅ validated via scripts/replay-stream.ts against recorded probe-4 body |
161
+ | Tools + registry + executor + permissions | ✅ |
162
+ | Agent loop + system prompt + messages | ✅ |
163
+ | Config loader + print-mode CLI entry | ✅ — verified end-to-end (plain chat, readonly tools, mutating tools) |
164
+ | Ink TUI (chat / input / permission / status) | ✅ renders cleanly under a pty; awaiting real-TTY interactive test |
165
+ | Interactive CLI entry | ✅ |
166
+ | End-to-end verification (TUI) | 🔄 in progress — user will run `kimiflare` in their terminal |
167
+
168
+ ## Deferred (pre-launch)
169
+
170
+ - **First-run credential wizard**: when config.json and env vars are both missing, walk the user through creating a Cloudflare API token, validate it with a tiny test call, save to `~/.config/kimiflare/config.json` (chmod 600). Today we just error out with instructions. Product is BYO-creds — every user supplies their own Cloudflare token; kimiflare does not proxy requests.
171
+ - **Npm publish / homebrew tap** so `npm i -g kimiflare` works.
172
+ - **Session transcripts** (`~/.local/share/kimiflare/sessions/*.jsonl`) + `--resume`.
package/README.md ADDED
@@ -0,0 +1,137 @@
1
+ # kimiflare
2
+
3
+ A terminal coding agent powered by **[Kimi-K2.6](https://developers.cloudflare.com/workers-ai/models/kimi-k2.6/)** on Cloudflare Workers AI. It's Claude Code, but the model is Moonshot's 1T-parameter open-source Kimi running directly on your Cloudflare account — no middleman, no AI Gateway, no OpenAI SDK. You bring the token, your traffic goes straight to Cloudflare.
4
+
5
+ ```
6
+ $ kimiflare
7
+ kimiflare · /help for commands · ctrl-c to exit
8
+
9
+ › what files are here?
10
+ ✓ glob(*)
11
+ /Users/you/proj/package.json
12
+ /Users/you/proj/src/index.ts
13
+ ...
14
+
15
+ › add a /health endpoint to server.ts
16
+ ✓ read(src/server.ts)
17
+ ◐ edit src/server.ts
18
+ ─── permission requested ──────────────────
19
+ @@ -42,6 +42,10 @@
20
+ app.get('/', …)
21
+ + app.get('/health', (_, res) => res.json({ ok: true }))
22
+ ─────────────────────────────────────────────
23
+ [Allow once] [Allow for session] [Deny]
24
+ ```
25
+
26
+ ## Why
27
+
28
+ - **262k context.** Read entire modules without pagination.
29
+ - **Native tool use.** File I/O, shell, globs, grep, web fetch — all wired up, with per-call approval for anything mutating.
30
+ - **Streaming reasoning + content.** The model's chain-of-thought streams separately; toggle with `/reasoning` or `Ctrl-R`.
31
+ - **Pay your own way.** Your Cloudflare account, your credits, your rate limits. `$0.95 / M input`, `$0.16 / M cached input`, `$4.00 / M output`. The bottom status line shows live cost.
32
+
33
+ ## Install
34
+
35
+ ```sh
36
+ git clone https://github.com/sinameraji/kimiflare
37
+ cd kimiflare
38
+ npm install
39
+ npm run build
40
+ npm link # or: ln -s "$PWD/bin/kimiflare.mjs" ~/.local/bin/kimiflare
41
+ ```
42
+
43
+ Published npm package coming soon.
44
+
45
+ ## Configure
46
+
47
+ Get credentials from Cloudflare:
48
+
49
+ 1. https://dash.cloudflare.com → your account → copy **Account ID**.
50
+ 2. https://dash.cloudflare.com/profile/api-tokens → **Create Token** → Custom token with **Account › Workers AI › Read** on your account → **Create** → copy.
51
+
52
+ Then either export them each shell:
53
+
54
+ ```sh
55
+ export CLOUDFLARE_ACCOUNT_ID=...
56
+ export CLOUDFLARE_API_TOKEN=...
57
+ ```
58
+
59
+ or save them once (`chmod 600` automatically):
60
+
61
+ ```sh
62
+ mkdir -p ~/.config/kimiflare
63
+ cat > ~/.config/kimiflare/config.json <<'EOF'
64
+ {
65
+ "accountId": "YOUR_ACCOUNT_ID",
66
+ "apiToken": "YOUR_API_TOKEN",
67
+ "model": "@cf/moonshotai/kimi-k2.6"
68
+ }
69
+ EOF
70
+ chmod 600 ~/.config/kimiflare/config.json
71
+ ```
72
+
73
+ ## Usage
74
+
75
+ ```sh
76
+ kimiflare # interactive TUI
77
+ kimiflare -p "summarize PLAN.md" # one-shot, streams answer to stdout
78
+ kimiflare -p "..." --dangerously-allow-all # auto-approve mutating tools (for scripts)
79
+ kimiflare --model @cf/moonshotai/kimi-k2.6 # override model
80
+ kimiflare --reasoning # (print mode) stream chain-of-thought to stderr
81
+ ```
82
+
83
+ Interactive slash commands:
84
+
85
+ | Command | Effect |
86
+ |---------------|--------------------------------------------------|
87
+ | `/clear` | Reset the conversation (keeps system prompt) |
88
+ | `/reasoning` | Toggle chain-of-thought display |
89
+ | `/cost` | Show token usage so far |
90
+ | `/model` | Show current model |
91
+ | `/help` | List commands |
92
+ | `/exit` | Quit |
93
+
94
+ Keys: `Ctrl-R` toggles reasoning, `Ctrl-C` interrupts an in-flight turn (press again to exit).
95
+
96
+ ## Tools
97
+
98
+ All tool calls show inline; mutating ones require per-call approval the first time, with an option to allow for the rest of the session.
99
+
100
+ | Tool | Permission | What it does |
101
+ |-------------|------------|--------------|
102
+ | `read` | auto | Read a text file (≤ 2MB) with optional line range. |
103
+ | `write` | prompt | Create or overwrite a file. Shows a unified diff before you approve. |
104
+ | `edit` | prompt | Replace an exact substring. Fails unless `old_string` is unique (or `replace_all=true`). |
105
+ | `bash` | prompt | Run a shell command via `bash -lc`. Session-allow is keyed by the first token of the command. |
106
+ | `glob` | auto | Match files by pattern (`**/*.ts`), sorted by mtime. |
107
+ | `grep` | auto | Regex search. Uses `rg` if installed; falls back to a JS walk. |
108
+ | `web_fetch` | auto | Fetch a URL, convert HTML → markdown (≤ 100KB). |
109
+
110
+ ## How it works
111
+
112
+ ```
113
+ ┌───────────────────────────────────────────────────────────┐
114
+ │ kimiflare (Node + Ink TUI) │
115
+ user ─▶ │ │
116
+ │ user msg ─▶ agent loop ─▶ runKimi() ──[POST SSE]──▶ │
117
+ │ ▲ │
118
+ │ │ │
119
+ │ tool result ◀──tool executor──◀ tool_calls │
120
+ │ (permission modal for write / edit / bash) │
121
+ └───────────────────────────────────────────────────────────┘
122
+
123
+
124
+ api.cloudflare.com/client/v4
125
+ /accounts/{ID}/ai/run/
126
+ @cf/moonshotai/kimi-k2.6
127
+ ```
128
+
129
+ No AI Gateway, no proxy, no OpenAI SDK. Direct `fetch` to Workers AI, OpenAI-compatible `messages` + `tools` payload, SSE stream with reasoning + content + tool-call deltas accumulated by index.
130
+
131
+ ## Status
132
+
133
+ Early. Transport + tools + agent loop + print mode are verified end-to-end; interactive TUI renders cleanly under a pty and awaits real-terminal shakedown. See `PLAN.md` for milestone log and deferred items (first-run wizard, npm publish, session resume).
134
+
135
+ ## License
136
+
137
+ TBD.
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env node
2
+ import "../dist/index.js";