kimiflare 0.68.0 → 0.68.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -12,22 +12,25 @@
12
12
  </p>
13
13
 
14
14
  <p align="center">
15
- <strong>A terminal coding agent powered by <a href="https://developers.cloudflare.com/workers-ai/models/kimi-k2.6/">Kimi-K2.6</a> on Cloudflare Workers AI.</strong><br>
16
- Moonshot's 1T-parameter open-source model, running directly in your terminal.
15
+ <strong>A terminal coding agent powered by <a href="https://developers.cloudflare.com/workers-ai/models/kimi-k2.6/">Kimi-K2.6</a>, routed through your own <a href="https://developers.cloudflare.com/ai-gateway/">Cloudflare AI Gateway</a>.</strong><br>
16
+ Moonshot's 1T-parameter open-source model, with first-class observability, caching, and authoritative cost — all on your Cloudflare account.
17
17
  </p>
18
18
 
19
19
  <p align="center">
20
20
  <img src="docs/screenshot.png" alt="kimiflare TUI" width="900">
21
21
  </p>
22
22
 
23
- ## Two ways to run
23
+ ## How it works
24
24
 
25
- | Mode | How it works | Best for |
26
- |------|-------------|----------|
27
- | **BYOK** | Bring your own Cloudflare Account ID + API Token. Traffic goes straight to Workers AI from your account. | Power users who want full control and direct billing. |
28
- | ~~**Kimiflare Cloud**~~ | ~~Device auth — no API key needed. We proxy requests through our managed endpoint.~~ | ~~Getting started quickly without a Cloudflare account.~~ |
25
+ You bring your own Cloudflare **Account ID** + **API Token**. KimiFlare provisions (or reuses) an **AI Gateway** in your account and routes every model request through it. Nothing leaves your Cloudflare tenancy.
29
26
 
30
- > ~~🎁 **Try Kimiflare Cloud free** — sign up and get **5 million tokens** on us until May 14, 2026. Run `kimiflare --cloud` or pick "Cloud (managed)" during onboarding.~~
27
+ You get this for free:
28
+
29
+ - **Per-request logs** with full payload, latency, and status — visible in the Cloudflare dashboard
30
+ - **Response caching** with configurable TTL (`/gateway cache-ttl <seconds>`)
31
+ - **Authoritative per-turn cost** pulled from the Gateway logs API — no estimates
32
+ - **Cache-hit ratio and per-feature cost breakdown** in `/cost`
33
+ - **Auto-tagging** of every request with `feature` / `sessionId` / `turnIdx` metadata for downstream attribution
31
34
 
32
35
  ## What to remember
33
36
 
@@ -39,7 +42,7 @@
39
42
  - **Smart permission modal** — Denying a tool opens inline feedback so you can tell the agent what to do instead. Keyboard-native navigation (`↑/↓`, `j/k`, `Alt+1/2/3`).
40
43
  - **Loop guardrails** — Agent hard-stops when all tools in a turn are blocked, preventing infinite token-burning cycles.
41
44
  - **Persistent all-time cost history** — Append-only `history.jsonl` tracks daily usage forever, so `/cost` shows true all-time and monthly totals that survive across sessions and version updates.
42
- - **Live cost tracking** — Status bar shows real-time spend based on Cloudflare pricing. Know exactly what each turn costs.
45
+ - **Live, gateway-confirmed cost tracking** — Status bar shows a fast local estimate (`≈$0.12`) that flips to the real, Cloudflare-billed number once the AI Gateway log reconciles. Per-turn latency renders next to cost.
43
46
  - **LSP + MCP** — Semantic code intelligence (hover, go-to-definition, references, diagnostics) via Language Server Protocol. Extend with external tools via Model Context Protocol.
44
47
  - **Local structured memory** — SQLite + embeddings cross-session memory. The agent recalls facts, instructions, and preferences across sessions via `remember`, `recall`, and `forget` tools.
45
48
  - **Web search, GitHub, and headless browser** — Research the web, read GitHub repos, and fetch JavaScript-rendered pages without leaving your terminal.
@@ -66,7 +69,7 @@ npm install -g kimiflare
66
69
  kimiflare
67
70
  ```
68
71
 
69
- On first run, an interactive onboarding wizard asks how you want to connect BYOK ~~or Cloud~~. That's it.
72
+ On first run, an interactive onboarding wizard collects your Cloudflare credentials and provisions (or picks) an AI Gateway. That's it.
70
73
 
71
74
  Or run without installing:
72
75
 
@@ -76,16 +79,9 @@ npx kimiflare
76
79
 
77
80
  Requires Node.js ≥ 20.
78
81
 
79
- ### AI Gateway (default)
80
-
81
- KimiFlare now routes Workers AI requests through your own **Cloudflare AI Gateway**. This unlocks:
82
-
83
- - Per-request payload logs in the Cloudflare dashboard
84
- - Response caching (set TTL with `/gateway cache-ttl <seconds>`)
85
- - Authoritative cost via the Gateway logs API, replacing local cost heuristics
86
- - Auto-tagging of every request with `feature` / `sessionId` / `turnIdx` metadata
82
+ ### Cloudflare API token
87
83
 
88
- The onboarding wizard creates or picks an AI Gateway for you. Your Cloudflare API token needs these permissions:
84
+ The onboarding wizard provisions or picks an AI Gateway in your account. Your Cloudflare API token needs:
89
85
 
90
86
  - `Workers AI:Read`
91
87
  - `AI Gateway:Read` (to list gateways)
@@ -93,9 +89,7 @@ The onboarding wizard creates or picks an AI Gateway for you. Your Cloudflare AP
93
89
 
94
90
  Edit your token at: https://dash.cloudflare.com/profile/api-tokens
95
91
 
96
- Once configured, run `/cost` in the TUI to see a Gateway section with cache hit ratio and direct dashboard links to each request log.
97
-
98
- For emergencies, set `KIMIFLARE_DISABLE_AI_GATEWAY=1` to fall back to the direct Workers AI path.
92
+ Once configured, `/cost` shows the Gateway-confirmed totals, cache hit ratio, per-feature breakdown, and direct dashboard links to each request log. `/gateway status` shows the current TTL, skip-cache flag, metadata tags, and live cache-hit ratio.
99
93
 
100
94
  ### One-shot mode
101
95
 
@@ -117,6 +111,7 @@ const { session } = await createAgentSession({
117
111
  config: {
118
112
  accountId: process.env.CLOUDFLARE_ACCOUNT_ID,
119
113
  apiToken: process.env.CLOUDFLARE_API_TOKEN,
114
+ aiGatewayId: process.env.CLOUDFLARE_AI_GATEWAY_ID,
120
115
  model: "@cf/moonshotai/kimi-k2.6",
121
116
  },
122
117
  });
@@ -149,7 +144,7 @@ session.dispose();
149
144
 
150
145
  #### SDK Authentication
151
146
 
152
- The SDK needs a Cloudflare **Account ID** and **API Token** to call Workers AI directly. Credentials are resolved in this priority order:
147
+ The SDK needs a Cloudflare **Account ID**, **API Token**, and AI Gateway ID. Credentials are resolved in this priority order:
153
148
 
154
149
  1. **Explicit `config` object** (recommended for apps)
155
150
  2. **Environment variables**: `CLOUDFLARE_ACCOUNT_ID` / `CF_ACCOUNT_ID`, `CLOUDFLARE_API_TOKEN` / `CF_API_TOKEN`
@@ -169,8 +164,6 @@ const { session } = await createAgentSession({
169
164
  });
170
165
  ```
171
166
 
172
- ~~**For zero-credential onboarding**, use KimiFlare Cloud mode. The user authenticates via GitHub device flow and a Cloudflare Worker proxies AI requests. Your app never sees raw Cloudflare credentials — only a GitHub token and `remoteWorkerUrl`.~~
173
-
174
167
  #### RPC mode (subprocess)
175
168
 
176
169
  If you need process isolation or a non-Node consumer, run KimiFlare in JSONL-over-stdio RPC mode:
@@ -228,7 +221,8 @@ kimiflare
228
221
  | `/memory` | Show memory stats and search |
229
222
  | `/mcp list` / `/mcp reload` | Manage MCP servers |
230
223
  | `/reasoning` | Toggle chain-of-thought display |
231
- | `/cost` | Show token usage for current turn |
224
+ | `/cost` | Show Gateway-confirmed cost, cache hit ratio, and per-feature breakdown |
225
+ | `/gateway status` | Show AI Gateway config and live cache-hit ratio |
232
226
  | `/update` | Check for updates |
233
227
  | `/help` | List all commands |
234
228