mimo2codex 0.1.15 → 0.1.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/AGENTS.md +24 -5
  2. package/README.md +46 -5
  3. package/README.zh.md +46 -5
  4. package/dist/admin/router.js +117 -2
  5. package/dist/admin/router.js.map +1 -1
  6. package/dist/cli.js +67 -147
  7. package/dist/cli.js.map +1 -1
  8. package/dist/config.js +16 -10
  9. package/dist/config.js.map +1 -1
  10. package/dist/db/logs.js +80 -0
  11. package/dist/db/logs.js.map +1 -1
  12. package/dist/providers/generic.js +96 -0
  13. package/dist/providers/generic.js.map +1 -0
  14. package/dist/providers/genericLoader.js +229 -0
  15. package/dist/providers/genericLoader.js.map +1 -0
  16. package/dist/providers/registry.js +48 -10
  17. package/dist/providers/registry.js.map +1 -1
  18. package/dist/server.js +201 -1
  19. package/dist/server.js.map +1 -1
  20. package/dist/setup/snippets.js +187 -0
  21. package/dist/setup/snippets.js.map +1 -0
  22. package/dist/translate/reqToChat.js +1 -1
  23. package/dist/translate/reqToChat.js.map +1 -1
  24. package/dist/upstream/openaiCompatClient.js +32 -11
  25. package/dist/upstream/openaiCompatClient.js.map +1 -1
  26. package/dist/web/assets/index-D19ffnSJ.css +1 -0
  27. package/dist/web/assets/index-DPLJprJ4.js +67 -0
  28. package/dist/web/index.html +2 -2
  29. package/doc/generic-providers.md +399 -0
  30. package/doc/generic-providers.zh.md +399 -0
  31. package/mimoskill/SKILL.md +69 -8
  32. package/mimoskill/references/ocr_workflow.md +216 -0
  33. package/mimoskill/scripts/generate_image.py +163 -0
  34. package/mimoskill/scripts/ocr.py +396 -0
  35. package/package.json +5 -4
  36. package/dist/web/assets/index-BoykBCnY.js +0 -67
  37. package/dist/web/assets/index-DAJbSznk.css +0 -1
@@ -4,8 +4,8 @@
4
4
  <meta charset="UTF-8" />
5
5
  <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
6
  <title>mimo2codex 控制台</title>
7
- <script type="module" crossorigin src="/admin/assets/index-BoykBCnY.js"></script>
8
- <link rel="stylesheet" crossorigin href="/admin/assets/index-DAJbSznk.css">
7
+ <script type="module" crossorigin src="/admin/assets/index-DPLJprJ4.js"></script>
8
+ <link rel="stylesheet" crossorigin href="/admin/assets/index-D19ffnSJ.css">
9
9
  </head>
10
10
  <body>
11
11
  <div id="root"></div>
@@ -0,0 +1,399 @@
1
+ # Generic OpenAI-compatible Providers · Detailed Guide
2
+
3
+ > English · [中文](./generic-providers.zh.md)
4
+ >
5
+ > Back to: [README English](../README.md) · [README 中文](../README.zh.md)
6
+
7
+ mimo2codex ships with two built-in providers — MiMo and DeepSeek. The **generic provider mechanism** lets you wire any **OpenAI Chat Completions-compatible** or **native Responses API** upstream to the latest Codex without modifying any code: Qwen, GLM, Kimi, Zhipu, OpenAI itself, local vLLM, Ollama, LM Studio — anything with an OpenAI-shaped HTTP interface.
8
+
9
+ ## What it solves
10
+
11
+ The latest Codex hard-requires `wire_api = "responses"`, but almost every third-party model only exposes Chat Completions. mimo2codex does the translation; you just register your upstream in a config file.
12
+
13
+ Two wire-protocol modes are supported:
14
+
15
+ | `wireApi` | Upstream protocol | When to use |
16
+ |---|---|---|
17
+ | `chat` (default) | OpenAI Chat Completions | 99% of third-party providers (Qwen / GLM / DeepSeek / Kimi / Ollama / vLLM …) |
18
+ | `responses` | OpenAI Responses API | Upstream natively speaks Responses (OpenAI itself, future-leaning providers). Direct passthrough — no protocol translation |
19
+
20
+ The `responses` passthrough has a side benefit: **when the upstream's protocol evolves, you don't wait for mimo2codex to catch up** — new fields flow straight through without being stripped by an outdated translator.
21
+
22
+ ## 60-second start
23
+
24
+ **Simplest path** — one provider, three env vars.
25
+
26
+ ```bash
27
+ export GENERIC_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
28
+ export GENERIC_API_KEY=sk-your-qwen-key
29
+ export GENERIC_DEFAULT_MODEL=qwen3-max
30
+ mimo2codex --model generic
31
+ ```
32
+
33
+ The startup banner prints `provider: generic`, `upstream: https://dashscope...`, and `mimo2codex print-config --model generic` outputs the `auth.json + config.toml` snippets — paste into `~/.codex/`.
34
+
35
+ > ⚠️ env-only mode supports **one** upstream. For multiple, use the `providers.json` route below.
36
+
37
+ ## Config-file route (multi-instance, recommended)
38
+
39
+ Write a `providers.json` with one entry per upstream. Default path:
40
+
41
+ | OS | Path |
42
+ |---|---|
43
+ | macOS / Linux | `~/.mimo2codex/providers.json` |
44
+ | Windows | `%USERPROFILE%\.mimo2codex\providers.json` |
45
+
46
+ Override with `MIMO2CODEX_PROVIDERS_FILE=/some/path/providers.json`.
47
+
48
+ Full example:
49
+
50
+ ```json
51
+ {
52
+ "providers": [
53
+ {
54
+ "id": "qwen",
55
+ "shortcut": "qwen",
56
+ "displayName": "Qwen (DashScope)",
57
+ "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
58
+ "envKey": "QWEN_API_KEY",
59
+ "defaultModel": "qwen3-max",
60
+ "wireApi": "chat",
61
+ "models": [
62
+ { "id": "qwen3-max", "contextWindow": 262144 },
63
+ { "id": "qwen3-coder-plus", "contextWindow": 1048576 }
64
+ ],
65
+ "features": { "forceParallelToolCalls": true }
66
+ },
67
+ {
68
+ "id": "kimi",
69
+ "shortcut": "kimi",
70
+ "displayName": "Kimi K2",
71
+ "baseUrl": "https://api.moonshot.cn/v1",
72
+ "envKey": "KIMI_API_KEY",
73
+ "defaultModel": "kimi-k2-0905-preview"
74
+ },
75
+ {
76
+ "id": "ollama",
77
+ "shortcut": "ol",
78
+ "displayName": "Ollama (local)",
79
+ "baseUrl": "http://127.0.0.1:11434/v1",
80
+ "envKey": "OLLAMA_API_KEY",
81
+ "defaultModel": "qwen2.5-coder:7b"
82
+ },
83
+ {
84
+ "id": "openai-native",
85
+ "displayName": "OpenAI (native Responses)",
86
+ "baseUrl": "https://api.openai.com/v1",
87
+ "envKey": "OPENAI_API_KEY",
88
+ "defaultModel": "gpt-5",
89
+ "wireApi": "responses"
90
+ }
91
+ ]
92
+ }
93
+ ```
94
+
95
+ Then start:
96
+
97
+ ```bash
98
+ export QWEN_API_KEY=sk-...
99
+ export KIMI_API_KEY=sk-...
100
+ mimo2codex --model qwen # default provider = qwen
101
+ ```
102
+
103
+ `--model` accepts either `id` or `shortcut`.
104
+
105
+ ## Field reference
106
+
107
+ | Field | Required | Default | Notes |
108
+ |---|---|---|---|
109
+ | `id` | ✓ | — | Unique identifier. Cannot be `mimo` / `deepseek` (reserved). Alphanumeric + `-` / `_` only |
110
+ | `displayName` | — | id | Shown in UI and print-config output |
111
+ | `shortcut` | — | id | Used with `--model <shortcut>` |
112
+ | `baseUrl` | ✓ | — | Upstream base URL (**do not** include `/chat/completions` — mimo2codex appends paths) |
113
+ | `envKey` | ✓ | — | Env var to read the API key from (e.g. `QWEN_API_KEY`) |
114
+ | `defaultModel` | ✓ | — | Fallback when client didn't specify or sent an unknown id |
115
+ | `wireApi` | — | `"chat"` | `"chat"` or `"responses"`, see above |
116
+ | `models` | — | `[]` | Declared model catalog for this provider (see next section) |
117
+ | `features.forceParallelToolCalls` | — | `false` | Force `parallel_tool_calls: true` (recommended for agentic-coding upstreams) |
118
+ | `features.webSearch` | — | `false` | Forward Codex's `web_search` tool (only meaningful if upstream has a builtin web_search) |
119
+ | `docsUrl` | — | — | Link shown in the "missing API key" error |
120
+
121
+ Each `models[]` entry:
122
+
123
+ | Field | Required | Notes |
124
+ |---|---|---|
125
+ | `id` | ✓ | The upstream's real model id |
126
+ | `aliases` | — | Client-side names that route to this model too |
127
+ | `displayName` | — | UI label |
128
+ | `contextWindow` | — | Emitted as `model_context_window` in print-config |
129
+ | `maxOutputTokens` | — | Emitted as `model_max_output_tokens` in print-config |
130
+ | `supportsImages` / `supportsReasoning` / `supportsWebSearch` | — | Metadata, surfaced in admin UI |
131
+
132
+ ## Model identification strategy
133
+
134
+ `models[]` is **optional**. Two modes:
135
+
136
+ **1. With declared `models[]` (strict mode)**
137
+
138
+ Only ids in `models[]` (or their aliases) are considered "owned" by this provider. `byClientModel` routes by exact match. If the client sends an unlisted id:
139
+ - And this provider is the **default** → rewrite to `defaultModel`, log a `rewriteNotice`
140
+ - Otherwise → falls through to the default provider's fallback
141
+
142
+ Good for: known model lineup, wanting print-config to emit `model_context_window`, clean admin UI catalog.
143
+
144
+ **2. No `models[]` (open-catalog passthrough)**
145
+
146
+ Whatever model id the client sends, forward verbatim. **No rewriting, no errors.**
147
+
148
+ Good for: upstreams with fast-changing catalogs (Ollama, OpenRouter), when you just want a pipe.
149
+
150
+ > Open-catalog generics are **not** auto-matched by `byClientModel` — otherwise they'd "swallow" every mimo / deepseek id. To route to them, set them as the default provider with `--model <id>`.
151
+
152
+ ## wireApi explained
153
+
154
+ **`chat`**: mimo2codex translates Codex's Responses request to Chat Completions, sends to `${baseUrl}/chat/completions`, translates the response back.
155
+
156
+ ```
157
+ Codex ──[Responses]──> mimo2codex ──[Chat]──> upstream ──[Chat]──> mimo2codex ──[Responses]──> Codex
158
+ ```
159
+
160
+ **`responses`**: mimo2codex forwards Codex's request **as-is** to `${baseUrl}/responses`. No translation either direction.
161
+
162
+ ```
163
+ Codex ──[Responses]──> mimo2codex ──[Responses raw]──> upstream ──[Responses raw]──> mimo2codex ──> Codex
164
+ ```
165
+
166
+ Use `responses` when:
167
+
168
+ - Upstream is OpenAI itself
169
+ - Upstream claims "full OpenAI Responses API parity"
170
+ - Upstream supports fields chat completions can't carry (`reasoning.effort`, `text.verbosity`, new tool types) and you don't want them dropped
171
+
172
+ Notes:
173
+
174
+ - Streaming passthrough is **byte-level pipe** — upstream SSE frames forward unmodified, Codex's parser handles framing. Lower overhead but mimo2codex makes zero modifications mid-stream
175
+ - Admin UI's per-model token stats can only extract top-level `usage` fields on the `responses` path; nested usage breakdowns aren't parsed
176
+
177
+ ## Real-world examples
178
+
179
+ ### Alibaba Qwen (DashScope OpenAI-compatible mode)
180
+
181
+ ```json
182
+ {
183
+ "id": "qwen",
184
+ "displayName": "Qwen (DashScope)",
185
+ "baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
186
+ "envKey": "QWEN_API_KEY",
187
+ "defaultModel": "qwen3-max",
188
+ "models": [
189
+ { "id": "qwen3-max", "contextWindow": 262144 },
190
+ { "id": "qwen3-coder-plus", "contextWindow": 1048576, "supportsReasoning": true }
191
+ ],
192
+ "features": { "forceParallelToolCalls": true }
193
+ }
194
+ ```
195
+
196
+ ### Zhipu GLM
197
+
198
+ ```json
199
+ {
200
+ "id": "glm",
201
+ "displayName": "Zhipu GLM-4.6",
202
+ "baseUrl": "https://open.bigmodel.cn/api/paas/v4",
203
+ "envKey": "ZHIPU_API_KEY",
204
+ "defaultModel": "glm-4.6",
205
+ "models": [
206
+ { "id": "glm-4.6", "contextWindow": 200000 }
207
+ ]
208
+ }
209
+ ```
210
+
211
+ ### Moonshot Kimi
212
+
213
+ ```json
214
+ {
215
+ "id": "kimi",
216
+ "displayName": "Kimi K2",
217
+ "baseUrl": "https://api.moonshot.cn/v1",
218
+ "envKey": "KIMI_API_KEY",
219
+ "defaultModel": "kimi-k2-0905-preview",
220
+ "models": [
221
+ { "id": "kimi-k2-0905-preview", "contextWindow": 256000 }
222
+ ]
223
+ }
224
+ ```
225
+
226
+ ### Local Ollama / LM Studio (open-catalog)
227
+
228
+ ```json
229
+ {
230
+ "id": "ollama",
231
+ "shortcut": "ol",
232
+ "displayName": "Ollama (local)",
233
+ "baseUrl": "http://127.0.0.1:11434/v1",
234
+ "envKey": "OLLAMA_API_KEY",
235
+ "defaultModel": "qwen2.5-coder:7b"
236
+ }
237
+ ```
238
+
239
+ Ollama doesn't validate API keys, but `envKey` is schema-required — just set anything (`OLLAMA_API_KEY=ignored`).
240
+
241
+ ### OpenAI native Responses (passthrough)
242
+
243
+ ```json
244
+ {
245
+ "id": "openai-native",
246
+ "displayName": "OpenAI (native Responses)",
247
+ "baseUrl": "https://api.openai.com/v1",
248
+ "envKey": "OPENAI_API_KEY",
249
+ "defaultModel": "gpt-5",
250
+ "wireApi": "responses"
251
+ }
252
+ ```
253
+
254
+ ## Default provider & routing rules (important)
255
+
256
+ After adding generic providers, routing priority:
257
+
258
+ 1. Client `model` matches some provider's `models[]` (incl. aliases) **and that provider has a key** → route there
259
+ 2. Catalog matched but the provider has no key → falls through to the default provider; model is rewritten to its `defaultModel`, logged as `client_model_rewritten`
260
+ 3. Open-catalog provider (no `models[]`) → skipped during auto-routing (so it doesn't swallow unknown ids); reachable only by setting it as the default with `--model <id>`
261
+ 4. Nothing matches → falls back to the default provider, rewriting model to its `defaultModel`, logged as `client_model_rewritten`
262
+
263
+ Default provider selection:
264
+
265
+ - `--model <id-or-shortcut>` takes priority
266
+ - Otherwise `MIMO2CODEX_DEFAULT_PROVIDER` env var
267
+ - Otherwise falls back to `"mimo"`
268
+
269
+ ### What "no key" actually does
270
+
271
+ A common foot-gun: you configure qwen / kimi / glm in `providers.json` but only set `MIMO_API_KEY` at startup. Then:
272
+
273
+ ```bash
274
+ # Client sends qwen3-max
275
+ # → byClientModel matches the qwen catalog
276
+ # → qwen has no key → fall through
277
+ # → default provider mimo → model rewritten to mimo-v2.5-pro
278
+ # → the response actually comes from MiMo on mimo-v2.5-pro
279
+ ```
280
+
281
+ **No mid-conversation warning.** The admin "model mappings" table shows the `qwen3-max → mimo-v2.5-pro` rewrite, and chat logs carry the `client_model_rewritten` error code. But if you don't open the admin UI, it's easy to believe you're using Qwen when you're actually using MiMo.
282
+
283
+ To avoid this silent fallback, today's options:
284
+
285
+ 1. **Make sure all keys are set up-front** — the admin Dashboard's Provider cards explicitly show "key detected / not detected" per provider; set all the keys you intend to use
286
+ 2. **Single-provider startup** — to specifically use qwen, run `--model qwen` without the mimo key. Then if qwen has no key, startup errors out instead of silently downgrading
287
+
288
+ > Existing mimo / deepseek users **are not affected**: without `providers.json`, the default provider stays `mimo` and all behavior is byte-identical.
289
+
290
+ ## Manage in admin webui (no manual JSON editing)
291
+
292
+ Open `http://127.0.0.1:8788/admin/`:
293
+
294
+ - **Generic Providers page** (sidebar, [`/admin/providers`](http://127.0.0.1:8788/admin/providers)): visual CRUD for generic providers
295
+ - Table lists every entry in `providers.json`; each row has Edit / Delete
296
+ - "+ Add Provider" opens a form with placeholders, inline validation (id can't collide with builtins, no spaces, baseUrl required, etc.)
297
+ - Models list is dynamically editable — each model takes contextWindow / maxOutputTokens / vision / reasoning / web search metadata
298
+ - "Edit raw JSON" escape hatch — edit the full `providers.json` text, only writes when JSON validates
299
+ - On save, writes `~/.mimo2codex/providers.json` and shows a **"Restart mimo2codex to apply"** banner — there is no hot reload; the registry initializes once at startup
300
+ - **Setup guide** ([`/admin/setup`](http://127.0.0.1:8788/admin/setup)): provider dropdown, three tabs auto-render `auth.json + config.toml` for direct / env-key / cc-switch flows. Each code block has a Copy button
301
+ - **Dashboard**: all registered providers (including generics) shown in Provider cards with key-presence status
302
+ - **Logs**: filter by provider (generic ids appear in the dropdown)
303
+
304
+ > Note: the UI **does not manage API keys** — keys are not stored in the database or any config file; they must be injected via environment variables (e.g. `QWEN_API_KEY=sk-...`). This avoids credentials landing on disk and getting backed up or leaked. UI handles schema config, env handles secrets.
305
+
306
+ ## CLI subcommands
307
+
308
+ ```bash
309
+ mimo2codex print-config --model qwen # qwen's auth.json + config.toml snippets
310
+ mimo2codex print-config --model qwen --env-key # env-key variant (Codex CLI only)
311
+ mimo2codex print-cc-switch --model qwen # cc-switch custom-provider snippets
312
+ ```
313
+
314
+ `model_provider` naming convention in the toml:
315
+
316
+ - mimo → `[model_providers.mimo]` (legacy preserved)
317
+ - deepseek → `[model_providers.mimo2codex]` (legacy preserved)
318
+ - other generics → `[model_providers.mimo2codex-<id>]` (prefixed to avoid colliding with the user's existing toml sections)
319
+
320
+ ## Troubleshooting
321
+
322
+ <details>
323
+ <summary><b><code>provider id "xxx" must be alphanumeric + dash/underscore</code></b></summary>
324
+
325
+ `id` only allows letters, digits, `-`, `_`. No spaces, dots, slashes. Use `kimi`, `my-qwen`, `local_dev` etc.
326
+
327
+ </details>
328
+
329
+ <details>
330
+ <summary><b><code>generic provider id "mimo" conflicts with a built-in provider</code></b></summary>
331
+
332
+ `mimo` and `deepseek` are reserved. Rename to e.g. `mimo-custom`.
333
+
334
+ </details>
335
+
336
+ <details>
337
+ <summary><b><code>missing API key for ...</code> but I set the env</b></summary>
338
+
339
+ Check:
340
+ 1. The env var name matches `envKey` in the spec exactly (case-sensitive)
341
+ 2. Right shell — PowerShell `$env:X` is invisible to cmd, and vice versa
342
+ 3. You actually set the key for the provider that `MIMO2CODEX_DEFAULT_PROVIDER` (or `--model`) points at — the default provider must have a key, or startup fails
343
+
344
+ </details>
345
+
346
+ <details>
347
+ <summary><b>Startup banner doesn't show my generic provider</b></summary>
348
+
349
+ - Banner only lists providers with **API keys set**. Check `envKey`
350
+ - Verify the providers.json path: `~/.mimo2codex/` or explicit `MIMO2CODEX_PROVIDERS_FILE`
351
+ - JSON syntax errors fail startup loudly — they don't silently skip
352
+
353
+ </details>
354
+
355
+ <details>
356
+ <summary><b>Routing wrong — sent qwen3-max but got mimo</b></summary>
357
+
358
+ If your generic doesn't declare `models[]`, it **won't** be auto-matched by `byClientModel`. Two fixes:
359
+ - Add `models: [{ "id": "qwen3-max" }]` to the spec (recommended)
360
+ - Or make the generic the default provider: `mimo2codex --model qwen`
361
+
362
+ </details>
363
+
364
+ <details>
365
+ <summary><b>Upstream 400, error says reasoning / thinking field not recognized</b></summary>
366
+
367
+ Non-MiMo upstreams usually don't understand MiMo's `thinking` field. Generic providers already strip these by default. If you still see this, run with `--verbose` to inspect the actual forwarded body — likely something Codex itself emitted that the upstream doesn't accept (a Codex-side compat issue, unrelated to the proxy).
368
+
369
+ </details>
370
+
371
+ <details>
372
+ <summary><b>wireApi: "responses" upstream returns 404 / 405</b></summary>
373
+
374
+ The upstream probably doesn't implement `/v1/responses`. Most third parties only have `/v1/chat/completions` today — set `wireApi` back to `"chat"` (or omit it; default is chat).
375
+
376
+ </details>
377
+
378
+ <details>
379
+ <summary><b>Same id appears twice in providers.json</b></summary>
380
+
381
+ Startup fails with an error. Each id must be unique.
382
+
383
+ </details>
384
+
385
+ ## Design notes
386
+
387
+ - **Why is the default provider still mimo?** Backwards compatibility. Existing mimo / deepseek users see zero behavior change after upgrading
388
+ - **Why don't open-catalog generics participate in `byClientModel`?** They'd swallow every unknown id, including mimo / deepseek's legit ids. They need an explicit `--model <id>` to be used as a catch-all
389
+ - **Why the `mimo2codex-` prefix on toml provider keys?** Users' `~/.codex/config.toml` may already have `[model_providers.qwen]` (pointing directly at Qwen). The prefix avoids overwriting it
390
+ - **Why no admin-UI form for editing generic providers?** Ship "it works" first. UI forms can be added later without changing the architecture (providers.json was already a config file)
391
+
392
+ ## Source files
393
+
394
+ - [src/providers/generic.ts](../src/providers/generic.ts) — factory function
395
+ - [src/providers/genericLoader.ts](../src/providers/genericLoader.ts) — config loading + env fallback
396
+ - [src/providers/registry.ts](../src/providers/registry.ts) — runtime registration + routing guards
397
+ - [src/upstream/openaiCompatClient.ts](../src/upstream/openaiCompatClient.ts) — chat / responses upstream clients
398
+ - [src/server.ts](../src/server.ts) — wireApi branch in `handleResponses`
399
+ - [test/providers.generic.test.ts](../test/providers.generic.test.ts) — 18 test cases