@oneciel-ai/claude-any 0.1.24

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/LICENSE +22 -0
  2. package/NOTICE +9 -0
  3. package/README.md +435 -0
  4. package/claude-any-menu.py +1851 -0
  5. package/claude-any-tool-guard.py +440 -0
  6. package/claude_any.py +6039 -0
  7. package/docs/README.ja.md +372 -0
  8. package/docs/README.ko.md +373 -0
  9. package/docs/README.zh.md +352 -0
  10. package/docs/assets/claude-any-base-url.en.png +0 -0
  11. package/docs/assets/claude-any-base-url.ja.png +0 -0
  12. package/docs/assets/claude-any-base-url.ko.png +0 -0
  13. package/docs/assets/claude-any-base-url.png +0 -0
  14. package/docs/assets/claude-any-base-url.zh.png +0 -0
  15. package/docs/assets/claude-any-demo.en.gif +0 -0
  16. package/docs/assets/claude-any-demo.en.mp4 +0 -0
  17. package/docs/assets/claude-any-demo.gif +0 -0
  18. package/docs/assets/claude-any-demo.ja.gif +0 -0
  19. package/docs/assets/claude-any-demo.ja.mp4 +0 -0
  20. package/docs/assets/claude-any-demo.ko.gif +0 -0
  21. package/docs/assets/claude-any-demo.ko.mp4 +0 -0
  22. package/docs/assets/claude-any-demo.mp4 +0 -0
  23. package/docs/assets/claude-any-demo.zh.gif +0 -0
  24. package/docs/assets/claude-any-demo.zh.mp4 +0 -0
  25. package/docs/assets/claude-any-main.en.png +0 -0
  26. package/docs/assets/claude-any-main.ja.png +0 -0
  27. package/docs/assets/claude-any-main.ko.png +0 -0
  28. package/docs/assets/claude-any-main.png +0 -0
  29. package/docs/assets/claude-any-main.zh.png +0 -0
  30. package/docs/assets/claude-any-model.en.png +0 -0
  31. package/docs/assets/claude-any-model.ja.png +0 -0
  32. package/docs/assets/claude-any-model.ko.png +0 -0
  33. package/docs/assets/claude-any-model.png +0 -0
  34. package/docs/assets/claude-any-model.zh.png +0 -0
  35. package/docs/assets/claude-any-nvidia-nim.gif +0 -0
  36. package/docs/assets/claude-any-ollama-cloud.gif +0 -0
  37. package/docs/assets/claude-any-options.en.png +0 -0
  38. package/docs/assets/claude-any-options.ja.png +0 -0
  39. package/docs/assets/claude-any-options.ko.png +0 -0
  40. package/docs/assets/claude-any-options.png +0 -0
  41. package/docs/assets/claude-any-options.zh.png +0 -0
  42. package/docs/assets/claude-any-provider.en.png +0 -0
  43. package/docs/assets/claude-any-provider.ja.png +0 -0
  44. package/docs/assets/claude-any-provider.ko.png +0 -0
  45. package/docs/assets/claude-any-provider.png +0 -0
  46. package/docs/assets/claude-any-provider.zh.png +0 -0
  47. package/docs/assets/claude-any-test.en.png +0 -0
  48. package/docs/assets/claude-any-test.ja.png +0 -0
  49. package/docs/assets/claude-any-test.ko.png +0 -0
  50. package/docs/assets/claude-any-test.png +0 -0
  51. package/docs/assets/claude-any-test.zh.png +0 -0
  52. package/docs/github-descriptions.md +235 -0
  53. package/docs/manual.md +496 -0
  54. package/install.ps1 +24 -0
  55. package/install.sh +19 -0
  56. package/npm-bin/claude-any-stop.js +6 -0
  57. package/npm-bin/claude-any.js +5 -0
  58. package/npm-bin/claude-anyctl.js +5 -0
  59. package/npm-bin/run-claude-any.js +51 -0
  60. package/package.json +45 -0
package/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 One Ciel LLC
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
package/NOTICE ADDED
@@ -0,0 +1,9 @@
1
+ Claude Any
2
+ Copyright (c) 2026 One Ciel LLC
3
+
4
+ Credits: One Ciel LLC
5
+
6
+ Claude Code is a product of Anthropic. NVIDIA, NIM, Ollama, and vLLM names
7
+ belong to their respective owners. This project is an independent compatibility
8
+ wrapper and is not endorsed by those providers.
9
+
package/README.md ADDED
@@ -0,0 +1,435 @@
1
+ # Claude Any
2
+
3
+ | English | [한국어](docs/README.ko.md) | [日本語](docs/README.ja.md) | [中文](docs/README.zh.md) |
4
+ | --- | --- | --- | --- |
5
+
6
+ > ## 🚀 Use the full Claude Code experience with free or low-cost LLMs
7
+ >
8
+ > - **Free** — [NVIDIA hosted NIM](https://build.nvidia.com/) (qwen3-coder-480b, gpt-oss, and friends) through the API Catalog.
9
+ > - **Low-cost** — [Ollama Cloud](https://ollama.com/cloud) for GLM, Qwen, DeepSeek, and other open-weight models at a fraction of frontier-model pricing.
10
+ > - **Free + local** — [Ollama](https://ollama.com/) or [vLLM](https://github.com/vllm-project/vllm) on your own GPU, fully offline.
11
+ >
12
+ > Provider, model, base URL, API key, streaming behavior, and LLM options are all selected from a console menu **before** Claude Code starts. Claude Code itself runs untouched with all of its native tooling, slash commands, and workflows.
13
+
14
+ ### Demo
15
+
16
+ ![NVIDIA hosted NIM driving Claude Code (deepseek-4-flash)](docs/assets/claude-any-nvidia-nim.gif)
17
+
18
+ NVIDIA hosted NIM (deepseek-4-flash) driving Claude Code through the claude-any router.  [full mp4 ⤓](https://github.com/OneCielAI/claude-any/raw/main/demo/claude-any-nvidia-nim.mp4)
19
+
20
+ ![Ollama Cloud streamed through the claude-any router (glm-5.1)](docs/assets/claude-any-ollama-cloud.gif)
21
+
22
+ Ollama Cloud (glm-5.1) streamed through the claude-any router with SSE word-boundary chunking enabled.  [full mp4 ⤓](https://github.com/OneCielAI/claude-any/raw/main/demo/claude-any-ollama-cloud.mp4)
23
+
24
+ ---
25
+
26
+ Claude Any is a provider selector and compatibility launcher for Claude Code.
27
+ It lets you choose Anthropic, Ollama, Ollama Cloud, vLLM, NVIDIA hosted models,
28
+ or self-hosted NIM before Claude Code starts, then passes normal Claude Code
29
+ arguments through unchanged.
30
+
31
+ Credits: One Ciel LLC
32
+
33
+ Current version: `0.1.24`
34
+
35
+ ## Why This Exists
36
+
37
+ Claude Any started from a practical need: even on the highest Claude Code plan,
38
+ long sessions can run out of available tokens or become blocked while waiting
39
+ for the next quota window. The goal is not to replace Claude Code, but to keep
40
+ work moving. Slower but usable providers such as NVIDIA NIM, Ollama Cloud,
41
+ vLLM, and local Ollama can act as hybrid third-party agents for summaries,
42
+ research, journaling, simple coding tasks, and delegated background work.
43
+
44
+ Another design goal is to keep as much of Claude Code's native experience as
45
+ possible. When a provider exposes an Anthropic-compatible endpoint, Claude Any
46
+ prefers that path so Claude Code tooling, permissions, model selection, and
47
+ workflow behavior remain close to the original. For capabilities that remote
48
+ providers cannot supply directly, such as web search, Claude Any adds separate
49
+ MCP-based tooling.
50
+
51
+ The pre-launch menu is console-first. Provider, model, base URL, API key, and
52
+ options are meant to be easy to review and change before Claude Code starts,
53
+ including over SSH.
54
+
55
+ macOS has not been fully tested by the maintainer yet, but Claude Any uses
56
+ portable Python and shell wrappers. If you hit a macOS issue, please report it.
57
+
58
+ - D. Yun
59
+
60
+ ## Install
61
+
62
+ Requirements:
63
+
64
+ - Python 3.10+
65
+ - Claude Code installed and available as `claude`
66
+ - Node/npm only if you enable MCP web tooling
67
+
68
+ Current install from GitHub:
69
+
70
+ ```sh
71
+ npm install -g https://github.com/OneCielAI/claude-any.git
72
+ claude-any
73
+ ```
74
+
75
+ Source install:
76
+
77
+ ```sh
78
+ git clone https://github.com/OneCielAI/claude-any.git
79
+ cd claude-any
80
+ ./install.sh
81
+ claude-any
82
+ ```
83
+
84
+ Windows PowerShell source install:
85
+
86
+ ```powershell
87
+ git clone https://github.com/OneCielAI/claude-any.git
88
+ cd claude-any
89
+ .\install.ps1
90
+ claude-any
91
+ ```
92
+
93
+ Registry install, after the first npm publish:
94
+
95
+ ```sh
96
+ npm install -g @oneciel-ai/claude-any
97
+ claude-any
98
+ ```
99
+
100
+ Upgrade:
101
+
102
+ ```sh
103
+ # GitHub install, current recommended path
104
+ npm install -g https://github.com/OneCielAI/claude-any.git --force
105
+ claude-any version
106
+ ```
107
+
108
+ To make `npm update -g @oneciel-ai/claude-any` work, the package must be
109
+ published to the public npm registry under the same package name:
110
+
111
+ ```sh
112
+ npm login
113
+ npm publish --access public
114
+ npm install -g @oneciel-ai/claude-any
115
+ npm update -g @oneciel-ai/claude-any
116
+ ```
117
+
118
+ For automated publishing, create an npm automation token, save it as the
119
+ repository secret `NPM_TOKEN`, then publish a GitHub Release or run the
120
+ `Publish to npm` workflow manually.
121
+
122
+ Versioning uses SemVer. For future releases, bump `version` in `package.json`,
123
+ create a matching Git tag such as `v0.1.1`, and publish a GitHub Release to
124
+ trigger the npm publish workflow. After registry publication, the normal
125
+ registry upgrade command will be:
126
+
127
+ ```sh
128
+ npm update -g @oneciel-ai/claude-any
129
+ ```
130
+
131
+
132
+ ![Claude Any menu](docs/assets/claude-any-main.en.png)
133
+
134
+ ## Demo
135
+
136
+ ![Claude Any demo](docs/assets/claude-any-demo.en.gif)
137
+
138
+ The demo sequence now shows provider selection, Base URL entry, model selection,
139
+ LLM options, and the compatibility test. The compatibility test checks a plain
140
+ text response, a required `tool_use`, and a `tool_result` follow-up before the
141
+ launcher recommends starting Claude Code.
142
+
143
+ Additional current screenshots:
144
+
145
+ | Provider | Base URL | Model | LLM options | Compatibility |
146
+ | --- | --- | --- | --- | --- |
147
+ | ![Provider](docs/assets/claude-any-provider.en.png) | ![Base URL](docs/assets/claude-any-base-url.en.png) | ![Model](docs/assets/claude-any-model.en.png) | ![Options](docs/assets/claude-any-options.en.png) | ![Test](docs/assets/claude-any-test.en.png) |
148
+
149
+ See the [full manual](docs/manual.md) for provider setup, headless flags, and
150
+ troubleshooting. A downloadable demo video is available at
151
+ [docs/assets/claude-any-demo.mp4](docs/assets/claude-any-demo.mp4).
152
+
153
+ ## Development Story
154
+
155
+ Claude Any was built through real integration tests: provider switching, model
156
+ discovery, API-key entry, compatibility tests, web-search tooling, timeout
157
+ handling, and native Claude Code behavior. The main lesson was that
158
+ Anthropic-compatible Messages endpoints are the cleanest integration path when a
159
+ provider supports them. Ollama, vLLM, and NIM can expose Anthropic-compatible
160
+ routes that preserve more of Claude Code's tooling model than a generic
161
+ OpenAI-compatible chat route.
162
+
163
+ Local inference was also tested with Qwen 3.6 27B Q4 through Ollama and vLLM on
164
+ RTX 5090 and MSI GB10-class hardware. It worked, but the speed should not be
165
+ judged against native Claude Code or Codex. In practice, some hosted/cloud
166
+ choices such as NVIDIA NIM and Ollama Cloud felt more useful for this hybrid
167
+ workflow than expected.
168
+
169
+ OpenAI-compatible endpoints were deliberately kept out of the primary path for
170
+ Claude Code use. In testing, tool-call translation through generic OpenAI chat
171
+ compatibility was more brittle around tool parameters, tool results, repeated
172
+ calls, retries, and model selection.
173
+
174
+ The most recent vLLM finding is that server-side tool-call parsing must match
175
+ the model family. For Claude Code, a vLLM server can be reachable and still fail
176
+ if `--tool-call-parser` is wrong. In particular, Qwen3-Coder should be served
177
+ with `--enable-auto-tool-choice --tool-call-parser qwen3_xml`; Hermes is for
178
+ Hermes-style models and some older Qwen tool templates. Claude Any now surfaces
179
+ this in the compatibility test instead of treating a simple text response as
180
+ enough.
181
+
182
+ ## Recommended Uses
183
+
184
+ Claude Any is most useful where speed is less important than keeping background
185
+ work moving. Good fits include Docker host maintenance, Windows or Linux system
186
+ administration, cleanup scripts for unused files, periodic security checklists,
187
+ log review, Windows Event Log review, intrusion-attempt triage, and report
188
+ drafting.
189
+
190
+ It is not a replacement for dedicated security products, but it can help
191
+ administrators turn routine checks into repeatable scripts and readable reports.
192
+ It is useful for summarizing possible virus, ransomware, brute-force, or
193
+ remote-access intrusion attempts. In that sense, Claude Any can help you build a
194
+ free or low-cost system security watcher for routine checks, alerts, and
195
+ human-readable summaries.
196
+
197
+ For example, it can help turn requests such as "install PostgreSQL in a Docker
198
+ container" or "analyze today's Docker logs and email me a report" into concrete
199
+ commands, scripts, scheduled jobs, and summaries.
200
+
201
+ A practical pattern is tiered supervision: use smaller or cheaper models to
202
+ watch logs and detect possible issues, use a larger model to review findings,
203
+ write policy, and plan the response, then let smaller models execute routine
204
+ steps under that larger model's supervision.
205
+
206
+ ## Features
207
+
208
+ - Pre-launch provider picker with English, Korean, Japanese, and Chinese UI.
209
+ - Provider-aware model list and custom model entry.
210
+ - API key entry outside the Claude Code chat input.
211
+ - LLM option presets for context window, output tokens, timeout, sampling, and
212
+ native compatibility.
213
+ - Compatibility test before launch, including text response, tool use, and
214
+ tool-result round trip checks.
215
+ - Runtime context reporting for vLLM/NIM when `/v1/models` exposes
216
+ `max_model_len`.
217
+ - Console-first pre-launch menu for SSH and terminal workflows.
218
+ - Native paths where providers expose Claude/Anthropic-compatible endpoints.
219
+ - Router mode for providers that need request/response adaptation.
220
+ - DuckDuckGo and fetch MCP wiring for non-native providers.
221
+ - Headless setup flags such as `--ca-provider`, `--ca-model`, `--ca-base-url`,
222
+ `--ca-api-key-env`, `--ca-ollama-option`, and `--ca-max-output-tokens`.
223
+ - Streaming proxy for Ollama/Ollama Cloud router path — tokens are delivered
224
+ to Claude Code as they arrive instead of waiting for the full response.
225
+ - Per-provider `stream` on/off toggle and `stream_word_chunking` option to
226
+ batch text deltas at word boundaries, mitigating SSE fragmentation that can
227
+ break tool-call / JSON parsing in long streamed responses.
228
+ - LLM options menu shows the meaning of the highlighted row at the bottom of
229
+ the panel in the selected language (English, Korean, Japanese, Chinese), and
230
+ boolean rows (`Stream`, `Stream word chunking`, `Native compatibility`,
231
+ `Think`) toggle in place when you press Enter — no input prompt.
232
+ - Tool guard hook coverage extended to the full Claude Code hook surface,
233
+ including `WorktreeCreate` / `WorktreeRemove`, so non-git working directories
234
+ no longer fail Agent isolation with
235
+ `Cannot create agent worktree: not in a git repository...`.
236
+ - Config file caching — settings are read from disk once and reused until the
237
+ file changes, reducing per-request overhead in the router.
238
+
239
+ ## Changelog
240
+
241
+ ### 0.1.24
242
+
243
+ - **First public npm release** under the correct scope: `@oneciel-ai/claude-any`. Earlier 0.1.x versions were never published to the registry; this is the version that is actually installable via `npm install -g @oneciel-ai/claude-any`.
244
+
245
+ ### 0.1.23
246
+
247
+ - **Stream toggle**: each non-Anthropic provider now has a `stream_enabled`
248
+ knob in the LLM options menu, in `claude-anyctl ollama-options` /
249
+ `provider-options`, and in headless flags. When off, the router forces
250
+ `stream:false` upstream and returns the full response to Claude Code — a
251
+ workaround when streaming fragmentation breaks tool-call or JSON parsing.
252
+ - **Word-boundary streaming**: new `stream_word_chunking` option buffers SSE
253
+ text deltas to whitespace/word boundaries before flushing. Implemented for
254
+ both the Ollama router path and the native pass-through path (vLLM, NVIDIA
255
+ hosted, self-hosted NIM). Tool deltas and non-text SSE events pass through
256
+ unchanged.
257
+ - **All-hooks tool guard**: `install_tool_guard_hooks` now registers the full
258
+ set of Claude Code hook events (PreToolUse, PostToolUse, PostToolUseFailure,
259
+ PostToolBatch, PermissionRequest, PermissionDenied, SessionStart/End, Setup,
260
+ UserPromptSubmit/Expansion, Stop, StopFailure, InstructionsLoaded,
261
+ ConfigChange, CwdChanged, Notification, SubagentStart/Stop, TeammateIdle,
262
+ TaskCreated, TaskCompleted, PreCompact, PostCompact, WorktreeCreate,
263
+ WorktreeRemove, Elicitation, ElicitationResult). The WorktreeCreate handler
264
+ emits `worktreePath = base_path` so Agent isolation works in non-git
265
+ directories.
266
+ - **Windows hook compatibility**: `shell_command_string` now emits forward
267
+ slashes and POSIX quoting on Windows so Claude Code's sh-based hook runner
268
+ doesn't strip backslashes from paths like `C:\Users\...`.
269
+ - **LLM options UX**: per-row description footer rendered in the user's
270
+ selected language, and boolean toggles (`Stream`, `Stream word chunking`,
271
+ `Native compatibility`, `Think`) flip on Enter without a prompt.
272
+
273
+ ### 0.1.22
274
+
275
+ - **Headless manual expansion**: expand the manual with practical headless setup, launch, testing, passthrough, and cleanup examples for automation and remote-server use.
276
+
277
+ ### 0.1.21
278
+
279
+ - **Service lifecycle documentation**: clarify that Claude Any starts only the router/proxy services required for the selected provider at launch time, and `claude-any stop` is the explicit cleanup command.
280
+
281
+ ### 0.1.20
282
+
283
+ - **NVIDIA hosted quick test**: `auto` mode now uses a text-only quick test for NVIDIA hosted providers, avoiding slow or flaky tool_use requests during menu checks. Use `smoke` for text + tool_use, or `full` for the full text/tool_use/tool_result round trip.
284
+ - **Menu test timeout**: the terminal menu now runs `claude-any test 60 auto`, which keeps the pre-launch test responsive for hosted models.
285
+
286
+ ### 0.1.19
287
+
288
+ - **Faster compatibility tests**: `claude-any test` now supports `auto`, `smoke`, and `full` modes.
289
+ - **Menu default speedup**: the terminal menu runs `claude-any test 120 auto`, so NVIDIA hosted compatibility checks finish faster while full validation remains available with `claude-any test 180 full`.
290
+
291
+ ### 0.1.18
292
+
293
+ - **NVIDIA hosted transient diagnostics**: compatibility tests now identify `RemoteDisconnected`, connection resets, and 502/503/504 responses from NVIDIA hosted backends as transient upstream/API Catalog failures.
294
+ - **NVIDIA proxy cleanup**: `claude-any stop` now also matches `nvd-claude-proxy` executable processes so stale proxy sessions are cleaned up reliably.
295
+
296
+ ### 0.1.17
297
+
298
+ - **Menu compatibility-test timeout**: the terminal menu now runs compatibility tests with an explicit 180 s timeout and stops the child process if it exceeds the menu hard limit, so slow hosted models cannot leave the menu appearing indefinitely pending.
299
+
300
+ ### 0.1.16
301
+
302
+ - **NVIDIA hosted proxy startup fix**: detect and launch an installed `nvd-claude-proxy`/`ncp` executable before falling back to `python -m nvd_claude_proxy.main`. This supports uv-tool installs where the proxy is available as a command but not importable from Claude Any's Python interpreter.
303
+
304
+ ### 0.1.15
305
+
306
+ - **Ollama/Ollama Cloud tool-call streaming fix**: emit streamed tool calls using sequential Anthropic SSE content block indexes and `input_json_delta` payloads. This prevents Claude Code from rejecting malformed streamed tool-use blocks with `Invalid tool parameters`.
307
+ - **Tool guard auto-install**: non-Anthropic launches now merge the Claude Any tool guard into `~/.claude/settings.json` so generated tool inputs can be normalized before execution.
308
+ - **Tool-call diagnostics**: router-side tool calls are logged to `~/.config/claude-any/tool-calls.jsonl`, and Claude Code hook inputs are logged to `~/.claude/claude-any-tool-guard/tool-events.jsonl` for precise debugging.
309
+ - **Tool input normalization**: the guard now maps common aliases such as `path` to `file_path`, `cmd` to `command`, and `query` to `pattern`, and returns explicit guidance when required fields are missing.
310
+
311
+ ### 0.1.14
312
+
313
+ - **SSH/terminal arrow-key compatibility**: rewrote `read_menu_key()` with a proper ANSI escape sequence parser and moved raw terminal setup into `portable_select()` so the terminal stays in raw mode for the entire menu loop. This prevents escape sequences from leaking to the screen when `ECHO` is restored between keystrokes. Arrow keys, Home, and End now work reliably in SSH sessions.
314
+ - **Test timeout**: default compatibility test timeout increased from 60 s to 120 s for slower cloud providers.
315
+ - **Ollama Cloud compatibility test fix**: added `"stream": false` to compatibility test requests so the router fetches a single JSON response from Ollama Cloud instead of SSE streaming, which was causing `post_json` to timeout while collecting all chunks.
316
+
317
+ ### 0.1.13
318
+
319
+ - **Ollama streaming proxy**: The router now streams Ollama and Ollama Cloud
320
+ responses through to Claude Code in real time using Anthropic SSE format,
321
+ instead of buffering the entire response before delivery.
322
+ - **Config caching**: `load_config()` now caches the configuration file in
323
+ memory and only re-reads from disk when the file modification time changes.
324
+ This eliminates repeated file I/O and JSON parsing on every router request.
325
+ - **Token estimation caching**: `estimate_tokens()` now accepts an optional
326
+ cache dict to avoid redundant `json.dumps()` calls within a single request.
327
+ `ollama_chat_request` and `cap_output_tokens_for_context` share the same
328
+ cache when computing context window sizing.
329
+
330
+ ### 0.1.12
331
+
332
+ - Refresh docs and demo assets.
333
+
334
+ ### 0.1.11
335
+
336
+ - Validate tool call compatibility.
337
+
338
+ ### 0.1.10
339
+
340
+ - Show runtime context in tests.
341
+
342
+ ### 0.1.9
343
+
344
+ - Cap presets to server context.
345
+
346
+ ### 0.1.8
347
+
348
+ - Localize LLM presets.
349
+
350
+ ## Provider Notes
351
+
352
+ | Provider | Mode | Notes |
353
+ | --- | --- | --- |
354
+ | Anthropic | Native Claude Code | Uses Claude login or Anthropic API key. |
355
+ | Ollama | Native when available, router otherwise | Local Ollama normally needs no API key. Cloud models through local Ollama require `ollama signin` on the Ollama host. |
356
+ | Ollama Cloud | Router | Calls `https://ollama.com/api`; requires an Ollama API key. |
357
+ | vLLM | Native Anthropic-compatible endpoint | Use a vLLM endpoint that exposes Anthropic-compatible `/v1/messages`; match `--tool-call-parser` to the model family. |
358
+ | NVIDIA hosted | Router/proxy | Uses NVIDIA hosted API through the compatibility path. |
359
+ | self-hosted NIM | Native Anthropic-compatible endpoint | Use the self-hosted NIM Anthropic-compatible endpoint. |
360
+
361
+ ## Service Lifecycle
362
+
363
+ Claude Any does not keep every possible backend helper running all the time. The
364
+ normal lifecycle is:
365
+
366
+ - Before launch, managed router/proxy processes can be stopped with
367
+ `claude-any stop`.
368
+ - When `claude-any` starts Claude Code, it starts only the services required by
369
+ the selected provider.
370
+ - Ollama and Ollama Cloud router mode use the Claude Any router on
371
+ `127.0.0.1:8799`.
372
+ - NVIDIA hosted router mode uses the Claude Any router on `127.0.0.1:8799` and
373
+ starts `nvd-claude-proxy` on `127.0.0.1:8788` only when that provider needs it.
374
+ - Switching away from NVIDIA hosted does not require keeping the NVIDIA proxy
375
+ alive; stale sessions should be cleaned with `claude-any stop` before a fresh
376
+ test or launch.
377
+
378
+ This keeps Claude Code pointed at one stable Claude Any entry point while still
379
+ letting provider-specific helpers start on demand.
380
+
381
+ For Qwen3-Coder on vLLM, start the server with a matching tool parser:
382
+
383
+ ```sh
384
+ vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct \
385
+ --host 0.0.0.0 \
386
+ --port 8000 \
387
+ --served-model-name qwen3-coder-30b \
388
+ --max-model-len 65536 \
389
+ --enable-auto-tool-choice \
390
+ --tool-call-parser qwen3_xml
391
+ ```
392
+
393
+ ## Provider Links
394
+
395
+ - Ollama Cloud: [cloud overview](https://ollama.com/cloud), [API key settings](https://ollama.com/settings/keys), [authentication docs](https://docs.ollama.com/api/authentication).
396
+ - Ollama local Anthropic compatibility: [Ollama Anthropic API docs](https://docs.ollama.com/api/anthropic-compatibility).
397
+ - vLLM: [Claude Code integration](https://docs.vllm.ai/en/latest/serving/integrations/claude_code/), [tool calling](https://docs.vllm.ai/en/stable/features/tool_calling/), [project GitHub](https://github.com/vllm-project/vllm).
398
+ - NVIDIA hosted NIM: [NVIDIA API Catalog](https://build.nvidia.com/), [API Catalog quickstart](https://docs.api.nvidia.com/nim/docs/api-quickstart).
399
+ - Self-hosted NVIDIA NIM: [Claude Code with NIM](https://docs.nvidia.com/nim/large-language-models/latest/ai-assistant-integrations/claude-code.html), [NIM for LLMs getting started](https://docs.nvidia.com/nim/large-language-models/1.14.0/getting-started.html), [NGC personal keys](https://org.ngc.nvidia.com/setup/personal-keys).
400
+
401
+ ## Headless Examples
402
+
403
+ Headless commands skip the pre-launch menu and launch Claude Code immediately.
404
+ Claude Any consumes `--ca-*` setup flags, starts the required router/proxy
405
+ services, then passes the remaining arguments to Claude Code.
406
+
407
+ ```sh
408
+ claude-any --ca-provider ollama-cloud --ca-model glm-5.1
409
+ claude-any --ca-provider ollama --ca-base-url http://127.0.0.1:11434 --ca-model qwen3-coder
410
+ claude-any --ca-provider ollama-cloud --ca-api-key-env OLLAMA_API_KEY --ca-model qwen3-coder:480b:cloud
411
+ claude-any --ca-provider vllm --ca-base-url http://127.0.0.1:8000 --ca-model Qwen/Qwen3-Coder
412
+ claude-any --ca-no-update-check -p "Reply with OK only." --output-format text
413
+ ```
414
+
415
+ All other arguments are passed through to Claude Code.
416
+
417
+ ## Security
418
+
419
+ Do not commit runtime configuration or API keys. Claude Any stores local runtime
420
+ configuration under `~/.config/claude-any/`. NVIDIA hosted credentials used by
421
+ the optional proxy are stored under `~/.config/nvd-claude-proxy/`.
422
+
423
+ This repository should contain source, documentation, and demo assets only.
424
+
425
+ ## Development
426
+
427
+ ```sh
428
+ python -m py_compile claude_any.py claude-any-menu.py claude-any-tool-guard.py
429
+ python -m ruff check .
430
+ python scripts/make_demo_assets.py
431
+ ```
432
+
433
+ ## License
434
+
435
+ MIT. See [LICENSE](LICENSE).