@oneciel-ai/claude-any 0.1.28 → 0.1.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,9 +1,9 @@
1
- # Claude Any
2
-
3
- ![Claude Any: full Claude Code experience with free or low-cost LLMs](claude-any-adv.png)
4
-
5
- | English | [한국어](docs/README.ko.md) | [日本語](docs/README.ja.md) | [中文](docs/README.zh.md) |
6
- | --- | --- | --- | --- |
1
+ # Claude Any
2
+
3
+ ![Claude Any: full Claude Code experience with free or low-cost LLMs](claude-any-adv.png)
4
+
5
+ | English | [한국어](docs/README.ko.md) | [日本語](docs/README.ja.md) | [中文](docs/README.zh.md) |
6
+ | --- | --- | --- | --- |
7
7
 
8
8
  [![npm version](https://img.shields.io/npm/v/@oneciel-ai/claude-any?logo=npm&label=npm)](https://www.npmjs.com/package/@oneciel-ai/claude-any)
9
9
  [![npm downloads](https://img.shields.io/npm/dm/@oneciel-ai/claude-any?logo=npm&label=downloads)](https://www.npmjs.com/package/@oneciel-ai/claude-any)
@@ -11,23 +11,23 @@
11
11
 
12
12
  > ## 🚀 Use the full Claude Code experience with free or low-cost LLMs
13
13
  >
14
- > - **Free** — [NVIDIA hosted NIM](https://build.nvidia.com/) (qwen3-coder-480b, gpt-oss, and friends) through the API Catalog.
15
- > - **Low-cost** — [Ollama Cloud](https://ollama.com/cloud) for GLM, Qwen, DeepSeek, and other open-weight models at a fraction of frontier-model pricing.
16
- > - **Free + local** — [Ollama](https://ollama.com/) or [vLLM](https://github.com/vllm-project/vllm) on your own GPU, fully offline.
17
- > - **Plan Mode + Advisor ready** — Claude Any preserves Claude Code Plan Mode on non-Anthropic providers and adds an optional long-context Advisor model for review.
18
- > - **Smooth free-model pacing** — Claude Code spends time reading files and running tools, and Claude Any uses that natural gap for RPM pacing so NVIDIA hosted free models feel usable even with strict per-minute limits.
19
- >
20
- > Provider, model, base URL, API key, streaming behavior, and LLM options are all selected from a console menu **before** Claude Code starts. Claude Code itself runs untouched with all of its native tooling, slash commands, and workflows.
21
-
22
- ## Today's Top 3 Benefits
23
-
24
- 1. **Plan Mode works on non-Anthropic models** — Claude Any keeps Claude Code's Plan Mode usable even when the upstream provider is NVIDIA hosted, Ollama Cloud, local Ollama, vLLM, or NIM.
25
- 2. **Advisor review with a bigger model** — pick a long-context Advisor Model at launch, then use `/advisor` inside Claude Code to review the current task, blockers, and next concrete action.
26
- 3. **Free-model RPM limits feel smoother** — router-side RPM pacing uses the natural time spent reading files and running tools, so NVIDIA hosted free models can stay within per-minute limits with less visible waiting.
27
-
28
- ### Demo
29
-
30
- ![NVIDIA hosted NIM driving Claude Code (deepseek-4-flash)](docs/assets/claude-any-nvidia-nim.gif)
14
+ > - **Free** — [NVIDIA hosted NIM](https://build.nvidia.com/) (qwen3-coder-480b, gpt-oss, and friends) through the API Catalog.
15
+ > - **Low-cost** — [Ollama Cloud](https://ollama.com/cloud) for GLM, Qwen, DeepSeek, and other open-weight models at a fraction of frontier-model pricing.
16
+ > - **Free + local** — [Ollama](https://ollama.com/) or [vLLM](https://github.com/vllm-project/vllm) on your own GPU, fully offline.
17
+ > - **Plan Mode + Advisor ready** — Claude Any preserves Claude Code Plan Mode on non-Anthropic providers and adds an optional long-context Advisor model for review.
18
+ > - **Smooth free-model pacing** — Claude Code spends time reading files and running tools, and Claude Any uses that natural gap for RPM pacing so NVIDIA hosted free models feel usable even with strict per-minute limits.
19
+ >
20
+ > Provider, model, base URL, API key, streaming behavior, and LLM options are all selected from a console menu **before** Claude Code starts. Claude Code itself runs untouched with all of its native tooling, slash commands, and workflows.
21
+
22
+ ## Today's Top 3 Benefits
23
+
24
+ 1. **Plan Mode works on non-Anthropic models** — Claude Any keeps Claude Code's Plan Mode usable even when the upstream provider is NVIDIA hosted, Ollama Cloud, local Ollama, vLLM, or NIM.
25
+ 2. **Advisor review with a bigger model** — pick a long-context Advisor Model at launch, then use `/advisor` inside Claude Code to review the current task, blockers, and next concrete action.
26
+ 3. **Free-model RPM limits feel smoother** — router-side RPM pacing uses the natural time spent reading files and running tools, so NVIDIA hosted free models can stay within per-minute limits with less visible waiting.
27
+
28
+ ### Demo
29
+
30
+ ![NVIDIA hosted NIM driving Claude Code (deepseek-4-flash)](docs/assets/claude-any-nvidia-nim.gif)
31
31
 
32
32
  NVIDIA hosted NIM (deepseek-4-flash) driving Claude Code through the claude-any router.  [full mp4 ⤓](https://github.com/OneCielAI/claude-any/raw/main/demo/claude-any-nvidia-nim.mp4)
33
33
 
@@ -44,7 +44,7 @@ arguments through unchanged.
44
44
 
45
45
  Credits: One Ciel LLC
46
46
 
47
- Current version: `0.1.28`
47
+ Current version: `0.1.30`
48
48
 
49
49
  ## Why This Exists
50
50
 
@@ -84,23 +84,57 @@ Requirements:
84
84
 
85
85
  **Install from the npm registry (recommended):**
86
86
 
87
- ```sh
88
- npm install -g @oneciel-ai/claude-any
89
- ```
90
-
91
- ```sh
92
- claude-any
93
- ```
87
+ ```sh
88
+ npm install -g @oneciel-ai/claude-any
89
+ ```
90
+
91
+ ```sh
92
+ claude-any
93
+ ```
94
+
95
+ ## Run Claude Code Headlessly
96
+
97
+ Use headless mode when a script, SSH session, CI job, or parent agent needs to
98
+ launch Claude Code directly without opening the pre-launch menu. `claude-any`
99
+ consumes `--ca-*` options, starts the required local router, and passes the
100
+ remaining arguments to Claude Code.
101
+
102
+ ```sh
103
+ claude-any --ca-provider nvidia-hosted --ca-model z-ai/glm-4.7
104
+ ```
105
+
106
+ ```sh
107
+ claude-any --ca-provider ollama-cloud --ca-model glm-5.1
108
+ ```
109
+
110
+ ```sh
111
+ claude-any --ca-provider ollama --ca-base-url http://127.0.0.1:11434 --ca-model qwen3-coder
112
+ ```
113
+
114
+ Run one non-interactive Claude Code prompt:
115
+
116
+ ```sh
117
+ claude-any --ca-provider nvidia-hosted --ca-model z-ai/glm-4.7 --ca-no-update-check -p "Reply with OK only." --output-format text
118
+ ```
119
+
120
+ Use the saved provider/model and only skip the menu:
121
+
122
+ ```sh
123
+ CLAUDE_ANY_SKIP_MENU=1 claude-any -p "Summarize this repository." --output-format text
124
+ ```
125
+
126
+ More examples are in [Headless Examples](#headless-examples) and
127
+ [the full manual](docs/manual.md#headless-usage).
94
128
 
95
129
  **Upgrade:**
96
130
 
97
- ```sh
98
- npm update -g @oneciel-ai/claude-any
99
- ```
100
-
101
- ```sh
102
- claude-any version
103
- ```
131
+ ```sh
132
+ npm update -g @oneciel-ai/claude-any
133
+ ```
134
+
135
+ ```sh
136
+ claude-any version
137
+ ```
104
138
 
105
139
  **Uninstall:**
106
140
 
@@ -113,49 +147,49 @@ npm uninstall -g @oneciel-ai/claude-any
113
147
  Install directly from the GitHub repository (useful for testing unreleased
114
148
  commits between npm publishes):
115
149
 
116
- ```sh
117
- npm install -g https://github.com/OneCielAI/claude-any.git
118
- ```
119
-
120
- ```sh
121
- claude-any
122
- ```
150
+ ```sh
151
+ npm install -g https://github.com/OneCielAI/claude-any.git
152
+ ```
153
+
154
+ ```sh
155
+ claude-any
156
+ ```
123
157
 
124
158
  POSIX source install:
125
159
 
126
- ```sh
127
- git clone https://github.com/OneCielAI/claude-any.git
128
- ```
129
-
130
- ```sh
131
- cd claude-any
132
- ```
133
-
134
- ```sh
135
- ./install.sh
136
- ```
137
-
138
- ```sh
139
- claude-any
140
- ```
160
+ ```sh
161
+ git clone https://github.com/OneCielAI/claude-any.git
162
+ ```
163
+
164
+ ```sh
165
+ cd claude-any
166
+ ```
167
+
168
+ ```sh
169
+ ./install.sh
170
+ ```
171
+
172
+ ```sh
173
+ claude-any
174
+ ```
141
175
 
142
176
  Windows PowerShell source install:
143
177
 
144
- ```powershell
145
- git clone https://github.com/OneCielAI/claude-any.git
146
- ```
147
-
148
- ```powershell
149
- cd claude-any
150
- ```
151
-
152
- ```powershell
153
- .\install.ps1
154
- ```
155
-
156
- ```powershell
157
- claude-any
158
- ```
178
+ ```powershell
179
+ git clone https://github.com/OneCielAI/claude-any.git
180
+ ```
181
+
182
+ ```powershell
183
+ cd claude-any
184
+ ```
185
+
186
+ ```powershell
187
+ .\install.ps1
188
+ ```
189
+
190
+ ```powershell
191
+ claude-any
192
+ ```
159
193
 
160
194
  ### Releasing (maintainers)
161
195
 
@@ -263,24 +297,24 @@ steps under that larger model's supervision.
263
297
  - Native paths where providers expose Claude/Anthropic-compatible endpoints.
264
298
  - Router mode for providers that need request/response adaptation.
265
299
  - DuckDuckGo and fetch MCP wiring for non-native providers.
266
- - Headless setup flags such as `--ca-provider`, `--ca-model`, `--ca-base-url`,
267
- `--ca-api-key-env`, `--ca-ollama-option`, and `--ca-max-output-tokens`.
268
- - Claude Code Plan Mode support on router-backed non-Anthropic providers,
269
- including local handling for `EnterPlanMode` and plan artifacts.
270
- - Optional `/advisor` slash command that routes the current task state to a
271
- selected Advisor Model, useful for long-context review and next-step checks.
272
- - Claude Code `statusLine` integration showing router RPM usage and wait time
273
- in the bottom status area instead of polluting the chat transcript.
274
- - Router-side RPM control for NVIDIA hosted, self-hosted NIM, Ollama, and
275
- Ollama Cloud. `rate_limit_rpm=0` disables throttling while still showing the
276
- last-60-seconds usage rate.
277
- - Soft pacing subtracts time already spent reading files, running commands, and
278
- waiting for tool results. In real coding sessions, those tool-call gaps absorb
279
- much of the RPM spacing naturally, so providers such as NVIDIA hosted NIM can
280
- stay within free-model limits without making every Claude Code turn feel
281
- rate-limited.
282
- - Streaming proxy for Ollama/Ollama Cloud router path — tokens are delivered
283
- to Claude Code as they arrive instead of waiting for the full response.
300
+ - Headless setup flags such as `--ca-provider`, `--ca-model`, `--ca-base-url`,
301
+ `--ca-api-key-env`, `--ca-ollama-option`, and `--ca-max-output-tokens`.
302
+ - Claude Code Plan Mode support on router-backed non-Anthropic providers,
303
+ including local handling for `EnterPlanMode` and plan artifacts.
304
+ - Optional `/advisor` slash command that routes the current task state to a
305
+ selected Advisor Model, useful for long-context review and next-step checks.
306
+ - Claude Code `statusLine` integration showing router RPM usage and wait time
307
+ in the bottom status area instead of polluting the chat transcript.
308
+ - Router-side RPM control for NVIDIA hosted, self-hosted NIM, Ollama, and
309
+ Ollama Cloud. `rate_limit_rpm=0` disables throttling while still showing the
310
+ last-60-seconds usage rate.
311
+ - Soft pacing subtracts time already spent reading files, running commands, and
312
+ waiting for tool results. In real coding sessions, those tool-call gaps absorb
313
+ much of the RPM spacing naturally, so providers such as NVIDIA hosted NIM can
314
+ stay within free-model limits without making every Claude Code turn feel
315
+ rate-limited.
316
+ - Streaming proxy for Ollama/Ollama Cloud router path — tokens are delivered
317
+ to Claude Code as they arrive instead of waiting for the full response.
284
318
  - Per-provider `stream` on/off toggle and `stream_word_chunking` option to
285
319
  batch text deltas at word boundaries, mitigating SSE fragmentation that can
286
320
  break tool-call / JSON parsing in long streamed responses.
@@ -292,58 +326,75 @@ steps under that larger model's supervision.
292
326
  including `WorktreeCreate` / `WorktreeRemove`, so non-git working directories
293
327
  no longer fail Agent isolation with
294
328
  `Cannot create agent worktree: not in a git repository...`.
295
- - Config file caching — settings are read from disk once and reused until the
296
- file changes, reducing per-request overhead in the router.
297
- - Router control-plane endpoints for headless agent coordination:
298
- `/ca/chat/messages`, `/ca/chat/wait`, `/ca/chat/stream`, `/ca/chat/files`,
299
- and `/ca/plan/artifacts`.
329
+ - Config file caching — settings are read from disk once and reused until the
330
+ file changes, reducing per-request overhead in the router.
331
+ - Router control-plane endpoints for headless agent coordination:
332
+ `/ca/chat/messages`, `/ca/chat/wait`, `/ca/chat/stream`, `/ca/chat/files`,
333
+ and `/ca/plan/artifacts`.
300
334
 
301
335
  ## Changelog
302
336
 
303
- ### 0.1.28
304
-
305
- - **Plan Mode + Advisor headline**: document Claude Any's Plan Mode support for
306
- router-backed non-Anthropic providers and the `/advisor` slash command backed
307
- by a selected long-context Advisor Model.
308
- - **Status-line RPM telemetry**: Claude Any installs a Claude Code `statusLine`
309
- command that shows router RPM usage and the latest wait time in the bottom
310
- status area, keeping rate-limit telemetry out of the chat transcript.
311
- - **Soft RPM pacing for free hosted models**: NVIDIA hosted, self-hosted NIM,
312
- Ollama, and Ollama Cloud can use router-side RPM pacing. The pacing subtracts
313
- time already spent in file reads, command execution, and tool-result waits, so
314
- normal coding tool-call gaps naturally absorb much of the RPM spacing.
315
- - **Unlimited usage display**: `rate_limit_rpm=0` disables throttling while
316
- still displaying the last-60-seconds request rate.
317
-
318
- ### 0.1.27
337
+ ### 0.1.30
319
338
 
320
- - **Plan mode support for non-Anthropic providers**: the router now keeps
321
- `EnterPlanMode` available and supports Claude Code Plan mode even when the
322
- upstream model does not reliably choose that internal tool. Forced
323
- `tool_choice=EnterPlanMode` is answered locally with a valid Anthropic
324
- `tool_use`, and long implementation requests that receive only a short or
325
- empty non-actionable text response are promoted to `EnterPlanMode` using
326
- language-agnostic structure checks.
327
- - **Plan-mode self-tool handling**: unsupported Claude Code self-tools are
328
- still stripped for non-Anthropic providers, but Plan-mode tools are handled
329
- separately so planning can work instead of being disabled.
339
+ - **Headless launch docs moved up**: README now shows copy-ready examples for
340
+ launching Claude Code directly with `--ca-provider`, `--ca-model`, `-p`, and
341
+ `CLAUDE_ANY_SKIP_MENU=1` immediately after install.
342
+ - **NVIDIA hosted wording cleanup**: provider and lifecycle docs now describe
343
+ NVIDIA hosted as using the Claude Any local router, with no separate hosted
344
+ API Catalog proxy requirement.
330
345
 
331
- ### 0.1.25
346
+ ### 0.1.29
332
347
 
333
- - **Plan-mode diagnostics**: set `~/.config/claude-any/log-level` to `TRACE`
334
- to capture redacted request and response summaries in `requests.jsonl` /
335
- `responses.jsonl`.
336
- - **Headless agent chat service**: the router exposes a small HTTP control
337
- plane for sub coding agents. Agents can post messages, poll updates after
338
- the last seen message id, or wait on an SSE stream when they do not have
339
- their own loop.
340
- - **Plan artifact serving**: agents can create plan files through the router
341
- and share stable local URLs, matching Claude Code's file/artifact-oriented
342
- Plan-mode workflow without copying Anthropic's internal implementation.
343
-
344
- ### 0.1.24
345
-
346
- - **First public npm release** under the correct scope: `@oneciel-ai/claude-any`. Earlier 0.1.x versions were never published to the registry; this is the version that is actually installable via `npm install -g @oneciel-ai/claude-any`.
348
+ - **NVIDIA compatibility test fix**: `claude-any test` now restarts the local
349
+ router before router-mode tests, so upgraded installs do not accidentally use
350
+ an old long-running router that still expects `nvd-claude-proxy`.
351
+ - **Clear NVIDIA router wording**: menu status now describes NVIDIA hosted as
352
+ using the local claude-any router instead of the retired local proxy path.
353
+
354
+ ### 0.1.28
355
+
356
+ - **Plan Mode + Advisor headline**: document Claude Any's Plan Mode support for
357
+ router-backed non-Anthropic providers and the `/advisor` slash command backed
358
+ by a selected long-context Advisor Model.
359
+ - **Status-line RPM telemetry**: Claude Any installs a Claude Code `statusLine`
360
+ command that shows router RPM usage and the latest wait time in the bottom
361
+ status area, keeping rate-limit telemetry out of the chat transcript.
362
+ - **Soft RPM pacing for free hosted models**: NVIDIA hosted, self-hosted NIM,
363
+ Ollama, and Ollama Cloud can use router-side RPM pacing. The pacing subtracts
364
+ time already spent in file reads, command execution, and tool-result waits, so
365
+ normal coding tool-call gaps naturally absorb much of the RPM spacing.
366
+ - **Unlimited usage display**: `rate_limit_rpm=0` disables throttling while
367
+ still displaying the last-60-seconds request rate.
368
+
369
+ ### 0.1.27
370
+
371
+ - **Plan mode support for non-Anthropic providers**: the router now keeps
372
+ `EnterPlanMode` available and supports Claude Code Plan mode even when the
373
+ upstream model does not reliably choose that internal tool. Forced
374
+ `tool_choice=EnterPlanMode` is answered locally with a valid Anthropic
375
+ `tool_use`, and long implementation requests that receive only a short or
376
+ empty non-actionable text response are promoted to `EnterPlanMode` using
377
+ language-agnostic structure checks.
378
+ - **Plan-mode self-tool handling**: unsupported Claude Code self-tools are
379
+ still stripped for non-Anthropic providers, but Plan-mode tools are handled
380
+ separately so planning can work instead of being disabled.
381
+
382
+ ### 0.1.25
383
+
384
+ - **Plan-mode diagnostics**: set `~/.config/claude-any/log-level` to `TRACE`
385
+ to capture redacted request and response summaries in `requests.jsonl` /
386
+ `responses.jsonl`.
387
+ - **Headless agent chat service**: the router exposes a small HTTP control
388
+ plane for sub coding agents. Agents can post messages, poll updates after
389
+ the last seen message id, or wait on an SSE stream when they do not have
390
+ their own loop.
391
+ - **Plan artifact serving**: agents can create plan files through the router
392
+ and share stable local URLs, matching Claude Code's file/artifact-oriented
393
+ Plan-mode workflow without copying Anthropic's internal implementation.
394
+
395
+ ### 0.1.24
396
+
397
+ - **First public npm release** under the correct scope: `@oneciel-ai/claude-any`. Earlier 0.1.x versions were never published to the registry; this is the version that is actually installable via `npm install -g @oneciel-ai/claude-any`.
347
398
 
348
399
  ### 0.1.23
349
400
 
@@ -458,7 +509,7 @@ steps under that larger model's supervision.
458
509
  | Ollama | Native when available, router otherwise | Local Ollama normally needs no API key. Cloud models through local Ollama require `ollama signin` on the Ollama host. |
459
510
  | Ollama Cloud | Router | Calls `https://ollama.com/api`; requires an Ollama API key. |
460
511
  | vLLM | Native Anthropic-compatible endpoint | Use a vLLM endpoint that exposes Anthropic-compatible `/v1/messages`; match `--tool-call-parser` to the model family. |
461
- | NVIDIA hosted | Router/proxy | Uses NVIDIA hosted API through the compatibility path. |
512
+ | NVIDIA hosted | Router | Uses the NVIDIA hosted API Catalog through the Claude Any local router. |
462
513
  | self-hosted NIM | Native Anthropic-compatible endpoint | Use the self-hosted NIM Anthropic-compatible endpoint. |
463
514
 
464
515
  ## Service Lifecycle
@@ -466,17 +517,16 @@ steps under that larger model's supervision.
466
517
  Claude Any does not keep every possible backend helper running all the time. The
467
518
  normal lifecycle is:
468
519
 
469
- - Before launch, managed router/proxy processes can be stopped with
520
+ - Before launch, managed router processes can be stopped with
470
521
  `claude-any stop`.
471
522
  - When `claude-any` starts Claude Code, it starts only the services required by
472
523
  the selected provider.
473
524
  - Ollama and Ollama Cloud router mode use the Claude Any router on
474
525
  `127.0.0.1:8799`.
475
- - NVIDIA hosted router mode uses the Claude Any router on `127.0.0.1:8799` and
476
- starts `nvd-claude-proxy` on `127.0.0.1:8788` only when that provider needs it.
477
- - Switching away from NVIDIA hosted does not require keeping the NVIDIA proxy
478
- alive; stale sessions should be cleaned with `claude-any stop` before a fresh
479
- test or launch.
526
+ - NVIDIA hosted router mode uses the Claude Any router on `127.0.0.1:8799`;
527
+ hosted API Catalog models do not require a separate NVIDIA proxy.
528
+ - Run `claude-any stop` before a fresh provider-switch test if an old router is
529
+ still bound to the local port.
480
530
 
481
531
  This keeps Claude Code pointed at one stable Claude Any entry point while still
482
532
  letting provider-specific helpers start on demand.
@@ -504,7 +554,7 @@ vllm serve Qwen/Qwen3-Coder-30B-A3B-Instruct \
504
554
  ## Headless Examples
505
555
 
506
556
  Headless commands skip the pre-launch menu and launch Claude Code immediately.
507
- Claude Any consumes `--ca-*` setup flags, starts the required router/proxy
557
+ Claude Any consumes `--ca-*` setup flags, starts the required router
508
558
  services, then passes the remaining arguments to Claude Code.
509
559
 
510
560
  ```sh
@@ -515,32 +565,32 @@ claude-any --ca-provider vllm --ca-base-url http://127.0.0.1:8000 --ca-model Qwe
515
565
  claude-any --ca-no-update-check -p "Reply with OK only." --output-format text
516
566
  ```
517
567
 
518
- All other arguments are passed through to Claude Code.
519
-
520
- ## Headless Agent Chat
521
-
522
- When the claude-any router is running, sub agents can coordinate through local
523
- HTTP endpoints without opening the menu:
524
-
525
- ```sh
526
- # Send a message to a channel.
527
- curl -s http://127.0.0.1:8799/ca/chat/messages \
528
- -H 'content-type: application/json' \
529
- -d '{"channel":"agents","sender_id":"codex","recipients":["kimi"],"message":"Need logs after id 42"}'
530
-
531
- # Poll updates after the last message id.
532
- curl -s 'http://127.0.0.1:8799/ca/chat/messages?channel=agents&recipient=kimi&after=42'
533
-
534
- # Wait on a stream until messages arrive.
535
- curl -N 'http://127.0.0.1:8799/ca/chat/stream?channel=agents&recipient=kimi&after=42&timeout=300'
536
-
537
- # Publish a plan file and get a served URL.
538
- curl -s http://127.0.0.1:8799/ca/plan/artifacts \
539
- -H 'content-type: application/json' \
540
- -d '{"title":"handoff","name":"handoff.md","content":"# Plan\n- step 1"}'
541
- ```
542
-
543
- ## Security
568
+ All other arguments are passed through to Claude Code.
569
+
570
+ ## Headless Agent Chat
571
+
572
+ When the claude-any router is running, sub agents can coordinate through local
573
+ HTTP endpoints without opening the menu:
574
+
575
+ ```sh
576
+ # Send a message to a channel.
577
+ curl -s http://127.0.0.1:8799/ca/chat/messages \
578
+ -H 'content-type: application/json' \
579
+ -d '{"channel":"agents","sender_id":"codex","recipients":["kimi"],"message":"Need logs after id 42"}'
580
+
581
+ # Poll updates after the last message id.
582
+ curl -s 'http://127.0.0.1:8799/ca/chat/messages?channel=agents&recipient=kimi&after=42'
583
+
584
+ # Wait on a stream until messages arrive.
585
+ curl -N 'http://127.0.0.1:8799/ca/chat/stream?channel=agents&recipient=kimi&after=42&timeout=300'
586
+
587
+ # Publish a plan file and get a served URL.
588
+ curl -s http://127.0.0.1:8799/ca/plan/artifacts \
589
+ -H 'content-type: application/json' \
590
+ -d '{"title":"handoff","name":"handoff.md","content":"# Plan\n- step 1"}'
591
+ ```
592
+
593
+ ## Security
544
594
 
545
595
  Do not commit runtime configuration or API keys. Claude Any stores local runtime
546
596
  configuration under `~/.config/claude-any/`. NVIDIA hosted credentials used by
@@ -656,7 +656,7 @@ def probe_base_url(provider: str, pcfg: dict) -> str:
656
656
  if "your-" in base:
657
657
  return f"Base URL: placeholder ({base})"
658
658
  if provider == "nvidia-hosted":
659
- return f"Base URL: NVIDIA hosted ({base}); proxy starts on launch"
659
+ return f"Base URL: NVIDIA hosted ({base}); local router http://127.0.0.1:8799 starts on launch"
660
660
  path = "/api/tags" if provider in ("ollama", "ollama-cloud") else "/v1/models"
661
661
  url = join_url(base, path)
662
662
  headers = {}