@oneciel-ai/claude-any 0.1.63 → 0.1.65

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -23,11 +23,19 @@
23
23
  >
24
24
  > Provider, model, base URL, API key, streaming behavior, and LLM options are all selected from a console menu **before** Claude Code starts. Claude Code itself runs untouched with all of its native tooling, slash commands, and workflows.
25
25
 
26
- ## Today's Top 3 Benefits
27
-
28
- 1. **Plan Mode works on non-Anthropic models** — Claude Any keeps Claude Code's Plan Mode usable even when the upstream provider is NVIDIA hosted, Ollama Cloud, local Ollama, vLLM, or NIM.
29
- 2. **Advisor review with a bigger model** — pick a long-context Advisor Model at launch, then use `/advisor` inside Claude Code to review the current task, blockers, and next concrete action.
30
- 3. **Free-model RPM limits feel smoother** — router-side RPM pacing uses the natural time spent reading files and running tools, so NVIDIA hosted free models can stay within per-minute limits with less visible waiting.
26
+ ## Today's Top 3 Benefits
27
+
28
+ ### 2026-05-14
29
+
30
+ 1. **Plan Mode loop recovery is semantic, not hard-coded** — unchanged `Read` results are now converted with the previous authoritative observation and current Plan Mode state, so Claude Code can move to `ExitPlanMode` or the next real step instead of rereading the same slice.
31
+ 2. **Remote test router mode is easier to expose** — set `CLAUDE_ANY_ROUTER_BIND_HOST=0.0.0.0` when you intentionally want to test the router from another machine, while Claude Code still talks to the safe local client base.
32
+ 3. **Cleaner router transcripts for third-party models** — attachment-only metadata, historical no-op tool results, and orphan tool results are normalized before reaching Ollama, Ollama Cloud, NVIDIA hosted, vLLM, or NIM.
33
+
34
+ ### 2026-05-13
35
+
36
+ 1. **Plan Mode works on non-Anthropic models** — Claude Any keeps Claude Code's Plan Mode usable even when the upstream provider is NVIDIA hosted, Ollama Cloud, local Ollama, vLLM, or NIM.
37
+ 2. **Advisor review with a bigger model** — pick a long-context Advisor Model at launch, then use `/advisor` inside Claude Code to review the current task, blockers, and next concrete action.
38
+ 3. **Free-model RPM limits feel smoother** — router-side RPM pacing uses the natural time spent reading files and running tools, so NVIDIA hosted free models can stay within per-minute limits with less visible waiting.
31
39
 
32
40
  ### Demo
33
41
 
@@ -48,7 +56,7 @@ arguments through unchanged.
48
56
 
49
57
  Credits: One Ciel LLC
50
58
 
51
- Current version: `0.1.63`
59
+ Current version: `0.1.65`
52
60
 
53
61
  ## Why This Exists
54
62
 
@@ -130,7 +138,7 @@ CLAUDE_ANY_SKIP_MENU=1 claude-any -p "Summarize this repository." --output-forma
130
138
  Configure every launch option with flags:
131
139
 
132
140
  ```sh
133
- claude-any --ca-provider nvidia-hosted --ca-base-url https://integrate.api.nvidia.com/v1 --ca-model z-ai/glm-4.7 --ca-advisor-model deepseek-ai/deepseek-v4-pro --ca-api-key-env NVIDIA_API_KEY --ca-max-output-tokens 4096 --ca-context-window 65536 --ca-request-timeout-ms 300000 --ca-rate-limit-rpm 40 --ca-rate-limit-status on --ca-no-update-check -p "Reply with OK only." --output-format text
141
+ claude-any --ca-provider nvidia-hosted --ca-base-url https://integrate.api.nvidia.com/v1 --ca-model z-ai/glm-4.7 --ca-advisor-model deepseek-ai/deepseek-v4-pro --ca-api-key-env NVIDIA_API_KEY --ca-max-output-tokens 4096 --ca-context-window 65536 --ca-request-timeout-ms 120000 --ca-rate-limit-rpm 40 --ca-rate-limit-status on --ca-no-update-check -p "Reply with OK only." --output-format text
134
142
  ```
135
143
 
136
144
  Or put the same values in environment variables:
@@ -144,7 +152,7 @@ export CLAUDE_ANY_ADVISOR_MODEL=deepseek-ai/deepseek-v4-pro
144
152
  export CLAUDE_ANY_API_KEY_ENV=NVIDIA_API_KEY
145
153
  export CLAUDE_ANY_MAX_OUTPUT_TOKENS=4096
146
154
  export CLAUDE_ANY_CONTEXT_WINDOW=65536
147
- export CLAUDE_ANY_REQUEST_TIMEOUT_MS=300000
155
+ export CLAUDE_ANY_REQUEST_TIMEOUT_MS=120000
148
156
  export CLAUDE_ANY_RATE_LIMIT_RPM=40
149
157
  export CLAUDE_ANY_RATE_LIMIT_STATUS=on
150
158
  claude-any -p "Reply with OK only." --output-format text
@@ -385,6 +393,28 @@ steps under that larger model's supervision.
385
393
 
386
394
  ## Changelog
387
395
 
396
+ ### 0.1.65
397
+
398
+ - **Plan Mode unchanged-Read loop recovery**: router conversion now preserves the
399
+ previous successful `Read` result for unchanged/no-op reads, exposes the
400
+ current Plan Mode state to third-party models, and avoids arbitrary retry
401
+ thresholds.
402
+ - **Cleaner third-party transcripts**: attachment-only metadata, historical
403
+ no-op tool results, and orphan tool results are normalized before reaching
404
+ Ollama, Ollama Cloud, NVIDIA hosted, vLLM, or NIM.
405
+ - **Remote router test binding**: `CLAUDE_ANY_ROUTER_BIND_HOST=0.0.0.0` can be
406
+ used for intentional remote testing while Claude Code keeps using the local
407
+ client base URL.
408
+
409
+ ### 0.1.64
410
+
411
+ - **Model-aware native auto-compact**: claude-any now injects
412
+ `CLAUDE_CODE_AUTO_COMPACT_WINDOW` at launch using the selected provider/model
413
+ context window, including the cached Ollama/Ollama Cloud model catalog. Smaller
414
+ custom models now let Claude Code's native auto-compact trigger against their
415
+ real context budget instead of falling back to Claude Code's generic 200K
416
+ assumption.
417
+
388
418
  ### 0.1.63
389
419
 
390
420
  - **Plan Mode stop guard**: when a non-Anthropic model is already in Plan Mode
@@ -420,7 +450,7 @@ steps under that larger model's supervision.
420
450
 
421
451
  - **Dynamic timeout help**: the LLM options panel now describes
422
452
  `request_timeout_ms` using the currently selected value instead of always
423
- showing the old `300000 ms = 5 minutes` example.
453
+ showing a hard-coded timeout example.
424
454
 
425
455
  ### 0.1.49
426
456
 
@@ -549,8 +579,7 @@ steps under that larger model's supervision.
549
579
 
550
580
  ### 0.1.31
551
581
 
552
- - **5-minute default upstream timeout**: existing saved 10/30-minute defaults
553
- are migrated to 300000 ms so gateway stalls fail faster.
582
+ - **2-minute default upstream timeout**: existing saved longer bundled defaults are migrated to 120000 ms so gateway stalls fail faster.
554
583
  - **Localized gateway retries**: 502/503/504 and socket timeout responses are
555
584
  retried automatically, with retry progress shown in the selected UI language.
556
585
 
@@ -4,7 +4,6 @@ from __future__ import annotations
4
4
  import json
5
5
  import os
6
6
  import re
7
- import hashlib
8
7
  import sys
9
8
  import time
10
9
  from pathlib import Path
@@ -13,6 +12,42 @@ from typing import Any
13
12
 
14
13
  NON_NATIVE_PROVIDERS = {"ollama", "ollama-cloud", "vllm", "nvidia-hosted", "self-hosted-nim"}
15
14
  TASK_STATUS = {"pending", "in_progress", "completed", "deleted"}
15
+ TASK_STATUS_ALIASES = {
16
+ "active": "in_progress",
17
+ "assigned": "in_progress",
18
+ "current": "in_progress",
19
+ "doing": "in_progress",
20
+ "inprogress": "in_progress",
21
+ "in_progress": "in_progress",
22
+ "in-progress": "in_progress",
23
+ "in progress": "in_progress",
24
+ "ongoing": "in_progress",
25
+ "processing": "in_progress",
26
+ "running": "in_progress",
27
+ "started": "in_progress",
28
+ "working": "in_progress",
29
+ "complete": "completed",
30
+ "completed": "completed",
31
+ "done": "completed",
32
+ "finished": "completed",
33
+ "resolved": "completed",
34
+ "success": "completed",
35
+ "closed": "completed",
36
+ "open": "pending",
37
+ "pending": "pending",
38
+ "queued": "pending",
39
+ "todo": "pending",
40
+ "to_do": "pending",
41
+ "to-do": "pending",
42
+ "waiting": "pending",
43
+ "cancel": "deleted",
44
+ "cancelled": "deleted",
45
+ "canceled": "deleted",
46
+ "delete": "deleted",
47
+ "deleted": "deleted",
48
+ "remove": "deleted",
49
+ "removed": "deleted",
50
+ }
16
51
  DESCRIPTION_OK = {"Bash", "TaskCreate", "TaskUpdate"}
17
52
  DROP_DESCRIPTION = {"Read", "Write", "Edit", "MultiEdit", "Glob", "Grep", "LS"}
18
53
  BASH_KEYS = {"command", "description", "timeout", "run_in_background"}
@@ -24,7 +59,17 @@ GLOB_KEYS = {"pattern", "path"}
24
59
  GREP_KEYS = {"pattern", "path", "glob", "type", "output_mode", "-A", "-B", "-C", "head_limit", "multiline"}
25
60
  LS_KEYS = {"path", "ignore"}
26
61
  TASKLIST_KEYS: set[str] = set()
27
- TASKUPDATE_KEYS = {"taskId", "status"}
62
+ TASKUPDATE_KEYS = {
63
+ "taskId",
64
+ "subject",
65
+ "description",
66
+ "activeForm",
67
+ "status",
68
+ "addBlocks",
69
+ "addBlockedBy",
70
+ "owner",
71
+ "metadata",
72
+ }
28
73
  STRICT_KEYS = {
29
74
  "Bash": BASH_KEYS,
30
75
  "Read": READ_KEYS,
@@ -45,7 +90,7 @@ REQUIRED_KEYS = {
45
90
  "MultiEdit": {"file_path", "edits"},
46
91
  "Glob": {"pattern"},
47
92
  "Grep": {"pattern"},
48
- "TaskUpdate": {"taskId", "status"},
93
+ "TaskUpdate": {"taskId"},
49
94
  }
50
95
  TOOL_HINTS = {
51
96
  "Bash": "Use Bash with command, description, timeout, and run_in_background only.",
@@ -55,7 +100,7 @@ TOOL_HINTS = {
55
100
  "MultiEdit": "Use MultiEdit with file_path and edits only.",
56
101
  "Glob": "Use Glob with pattern and optional path only.",
57
102
  "Grep": "Use Grep with pattern, path, glob, type, output_mode, context, head_limit, or multiline only.",
58
- "TaskUpdate": "Use TaskUpdate with taskId and status.",
103
+ "TaskUpdate": "Use TaskUpdate with taskId and optional status pending, in_progress, completed, or deleted.",
59
104
  }
60
105
  PLAN_GUARD_MARKER = "[claude-any-plan-guard]"
61
106
 
@@ -421,37 +466,6 @@ def should_block_plan_stop(transcript_path: str | None) -> tuple[bool, str]:
421
466
  return True, reason
422
467
 
423
468
 
424
- def stop_block_count_path(session_id: str) -> Path:
425
- return cache_dir() / f"stop-block-{session_id or 'unknown'}.json"
426
-
427
-
428
- def increment_stop_block_count(session_id: str | None, text: str) -> int:
429
- path = stop_block_count_path(session_id or "unknown")
430
- key = hashlib.sha256(text.strip().encode("utf-8", errors="ignore")).hexdigest()[:16]
431
- try:
432
- data = json.loads(path.read_text(encoding="utf-8")) if path.exists() else {}
433
- if not isinstance(data, dict):
434
- data = {}
435
- except Exception:
436
- data = {}
437
- count = int(data.get(key) or 0) + 1
438
- data[key] = count
439
- tmp = path.with_suffix(".tmp")
440
- tmp.write_text(json.dumps(data, ensure_ascii=False) + "\n", encoding="utf-8")
441
- tmp.replace(path)
442
- return count
443
-
444
-
445
- def reset_stop_block_count(session_id: str | None) -> None:
446
- if not session_id:
447
- return
448
- path = stop_block_count_path(session_id)
449
- try:
450
- path.unlink(missing_ok=True)
451
- except Exception:
452
- pass
453
-
454
-
455
469
  def handle_stop(event: dict[str, Any]) -> int:
456
470
  log_json_event(event)
457
471
  if str(event.get("hook_event_name") or "") == "SubagentStop":
@@ -462,14 +476,11 @@ def handle_stop(event: dict[str, Any]) -> int:
462
476
  if active():
463
477
  should_block, reason = should_block_plan_stop(transcript_path)
464
478
  if should_block:
465
- count = increment_stop_block_count(session_id, reason)
466
- if count <= 3:
467
- out = {"decision": "block", "reason": reason, "suppressOutput": True}
468
- log_json_event(event, out)
469
- log_event(f"Stop guard blocked plan idle session={session_id} count={count} transcript={transcript_path}")
470
- emit(out)
471
- return 0
472
- log_event(f"Stop guard allowed repeated plan idle session={session_id} count={count} transcript={transcript_path}")
479
+ out = {"decision": "block", "reason": reason, "suppressOutput": True}
480
+ log_json_event(event, out)
481
+ log_event(f"Stop guard blocked plan idle session={session_id} transcript={transcript_path}")
482
+ emit(out)
483
+ return 0
473
484
  log_event(f"Stop guard observed session={session_id}")
474
485
  return 0
475
486
 
@@ -478,12 +489,19 @@ def normalize_aliases(tool: str, tool_input: dict[str, Any]) -> tuple[dict[str,
478
489
  updated = dict(tool_input)
479
490
  changed: list[str] = []
480
491
 
492
+ def present(value: Any) -> bool:
493
+ if value is None:
494
+ return False
495
+ if isinstance(value, str):
496
+ return bool(value.strip())
497
+ return True
498
+
481
499
  def alias(target: str, *names: str) -> None:
482
- if target in updated:
500
+ if present(updated.get(target)):
483
501
  return
484
502
  for name in names:
485
503
  value = updated.get(name)
486
- if value not in (None, ""):
504
+ if present(value):
487
505
  updated[target] = value
488
506
  changed.append(f"{name}->{target}")
489
507
  return
@@ -500,9 +518,36 @@ def normalize_aliases(tool: str, tool_input: dict[str, Any]) -> tuple[dict[str,
500
518
  alias("path", "file_path", "directory")
501
519
  elif tool == "TaskUpdate":
502
520
  alias("taskId", "task_id", "id")
521
+ status = normalize_task_status(updated.get("status"))
522
+ if status and updated.get("status") != status:
523
+ before = updated.get("status")
524
+ updated["status"] = status
525
+ changed.append(f"status:{before}->{status}")
526
+ for key in ("addBlocks", "addBlockedBy"):
527
+ value = updated.get(key)
528
+ if isinstance(value, str) and value.strip():
529
+ updated[key] = [value.strip()]
530
+ changed.append(f"{key}:string->array")
531
+ metadata = updated.get("metadata")
532
+ if metadata is not None and not isinstance(metadata, dict):
533
+ updated.pop("metadata", None)
534
+ changed.append("metadata dropped")
503
535
  return updated, changed
504
536
 
505
537
 
538
+ def normalize_task_status(value: Any) -> str | None:
539
+ if value is None:
540
+ return None
541
+ text = str(value).strip()
542
+ if not text:
543
+ return None
544
+ normalized = re.sub(r"[\s\-]+", "_", text.lower())
545
+ normalized = re.sub(r"[^a-z0-9_]", "", normalized)
546
+ if normalized in TASK_STATUS:
547
+ return normalized
548
+ return TASK_STATUS_ALIASES.get(text.lower()) or TASK_STATUS_ALIASES.get(normalized)
549
+
550
+
506
551
  def missing_required_keys(tool: str, tool_input: dict[str, Any]) -> list[str]:
507
552
  required = REQUIRED_KEYS.get(tool, set())
508
553
  missing: list[str] = []
@@ -533,7 +578,6 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
533
578
  if tool.startswith("mcp__"):
534
579
  return
535
580
  log_json_event(event)
536
- reset_stop_block_count(str(event.get("session_id") or ""))
537
581
  raw = event.get("tool_input")
538
582
  if not isinstance(raw, dict):
539
583
  pre_deny(
@@ -561,9 +605,11 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
561
605
  )
562
606
  return
563
607
 
608
+ updated, dropped, changed = strip_unknown_keys(tool, raw)
609
+
564
610
  if tool == "TaskUpdate":
565
- task_id = raw.get("taskId")
566
- status = raw.get("status")
611
+ task_id = updated.get("taskId")
612
+ status = updated.get("status")
567
613
  if not isinstance(task_id, str) or not task_id.strip():
568
614
  tasks = known_tasks(str(event.get("session_id") or ""))
569
615
  known = ", ".join(f"{tid} ({info.get('subject')})" for tid, info in sorted(tasks.items())[:8] if isinstance(info, dict))
@@ -572,14 +618,13 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
572
618
  context += f" Known task ids for this session: {known}."
573
619
  pre_deny("TaskUpdate requires parameter taskId.", context)
574
620
  return
575
- if not isinstance(status, str) or status not in TASK_STATUS:
621
+ if status is not None and (not isinstance(status, str) or status not in TASK_STATUS):
576
622
  pre_deny(
577
623
  "TaskUpdate status must be one of pending, in_progress, completed, or deleted.",
578
624
  "Regenerate TaskUpdate with a valid status enum and preserve the taskId.",
579
625
  )
580
626
  return
581
627
 
582
- updated, dropped, changed = strip_unknown_keys(tool, raw)
583
628
  missing = missing_required_keys(tool, updated)
584
629
  if missing:
585
630
  log_event(f"PreToolUse denied tool={tool} missing={missing} keys={list(raw.keys())}")
@@ -593,7 +638,7 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
593
638
  if dropped:
594
639
  reason_parts.append(f"removed unsupported parameter(s): {', '.join(dropped)}")
595
640
  if changed:
596
- reason_parts.append(f"normalized parameter name(s): {', '.join(changed)}")
641
+ reason_parts.append(f"normalized parameter/value(s): {', '.join(changed)}")
597
642
  reason = "; ".join(reason_parts)
598
643
  log_event(f"PreToolUse sanitized tool={tool} dropped={dropped} changed={changed} keys={list(raw.keys())}")
599
644
  pre_allow(