npm - @oneciel-ai/claude-any - Versions diffs - 0.1.62 → 0.1.64 - Mend

@oneciel-ai/claude-any 0.1.62 → 0.1.64

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +22 -1
package/claude-any-tool-guard.py +136 -7
package/claude_any.py +36 -15
package/docs/README.ja.md +20 -1
package/docs/README.ko.md +20 -1
package/docs/README.zh.md +19 -1
package/docs/manual.md +1 -1
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -48,7 +48,7 @@ arguments through unchanged.
 Credits: One Ciel LLC
-Current version: `0.1.62`
+Current version: `0.1.64`
 ## Why This Exists
@@ -385,6 +385,27 @@ steps under that larger model's supervision.
 ## Changelog
+### 0.1.64
+- **Model-aware native auto-compact**: claude-any now injects
+  `CLAUDE_CODE_AUTO_COMPACT_WINDOW` at launch using the selected provider/model
+  context window, including the cached Ollama/Ollama Cloud model catalog. Smaller
+  custom models now let Claude Code's native auto-compact trigger against their
+  real context budget instead of falling back to Claude Code's generic 200K
+  assumption.
+### 0.1.63
+- **Plan Mode stop guard**: when a non-Anthropic model is already in Plan Mode
+  and stops after a short acknowledgement without a tool call, the Stop hook
+  now returns structured JSON feedback so Claude Code continues with a
+  plan-mode-safe tool instead of leaking text into the prompt box.
+- **Guard-feedback filtering**: claude-any filters its own plan-guard marker
+  from router history for all roles, preventing Stop hook recovery messages from
+  being sent back to upstream models.
+- **Safer retry budget**: the Stop guard retry counter now resets once a real
+  tool call is attempted, while `SubagentStop` events are kept observational.
 ### 0.1.62
 - **Ollama context catalog**: added `claude-any ollama-catalog`, which downloads

package/claude-any-tool-guard.py CHANGED Viewed

@@ -57,6 +57,7 @@ TOOL_HINTS = {
     "Grep": "Use Grep with pattern, path, glob, type, output_mode, context, head_limit, or multiline only.",
     "TaskUpdate": "Use TaskUpdate with taskId and status.",
 }
+PLAN_GUARD_MARKER = "[claude-any-plan-guard]"
 def active() -> bool:
@@ -299,6 +300,82 @@ def transcript_plan_mode_active(transcript_path: str | None) -> bool:
     return active
+def message_text(message: dict[str, Any]) -> str:
+    content = message.get("content")
+    if isinstance(content, str):
+        return content.strip()
+    if not isinstance(content, list):
+        return ""
+    parts: list[str] = []
+    for block in content:
+        if isinstance(block, str):
+            parts.append(block)
+        elif isinstance(block, dict) and block.get("type") == "text":
+            parts.append(str(block.get("text") or ""))
+    return "\n".join(part for part in parts if part).strip()
+def message_has_tool_use(message: dict[str, Any]) -> bool:
+    content = message.get("content")
+    if not isinstance(content, list):
+        return False
+    return any(isinstance(block, dict) and block.get("type") == "tool_use" for block in content)
+def transcript_latest_turn(transcript_path: str | None) -> dict[str, Any]:
+    if not transcript_path:
+        return {}
+    path = Path(transcript_path)
+    if not path.exists():
+        return {}
+    try:
+        lines = path.read_text(encoding="utf-8", errors="ignore").splitlines()[-160:]
+    except Exception:
+        return {}
+    latest_assistant: dict[str, Any] | None = None
+    latest_assistant_index = -1
+    parsed: list[dict[str, Any]] = []
+    for line in lines:
+        try:
+            data = json.loads(line)
+        except Exception:
+            continue
+        parsed.append(data)
+        message = data.get("message")
+        if isinstance(message, dict) and message.get("role") == "assistant":
+            latest_assistant = message
+            latest_assistant_index = len(parsed) - 1
+    if not latest_assistant:
+        return {}
+    latest_user_text = ""
+    for data in reversed(parsed[:latest_assistant_index]):
+        if data.get("type") != "user":
+            continue
+        message = data.get("message")
+        if not isinstance(message, dict):
+            continue
+        if message.get("isMeta") is True:
+            continue
+        text = message_text(message)
+        if not text:
+            continue
+        if text.startswith("Stop hook feedback:") or PLAN_GUARD_MARKER in text:
+            continue
+        if text.startswith("Claude Any plan guard:"):
+            continue
+        latest_user_text = text
+        break
+    return {
+        "assistant_text": message_text(latest_assistant),
+        "assistant_has_tool_use": message_has_tool_use(latest_assistant),
+        "user_text": latest_user_text,
+    }
 def short_resume_prompt(text: str) -> bool:
     normalized = re.sub(r"\s+", " ", text or "").strip()
     if not normalized or len(normalized) > 32:
@@ -307,16 +384,43 @@ def short_resume_prompt(text: str) -> bool:
 def non_actionable_stop_text(text: str) -> bool:
-    normalized = re.sub(r"\s+", " ", text or "").strip()
+    stripped = (text or "").strip()
+    normalized = re.sub(r"\s+", " ", stripped).strip()
     if not normalized or len(normalized) > 220:
         return False
-    if "\n" in text:
+    if "\n" in stripped:
         return False
     if re.search(r"[`{};/\\\\]|https?://", normalized):
         return False
     return True
+def should_block_plan_stop(transcript_path: str | None) -> tuple[bool, str]:
+    if not transcript_plan_mode_active(transcript_path):
+        return False, ""
+    turn = transcript_latest_turn(transcript_path)
+    assistant_text = str(turn.get("assistant_text") or "")
+    user_text = str(turn.get("user_text") or "")
+    if turn.get("assistant_has_tool_use"):
+        return False, ""
+    if not non_actionable_stop_text(assistant_text):
+        return False, ""
+    if re.search(r"[?？]", assistant_text):
+        return False, ""
+    if not short_resume_prompt(user_text):
+        return False, ""
+    reason = (
+        f"{PLAN_GUARD_MARKER} Claude Any plan guard: Claude Code is still in plan mode, "
+        "but the latest response ended as a short "
+        "acknowledgement without any concrete tool call. Continue now by calling the next required Claude Code "
+        "plan-mode-safe tool, such as Read, Glob, Grep, or ExitPlanMode. Use TaskUpdate only when an existing "
+        "task is being updated. If mutation is required, call ExitPlanMode with the plan first. Do not put the "
+        "next step into the user input box and do not wait for the user unless you are asking a real "
+        "clarification question."
+    )
+    return True, reason
 def stop_block_count_path(session_id: str) -> Path:
     return cache_dir() / f"stop-block-{session_id or 'unknown'}.json"
@@ -331,17 +435,41 @@ def increment_stop_block_count(session_id: str | None, text: str) -> int:
     except Exception:
         data = {}
     count = int(data.get(key) or 0) + 1
-    path.write_text(json.dumps({key: count}, ensure_ascii=False) + "\n", encoding="utf-8")
+    data[key] = count
+    tmp = path.with_suffix(".tmp")
+    tmp.write_text(json.dumps(data, ensure_ascii=False) + "\n", encoding="utf-8")
+    tmp.replace(path)
     return count
+def reset_stop_block_count(session_id: str | None) -> None:
+    if not session_id:
+        return
+    path = stop_block_count_path(session_id)
+    try:
+        path.unlink(missing_ok=True)
+    except Exception:
+        pass
 def handle_stop(event: dict[str, Any]) -> int:
     log_json_event(event)
-    # Claude Code 2.1.x records Stop hook stderr as a suggestion
-    # (`preventedContinuation: false`) in some interactive flows. That pollutes
-    # the transcript and can leak into the input buffer, so keep Stop events
-    # observational and do continuation control in the router instead.
+    if str(event.get("hook_event_name") or "") == "SubagentStop":
+        log_event(f"SubagentStop guard observed session={event.get('session_id') or ''}")
+        return 0
     session_id = str(event.get("session_id") or "")
+    transcript_path = str(event.get("transcript_path") or "")
+    if active():
+        should_block, reason = should_block_plan_stop(transcript_path)
+        if should_block:
+            count = increment_stop_block_count(session_id, reason)
+            if count <= 3:
+                out = {"decision": "block", "reason": reason, "suppressOutput": True}
+                log_json_event(event, out)
+                log_event(f"Stop guard blocked plan idle session={session_id} count={count} transcript={transcript_path}")
+                emit(out)
+                return 0
+            log_event(f"Stop guard allowed repeated plan idle session={session_id} count={count} transcript={transcript_path}")
     log_event(f"Stop guard observed session={session_id}")
     return 0
@@ -405,6 +533,7 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
     if tool.startswith("mcp__"):
         return
     log_json_event(event)
+    reset_stop_block_count(str(event.get("session_id") or ""))
     raw = event.get("tool_input")
     if not isinstance(raw, dict):
         pre_deny(

package/claude_any.py CHANGED Viewed

@@ -90,7 +90,7 @@ PROVIDER_LABELS = {
     "self-hosted-nim": "Self Hosted NIM",
 }
 APP_NAME = "Claude Any"
-VERSION = "0.1.62"
+VERSION = "0.1.64"
 CREDITS = "Credits: One Ciel LLC"
 LOG_LEVELS = {"SILENT": 0, "ERROR": 1, "WARN": 2, "INFO": 3, "DEBUG": 4, "TRACE": 5}
@@ -106,6 +106,7 @@ _RATE_LIMIT_LOCK = threading.Lock()
 _CHAT_CONDITION = threading.Condition()
 _CHAT_NEXT_ID: int | None = None
 ADVISOR_FEEDBACK_MARKER = "CLAUDE_ANY_ADVISOR_FEEDBACK"
+PLAN_GUARD_MARKER = "[claude-any-plan-guard]"
 # Tools Claude Code injects into every model's tool list that misfire when called
 # by non-Anthropic models. See docs/notes from anthropics/claude-code issues
@@ -2110,19 +2111,28 @@ def plan_mode_tool_name_for_emit(body: dict[str, Any], name: str, tool_input: di
         router_log("WARN", "dropped ExitPlanMode while plan mode is not active")
         return None, tool_input
     return name, tool_input
-def latest_user_text(body: dict[str, Any]) -> str:
+def is_guard_feedback_text(text: str) -> bool:
+    stripped = (text or "").strip()
+    return (
+        stripped.startswith("Stop hook feedback:")
+        or stripped.startswith("Claude Any plan guard:")
+        or PLAN_GUARD_MARKER in stripped
+    )
+def latest_user_text(body: dict[str, Any]) -> str:
     for message in reversed(body.get("messages") or []):
         if not isinstance(message, dict) or message.get("role") != "user":
             continue
         if message.get("isMeta") is True:
             continue
-        content = message.get("content")
-        if isinstance(content, str):
-            if content.startswith("Stop hook feedback:"):
-                continue
-            return content
+        content = message.get("content")
+        if isinstance(content, str):
+            if is_guard_feedback_text(content):
+                continue
+            return content
         if not isinstance(content, list):
             # Claude Code can inject user-role attachment records such as
             # plan_mode_exit. They are state metadata, not new user intent.
@@ -2133,11 +2143,11 @@ def latest_user_text(body: dict[str, Any]) -> str:
         text_blocks = [
             block for block in content
             if isinstance(block, str) or (isinstance(block, dict) and block.get("type") == "text")
-        ]
-        text = anthropic_content_to_text(text_blocks)
-        if not text or text.startswith("Stop hook feedback:"):
-            continue
-        return text
+        ]
+        text = anthropic_content_to_text(text_blocks)
+        if not text or is_guard_feedback_text(text):
+            continue
+        return text
     return ""
@@ -3910,7 +3920,7 @@ def should_skip_upstream_message(message: dict[str, Any]) -> bool:
     if role == "user" and message.get("isMeta") is True:
         return True
     text = anthropic_content_to_text(content).strip()
-    if role == "user" and text.startswith("Stop hook feedback:"):
+    if is_guard_feedback_text(text):
         return True
     # Router diagnostics must never be fed back to the upstream model. In Claude
     # Code they can also appear in the prompt input after a malformed/empty turn.
@@ -8764,6 +8774,13 @@ def claude_code_output_token_limit(provider: str, pcfg: dict[str, Any]) -> int |
     return None
+def claude_code_auto_compact_window(provider: str, pcfg: dict[str, Any]) -> int | None:
+    limit = context_limit_for_status(provider, pcfg)
+    if limit:
+        return limit
+    return None
 def claude_code_context_model_alias(provider: str, pcfg: dict[str, Any], model: str) -> str:
     model = strip_claude_context_suffix(model)
     limit = context_limit_for_status(provider, pcfg)
@@ -8780,6 +8797,9 @@ def apply_common_claude_env(provider: str, pcfg: dict[str, Any], env: dict[str,
     output_tokens = claude_code_output_token_limit(provider, pcfg)
     if output_tokens:
         env["CLAUDE_CODE_MAX_OUTPUT_TOKENS"] = str(output_tokens)
+    compact_window = claude_code_auto_compact_window(provider, pcfg)
+    if compact_window:
+        env["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] = str(compact_window)
     advisor_model = str(pcfg.get("advisor_model") or "").strip()
     if advisor_model:
         env["CLAUDE_ANY_ADVISOR_MODEL"] = advisor_model
@@ -8870,6 +8890,7 @@ def cmd_env(_: argparse.Namespace) -> None:
         "CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS",
         "CLAUDE_CODE_ATTRIBUTION_HEADER",
         "CLAUDE_CODE_MAX_OUTPUT_TOKENS",
+        "CLAUDE_CODE_AUTO_COMPACT_WINDOW",
         "ANTHROPIC_MODEL",
         "ANTHROPIC_CUSTOM_MODEL_OPTION",
         "ANTHROPIC_DEFAULT_HAIKU_MODEL",

package/docs/README.ja.md CHANGED Viewed

@@ -47,7 +47,7 @@ vLLM、NVIDIA hosted、self-hosted NIM を選択し、通常の Claude Code 引
 Credits: One Ciel LLC
-現在のバージョン: `0.1.62`
+現在のバージョン: `0.1.64`
 ## 作られた理由
@@ -351,6 +351,25 @@ Windows/Linux 管理、クリーンアップスクリプト、定期的なセキ
 ## 変更履歴
+### 0.1.64
+- **モデル context 対応の native auto-compact**: claude-any は起動時に、選択中の
+  provider/model の context window を使って `CLAUDE_CODE_AUTO_COMPACT_WINDOW`
+  を注入します。Ollama/Ollama Cloud ではディスクにキャッシュした model catalog
+  も利用するため、小さい custom model でも Claude Code の汎用 200K 仮定ではなく、
+  実際の context budget に合わせて native auto-compact が発火します。
+### 0.1.63
+- **Plan Mode stop guard**: non-Anthropic モデルが Plan Mode 中に短い確認文だけで
+  tool call なしに停止した場合、Stop hook が構造化 JSON フィードバックを返し、
+  Claude Code が plan-mode-safe tool で続行できるようにしました。
+- **Guard feedback filtering**: claude-any の plan-guard marker をすべての role
+  の router history から除外し、Stop hook の復旧メッセージが upstream モデルへ
+  戻らないようにしました。
+- **より安全な retry budget**: 実際の tool call が試行されたら Stop guard の
+  カウンターをリセットし、`SubagentStop` は観察専用のままにします。
 ### 0.1.62
 - **Ollama context catalog**: `claude-any ollama-catalog` を追加しました。

package/docs/README.ko.md CHANGED Viewed

@@ -47,7 +47,7 @@ NVIDIA hosted, self-hosted NIM을 선택하고, Claude Code의 일반 인자는
 Credits: One Ciel LLC
-현재 버전: `0.1.62`
+현재 버전: `0.1.64`
 ## 왜 만들었나
@@ -351,6 +351,25 @@ Windows 이벤트 로그 리뷰, 바이러스/랜섬웨어 침입 시도 정리,
 ## 변경 이력
+### 0.1.64
+- **모델 컨텍스트 인식 native auto-compact**: claude-any가 실행 시 선택된
+  provider/model의 context window를 기준으로 `CLAUDE_CODE_AUTO_COMPACT_WINDOW`를
+  주입합니다. Ollama/Ollama Cloud는 디스크에 캐시된 model catalog도 활용하므로,
+  작은 커스텀 모델도 Claude Code의 기본 200K 가정이 아니라 실제 context budget에
+  맞춰 native auto-compact가 발동됩니다.
+### 0.1.63
+- **Plan Mode stop guard**: non-Anthropic 모델이 Plan Mode 안에서 짧은 확인
+  문장만 내고 tool call 없이 멈추는 경우, Stop hook이 구조화된 JSON 피드백을
+  반환해 Claude Code가 plan-mode-safe tool로 계속 진행하도록 했습니다.
+- **Guard 피드백 필터링**: claude-any의 plan-guard marker를 모든 role의 router
+  history에서 제거하여, Stop hook 복구 메시지가 upstream 모델로 다시 전달되지
+  않게 했습니다.
+- **더 안전한 retry budget**: 실제 tool call이 시도되면 Stop guard 카운터를
+  리셋하고, `SubagentStop` 이벤트는 관찰 전용으로 유지합니다.
 ### 0.1.62
 - **Ollama 컨텍스트 카탈로그**: `claude-any ollama-catalog` 명령을 추가했습니다.

package/docs/README.zh.md CHANGED Viewed

@@ -47,7 +47,7 @@ NIM，并把普通 Claude Code 参数原样传递。
 Credits: One Ciel LLC
-当前版本: `0.1.62`
+当前版本: `0.1.64`
 ## 为什么存在
@@ -337,6 +337,24 @@ Hermes 格式模型或部分较旧的 Qwen tool template。
 ## 更新日志
+### 0.1.64
+- **按模型上下文触发 native auto-compact**：claude-any 启动时会根据当前
+  provider/model 的 context window 注入 `CLAUDE_CODE_AUTO_COMPACT_WINDOW`。
+  Ollama/Ollama Cloud 会同时使用磁盘缓存的 model catalog，因此较小的 custom
+  model 也会按真实 context budget 触发 Claude Code 原生 auto-compact，而不是
+  退回到 Claude Code 通用的 200K 假设。
+### 0.1.63
+- **Plan Mode stop guard**：当 non-Anthropic 模型已经处于 Plan Mode，却只输出
+  简短确认文本且没有 tool call 就停止时，Stop hook 现在会返回结构化 JSON
+  反馈，让 Claude Code 继续调用 plan-mode-safe tool。
+- **Guard feedback filtering**：claude-any 会从所有 role 的 router history 中过滤
+  自己的 plan-guard marker，避免 Stop hook 恢复消息再次发送给 upstream 模型。
+- **更安全的 retry budget**：一旦真正的 tool call 被尝试，Stop guard 计数器会
+  重置；`SubagentStop` 事件保持仅观察模式。
 ### 0.1.62
 - **Ollama 上下文目录**：新增 `claude-any ollama-catalog` 命令。它会下载

package/docs/manual.md CHANGED Viewed

@@ -10,7 +10,7 @@ Code starts, while passing normal Claude Code arguments through unchanged.
 Credits: One Ciel LLC
-Current version: `0.1.62`
+Current version: `0.1.64`
 ## Install

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@oneciel-ai/claude-any",
-  "version": "0.1.62",
+  "version": "0.1.64",
   "description": "Claude Code provider selector for Anthropic, Ollama, Ollama Cloud, vLLM, NVIDIA hosted, and self-hosted NIM.",
   "license": "MIT",
   "author": "One Ciel LLC",