npm - @oneciel-ai/claude-any - Versions diffs - 0.1.46 → 0.1.63 - Mend

@oneciel-ai/claude-any 0.1.46 → 0.1.63

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/README.md +68 -5
package/claude-any-tool-guard.py +155 -7
package/claude_any.py +2163 -612
package/docs/README.ja.md +60 -4
package/docs/README.ko.md +60 -4
package/docs/README.zh.md +57 -5
package/docs/manual.md +1 -1
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -48,7 +48,7 @@ arguments through unchanged.
 Credits: One Ciel LLC
-Current version: `0.1.46`
+Current version: `0.1.63`
 ## Why This Exists
@@ -338,9 +338,13 @@ steps under that larger model's supervision.
   native compatibility.
 - Compatibility test before launch, including text response, tool use, and
   tool-result round trip checks.
-- Runtime context reporting for vLLM/NIM when `/v1/models` exposes
-  `max_model_len`.
-- Console-first pre-launch menu for SSH and terminal workflows.
+- Runtime context reporting for vLLM/NIM when `/v1/models` exposes
+  `max_model_len`.
+- Ollama model-context catalog: `claude-any ollama-catalog` downloads
+  `https://ollama.com/api/tags` and Ollama library tag pages, then caches real
+  context windows such as 256K Kimi and 1M DeepSeek models for preset filtering
+  and status display.
+- Console-first pre-launch menu for SSH and terminal workflows.
 - Native paths where providers expose Claude/Anthropic-compatible endpoints.
 - Router mode for providers that need request/response adaptation.
 - DuckDuckGo and fetch MCP wiring for non-native providers.
@@ -381,6 +385,63 @@ steps under that larger model's supervision.
 ## Changelog
+### 0.1.63
+- **Plan Mode stop guard**: when a non-Anthropic model is already in Plan Mode
+  and stops after a short acknowledgement without a tool call, the Stop hook
+  now returns structured JSON feedback so Claude Code continues with a
+  plan-mode-safe tool instead of leaking text into the prompt box.
+- **Guard-feedback filtering**: claude-any filters its own plan-guard marker
+  from router history for all roles, preventing Stop hook recovery messages from
+  being sent back to upstream models.
+- **Safer retry budget**: the Stop guard retry counter now resets once a real
+  tool call is attempted, while `SubagentStop` events are kept observational.
+### 0.1.62
+- **Ollama context catalog**: added `claude-any ollama-catalog`, which downloads
+  Ollama's model list and library tag pages, strips suffixes such as `:cloud`
+  and `:latest`, and caches per-model context windows under
+  `~/.config/claude-any/ollama-model-catalog.json`.
+- **Context-aware presets**: the pre-launch menu now uses the selected model's
+  known context capacity to hide impossible presets and expose 1M-context
+  presets only for models that can actually use them.
+- **Native Claude Code compacting preserved**: removed the
+  `CLAUDE_CODE_AUTO_COMPACT_WINDOW` override so Claude Code's own compact
+  behavior stays in control instead of being capped too early by claude-any.
+- **Live context/status accounting**: the statusline prefers Claude Code's
+  current session context-window telemetry when available, while router mode
+  continues to report upstream request tokens, retries, RPM usage, and errors.
+- **Advisor and plan-mode hardening**: kept Advisor review support, stale
+  `ExitPlanMode` recovery, queued-command handling, and broader Claude Code hook
+  coverage for agent/task/team workflows on non-Anthropic providers.
+### 0.1.50
+- **Dynamic timeout help**: the LLM options panel now describes
+  `request_timeout_ms` using the currently selected value instead of always
+  showing the old `300000 ms = 5 minutes` example.
+### 0.1.49
+- **Streaming buffer fix**: Ollama/OpenAI-compatible streams now flush any
+  briefly held plan-detection text as soon as normal text streaming resumes,
+  instead of replaying it at the end of the response.
+- **Plan mode guard**: `ExitPlanMode` tool calls are dropped when Claude Code is
+  no longer in plan mode, avoiding the “You are not in plan mode” dead end.
+### 0.1.48
+- **Unreachable model list fix**: when a provider model endpoint cannot be
+  reached, the model picker no longer repopulates stale `current_model` or
+  `custom_models` entries from config as if they came from the new endpoint.
+### 0.1.47
+- **Base URL model reset**: changing a provider Base URL now clears stale
+  custom/current model entries and refreshes model caches, so the model picker
+  cannot keep showing models from the previous endpoint.
 ### 0.1.46
 - **Cleaner stream options**: the LLM options menu now hides `Stream word
@@ -419,7 +480,9 @@ steps under that larger model's supervision.
 ### 0.1.40
 - **RPM 0 is preserved**: setting `rate_limit_rpm=0` now stores an explicit
-  unlimited mode instead of falling back to the provider default.
+  unmanaged router mode instead of falling back to the provider default.
+  Claude Any still shows recent 60-second request usage when enabled, but it
+  does not claim the upstream provider is unlimited.
 ### 0.1.39

package/claude-any-tool-guard.py CHANGED Viewed

@@ -57,6 +57,7 @@ TOOL_HINTS = {
     "Grep": "Use Grep with pattern, path, glob, type, output_mode, context, head_limit, or multiline only.",
     "TaskUpdate": "Use TaskUpdate with taskId and status.",
 }
+PLAN_GUARD_MARKER = "[claude-any-plan-guard]"
 def active() -> bool:
@@ -299,6 +300,82 @@ def transcript_plan_mode_active(transcript_path: str | None) -> bool:
     return active
+def message_text(message: dict[str, Any]) -> str:
+    content = message.get("content")
+    if isinstance(content, str):
+        return content.strip()
+    if not isinstance(content, list):
+        return ""
+    parts: list[str] = []
+    for block in content:
+        if isinstance(block, str):
+            parts.append(block)
+        elif isinstance(block, dict) and block.get("type") == "text":
+            parts.append(str(block.get("text") or ""))
+    return "\n".join(part for part in parts if part).strip()
+def message_has_tool_use(message: dict[str, Any]) -> bool:
+    content = message.get("content")
+    if not isinstance(content, list):
+        return False
+    return any(isinstance(block, dict) and block.get("type") == "tool_use" for block in content)
+def transcript_latest_turn(transcript_path: str | None) -> dict[str, Any]:
+    if not transcript_path:
+        return {}
+    path = Path(transcript_path)
+    if not path.exists():
+        return {}
+    try:
+        lines = path.read_text(encoding="utf-8", errors="ignore").splitlines()[-160:]
+    except Exception:
+        return {}
+    latest_assistant: dict[str, Any] | None = None
+    latest_assistant_index = -1
+    parsed: list[dict[str, Any]] = []
+    for line in lines:
+        try:
+            data = json.loads(line)
+        except Exception:
+            continue
+        parsed.append(data)
+        message = data.get("message")
+        if isinstance(message, dict) and message.get("role") == "assistant":
+            latest_assistant = message
+            latest_assistant_index = len(parsed) - 1
+    if not latest_assistant:
+        return {}
+    latest_user_text = ""
+    for data in reversed(parsed[:latest_assistant_index]):
+        if data.get("type") != "user":
+            continue
+        message = data.get("message")
+        if not isinstance(message, dict):
+            continue
+        if message.get("isMeta") is True:
+            continue
+        text = message_text(message)
+        if not text:
+            continue
+        if text.startswith("Stop hook feedback:") or PLAN_GUARD_MARKER in text:
+            continue
+        if text.startswith("Claude Any plan guard:"):
+            continue
+        latest_user_text = text
+        break
+    return {
+        "assistant_text": message_text(latest_assistant),
+        "assistant_has_tool_use": message_has_tool_use(latest_assistant),
+        "user_text": latest_user_text,
+    }
 def short_resume_prompt(text: str) -> bool:
     normalized = re.sub(r"\s+", " ", text or "").strip()
     if not normalized or len(normalized) > 32:
@@ -307,16 +384,43 @@ def short_resume_prompt(text: str) -> bool:
 def non_actionable_stop_text(text: str) -> bool:
-    normalized = re.sub(r"\s+", " ", text or "").strip()
+    stripped = (text or "").strip()
+    normalized = re.sub(r"\s+", " ", stripped).strip()
     if not normalized or len(normalized) > 220:
         return False
-    if "\n" in text:
+    if "\n" in stripped:
         return False
     if re.search(r"[`{};/\\\\]|https?://", normalized):
         return False
     return True
+def should_block_plan_stop(transcript_path: str | None) -> tuple[bool, str]:
+    if not transcript_plan_mode_active(transcript_path):
+        return False, ""
+    turn = transcript_latest_turn(transcript_path)
+    assistant_text = str(turn.get("assistant_text") or "")
+    user_text = str(turn.get("user_text") or "")
+    if turn.get("assistant_has_tool_use"):
+        return False, ""
+    if not non_actionable_stop_text(assistant_text):
+        return False, ""
+    if re.search(r"[?？]", assistant_text):
+        return False, ""
+    if not short_resume_prompt(user_text):
+        return False, ""
+    reason = (
+        f"{PLAN_GUARD_MARKER} Claude Any plan guard: Claude Code is still in plan mode, "
+        "but the latest response ended as a short "
+        "acknowledgement without any concrete tool call. Continue now by calling the next required Claude Code "
+        "plan-mode-safe tool, such as Read, Glob, Grep, or ExitPlanMode. Use TaskUpdate only when an existing "
+        "task is being updated. If mutation is required, call ExitPlanMode with the plan first. Do not put the "
+        "next step into the user input box and do not wait for the user unless you are asking a real "
+        "clarification question."
+    )
+    return True, reason
 def stop_block_count_path(session_id: str) -> Path:
     return cache_dir() / f"stop-block-{session_id or 'unknown'}.json"
@@ -331,17 +435,41 @@ def increment_stop_block_count(session_id: str | None, text: str) -> int:
     except Exception:
         data = {}
     count = int(data.get(key) or 0) + 1
-    path.write_text(json.dumps({key: count}, ensure_ascii=False) + "\n", encoding="utf-8")
+    data[key] = count
+    tmp = path.with_suffix(".tmp")
+    tmp.write_text(json.dumps(data, ensure_ascii=False) + "\n", encoding="utf-8")
+    tmp.replace(path)
     return count
+def reset_stop_block_count(session_id: str | None) -> None:
+    if not session_id:
+        return
+    path = stop_block_count_path(session_id)
+    try:
+        path.unlink(missing_ok=True)
+    except Exception:
+        pass
 def handle_stop(event: dict[str, Any]) -> int:
     log_json_event(event)
-    # Claude Code 2.1.x records Stop hook stderr as a suggestion
-    # (`preventedContinuation: false`) in some interactive flows. That pollutes
-    # the transcript and can leak into the input buffer, so keep Stop events
-    # observational and do continuation control in the router instead.
+    if str(event.get("hook_event_name") or "") == "SubagentStop":
+        log_event(f"SubagentStop guard observed session={event.get('session_id') or ''}")
+        return 0
     session_id = str(event.get("session_id") or "")
+    transcript_path = str(event.get("transcript_path") or "")
+    if active():
+        should_block, reason = should_block_plan_stop(transcript_path)
+        if should_block:
+            count = increment_stop_block_count(session_id, reason)
+            if count <= 3:
+                out = {"decision": "block", "reason": reason, "suppressOutput": True}
+                log_json_event(event, out)
+                log_event(f"Stop guard blocked plan idle session={session_id} count={count} transcript={transcript_path}")
+                emit(out)
+                return 0
+            log_event(f"Stop guard allowed repeated plan idle session={session_id} count={count} transcript={transcript_path}")
     log_event(f"Stop guard observed session={session_id}")
     return 0
@@ -405,6 +533,7 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
     if tool.startswith("mcp__"):
         return
     log_json_event(event)
+    reset_stop_block_count(str(event.get("session_id") or ""))
     raw = event.get("tool_input")
     if not isinstance(raw, dict):
         pre_deny(
@@ -413,6 +542,25 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
         )
         return
+    if tool in {"EnterPlanMode", "ExitPlanMode"}:
+        transcript_path = str(event.get("transcript_path") or "")
+        if transcript_path:
+            in_plan_mode = transcript_plan_mode_active(transcript_path)
+            if tool == "EnterPlanMode" and in_plan_mode:
+                log_event(f"PreToolUse denied repeated EnterPlanMode transcript={transcript_path}")
+                pre_deny(
+                    "Claude Code is already in plan mode.",
+                    "Continue the current plan-mode exploration. Do not call EnterPlanMode again.",
+                )
+                return
+            if tool == "ExitPlanMode" and not in_plan_mode:
+                log_event(f"PreToolUse denied stale ExitPlanMode transcript={transcript_path}")
+                pre_deny(
+                    "Claude Code is not currently in plan mode.",
+                    "If the plan was already approved or plan mode was exited, continue with concrete work instead of calling ExitPlanMode. If planning is required again, enter plan mode first.",
+                )
+                return
     if tool == "TaskUpdate":
         task_id = raw.get("taskId")
         status = raw.get("status")