@oneciel-ai/claude-any 0.1.46 → 0.1.63

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -48,7 +48,7 @@ arguments through unchanged.
48
48
 
49
49
  Credits: One Ciel LLC
50
50
 
51
- Current version: `0.1.46`
51
+ Current version: `0.1.63`
52
52
 
53
53
  ## Why This Exists
54
54
 
@@ -338,9 +338,13 @@ steps under that larger model's supervision.
338
338
  native compatibility.
339
339
  - Compatibility test before launch, including text response, tool use, and
340
340
  tool-result round trip checks.
341
- - Runtime context reporting for vLLM/NIM when `/v1/models` exposes
342
- `max_model_len`.
343
- - Console-first pre-launch menu for SSH and terminal workflows.
341
+ - Runtime context reporting for vLLM/NIM when `/v1/models` exposes
342
+ `max_model_len`.
343
+ - Ollama model-context catalog: `claude-any ollama-catalog` downloads
344
+ `https://ollama.com/api/tags` and Ollama library tag pages, then caches real
345
+ context windows such as 256K Kimi and 1M DeepSeek models for preset filtering
346
+ and status display.
347
+ - Console-first pre-launch menu for SSH and terminal workflows.
344
348
  - Native paths where providers expose Claude/Anthropic-compatible endpoints.
345
349
  - Router mode for providers that need request/response adaptation.
346
350
  - DuckDuckGo and fetch MCP wiring for non-native providers.
@@ -381,6 +385,63 @@ steps under that larger model's supervision.
381
385
 
382
386
  ## Changelog
383
387
 
388
+ ### 0.1.63
389
+
390
+ - **Plan Mode stop guard**: when a non-Anthropic model is already in Plan Mode
391
+ and stops after a short acknowledgement without a tool call, the Stop hook
392
+ now returns structured JSON feedback so Claude Code continues with a
393
+ plan-mode-safe tool instead of leaking text into the prompt box.
394
+ - **Guard-feedback filtering**: claude-any filters its own plan-guard marker
395
+ from router history for all roles, preventing Stop hook recovery messages from
396
+ being sent back to upstream models.
397
+ - **Safer retry budget**: the Stop guard retry counter now resets once a real
398
+ tool call is attempted, while `SubagentStop` events are kept observational.
399
+
400
+ ### 0.1.62
401
+
402
+ - **Ollama context catalog**: added `claude-any ollama-catalog`, which downloads
403
+ Ollama's model list and library tag pages, strips suffixes such as `:cloud`
404
+ and `:latest`, and caches per-model context windows under
405
+ `~/.config/claude-any/ollama-model-catalog.json`.
406
+ - **Context-aware presets**: the pre-launch menu now uses the selected model's
407
+ known context capacity to hide impossible presets and expose 1M-context
408
+ presets only for models that can actually use them.
409
+ - **Native Claude Code compacting preserved**: removed the
410
+ `CLAUDE_CODE_AUTO_COMPACT_WINDOW` override so Claude Code's own compact
411
+ behavior stays in control instead of being capped too early by claude-any.
412
+ - **Live context/status accounting**: the statusline prefers Claude Code's
413
+ current session context-window telemetry when available, while router mode
414
+ continues to report upstream request tokens, retries, RPM usage, and errors.
415
+ - **Advisor and plan-mode hardening**: kept Advisor review support, stale
416
+ `ExitPlanMode` recovery, queued-command handling, and broader Claude Code hook
417
+ coverage for agent/task/team workflows on non-Anthropic providers.
418
+
419
+ ### 0.1.50
420
+
421
+ - **Dynamic timeout help**: the LLM options panel now describes
422
+ `request_timeout_ms` using the currently selected value instead of always
423
+ showing the old `300000 ms = 5 minutes` example.
424
+
425
+ ### 0.1.49
426
+
427
+ - **Streaming buffer fix**: Ollama/OpenAI-compatible streams now flush any
428
+ briefly held plan-detection text as soon as normal text streaming resumes,
429
+ instead of replaying it at the end of the response.
430
+ - **Plan mode guard**: `ExitPlanMode` tool calls are dropped when Claude Code is
431
+ no longer in plan mode, avoiding the “You are not in plan mode” dead end.
432
+
433
+ ### 0.1.48
434
+
435
+ - **Unreachable model list fix**: when a provider model endpoint cannot be
436
+ reached, the model picker no longer repopulates stale `current_model` or
437
+ `custom_models` entries from config as if they came from the new endpoint.
438
+
439
+ ### 0.1.47
440
+
441
+ - **Base URL model reset**: changing a provider Base URL now clears stale
442
+ custom/current model entries and refreshes model caches, so the model picker
443
+ cannot keep showing models from the previous endpoint.
444
+
384
445
  ### 0.1.46
385
446
 
386
447
  - **Cleaner stream options**: the LLM options menu now hides `Stream word
@@ -419,7 +480,9 @@ steps under that larger model's supervision.
419
480
  ### 0.1.40
420
481
 
421
482
  - **RPM 0 is preserved**: setting `rate_limit_rpm=0` now stores an explicit
422
- unlimited mode instead of falling back to the provider default.
483
+ unmanaged router mode instead of falling back to the provider default.
484
+ Claude Any still shows recent 60-second request usage when enabled, but it
485
+ does not claim the upstream provider is unlimited.
423
486
 
424
487
  ### 0.1.39
425
488
 
@@ -57,6 +57,7 @@ TOOL_HINTS = {
57
57
  "Grep": "Use Grep with pattern, path, glob, type, output_mode, context, head_limit, or multiline only.",
58
58
  "TaskUpdate": "Use TaskUpdate with taskId and status.",
59
59
  }
60
+ PLAN_GUARD_MARKER = "[claude-any-plan-guard]"
60
61
 
61
62
 
62
63
  def active() -> bool:
@@ -299,6 +300,82 @@ def transcript_plan_mode_active(transcript_path: str | None) -> bool:
299
300
  return active
300
301
 
301
302
 
303
+ def message_text(message: dict[str, Any]) -> str:
304
+ content = message.get("content")
305
+ if isinstance(content, str):
306
+ return content.strip()
307
+ if not isinstance(content, list):
308
+ return ""
309
+ parts: list[str] = []
310
+ for block in content:
311
+ if isinstance(block, str):
312
+ parts.append(block)
313
+ elif isinstance(block, dict) and block.get("type") == "text":
314
+ parts.append(str(block.get("text") or ""))
315
+ return "\n".join(part for part in parts if part).strip()
316
+
317
+
318
+ def message_has_tool_use(message: dict[str, Any]) -> bool:
319
+ content = message.get("content")
320
+ if not isinstance(content, list):
321
+ return False
322
+ return any(isinstance(block, dict) and block.get("type") == "tool_use" for block in content)
323
+
324
+
325
+ def transcript_latest_turn(transcript_path: str | None) -> dict[str, Any]:
326
+ if not transcript_path:
327
+ return {}
328
+ path = Path(transcript_path)
329
+ if not path.exists():
330
+ return {}
331
+ try:
332
+ lines = path.read_text(encoding="utf-8", errors="ignore").splitlines()[-160:]
333
+ except Exception:
334
+ return {}
335
+
336
+ latest_assistant: dict[str, Any] | None = None
337
+ latest_assistant_index = -1
338
+ parsed: list[dict[str, Any]] = []
339
+ for line in lines:
340
+ try:
341
+ data = json.loads(line)
342
+ except Exception:
343
+ continue
344
+ parsed.append(data)
345
+ message = data.get("message")
346
+ if isinstance(message, dict) and message.get("role") == "assistant":
347
+ latest_assistant = message
348
+ latest_assistant_index = len(parsed) - 1
349
+
350
+ if not latest_assistant:
351
+ return {}
352
+
353
+ latest_user_text = ""
354
+ for data in reversed(parsed[:latest_assistant_index]):
355
+ if data.get("type") != "user":
356
+ continue
357
+ message = data.get("message")
358
+ if not isinstance(message, dict):
359
+ continue
360
+ if message.get("isMeta") is True:
361
+ continue
362
+ text = message_text(message)
363
+ if not text:
364
+ continue
365
+ if text.startswith("Stop hook feedback:") or PLAN_GUARD_MARKER in text:
366
+ continue
367
+ if text.startswith("Claude Any plan guard:"):
368
+ continue
369
+ latest_user_text = text
370
+ break
371
+
372
+ return {
373
+ "assistant_text": message_text(latest_assistant),
374
+ "assistant_has_tool_use": message_has_tool_use(latest_assistant),
375
+ "user_text": latest_user_text,
376
+ }
377
+
378
+
302
379
  def short_resume_prompt(text: str) -> bool:
303
380
  normalized = re.sub(r"\s+", " ", text or "").strip()
304
381
  if not normalized or len(normalized) > 32:
@@ -307,16 +384,43 @@ def short_resume_prompt(text: str) -> bool:
307
384
 
308
385
 
309
386
  def non_actionable_stop_text(text: str) -> bool:
310
- normalized = re.sub(r"\s+", " ", text or "").strip()
387
+ stripped = (text or "").strip()
388
+ normalized = re.sub(r"\s+", " ", stripped).strip()
311
389
  if not normalized or len(normalized) > 220:
312
390
  return False
313
- if "\n" in text:
391
+ if "\n" in stripped:
314
392
  return False
315
393
  if re.search(r"[`{};/\\\\]|https?://", normalized):
316
394
  return False
317
395
  return True
318
396
 
319
397
 
398
+ def should_block_plan_stop(transcript_path: str | None) -> tuple[bool, str]:
399
+ if not transcript_plan_mode_active(transcript_path):
400
+ return False, ""
401
+ turn = transcript_latest_turn(transcript_path)
402
+ assistant_text = str(turn.get("assistant_text") or "")
403
+ user_text = str(turn.get("user_text") or "")
404
+ if turn.get("assistant_has_tool_use"):
405
+ return False, ""
406
+ if not non_actionable_stop_text(assistant_text):
407
+ return False, ""
408
+ if re.search(r"[??]", assistant_text):
409
+ return False, ""
410
+ if not short_resume_prompt(user_text):
411
+ return False, ""
412
+ reason = (
413
+ f"{PLAN_GUARD_MARKER} Claude Any plan guard: Claude Code is still in plan mode, "
414
+ "but the latest response ended as a short "
415
+ "acknowledgement without any concrete tool call. Continue now by calling the next required Claude Code "
416
+ "plan-mode-safe tool, such as Read, Glob, Grep, or ExitPlanMode. Use TaskUpdate only when an existing "
417
+ "task is being updated. If mutation is required, call ExitPlanMode with the plan first. Do not put the "
418
+ "next step into the user input box and do not wait for the user unless you are asking a real "
419
+ "clarification question."
420
+ )
421
+ return True, reason
422
+
423
+
320
424
  def stop_block_count_path(session_id: str) -> Path:
321
425
  return cache_dir() / f"stop-block-{session_id or 'unknown'}.json"
322
426
 
@@ -331,17 +435,41 @@ def increment_stop_block_count(session_id: str | None, text: str) -> int:
331
435
  except Exception:
332
436
  data = {}
333
437
  count = int(data.get(key) or 0) + 1
334
- path.write_text(json.dumps({key: count}, ensure_ascii=False) + "\n", encoding="utf-8")
438
+ data[key] = count
439
+ tmp = path.with_suffix(".tmp")
440
+ tmp.write_text(json.dumps(data, ensure_ascii=False) + "\n", encoding="utf-8")
441
+ tmp.replace(path)
335
442
  return count
336
443
 
337
444
 
445
+ def reset_stop_block_count(session_id: str | None) -> None:
446
+ if not session_id:
447
+ return
448
+ path = stop_block_count_path(session_id)
449
+ try:
450
+ path.unlink(missing_ok=True)
451
+ except Exception:
452
+ pass
453
+
454
+
338
455
  def handle_stop(event: dict[str, Any]) -> int:
339
456
  log_json_event(event)
340
- # Claude Code 2.1.x records Stop hook stderr as a suggestion
341
- # (`preventedContinuation: false`) in some interactive flows. That pollutes
342
- # the transcript and can leak into the input buffer, so keep Stop events
343
- # observational and do continuation control in the router instead.
457
+ if str(event.get("hook_event_name") or "") == "SubagentStop":
458
+ log_event(f"SubagentStop guard observed session={event.get('session_id') or ''}")
459
+ return 0
344
460
  session_id = str(event.get("session_id") or "")
461
+ transcript_path = str(event.get("transcript_path") or "")
462
+ if active():
463
+ should_block, reason = should_block_plan_stop(transcript_path)
464
+ if should_block:
465
+ count = increment_stop_block_count(session_id, reason)
466
+ if count <= 3:
467
+ out = {"decision": "block", "reason": reason, "suppressOutput": True}
468
+ log_json_event(event, out)
469
+ log_event(f"Stop guard blocked plan idle session={session_id} count={count} transcript={transcript_path}")
470
+ emit(out)
471
+ return 0
472
+ log_event(f"Stop guard allowed repeated plan idle session={session_id} count={count} transcript={transcript_path}")
345
473
  log_event(f"Stop guard observed session={session_id}")
346
474
  return 0
347
475
 
@@ -405,6 +533,7 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
405
533
  if tool.startswith("mcp__"):
406
534
  return
407
535
  log_json_event(event)
536
+ reset_stop_block_count(str(event.get("session_id") or ""))
408
537
  raw = event.get("tool_input")
409
538
  if not isinstance(raw, dict):
410
539
  pre_deny(
@@ -413,6 +542,25 @@ def handle_pre_tool(event: dict[str, Any]) -> None:
413
542
  )
414
543
  return
415
544
 
545
+ if tool in {"EnterPlanMode", "ExitPlanMode"}:
546
+ transcript_path = str(event.get("transcript_path") or "")
547
+ if transcript_path:
548
+ in_plan_mode = transcript_plan_mode_active(transcript_path)
549
+ if tool == "EnterPlanMode" and in_plan_mode:
550
+ log_event(f"PreToolUse denied repeated EnterPlanMode transcript={transcript_path}")
551
+ pre_deny(
552
+ "Claude Code is already in plan mode.",
553
+ "Continue the current plan-mode exploration. Do not call EnterPlanMode again.",
554
+ )
555
+ return
556
+ if tool == "ExitPlanMode" and not in_plan_mode:
557
+ log_event(f"PreToolUse denied stale ExitPlanMode transcript={transcript_path}")
558
+ pre_deny(
559
+ "Claude Code is not currently in plan mode.",
560
+ "If the plan was already approved or plan mode was exited, continue with concrete work instead of calling ExitPlanMode. If planning is required again, enter plan mode first.",
561
+ )
562
+ return
563
+
416
564
  if tool == "TaskUpdate":
417
565
  task_id = raw.get("taskId")
418
566
  status = raw.get("status")