PyPI - coderouter-cli - Versions diffs - 2.3.0a3__tar.gz → 2.4.0__tar.gz - Mend

coderouter-cli 2.3.0a3tar.gz → 2.4.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (188) hide show

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/.gitignore RENAMED Viewed

@@ -32,6 +32,11 @@ env/
 ENV/
 .python-version-local
+# ============================================================
+# _OUTPUTS
+# ============================================================
+_OUTPUTS/
 # uv
 # Note: uv.lock SHOULD be committed (per plan.md §5.4)
 # Do NOT add uv.lock here.

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/CHANGELOG.md RENAMED Viewed

@@ -6,6 +6,127 @@ versioning follows [SemVer](https://semver.org/).
 ---
+## [v2.4.0] — 2026-05-15 (Goal-session awareness — P1-4/5/6)
+Stable release following v2.3.0a4. Promotes the Plugin SDK to stable,
+adds three goal-session features, and ships a rule-suggestion CLI.
+### Added
+- **`coderouter/guards/_fingerprint.py`** (P1-4): Response fingerprinting
+  helper.  `fingerprint_response(text)` returns a 12-hex SHA-256 digest
+  of the top-N content words (stop-word-filtered, order-independent).
+  Used by the new `goal_progress_stall` drift signal to detect when a
+  model repeats itself without making progress.
+- **Signal 6 — `goal_progress_stall`** (`drift_detection.py`, P1-4):
+  Sixth drift signal added to `detect_drift()`.  Fires (mild) when the
+  fraction of fingerprinted responses that repeat an already-seen
+  fingerprint exceeds `repetition_rate_threshold` (default 0.4).
+  Requires `response_fingerprint` to be populated on observations; when
+  absent the signal is silently skipped (backward-compatible).
+- **`DriftThresholds.repetition_rate_threshold`** (P1-4): New field on
+  `DriftThresholds`, present on all three presets.  `THRESHOLDS_GOAL`
+  preset added (`min_window_fill=4`, `repetition_rate_threshold=0.2`,
+  tighter across the board) and exposed via `SENSITIVITY_PRESETS["goal"]`.
+- **`FallbackChain.goal_mode: bool = False`** (`config/schemas.py`, P1-5):
+  Profile-level flag.  When `True`, the drift detector ignores
+  `drift_detection_sensitivity` and uses `THRESHOLDS_GOAL` instead
+  (stricter thresholds + `min_window_fill=4`).  Designed for `/goal`
+  agent sessions where forward-progress stalls are more actionable.
+- **`coderouter/state/suggest_rules.py`** (P1-6): Statistical rule
+  suggestion engine.  `suggest_rules(WindowSummary) → list[RuleSuggestion]`
+  analyses the request journal and emits copy-paste YAML snippets.
+  Five rules: `provider_reorder` (cost rank), `enable_prompt_cache`
+  (high-token / low-hit providers), `enable_drift_detection` (reminder),
+  `low_sensitivity_small_window` (sparse-traffic guard), `goal_profile`
+  (output-divergence → `goal_mode: true`).  Pure statistics — no LLM.
+- **`coderouter replay --suggest-rules`** (`cli.py`, P1-6): New flag on
+  the existing `replay` subcommand.  Reads the full request journal,
+  runs `suggest_rules`, and prints a formatted terminal report with
+  confidence badges and YAML snippets.
+### Changed
+- **`ResponseObservation.response_fingerprint: str | None = None`**
+  (`drift_detection.py`): New optional field (slots-safe, defaults to
+  `None`).  Fully backward-compatible — existing callers that don't
+  populate it get the same five-signal behaviour as before.
+- **`FallbackEngine._observe_drift_signal`** (`fallback.py`): Accepts
+  new `response_fingerprint` kwarg.  Non-streaming and streaming success
+  paths now compute and pass a fingerprint for the `goal_progress_stall`
+  signal.  `goal_mode` check applies `THRESHOLDS_GOAL` when the profile
+  flag is set.
+### Files touched
+```
+A  coderouter/guards/_fingerprint.py
+M  coderouter/guards/__init__.py          — module registry comment
+M  coderouter/guards/drift_detection.py   — Signal 6, THRESHOLDS_GOAL, new fields
+M  coderouter/config/schemas.py           — FallbackChain.goal_mode
+M  coderouter/routing/fallback.py         — fingerprint wiring, goal_mode dispatch
+A  coderouter/state/suggest_rules.py
+M  coderouter/state/__init__.py           — module registry comment
+M  coderouter/cli.py                      — replay --suggest-rules
+A  docs/articles/v1-saga/note-14-v0-4-goal-mode.md
+M  docs/articles/v1-saga/INDEX.md
+M  docs/inside/future.md
+M  CHANGELOG.md, pyproject.toml          — 2.3.0a4 → 2.4.0
+```
+---
+## [v2.3.0a4] — 2026-05-08 (Plugin SDK — ruff cleanup)
+Patch over `v2.3.0a3`. CI's `ruff check .` job surfaced six lint
+findings in the new Plugin SDK code. None affect runtime behavior.
+### Fixed
+- **RUF022**: `__all__` in `coderouter/plugins/__init__.py` is now
+  isort-sorted alphabetically.
+- **RUF006**: `_fanout_observers` was using `asyncio.create_task`
+  without holding a strong reference. Asyncio's task tracker only
+  keeps a weakref, so a fanout-in-flight task could be GC'd before
+  the observer ran. Fixed by storing tasks in a per-engine
+  ``_observer_tasks: set[asyncio.Task[None]]`` and removing each
+  via ``task.add_done_callback(set.discard)`` on completion. The
+  attribute is lazy-initialized in `_fanout_observers` itself so
+  engines built via ``__new__`` (which bypass ``__init__``) still
+  work.
+- **I001 + F841**: `tests/test_plugins_integration.py` had unused
+  imports (`AnthropicResponse`, `AnthropicUsage`) and an unused
+  local (`captured_chat`) left over from a build-engine helper
+  whose code path never ran. Removed the helper entirely; the
+  remaining tests exercise the engine's hook surface
+  (``_apply_input_filters`` / ``_fanout_observers`` /
+  ``_safe_observe``) directly, which is what they always actually
+  did.
+- **I001**: `tests/test_plugins_loader.py` import block reordered
+  alphabetically by module name.
+### Files touched
+```
+M  coderouter/plugins/__init__.py        — __all__ alphabetical
+M  coderouter/routing/fallback.py        — task strong-ref set
+M  tests/test_plugins_integration.py     — drop dead helper
+M  tests/test_plugins_loader.py          — import order
+M  tests/test_plugins_registry.py        — formatting nit (blank line)
+M  CHANGELOG.md, pyproject.toml          — 2.3.0a3 → 2.3.0a4
+```
+After this patch, ``ruff check .`` passes against every tracked
+Python file in the repo.
+---
 ## [v2.3.0a3] — 2026-05-08 (Plugin SDK — LogRecord.module collision fix)
 Patch over `v2.3.0a2`. The wheel-install-and-test job in CI surfaced

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: coderouter-cli
-Version: 2.3.0a3
+Version: 2.4.0
 Summary: Local-first, free-first, fallback-built-in LLM router. Claude Code / OpenAI compatible.
 Project-URL: Homepage, https://github.com/zephel01/CodeRouter
 Project-URL: Repository, https://github.com/zephel01/CodeRouter
@@ -47,7 +47,7 @@ Description-Content-Type: text/markdown
 <p align="center">
   <a href="https://github.com/zephel01/CodeRouter/actions/workflows/ci.yml"><img src="https://github.com/zephel01/CodeRouter/actions/workflows/ci.yml/badge.svg?branch=main" alt="CI"></a>
-  <a href=""><img src="https://img.shields.io/badge/version-2.2.0-blue" alt="version"></a>
+  <a href="https://pypi.org/project/coderouter-cli/"><img src="https://img.shields.io/pypi/v/coderouter-cli?include_prereleases&color=blue&label=pypi" alt="pypi"></a>
   <a href=""><img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="python"></a>
   <a href=""><img src="https://img.shields.io/badge/deps-5-brightgreen" alt="deps"></a>
   <a href=""><img src="https://img.shields.io/badge/license-MIT-yellow" alt="license"></a>

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/README.en.md RENAMED Viewed

@@ -6,7 +6,7 @@
 <p align="center">
   <a href="https://github.com/zephel01/CodeRouter/actions/workflows/ci.yml"><img src="https://github.com/zephel01/CodeRouter/actions/workflows/ci.yml/badge.svg?branch=main" alt="CI"></a>
-  <a href=""><img src="https://img.shields.io/badge/version-2.2.0-blue" alt="version"></a>
+  <a href="https://pypi.org/project/coderouter-cli/"><img src="https://img.shields.io/pypi/v/coderouter-cli?include_prereleases&color=blue&label=pypi" alt="pypi"></a>
   <a href=""><img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="python"></a>
   <a href=""><img src="https://img.shields.io/badge/deps-5-brightgreen" alt="deps"></a>
   <a href=""><img src="https://img.shields.io/badge/license-MIT-yellow" alt="license"></a>

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/README.md RENAMED Viewed

@@ -6,7 +6,7 @@
 <p align="center">
   <a href="https://github.com/zephel01/CodeRouter/actions/workflows/ci.yml"><img src="https://github.com/zephel01/CodeRouter/actions/workflows/ci.yml/badge.svg?branch=main" alt="CI"></a>
-  <a href=""><img src="https://img.shields.io/badge/version-2.2.0-blue" alt="version"></a>
+  <a href="https://pypi.org/project/coderouter-cli/"><img src="https://img.shields.io/pypi/v/coderouter-cli?include_prereleases&color=blue&label=pypi" alt="pypi"></a>
   <a href=""><img src="https://img.shields.io/badge/python-3.12%2B-blue" alt="python"></a>
   <a href=""><img src="https://img.shields.io/badge/deps-5-brightgreen" alt="deps"></a>
   <a href=""><img src="https://img.shields.io/badge/license-MIT-yellow" alt="license"></a>

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/coderouter/cli.py RENAMED Viewed

@@ -293,6 +293,18 @@ def _build_parser() -> argparse.ArgumentParser:
         metavar="N",
         help="Use only the last N entries (applied after --since and --provider filters).",
     )
+    # P1-6: --suggest-rules — statistical analysis → routing rule proposals.
+    replay.add_argument(
+        "--suggest-rules",
+        action="store_true",
+        help=(
+            "P1-6: analyse the request journal and print actionable routing "
+            "rule suggestions as copy-paste YAML snippets. Suggestions cover "
+            "provider reordering by cost, prompt_cache enablement, drift "
+            "detection configuration, and goal profile creation. "
+            "Can be combined with --since / --limit to scope the analysis window."
+        ),
+    )
     return parser
@@ -684,6 +696,25 @@ def _run_replay(args: argparse.Namespace) -> int:
         print("replay: no matching entries found.")
         return 0
+    if getattr(args, "suggest_rules", False):
+        # P1-6: statistical rule suggestion mode.
+        # Always compute a full window summary (ignores --compare / --provider).
+        from coderouter.state.suggest_rules import format_suggestions, suggest_rules
+        from coderouter.state.replay import summarize_window as _sw
+        # Re-read without provider filter so we see all providers.
+        all_entries = read_request_log(log_path, since=args.since)
+        if args.limit is not None and args.limit > 0:
+            all_entries = all_entries[-args.limit:]
+        full_summary = _sw(all_entries)
+        suggestions = suggest_rules(full_summary)
+        print(f"Request journal: {len(all_entries)} entries analysed")
+        print(f"  Window: {full_summary.first_ts} → {full_summary.last_ts}")
+        print(f"  Providers: {', '.join(sorted(full_summary.providers))}")
+        print()
+        print(format_suggestions(suggestions))
+        return 0
     if args.compare:
         provider_a, provider_b = args.compare
         comparison = compare_providers(entries, provider_a, provider_b)

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/coderouter/config/schemas.py RENAMED Viewed

@@ -658,6 +658,28 @@ class FallbackChain(BaseModel):
         ),
     )
+    # --- P1-5: goal_mode — tighter drift thresholds for /goal sessions -------
+    #
+    # When True, the drift detector automatically switches to the
+    # ``THRESHOLDS_GOAL`` preset regardless of ``drift_detection_sensitivity``,
+    # and lowers ``min_window_fill`` to 4 so stall detection fires faster.
+    #
+    # Intended for profiles routed by the ``/goal`` meta-command where
+    # the agent is expected to make steady forward progress. Repetition and
+    # length collapse are much more meaningful signals in that context than
+    # in a general-purpose chat session.
+    goal_mode: bool = Field(
+        default=False,
+        description=(
+            "P1-5: when True, automatically applies the ``goal`` drift "
+            "threshold preset (stricter thresholds, lower ``min_window_fill`` "
+            "of 4) for this profile. Overrides ``drift_detection_sensitivity`` "
+            "when drift_detection_action is not ``off``. Designed for "
+            "agent/goal sessions where forward-progress stalls are more "
+            "actionable than in ad-hoc chat."
+        ),
+    )
     # --- v2.0-H (L6): Mid-stream partial stitching --------------------------
     #   * ``off``      — discard partial content on mid-stream failure (legacy).
     #   * ``surface``  — return partial content as a truncated-but-valid response.

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/coderouter/guards/__init__.py RENAMED Viewed

@@ -12,6 +12,8 @@ to hit:
   * :mod:`coderouter.guards.self_healing`    — v2.0-J auto-exclude +
                                                  restart + recovery probe
   * :mod:`coderouter.guards.continuous_probe` — v2.0-I background probing
+  * :mod:`coderouter.guards._fingerprint`     — P1-4 response fingerprinting
+                                                 for goal_progress_stall signal
 Each guard is a pure-functional / single-class module that the engine
 consults at the appropriate dispatch point. Guards never block the

coderouter_cli-2.4.0/coderouter/guards/_fingerprint.py ADDED Viewed

@@ -0,0 +1,125 @@
+"""Response fingerprinting for goal_progress_stall detection (P1-4).
+A "fingerprint" is a compact, order-independent signature of the *content*
+of an assistant response — independent of surface variation (filler phrases,
+minor rewordings).  Two responses with the same fingerprint are considered
+semantically repetitive for stall-detection purposes.
+Algorithm
+---------
+1. Normalise: lowercase, strip punctuation, collapse whitespace.
+2. Extract the N most-frequent content words (excluding a small stop-list).
+3. Sort alphabetically, join with '|', SHA-256 → 12-hex prefix.
+The 12-hex prefix gives 281 trillion distinct values — collision probability
+across any 20-response window is negligible (< 1 in 10^15).
+Why top-N content words instead of full hash?
+----------------------------------------------
+A verbatim hash would fail to catch "I cannot do X. Let me try Y" vs
+"Let me try Y as I cannot do X" — same stall, different hash.  By
+extracting the dominant vocabulary we get useful fuzzy equality without
+the overhead of embedding models.
+Usage
+-----
+    from coderouter.guards._fingerprint import fingerprint_response
+    fp = fingerprint_response(response_text)
+    obs = ResponseObservation(..., response_fingerprint=fp)
+"""
+from __future__ import annotations
+import hashlib
+import re
+import unicodedata
+# ---------------------------------------------------------------------------
+# Stop-word list (English + common LLM filler)
+# ---------------------------------------------------------------------------
+_STOP_WORDS: frozenset[str] = frozenset(
+    {
+        # English function words
+        "a", "an", "the", "and", "or", "but", "if", "in", "on", "at", "to",
+        "for", "of", "with", "by", "from", "as", "is", "it", "its", "be",
+        "was", "are", "were", "been", "has", "have", "had", "do", "does",
+        "did", "will", "would", "could", "should", "may", "might", "shall",
+        "this", "that", "these", "those", "i", "you", "he", "she", "we",
+        "they", "me", "him", "her", "us", "them", "my", "your", "his",
+        "their", "our", "what", "which", "who", "how", "when", "where",
+        "why", "not", "no", "so", "up", "out", "into", "about", "than",
+        "then", "there", "here", "also", "just", "can", "get", "all",
+        # Common LLM assistant filler
+        "certainly", "sure", "absolutely", "great", "happy", "help",
+        "please", "let", "know", "feel", "free", "answer", "question",
+        "response", "following", "based", "provide", "using",
+    }
+)
+# ---------------------------------------------------------------------------
+# Number of top content words to include in the fingerprint
+# ---------------------------------------------------------------------------
+_TOP_N: int = 12
+# ---------------------------------------------------------------------------
+# Public API
+# ---------------------------------------------------------------------------
+def fingerprint_response(text: str, *, top_n: int = _TOP_N) -> str:
+    """Return a 12-hex fingerprint string for *text*.
+    Parameters
+    ----------
+    text:
+        Raw assistant response text (plain text, not JSON).
+    top_n:
+        Number of most-frequent content words to include in the signature.
+        Defaults to ``_TOP_N`` (12).  Lower values are more fuzzy; higher
+        values are more precise.
+    Returns
+    -------
+    A 12-character lowercase hexadecimal string, e.g. ``"a3f7b2c091de"``.
+    Returns ``""`` for empty / whitespace-only input.
+    """
+    if not text or not text.strip():
+        return ""
+    # 1. Unicode normalisation + lowercase
+    normalised = unicodedata.normalize("NFKC", text).lower()
+    # 2. Strip punctuation / digits, collapse whitespace
+    normalised = re.sub(r"[^\w\s]", " ", normalised)
+    normalised = re.sub(r"\d+", " ", normalised)
+    normalised = re.sub(r"\s+", " ", normalised).strip()
+    # 3. Tokenise and filter stop words (also skip very short tokens)
+    tokens = [w for w in normalised.split() if len(w) > 2 and w not in _STOP_WORDS]
+    if not tokens:
+        return ""
+    # 4. Count frequencies, take top-N
+    freq: dict[str, int] = {}
+    for tok in tokens:
+        freq[tok] = freq.get(tok, 0) + 1
+    # Require at least 3 distinct content words; single-word or near-empty
+    # responses (e.g. "xxxxx..." test stubs, error codes, bare ACKs) produce
+    # the same fingerprint every time and would falsely inflate the repetition
+    # rate.  Returning "" marks these as "not fingerprinted" so detect_drift
+    # skips them entirely.
+    if len(freq) < 3:
+        return ""
+    top_words = sorted(freq, key=lambda w: (-freq[w], w))[:top_n]
+    # 5. Sort alphabetically → stable join → hash
+    signature = "|".join(sorted(top_words))
+    digest = hashlib.sha256(signature.encode()).hexdigest()
+    return digest[:12]

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/coderouter/guards/drift_detection.py RENAMED Viewed

@@ -34,6 +34,10 @@ Signals
   * ``stop_anomaly_rate`` — fraction of responses with unexpected stop_reason
     (not "end_turn" / "tool_use" / "max_tokens")
   * ``error_rate`` — fraction of attempts that ended in failure
+  * ``goal_progress_stall`` (P1-4) — fraction of fingerprinted responses
+    whose fingerprint matches a previously-seen fingerprint in the window,
+    indicating the model is repeating itself without making progress.
+    Only fires when ``response_fingerprint`` is populated on observations.
 Thresholds are bundled as :class:`DriftThresholds` with three presets
 (``low`` / ``normal`` / ``high`` sensitivity).
@@ -71,6 +75,15 @@ class ResponseObservation:
     is_error: bool = False
     """True if the attempt ended in provider-failed / provider-failed-midstream."""
     stream: bool = False
+    response_fingerprint: str | None = None
+    """P1-4: compact content fingerprint of the response text.
+    When set, used by the ``goal_progress_stall`` signal to detect
+    repetition: the same fingerprint appearing multiple times in the
+    window indicates the model is not making progress. Computed by
+    :func:`coderouter.guards._fingerprint.fingerprint_response`.
+    Pass ``None`` (default) to opt-out — the signal is silently skipped.
+    """
 # ---------------------------------------------------------------------------
@@ -100,6 +113,12 @@ class DriftThresholds:
     length_collapse_ratio: float = 0.5
     """If recent half median is < 50% of earlier half median → collapse."""
+    # P1-4: repetition/stall threshold
+    repetition_rate_threshold: float = 0.4
+    """P1-4: fraction of fingerprinted responses whose fingerprint has
+    appeared before in the window. Above this rate → goal_progress_stall
+    signal fires (mild). Default 0.4 = 2 out of 5 responses are repeats."""
     # Minimum observations before detection fires
     min_window_fill: int = 6
     """Don't trigger until at least this many observations in the window."""
@@ -112,6 +131,7 @@ THRESHOLDS_LOW = DriftThresholds(
     tool_silence_rate=0.8,
     stop_anomaly_rate=0.6,
     error_rate=0.4,
+    repetition_rate_threshold=0.6,
     min_window_fill=10,
 )
@@ -123,6 +143,19 @@ THRESHOLDS_HIGH = DriftThresholds(
     tool_silence_rate=0.5,
     stop_anomaly_rate=0.3,
     error_rate=0.15,
+    repetition_rate_threshold=0.25,
+    min_window_fill=4,
+)
+# P1-5: goal-mode preset — tighter thresholds + lower min_window_fill.
+# Applied automatically when the profile has goal_mode=True.
+THRESHOLDS_GOAL = DriftThresholds(
+    empty_response_rate=0.2,
+    length_collapse_ratio=0.6,
+    tool_silence_rate=0.5,
+    stop_anomaly_rate=0.3,
+    error_rate=0.15,
+    repetition_rate_threshold=0.2,
     min_window_fill=4,
 )
@@ -130,6 +163,7 @@ SENSITIVITY_PRESETS: dict[str, DriftThresholds] = {
     "low": THRESHOLDS_LOW,
     "normal": THRESHOLDS_NORMAL,
     "high": THRESHOLDS_HIGH,
+    "goal": THRESHOLDS_GOAL,
 }
@@ -244,6 +278,27 @@ def detect_drift(
     if error_rate > thresholds.error_rate:
         mild_flags.append(f"error_rate={error_rate:.2f}")
+    # --- Signal 6: Goal progress stall (P1-4) ---
+    # Only active when at least some observations have a fingerprint.
+    # Computes: how many fingerprinted responses repeat a fingerprint
+    # already seen earlier in the window.  High repetition → stall.
+    fingerprinted = [
+        obs for obs in window if obs.response_fingerprint  # excludes None and ""
+    ]
+    if len(fingerprinted) >= 3:
+        seen: set[str] = set()
+        repeat_count = 0
+        for obs in fingerprinted:
+            fp = obs.response_fingerprint  # guaranteed non-empty by filter above
+            if fp in seen:
+                repeat_count += 1
+            else:
+                seen.add(fp)
+        repetition_rate = repeat_count / len(fingerprinted)
+        signals["goal_progress_stall"] = round(repetition_rate, 3)
+        if repetition_rate > thresholds.repetition_rate_threshold:
+            mild_flags.append(f"goal_progress_stall={repetition_rate:.2f}")
     # --- Severity synthesis ---
     if severe_flags:
         severity: Literal["none", "mild", "severe"] = "severe"

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/coderouter/plugins/__init__.py RENAMED Viewed

@@ -43,17 +43,14 @@ from coderouter.plugins.loader import (
 from coderouter.plugins.registry import PluginRegistry
 __all__ = [
-    # Active hooks
-    "InputFilter",
-    "Observer",
-    # Future hooks (Protocol-only, no engine integration yet)
+    "PLUGIN_GROUPS_FUTURE",
+    "PLUGIN_GROUPS_V2_3",
+    "Adapter",
     "Frontend",
     "Guard",
+    "InputFilter",
+    "Observer",
     "OutputFilter",
-    "Adapter",
-    # Discovery + container
     "PluginRegistry",
     "discover_and_load",
-    "PLUGIN_GROUPS_V2_3",
-    "PLUGIN_GROUPS_FUTURE",
 ]

{coderouter_cli-2.3.0a3 → coderouter_cli-2.4.0}/coderouter/routing/fallback.py RENAMED Viewed

@@ -838,6 +838,12 @@ class FallbackEngine:
         # so tests that build the engine via ``FallbackEngine.__new__``
         # see an empty registry instead of AttributeError.
         self._plugin_registry: PluginRegistry = plugins or PluginRegistry.empty()
+        # v2.3.0: holds strong refs to in-flight Observer fanout tasks
+        # so the asyncio event loop's weak-ref bookkeeping doesn't GC
+        # them mid-flight (RUF006).  Tasks remove themselves on done
+        # via ``add_done_callback(_observer_tasks.discard)`` in
+        # :meth:`_fanout_observers`.
+        self._observer_tasks: set[asyncio.Task[None]] = set()
         # Cache adapters so we don't re-instantiate per request
         self._adapters: dict[str, BaseAdapter] = {
             p.name: build_adapter(p) for p in config.providers
@@ -1277,6 +1283,7 @@ class FallbackEngine:
         stop_reason: str | None = None,
         is_error: bool = False,
         stream: bool = False,
+        response_fingerprint: str | None = None,
     ) -> DriftVerdict | None:
         """v2.0-G (L4): record an observation and check for drift.
@@ -1288,9 +1295,18 @@ class FallbackEngine:
         - Emits ``drift-detected`` log.
         - If action is ``promote`` or ``reload``, demotes the provider
           via the adaptive rank machinery.
+        Parameters
+        ----------
+        response_fingerprint:
+            P1-4: compact content fingerprint from
+            :func:`coderouter.guards._fingerprint.fingerprint_response`.
+            When set, enables the ``goal_progress_stall`` signal.
+            Pass ``None`` (default) to skip that signal.
         """
         from coderouter.guards.drift_detection import (
             SENSITIVITY_PRESETS,
+            THRESHOLDS_GOAL,
             ResponseObservation,
             detect_drift,
         )
@@ -1316,6 +1332,7 @@ class FallbackEngine:
             stop_reason=stop_reason,
             is_error=is_error,
             stream=stream,
+            response_fingerprint=response_fingerprint,
         )
         self._drift_window.record(obs)
@@ -1338,10 +1355,15 @@ class FallbackEngine:
             return None
         # Run detection
+        # P1-5: goal_mode overrides the sensitivity preset with the tighter
+        # THRESHOLDS_GOAL regardless of drift_detection_sensitivity setting.
         window = self._drift_window.get_window(provider)
-        thresholds = SENSITIVITY_PRESETS.get(
-            chain_cfg.drift_detection_sensitivity, SENSITIVITY_PRESETS["normal"]
-        )
+        if getattr(chain_cfg, "goal_mode", False):
+            thresholds = THRESHOLDS_GOAL
+        else:
+            thresholds = SENSITIVITY_PRESETS.get(
+                chain_cfg.drift_detection_sensitivity, SENSITIVITY_PRESETS["normal"]
+            )
         verdict = detect_drift(window, thresholds)
         if not verdict.drifted:
@@ -1881,10 +1903,22 @@ class FallbackEngine:
         observers = self.plugins.observers
         if not observers:
             return
+        # Lazy-init the task set for engines built via ``__new__`` —
+        # mirrors the lazy ``plugins`` property pattern so legacy
+        # tests that bypass __init__ don't crash here.
+        if not hasattr(self, "_observer_tasks"):
+            self._observer_tasks = set()
         for obs in observers:
-            asyncio.create_task(
+            task = asyncio.create_task(
                 self._safe_observe(obs, event_type, payload)
             )
+            # Strong-ref keeps the task alive past the loop iteration;
+            # ``discard`` cleans up after the task completes (success
+            # or exception). Avoids the RUF006 footgun where
+            # asyncio.create_task's weakref-only bookkeeping can let
+            # the loop GC a fanout-in-progress task.
+            self._observer_tasks.add(task)
+            task.add_done_callback(self._observer_tasks.discard)
     async def _safe_observe(
         self,
@@ -2065,6 +2099,13 @@ class FallbackEngine:
                 adapter.name, profile=request.profile
             )
             # v2.0-G (L4): drift detection observation (success path).
+            # P1-4: compute response fingerprint for goal_progress_stall.
+            _fp_text = " ".join(
+                getattr(b, "text", "") or (b.get("text", "") if isinstance(b, dict) else "")
+                for b in (resp.content or [])
+                if (getattr(b, "type", None) or (b.get("type") if isinstance(b, dict) else None)) == "text"
+            )
+            from coderouter.guards._fingerprint import fingerprint_response as _fp
             self._observe_drift_signal(
                 adapter.name,
                 profile=request.profile,
@@ -2075,6 +2116,7 @@ class FallbackEngine:
                 request_had_tools=bool(request.tools),
                 stop_reason=resp.stop_reason,
                 stream=False,
+                response_fingerprint=_fp(_fp_text) if _fp_text else None,
             )
             # v1.9-A: pair every successful Anthropic response with a
             # cache-observed log line. Native Anthropic / LM Studio
@@ -2294,6 +2336,11 @@ class FallbackEngine:
                     adapter.name, exc, partial_content=acc.partial_content
                 ) from exc
             # v2.0-G (L4): drift detection observation (stream success).
+            # P1-4: compute response fingerprint for goal_progress_stall.
+            _stream_fp_text = " ".join(
+                b.get("text", "") for b in acc.partial_content if b.get("type") == "text"
+            )
+            from coderouter.guards._fingerprint import fingerprint_response as _fp_s
             self._observe_drift_signal(
                 adapter.name,
                 profile=request.profile,
@@ -2302,6 +2349,7 @@ class FallbackEngine:
                 request_had_tools=bool(request.tools),
                 stop_reason=acc.stop_reason,
                 stream=True,
+                response_fingerprint=_fp_s(_stream_fp_text) if _stream_fp_text else None,
             )
             # v1.9-B2: pair the successful stream with a cache-observed
             # log line carrying the aggregated usage counters that the

coderouter_cli-2.4.0/coderouter/state/__init__.py ADDED Viewed

@@ -0,0 +1,19 @@
+"""Persistent state layer (v2.0-K).
+Five modules:
+* :mod:`coderouter.state.store`         — sqlite3 KV store for operational
+                                           metadata (budget totals, health
+                                           state, self-healing exclusions).
+* :mod:`coderouter.state.audit_log`     — JSONL structured event log with
+                                           rotation and CLI reader.
+* :mod:`coderouter.state.request_log`   — JSONL request metadata journal
+                                           (per-request token counts, cost,
+                                           provider — no request body).
+* :mod:`coderouter.state.replay`        — Statistical A/B analysis engine
+                                           over request journal entries.
+* :mod:`coderouter.state.suggest_rules` — P1-6 rule suggestion engine:
+                                           analyses WindowSummary and emits
+                                           copy-paste YAML snippets for
+                                           routing optimisation.
+"""

coderouter-cli 2.3.0a3__tar.gz → 2.4.0__tar.gz

coderouter-cli 2.3.0a3tar.gz → 2.4.0tar.gz