PyPI - cli-agent-runner - Versions diffs - 0.1.40__tar.gz → 0.1.42__tar.gz - Mend

cli-agent-runner 0.1.40tar.gz → 0.1.42tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (235) hide show

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/CHANGELOG.md RENAMED Viewed

@@ -5,6 +5,27 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.1.42] - 2026-06-25
+### Added
+- `crash_loop` defense — serve stops after 5 consecutive *unknown* short crashes (non-zero exit, <60s, no classified transient), escalating the restart delay and recording the failure reason. Ends the respawn-forever crash loop; recoverable-slow failures (rate-limit / quota / 5xx / timeout) still ride the transient-error backoff unchanged.
+- `config_broken` defense — a permanent startup-battery failure now halts serve (distinct no-retry exit code `78`) instead of respawning a broken config every round.
+### Fixed
+- `vcs.dirty_action` no longer sweeps the runner's own `log_dir` bookkeeping when `log_dir` is inside `work_dir`: `auto_commit` excludes it from the commit (no more phantom `git_head` advance on a zero-work round) and `stash` excludes it from `git stash push -u` (logs no longer vanish). `.evolving/` and agent work are unaffected.
+### Removed
+- The inert `smoke_fail_rate` monitor alert (could never fire — superseded by the always-on `config_broken` stop). Monitor now ships 11 detectors.
+### Docs
+- `thesis.md`: the stuck-loop defense is described honestly as a notify-level, opt-in-to-auto-stop monitor detector (`anomaly_repetitive_active`), not a default hard-stop; fixed the `stuck_loop_detected` naming drift.
+## [0.1.41] - 2026-06-07
+### Added
+- New `codewhale` preset — supervise Hmbown/CodeWhale (DeepSeek terminal agent) via `codewhale exec --auto --output-format stream-json`. `agent-runner init --preset codewhale`.
+- New built-in `codewhale_error_detector` plugin — emits `agent_usage_recorded` (model + token counts) from codewhale's stream-json output. Transient-error classification is best-effort (mappable buckets only); auth failures surface via the existing monitor `oauth_fail` detector.
 ## [0.1.40] - 2026-05-31
 ### Security

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: cli-agent-runner
-Version: 0.1.40
+Version: 0.1.42
 Summary: Restart-on-exit supervisor for autonomous CLI agents
 Project-URL: Homepage, https://github.com/wan9yu/cli-agent-runner
 Project-URL: Documentation, https://github.com/wan9yu/cli-agent-runner#readme
@@ -49,7 +49,7 @@ full disks, runaway memory.
 ```
 ┌──────────────────────────────────────────┐
-│ Layer 3: The Witness (monitor)           │  12 detectors + auto-stop
+│ Layer 3: The Witness (monitor)           │  11 detectors + auto-stop
 ├──────────────────────────────────────────┤
 │ Layer 2: The Loop (serve, ~120 LOC)      │  signal-trapping restart loop
 ├──────────────────────────────────────────┤
@@ -86,14 +86,14 @@ Full walkthrough: [`docs/quickstart.md`](docs/quickstart.md).
 |---|---|
 | `init` / `install` / `uninstall` | `peek` — state snapshot |
 | `start` / `stop` / `kill` / `cancel` | `watch` — peek in a refresh loop |
-| `restart` / `status` | `monitor` — 12 detectors, alerts, auto-stop |
+| `restart` / `status` | `monitor` — 11 detectors, alerts, auto-stop |
 | `round` / `serve` / `upgrade` | `events` — query / stream events.jsonl |
 Verb reference: [`docs/commands.md`](docs/commands.md).
 ## Defenses (built in)
-11 named defenses, structured as data — see `agent-runner peek --select defenses`.
+12 named defenses, structured as data — see `agent-runner peek --select defenses`.
 Each carries the historical incident it codifies and the invariant test that
 guards it. Highlights:
@@ -106,7 +106,7 @@ guards it. Highlights:
 Full list and rationale: [`docs/architecture.md`](docs/architecture.md).
-## Monitor: 12 detectors
+## Monitor: 11 detectors
 Notify only: `timeout_rate`, `hung`, `orphan_chain`, `disk_warning`,
 `mem_pressure`, `smoke_fail_rate`, `network_fail`, `rate_limit_active`,

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/README.md RENAMED Viewed

@@ -12,7 +12,7 @@ full disks, runaway memory.
 ```
 ┌──────────────────────────────────────────┐
-│ Layer 3: The Witness (monitor)           │  12 detectors + auto-stop
+│ Layer 3: The Witness (monitor)           │  11 detectors + auto-stop
 ├──────────────────────────────────────────┤
 │ Layer 2: The Loop (serve, ~120 LOC)      │  signal-trapping restart loop
 ├──────────────────────────────────────────┤
@@ -49,14 +49,14 @@ Full walkthrough: [`docs/quickstart.md`](docs/quickstart.md).
 |---|---|
 | `init` / `install` / `uninstall` | `peek` — state snapshot |
 | `start` / `stop` / `kill` / `cancel` | `watch` — peek in a refresh loop |
-| `restart` / `status` | `monitor` — 12 detectors, alerts, auto-stop |
+| `restart` / `status` | `monitor` — 11 detectors, alerts, auto-stop |
 | `round` / `serve` / `upgrade` | `events` — query / stream events.jsonl |
 Verb reference: [`docs/commands.md`](docs/commands.md).
 ## Defenses (built in)
-11 named defenses, structured as data — see `agent-runner peek --select defenses`.
+12 named defenses, structured as data — see `agent-runner peek --select defenses`.
 Each carries the historical incident it codifies and the invariant test that
 guards it. Highlights:
@@ -69,7 +69,7 @@ guards it. Highlights:
 Full list and rationale: [`docs/architecture.md`](docs/architecture.md).
-## Monitor: 12 detectors
+## Monitor: 11 detectors
 Notify only: `timeout_rate`, `hung`, `orphan_chain`, `disk_warning`,
 `mem_pressure`, `smoke_fail_rate`, `network_fail`, `rate_limit_active`,

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/README.zh.md RENAMED Viewed

@@ -6,7 +6,7 @@
 把任意 CLI agent（Claude Code、自研 agent、任何长跑命令）包装成可被
 systemd / launchd 拉起、能被远程观测的服务。**每轮跑完进程退出**，外层
-supervisor 重启 —— 这是核心模式。中间穿插 11 条防御，避开 production 上
+supervisor 重启 —— 这是核心模式。中间穿插 12 条防御，避开 production 上
 最容易翻车的几条路：
 - 轮卡死、Tool 调用空转 → 硬墙 timeout
@@ -20,7 +20,7 @@ supervisor 重启 —— 这是核心模式。中间穿插 11 条防御，避开
 ```
 ┌──────────────────────────────────────────┐
-│ Layer 3：Witness（monitor）              │  12 个检测器 + 自动停服
+│ Layer 3：Witness（monitor）              │  11 个检测器 + 自动停服
 ├──────────────────────────────────────────┤
 │ Layer 2：Loop（serve，~120 LOC 薄壳）    │  捕获信号，循环拉起 round
 ├──────────────────────────────────────────┤
@@ -63,7 +63,7 @@ agent-runner monitor              # 实时异常检测，OAuth/磁盘 critical
 |---|---|
 | `init` / `install` / `uninstall` | `peek` —— 项目状态快照 |
 | `start` / `stop` / `kill` / `cancel` | `watch` —— peek 在刷新循环里 |
-| `restart` / `status` | `monitor` —— 12 个检测器 + 告警 + 自动停服 |
+| `restart` / `status` | `monitor` —— 11 个检测器 + 告警 + 自动停服 |
 | `round` / `serve` / `upgrade` | `events` —— 查询 / 流式订阅 events.jsonl |
 **停服三动词**有清晰的语义分层：
@@ -73,7 +73,7 @@ agent-runner monitor              # 实时异常检测，OAuth/磁盘 critical
 动词参考：[`docs/commands.md`](docs/commands.md)。
-## 内置防御（11 条）
+## 内置防御（12 条）
 防御以数据形式定义在 `agent_runner/defenses.py`，可通过
 `agent-runner peek --select defenses` 直接拿到。每条防御自带：
@@ -95,7 +95,7 @@ agent-runner monitor              # 实时异常检测，OAuth/磁盘 critical
 完整列表 + 历史出处：[`docs/architecture.md`](docs/architecture.md)。
-## Monitor：12 个检测器
+## Monitor：11 个检测器
 **只告警**（warning 级，服务继续跑）：
 `timeout_rate` / `hung` / `orphan_chain` / `disk_warning` /

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/_emit.py RENAMED Viewed

@@ -45,6 +45,29 @@ def emit_max_rounds_reached(log_dir: Path, *, rounds_completed: int, max_rounds:
     emit(log_dir, MAX_ROUNDS_REACHED, rounds_completed=rounds_completed, max_rounds=max_rounds)
+def emit_config_broken(log_dir: Path, *, reason: str) -> None:
+    """Emit config_broken (serve stopped on a permanent startup-battery failure)."""
+    from agent_runner.events import CONFIG_BROKEN, emit
+    emit(log_dir, CONFIG_BROKEN, reason=reason)
+def emit_crash_loop(log_dir: Path, *, consecutive: int, exit_code: int, log_path: Path) -> None:
+    """Emit crash_loop (serve stopped after consecutive unknown short crashes).
+    Captures the failure reason — a redacted tail of the round log — so a
+    recurring unknown crash can later be classified into a transient bucket.
+    """
+    from agent_runner._redact import redact_secrets
+    from agent_runner.events import CRASH_LOOP, emit
+    try:
+        reason = redact_secrets(log_path.read_text(errors="replace")[-2000:])
+    except OSError:
+        reason = ""
+    emit(log_dir, CRASH_LOOP, consecutive=consecutive, exit_code=exit_code, reason=reason)
 def emit_stop_file_detected(
     log_dir: Path, *, stop_file: Path, content: str, rounds_completed: int
 ) -> None:

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/_version.py RENAMED Viewed

@@ -18,7 +18,7 @@ version_tuple: tuple[int | str, ...]
 commit_id: str | None
 __commit_id__: str | None
-__version__ = version = '0.1.40'
-__version_tuple__ = version_tuple = (0, 1, 40)
+__version__ = version = '0.1.42'
+__version_tuple__ = version_tuple = (0, 1, 42)
 __commit_id__ = commit_id = None

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/api.py RENAMED Viewed

@@ -18,7 +18,7 @@ import sysconfig
 import time
 from collections.abc import Iterator
 from pathlib import Path
-from typing import Any
+from typing import Any, Literal
 from agent_runner import events, lifecycle
 from agent_runner.api_types import (
@@ -45,6 +45,59 @@ from agent_runner.service_unit import (
     serve_unit_filename,
 )
+# Exit code for a permanent (no-retry) startup-battery failure. A broken config
+# does not self-heal between rounds, so serve STOPS rather than respawning it
+# forever. 78 = EX_CONFIG (sysexits) — avoids argparse's 2 and the generic 1.
+# Lives here (not runner.py) so serve_cmd can import it from the sanctioned api
+# facade without coupling to runner (runner imports api, not the reverse).
+PERMANENT_CONFIG_EXIT = 78
+# Crash-loop circuit breaker (b12). The serve loop escalates the restart delay
+# on consecutive UNKNOWN short crashes (non-zero exit, short duration, no
+# classified transient) and STOPS after CRASH_LOOP_THRESHOLD of them — the Run 6
+# ~100-empty-rounds scar. Recoverable-slow failures (rate limit / 5h quota / 5xx
+# / timeout) are already handled by the transient-error throttle and never reach
+# this path. A clean (exit 0), long, or classified-transient round resets the run.
+CRASH_LOOP_THRESHOLD = 5
+CRASH_LOOP_SHORT_EXIT_S = 60  # mirrors monitor.SHORT_EXIT_THRESHOLD_S
+CRASH_LOOP_MAX_DELAY_S = 1800  # cap the escalating restart delay (30 min)
+def post_round_decision(
+    *,
+    returncode: int,
+    duration_s: float,
+    throttle_active: bool,
+    consecutive: int,
+    restart_delay_s: int,
+) -> tuple[Literal["config_broken", "crash_loop", "continue"], int, int]:
+    """Restart policy after one round — keeps the serve loop a thin dispatcher.
+    Returns ``(action, delay_s, consecutive)`` where action is:
+    - ``"config_broken"`` — permanent startup failure (b18): stop.
+    - ``"crash_loop"`` — CRASH_LOOP_THRESHOLD consecutive unknown short crashes
+      (b12): stop. An unknown short crash is a non-zero, fast exit with no
+      classified transient (rate-limit/5xx/timeout are handled by the throttle).
+    - ``"continue"`` — sleep ``delay_s`` then run the next round.
+    A clean (exit 0), long, or transient round resets ``consecutive`` to 0; an
+    unknown short crash escalates the delay (restart × 2ⁿ, capped) until the stop.
+    """
+    if returncode == PERMANENT_CONFIG_EXIT:
+        return ("config_broken", 0, consecutive)
+    unknown_short_crash = (
+        returncode != 0 and duration_s < CRASH_LOOP_SHORT_EXIT_S and not throttle_active
+    )
+    if unknown_short_crash:
+        consecutive += 1
+        if consecutive >= CRASH_LOOP_THRESHOLD:
+            return ("crash_loop", 0, consecutive)
+        delay = min(restart_delay_s * 2**consecutive, CRASH_LOOP_MAX_DELAY_S)
+        return ("continue", delay, consecutive)
+    delay = restart_delay_s if returncode == 0 else restart_delay_s * 2
+    return ("continue", delay, 0)
 _PROJECT_NAME_RE = re.compile(r"^[A-Za-z0-9._-]+$")
 _LINGER_HINT = (
@@ -730,6 +783,8 @@ def check_self_terminated_sentinel(log_dir: Path) -> bool:
 from agent_runner._emit import (  # noqa: E402,F401 — intentional bottom re-export
     emit_agent_usage_recorded,
     emit_anomaly_repetitive_tool,
+    emit_config_broken,
+    emit_crash_loop,
     emit_fresh_eyes_round_triggered,
     emit_max_rounds_reached,
     emit_rate_limit_stop,

cli_agent_runner-0.1.42/agent_runner/builtin_plugins/codewhale.py ADDED Viewed

@@ -0,0 +1,133 @@
+"""Built-in post_round_hook for codewhale CLI: usage events + transient classifier.
+Third built-in plugin (after claude, gemini). Parses codewhale's `exec
+--output-format stream-json` NDJSON stdout tail; emits agent_usage_recorded
+from the terminal metadata record. Transient-error classification is
+best-effort and emits ONLY when an error maps to an existing bucket (like
+gemini): codewhale's exec stdout surfaces a {"type":"error"} record, but the
+only observed case so far is auth failure (oauth_fail territory, not a
+transient bucket), so nothing maps yet -- usage-only today. 429/5xx mapping
+is added when a real rate-limit sample is captured.
+"""
+from __future__ import annotations
+import json
+import time
+from collections import deque
+from pathlib import Path
+from typing import Any
+from agent_runner.api import (
+    emit_agent_usage_recorded,
+    emit_transient_error_detected,
+)
+from agent_runner.builtin_plugins._constants import (
+    _5XX_STATUSES,
+    _BACK_OFF_DEFAULTS,
+    _RAW_CAP,
+    _TAIL_LINES,
+)
+from agent_runner.hooks import HookContext, register_post_round_hook
+class CodewhaleErrorDetector:
+    """Parse codewhale round log tail; emit usage + transient_error_detected events."""
+    name = "codewhale_error_detector"
+    def after_round(self, ctx: HookContext, result: Any) -> None:
+        if ctx.agent_binary != "codewhale":
+            return
+        log_path = ctx.agent_log_path
+        if log_path is None or not log_path.exists():
+            return
+        parsed = _parse_codewhale_log(log_path)
+        if parsed.get("transient_error"):
+            emit_transient_error_detected(
+                ctx.log_dir, round_num=ctx.round_num, **parsed["transient_error"]
+            )
+        if parsed.get("usage"):
+            emit_agent_usage_recorded(
+                ctx.log_dir,
+                round_num=ctx.round_num,
+                phase=ctx.phase or "",
+                success=(result.exit_code == 0 and not result.timed_out),
+                **parsed["usage"],
+            )
+def _parse_codewhale_log(log_path: Path) -> dict[str, Any]:
+    """Scan last _TAIL_LINES of codewhale NDJSON; extract usage from the metadata
+    record; classify any {"type":"error"} that maps to a transient bucket.
+    Tolerates non-JSON lines (codewhale prefixes some stdout with terminal
+    escapes) via per-line try/except.
+    """
+    with log_path.open("r", encoding="utf-8", errors="replace") as f:
+        tail = deque(f, maxlen=_TAIL_LINES)
+    metadata: dict | None = None
+    error_event: dict | None = None
+    for line in tail:
+        line = line.strip()
+        if not line:
+            continue
+        try:
+            event = json.loads(line)
+        except json.JSONDecodeError:
+            continue
+        if not isinstance(event, dict):
+            continue
+        etype = event.get("type")
+        if etype == "metadata":
+            metadata = event.get("meta") or {}
+        elif etype == "error":
+            error_event = event
+    out: dict[str, Any] = {}
+    if metadata:
+        out["usage"] = {
+            "agent": "codewhale",
+            "model": str(metadata.get("model", "unknown")),
+            "input_tokens": int(metadata.get("input_tokens", 0)),
+            "output_tokens": int(metadata.get("output_tokens", 0)),
+            "cached_tokens": 0,  # codewhale exec stdout exposes no cache counts
+            "cost_usd": None,  # codewhale exec stdout exposes no USD
+            "duration_ms": 0,  # not in exec metadata
+        }
+    if error_event is not None:
+        classification = _classify_codewhale_error(error_event)
+        if classification:
+            duration = _BACK_OFF_DEFAULTS[classification]
+            out["transient_error"] = {
+                "classification": classification,
+                "agent": "codewhale",
+                "reset_at_epoch": int(time.time() + duration),
+                "raw": str(error_event.get("error", "error"))[:_RAW_CAP],
+            }
+    return out
+def _classify_codewhale_error(error_event: dict[str, Any]) -> str | None:
+    """Map a codewhale {"type":"error"} record to a transient bucket, or None.
+    None means 'not a transient error' (e.g. auth failure -> handled by the
+    monitor's oauth_fail log-scan, not the transient classifier). codewhale's
+    error record currently carries only a free-text 'error' string with no
+    status code; until a real rate-limit/5xx sample is captured we cannot map
+    to rate_limit_model / api_transient_5xx / api_timeout, so we return None.
+    A future revision keys on a numeric status field once observed.
+    """
+    code = error_event.get("code") or error_event.get("status_code")
+    if code == 429:
+        return "rate_limit_model"
+    if code in _5XX_STATUSES:
+        return "api_transient_5xx"
+    if code == 408:
+        return "api_timeout"
+    return None
+register_post_round_hook(CodewhaleErrorDetector())

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/cli/init_cmd.py RENAMED Viewed

@@ -2,15 +2,27 @@
 from __future__ import annotations
+import importlib.resources
 from agent_runner import api
 from agent_runner.cli.common import emit, fail, work_dir_from_args
+def _preset_names() -> list[str]:
+    """Discover scaffold presets from the shipped ``agent_runner/presets/*.toml``.
+    Derived (not hardcoded) so adding a preset is a single new .toml file — the
+    ``--preset`` choices and validation track the filesystem automatically.
+    """
+    presets = importlib.resources.files("agent_runner.presets")
+    return sorted(p.name[:-5] for p in presets.iterdir() if p.name.endswith(".toml"))
 def add_parser(sub, parent) -> None:
     p = sub.add_parser("init", parents=[parent], help="Scaffold agent-runner project files")
     p.add_argument(
         "--preset",
-        choices=["claude", "aider", "gemini"],
+        choices=_preset_names(),
         default="claude",
         help="Which agent CLI preset to scaffold (default: claude)",
     )

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/cli/serve_cmd.py RENAMED Viewed

@@ -23,12 +23,15 @@ from agent_runner._throttle import _check_throttle_state
 from agent_runner._throttle import reset_counters as _reset_counters
 from agent_runner.api import (
     check_self_terminated_sentinel,
+    emit_config_broken,
+    emit_crash_loop,
     emit_fresh_eyes_round_triggered,
     emit_max_rounds_reached,
     emit_rate_limit_stop,
     emit_round_substrate_after,
     emit_round_substrate_before,
     emit_stop_file_detected,
+    post_round_decision,
 )
 from agent_runner.cli.common import cfg_from_args
 from agent_runner.hooks import run_serve_startup_hooks
@@ -135,6 +138,7 @@ def cmd(args) -> int:
     stop_file = cfg.runtime.stop_file  # cache: same pattern as effective_max_rounds
     work_dir = cfg.runtime.work_dir
     rounds_completed = 0
+    consecutive_crashes = 0  # b12: consecutive UNKNOWN short crashes (crash-loop breaker)
     try:
         pid_file.write(os.getpid())
@@ -197,6 +201,7 @@ def cmd(args) -> int:
                     every_n=cfg.runtime.fresh_eyes_every_n,
                 )
             round_log_path = log_dir / f"round-{round_num}.log"
+            round_started = time.monotonic()
             with round_log_path.open("w") as f:
                 r = subprocess.run(
                     [
@@ -211,6 +216,7 @@ def cmd(args) -> int:
                     stdout=f,
                     stderr=subprocess.STDOUT,
                 )
+            round_duration_s = time.monotonic() - round_started
             atomic_relink(log_dir / ROUND_CURRENT_LINK, round_log_path)
             git_head_after = compute_git_head(work_dir)
             paths_hash_after = compute_paths_hash(work_dir, cfg.runtime.substrate_fingerprint_paths)
@@ -221,13 +227,28 @@ def cmd(args) -> int:
                 paths_hash=paths_hash_after,
             )
             rounds_completed += 1
+            # Restart policy (config_broken / crash_loop / continue) lives in the
+            # tested api.post_round_decision helper so this loop stays thin.
+            action, delay, consecutive_crashes = post_round_decision(
+                returncode=r.returncode,
+                duration_s=round_duration_s,
+                throttle_active=_check_throttle_state(log_dir) is not None,
+                consecutive=consecutive_crashes,
+                restart_delay_s=cfg.runtime.restart_delay_s,
+            )
+            if action == "config_broken":
+                emit_config_broken(log_dir, reason="startup battery permanent failure")
+                break
+            if action == "crash_loop":
+                emit_crash_loop(
+                    log_dir,
+                    consecutive=consecutive_crashes,
+                    exit_code=r.returncode,
+                    log_path=round_log_path,
+                )
+                break
             if args.once or stop["requested"]:
                 break
-            delay = (
-                cfg.runtime.restart_delay_s
-                if r.returncode == 0
-                else cfg.runtime.restart_delay_s * 2
-            )
             time.sleep(delay)
     finally:
         pid_file.unlink()

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/defenses.py RENAMED Viewed

@@ -83,8 +83,18 @@ def catalog(cfg: Config) -> list[Defense]:
         Defense(
             name="startup_smoke_check",
             value="6 checks (config / log_dir / agent_cli / git / prompt_file / prompt_smoke)",
-            codifies="R721 + #446 — _common.md frontmatter caused 4h/123-round silent burn",
-            guarded_by=None,
+            codifies=(
+                "R721 + #446 — _common.md frontmatter caused 4h/123-round silent burn; "
+                "now halts serve (config_broken) instead of respawning a broken config"
+            ),
+            guarded_by=Path("tests/unit/test_serve_config_broken.py"),
+            current_state="active",
+        ),
+        Defense(
+            name="crash_loop_breaker",
+            value="stop after 5 consecutive short crashes; exp-escalating delay",
+            codifies="Run 6 — crashing agent respawned ~100 empty rounds at a fixed 2x delay",
+            guarded_by=Path("tests/unit/test_serve_crash_loop.py"),
             current_state="active",
         ),
         Defense(

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/events.py RENAMED Viewed

@@ -32,6 +32,8 @@ ANOMALY_REPETITIVE_TOOL = "anomaly_repetitive_tool"
 AGENT_NETWORK_BLIP = "agent_network_blip"
 AGENT_SPAWN = "agent_spawn"
 AGENT_USAGE_RECORDED = "agent_usage_recorded"
+CONFIG_BROKEN = "config_broken"
+CRASH_LOOP = "crash_loop"
 DIRTY_COMMIT_FAILED = "dirty_commit_failed"
 DIRTY_DETECTED = "dirty_detected"
 FRESH_EYES_ROUND_TRIGGERED = "fresh_eyes_round_triggered"

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/monitor.py RENAMED Viewed

@@ -49,7 +49,6 @@ KNOWN_ALERT_KINDS: frozenset[str] = frozenset(
         "disk_warning",
         "disk_critical",
         "mem_pressure",
-        "smoke_fail_rate",
         "oauth_fail",
         "network_fail",
         "rate_limit_active",
@@ -265,29 +264,6 @@ def detect_mem_pressure(metrics: list[dict[str, Any]], *, threshold_mb: int = 20
     )
-def detect_smoke_fail_rate(
-    events: list[dict[str, Any]], *, window: int = 10, threshold: float = 0.1
-) -> Alert | None:
-    ends = [e for e in events if e.get("event") == "round_end"]
-    if len(ends) < window:
-        return None
-    recent_round_nums = [e.get("round_num") for e in ends[-window:]]
-    fails = sum(
-        1
-        for e in events
-        if e.get("event") == "smoke_check_failed" and e.get("round_num") in recent_round_nums
-    )
-    rate = fails / window
-    if rate < threshold:
-        return None
-    return _alert(
-        "smoke_fail_rate",
-        "warning",
-        f"{fails}/{window} recent rounds had smoke_check_failed",
-        {"rate": rate, "threshold": threshold, "hint": "Inspect events.jsonl for failure reasons"},
-    )
 def detect_oauth_fail(
     events: list[dict[str, Any]],
     log_tails: dict[int, str],
@@ -603,7 +579,6 @@ def run_all_detectors(
         ),
         detect_disk_critical(metrics, threshold_pct=disk_critical_pct),
         detect_mem_pressure(metrics, threshold_mb=mem_avail_min_mb),
-        detect_smoke_fail_rate(events),
         detect_oauth_fail(events, log_tails, patterns=compiled_auth_pats, hint=auth_fail_hint),
         detect_network_fail(events, log_tails),
         detect_rate_limit_active(events, now=now.timestamp()),

cli_agent_runner-0.1.42/agent_runner/presets/codewhale.toml ADDED Viewed

@@ -0,0 +1,30 @@
+# agent-runner.toml — generated by `agent-runner init --preset codewhale`.
+#
+# Prereqs:
+#   - codewhale installed (ships `codewhale` + `codewhale-tui`; both on PATH):
+#       npm i -g codewhale     (or cargo/brew per CodeWhale docs)
+#   - DEEPSEEK_API_KEY set on the supervisor host (or a key saved via
+#     `codewhale auth set`; resolution order is config > keyring > env)
+#   - work_dir is a git repo
+[agent]
+command = ["codewhale", "exec", "--auto", "--output-format", "stream-json"]
+prompt_arg_template = ["{prompt}"]
+name = "codewhale"
+[runtime]
+work_dir = "."
+log_dir = "~/.agent-runner/{project}/logs"
+round_timeout_s = 1800
+restart_delay_s = 3
+[prompt]
+file = "./prompts/main.md"
+inject_context = true
+[vcs]
+dirty_action = "stash"
+stash_idempotency_s = 5
+[monitor]
+auth_fail_hint = "Run `codewhale auth status` to inspect provider/credentials, or set DEEPSEEK_API_KEY on the supervisor host."

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/runner.py RENAMED Viewed

@@ -369,7 +369,7 @@ def run_one_round(cfg: Config, *, phase_override: str | None = None) -> RoundRes
                 file=sys.stderr,
             )
             events.emit(log_dir, "smoke_check_failed", reason=f"{r.name}: {r.reason}")
-        sys.exit(1)
+        sys.exit(api.PERMANENT_CONFIG_EXIT)
     # Concurrency lock (per-project)
     lock_path = log_dir / "agent-runner.lock"
@@ -521,6 +521,7 @@ def _run_one_round_inner(cfg: Config, *, phase_override: str | None = None) -> R
                 round_num=round_num,
                 phase=phase,
                 idempotency_s=cfg.vcs.stash_idempotency_s,
+                log_dir=cfg.runtime.log_dir,
             )
             if ref is not None:
                 context_store.write_orphan_state(
@@ -546,7 +547,9 @@ def _run_one_round_inner(cfg: Config, *, phase_override: str | None = None) -> R
             # Leave tree dirty for next round; dirty_detected already emitted
             pass
         elif action == "auto_commit":
-            err = vcs_state.try_auto_commit(cfg.runtime.work_dir, round_num, phase)
+            err = vcs_state.try_auto_commit(
+                cfg.runtime.work_dir, round_num, phase, log_dir=cfg.runtime.log_dir
+            )
             if err is not None:
                 events.emit(
                     log_dir,

{cli_agent_runner-0.1.40 → cli_agent_runner-0.1.42}/agent_runner/scaffold.py RENAMED Viewed

@@ -5,8 +5,8 @@ Writes three files into a git repo:
   prompts/main.md        — neutral 8-line placeholder
   .gitignore             — append "logs/" if missing
-Available presets ship as package data in `agent_runner/presets/*.toml`.
-Currently: `claude`, `aider`, `gemini`.
+Available presets ship as package data in `agent_runner/presets/*.toml`;
+`agent-runner init --preset <name>` discovers them from that directory.
 Optionally commits in one step (default true via the CLI).
 """

cli-agent-runner 0.1.40__tar.gz → 0.1.42__tar.gz

cli-agent-runner 0.1.40tar.gz → 0.1.42tar.gz