PyPI - cli-agent-runner - Versions diffs - 0.1.7__tar.gz → 0.1.9__tar.gz - Mend

cli-agent-runner 0.1.7tar.gz → 0.1.9tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (142) hide show

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/CHANGELOG.md RENAMED Viewed

@@ -7,6 +7,85 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.1.9] - 2026-05-13
+### Acknowledgements
+Thanks to the argus-gateway team for the dev/qa/product wall-time data
+(Phase 4 feedback §3.1) that drove this API shape. Their three-role
+distribution made the case for per-phase overrides concrete.
+### Added
+- `[runtime.round_timeout_per_phase]` TOML block — per-phase overrides for
+  `round_timeout_s`. Unconfigured phases fall back to global. Keys validated
+  against `[phases] list` at config-load (typo catcher); non-positive values
+  rejected; bool / float values rejected (would otherwise silently coerce
+  to int).
+- `agent_runner.runner._round_timeout_for(cfg, phase)` helper — single
+  lookup point for phase-aware timeout resolution.
+### Migration
+No breaking changes. Existing configs without the new block keep using a
+single global timeout — identical to 0.1.8 behavior.
+Plugin authors: no public API change. `RuntimeConfig` is not in the
+documented plugin-author public surface.
+## [0.1.8] - 2026-05-13
+### Acknowledgements
+Thanks to the argus-gateway team for Phase 4 dogfooding feedback that drove
+every item in this release. 3 audit memos (~90KB) silently swept into an
+orphan stash is a real-world failure mode; this release closes that loop.
+### Added
+- `agent_runner.vcs_state.register_plugin_owned_paths()` — plugins opt-out
+  files/dirs from orphan-stash defense. Matching: trailing-slash prefix or
+  `pathlib.PurePath.match` glob (recognizes `**` for recursive segments via
+  `fnmatch` fallback on Python 3.11). Call at module import (entry_point
+  side-effect).
+- `agent_runner.vcs_state.plugin_owned_paths()` — snapshot accessor for peek.
+- `ProjectState.recent_hook_failures: list[dict]` — last 10 `hook_failed`
+  events filtered from `recent_events` for debugging hook integration.
+- peek schema bumped 1.4 → 1.5. `plugins` block now includes
+  `pre_round_hooks`, `post_round_hooks`, `owned_paths` lists.
+### Changed
+- `docs/plugins.md` register-pattern examples corrected: registration must
+  happen as module-top side effect; entry_point loaders only import, they
+  do not invoke. Old `_register()` wrapper pattern silently didn't fire.
+- `docs/plugins.md` gained "Declaring plugin-owned paths" and "Plugin tests
+  + consumer pytest collision" sections.
+### Fixed
+- Plugin outputs in plugin-declared paths (e.g. `proposals/`,
+  `logs/plugins/my_plugin/`) no longer silently swept into orphan stashes
+  by `process_orphan_wip`. Previously: 90KB Argus audit memos invisible
+  after Phase 4 round; required stash archaeology to recover.
+### Migration
+No breaking changes. Plugin authors:
+- If your plugin writes files to `work_dir` and they keep getting stashed
+  between rounds, opt them out:
+  ```python
+  from agent_runner.vcs_state import register_plugin_owned_paths
+  register_plugin_owned_paths(["your-output-dir/", "logs/your-plugin/**/*"])
+  ```
+- If you followed the old `_register()` pattern from docs and noticed
+  registrations not firing: move the call to module top:
+  ```python
+  # was: def _register(): register_pre_round_hook(MyHook())
+  # now: register_pre_round_hook(MyHook())  # module-top side-effect
+  ```
 ## [0.1.7] - 2026-05-13
 ### Migration for existing 0.1.6 users (DOWNSTREAM CONSUMERS READ THIS)
@@ -288,6 +367,10 @@ Initial public release on PyPI as `cli-agent-runner`.
 - Tag-triggered release publishing to PyPI via Trusted Publishing OIDC,
   gated by a manual approval on the `pypi` GitHub environment.
-[Unreleased]: https://github.com/wan9yu/cli-agent-runner/compare/v0.1.1...HEAD
-[0.1.1]: https://github.com/wan9yu/cli-agent-runner/releases/tag/v0.1.1
-[0.1.0]: https://github.com/wan9yu/cli-agent-runner/releases/tag/v0.1.0
+[Unreleased]: https://github.com/wan9yu/cli-agent-runner/compare/v0.1.9...HEAD
+[0.1.9]: https://github.com/wan9yu/cli-agent-runner/releases/tag/v0.1.9
+[0.1.8]: https://github.com/wan9yu/cli-agent-runner/releases/tag/v0.1.8
+[0.1.7]: https://github.com/wan9yu/cli-agent-runner/releases/tag/v0.1.7
+[0.1.6]: https://github.com/wan9yu/cli-agent-runner/releases/tag/v0.1.6
+[0.1.5]: https://github.com/wan9yu/cli-agent-runner/releases/tag/v0.1.5
+[0.1.4]: https://github.com/wan9yu/cli-agent-runner/releases/tag/v0.1.4

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: cli-agent-runner
-Version: 0.1.7
+Version: 0.1.9
 Summary: Restart-on-exit supervisor for autonomous CLI agents
 Project-URL: Homepage, https://github.com/wan9yu/cli-agent-runner
 Project-URL: Documentation, https://github.com/wan9yu/cli-agent-runner#readme

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/_version.py RENAMED Viewed

@@ -18,7 +18,7 @@ version_tuple: tuple[int | str, ...]
 commit_id: str | None
 __commit_id__: str | None
-__version__ = version = '0.1.7'
-__version_tuple__ = version_tuple = (0, 1, 7)
+__version__ = version = '0.1.9'
+__version_tuple__ = version_tuple = (0, 1, 9)
 __commit_id__ = commit_id = None

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/api.py RENAMED Viewed

@@ -240,6 +240,9 @@ def _log_dir_for_project(project: str | Path) -> Path:
 # for callers that only use lifecycle verbs.
 from agent_runner import defenses, monitor  # noqa: E402
+from agent_runner.events import HOOK_FAILED  # noqa: E402
+_RECENT_HOOK_FAILURES_LIMIT = 10
 def peek(
@@ -266,6 +269,16 @@ def peek(
         if current is None:
             raise KeyError(f"round {round_num} not found under {log_dir}/rounds/")
     recent = parsed_events[-events:] if events else []
+    # Walk the tail in reverse so we stop as soon as the limit is filled.
+    # parsed_events grows unboundedly over a project's lifetime; a full-scan
+    # comprehension here would dominate watch-loop peek cost.
+    recent_hook_failures: list[dict[str, Any]] = []
+    for e in reversed(parsed_events):
+        if e.get("event") == HOOK_FAILED:
+            recent_hook_failures.append(e)
+            if len(recent_hook_failures) == _RECENT_HOOK_FAILURES_LIMIT:
+                break
+    recent_hook_failures.reverse()
     state = ProjectState(
         project=base_state.project,
@@ -286,6 +299,7 @@ def peek(
         system=base_state.system,
         service=status(project if project is not None else work_dir),
         recent_events=recent,
+        recent_hook_failures=recent_hook_failures,
     )
     return state if select is None else select_path(state, select)

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/api_types.py RENAMED Viewed

@@ -72,6 +72,7 @@ class ProjectState:
     system: SystemMetrics
     service: ServiceStatus
     recent_events: list[dict[str, Any]] = field(default_factory=list)
+    recent_hook_failures: list[dict[str, Any]] = field(default_factory=list)
 @dataclass(frozen=True)
@@ -130,7 +131,7 @@ class InitResult:
     work_dir: Path
     files_created: list[Path]
     committed: bool
-    preset: str = "claude"  # 0.1.7+; default for backward compat with synthesised InitResults
+    preset: str = "claude"  # default keeps synthesised InitResults working
 @dataclass(frozen=True)

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/cli/common.py RENAMED Viewed

@@ -12,10 +12,11 @@ from typing import Any
 from agent_runner.api_types import ProjectState
 from agent_runner.config import Config, load_config
 from agent_runner.events import plugin_event_kinds
-from agent_runner.hooks import plugin_context_enrichers
+from agent_runner.hooks import plugin_context_enrichers, post_round_hooks, pre_round_hooks
 from agent_runner.monitor import plugin_detectors
+from agent_runner.vcs_state import plugin_owned_paths
-PEEK_SCHEMA_VERSION = "1.4"
+PEEK_SCHEMA_VERSION = "1.5"
 def cfg_from_args(args) -> Config:
@@ -48,7 +49,10 @@ def emit(value: Any, *, json_mode: bool) -> None:
                 "plugins": {
                     "event_kinds": plugin_event_kinds(),
                     "context_enrichers": plugin_context_enrichers(),
+                    "pre_round_hooks": [h.name for h in pre_round_hooks()],
+                    "post_round_hooks": [h.name for h in post_round_hooks()],
                     "detectors": plugin_detectors(),
+                    "owned_paths": plugin_owned_paths(),
                 },
                 **_to_jsonable(value),
             }

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/config.py RENAMED Viewed

@@ -24,6 +24,7 @@ class RuntimeConfig:
     log_dir: Path
     round_timeout_s: int = 1800
     restart_delay_s: int = 3
+    round_timeout_per_phase: dict[str, int] = field(default_factory=dict)
 @dataclass(frozen=True)
@@ -86,6 +87,32 @@ def _expand_path(s: str, project_name: str) -> Path:
     return Path(s.replace("{project}", project_name)).expanduser()
+def _require_positive_int(value: Any, *, field: str) -> int:
+    """Validate a TOML value is a positive int. Rejects bool (subclass of int
+    in Python, would silently coerce e.g. ``true`` → 1) and any non-int."""
+    if isinstance(value, bool) or not isinstance(value, int):
+        raise ValueError(f"{field}: must be an integer, got {type(value).__name__} ({value!r})")
+    if value <= 0:
+        raise ValueError(f"{field}: must be positive, got {value}")
+    return value
+def _validate_round_timeout_per_phase_keys(
+    per_phase: dict[str, int], phases: list[str] | None
+) -> None:
+    """All keys must appear in [phases] list (typo catcher)."""
+    if not per_phase:
+        return
+    if phases is None:
+        raise ValueError("runtime.round_timeout_per_phase requires [phases] list to be defined")
+    unknown = set(per_phase) - set(phases)
+    if unknown:
+        raise ValueError(
+            f"runtime.round_timeout_per_phase keys not in phases list: "
+            f"{sorted(unknown)}; available phases: {phases}"
+        )
 def load_config(toml_path: Path) -> Config:
     if not toml_path.exists():
         raise FileNotFoundError(f"config not found: {toml_path}")
@@ -103,12 +130,28 @@ def load_config(toml_path: Path) -> Config:
     work_dir = _expand_path(raw_work_dir, "").resolve()
     project_name = work_dir.name or "default"
+    # Phases first — needed for per-phase round_timeout validation below.
+    phases_d = raw.get("phases", {})
+    phases = list(phases_d["list"]) if "list" in phases_d else None
     runtime_d = raw.get("runtime", {})
+    per_phase_raw = runtime_d.get("round_timeout_per_phase", {})
+    per_phase: dict[str, int] = {
+        str(k): _require_positive_int(v, field=f"runtime.round_timeout_per_phase[{str(k)!r}]")
+        for k, v in per_phase_raw.items()
+    }
+    _validate_round_timeout_per_phase_keys(per_phase, phases)
     runtime = RuntimeConfig(
         work_dir=work_dir,
         log_dir=_expand_path(str(_require(raw, "runtime", "log_dir")), project_name),
-        round_timeout_s=int(runtime_d.get("round_timeout_s", 1800)),
-        restart_delay_s=int(runtime_d.get("restart_delay_s", 3)),
+        round_timeout_s=_require_positive_int(
+            runtime_d.get("round_timeout_s", 1800), field="runtime.round_timeout_s"
+        ),
+        restart_delay_s=_require_positive_int(
+            runtime_d.get("restart_delay_s", 3), field="runtime.restart_delay_s"
+        ),
+        round_timeout_per_phase=per_phase,
     )
     prompt_d = raw.get("prompt", {})
     mode = prompt_d.get("context_injection_mode", "prepend")
@@ -125,7 +168,9 @@ def load_config(toml_path: Path) -> Config:
     vcs_d = raw.get("vcs", {})
     vcs = VcsConfig(
         orphan_action=str(vcs_d.get("orphan_action", "stash")),
-        stash_idempotency_s=int(vcs_d.get("stash_idempotency_s", 5)),
+        stash_idempotency_s=_require_positive_int(
+            vcs_d.get("stash_idempotency_s", 5), field="vcs.stash_idempotency_s"
+        ),
     )
     monitor_d = raw.get("monitor", {})
     monitor = MonitorConfig(
@@ -133,8 +178,6 @@ def load_config(toml_path: Path) -> Config:
         auth_fail_hint=str(monitor_d.get("auth_fail_hint", _DEFAULT_AUTH_HINT)),
         auto_stop_on=list(monitor_d.get("auto_stop_on", _DEFAULT_AUTO_STOP_ON)),
     )
-    phases_d = raw.get("phases", {})
-    phases = list(phases_d["list"]) if "list" in phases_d else None
     plugins_d = raw.get("plugins")
     return Config(

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/events.py RENAMED Viewed

@@ -22,6 +22,11 @@ from datetime import UTC, datetime
 from pathlib import Path
 from typing import Any
+# Cross-module event-kind constants. Most kinds are emitted in only one place
+# (runner.py), but kinds that are also CONSUMED elsewhere (filtered, surfaced
+# in peek, asserted in tests) earn a constant to keep the spelling honest.
+HOOK_FAILED = "hook_failed"
 _BUILTIN_KINDS: frozenset[str] = frozenset(
     {
         "round_start",
@@ -38,7 +43,7 @@ _BUILTIN_KINDS: frozenset[str] = frozenset(
         "round_end",
         "monitor_alert_emitted",
         "monitor_auto_stop_triggered",
-        "hook_failed",
+        HOOK_FAILED,
     }
 )

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/runner.py RENAMED Viewed

@@ -49,6 +49,17 @@ def _phase_for(round_num: int, phases: list[str] | None) -> tuple[str | None, in
     return phases[idx], idx
+def _round_timeout_for(cfg: Config, phase: str | None) -> int:
+    """Per-phase override of round_timeout_s; falls back to global default.
+    Phase=None (no phases configured) → global. Phase not in override dict →
+    global. Phase in override dict → that phase's configured timeout.
+    """
+    if phase is None:
+        return cfg.runtime.round_timeout_s
+    return cfg.runtime.round_timeout_per_phase.get(phase, cfg.runtime.round_timeout_s)
 def _previous_block(prev: context_store.Status | None, dirty_last: bool) -> dict[str, Any] | None:
     if prev is None:
         return None
@@ -95,7 +106,7 @@ def _stitch_enricher_slices(
             payload = hooks._summarize_error(exc, tb=tb_mod.format_exc())
             events.emit(
                 log_dir,
-                "hook_failed",
+                events.HOOK_FAILED,
                 hook_name=enricher.name,
                 hook_kind="context_enricher",
                 **payload,
@@ -114,7 +125,7 @@ def _run_pre_round_hooks(hook_ctx: hooks.HookContext, log_dir: Path) -> None:
             payload = hooks._summarize_error(exc, tb=tb_mod.format_exc())
             events.emit(
                 log_dir,
-                "hook_failed",
+                events.HOOK_FAILED,
                 hook_name=hook.name,
                 hook_kind="pre_round",
                 **payload,
@@ -136,7 +147,7 @@ def _run_post_round_hooks(
             payload = hooks._summarize_error(exc, tb=tb_mod.format_exc())
             events.emit(
                 log_dir,
-                "hook_failed",
+                events.HOOK_FAILED,
                 hook_name=hook.name,
                 hook_kind="post_round",
                 **payload,
@@ -175,6 +186,7 @@ def _run_one_round_inner(cfg: Config) -> RoundResult:
     round_num = (prev_status.round_num if prev_status else 0) + 1
     phase, phase_idx = _phase_for(round_num, cfg.phases)
+    timeout_s = _round_timeout_for(cfg, phase)
     started_at = now_iso_ms()
     orphan = context_store.read_orphan_state(log_dir)
@@ -223,12 +235,12 @@ def _run_one_round_inner(cfg: Config) -> RoundResult:
         mode=cfg.prompt.context_injection_mode,
     )
-    events.emit(log_dir, "agent_spawn", round_num=round_num, timeout_s=cfg.runtime.round_timeout_s)
+    events.emit(log_dir, "agent_spawn", round_num=round_num, timeout_s=timeout_s)
     result = agent_runtime.run(
         command=cfg.agent.command,
         prompt_arg_template=cfg.agent.prompt_arg_template,
         prompt=prompt,
-        timeout_s=cfg.runtime.round_timeout_s,
+        timeout_s=timeout_s,
         log_path=log_path,
         env_extra=dict(cfg.agent.env),
     )
@@ -281,7 +293,7 @@ def _run_one_round_inner(cfg: Config) -> RoundResult:
             log_dir,
             "round_timeout_kill",
             round_num=round_num,
-            reason=f"exceeded round_timeout_s={cfg.runtime.round_timeout_s}",
+            reason=f"exceeded round_timeout_s={timeout_s}",
         )
     completed_at = now_iso_ms()

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/service_unit.py RENAMED Viewed

@@ -5,7 +5,9 @@ Two units per project:
   agent-runner-monitor@<project>.service  - runs `agent-runner monitor`
 Install command writes these to ~/.config/systemd/user/. The graceful-stop
-contract relies on KillSignal=SIGTERM + TimeoutStopSec=round_timeout_s+60.
+contract relies on KillSignal=SIGTERM + TimeoutStopSec=max(round_timeout_s,
+round_timeout_per_phase.values())+60 — the LARGEST possible round budget
+plus grace.
 """
 from __future__ import annotations
@@ -32,7 +34,12 @@ def _config_path(cfg: Config) -> Path:
 def render_serve_unit(cfg: Config, *, venv_bin: Path) -> str:
     """Generate the serve systemd unit body."""
-    timeout_total = cfg.runtime.round_timeout_s + _GRACE_S
+    # TimeoutStopSec covers the largest possible round so `systemctl stop`
+    # doesn't SIGKILL a long per-phase round mid-flight.
+    max_round_timeout = max(
+        [cfg.runtime.round_timeout_s, *cfg.runtime.round_timeout_per_phase.values()]
+    )
+    timeout_total = max_round_timeout + _GRACE_S
     return (
         f"[Unit]\n"
         f"Description=Agent Runner Supervisor ({cfg.runtime.work_dir.name})\n"

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/agent_runner/vcs_state.py RENAMED Viewed

@@ -7,14 +7,69 @@ Stash safety rules (R820 + §9 IMMUTABLE):
   design means external concurrent ``git stash push`` is not a defended scenario.
 - "Auto-tool change vs human change" detection uses set-based diff vs HEAD,
   not unified-diff +/-line parsing (R2110 lesson).
+- Also hosts the plugin-owned-paths registry consumed by
+  ``detect_dirty_files()`` so plugins can opt files/dirs out of the
+  orphan-stash defense.
 """
 from __future__ import annotations
+import fnmatch
 import subprocess  # noqa: TID251 — vcs_state.py is the only sanctioned git CLI caller
 import time
 from dataclasses import dataclass
-from pathlib import Path
+from pathlib import Path, PurePath
+# Plugin-owned paths registry — set via register_plugin_owned_paths().
+# detect_dirty_files() filters its return through this list, so plugin-declared
+# paths are not flagged as orphan WIP and not stashed by the supervisor.
+_PLUGIN_OWNED_PATHS: list[str] = []
+def register_plugin_owned_paths(paths: list[str]) -> None:
+    """Register paths the plugin considers its own deliverables.
+    Paths are relative to the work_dir. Matching:
+      - Trailing ``/`` → prefix match (e.g. ``"proposals/"`` matches
+        ``"proposals/dev-round1.md"`` and the bare directory name).
+      - Anything else without ``**`` → ``pathlib.PurePath.match`` glob
+        (e.g. ``"reports/*.md"``). Single ``*`` does not cross slashes.
+      - Patterns containing ``**`` → ``fnmatch.fnmatch`` (e.g.
+        ``"logs/plugins/**/*"``). ``**`` matches recursive directory
+        segments. (``PurePath.full_match`` would handle this natively
+        but requires Python 3.13+; this project's minimum is 3.11.)
+    Plugins call this at module import time (entry_point side-effect) so the
+    paths are known before the first round runs.
+    Raises ValueError on non-string entries.
+    """
+    for p in paths:
+        if not isinstance(p, str):
+            raise ValueError(f"register_plugin_owned_paths: non-string entry {p!r}")
+    _PLUGIN_OWNED_PATHS.extend(paths)
+def plugin_owned_paths() -> list[str]:
+    """Snapshot of registered plugin-owned paths (for peek visibility)."""
+    return list(_PLUGIN_OWNED_PATHS)
+def _matches_owned_path(path: str) -> bool:
+    """True if `path` matches any registered plugin-owned pattern."""
+    for pattern in _PLUGIN_OWNED_PATHS:
+        if pattern.endswith("/"):
+            stripped = pattern.rstrip("/")
+            if path == stripped or path.startswith(pattern):
+                return True
+        elif "**" in pattern:
+            # fnmatch handles ** recursively; PurePath.match (3.11) does not.
+            if fnmatch.fnmatch(path, pattern):
+                return True
+        elif PurePath(path).match(pattern):
+            return True
+    return False
 @dataclass(frozen=True)
@@ -74,6 +129,9 @@ def detect_dirty_files(repo: Path) -> list[str]:
         else:
             out.append(path)
             i += 1
+    # Early-out preserves zero behavior change when no plugin has registered.
+    if _PLUGIN_OWNED_PATHS:
+        out = [p for p in out if not _matches_owned_path(p)]
     return out

{cli_agent_runner-0.1.7 → cli_agent_runner-0.1.9}/docs/configuration.md RENAMED Viewed

@@ -23,6 +23,7 @@ writes a templated copy you can edit.
 | `log_dir` | `Path` | — |
 | `round_timeout_s` | `int` | 1800 |
 | `restart_delay_s` | `int` | 3 |
+| `round_timeout_per_phase` | `dict[str, int]` | {} |
 ### `[prompt]`
@@ -76,6 +77,31 @@ Override in your `agent-runner.toml` if you ship a custom CLI.
 |---|---|---|---|
 | `list` | list[str] | (none → no phase rotation) | round N gets `phases[(N-1) % len(phases)]` |
+## Per-phase timeouts (0.1.9+)
+If your `[phases]` rotation has phases with different wall-clock budgets,
+override the global timeout per phase:
+```toml
+[runtime]
+round_timeout_s = 1800           # fallback for unconfigured phases
+[runtime.round_timeout_per_phase]
+dev = 3600                       # implementation work, longer budget
+qa = 1200                        # test review, tighter budget
+product = 1200                   # docs writing, tighter budget
+[phases]
+list = ["dev", "qa", "product"]
+```
+Validation: typos in phase names (keys not in `[phases] list`) and
+non-positive / non-integer values are caught at config-load time with
+`ValueError`.
+Unconfigured phases (and configs without `[phases]`) keep using the
+global `round_timeout_s`.
 ## `[monitor]` (optional, defaults shown)
 ```toml

cli-agent-runner 0.1.7__tar.gz → 0.1.9__tar.gz

cli-agent-runner 0.1.7tar.gz → 0.1.9tar.gz