PyPI - moonbridge - Versions diffs - 0.8.0__tar.gz → 0.9.0__tar.gz - Mend

moonbridge 0.8.0tar.gz → 0.9.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (37) hide show

moonbridge-0.9.0/.github/workflows/cerberus.yml ADDED Viewed

@@ -0,0 +1,46 @@
+# Cerberus AI Code Review Council
+# https://github.com/misty-step/cerberus
+name: Cerberus Council
+on:
+  pull_request:
+    types: [opened, synchronize, reopened]
+permissions:
+  contents: read
+  pull-requests: write
+concurrency:
+  group: cerberus-${{ github.event.pull_request.number }}
+  cancel-in-progress: true
+jobs:
+  review:
+    name: "${{ matrix.reviewer }}"
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        include:
+          - { reviewer: APOLLO, perspective: correctness }
+          - { reviewer: ATHENA, perspective: architecture }
+          - { reviewer: SENTINEL, perspective: security }
+          - { reviewer: VULCAN, perspective: performance }
+          - { reviewer: ARTEMIS, perspective: maintainability }
+      fail-fast: false
+    steps:
+      - uses: actions/checkout@v4
+      - uses: misty-step/cerberus@v1
+        with:
+          perspective: ${{ matrix.perspective }}
+          github-token: ${{ secrets.GITHUB_TOKEN }}
+          kimi-api-key: ${{ secrets.MOONSHOT_API_KEY }}
+  verdict:
+    name: "Council Verdict"
+    needs: review
+    if: always()
+    runs-on: ubuntu-latest
+    steps:
+      - uses: misty-step/cerberus/verdict@v1
+        with:
+          github-token: ${{ secrets.GITHUB_TOKEN }}

moonbridge-0.9.0/.release-please-manifest.json ADDED Viewed

@@ -0,0 +1,3 @@
+{
+  ".": "0.9.0"
+}

{moonbridge-0.8.0 → moonbridge-0.9.0}/CHANGELOG.md RENAMED Viewed

@@ -5,6 +5,14 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [0.9.0](https://github.com/misty-step/moonbridge/compare/moonbridge-v0.8.0...moonbridge-v0.9.0) (2026-02-06)
+### Features
+* add structured output parsing with quality signals ([#76](https://github.com/misty-step/moonbridge/issues/76)) ([1318a03](https://github.com/misty-step/moonbridge/commit/1318a03f42556fa6d9366bd8e67ae465fd8a235a))
+* validate MOONBRIDGE_ALLOWED_DIRS and expose config health ([#67](https://github.com/misty-step/moonbridge/issues/67)) ([#78](https://github.com/misty-step/moonbridge/issues/78)) ([bf5af9b](https://github.com/misty-step/moonbridge/commit/bf5af9b7e5d13e8c24776d3e4ff154af04e1b2a7))
 ## [0.8.0](https://github.com/misty-step/moonbridge/compare/moonbridge-v0.7.0...moonbridge-v0.8.0) (2026-02-06)

{moonbridge-0.8.0 → moonbridge-0.9.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: moonbridge
-Version: 0.8.0
+Version: 0.9.0
 Summary: MCP server for spawning AI coding agents (Kimi, Codex, and more)
 Project-URL: Homepage, https://github.com/misty-step/moonbridge
 Project-URL: Repository, https://github.com/misty-step/moonbridge
@@ -90,6 +90,32 @@ export MOONBRIDGE_SKIP_UPDATE_CHECK=1
 **Best for:** Tasks that benefit from parallel execution or volume.
+## How it Works
+### Connection Flow
+1. MCP client (Claude Code, Cursor, etc.) connects to Moonbridge over stdio
+2. Client discovers available tools via `list_tools`
+3. Client calls `spawn_agent` or `spawn_agents_parallel`
+### Spawn Process
+1. Moonbridge validates the prompt and working directory
+2. Resolves which adapter to use (Kimi, Codex)
+3. Adapter builds the CLI command with appropriate flags
+4. Spawns subprocess in a separate process group
+5. Captures stdout/stderr, enforces timeout
+6. Returns structured JSON result
+### Parallel Execution
+- `spawn_agents_parallel` runs up to 10 agents concurrently via `asyncio.gather`
+- Each agent is independent (separate process, separate output)
+- All results returned together when the last agent finishes (or times out)
+```
+MCP Client → stdio → Moonbridge → adapter → CLI subprocess
+                                          → CLI subprocess (parallel)
+                                          → CLI subprocess (parallel)
+```
 ## Tools
 | Tool | Use case |
@@ -170,6 +196,31 @@ All tools return JSON with these fields:
 | `MOONBRIDGE_SANDBOX_MAX_COPY` | Max sandbox copy size in bytes (default 500MB) |
 | `MOONBRIDGE_LOG_LEVEL` | Set to `DEBUG` for verbose logging |
+## Security
+### 1. Directory Restrictions (`MOONBRIDGE_ALLOWED_DIRS`)
+Default: agents can operate in any directory. Set `MOONBRIDGE_ALLOWED_DIRS` to restrict: colon-separated allowed paths. Symlinks resolved via `os.path.realpath` before checking. Strict mode (`MOONBRIDGE_STRICT=1`) exits on startup if no valid allowed directories are configured.
+```bash
+export MOONBRIDGE_ALLOWED_DIRS="/home/user/projects:/home/user/work"
+export MOONBRIDGE_STRICT=1  # require restrictions
+```
+### 2. Environment Sanitization
+Only whitelisted env vars are passed to spawned agents. Each adapter defines its own allowlist (`PATH`, `HOME`, plus adapter-specific like `OPENAI_API_KEY` for Codex). Your shell environment (secrets, tokens, SSH keys) is not inherited by default.
+### 3. Input Validation
+Model parameters are validated to prevent flag injection (values starting with `-` are rejected). Prompts are capped at 100,000 characters and cannot be empty.
+### 4. Process Isolation
+Agents run in separate process groups (`start_new_session=True`). Orphan cleanup on exit. Sandbox mode available (`MOONBRIDGE_SANDBOX=1`) for copy-on-run isolation.
+> **Not OS-level sandboxing.** Agents can still read arbitrary host files. For strong isolation, use containers/VMs.
 ## Troubleshooting
 ### "CLI not found"

{moonbridge-0.8.0 → moonbridge-0.9.0}/README.md RENAMED Viewed

@@ -61,6 +61,32 @@ export MOONBRIDGE_SKIP_UPDATE_CHECK=1
 **Best for:** Tasks that benefit from parallel execution or volume.
+## How it Works
+### Connection Flow
+1. MCP client (Claude Code, Cursor, etc.) connects to Moonbridge over stdio
+2. Client discovers available tools via `list_tools`
+3. Client calls `spawn_agent` or `spawn_agents_parallel`
+### Spawn Process
+1. Moonbridge validates the prompt and working directory
+2. Resolves which adapter to use (Kimi, Codex)
+3. Adapter builds the CLI command with appropriate flags
+4. Spawns subprocess in a separate process group
+5. Captures stdout/stderr, enforces timeout
+6. Returns structured JSON result
+### Parallel Execution
+- `spawn_agents_parallel` runs up to 10 agents concurrently via `asyncio.gather`
+- Each agent is independent (separate process, separate output)
+- All results returned together when the last agent finishes (or times out)
+```
+MCP Client → stdio → Moonbridge → adapter → CLI subprocess
+                                          → CLI subprocess (parallel)
+                                          → CLI subprocess (parallel)
+```
 ## Tools
 | Tool | Use case |
@@ -141,6 +167,31 @@ All tools return JSON with these fields:
 | `MOONBRIDGE_SANDBOX_MAX_COPY` | Max sandbox copy size in bytes (default 500MB) |
 | `MOONBRIDGE_LOG_LEVEL` | Set to `DEBUG` for verbose logging |
+## Security
+### 1. Directory Restrictions (`MOONBRIDGE_ALLOWED_DIRS`)
+Default: agents can operate in any directory. Set `MOONBRIDGE_ALLOWED_DIRS` to restrict: colon-separated allowed paths. Symlinks resolved via `os.path.realpath` before checking. Strict mode (`MOONBRIDGE_STRICT=1`) exits on startup if no valid allowed directories are configured.
+```bash
+export MOONBRIDGE_ALLOWED_DIRS="/home/user/projects:/home/user/work"
+export MOONBRIDGE_STRICT=1  # require restrictions
+```
+### 2. Environment Sanitization
+Only whitelisted env vars are passed to spawned agents. Each adapter defines its own allowlist (`PATH`, `HOME`, plus adapter-specific like `OPENAI_API_KEY` for Codex). Your shell environment (secrets, tokens, SSH keys) is not inherited by default.
+### 3. Input Validation
+Model parameters are validated to prevent flag injection (values starting with `-` are rejected). Prompts are capped at 100,000 characters and cannot be empty.
+### 4. Process Isolation
+Agents run in separate process groups (`start_new_session=True`). Orphan cleanup on exit. Sandbox mode available (`MOONBRIDGE_SANDBOX=1`) for copy-on-run isolation.
+> **Not OS-level sandboxing.** Agents can still read arbitrary host files. For strong isolation, use containers/VMs.
 ## Troubleshooting
 ### "CLI not found"

{moonbridge-0.8.0 → moonbridge-0.9.0}/src/moonbridge/__init__.py RENAMED Viewed

@@ -2,7 +2,7 @@
 from __future__ import annotations
-__version__ = "0.8.0"
+__version__ = "0.9.0"
 from .server import main, run, server

{moonbridge-0.8.0 → moonbridge-0.9.0}/src/moonbridge/server.py RENAMED Viewed

@@ -21,6 +21,7 @@ from mcp.types import TextContent, Tool
 from moonbridge.adapters import ADAPTER_REGISTRY, CLIAdapter, get_adapter
 from moonbridge.adapters.base import AgentResult
+from moonbridge.signals import extract_quality_signals
 from moonbridge.tools import build_tools
 server = Server("moonbridge")
@@ -78,6 +79,22 @@ def _warn_if_unrestricted() -> None:
     print(message, file=sys.stderr)
+def _validate_allowed_dirs() -> None:
+    if not ALLOWED_DIRS:
+        return
+    missing_count = 0
+    for path in ALLOWED_DIRS:
+        if os.path.isdir(path):
+            continue
+        missing_count += 1
+        logger.warning("MOONBRIDGE_ALLOWED_DIRS entry does not exist: %s", path)
+    if missing_count == len(ALLOWED_DIRS) and STRICT_MODE:
+        message = "MOONBRIDGE_ALLOWED_DIRS entries do not exist"
+        logger.error(message)
+        print(message, file=sys.stderr)
+        sys.exit(1)
 def _safe_env(adapter: CLIAdapter) -> dict[str, str]:
     env = {key: os.environ[key] for key in adapter.config.safe_env_keys if key in os.environ}
     if "PATH" not in env and "PATH" in os.environ:
@@ -327,7 +344,7 @@ def _run_cli_sync(
             )
         status = "success" if proc.returncode == 0 else "error"
         logger.info("Agent %s completed with status: %s", agent_index, status)
-        return AgentResult(
+        result = AgentResult(
             status=status,
             output=stdout,
             stderr=stderr_value,
@@ -335,6 +352,13 @@ def _run_cli_sync(
             duration_ms=duration_ms,
             agent_index=agent_index,
         )
+        if result.status == "success":
+            signals = extract_quality_signals(result.output, result.stderr)
+            if signals:
+                raw = dict(result.raw or {})
+                raw["quality_signals"] = signals
+                result = replace(result, raw=raw)
+        return result
     except TimeoutExpired:
         _terminate_process(proc)
         duration_ms = int((time.monotonic() - start) * 1000)
@@ -401,6 +425,12 @@ def _json_text(payload: Any) -> list[TextContent]:
 def _status_check(cwd: str, adapter: CLIAdapter) -> dict[str, Any]:
+    config = {
+        "strict_mode": STRICT_MODE,
+        "allowed_dirs": ALLOWED_DIRS or None,
+        "unrestricted": not ALLOWED_DIRS,
+        "cwd": cwd,
+    }
     installed, _path = adapter.check_installed()
     if not installed:
         return {
@@ -408,20 +438,27 @@ def _status_check(cwd: str, adapter: CLIAdapter) -> dict[str, Any]:
             "message": (
                 f"{adapter.config.name} CLI not found. Install: {adapter.config.install_hint}"
             ),
+            "config": config,
         }
     timeout = min(DEFAULT_TIMEOUT, 60)
     result = _run_cli_sync(adapter, "status check", False, cwd, timeout, 0)
     if result.status == "auth_error":
-        return {"status": "auth_error", "message": adapter.config.auth_message}
+        return {
+            "status": "auth_error",
+            "message": adapter.config.auth_message,
+            "config": config,
+        }
     if result.status == "success":
         return {
             "status": "success",
             "message": f"{adapter.config.name} CLI available and authenticated",
+            "config": config,
         }
     return {
         "status": "error",
         "message": f"{adapter.config.name} CLI error",
         "details": result.to_dict(),
+        "config": config,
     }
@@ -444,6 +481,7 @@ def _adapter_info(cwd: str, adapter: CLIAdapter) -> dict[str, Any]:
 @server.list_tools()
 async def list_tools() -> list[Tool]:
+    """Build MCP tool metadata for the active adapter."""
     adapter = get_adapter()
     tool_desc = adapter.config.tool_description
     status_desc = f"Verify {adapter.config.name} CLI is installed and authenticated"
@@ -456,7 +494,15 @@ async def list_tools() -> list[Tool]:
 async def handle_tool(name: str, arguments: dict[str, Any]) -> list[TextContent]:
-    """Handle tool calls. Exposed for testing."""
+    """Dispatch a tool invocation with validation and stable error payloads.
+    Separated from ``call_tool`` so tests can invoke tool logic without the
+    MCP decorator.
+    Args:
+        name: MCP tool name (``spawn_agent``, ``spawn_agents_parallel``, etc.).
+        arguments: Tool argument payload from the MCP client.
+    """
     try:
         cwd = _validate_cwd(None)
         if name == "spawn_agent":
@@ -556,11 +602,12 @@ async def handle_tool(name: str, arguments: dict[str, Any]) -> list[TextContent]
 @server.call_tool()
 async def call_tool(name: str, arguments: dict[str, Any]) -> list[TextContent]:
-    """MCP tool handler - delegates to handle_tool."""
+    """MCP tool handler -- delegates to ``handle_tool`` for testability."""
     return await handle_tool(name, arguments)
 async def run() -> None:
+    """Run the MCP server over stdio until the client disconnects."""
     async with stdio_server() as (read_stream, write_stream):
         await server.run(
             read_stream,
@@ -570,8 +617,10 @@ async def run() -> None:
 def main() -> None:
+    """CLI entry point that validates prerequisites then starts the server."""
     _configure_logging()
     _warn_if_unrestricted()
+    _validate_allowed_dirs()
     from moonbridge import __version__
     from moonbridge.version_check import check_for_updates

moonbridge-0.9.0/src/moonbridge/signals.py ADDED Viewed

@@ -0,0 +1,68 @@
+"""Heuristic extraction of quality signals from agent output."""
+from __future__ import annotations
+import re
+from typing import Any
+# Diff markers at line start.
+_DIFF_MARKER_RE = re.compile(r"^(?:\+\+\+ |--- |@@ )", re.MULTILINE)
+# File headers in unified diffs.
+_DIFF_FILE_RE = re.compile(r"^(?:\+\+\+ b/|--- a/)(.+)$", re.MULTILINE)
+# Git-style summary lines.
+_FILES_CHANGED_RE = re.compile(r"\b(\d+)\s+files?\s+changed\b", re.IGNORECASE)
+_MODIFIED_FILES_RE = re.compile(r"\bModified\s+(\d+)\s+files?\b", re.IGNORECASE)
+# Pytest-style summaries.
+_PASSED_RE = re.compile(r"(?<!\w)(\d+)\s+passed\b", re.IGNORECASE)
+_FAILED_RE = re.compile(r"(?<!\w)(\d+)\s+failed\b", re.IGNORECASE)
+# stderr error markers.
+_ERROR_RE = re.compile(r"(Traceback \(most recent call last\)|\berror:)", re.IGNORECASE)
+def _last_int(pattern: re.Pattern[str], text: str) -> int | None:
+    matches = pattern.findall(text)
+    if not matches:
+        return None
+    return int(matches[-1])
+def _count_files_changed(output: str) -> int:
+    paths = {path for path in _DIFF_FILE_RE.findall(output) if path and path != "/dev/null"}
+    if paths:
+        return len(paths)
+    match = _FILES_CHANGED_RE.search(output) or _MODIFIED_FILES_RE.search(output)
+    if match:
+        return int(match.group(1))
+    return 0
+def extract_quality_signals(output: str, stderr: str | None = None) -> dict[str, Any]:
+    """Extract heuristic quality signals from agent output."""
+    signals: dict[str, Any] = {}
+    if not output and not stderr:
+        return signals
+    has_diff = bool(_DIFF_MARKER_RE.search(output))
+    if has_diff:
+        signals["has_diff"] = True
+    files_changed = _count_files_changed(output)
+    if files_changed:
+        signals["files_changed"] = files_changed
+    combined = output
+    if stderr:
+        combined = f"{output}\n{stderr}"
+    tests_passed = _last_int(_PASSED_RE, combined)
+    if tests_passed is not None:
+        signals["tests_passed"] = tests_passed
+    tests_failed = _last_int(_FAILED_RE, combined)
+    if tests_failed is not None:
+        signals["tests_failed"] = tests_failed
+    if stderr and _ERROR_RE.search(stderr):
+        signals["has_errors"] = True
+    return signals

{moonbridge-0.8.0 → moonbridge-0.9.0}/tests/test_sandbox.py RENAMED Viewed

@@ -2,8 +2,6 @@ import importlib
 from pathlib import Path
 from typing import Any
-import pytest
 from moonbridge.adapters.base import AgentResult
 sandbox_module = importlib.import_module("moonbridge.sandbox")

{moonbridge-0.8.0 → moonbridge-0.9.0}/tests/test_server.py RENAMED Viewed

@@ -56,6 +56,19 @@ async def test_spawn_agent_thinking_adds_flag(mock_popen: Any) -> None:
     assert "--thinking" in args[0]
+@pytest.mark.asyncio
+async def test_spawn_agent_adds_quality_signals(mock_popen: Any) -> None:
+    process = mock_popen.return_value
+    process.communicate.return_value = ("== 5 passed in 0.12s ==", "")
+    process.returncode = 0
+    result = await server_module.handle_tool("spawn_agent", {"prompt": "Hello"})
+    payload = json.loads(result[0].text)
+    assert payload["status"] == "success"
+    assert payload["raw"]["quality_signals"] == {"tests_passed": 5}
 @pytest.mark.asyncio
 async def test_spawn_agents_parallel_runs_concurrently(monkeypatch: Any) -> None:
     starts: list[float] = []
@@ -261,6 +274,25 @@ async def test_check_status_not_installed(mock_which_no_kimi: Any) -> None:
     assert payload["status"] == "error"
+@pytest.mark.asyncio
+async def test_check_status_includes_config(
+    mock_which_no_kimi: Any, monkeypatch: Any
+) -> None:
+    monkeypatch.setattr(server_module, "ALLOWED_DIRS", [])
+    monkeypatch.setattr(server_module, "STRICT_MODE", True)
+    monkeypatch.setattr(server_module.os, "getcwd", lambda: "/workdir")
+    result = await server_module.handle_tool("check_status", {})
+    payload = json.loads(result[0].text)
+    assert "config" in payload
+    config = payload["config"]
+    assert config["strict_mode"] is True
+    assert config["allowed_dirs"] is None
+    assert config["unrestricted"] is True
+    assert config["cwd"] == server_module.os.path.realpath("/workdir")
 @pytest.mark.asyncio
 async def test_list_adapters_tool_output(monkeypatch: Any) -> None:
     def fake_run(
@@ -434,6 +466,45 @@ def test_warn_if_unrestricted_no_warning_when_restricted(
     assert not caplog.records
+def test_validate_allowed_dirs_warns_missing(monkeypatch: Any, caplog: Any) -> None:
+    monkeypatch.setattr(server_module, "ALLOWED_DIRS", ["/missing"])
+    monkeypatch.setattr(server_module.os.path, "isdir", lambda _path: False)
+    caplog.set_level(logging.WARNING, logger="moonbridge")
+    server_module._validate_allowed_dirs()
+    assert any(
+        record.levelno == logging.WARNING
+        and record.getMessage() == "MOONBRIDGE_ALLOWED_DIRS entry does not exist: /missing"
+        for record in caplog.records
+    )
+def test_validate_allowed_dirs_strict_all_missing_exits(
+    monkeypatch: Any, caplog: Any, mocker: Any
+) -> None:
+    monkeypatch.setattr(server_module, "ALLOWED_DIRS", ["/missing", "/missing2"])
+    monkeypatch.setattr(server_module, "STRICT_MODE", True)
+    monkeypatch.setattr(server_module.os.path, "isdir", lambda _path: False)
+    exit_mock = mocker.patch("moonbridge.server.sys.exit")
+    caplog.set_level(logging.ERROR, logger="moonbridge")
+    server_module._validate_allowed_dirs()
+    exit_mock.assert_called_once_with(1)
+def test_validate_allowed_dirs_some_valid_no_exit(monkeypatch: Any, mocker: Any) -> None:
+    monkeypatch.setattr(server_module, "ALLOWED_DIRS", ["/missing", "/valid"])
+    monkeypatch.setattr(server_module, "STRICT_MODE", True)
+    monkeypatch.setattr(server_module.os.path, "isdir", lambda path: path == "/valid")
+    exit_mock = mocker.patch("moonbridge.server.sys.exit")
+    server_module._validate_allowed_dirs()
+    exit_mock.assert_not_called()
 def test_resolve_timeout_uses_adapter_default(monkeypatch: Any) -> None:
     """Adapter-specific default takes precedence over global default."""
     from moonbridge.adapters import get_adapter

moonbridge-0.9.0/tests/test_signals.py ADDED Viewed

@@ -0,0 +1,109 @@
+from moonbridge.signals import extract_quality_signals
+def test_extract_quality_signals_empty_output() -> None:
+    assert extract_quality_signals("", None) == {}
+def test_extract_quality_signals_pytest_counts() -> None:
+    output = "== 5 passed, 2 failed in 0.12s =="
+    assert extract_quality_signals(output) == {"tests_passed": 5, "tests_failed": 2}
+def test_extract_quality_signals_diff_markers() -> None:
+    output = (
+        "diff --git a/foo.py b/foo.py\n"
+        "index 123..456 100644\n"
+        "--- a/foo.py\n"
+        "+++ b/foo.py\n"
+        "@@ -1 +1 @@\n"
+        "-old\n"
+        "+new\n"
+        "diff --git a/bar.py b/bar.py\n"
+        "--- a/bar.py\n"
+        "+++ b/bar.py\n"
+        "@@ -1 +1 @@\n"
+        "-old\n"
+        "+new\n"
+    )
+    assert extract_quality_signals(output) == {"has_diff": True, "files_changed": 2}
+def test_extract_quality_signals_traceback() -> None:
+    stderr = "Traceback (most recent call last):\n  boom\n"
+    assert extract_quality_signals("", stderr) == {"has_errors": True}
+def test_extract_quality_signals_combined() -> None:
+    output = "2 passed, 1 failed\n--- a/app.py\n+++ b/app.py\n@@ -1 +1 @@\n"
+    stderr = "error: something went wrong\n"
+    assert extract_quality_signals(output, stderr) == {
+        "tests_passed": 2,
+        "tests_failed": 1,
+        "has_diff": True,
+        "files_changed": 1,
+        "has_errors": True,
+    }
+def test_extract_quality_signals_zero_passed() -> None:
+    """Zero passed should be reported, not silently dropped."""
+    output = "== 0 passed, 3 failed in 0.05s =="
+    assert extract_quality_signals(output) == {"tests_passed": 0, "tests_failed": 3}
+def test_extract_quality_signals_zero_failed() -> None:
+    """Zero failed is a meaningful signal (all tests passed)."""
+    output = "== 5 passed, 0 failed in 0.12s =="
+    assert extract_quality_signals(output) == {"tests_passed": 5, "tests_failed": 0}
+def test_extract_quality_signals_files_changed_summary() -> None:
+    """Fallback to git summary line when no diff headers present."""
+    output = "3 files changed, 10 insertions(+), 2 deletions(-)\n"
+    assert extract_quality_signals(output) == {"files_changed": 3}
+def test_extract_quality_signals_modified_files_summary() -> None:
+    """Fallback to Modified N files format."""
+    output = "Modified 2 files\n"
+    assert extract_quality_signals(output) == {"files_changed": 2}
+def test_extract_quality_signals_last_match_wins() -> None:
+    """When output has multiple test runs, the last result is used."""
+    output = (
+        "== 3 passed in 0.1s ==\n"
+        "== 5 passed, 1 failed in 0.2s ==\n"
+    )
+    assert extract_quality_signals(output) == {"tests_passed": 5, "tests_failed": 1}
+def test_extract_quality_signals_no_signals_on_plain_output() -> None:
+    output = "All done, no issues found.\n"
+    assert extract_quality_signals(output) == {}
+def test_extract_quality_signals_real_worldish_codex_output() -> None:
+    output = (
+        "Running: uv run pytest -v\n"
+        "============================= test session starts ==============================\n"
+        "collected 7 items\n"
+        "tests/test_server.py ....F..\n"
+        "=========================== short test summary info ============================\n"
+        "FAILED tests/test_server.py::test_spawn_agent - AssertionError\n"
+        "========================= 6 passed, 1 failed in 0.45s =========================\n"
+        "diff --git a/src/app.py b/src/app.py\n"
+        "index 123..456 100644\n"
+        "--- a/src/app.py\n"
+        "+++ b/src/app.py\n"
+        "@@ -1,2 +1,2 @@\n"
+        "-old\n"
+        "+new\n"
+    )
+    assert extract_quality_signals(output) == {
+        "tests_passed": 6,
+        "tests_failed": 1,
+        "has_diff": True,
+        "files_changed": 1,
+    }