PyPI - veris-cli - Versions diffs - 2.12.0__tar.gz → 2.13.0__tar.gz - Mend

veris-cli 2.12.0tar.gz → 2.13.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

{veris_cli-2.12.0 → veris_cli-2.13.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: veris-cli
-Version: 2.12.0
+Version: 2.13.0
 Summary: CLI to connect local agents to the Veris backend
 Project-URL: Homepage, https://github.com/veris-ai/veris-cli
 Project-URL: Bug Tracker, https://github.com/veris-ai/veris-cli/issues
@@ -309,6 +309,13 @@ veris evaluation-runs list --run-id <id>
 veris evaluation-runs status <eval-run-id> --run-id <id> [--watch]
 ```
+### CI
+```bash
+# Run simulation + evaluation pipeline (config-driven)
+veris ci run [--scenario-set-id <id>] [--env-id <id>] [--concurrency <n>] [--image-tag <tag>] [--simulation-timeout <seconds>]
+```
 ### Runs
 ```bash
@@ -458,6 +465,79 @@ Each scenario runs in an isolated container with:
 - Mounted logs directory at `/sessions` (for output)
 - Environment variables from `.env` plus `SCENARIO_ID`
+## CI/CD Integration
+Run simulations automatically on every pull request and post results as a PR comment.
+### Setup
+1. Run `veris ci run` interactively once to select a scenario set — this saves the config to `.veris/config.yaml`:
+```bash
+veris ci run
+```
+2. Commit `.veris/config.yaml` to your repo (make sure it's not gitignored).
+3. Add a `VERIS_API_KEY` secret to your repo (Settings → Secrets → Actions).
+### GitHub Actions Workflow
+```yaml
+name: Veris Simulation
+on:
+  pull_request:
+    branches: [main]
+jobs:
+  simulate:
+    runs-on: ubuntu-latest
+    # Restrict to maintainers via GitHub environment protection rules
+    environment: veris-sim-ci
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+      - name: Install veris-cli
+        run: pip install veris-cli
+      - name: Build & push agent image
+        env:
+          VERIS_API_KEY: ${{ secrets.VERIS_API_KEY }}
+        run: |
+          veris login "$VERIS_API_KEY"
+          veris env push --tag ${{ github.sha }} --remote
+      - name: Run simulation & evaluation
+        run: |
+          veris ci run --image-tag ${{ github.sha }} > veris-summary.md
+      - name: Comment on PR
+        uses: marocchino/sticky-pull-request-comment@v2
+        with:
+          path: veris-summary.md
+```
+The `environment: veris-sim-ci` line uses [GitHub environment protection rules](https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment) — secrets are only exposed to approved maintainers, even on public repos.
+### CLI Usage
+```bash
+# Everything from config — zero flags needed
+veris ci run
+# Override image tag (common in CI for PR builds)
+veris ci run --image-tag $(git rev-parse --short HEAD)
+# Override everything
+veris ci run --scenario-set-id X --env-id Y --concurrency 5
+```
+Progress is printed to stderr, clean markdown to stdout. The command exits non-zero if the run or evaluation fails.
 ## How It Works
 ```

{veris_cli-2.12.0 → veris_cli-2.13.0}/README.md RENAMED Viewed

@@ -287,6 +287,13 @@ veris evaluation-runs list --run-id <id>
 veris evaluation-runs status <eval-run-id> --run-id <id> [--watch]
 ```
+### CI
+```bash
+# Run simulation + evaluation pipeline (config-driven)
+veris ci run [--scenario-set-id <id>] [--env-id <id>] [--concurrency <n>] [--image-tag <tag>] [--simulation-timeout <seconds>]
+```
 ### Runs
 ```bash
@@ -436,6 +443,79 @@ Each scenario runs in an isolated container with:
 - Mounted logs directory at `/sessions` (for output)
 - Environment variables from `.env` plus `SCENARIO_ID`
+## CI/CD Integration
+Run simulations automatically on every pull request and post results as a PR comment.
+### Setup
+1. Run `veris ci run` interactively once to select a scenario set — this saves the config to `.veris/config.yaml`:
+```bash
+veris ci run
+```
+2. Commit `.veris/config.yaml` to your repo (make sure it's not gitignored).
+3. Add a `VERIS_API_KEY` secret to your repo (Settings → Secrets → Actions).
+### GitHub Actions Workflow
+```yaml
+name: Veris Simulation
+on:
+  pull_request:
+    branches: [main]
+jobs:
+  simulate:
+    runs-on: ubuntu-latest
+    # Restrict to maintainers via GitHub environment protection rules
+    environment: veris-sim-ci
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+      - name: Install veris-cli
+        run: pip install veris-cli
+      - name: Build & push agent image
+        env:
+          VERIS_API_KEY: ${{ secrets.VERIS_API_KEY }}
+        run: |
+          veris login "$VERIS_API_KEY"
+          veris env push --tag ${{ github.sha }} --remote
+      - name: Run simulation & evaluation
+        run: |
+          veris ci run --image-tag ${{ github.sha }} > veris-summary.md
+      - name: Comment on PR
+        uses: marocchino/sticky-pull-request-comment@v2
+        with:
+          path: veris-summary.md
+```
+The `environment: veris-sim-ci` line uses [GitHub environment protection rules](https://docs.github.com/en/actions/deployment/targeting-different-environments/using-environments-for-deployment) — secrets are only exposed to approved maintainers, even on public repos.
+### CLI Usage
+```bash
+# Everything from config — zero flags needed
+veris ci run
+# Override image tag (common in CI for PR builds)
+veris ci run --image-tag $(git rev-parse --short HEAD)
+# Override everything
+veris ci run --scenario-set-id X --env-id Y --concurrency 5
+```
+Progress is printed to stderr, clean markdown to stdout. The command exits non-zero if the run or evaluation fails.
 ## How It Works
 ```

{veris_cli-2.12.0 → veris_cli-2.13.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "veris-cli"
-version = "2.12.0"
+version = "2.13.0"
 description = "CLI to connect local agents to the Veris backend"
 readme = "README.md"
 requires-python = ">=3.11"

veris_cli-2.13.0/src/veris_cli/ci_output.py ADDED Viewed

@@ -0,0 +1,80 @@
+"""Markdown summary formatter for CI runs."""
+from __future__ import annotations
+def _fmt_duration(seconds: int | None) -> str:
+    if seconds is None:
+        return "—"
+    m, s = divmod(seconds, 60)
+    return f"{m}m {s:02d}s"
+def format_markdown_summary(
+    run: dict,
+    scenario_set: dict,
+    simulations: list[dict],
+    eval_run: dict,
+) -> str:
+    """Build a markdown report from run + evaluation data."""
+    # Build lookup maps
+    scenarios = scenario_set.get("scenarios") or []
+    scenario_map: dict[str, str] = {s["id"]: s["title"] for s in scenarios}
+    sim_scenario_map: dict[str, str] = {s["id"]: s["scenario_id"] for s in simulations}
+    lines: list[str] = []
+    lines.append("## Veris Simulation Report")
+    lines.append("")
+    # Metadata table
+    lines.append("| Field | Value |")
+    lines.append("|-------|-------|")
+    lines.append(f"| Run | `{run['id']}` |")
+    lines.append(f"| Status | {run['status']} |")
+    set_title = scenario_set.get("title", "—")
+    set_id = scenario_set.get("id", "—")
+    lines.append(f"| Scenario Set | {set_title} (`{set_id}`) |")
+    lines.append(f"| Scenarios | {len(scenarios)} |")
+    lines.append(f"| Duration | {_fmt_duration(run.get('duration_seconds'))} |")
+    lines.append("")
+    # Grading results
+    grading_results = eval_run.get("grading_results") or []
+    if grading_results:
+        lines.append("### Grading Results")
+        lines.append("")
+        lines.append("| Scenario | Simulation | Score | Status |")
+        lines.append("|----------|------------|-------|--------|")
+        for gr in grading_results:
+            sim_id = gr["simulation_id"]
+            scenario_id = sim_scenario_map.get(sim_id, "—")
+            scenario_title = scenario_map.get(scenario_id, scenario_id)
+            score = gr.get("score")
+            score_str = f"{score:.2f}" if score is not None else "—"
+            lines.append(f"| {scenario_title} | `{sim_id}` | {score_str} | {gr['status']} |")
+        lines.append("")
+    # Assertion results
+    assertion_results = eval_run.get("assertion_results") or []
+    if assertion_results:
+        lines.append("### Assertion Results")
+        lines.append("")
+        lines.append("| Scenario | Verdict | Criteria |")
+        lines.append("|----------|---------|----------|")
+        for ar in assertion_results:
+            sim_id = ar["simulation_id"]
+            scenario_id = sim_scenario_map.get(sim_id, "—")
+            scenario_title = scenario_map.get(scenario_id, scenario_id)
+            verdict = ar.get("verdict") or "—"
+            # Compute criteria summary
+            criteria_results = (ar.get("result") or {}).get("criteria_results", [])
+            if criteria_results:
+                passed = sum(1 for c in criteria_results if c.get("result") == "PASS")
+                criteria_str = f"{passed}/{len(criteria_results)}"
+            else:
+                criteria_str = "—"
+            lines.append(f"| {scenario_title} | {verdict} | {criteria_str} |")
+        lines.append("")
+    return "\n".join(lines)

{veris_cli-2.12.0 → veris_cli-2.13.0}/src/veris_cli/cli.py RENAMED Viewed

@@ -1674,6 +1674,322 @@ def run_local(
         print()
+## CI commands
+@cli.group()
+def ci():
+    """CI/CD integration commands."""
+    pass
+@ci.command(name="run")
+@click.option("--scenario-set-id", default=None, help="Scenario set ID (overrides config)")
+@click.option("--env-id", default=None, help="Environment ID (overrides config)")
+@click.option("--concurrency", default=None, type=int, help="Parallel jobs (overrides config)")
+@click.option("--image-tag", default=None, help="Image tag to run (default: latest)")
+@click.option("--simulation-timeout", default=None, type=int, help="Timeout per sim in seconds")
+@click.pass_context
+def ci_run(ctx, scenario_set_id, env_id, concurrency, image_tag, simulation_timeout):
+    """Run simulations and evaluations for CI, output markdown summary.
+    Reads scenario_set_id, concurrency, and environment_id from
+    .veris/config.yaml so zero flags are needed in CI. All values
+    are overridable via CLI flags.
+    Progress is printed to stderr; markdown summary to stdout.
+    \b
+    Examples:
+      veris ci run                              # everything from config
+      veris ci run --image-tag $(git rev-parse --short HEAD)
+      veris ci run --scenario-set-id X --env-id Y --concurrency 5
+    """
+    from veris_cli.ci_output import format_markdown_summary
+    profile = _get_profile(ctx)
+    project_config = ProjectConfig(profile=profile)
+    ci_config = project_config.get_ci_config()
+    is_tty = sys.stdin.isatty()
+    # ── Resolve env_id ──
+    env_id = env_id or project_config.get_environment_id()
+    if not env_id:
+        output.print_error("No environment_id found. Run `veris init` or pass --env-id.")
+        sys.exit(1)
+    try:
+        api = VerisAPI(profile=profile)
+    except ValueError as e:
+        output.print_error(str(e))
+        sys.exit(1)
+    # ── Resolve scenario_set_id ──
+    prompted_scenario_set = False
+    scenario_set_id = scenario_set_id or ci_config.get("scenario_set_id")
+    if not scenario_set_id:
+        if not is_tty:
+            click.echo(
+                "Error: CI not configured. Run `veris ci run` interactively first "
+                "or pass --scenario-set-id.",
+                err=True,
+            )
+            sys.exit(1)
+        scenario_set_id = _ci_prompt_scenario_set(api, env_id)
+        prompted_scenario_set = True
+    # ── Resolve concurrency ──
+    concurrency = concurrency or ci_config.get("concurrency", 10)
+    # ── Save to config only when user interactively selected ──
+    if prompted_scenario_set:
+        new_ci = {"scenario_set_id": scenario_set_id}
+        if concurrency != 10:
+            new_ci["concurrency"] = concurrency
+        project_config.set_ci_config(new_ci)
+        click.echo("Saved CI config to .veris/config.yaml", err=True)
+    # ── Build run config ──
+    run_config = {}
+    if image_tag:
+        run_config["image_tag"] = image_tag
+    if simulation_timeout:
+        run_config["simulation_timeout"] = simulation_timeout
+    # ── Step 1: Create run ──
+    click.echo(
+        f"Creating run (scenario_set={scenario_set_id}, concurrency={concurrency})...",
+        err=True,
+    )
+    try:
+        run = api.create_run(
+            scenario_set_id=scenario_set_id,
+            environment_id=env_id,
+            parallel_jobs=concurrency,
+            config=run_config if run_config else None,
+        )
+    except Exception as e:
+        click.echo(f"Error: Failed to create run: {e}", err=True)
+        sys.exit(1)
+    run_id = run["id"]
+    click.echo(f"Run {run_id}: created", err=True)
+    # ── Step 2: Poll run ──
+    run = _ci_poll_run(api, run_id)
+    if run["status"] not in ("completed", "failed"):
+        click.echo(f"Run {run_id}: {run['status']} (unexpected terminal state)", err=True)
+        sys.exit(1)
+    run_failed = run["status"] == "failed"
+    if run_failed:
+        click.echo(f"Run {run_id}: failed — {run.get('error_message', 'unknown error')}", err=True)
+        if run.get("completed_simulations", 0) == 0:
+            sys.exit(1)
+        click.echo("Some simulations completed — continuing to evaluation...", err=True)
+    # ── Step 3: Resolve grader ──
+    click.echo("Resolving grader...", err=True)
+    try:
+        graders_resp = api.list_graders(env_id, scenario_set_id=scenario_set_id)
+    except Exception as e:
+        click.echo(f"Error: Failed to list graders: {e}", err=True)
+        sys.exit(1)
+    graders = graders_resp.get("graders", [])
+    if not graders:
+        click.echo("Error: No grader found for this environment/scenario set.", err=True)
+        sys.exit(1)
+    grader_id = graders[0]["id"]
+    click.echo(f"Using grader {grader_id}", err=True)
+    # ── Step 4: Trigger evaluation ──
+    click.echo("Triggering evaluation...", err=True)
+    try:
+        eval_resp = api.trigger_evaluation(run_id, grader_id)
+    except Exception as e:
+        click.echo(f"Error: Failed to trigger evaluation: {e}", err=True)
+        sys.exit(1)
+    eval_run_id = eval_resp["evaluation_run_id"]
+    click.echo(f"Evaluation run {eval_run_id}: started", err=True)
+    # ── Step 5: Poll evaluation ──
+    eval_run = _ci_poll_eval(api, run_id, eval_run_id)
+    if eval_run["status"] == "failed":
+        click.echo(f"Evaluation run {eval_run_id}: failed", err=True)
+    # ── Step 6: Fetch supplementary data ──
+    try:
+        scenario_set = api.get_scenario_set(scenario_set_id)
+        sims_resp = api.list_run_simulations(run_id)
+    except Exception as e:
+        click.echo(f"Error: Failed to fetch data: {e}", err=True)
+        sys.exit(1)
+    simulations = sims_resp.get("simulations", [])
+    # ── Step 7: Output markdown ──
+    md = format_markdown_summary(run, scenario_set, simulations, eval_run)
+    click.echo(md)
+    # ── Exit non-zero if run or evaluation failed ──
+    if run_failed or eval_run.get("status") == "failed":
+        sys.exit(1)
+def _ci_prompt_scenario_set(api: VerisAPI, env_id: str) -> str:
+    """Interactively prompt user to select a scenario set for CI config."""
+    click.echo("Fetching scenario sets...", err=True)
+    try:
+        sets = api.list_scenario_sets(environment_id=env_id)
+    except Exception as e:
+        output.print_error(f"Failed to list scenario sets: {e}")
+        sys.exit(1)
+    if not sets:
+        output.print_error("No scenario sets found for this environment.")
+        sys.exit(1)
+    choices = [
+        {
+            "id": s["id"],
+            "title": f"{s.get('title', '')} ({s['id']}) — {s.get('scenario_count', '?')} scenarios",
+        }
+        for s in sets
+    ]
+    selected = prompts.select_from_list(
+        "Select a scenario set for CI:", choices, flag_hint="--scenario-set-id"
+    )
+    if not selected:
+        output.print_error("No scenario set selected")
+        sys.exit(1)
+    return selected
+def _ci_poll_run(api: VerisAPI, run_id: str) -> dict:
+    """Poll run status until terminal state. Uses Rich spinner on TTY, plain lines in CI."""
+    terminal = {"completed", "failed", "cancelled"}
+    is_tty = sys.stderr.isatty()
+    last_msg = None
+    live = None
+    if is_tty:
+        from rich.console import Console
+        from rich.live import Live
+        from rich.spinner import Spinner
+        stderr_console = Console(stderr=True)
+        live = Live(console=stderr_console, refresh_per_second=8, transient=True)
+        live.start()
+    try:
+        while True:
+            try:
+                run = api.get_run(run_id)
+            except Exception as e:
+                if live:
+                    live.stop()
+                click.echo(f"Error: Failed to poll run: {e}", err=True)
+                sys.exit(1)
+            status = run["status"]
+            completed = run.get("completed_simulations", 0)
+            failed = run.get("failed_simulations", 0)
+            total = run.get("total_simulations", 0)
+            if total == 0:
+                total = run.get("total_jobs", 0)
+            running = total - completed - failed
+            parts = []
+            if running > 0:
+                parts.append(f"{running} running")
+            if completed > 0:
+                parts.append(f"{completed} done")
+            if failed > 0:
+                parts.append(f"{failed} failed")
+            detail = ", ".join(parts) if parts else "waiting"
+            msg = f"{status} — {detail} ({total} total)"
+            if status in terminal:
+                if live:
+                    live.stop()
+                    icon = "[green]✓[/green]" if status == "completed" else "[red]✗[/red]"
+                    stderr_console.print(f"{icon} Run {status} — {detail}")
+                else:
+                    icon = "✓" if status == "completed" else "✗"
+                    click.echo(f"{icon} Run {status} — {detail}", err=True)
+                return run
+            if is_tty:
+                live.update(Spinner("dots", text=msg))
+            elif msg != last_msg:
+                click.echo(f"  Run: {msg}", err=True)
+                last_msg = msg
+            time.sleep(3)
+    finally:
+        if live:
+            live.stop()
+def _ci_poll_eval(api: VerisAPI, run_id: str, eval_run_id: str) -> dict:
+    """Poll evaluation run until terminal state. Uses Rich spinner on TTY, plain lines in CI."""
+    terminal = {"completed", "failed"}
+    is_tty = sys.stderr.isatty()
+    last_msg = None
+    live = None
+    if is_tty:
+        from rich.console import Console
+        from rich.live import Live
+        from rich.spinner import Spinner
+        stderr_console = Console(stderr=True)
+        live = Live(console=stderr_console, refresh_per_second=8, transient=True)
+        live.start()
+    try:
+        while True:
+            try:
+                ev = api.get_evaluation_run(run_id, eval_run_id)
+            except Exception as e:
+                if live:
+                    live.stop()
+                click.echo(f"Error: Failed to poll evaluation: {e}", err=True)
+                sys.exit(1)
+            status = ev["status"]
+            gr_done = ev.get("completed_grading_results", 0) + ev.get("failed_grading_results", 0)
+            gr_total = ev.get("total_grading_results", 0)
+            ar_done = ev.get("completed_assertion_results", 0) + ev.get(
+                "failed_assertion_results", 0
+            )
+            ar_total = ev.get("total_assertion_results", 0)
+            msg = (
+                f"Evaluation {eval_run_id}: {status} "
+                f"(grading {gr_done}/{gr_total}, assertions {ar_done}/{ar_total})"
+            )
+            if status in terminal:
+                if live:
+                    live.stop()
+                    icon = "[green]✓[/green]" if status == "completed" else "[red]✗[/red]"
+                    stderr_console.print(f"{icon} {msg}")
+                else:
+                    icon = "✓" if status == "completed" else "✗"
+                    click.echo(f"{icon} {msg}", err=True)
+                return ev
+            if is_tty:
+                live.update(Spinner("dots", text=msg))
+            elif msg != last_msg:
+                click.echo(f"  {msg}", err=True)
+                last_msg = msg
+            time.sleep(5)
+    finally:
+        if live:
+            live.stop()
 def main():
     """Main entry point"""
     cli()

{veris_cli-2.12.0 → veris_cli-2.13.0}/src/veris_cli/config.py RENAMED Viewed

@@ -200,3 +200,13 @@ class ProjectConfig:
         profile_data["environment_id"] = environment_id
         profile_data["environment_name"] = environment_name
         self._save_profile(profile_data)
+    def get_ci_config(self) -> dict:
+        """Get CI config block from project config."""
+        return self._load_profile().get("ci", {})
+    def set_ci_config(self, ci_config: dict) -> None:
+        """Save CI config block to project config."""
+        profile_data = self._load_profile()
+        profile_data["ci"] = ci_config
+        self._save_profile(profile_data)