PyPI - veris-cli - Versions diffs - 2.1.2__tar.gz → 2.2.0__tar.gz - Mend

veris-cli 2.1.2tar.gz → 2.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

{veris_cli-2.1.2 → veris_cli-2.2.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: veris-cli
-Version: 2.1.2
+Version: 2.2.0
 Summary: CLI to connect local agents to the Veris backend
 Project-URL: Homepage, https://github.com/veris-ai/veris-cli
 Project-URL: Bug Tracker, https://github.com/veris-ai/veris-cli/issues
@@ -125,18 +125,28 @@ This will:
 **Note:** On macOS, this uses `docker buildx` for multi-platform builds targeting `linux/amd64` (GKE platform).
-### 6. List Available Scenarios
+### 6. Generate Scenarios (Optional)
+You can write scenarios by hand (see [Local Development](#local-development--testing)) or generate them automatically:
+```bash
+# Generate 5 scenarios + graders using Claude Code
+veris scenarios generate --num 5
+```
+This launches a K8s job that explores your agent's source code and produces test scenarios and graders. Poll status with:
 ```bash
 veris scenarios list
 ```
-Or filter by visibility:
+### 7. List Available Scenarios
 ```bash
-veris scenarios list --visibility public
+veris scenarios list
 ```
-### 7. Create and Run a Simulation
+### 8. Create and Run a Simulation
 ```bash
 # Interactive mode (prompts for scenario and environment)
@@ -146,7 +156,7 @@ veris run create
 veris run create --scenario-set-id scenset_abc123 --env-id env_xyz789
 ```
-### 8. Monitor Your Run
+### 9. Monitor Your Run
 ```bash
 # Check status
@@ -162,7 +172,23 @@ veris run logs run_abc123
 veris run logs run_abc123 --follow
 ```
-### 9. Cancel a Run (if needed)
+### 10. Evaluate Results (Optional)
+Once a run completes and graders are available:
+```bash
+# List available graders
+veris eval list
+# Trigger evaluation (interactive prompts for run and grader)
+veris evaluation-runs create
+# Check evaluation status
+veris evaluation-runs list --run-id run_abc123
+veris evaluation-runs status evalrun_abc123 --run-id run_abc123
+```
+### 11. Cancel a Run (if needed)
 ```bash
 veris run cancel run_abc123
@@ -208,8 +234,31 @@ veris env list [--status ready]
 ### Scenarios
 ```bash
-# List scenarios
-veris scenarios list [--visibility public|private|org]
+# List scenario sets
+veris scenarios list [--env-id <id>]
+# Generate scenarios + graders via K8s job
+veris scenarios generate [--env-id <id>] [--num 5] [--image-tag latest]
+```
+### Eval (Graders)
+```bash
+# List graders for an environment
+veris eval list [--env-id <id>]
+```
+### Evaluation Runs
+```bash
+# Trigger grading on a completed run
+veris evaluation-runs create [--run-id <id>] [--grader-id <id>]
+# List evaluation runs for a run
+veris evaluation-runs list --run-id <id>
+# Get evaluation run status and results
+veris evaluation-runs status <eval-run-id> --run-id <id> [--watch]
 ```
 ### Runs
@@ -347,13 +396,25 @@ Each scenario runs in an isolated container with:
 └─────────────────────────────────────────────────────────────┘
                             ↓
 ┌─────────────────────────────────────────────────────────────┐
-│  3. Run Simulations                                         │
+│  3. Generate Scenarios (optional)                           │
+│     veris scenarios generate → Claude Code explores agent   │
+│                              → produces scenarios + graders │
+└─────────────────────────────────────────────────────────────┘
+                            ↓
+┌─────────────────────────────────────────────────────────────┐
+│  4. Run Simulations                                         │
 │     veris run create → Veris spawns your agent in K8s       │
 │                     → Runs scenarios against it             │
 └─────────────────────────────────────────────────────────────┘
                             ↓
 ┌─────────────────────────────────────────────────────────────┐
-│  4. Monitor & Analyze                                       │
+│  5. Evaluate Results (optional)                             │
+│     veris evaluation-runs create → grades simulation traces │
+│     veris evaluation-runs status → view grading results     │
+└─────────────────────────────────────────────────────────────┘
+                            ↓
+┌─────────────────────────────────────────────────────────────┐
+│  6. Monitor & Analyze                                       │
 │     veris run status → check progress                       │
 │     veris run logs → view events                            │
 └─────────────────────────────────────────────────────────────┘

{veris_cli-2.1.2 → veris_cli-2.2.0}/README.md RENAMED Viewed

@@ -104,18 +104,28 @@ This will:
 **Note:** On macOS, this uses `docker buildx` for multi-platform builds targeting `linux/amd64` (GKE platform).
-### 6. List Available Scenarios
+### 6. Generate Scenarios (Optional)
+You can write scenarios by hand (see [Local Development](#local-development--testing)) or generate them automatically:
+```bash
+# Generate 5 scenarios + graders using Claude Code
+veris scenarios generate --num 5
+```
+This launches a K8s job that explores your agent's source code and produces test scenarios and graders. Poll status with:
 ```bash
 veris scenarios list
 ```
-Or filter by visibility:
+### 7. List Available Scenarios
 ```bash
-veris scenarios list --visibility public
+veris scenarios list
 ```
-### 7. Create and Run a Simulation
+### 8. Create and Run a Simulation
 ```bash
 # Interactive mode (prompts for scenario and environment)
@@ -125,7 +135,7 @@ veris run create
 veris run create --scenario-set-id scenset_abc123 --env-id env_xyz789
 ```
-### 8. Monitor Your Run
+### 9. Monitor Your Run
 ```bash
 # Check status
@@ -141,7 +151,23 @@ veris run logs run_abc123
 veris run logs run_abc123 --follow
 ```
-### 9. Cancel a Run (if needed)
+### 10. Evaluate Results (Optional)
+Once a run completes and graders are available:
+```bash
+# List available graders
+veris eval list
+# Trigger evaluation (interactive prompts for run and grader)
+veris evaluation-runs create
+# Check evaluation status
+veris evaluation-runs list --run-id run_abc123
+veris evaluation-runs status evalrun_abc123 --run-id run_abc123
+```
+### 11. Cancel a Run (if needed)
 ```bash
 veris run cancel run_abc123
@@ -187,8 +213,31 @@ veris env list [--status ready]
 ### Scenarios
 ```bash
-# List scenarios
-veris scenarios list [--visibility public|private|org]
+# List scenario sets
+veris scenarios list [--env-id <id>]
+# Generate scenarios + graders via K8s job
+veris scenarios generate [--env-id <id>] [--num 5] [--image-tag latest]
+```
+### Eval (Graders)
+```bash
+# List graders for an environment
+veris eval list [--env-id <id>]
+```
+### Evaluation Runs
+```bash
+# Trigger grading on a completed run
+veris evaluation-runs create [--run-id <id>] [--grader-id <id>]
+# List evaluation runs for a run
+veris evaluation-runs list --run-id <id>
+# Get evaluation run status and results
+veris evaluation-runs status <eval-run-id> --run-id <id> [--watch]
 ```
 ### Runs
@@ -326,13 +375,25 @@ Each scenario runs in an isolated container with:
 └─────────────────────────────────────────────────────────────┘
                             ↓
 ┌─────────────────────────────────────────────────────────────┐
-│  3. Run Simulations                                         │
+│  3. Generate Scenarios (optional)                           │
+│     veris scenarios generate → Claude Code explores agent   │
+│                              → produces scenarios + graders │
+└─────────────────────────────────────────────────────────────┘
+                            ↓
+┌─────────────────────────────────────────────────────────────┐
+│  4. Run Simulations                                         │
 │     veris run create → Veris spawns your agent in K8s       │
 │                     → Runs scenarios against it             │
 └─────────────────────────────────────────────────────────────┘
                             ↓
 ┌─────────────────────────────────────────────────────────────┐
-│  4. Monitor & Analyze                                       │
+│  5. Evaluate Results (optional)                             │
+│     veris evaluation-runs create → grades simulation traces │
+│     veris evaluation-runs status → view grading results     │
+└─────────────────────────────────────────────────────────────┘
+                            ↓
+┌─────────────────────────────────────────────────────────────┐
+│  6. Monitor & Analyze                                       │
 │     veris run status → check progress                       │
 │     veris run logs → view events                            │
 └─────────────────────────────────────────────────────────────┘

{veris_cli-2.1.2 → veris_cli-2.2.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "veris-cli"
-version = "2.1.2"
+version = "2.2.0"
 description = "CLI to connect local agents to the Veris backend"
 readme = "README.md"
 requires-python = ">=3.11"

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/api.py RENAMED Viewed

@@ -7,6 +7,28 @@ import httpx
 from veris_cli.config import Config
+class APIError(Exception):
+    """Raised when the backend returns an error response with details."""
+    def __init__(self, status_code: int, detail: str, url: str):
+        self.status_code = status_code
+        self.detail = detail
+        self.url = url
+        super().__init__(f"[{status_code}] {detail} ({url})")
+def _raise_for_status(response: httpx.Response) -> None:
+    """Like response.raise_for_status() but includes the response body."""
+    if response.is_success:
+        return
+    try:
+        body = response.json()
+        detail = body.get("detail", response.text)
+    except Exception:
+        detail = response.text or response.reason_phrase
+    raise APIError(response.status_code, detail, str(response.url))
 class VerisAPI:
     """Simple HTTP client for Veris backend API."""
@@ -30,7 +52,7 @@ class VerisAPI:
                 "/v1/environments",
                 json={"name": name, "description": description},
             )
-            response.raise_for_status()
+            _raise_for_status(response)
             return response.json()
     def create_environment_tag(self, environment_id: str, tag: str = "latest") -> dict[str, Any]:
@@ -40,7 +62,7 @@ class VerisAPI:
                 f"/v1/environments/{environment_id}/tags",
                 json={"tag": tag},
             )
-            response.raise_for_status()
+            _raise_for_status(response)
             return response.json()
     def list_environments(
@@ -52,33 +74,33 @@ class VerisAPI:
             params["status"] = status
         with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
             response = client.get("/v1/environments", params=params)
-            response.raise_for_status()
+            _raise_for_status(response)
             return response.json()
     def delete_environment(self, env_id: str) -> None:
         """Delete an environment."""
         with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
             response = client.delete(f"/v1/environments/{env_id}")
-            response.raise_for_status()
+            _raise_for_status(response)
     # Scenario Sets
     def list_scenario_sets(
-        self, visibility: Optional[str] = None, limit: int = 100, skip: int = 0
+        self, environment_id: Optional[str] = None, limit: int = 100, skip: int = 0
     ) -> list[dict[str, Any]]:
         """List scenario sets."""
         params = {"limit": limit, "skip": skip}
-        if visibility:
-            params["visibility"] = visibility
+        if environment_id:
+            params["environment_id"] = environment_id
         with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
             response = client.get("/v1/scenario-sets", params=params)
-            response.raise_for_status()
+            _raise_for_status(response)
             return response.json()
     def get_scenario_set(self, set_id: str) -> dict[str, Any]:
         """Get scenario set details."""
         with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
             response = client.get(f"/v1/scenario-sets/{set_id}")
-            response.raise_for_status()
+            _raise_for_status(response)
             return response.json()
     # Runs
@@ -100,7 +122,7 @@ class VerisAPI:
                     "config": config or {},
                 },
             )
-            response.raise_for_status()
+            _raise_for_status(response)
             return response.json()
     def list_runs(
@@ -112,26 +134,92 @@ class VerisAPI:
             params["status"] = status
         with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
             response = client.get("/v1/runs", params=params)
-            response.raise_for_status()
+            _raise_for_status(response)
             return response.json()
     def get_run(self, run_id: str) -> dict[str, Any]:
         """Get run details."""
         with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
             response = client.get(f"/v1/runs/{run_id}")
-            response.raise_for_status()
+            _raise_for_status(response)
             return response.json()
     def cancel_run(self, run_id: str) -> None:
         """Cancel a run."""
         with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
             response = client.delete(f"/v1/runs/{run_id}")
-            response.raise_for_status()
+            _raise_for_status(response)
     def get_run_events(self, run_id: str, limit: int = 100, offset: int = 0) -> dict[str, Any]:
         """Get run events/logs."""
         params = {"limit": limit, "offset": offset}
         with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
             response = client.get(f"/v1/runs/{run_id}/events", params=params)
-            response.raise_for_status()
+            _raise_for_status(response)
+            return response.json()
+    # Scenario Generation
+    def generate_scenario_set(
+        self,
+        environment_id: str,
+        num_scenarios: int = 5,
+        image_tag: Optional[str] = None,
+    ) -> dict[str, Any]:
+        """Trigger async scenario + grader generation via K8s job."""
+        payload: dict[str, Any] = {
+            "environment_id": environment_id,
+            "num_scenarios": num_scenarios,
+        }
+        if image_tag:
+            payload["image_tag"] = image_tag
+        with httpx.Client(base_url=self.base_url, headers=self._headers(), timeout=30) as client:
+            response = client.post("/v1/scenario-sets/generate", json=payload)
+            _raise_for_status(response)
+            return response.json()
+    # Graders
+    def list_graders(
+        self,
+        environment_id: str,
+        scenario_set_id: Optional[str] = None,
+        limit: int = 20,
+        offset: int = 0,
+    ) -> dict[str, Any]:
+        """List graders for an environment."""
+        params: dict[str, Any] = {
+            "environment_id": environment_id,
+            "limit": limit,
+            "offset": offset,
+        }
+        if scenario_set_id:
+            params["scenario_set_id"] = scenario_set_id
+        with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
+            response = client.get("/v1/graders", params=params)
+            _raise_for_status(response)
+            return response.json()
+    # Evaluations
+    def trigger_evaluation(self, run_id: str, grader_id: str) -> dict[str, Any]:
+        """Trigger grading on a completed run."""
+        with httpx.Client(base_url=self.base_url, headers=self._headers(), timeout=30) as client:
+            response = client.post(
+                f"/v1/runs/{run_id}/evaluate",
+                params={"grader_id": grader_id},
+            )
+            _raise_for_status(response)
+            return response.json()
+    def list_evaluation_runs(self, run_id: str, limit: int = 20, offset: int = 0) -> dict[str, Any]:
+        """List evaluation runs for a given run."""
+        params = {"limit": limit, "offset": offset}
+        with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
+            response = client.get(f"/v1/runs/{run_id}/evaluation-runs", params=params)
+            _raise_for_status(response)
+            return response.json()
+    def get_evaluation_run(self, run_id: str, eval_run_id: str) -> dict[str, Any]:
+        """Get evaluation run details including per-simulation results."""
+        with httpx.Client(base_url=self.base_url, headers=self._headers()) as client:
+            response = client.get(f"/v1/runs/{run_id}/evaluation-runs/{eval_run_id}")
+            _raise_for_status(response)
             return response.json()

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/cli.py RENAMED Viewed

@@ -13,6 +13,7 @@ from pathlib import Path
 from urllib.parse import parse_qs, urlparse
 import click
+import httpx
 from veris_cli import output, prompts, templates
 from veris_cli.api import VerisAPI
@@ -224,6 +225,11 @@ def init(name: str):
     except ValueError as e:
         output.print_error(str(e))
         output.print_info("You can create the environment later with 'veris env push'")
+    except httpx.ConnectError:
+        output.print_error(f"Could not connect to backend at {Config().get_backend_url()}")
+        output.print_info(
+            "Is the backend running? You can create the environment later with 'veris env push'"
+        )
     except Exception as e:
         output.print_error(f"Failed to create environment: {e}")
         output.print_info("You can create the environment later with 'veris env push'")
@@ -275,7 +281,7 @@ def env_build(tag: str, no_cache: bool):
         username = push_creds.get("username", "_token")
         password = push_creds.get("password", "")
-        registry = push_creds.get("registry", "gcr.io")
+        registry = push_creds.get("registry", "us-docker.pkg.dev")
         output.print_success(f"Tag created: {tag}")
         output.print_info(f"Building image: {image_uri}\n")
@@ -346,7 +352,7 @@ def env_push(tag: str, no_cache: bool):
         username = push_creds.get("username", "_token")
         password = push_creds.get("password", "")
-        registry = push_creds.get("registry", "gcr.io")
+        registry = push_creds.get("registry", "us-docker.pkg.dev")
         output.print_success(f"Tag created: {tag}")
         output.print_info(f"Building and pushing image: {image_uri}\n")
@@ -408,12 +414,12 @@ def scenarios():
 @scenarios.command(name="list")
-@click.option("--visibility", default=None, help="Filter by visibility (public/private/org)")
-def scenarios_list(visibility: str):
+@click.option("--env-id", default=None, help="Filter by environment ID")
+def scenarios_list(env_id: str):
     """List scenarios"""
     try:
         api = VerisAPI()
-        result = api.list_scenario_sets(visibility=visibility)
+        result = api.list_scenario_sets(environment_id=env_id)
         output.print_scenario_sets_table(result)
     except ValueError as e:
         output.print_error(str(e))
@@ -423,6 +429,51 @@ def scenarios_list(visibility: str):
         sys.exit(1)
+@scenarios.command(name="generate")
+@click.option("--env-id", default=None, help="Environment ID")
+@click.option("--num", default=5, help="Number of scenarios to generate (default: 5)")
+@click.option("--image-tag", default=None, help="Image tag to use (default: latest)")
+def scenarios_generate(env_id: str, num: int, image_tag: str):
+    """Generate scenarios + grader via K8s job.
+    Launches an async job that explores your agent code, generates test
+    scenarios and a grader definition. Poll with 'veris scenarios list'
+    to check when generation is complete.
+    """
+    try:
+        api = VerisAPI()
+        if not env_id:
+            project_config = ProjectConfig()
+            env_id = project_config.get_environment_id()
+        if not env_id:
+            result = api.list_environments(status="ready")
+            env_id = prompts.select_environment(result.get("environments", []))
+            if not env_id:
+                output.print_error("No environment selected")
+                sys.exit(1)
+        output.print_info(f"Generating {num} scenario(s) for environment {env_id}...")
+        result = api.generate_scenario_set(
+            environment_id=env_id,
+            num_scenarios=num,
+            image_tag=image_tag,
+        )
+        set_id = result.get("id", "")
+        output.print_success(f"Scenario generation started: {set_id}")
+        output.print_info("Status: generating")
+        output.print_info("Poll with 'veris scenarios list' to check when generation is complete")
+    except ValueError as e:
+        output.print_error(str(e))
+        sys.exit(1)
+    except Exception as e:
+        output.print_error(f"Failed to generate scenarios: {e}")
+        sys.exit(1)
 # Run commands
 @cli.group()
 def run():
@@ -558,6 +609,161 @@ def run_cancel(run_id: str):
         sys.exit(1)
+# Evaluation-run commands
+@cli.group(name="evaluation-runs")
+def eval_group():
+    """Evaluation run commands"""
+    pass
+@eval_group.command(name="create")
+@click.option("--run-id", default=None, help="Run ID to evaluate")
+@click.option("--grader-id", default=None, help="Grader ID to use")
+def eval_create(run_id: str, grader_id: str):
+    """Trigger grading on a completed run.
+    Launches an async K8s grading job that evaluates every simulation
+    in the run against the specified grader. Poll with 'veris eval list'
+    to check progress.
+    """
+    try:
+        api = VerisAPI()
+        if not run_id:
+            result = api.list_runs(status="completed")
+            runs = result.get("runs", [])
+            if not runs:
+                output.print_error("No completed runs found")
+                sys.exit(1)
+            choices = [{"id": r.get("id", ""), "title": r.get("id", "")} for r in runs]
+            run_id = prompts.select_from_list("Select a completed run:", choices)
+            if not run_id:
+                output.print_error("No run selected")
+                sys.exit(1)
+        if not grader_id:
+            run_data = api.get_run(run_id)
+            env_id = run_data.get("environment_id")
+            if not env_id:
+                output.print_error("Could not determine environment from run")
+                sys.exit(1)
+            graders_result = api.list_graders(environment_id=env_id)
+            graders = graders_result.get("graders", [])
+            if not graders:
+                output.print_error(
+                    f"No graders found for environment {env_id}. "
+                    "Generate scenarios first with 'veris scenarios generate'."
+                )
+                sys.exit(1)
+            choices = [
+                {
+                    "id": g.get("id", ""),
+                    "title": f"{g.get('id', '')} (tags: {g.get('tags', [])})",
+                }
+                for g in graders
+            ]
+            grader_id = prompts.select_from_list("Select a grader:", choices)
+            if not grader_id:
+                output.print_error("No grader selected")
+                sys.exit(1)
+        output.print_info(f"Triggering evaluation on run {run_id} with grader {grader_id}...")
+        result = api.trigger_evaluation(run_id=run_id, grader_id=grader_id)
+        eval_run_id = result.get("evaluation_run_id", "")
+        output.print_success(f"Evaluation started: {eval_run_id}")
+        output.print_info(f"Check progress with 'veris evaluation-runs list --run-id {run_id}'")
+        output.print_info(
+            f"View results with 'veris evaluation-runs status --run-id {run_id} {eval_run_id}'"
+        )
+    except ValueError as e:
+        output.print_error(str(e))
+        sys.exit(1)
+    except Exception as e:
+        output.print_error(f"Failed to create evaluation: {e}")
+        sys.exit(1)
+@eval_group.command(name="list")
+@click.option("--run-id", required=True, help="Run ID to list evaluations for")
+def eval_list(run_id: str):
+    """List evaluation runs for a given run."""
+    try:
+        api = VerisAPI()
+        result = api.list_evaluation_runs(run_id=run_id)
+        output.print_evaluation_runs_table(result.get("evaluation_runs", []))
+    except ValueError as e:
+        output.print_error(str(e))
+        sys.exit(1)
+    except Exception as e:
+        output.print_error(f"Failed to list evaluations: {e}")
+        sys.exit(1)
+@eval_group.command(name="status")
+@click.argument("eval_run_id")
+@click.option("--run-id", required=True, help="Parent run ID")
+@click.option("--watch", is_flag=True, help="Poll every 5 seconds until complete")
+def eval_status(eval_run_id: str, run_id: str, watch: bool):
+    """Get evaluation run status and results."""
+    try:
+        api = VerisAPI()
+        if watch:
+            while True:
+                data = api.get_evaluation_run(run_id=run_id, eval_run_id=eval_run_id)
+                output.print_evaluation_run_details(data)
+                status = data.get("status", "")
+                if status in ["completed", "failed"]:
+                    break
+                time.sleep(5)
+        else:
+            data = api.get_evaluation_run(run_id=run_id, eval_run_id=eval_run_id)
+            output.print_evaluation_run_details(data)
+    except ValueError as e:
+        output.print_error(str(e))
+        sys.exit(1)
+    except Exception as e:
+        output.print_error(f"Failed to get evaluation status: {e}")
+        sys.exit(1)
+# Eval commands (graders)
+@cli.group(name="eval")
+def eval_graders():
+    """Eval commands (graders)"""
+    pass
+@eval_graders.command(name="list")
+@click.option("--env-id", default=None, help="Environment ID (uses project config if omitted)")
+def graders_list(env_id: str):
+    """List graders for an environment."""
+    try:
+        api = VerisAPI()
+        if not env_id:
+            project_config = ProjectConfig()
+            env_id = project_config.get_environment_id()
+        if not env_id:
+            output.print_error("No environment ID. Use --env-id or run 'veris init' first.")
+            sys.exit(1)
+        result = api.list_graders(environment_id=env_id)
+        output.print_graders_table(result.get("graders", []))
+    except ValueError as e:
+        output.print_error(str(e))
+        sys.exit(1)
+    except Exception as e:
+        output.print_error(f"Failed to list graders: {e}")
+        sys.exit(1)
 def _load_dotenv(path: Path) -> dict[str, str]:
     """Parse and return environment variables from a .env file."""
     env = {}

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/output.py RENAMED Viewed

@@ -50,7 +50,7 @@ def print_scenario_sets_table(scenario_sets: list[dict[str, Any]]) -> None:
     table.add_column("ID", style="cyan")
     table.add_column("Title", style="green")
     table.add_column("Scenarios", style="blue")
-    table.add_column("Visibility", style="yellow")
+    table.add_column("Environment", style="yellow")
     table.add_column("Description", style="white")
     for ss in scenario_sets:
@@ -59,7 +59,7 @@ def print_scenario_sets_table(scenario_sets: list[dict[str, Any]]) -> None:
             ss.get("id", ""),
             ss.get("title", ""),
             str(ss.get("scenario_count", 0)),
-            ss.get("visibility", ""),
+            ss.get("environment_id") or "—",
             desc[:50] + "..." if len(desc) > 50 else desc,
         )
@@ -119,3 +119,96 @@ def print_run_events(events: list[dict[str, Any]]) -> None:
         console.print(
             f"[dim]{timestamp}[/dim] [{level_color}]{service}:{event_type}[/{level_color}] {data}"
         )
+def print_evaluation_runs_table(eval_runs: list[dict[str, Any]]) -> None:
+    """Print evaluation runs in a table."""
+    if not eval_runs:
+        console.print("No evaluation runs found.")
+        return
+    table = Table(title="Evaluation Runs")
+    table.add_column("ID", style="cyan")
+    table.add_column("Grader ID", style="green")
+    table.add_column("Status", style="yellow")
+    table.add_column("Total", style="blue")
+    table.add_column("Completed", style="green")
+    table.add_column("Failed", style="red")
+    table.add_column("Created", style="magenta")
+    for er in eval_runs:
+        table.add_row(
+            er.get("id", ""),
+            er.get("grader_id", ""),
+            er.get("status", ""),
+            str(er.get("total_evaluations", 0)),
+            str(er.get("completed_evaluations", 0)),
+            str(er.get("failed_evaluations", 0)),
+            er.get("created_at", ""),
+        )
+    console.print(table)
+def print_evaluation_run_details(data: dict[str, Any]) -> None:
+    """Print detailed evaluation run info."""
+    console.print("\n[bold]Evaluation Run[/bold]")
+    console.print(f"ID:       [cyan]{data.get('id', '')}[/cyan]")
+    console.print(f"Run:      [blue]{data.get('run_id', '')}[/blue]")
+    console.print(f"Grader:   [green]{data.get('grader_id', '')}[/green]")
+    console.print(f"Status:   [yellow]{data.get('status', '')}[/yellow]")
+    console.print(
+        f"Progress: {data.get('completed_evaluations', 0)}"
+        f"/{data.get('total_evaluations', 0)} completed"
+        f", {data.get('failed_evaluations', 0)} failed"
+    )
+    console.print(f"Created:  [magenta]{data.get('created_at', '')}[/magenta]")
+    evaluations = data.get("evaluations", [])
+    if evaluations:
+        console.print(f"\n[bold]Evaluations ({len(evaluations)})[/bold]")
+        table = Table()
+        table.add_column("Simulation ID", style="cyan")
+        table.add_column("Status", style="yellow")
+        table.add_column("Result", style="white", max_width=60)
+        for ev in evaluations:
+            result_str = ""
+            result = ev.get("result")
+            if result:
+                import json
+                result_str = json.dumps(result, indent=None)[:60]
+            table.add_row(
+                ev.get("simulation_id", ""),
+                ev.get("status", ""),
+                result_str,
+            )
+        console.print(table)
+def print_graders_table(graders: list[dict[str, Any]]) -> None:
+    """Print graders in a table."""
+    if not graders:
+        console.print("No graders found.")
+        return
+    table = Table(title="Graders")
+    table.add_column("ID", style="cyan")
+    table.add_column("Environment", style="blue")
+    table.add_column("Scenario Set", style="green")
+    table.add_column("Tags", style="yellow")
+    table.add_column("Created", style="magenta")
+    for g in graders:
+        tags = g.get("tags") or []
+        table.add_row(
+            g.get("id", ""),
+            g.get("environment_id", ""),
+            g.get("scenario_set_id") or "global",
+            ", ".join(tags) if tags else "",
+            g.get("created_at", ""),
+        )
+    console.print(table)

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/prompts.py RENAMED Viewed

@@ -79,3 +79,16 @@ def prompt_environment_name() -> Optional[str]:
     ).ask()
     return answer
+def select_from_list(prompt: str, items: list[dict[str, Any]]) -> Optional[str]:
+    """Generic interactive selection from a list of {id, title} dicts."""
+    if not items:
+        return None
+    choices = [
+        questionary.Choice(title=item.get("title", item.get("id", "")), value=item.get("id", ""))
+        for item in items
+    ]
+    return questionary.select(prompt, choices=choices).ask()

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/scripts/docker_build.sh RENAMED Viewed

@@ -27,6 +27,12 @@ TEMP_DOCKER_CONFIG=$(mktemp -d)
 export DOCKER_CONFIG="$TEMP_DOCKER_CONFIG"
 trap "rm -rf $TEMP_DOCKER_CONFIG" EXIT
+# Preserve CLI plugins (buildx, etc.) from the default config directory
+DEFAULT_DOCKER_CONFIG="${HOME}/.docker"
+if [ -d "$DEFAULT_DOCKER_CONFIG/cli-plugins" ]; then
+    ln -s "$DEFAULT_DOCKER_CONFIG/cli-plugins" "$TEMP_DOCKER_CONFIG/cli-plugins"
+fi
 # Login to registry first so we can pull the base image
 echo "Authenticating with Docker registry..."
 echo "  Registry: $REGISTRY"

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/templates.py RENAMED Viewed

@@ -1,7 +1,7 @@
 """Static file templates for veris init."""
 DOCKERFILE_SANDBOX = """# Extends veris-gvisor base with your agent code
-FROM gcr.io/veris-ai-dev/veris-gvisor:latest
+FROM us-docker.pkg.dev/veris-ai-dev/veris-sandbox-dev/veris-gvisor:latest
 # Copy agent code and dependencies
 # NOTE: Build context is project root, so paths are relative to project root

{veris_cli-2.1.2 → veris_cli-2.2.0}/.gitignore RENAMED Viewed

File without changes

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/__init__.py RENAMED Viewed

File without changes

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/config.py RENAMED Viewed

File without changes

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/scripts/__init__.py RENAMED Viewed

File without changes

{veris_cli-2.1.2 → veris_cli-2.2.0}/src/veris_cli/scripts/docker_push.sh RENAMED Viewed

File without changes

veris-cli 2.1.2__tar.gz → 2.2.0__tar.gz

veris-cli 2.1.2tar.gz → 2.2.0tar.gz