PyPI - applied-cli - Versions diffs - 0.6.6__tar.gz → 0.6.7__tar.gz - Mend

applied-cli 0.6.6tar.gz → 0.6.7tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (70) hide show

{applied_cli-0.6.6 → applied_cli-0.6.7}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: applied-cli
-Version: 0.6.6
+Version: 0.6.7
 Summary: CLI and shared client library for Applied Labs AI support agents
 Author: Applied Labs
 License-Expression: MIT
@@ -80,6 +80,57 @@ applied metrics --metric-name conversation.resolve --start 2026-04-01 --end 2026
 object. `analytics` returns grouped rows and currently supports `--metrics count`.
 Raw analytics SQL is not available through the public CLI surface.
+## Benchmarks & Scenarios
+A **benchmark** is a named regression suite; a **scenario** is one test conversation
+(built from a real `input_conversation_id`) that can belong to one or more benchmarks.
+The typical loop is: build a suite → run it → review the pass rate → fix → re-run.
+```bash
+# Inspect benchmarks and their scenarios
+applied benchmarks --agent-id <agent_id> --format json
+applied benchmark <benchmark_id> --format json
+applied scenarios --benchmark-id <benchmark_id> --format json
+# Build a suite
+applied benchmark-create --agent-id <agent_id> --name "Cancel Regression"
+applied scenario-create --input-conversation-id <conversation_id> --name "<name>" \
+  --benchmark-id <benchmark_id>
+# Port a suite to another agent (e.g. email -> chat). Cross-agent recreates the
+# scenarios under the destination agent; same-agent just tags them in.
+# Dry-run by default; add --apply to write.
+applied benchmark-clone <source_benchmark_id> --dest-benchmark-name "Chat Regression" \
+  --target-agent-id <chat_agent_id> --apply
+# Run a benchmark and wait for results in one command.
+# --contact-email runs as a contact that has an email, fixing
+# "Email is not present in the conversation" on test conversations.
+applied scenario-bulk-run --benchmark-id <benchmark_id> \
+  --contact-email test@example.com --wait
+applied scenario-bulk-status <job_id> --include-runs --format json
+# Kill a stuck bulk run (deletes its queued/running runs; finished runs preserved)
+applied scenario-bulk-cancel <job_id> --apply
+# Review pass/fail health (pass_status reflects the latest run per scenario)
+applied benchmark-results <benchmark_id> --format json
+# Rate scenarios as you evaluate
+applied scenario-update <scenario_id> --pass-status pass --feedback "<note>"
+# Safe delete — refuses to wipe scenarios unless you opt in
+applied benchmark-delete <benchmark_id> --detach-scenarios   # preserve scenarios
+applied benchmark-delete <benchmark_id> --force              # cascade delete
+# Recover deleted benchmark/scenario rows from a local PITR export
+applied scenario-recover-catalog --recovery-dir <dir> --apply
+```
+Deleting a benchmark cascades and permanently deletes its scenarios and runs, so
+`benchmark-delete` refuses a non-empty benchmark unless you pass `--detach-scenarios`
+(unlink the scenarios first so they survive under their agent) or `--force`.
 ## Library Usage
 ```python
@@ -113,6 +164,11 @@ conversations = await tools.conversation_query(
 | `analytics_report` | Read standard dashboard/report analytics views |
 | `analytics_query` | Aggregate supported conversation dimensions with count |
 | `metrics_query` | Roll up named metric events |
+| `benchmark_clone` | Copy all scenarios from one benchmark into another |
+| `benchmark_delete` | Delete a benchmark (guards against wiping scenarios) |
+| `benchmark_results` | Pass/fail/unrated tally and pass rate for a benchmark |
+| `scenario_bulk_run` | Run scenarios (contact override + wait-to-completion) |
+| `scenario_bulk_cancel` | Cancel a stuck bulk run's queued/running scenario runs |
 ## Examples

{applied_cli-0.6.6 → applied_cli-0.6.7}/README.md RENAMED Viewed

@@ -54,6 +54,57 @@ applied metrics --metric-name conversation.resolve --start 2026-04-01 --end 2026
 object. `analytics` returns grouped rows and currently supports `--metrics count`.
 Raw analytics SQL is not available through the public CLI surface.
+## Benchmarks & Scenarios
+A **benchmark** is a named regression suite; a **scenario** is one test conversation
+(built from a real `input_conversation_id`) that can belong to one or more benchmarks.
+The typical loop is: build a suite → run it → review the pass rate → fix → re-run.
+```bash
+# Inspect benchmarks and their scenarios
+applied benchmarks --agent-id <agent_id> --format json
+applied benchmark <benchmark_id> --format json
+applied scenarios --benchmark-id <benchmark_id> --format json
+# Build a suite
+applied benchmark-create --agent-id <agent_id> --name "Cancel Regression"
+applied scenario-create --input-conversation-id <conversation_id> --name "<name>" \
+  --benchmark-id <benchmark_id>
+# Port a suite to another agent (e.g. email -> chat). Cross-agent recreates the
+# scenarios under the destination agent; same-agent just tags them in.
+# Dry-run by default; add --apply to write.
+applied benchmark-clone <source_benchmark_id> --dest-benchmark-name "Chat Regression" \
+  --target-agent-id <chat_agent_id> --apply
+# Run a benchmark and wait for results in one command.
+# --contact-email runs as a contact that has an email, fixing
+# "Email is not present in the conversation" on test conversations.
+applied scenario-bulk-run --benchmark-id <benchmark_id> \
+  --contact-email test@example.com --wait
+applied scenario-bulk-status <job_id> --include-runs --format json
+# Kill a stuck bulk run (deletes its queued/running runs; finished runs preserved)
+applied scenario-bulk-cancel <job_id> --apply
+# Review pass/fail health (pass_status reflects the latest run per scenario)
+applied benchmark-results <benchmark_id> --format json
+# Rate scenarios as you evaluate
+applied scenario-update <scenario_id> --pass-status pass --feedback "<note>"
+# Safe delete — refuses to wipe scenarios unless you opt in
+applied benchmark-delete <benchmark_id> --detach-scenarios   # preserve scenarios
+applied benchmark-delete <benchmark_id> --force              # cascade delete
+# Recover deleted benchmark/scenario rows from a local PITR export
+applied scenario-recover-catalog --recovery-dir <dir> --apply
+```
+Deleting a benchmark cascades and permanently deletes its scenarios and runs, so
+`benchmark-delete` refuses a non-empty benchmark unless you pass `--detach-scenarios`
+(unlink the scenarios first so they survive under their agent) or `--force`.
 ## Library Usage
 ```python
@@ -87,6 +138,11 @@ conversations = await tools.conversation_query(
 | `analytics_report` | Read standard dashboard/report analytics views |
 | `analytics_query` | Aggregate supported conversation dimensions with count |
 | `metrics_query` | Roll up named metric events |
+| `benchmark_clone` | Copy all scenarios from one benchmark into another |
+| `benchmark_delete` | Delete a benchmark (guards against wiping scenarios) |
+| `benchmark_results` | Pass/fail/unrated tally and pass rate for a benchmark |
+| `scenario_bulk_run` | Run scenarios (contact override + wait-to-completion) |
+| `scenario_bulk_cancel` | Cancel a stuck bulk run's queued/running scenario runs |
 ## Examples

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: applied-cli
-Version: 0.6.6
+Version: 0.6.7
 Summary: CLI and shared client library for Applied Labs AI support agents
 Author: Applied Labs
 License-Expression: MIT
@@ -80,6 +80,57 @@ applied metrics --metric-name conversation.resolve --start 2026-04-01 --end 2026
 object. `analytics` returns grouped rows and currently supports `--metrics count`.
 Raw analytics SQL is not available through the public CLI surface.
+## Benchmarks & Scenarios
+A **benchmark** is a named regression suite; a **scenario** is one test conversation
+(built from a real `input_conversation_id`) that can belong to one or more benchmarks.
+The typical loop is: build a suite → run it → review the pass rate → fix → re-run.
+```bash
+# Inspect benchmarks and their scenarios
+applied benchmarks --agent-id <agent_id> --format json
+applied benchmark <benchmark_id> --format json
+applied scenarios --benchmark-id <benchmark_id> --format json
+# Build a suite
+applied benchmark-create --agent-id <agent_id> --name "Cancel Regression"
+applied scenario-create --input-conversation-id <conversation_id> --name "<name>" \
+  --benchmark-id <benchmark_id>
+# Port a suite to another agent (e.g. email -> chat). Cross-agent recreates the
+# scenarios under the destination agent; same-agent just tags them in.
+# Dry-run by default; add --apply to write.
+applied benchmark-clone <source_benchmark_id> --dest-benchmark-name "Chat Regression" \
+  --target-agent-id <chat_agent_id> --apply
+# Run a benchmark and wait for results in one command.
+# --contact-email runs as a contact that has an email, fixing
+# "Email is not present in the conversation" on test conversations.
+applied scenario-bulk-run --benchmark-id <benchmark_id> \
+  --contact-email test@example.com --wait
+applied scenario-bulk-status <job_id> --include-runs --format json
+# Kill a stuck bulk run (deletes its queued/running runs; finished runs preserved)
+applied scenario-bulk-cancel <job_id> --apply
+# Review pass/fail health (pass_status reflects the latest run per scenario)
+applied benchmark-results <benchmark_id> --format json
+# Rate scenarios as you evaluate
+applied scenario-update <scenario_id> --pass-status pass --feedback "<note>"
+# Safe delete — refuses to wipe scenarios unless you opt in
+applied benchmark-delete <benchmark_id> --detach-scenarios   # preserve scenarios
+applied benchmark-delete <benchmark_id> --force              # cascade delete
+# Recover deleted benchmark/scenario rows from a local PITR export
+applied scenario-recover-catalog --recovery-dir <dir> --apply
+```
+Deleting a benchmark cascades and permanently deletes its scenarios and runs, so
+`benchmark-delete` refuses a non-empty benchmark unless you pass `--detach-scenarios`
+(unlink the scenarios first so they survive under their agent) or `--force`.
 ## Library Usage
 ```python
@@ -113,6 +164,11 @@ conversations = await tools.conversation_query(
 | `analytics_report` | Read standard dashboard/report analytics views |
 | `analytics_query` | Aggregate supported conversation dimensions with count |
 | `metrics_query` | Roll up named metric events |
+| `benchmark_clone` | Copy all scenarios from one benchmark into another |
+| `benchmark_delete` | Delete a benchmark (guards against wiping scenarios) |
+| `benchmark_results` | Pass/fail/unrated tally and pass rate for a benchmark |
+| `scenario_bulk_run` | Run scenarios (contact override + wait-to-completion) |
+| `scenario_bulk_cancel` | Cancel a stuck bulk run's queued/running scenario runs |
 ## Examples

{applied_cli-0.6.6 → applied_cli-0.6.7}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "applied-cli"
-version = "0.6.6"
+version = "0.6.7"
 description = "CLI and shared client library for Applied Labs AI support agents"
 readme = "README.md"
 requires-python = ">=3.11"

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/__init__.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/agent_scoped_flows.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/auth.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/cli.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/client.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/conversation_lookup.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/conversations.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/credentials.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/flow_helpers.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/formatters.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/mcp.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/recovery.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/toolkit.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/tools.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/__init__.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/agents.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/articles.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/catalog.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/connectors.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/content.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/conversations.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/domains.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/flows.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/knowledge.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/manifest.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/products.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/scenarios.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/taxonomy.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli/v2/tickets.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli.egg-info/SOURCES.txt RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli.egg-info/dependency_links.txt RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli.egg-info/entry_points.txt RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli.egg-info/requires.txt RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/applied_cli.egg-info/top_level.txt RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/setup.cfg RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_agent_scoped_flows.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_audit_tools.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_auth_context.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_benchmark_clone.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_benchmark_delete_guardrail.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_benchmark_results.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_benchmark_scenario_tools.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_cli.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_cli_v2.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_client.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_client_v2.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_conversation_tools.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_flow_tools.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_knowledge_content_tools.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_recovery.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_scenario_bulk_cancel.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_scenario_bulk_run_contact.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_scenario_bulk_run_wait.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_toolkit_contract.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_agents.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_articles.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_catalog_and_mcp.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_connectors.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_content.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_conversations.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_flows.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_knowledge.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_products.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_scenarios.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_taxonomy.py RENAMED Viewed

File without changes

{applied_cli-0.6.6 → applied_cli-0.6.7}/tests/test_v2_tickets.py RENAMED Viewed

File without changes

applied-cli 0.6.6__tar.gz → 0.6.7__tar.gz

applied-cli 0.6.6tar.gz → 0.6.7tar.gz