PyPI - adversarial-workflow - Versions diffs - 0.7.0__tar.gz → 0.9.0__tar.gz - Mend

adversarial-workflow 0.7.0tar.gz → 0.9.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (71) hide show

{adversarial_workflow-0.7.0 → adversarial_workflow-0.9.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: adversarial-workflow
-Version: 0.7.0
+Version: 0.9.0
 Summary: Multi-stage AI evaluation system for task plans, code review, and test validation
 Author: Fredrik Matheson
 License: MIT
@@ -57,7 +57,7 @@ Evaluate proposals, sort out ideas, and prevent "phantom work" (AI claiming to i
 - 🎯 **Tool-agnostic**: Use with Claude Code, Cursor, Aider, manual coding, or any workflow
 - ✨ **Interactive onboarding**: Guided setup wizard gets you started in <5 minutes
-## What's New in v0.6.3
+## What's New in v0.9.0
 ### Upgrade
@@ -65,6 +65,67 @@ Evaluate proposals, sort out ideas, and prevent "phantom work" (AI claiming to i
 pip install --upgrade adversarial-workflow
 ```
+### v0.9.0 - Run Library Evaluators
+**Finally run your installed evaluators!** Use the new `--evaluator` flag:
+```bash
+# Install an evaluator from the library
+adversarial library install google/gemini-flash
+# Run it with --evaluator flag
+adversarial evaluate --evaluator gemini-flash task.md
+adversarial evaluate -e gemini-flash task.md  # short form
+# Works with model_requirement for portable evaluators
+# Automatically resolves to best available model
+```
+**Key Features:**
+- Run any installed evaluator by name
+- Supports evaluator aliases
+- Automatic model resolution via `model_requirement`
+- Falls back to legacy `model` field if resolution fails
+- Full backward compatibility - no flag uses existing behavior
+See [Evaluator Library](#evaluator-library) for full documentation.
+### v0.8.1 - BugBot Fixes
+- **CI/CD compatibility**: `--category --dry-run` no longer hangs in non-TTY environments
+- **Proper exit codes**: Dry-run returns 1 when all previews fail
+- **Config robustness**: Non-dict YAML configs no longer crash
+### v0.7.0 - Evaluator Library
+Browse, install, and update evaluators from the community [adversarial-evaluator-library](https://github.com/movito/adversarial-evaluator-library):
+```bash
+# Browse available evaluators
+adversarial library list
+# Filter by provider or category
+adversarial library list --provider google
+adversarial library list --category quick-check
+# Install evaluators
+adversarial library install google/gemini-flash openai/fast-check
+# Check for updates
+adversarial library check-updates
+# Update installed evaluators
+adversarial library update --all
+```
+**Key Features:**
+- Index caching with 1-hour TTL for faster lookups
+- Offline support with stale cache fallback
+- Provenance tracking via `_meta` block in installed files
+- Diff preview before applying updates
+See [Evaluator Library](#evaluator-library) for full documentation.
 ### v0.6.3 - Configurable Timeouts
 - **Per-evaluator timeout**: Add `timeout: 300` to evaluator YAML for slow models like Mistral Large
@@ -429,7 +490,8 @@ adversarial health                      # Comprehensive system health check
 adversarial agent onboard               # Set up agent coordination system
 # Workflow
-adversarial evaluate task.md            # Phase 1: Evaluate plan
+adversarial evaluate task.md            # Phase 1: Evaluate plan (uses config.yml)
+adversarial evaluate -e <name> task.md  # Phase 1: Evaluate with installed evaluator
 adversarial split task.md               # Split large files into smaller parts
 adversarial split task.md --dry-run     # Preview split without creating files
 adversarial review                      # Phase 3: Review implementation
@@ -437,6 +499,99 @@ adversarial validate "pytest"           # Phase 4: Validate with tests
 adversarial list-evaluators             # List all available evaluators
 ```
+## Evaluator Library
+Browse and install pre-configured evaluators from the community [adversarial-evaluator-library](https://github.com/movito/adversarial-evaluator-library).
+### Quick Start
+```bash
+# Browse available evaluators
+adversarial library list
+# Filter by provider or category
+adversarial library list --provider google
+adversarial library list --category quick-check
+# Install an evaluator
+adversarial library install google/gemini-flash
+# Run it with --evaluator flag
+adversarial evaluate --evaluator gemini-flash task.md
+adversarial evaluate -e gemini-flash task.md  # short form
+```
+### Available Commands
+| Command | Description |
+|---------|-------------|
+| `adversarial library list` | Browse available evaluators |
+| `adversarial library install <provider>/<name>` | Install evaluator to project |
+| `adversarial library check-updates` | Check for updates to installed evaluators |
+| `adversarial library update <name>` | Update an evaluator (with diff preview) |
+### Running Installed Evaluators
+Use the `--evaluator` flag to run any installed evaluator:
+```bash
+# Run by name
+adversarial evaluate --evaluator plan-evaluator task.md
+# Short form
+adversarial evaluate -e security-reviewer task.md
+# Evaluators with model_requirement auto-resolve to best available model
+adversarial evaluate -e gemini-flash task.md
+```
+**How it works:**
+- Looks up evaluator in `.adversarial/evaluators/*.yml`
+- Uses the evaluator's model, prompt, and output settings
+- Supports evaluator aliases
+- If evaluator has `model_requirement`, resolves to best available model
+- Falls back to legacy `model` field if resolution fails
+**Without --evaluator flag**: Uses existing shell script behavior (backward compatible)
+### Philosophy: Copy, Don't Link
+Installed evaluators are **copied** to your project, not referenced at runtime:
+- Projects remain self-contained and work offline
+- You can customize your local copies freely
+- Updates are explicit and user-controlled
+### Provenance Tracking
+Installed evaluators include metadata for tracking updates:
+```yaml
+_meta:
+  source: adversarial-evaluator-library
+  source_path: google/gemini-flash
+  version: "1.2.0"
+  installed: "2026-02-03T10:00:00Z"
+name: gemini-flash
+# ... rest of evaluator config
+```
+### Options
+```bash
+# Bypass cache (1-hour TTL by default)
+adversarial library list --no-cache
+# Force overwrite existing files
+adversarial library install google/gemini-flash --force
+# Update all outdated evaluators
+adversarial library update --all
+# Preview changes without applying
+adversarial library update gemini-flash --diff-only
+```
 ## Custom Evaluators
 Starting with v0.6.0, you can define project-specific evaluators without modifying the package.

{adversarial_workflow-0.7.0 → adversarial_workflow-0.9.0}/README.md RENAMED Viewed

@@ -20,7 +20,7 @@ Evaluate proposals, sort out ideas, and prevent "phantom work" (AI claiming to i
 - 🎯 **Tool-agnostic**: Use with Claude Code, Cursor, Aider, manual coding, or any workflow
 - ✨ **Interactive onboarding**: Guided setup wizard gets you started in <5 minutes
-## What's New in v0.6.3
+## What's New in v0.9.0
 ### Upgrade
@@ -28,6 +28,67 @@ Evaluate proposals, sort out ideas, and prevent "phantom work" (AI claiming to i
 pip install --upgrade adversarial-workflow
 ```
+### v0.9.0 - Run Library Evaluators
+**Finally run your installed evaluators!** Use the new `--evaluator` flag:
+```bash
+# Install an evaluator from the library
+adversarial library install google/gemini-flash
+# Run it with --evaluator flag
+adversarial evaluate --evaluator gemini-flash task.md
+adversarial evaluate -e gemini-flash task.md  # short form
+# Works with model_requirement for portable evaluators
+# Automatically resolves to best available model
+```
+**Key Features:**
+- Run any installed evaluator by name
+- Supports evaluator aliases
+- Automatic model resolution via `model_requirement`
+- Falls back to legacy `model` field if resolution fails
+- Full backward compatibility - no flag uses existing behavior
+See [Evaluator Library](#evaluator-library) for full documentation.
+### v0.8.1 - BugBot Fixes
+- **CI/CD compatibility**: `--category --dry-run` no longer hangs in non-TTY environments
+- **Proper exit codes**: Dry-run returns 1 when all previews fail
+- **Config robustness**: Non-dict YAML configs no longer crash
+### v0.7.0 - Evaluator Library
+Browse, install, and update evaluators from the community [adversarial-evaluator-library](https://github.com/movito/adversarial-evaluator-library):
+```bash
+# Browse available evaluators
+adversarial library list
+# Filter by provider or category
+adversarial library list --provider google
+adversarial library list --category quick-check
+# Install evaluators
+adversarial library install google/gemini-flash openai/fast-check
+# Check for updates
+adversarial library check-updates
+# Update installed evaluators
+adversarial library update --all
+```
+**Key Features:**
+- Index caching with 1-hour TTL for faster lookups
+- Offline support with stale cache fallback
+- Provenance tracking via `_meta` block in installed files
+- Diff preview before applying updates
+See [Evaluator Library](#evaluator-library) for full documentation.
 ### v0.6.3 - Configurable Timeouts
 - **Per-evaluator timeout**: Add `timeout: 300` to evaluator YAML for slow models like Mistral Large
@@ -392,7 +453,8 @@ adversarial health                      # Comprehensive system health check
 adversarial agent onboard               # Set up agent coordination system
 # Workflow
-adversarial evaluate task.md            # Phase 1: Evaluate plan
+adversarial evaluate task.md            # Phase 1: Evaluate plan (uses config.yml)
+adversarial evaluate -e <name> task.md  # Phase 1: Evaluate with installed evaluator
 adversarial split task.md               # Split large files into smaller parts
 adversarial split task.md --dry-run     # Preview split without creating files
 adversarial review                      # Phase 3: Review implementation
@@ -400,6 +462,99 @@ adversarial validate "pytest"           # Phase 4: Validate with tests
 adversarial list-evaluators             # List all available evaluators
 ```
+## Evaluator Library
+Browse and install pre-configured evaluators from the community [adversarial-evaluator-library](https://github.com/movito/adversarial-evaluator-library).
+### Quick Start
+```bash
+# Browse available evaluators
+adversarial library list
+# Filter by provider or category
+adversarial library list --provider google
+adversarial library list --category quick-check
+# Install an evaluator
+adversarial library install google/gemini-flash
+# Run it with --evaluator flag
+adversarial evaluate --evaluator gemini-flash task.md
+adversarial evaluate -e gemini-flash task.md  # short form
+```
+### Available Commands
+| Command | Description |
+|---------|-------------|
+| `adversarial library list` | Browse available evaluators |
+| `adversarial library install <provider>/<name>` | Install evaluator to project |
+| `adversarial library check-updates` | Check for updates to installed evaluators |
+| `adversarial library update <name>` | Update an evaluator (with diff preview) |
+### Running Installed Evaluators
+Use the `--evaluator` flag to run any installed evaluator:
+```bash
+# Run by name
+adversarial evaluate --evaluator plan-evaluator task.md
+# Short form
+adversarial evaluate -e security-reviewer task.md
+# Evaluators with model_requirement auto-resolve to best available model
+adversarial evaluate -e gemini-flash task.md
+```
+**How it works:**
+- Looks up evaluator in `.adversarial/evaluators/*.yml`
+- Uses the evaluator's model, prompt, and output settings
+- Supports evaluator aliases
+- If evaluator has `model_requirement`, resolves to best available model
+- Falls back to legacy `model` field if resolution fails
+**Without --evaluator flag**: Uses existing shell script behavior (backward compatible)
+### Philosophy: Copy, Don't Link
+Installed evaluators are **copied** to your project, not referenced at runtime:
+- Projects remain self-contained and work offline
+- You can customize your local copies freely
+- Updates are explicit and user-controlled
+### Provenance Tracking
+Installed evaluators include metadata for tracking updates:
+```yaml
+_meta:
+  source: adversarial-evaluator-library
+  source_path: google/gemini-flash
+  version: "1.2.0"
+  installed: "2026-02-03T10:00:00Z"
+name: gemini-flash
+# ... rest of evaluator config
+```
+### Options
+```bash
+# Bypass cache (1-hour TTL by default)
+adversarial library list --no-cache
+# Force overwrite existing files
+adversarial library install google/gemini-flash --force
+# Update all outdated evaluators
+adversarial library update --all
+# Preview changes without applying
+adversarial library update gemini-flash --diff-only
+```
 ## Custom Evaluators
 Starting with v0.6.0, you can define project-specific evaluators without modifying the package.

{adversarial_workflow-0.7.0 → adversarial_workflow-0.9.0}/adversarial_workflow/__init__.py RENAMED Viewed

@@ -12,7 +12,7 @@ Usage:
     adversarial validate "pytest"
 """
-__version__ = "0.7.0"
+__version__ = "0.9.0"
 __author__ = "Fredrik Matheson"
 __license__ = "MIT"

{adversarial_workflow-0.7.0 → adversarial_workflow-0.9.0}/adversarial_workflow/cli.py RENAMED Viewed

@@ -30,7 +30,7 @@ from typing import Dict, List, Optional, Tuple
 import yaml
 from dotenv import dotenv_values, load_dotenv
-__version__ = "0.7.0"
+__version__ = "0.9.0"
 # ANSI color codes for better output
 RESET = "\033[0m"
@@ -2944,6 +2944,7 @@ def main():
     from adversarial_workflow.evaluators import (
         BUILTIN_EVALUATORS,
+        discover_local_evaluators,
         get_all_evaluators,
         run_evaluator,
     )
@@ -2959,6 +2960,7 @@ def main():
         "health",
         "quickstart",
         "agent",
+        "library",
         "split",
         "validate",
         "review",
@@ -2982,6 +2984,8 @@ Examples:
   adversarial validate "npm test"       # Validate with tests
   adversarial split large-task.md       # Split large files
   adversarial check-citations doc.md    # Verify URLs in document
+  adversarial library list              # Browse available evaluators
+  adversarial library install google/gemini-flash  # Install evaluator
 For more information: https://github.com/movito/adversarial-workflow
         """,
@@ -3028,6 +3032,98 @@ For more information: https://github.com/movito/adversarial-workflow
         "--path", default=".", help="Project path (default: current directory)"
     )
+    # library command (with subcommands)
+    library_parser = subparsers.add_parser(
+        "library", help="Browse and install evaluators from the community library"
+    )
+    library_subparsers = library_parser.add_subparsers(
+        dest="library_subcommand", help="Library subcommand"
+    )
+    # library list subcommand
+    library_list_parser = library_subparsers.add_parser(
+        "list", help="List available evaluators from the library"
+    )
+    library_list_parser.add_argument(
+        "--provider", "-p", help="Filter by provider (e.g., google, openai)"
+    )
+    library_list_parser.add_argument(
+        "--category", "-c", help="Filter by category (e.g., quick-check, deep-reasoning)"
+    )
+    library_list_parser.add_argument(
+        "--verbose", "-v", action="store_true", help="Show detailed information"
+    )
+    library_list_parser.add_argument(
+        "--no-cache", action="store_true", help="Bypass cache and fetch fresh data"
+    )
+    # library info subcommand
+    library_info_parser = library_subparsers.add_parser(
+        "info", help="Show detailed information about an evaluator"
+    )
+    library_info_parser.add_argument(
+        "evaluator_spec", help="Evaluator to show info for (format: provider/name)"
+    )
+    # library install subcommand
+    library_install_parser = library_subparsers.add_parser(
+        "install", help="Install evaluator(s) from the library"
+    )
+    library_install_parser.add_argument(
+        "evaluators", nargs="*", help="Evaluator(s) to install (format: provider/name)"
+    )
+    library_install_parser.add_argument(
+        "--force", "-f", action="store_true", help="Overwrite existing files"
+    )
+    library_install_parser.add_argument(
+        "--skip-validation", action="store_true", help="Skip YAML validation (advanced)"
+    )
+    library_install_parser.add_argument(
+        "--dry-run", action="store_true", help="Preview without making changes"
+    )
+    library_install_parser.add_argument("--category", help="Install all evaluators in a category")
+    library_install_parser.add_argument(
+        "--yes", "-y", action="store_true", help="Skip confirmation prompts (required for CI/CD)"
+    )
+    # library check-updates subcommand
+    library_check_parser = library_subparsers.add_parser(
+        "check-updates", help="Check for updates to installed evaluators"
+    )
+    library_check_parser.add_argument(
+        "name", nargs="?", help="Specific evaluator to check (optional)"
+    )
+    library_check_parser.add_argument(
+        "--no-cache", action="store_true", help="Bypass cache and fetch fresh data"
+    )
+    # library update subcommand
+    library_update_parser = library_subparsers.add_parser(
+        "update", help="Update installed evaluator(s) to newer versions"
+    )
+    library_update_parser.add_argument("name", nargs="?", help="Evaluator name to update")
+    library_update_parser.add_argument(
+        "--all",
+        "-a",
+        action="store_true",
+        dest="all_evaluators",
+        help="Update all outdated evaluators",
+    )
+    library_update_parser.add_argument(
+        "--yes", "-y", action="store_true", help="Skip confirmation prompts"
+    )
+    library_update_parser.add_argument(
+        "--diff-only", action="store_true", help="Show diff without applying changes"
+    )
+    library_update_parser.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Preview without making changes (same as --diff-only)",
+    )
+    library_update_parser.add_argument(
+        "--no-cache", action="store_true", help="Bypass cache and fetch fresh data"
+    )
     # review command (static - reviews git changes, no file argument)
     subparsers.add_parser("review", help="Run Phase 3: Code review")
@@ -3149,6 +3245,15 @@ For more information: https://github.com/movito/adversarial-workflow
             action="store_true",
             help="Verify URLs in document before evaluation",
         )
+        # Add --evaluator flag for the "evaluate" command only
+        # This allows selecting a library-installed evaluator
+        if config.name == "evaluate":
+            eval_parser.add_argument(
+                "--evaluator",
+                "-e",
+                metavar="NAME",
+                help="Use a specific evaluator from .adversarial/evaluators/",
+            )
         # Store config for later execution
         eval_parser.set_defaults(evaluator_config=config)
@@ -3160,15 +3265,45 @@ For more information: https://github.com/movito/adversarial-workflow
     # Check for evaluator command first (has evaluator_config attribute)
     if hasattr(args, "evaluator_config"):
+        # Default to the command's evaluator config
+        config_to_use = args.evaluator_config
+        # Check if --evaluator flag was specified (only on evaluate command)
+        evaluator_override = getattr(args, "evaluator", None)
+        if evaluator_override:
+            local_evaluators = discover_local_evaluators()
+            if not local_evaluators:
+                print(f"{RED}Error: No evaluators installed.{RESET}")
+                print("Install evaluators with: adversarial library install <name>")
+                return 1
+            if evaluator_override not in local_evaluators:
+                print(f"{RED}Error: Evaluator '{evaluator_override}' not found.{RESET}")
+                print()
+                print("Available evaluators:")
+                # Show unique evaluators (avoid duplicates from aliases)
+                seen = set()
+                for _, cfg in sorted(local_evaluators.items()):
+                    if id(cfg) not in seen:
+                        print(f"  {cfg.name}")
+                        if cfg.aliases:
+                            print(f"    aliases: {', '.join(cfg.aliases)}")
+                        seen.add(id(cfg))
+                return 1
+            config_to_use = local_evaluators[evaluator_override]
+            print(f"Using evaluator: {config_to_use.name}")
         # Determine timeout: CLI flag > YAML config > default (180s)
         if args.timeout is not None:
             timeout = args.timeout
             source = "CLI override"
-        elif args.evaluator_config.timeout != 180:
-            timeout = args.evaluator_config.timeout
+        elif config_to_use.timeout != 180:
+            timeout = config_to_use.timeout
             source = "evaluator config"
         else:
-            timeout = args.evaluator_config.timeout  # 180 (default)
+            timeout = config_to_use.timeout  # 180 (default)
             source = "default"
         # Validate CLI timeout (consistent with YAML validation)
@@ -3195,7 +3330,7 @@ For more information: https://github.com/movito/adversarial-workflow
             print()
         return run_evaluator(
-            args.evaluator_config,
+            config_to_use,
             args.file,
             timeout=timeout,
         )
@@ -3220,6 +3355,59 @@ For more information: https://github.com/movito/adversarial-workflow
             print(f"{RED}Error: agent command requires a subcommand{RESET}")
             print("Usage: adversarial agent onboard")
             return 1
+    elif args.command == "library":
+        from adversarial_workflow.library import (
+            library_check_updates,
+            library_info,
+            library_install,
+            library_list,
+            library_update,
+        )
+        if args.library_subcommand == "list":
+            return library_list(
+                provider=args.provider,
+                category=args.category,
+                verbose=args.verbose,
+                no_cache=args.no_cache,
+            )
+        elif args.library_subcommand == "info":
+            return library_info(
+                evaluator_spec=args.evaluator_spec,
+            )
+        elif args.library_subcommand == "install":
+            return library_install(
+                evaluator_specs=args.evaluators,
+                force=args.force,
+                skip_validation=args.skip_validation,
+                dry_run=args.dry_run,
+                category=args.category,
+                yes=args.yes,
+            )
+        elif args.library_subcommand == "check-updates":
+            return library_check_updates(
+                name=args.name,
+                no_cache=args.no_cache,
+            )
+        elif args.library_subcommand == "update":
+            return library_update(
+                name=args.name,
+                all_evaluators=args.all_evaluators,
+                yes=args.yes,
+                diff_only=args.diff_only,
+                no_cache=args.no_cache,
+                dry_run=args.dry_run,
+            )
+        else:
+            # No subcommand provided
+            print(f"{RED}Error: library command requires a subcommand{RESET}")
+            print("Usage:")
+            print("  adversarial library list")
+            print("  adversarial library info <provider>/<name>")
+            print("  adversarial library install <provider>/<name>")
+            print("  adversarial library check-updates")
+            print("  adversarial library update <name>")
+            return 1
     elif args.command == "review":
         return review()
     elif args.command == "validate":

{adversarial_workflow-0.7.0 → adversarial_workflow-0.9.0}/adversarial_workflow/evaluators/__init__.py RENAMED Viewed

@@ -1,12 +1,18 @@
-"""Evaluators module for adversarial-workflow plugin architecture."""
+"""Evaluators module for adversarial-workflow plugin architecture.
+Supports dual-field model specification (ADV-0015):
+- Legacy: model + api_key_env fields (backwards compatible)
+- New: model_requirement field (resolved via ModelResolver)
+"""
 from .builtins import BUILTIN_EVALUATORS
-from .config import EvaluatorConfig
+from .config import EvaluatorConfig, ModelRequirement
 from .discovery import (
     EvaluatorParseError,
     discover_local_evaluators,
     parse_evaluator_yaml,
 )
+from .resolver import ModelResolver, ResolutionError
 from .runner import run_evaluator
@@ -38,6 +44,9 @@ def get_all_evaluators() -> dict[str, EvaluatorConfig]:
 __all__ = [
     "EvaluatorConfig",
     "EvaluatorParseError",
+    "ModelRequirement",
+    "ModelResolver",
+    "ResolutionError",
     "run_evaluator",
     "get_all_evaluators",
     "discover_local_evaluators",

adversarial-workflow 0.7.0__tar.gz → 0.9.0__tar.gz

adversarial-workflow 0.7.0tar.gz → 0.9.0tar.gz