PyPI - driftless - Versions diffs - 0.2.4__tar.gz → 0.2.5__tar.gz - Mend

driftless 0.2.4tar.gz → 0.2.5tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (88) hide show

{driftless-0.2.4 → driftless-0.2.5}/CHANGELOG.md RENAMED Viewed

@@ -17,6 +17,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ---
+## [0.2.5] - 2026-07-01
+### Added
+- **`init-ci` label-audit workflow** — scaffold `driftless-label-audit.yml` (or
+  `-all` matrix) with `audit-labels --fail` on eval dataset path changes.
+- **`init-ci` judge-check workflow** — scaffold `driftless-judge-check.yml` when
+  `eval.judge.calibration_path` is set; uses `--enforce` when gate thresholds
+  are configured.
+---
 ## [0.2.4] - 2026-07-01
 ### Fixed
@@ -120,8 +132,9 @@ First public release on [PyPI](https://pypi.org/project/driftless/0.1.0/).
 - **Docs** — project overview, repair algorithm spec, 2×2 migration methodology,
   Poetry + Dependabot product framing.
-[Unreleased]: https://github.com/driftless-dev/driftless/compare/v0.2.4...HEAD
-[0.2.4]: https://github.com/driftless-dev/driftless/releases/tag/v0.2.4
+[Unreleased]: https://github.com/driftless-dev/driftless/compare/v0.2.5...HEAD
+[0.2.5]: https://github.com/driftless-dev/driftless/releases/tag/v0.2.5
+[0.2.4]: https://github.com/driftless-dev/driftless/compare/v0.2.4...v0.2.5
 [0.2.3]: https://github.com/driftless-dev/driftless/compare/v0.2.3...v0.2.4
 [0.2.2]: https://github.com/driftless-dev/driftless/compare/v0.2.2...v0.2.3
 [0.2.1]: https://github.com/driftless-dev/driftless/releases/tag/v0.2.1

{driftless-0.2.4 → driftless-0.2.5}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: driftless
-Version: 0.2.4
+Version: 0.2.5
 Summary: Keep prompts in sync when model or eval data changes — Poetry-style lock regeneration, Dependabot-style PRs.
 Project-URL: Homepage, https://github.com/driftless-dev/driftless
 Project-URL: Repository, https://github.com/driftless-dev/driftless
@@ -96,7 +96,7 @@ optimizes against it, with your team owning the definition of "good":
 |---|---|
 | `init` | Scaffold a `driftless.yml`. |
 | `init-policy` | Scaffold a `.driftless/policy.yml` (when to migrate). |
-| `init-ci` | Scaffold `.github/workflows/` for scan, migrate, refine, and poll. |
+| `init-ci` | Scaffold `.github/workflows/` for scan, migrate, refine, poll, label audit, and judge check. |
 | `scan` | Find probable LLM usage and at-risk models. |
 | `plan` | Discover at-risk workflows and apply the migration policy (CI triage). |
 | `plan --act` | Migrate + open a PR/issue for every actionable trigger (close the loop). |
@@ -133,7 +133,7 @@ can run in CI. See `.github/workflows/` for a scheduled deprecation scan and a
 manually-triggered migration that opens a PR (or an issue when blocked).
 ```yaml
-- uses: driftless-dev/driftless@v0.2.4
+- uses: driftless-dev/driftless@v0.2.5
   with:
     command: scan
 ```

{driftless-0.2.4 → driftless-0.2.5}/README.md RENAMED Viewed

@@ -57,7 +57,7 @@ optimizes against it, with your team owning the definition of "good":
 |---|---|
 | `init` | Scaffold a `driftless.yml`. |
 | `init-policy` | Scaffold a `.driftless/policy.yml` (when to migrate). |
-| `init-ci` | Scaffold `.github/workflows/` for scan, migrate, refine, and poll. |
+| `init-ci` | Scaffold `.github/workflows/` for scan, migrate, refine, poll, label audit, and judge check. |
 | `scan` | Find probable LLM usage and at-risk models. |
 | `plan` | Discover at-risk workflows and apply the migration policy (CI triage). |
 | `plan --act` | Migrate + open a PR/issue for every actionable trigger (close the loop). |
@@ -94,7 +94,7 @@ can run in CI. See `.github/workflows/` for a scheduled deprecation scan and a
 manually-triggered migration that opens a PR (or an issue when blocked).
 ```yaml
-- uses: driftless-dev/driftless@v0.2.4
+- uses: driftless-dev/driftless@v0.2.5
   with:
     command: scan
 ```

{driftless-0.2.4 → driftless-0.2.5}/docs/RELEASE.md RENAMED Viewed

@@ -153,7 +153,7 @@ After a release, users can pin the composite Action by release tag
 (`action.yml` lives at the repo root — no `/action` path segment):
 ```yaml
-- uses: driftless-dev/driftless@v0.2.4
+- uses: driftless-dev/driftless@v0.2.5
   with:
     command: scan
 ```
@@ -161,9 +161,9 @@ After a release, users can pin the composite Action by release tag
 Or pin the PyPI package in the Action input:
 ```yaml
-- uses: driftless-dev/driftless@v0.2.4
+- uses: driftless-dev/driftless@v0.2.5
   with:
-    version: "==0.2.4"
+    version: "==0.2.5"
     command: migrate
 ```
@@ -171,7 +171,7 @@ Optionally maintain a floating **`v1`** tag on the latest stable minor release
 (point it at the current release tag after each publish):
 ```bash
-git tag -f v1 v0.2.4 && git push origin v1 --force
+git tag -f v1 v0.2.5 && git push origin v1 --force
 ```
 Update [`action.yml`](../action.yml) default `version` input when cutting releases.

{driftless-0.2.4 → driftless-0.2.5}/site/docs.html RENAMED Viewed

@@ -428,7 +428,7 @@ driftless view -w support_classifier</code></pre>
     <span class="tok-k">runs-on</span>: ubuntu-latest
     <span class="tok-k">steps</span>:
       - <span class="tok-k">uses</span>: actions/checkout@v4
-      - <span class="tok-k">uses</span>: driftless-dev/driftless@v0.2.4
+      - <span class="tok-k">uses</span>: driftless-dev/driftless@v0.2.5
         <span class="tok-k">with</span>:
           <span class="tok-k">command</span>: <span class="tok-s">plan</span></code></pre>
         <p>A scheduled <code class="inline">plan</code> gates CI when a deprecated model needs attention; a manually-triggered <code class="inline">migrate</code> opens a PR (or an issue when blocked) with the evidence attached.</p>

{driftless-0.2.4 → driftless-0.2.5}/src/driftless/__init__.py RENAMED Viewed

@@ -1,3 +1,3 @@
 """driftless: Dependabot for LLM models."""
-__version__ = "0.2.4"
+__version__ = "0.2.5"

{driftless-0.2.4 → driftless-0.2.5}/src/driftless/cli.py RENAMED Viewed

@@ -136,6 +136,16 @@ def init_ci(
     plan: bool = typer.Option(
         False, "--plan/--no-plan", help="Scaffold scheduled plan --act workflow."
     ),
+    audit_labels: bool | None = typer.Option(
+        None,
+        "--audit-labels/--no-audit-labels",
+        help="Scaffold label-audit CI workflow (default: on if labels_path is set).",
+    ),
+    judge_check: bool | None = typer.Option(
+        None,
+        "--judge-check/--no-judge-check",
+        help="Scaffold judge-calibration CI workflow (default: on if calibration_path is set).",
+    ),
 ) -> None:
     """Scaffold GitHub Actions workflows wired to the driftless composite Action."""
     from .init_ci import CHECKLIST, scaffold_ci_from_path
@@ -151,6 +161,8 @@ def init_ci(
             include_refine=refine,
             include_poll=poll,
             include_plan=plan,
+            include_audit_labels=audit_labels,
+            include_judge_check=judge_check,
         )
     except DriftlessError as exc:
         _fail(exc)

{driftless-0.2.4 → driftless-0.2.5}/src/driftless/init_ci.py RENAMED Viewed

@@ -2,6 +2,7 @@
 from __future__ import annotations
+from dataclasses import dataclass
 from pathlib import Path
 from . import __version__
@@ -203,6 +204,204 @@ jobs:
 """
+def label_audit_workflows(contract: Contract) -> list[str]:
+    """Workflow names eligible for gold-label auditing (classification + labels_path)."""
+    names: list[str] = []
+    for name, wf in contract.workflows.items():
+        if wf.eval.grading != "label":
+            continue
+        if not wf.eval.labels_path:
+            continue
+        names.append(name)
+    return names
+def label_audit_paths(contract: Contract) -> list[str]:
+    """Union of dataset paths for workflows included in label audit."""
+    paths: list[str] = []
+    for name in label_audit_workflows(contract):
+        for path in dataset_paths(contract.workflows[name]):
+            if path not in paths:
+                paths.append(path)
+    return paths
+def render_audit_labels_workflow(
+    action_ref: str,
+    workflow_names: list[str],
+    paths: list[str],
+) -> str:
+    if not workflow_names:
+        raise ValueError("workflow_names must not be empty")
+    title = (
+        f"driftless label audit ({workflow_names[0]})"
+        if len(workflow_names) == 1
+        else "driftless label audit"
+    )
+    if len(workflow_names) == 1:
+        matrix_block = ""
+        workflow_arg = workflow_names[0]
+        workflow_step = f"""\
+      - name: Audit gold labels ({workflow_names[0]})
+        uses: {action_ref}
+        with:
+          command: audit-labels
+          workflow: {workflow_arg}
+          args: "--fail"
+"""
+    else:
+        matrix_yaml = "\n".join(f"          - {name!r}" for name in workflow_names)
+        matrix_block = f"""\
+    strategy:
+      fail-fast: false
+      matrix:
+        workflow:
+{matrix_yaml}
+"""
+        workflow_step = f"""\
+      - name: Audit gold labels (${{{{ matrix.workflow }}}})
+        uses: {action_ref}
+        with:
+          command: audit-labels
+          workflow: ${{{{ matrix.workflow }}}}
+          args: "--fail"
+"""
+    return f"""\
+name: {title}
+# Fail CI when duplicate/near-duplicate inputs carry disagreeing gold labels.
+on:
+  pull_request:
+    paths:
+{_path_filter_block(paths)}\
+  push:
+    branches: [main]
+    paths:
+{_path_filter_block(paths)}\
+  workflow_dispatch:
+jobs:
+  audit:
+    runs-on: ubuntu-latest
+{matrix_block}\
+    steps:
+      - uses: actions/checkout@v4
+{workflow_step}\
+"""
+@dataclass(frozen=True)
+class JudgeCheckTarget:
+    name: str
+    calibration_path: str
+    enforce: bool
+def judge_check_targets(contract: Contract) -> list[JudgeCheckTarget]:
+    """Judge-graded workflows with a human calibration set configured."""
+    targets: list[JudgeCheckTarget] = []
+    for name, wf in contract.workflows.items():
+        if wf.eval.grading != "judge" or wf.eval.judge is None:
+            continue
+        spec = wf.eval.judge
+        if not spec.calibration_path:
+            continue
+        enforce = spec.max_mae is not None or spec.min_correlation is not None
+        targets.append(
+            JudgeCheckTarget(
+                name=name,
+                calibration_path=spec.calibration_path,
+                enforce=enforce,
+            )
+        )
+    return targets
+def judge_check_paths(contract: Contract) -> list[str]:
+    paths: list[str] = []
+    for target in judge_check_targets(contract):
+        if target.calibration_path not in paths:
+            paths.append(target.calibration_path)
+    return paths
+def render_judge_check_workflow(
+    action_ref: str,
+    targets: list[JudgeCheckTarget],
+    paths: list[str],
+) -> str:
+    if not targets:
+        raise ValueError("targets must not be empty")
+    title = (
+        f"driftless judge check ({targets[0].name})"
+        if len(targets) == 1
+        else "driftless judge check"
+    )
+    if len(targets) == 1:
+        target = targets[0]
+        matrix_block = ""
+        args = '"--enforce"' if target.enforce else '""'
+        workflow_step = f"""\
+      - name: Judge calibration check ({target.name})
+        uses: {action_ref}
+        with:
+          command: judge-check
+          workflow: {target.name}
+          args: {args}
+        env:
+{_provider_env_block()}\
+"""
+    else:
+        include_lines: list[str] = []
+        for target in targets:
+            args = '"--enforce"' if target.enforce else '""'
+            include_lines.append(
+                f"          - workflow: {target.name!r}\n"
+                f"            args: {args}"
+            )
+        matrix_block = (
+            "    strategy:\n"
+            "      fail-fast: false\n"
+            "      matrix:\n"
+            "        include:\n"
+            + "\n".join(include_lines)
+            + "\n\n"
+        )
+        workflow_step = f"""\
+      - name: Judge calibration check (${{{{ matrix.workflow }}}})
+        uses: {action_ref}
+        with:
+          command: judge-check
+          workflow: ${{{{ matrix.workflow }}}}
+          args: ${{{{ matrix.args }}}}
+        env:
+{_provider_env_block()}\
+"""
+    return f"""\
+name: {title}
+# Measure LLM-judge agreement against human-scored calibration records.
+on:
+  pull_request:
+    paths:
+{_path_filter_block(paths)}\
+  push:
+    branches: [main]
+    paths:
+{_path_filter_block(paths)}\
+  workflow_dispatch:
+jobs:
+  judge-check:
+    runs-on: ubuntu-latest
+{matrix_block}\
+    steps:
+      - uses: actions/checkout@v4
+{workflow_step}\
+"""
 def render_plan_workflow(action_ref: str) -> str:
     return f"""\
 name: driftless plan (deprecation triage)
@@ -251,6 +450,8 @@ def scaffold_ci(
     include_refine: bool = True,
     include_poll: bool | None = None,
     include_plan: bool = False,
+    include_audit_labels: bool | None = None,
+    include_judge_check: bool | None = None,
 ) -> list[Path]:
     """Write GitHub workflow YAML files under ``out_dir``."""
     action_ref = action_ref or default_action_ref()
@@ -293,10 +494,52 @@ def scaffold_ci(
     if include_plan:
         write(out_dir / "driftless-plan-act.yml", render_plan_workflow(action_ref))
+    audit_names = label_audit_workflows(contract)
+    audit_needed = include_audit_labels
+    if audit_needed is None:
+        audit_needed = bool(audit_names)
+    if audit_needed:
+        if not audit_names:
+            raise DriftlessError(
+                "label audit workflow requires a classification workflow with eval.labels_path",
+                hint="add labels_path to a workflow or pass --no-audit-labels",
+            )
+        audit_paths = label_audit_paths(contract)
+        fname = (
+            "driftless-label-audit.yml"
+            if len(audit_names) == 1
+            else "driftless-label-audit-all.yml"
+        )
+        write(
+            out_dir / fname,
+            render_audit_labels_workflow(action_ref, audit_names, audit_paths),
+        )
+    judge_targets = judge_check_targets(contract)
+    judge_needed = include_judge_check
+    if judge_needed is None:
+        judge_needed = bool(judge_targets)
+    if judge_needed:
+        if not judge_targets:
+            raise DriftlessError(
+                "judge-check workflow requires eval.judge.calibration_path",
+                hint="add a human-scored calibration set or pass --no-judge-check",
+            )
+        judge_paths = judge_check_paths(contract)
+        fname = (
+            "driftless-judge-check.yml"
+            if len(judge_targets) == 1
+            else "driftless-judge-check-all.yml"
+        )
+        write(
+            out_dir / fname,
+            render_judge_check_workflow(action_ref, judge_targets, judge_paths),
+        )
     if not written:
         raise DriftlessError(
             "nothing to scaffold",
-            hint="enable at least one of scan, migrate, refine, poll, or plan",
+            hint="enable at least one of scan, migrate, refine, poll, plan, audit-labels, or judge-check",
         )
     return written
@@ -321,5 +564,7 @@ Next steps:
   2. For poll workflows: DRIFTLESS_DATASOURCE_TOKEN if eval.data_source URLs need auth.
   3. Confirm workflow path filters match your eval dataset paths in driftless.yml.
   4. Run driftless validate -w <workflow> locally before enabling scheduled jobs.
-  5. Pin the Action ref when upgrading: uses: driftless-dev/driftless@vX.Y.Z
+  5. Run driftless audit-labels -w <workflow> locally; CI uses --fail on label conflicts.
+  6. For judge-graded workflows: driftless judge-check -w <workflow> --enforce when gates are set.
+  7. Pin the Action ref when upgrading: uses: driftless-dev/driftless@vX.Y.Z
 """

driftless-0.2.5/tests/test_init_ci.py ADDED Viewed

@@ -0,0 +1,314 @@
+from pathlib import Path
+from typer.testing import CliRunner
+from driftless.cli import app
+from driftless.init_ci import (
+    dataset_paths,
+    default_action_ref,
+    judge_check_targets,
+    label_audit_paths,
+    label_audit_workflows,
+    render_audit_labels_workflow,
+    render_judge_check_workflow,
+    render_migrate_workflow,
+    render_refine_workflow,
+)
+runner = CliRunner()
+def test_init_ci_scaffolds_workflows(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    Path("driftless.yml").write_text(
+        """
+version: 1
+workflows:
+  support_classifier:
+    run:
+      command: echo ok
+      input_path: data/inputs.jsonl
+      output_path: .driftless/out.jsonl
+    model:
+      current: gpt-4o-mini
+      env_var: MODEL
+    eval:
+      labels_path: data/labels.jsonl
+""".lstrip()
+    )
+    out = tmp_path / ".github" / "workflows"
+    result = runner.invoke(app, ["init-ci", "--out-dir", str(out)])
+    assert result.exit_code == 0
+    assert (out / "driftless-model-scan.yml").is_file()
+    assert (out / "driftless-model-migrate.yml").is_file()
+    assert (out / "driftless-prompt-refine.yml").is_file()
+    assert (out / "driftless-label-audit.yml").is_file()
+    refine = (out / "driftless-prompt-refine.yml").read_text()
+    audit = (out / "driftless-label-audit.yml").read_text()
+    assert "data/labels.jsonl" in refine
+    assert "data/inputs.jsonl" in refine
+    assert "data/labels.jsonl" in audit
+    assert "audit-labels" in audit
+    assert '--fail' in audit or '"--fail"' in audit
+    assert default_action_ref() in refine
+    assert "OPENAI_API_KEY" in result.output
+def test_init_ci_poll_when_data_source(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    Path("driftless.yml").write_text(
+        """
+version: 1
+workflows:
+  rag:
+    run:
+      command: echo ok
+      input_path: data/inputs.jsonl
+      output_path: .driftless/out.jsonl
+    model:
+      current: gpt-4o-mini
+      env_var: MODEL
+    eval:
+      labels_path: data/labels.jsonl
+      data_source:
+        labels_url: https://example.com/labels.jsonl
+""".lstrip()
+    )
+    out = tmp_path / "workflows"
+    result = runner.invoke(app, ["init-ci", "--out-dir", str(out), "--no-refine"])
+    assert result.exit_code == 0
+    assert (out / "driftless-prompt-refine-poll.yml").is_file()
+def test_init_ci_refuses_overwrite_without_force(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    Path("driftless.yml").write_text(
+        """
+version: 1
+workflows:
+  smoke:
+    run:
+      command: echo ok
+      input_path: in.jsonl
+      output_path: out.jsonl
+    model:
+      current: gpt-4o-mini
+      env_var: MODEL
+    eval:
+      labels_path: labels.jsonl
+""".lstrip()
+    )
+    out = tmp_path / "workflows"
+    assert runner.invoke(app, ["init-ci", "--out-dir", str(out)]).exit_code == 0
+    retry = runner.invoke(app, ["init-ci", "--out-dir", str(out)])
+    assert retry.exit_code == 1
+    assert "already exists" in retry.output
+def test_dataset_paths_dedupes():
+    from driftless.contract import Contract
+    contract = Contract.model_validate(
+        {
+            "version": 1,
+            "workflows": {
+                "w": {
+                    "run": {
+                        "command": "x",
+                        "input_path": "data/x.jsonl",
+                        "output_path": "out.jsonl",
+                    },
+                    "model": {"current": "gpt-4o-mini", "env_var": "M"},
+                    "eval": {"labels_path": "data/x.jsonl"},
+                }
+            },
+        }
+    )
+    wf = contract.workflows["w"]
+    assert dataset_paths(wf) == ["data/x.jsonl"]
+def test_init_ci_skips_audit_for_judge_graded_workflow(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    Path("driftless.yml").write_text(
+        """
+version: 1
+workflows:
+  summarizer:
+    run:
+      command: echo ok
+      input_path: data/inputs.jsonl
+      output_path: .driftless/out.jsonl
+    model:
+      current: gpt-4o-mini
+      env_var: MODEL
+    eval:
+      judge:
+        rubric: "Score quality."
+""".lstrip()
+    )
+    out = tmp_path / "workflows"
+    result = runner.invoke(app, ["init-ci", "--out-dir", str(out), "--no-refine"])
+    assert result.exit_code == 0
+    assert not any(p.name.startswith("driftless-label-audit") for p in out.iterdir())
+def test_init_ci_audit_matrix_for_multiple_workflows(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    Path("driftless.yml").write_text(
+        """
+version: 1
+workflows:
+  alpha:
+    run:
+      command: echo ok
+      input_path: data/a-in.jsonl
+      output_path: .driftless/a-out.jsonl
+    model:
+      current: gpt-4o-mini
+      env_var: MODEL
+    eval:
+      labels_path: data/a-labels.jsonl
+  beta:
+    run:
+      command: echo ok
+      input_path: data/b-in.jsonl
+      output_path: .driftless/b-out.jsonl
+    model:
+      current: gpt-4o-mini
+      env_var: MODEL
+    eval:
+      labels_path: data/b-labels.jsonl
+""".lstrip()
+    )
+    out = tmp_path / "workflows"
+    result = runner.invoke(
+        app, ["init-ci", "--out-dir", str(out), "--no-scan", "--no-migrate"]
+    )
+    assert result.exit_code == 0
+    audit = (out / "driftless-label-audit-all.yml").read_text()
+    assert "matrix:" in audit
+    assert "'alpha'" in audit or '"alpha"' in audit
+    assert "'beta'" in audit or '"beta"' in audit
+    assert "data/a-labels.jsonl" in audit
+    assert "data/b-labels.jsonl" in audit
+def test_init_ci_judge_check_when_calibration_path(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    Path("driftless.yml").write_text(
+        """
+version: 1
+workflows:
+  summarizer:
+    run:
+      command: echo ok
+      input_path: data/in.jsonl
+      output_path: data/out.jsonl
+    model:
+      current: gpt-4o-mini
+      env_var: MODEL
+    eval:
+      judge:
+        rubric: "Score summary quality."
+        calibration_path: data/calib.jsonl
+        max_mae: 0.15
+""".lstrip()
+    )
+    out = tmp_path / "workflows"
+    result = runner.invoke(
+        app, ["init-ci", "--out-dir", str(out), "--no-scan", "--no-migrate", "--no-refine"]
+    )
+    assert result.exit_code == 0
+    judge = (out / "driftless-judge-check.yml").read_text()
+    assert "judge-check" in judge
+    assert "data/calib.jsonl" in judge
+    assert "--enforce" in judge
+    assert "OPENAI_API_KEY" in judge
+def test_init_ci_skips_judge_check_without_calibration(tmp_path, monkeypatch):
+    monkeypatch.chdir(tmp_path)
+    Path("driftless.yml").write_text(
+        """
+version: 1
+workflows:
+  summarizer:
+    run:
+      command: echo ok
+      input_path: data/in.jsonl
+      output_path: data/out.jsonl
+    model:
+      current: gpt-4o-mini
+      env_var: MODEL
+    eval:
+      judge:
+        rubric: "Score summary quality."
+""".lstrip()
+    )
+    out = tmp_path / "workflows"
+    result = runner.invoke(
+        app,
+        ["init-ci", "--out-dir", str(out), "--no-scan", "--no-migrate", "--no-refine", "--no-audit-labels"],
+    )
+    assert result.exit_code == 1
+    assert "nothing to scaffold" in result.output
+def test_label_audit_helpers():
+    from driftless.contract import Contract
+    contract = Contract.model_validate(
+        {
+            "version": 1,
+            "workflows": {
+                "cls": {
+                    "run": {
+                        "command": "x",
+                        "input_path": "in.jsonl",
+                        "output_path": "out.jsonl",
+                    },
+                    "model": {"current": "gpt-4o-mini", "env_var": "M"},
+                    "eval": {"labels_path": "labels.jsonl"},
+                },
+                "sum": {
+                    "run": {
+                        "command": "x",
+                        "input_path": "in2.jsonl",
+                        "output_path": "out2.jsonl",
+                    },
+                    "model": {"current": "gpt-4o-mini", "env_var": "M"},
+                    "eval": {"judge": {"rubric": "ok"}},
+                },
+            },
+        }
+    )
+    assert label_audit_workflows(contract) == ["cls"]
+    assert label_audit_paths(contract) == ["labels.jsonl", "in.jsonl"]
+def test_rendered_workflows_use_action_ref():
+    ref = "driftless-dev/driftless@v9.9.9"
+    assert ref in render_migrate_workflow(ref)
+    assert "support_classifier" in render_refine_workflow(
+        ref, "support_classifier", ["data/labels.jsonl"]
+    )
+    audit = render_audit_labels_workflow(ref, ["support_classifier"], ["data/labels.jsonl"])
+    assert ref in audit
+    assert "audit-labels" in audit
+    assert "--fail" in audit
+    from driftless.init_ci import JudgeCheckTarget
+    judge = render_judge_check_workflow(
+        ref,
+        [JudgeCheckTarget("summarizer", "data/calib.jsonl", True)],
+        ["data/calib.jsonl"],
+    )
+    assert "judge-check" in judge
+    assert "--enforce" in judge

driftless-0.2.4/tests/test_init_ci.py DELETED Viewed

@@ -1,128 +0,0 @@
-from pathlib import Path
-from typer.testing import CliRunner
-from driftless.cli import app
-from driftless.init_ci import (
-    dataset_paths,
-    default_action_ref,
-    render_migrate_workflow,
-    render_refine_workflow,
-)
-runner = CliRunner()
-def test_init_ci_scaffolds_workflows(tmp_path, monkeypatch):
-    monkeypatch.chdir(tmp_path)
-    Path("driftless.yml").write_text(
-        """
-version: 1
-workflows:
-  support_classifier:
-    run:
-      command: echo ok
-      input_path: data/inputs.jsonl
-      output_path: .driftless/out.jsonl
-    model:
-      current: gpt-4o-mini
-      env_var: MODEL
-    eval:
-      labels_path: data/labels.jsonl
-""".lstrip()
-    )
-    out = tmp_path / ".github" / "workflows"
-    result = runner.invoke(app, ["init-ci", "--out-dir", str(out)])
-    assert result.exit_code == 0
-    assert (out / "driftless-model-scan.yml").is_file()
-    assert (out / "driftless-model-migrate.yml").is_file()
-    assert (out / "driftless-prompt-refine.yml").is_file()
-    refine = (out / "driftless-prompt-refine.yml").read_text()
-    assert "data/labels.jsonl" in refine
-    assert "data/inputs.jsonl" in refine
-    assert default_action_ref() in refine
-    assert "OPENAI_API_KEY" in result.output
-def test_init_ci_poll_when_data_source(tmp_path, monkeypatch):
-    monkeypatch.chdir(tmp_path)
-    Path("driftless.yml").write_text(
-        """
-version: 1
-workflows:
-  rag:
-    run:
-      command: echo ok
-      input_path: data/inputs.jsonl
-      output_path: .driftless/out.jsonl
-    model:
-      current: gpt-4o-mini
-      env_var: MODEL
-    eval:
-      labels_path: data/labels.jsonl
-      data_source:
-        labels_url: https://example.com/labels.jsonl
-""".lstrip()
-    )
-    out = tmp_path / "workflows"
-    result = runner.invoke(app, ["init-ci", "--out-dir", str(out), "--no-refine"])
-    assert result.exit_code == 0
-    assert (out / "driftless-prompt-refine-poll.yml").is_file()
-def test_init_ci_refuses_overwrite_without_force(tmp_path, monkeypatch):
-    monkeypatch.chdir(tmp_path)
-    Path("driftless.yml").write_text(
-        """
-version: 1
-workflows:
-  smoke:
-    run:
-      command: echo ok
-      input_path: in.jsonl
-      output_path: out.jsonl
-    model:
-      current: gpt-4o-mini
-      env_var: MODEL
-    eval:
-      labels_path: labels.jsonl
-""".lstrip()
-    )
-    out = tmp_path / "workflows"
-    assert runner.invoke(app, ["init-ci", "--out-dir", str(out)]).exit_code == 0
-    retry = runner.invoke(app, ["init-ci", "--out-dir", str(out)])
-    assert retry.exit_code == 1
-    assert "already exists" in retry.output
-def test_dataset_paths_dedupes():
-    from driftless.contract import Contract
-    contract = Contract.model_validate(
-        {
-            "version": 1,
-            "workflows": {
-                "w": {
-                    "run": {
-                        "command": "x",
-                        "input_path": "data/x.jsonl",
-                        "output_path": "out.jsonl",
-                    },
-                    "model": {"current": "gpt-4o-mini", "env_var": "M"},
-                    "eval": {"labels_path": "data/x.jsonl"},
-                }
-            },
-        }
-    )
-    wf = contract.workflows["w"]
-    assert dataset_paths(wf) == ["data/x.jsonl"]
-def test_rendered_workflows_use_action_ref():
-    ref = "driftless-dev/driftless@v9.9.9"
-    assert ref in render_migrate_workflow(ref)
-    assert "support_classifier" in render_refine_workflow(
-        ref, "support_classifier", ["data/labels.jsonl"]
-    )