PyPI - smartselect - Versions diffs - 0.1.3__py3-none-any.whl - Mend

smartselect 0.1.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

smartselect-0.1.3.dist-info/METADATA +269 -0
smartselect-0.1.3.dist-info/RECORD +19 -0
smartselect-0.1.3.dist-info/WHEEL +4 -0
smartselect-0.1.3.dist-info/entry_points.txt +6 -0
smartselect-0.1.3.dist-info/licenses/LICENSE +21 -0
testwise/__init__.py +3 -0
testwise/cli.py +204 -0
testwise/config.py +135 -0
testwise/context_builder.py +185 -0
testwise/diff_analyzer.py +273 -0
testwise/exceptions.py +33 -0
testwise/llm_selector.py +239 -0
testwise/models.py +190 -0
testwise/parsers/__init__.py +78 -0
testwise/parsers/generic_parser.py +81 -0
testwise/parsers/pytest_parser.py +204 -0
testwise/reporter.py +200 -0
testwise/test_discovery.py +165 -0
testwise/test_runner.py +188 -0

smartselect-0.1.3.dist-info/METADATA ADDED Viewed

@@ -0,0 +1,269 @@
+Metadata-Version: 2.4
+Name: smartselect
+Version: 0.1.3
+Summary: LLM-powered test selection for CI/CD pipelines
+Project-URL: Homepage, https://github.com/mattfrautnick/testwise
+Project-URL: Repository, https://github.com/mattfrautnick/testwise
+Project-URL: Issues, https://github.com/mattfrautnick/testwise/issues
+Author: Matthew Frautnick
+License-Expression: MIT
+License-File: LICENSE
+Keywords: ai,cd,ci,llm,test-selection,testing
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Developers
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Software Development :: Testing
+Requires-Python: >=3.10
+Requires-Dist: click>=8.0
+Requires-Dist: litellm>=1.40.0
+Requires-Dist: pydantic>=2.0
+Requires-Dist: pyyaml>=6.0
+Requires-Dist: tiktoken>=0.7.0
+Provides-Extra: dev
+Requires-Dist: mypy; extra == 'dev'
+Requires-Dist: pytest-cov; extra == 'dev'
+Requires-Dist: pytest-mock; extra == 'dev'
+Requires-Dist: pytest>=8.0; extra == 'dev'
+Requires-Dist: ruff; extra == 'dev'
+Requires-Dist: types-pyyaml; extra == 'dev'
+Description-Content-Type: text/markdown
+<p align="center">
+  <h1 align="center">Testwise</h1>
+  <p align="center">
+    LLM-powered test selection for CI/CD pipelines
+    <br />
+    <em>Run only the tests that matter. Save CI time without sacrificing coverage.</em>
+  </p>
+  <p align="center">
+    <a href="https://github.com/mattfrautnick/testwise/actions/workflows/ci.yml"><img src="https://github.com/mattfrautnick/testwise/actions/workflows/ci.yml/badge.svg" alt="CI"></a>
+    <a href="https://pypi.org/project/testwise/"><img src="https://img.shields.io/pypi/v/testwise.svg" alt="PyPI"></a>
+    <a href="https://pypi.org/project/testwise/"><img src="https://img.shields.io/pypi/pyversions/testwise.svg" alt="Python"></a>
+    <a href="https://github.com/mattfrautnick/testwise/blob/main/LICENSE"><img src="https://img.shields.io/github/license/mattfrautnick/testwise" alt="License"></a>
+  </p>
+</p>
+---
+Testwise analyzes your git diff and uses an LLM to classify every test as `must_run`, `should_run`, or `skip` — then executes only what's needed. It supports **test-level granularity** for languages with parser plugins and falls back to file-level selection for everything else.
+## Why Testwise?
+Large test suites slow down CI. Most changes only affect a fraction of your tests, but running the full suite every time wastes minutes (or hours). Existing static-analysis approaches miss indirect dependencies and cross-cutting concerns. Testwise uses an LLM that actually understands your code changes and test structure to make smarter decisions — with a safe fallback to run everything if it's ever uncertain.
+## How It Works
+```
+git diff ─> Discover Tests ─> Parse with Plugins ─> LLM Classifies ─> Run Selected ─> Report
+```
+1. **Diff Analysis** — Extracts the git diff between base and head refs
+2. **Test Discovery** — Finds all test files and parses individual test functions via parser plugins
+3. **LLM Classification** — Sends diff + test inventory to an LLM with structured output
+4. **Selective Execution** — Runs only selected tests and reports results with GitHub annotations
+## Features
+- **Hybrid Granularity** — Test-level selection for languages with parser plugins (pytest built-in), file-level fallback for others
+- **Plugin Architecture** — Extensible parser system via Python entry points. [Write a parser](#writing-a-parser-plugin) for any test framework.
+- **Any LLM Provider** — Uses [litellm](https://github.com/BerriAI/litellm) to support Claude, GPT, Gemini, and 100+ other models
+- **GitHub Actions** — Ships as a composite action with step summary, annotations, and outputs
+- **Safe Fallback** — If the LLM fails or is uncertain, falls back to running all tests
+- **Test Annotations** — Supports `@pytest.mark.covers()` to explicitly map tests to code areas
+## Quick Start
+### Install
+```bash
+pip install smartselect
+```
+### Configure
+Create `.testwise.yml` in your repo root:
+```yaml
+runners:
+  - name: pytest
+    command: pytest
+    args: ["-v", "--tb=short"]
+    test_patterns: ["tests/**/*.py", "test_*.py"]
+    parser: pytest
+    select_mode: test
+llm:
+  model: anthropic/claude-sonnet-4-20250514
+  api_key_env: ANTHROPIC_API_KEY
+```
+### Run
+```bash
+# Dry run — see what the LLM would select
+testwise --dry-run
+# Run selected tests
+testwise
+# Force all tests (bypass LLM)
+testwise --fallback
+```
+### GitHub Actions
+```yaml
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0  # Full history needed for diff
+      - uses: mattfrautnick/testwise@v1
+        with:
+          api-key: ${{ secrets.ANTHROPIC_API_KEY }}
+          run-level: should_run
+```
+The action writes a Markdown summary to `$GITHUB_STEP_SUMMARY` and emits `::error::` annotations for failing tests inline in your PR diff.
+## Test Annotations
+Testwise's pytest parser understands standard markers and a custom `@covers` annotation that explicitly maps tests to code areas:
+```python
+import pytest
+@pytest.mark.covers("auth_module", "user.login")
+def test_login_success(client, db):
+    """Verify successful login flow."""
+    ...
+@pytest.mark.integration
+@pytest.mark.covers("payment_service")
+def test_checkout_flow(client):
+    ...
+@pytest.mark.parametrize("role", ["admin", "user", "guest"])
+def test_permissions(role):
+    ...
+```
+The parser also extracts imports and fixture references automatically — no annotation required for basic dependency mapping.
+## Parser Plugins
+Testwise uses a plugin architecture for language-specific test parsing. Plugins are registered via Python entry points.
+### Built-in Parsers
+| Parser | Language | Granularity | Features |
+|--------|----------|-------------|----------|
+| `pytest` | Python | Test-level | Markers, covers, parametrize, fixtures, imports |
+| `generic` | Any | File-level | Fallback for unsupported languages |
+### Writing a Parser Plugin
+Implement `BaseParser` and register it as an entry point:
+```python
+from testwise.parsers import BaseParser
+from testwise.models import ParsedTest, ParsedTestFile, RunnerConfig
+from pathlib import Path
+class JestParser(BaseParser):
+    name = "jest"
+    languages = ["javascript", "typescript"]
+    file_patterns = ["*.test.ts", "*.test.js", "*.spec.ts", "*.spec.js"]
+    def parse_test_file(self, file_path: Path, content: str) -> ParsedTestFile:
+        # Parse describe/it blocks, extract test names
+        ...
+    def build_run_command(self, tests, runner_config, repo_root):
+        # Build jest --testNamePattern command
+        ...
+```
+```toml
+# pyproject.toml
+[project.entry-points."testwise.parsers"]
+jest = "my_package.jest_parser:JestParser"
+```
+See [CONTRIBUTING.md](CONTRIBUTING.md) for a full guide on writing and testing parser plugins.
+## CLI Reference
+```
+testwise [OPTIONS]
+Options:
+  -c, --config PATH                    Path to .testwise.yml
+  -b, --base-ref TEXT                  Base git ref to diff against
+      --head-ref TEXT                  Head git ref (default: HEAD)
+  -o, --output [text|json|github]      Output format (default: text)
+      --output-file PATH               Write JSON report to file
+      --dry-run                        Show selections without running tests
+      --fallback                       Skip LLM, run all tests
+      --run-level [must_run|should_run|all]  Minimum classification to run
+  -v, --verbose                        Verbose logging
+      --version                        Show version
+      --help                           Show this message
+```
+## Configuration Reference
+See [`.testwise.example.yml`](.testwise.example.yml) for a fully commented example.
+| Key | Type | Default | Description |
+|-----|------|---------|-------------|
+| `runners[].name` | string | required | Runner identifier |
+| `runners[].command` | string | required | Test runner command |
+| `runners[].args` | list | `[]` | Additional arguments |
+| `runners[].test_patterns` | list | `[]` | Glob patterns for test files |
+| `runners[].parser` | string | `"generic"` | Parser plugin name |
+| `runners[].select_mode` | string | `"file"` | `"test"` or `"file"` |
+| `runners[].timeout_seconds` | int | `300` | Per-runner timeout |
+| `llm.model` | string | `"anthropic/claude-sonnet-4-20250514"` | LLM model ([litellm format](https://docs.litellm.ai/docs/providers)) |
+| `llm.api_key_env` | string | `"ANTHROPIC_API_KEY"` | Env var containing API key |
+| `llm.max_context_tokens` | int | `100000` | Token budget for context |
+| `llm.temperature` | float | `0.0` | LLM temperature |
+| `fallback_on_error` | bool | `true` | Run all tests if LLM fails |
+| `run_should_run` | bool | `true` | Also run "should_run" tests |
+## Roadmap
+Testwise is in early development. Here's what's planned:
+- [ ] Jest/Vitest parser plugin
+- [ ] Go test parser plugin
+- [ ] Caching layer — skip LLM call for identical diffs
+- [ ] Cost tracking — log token usage and estimated cost per run
+- [ ] Confidence threshold — auto-fallback below a configurable confidence
+- [ ] Test impact analysis — learn from historical runs which tests fail for which changes
+- [ ] GitLab CI integration
+Have an idea? [Open an issue](https://github.com/mattfrautnick/testwise/issues) or [start a discussion](https://github.com/mattfrautnick/testwise/discussions).
+## Contributing
+Contributions are welcome! Whether it's a bug fix, a new parser plugin, or documentation improvements — all contributions help.
+See [CONTRIBUTING.md](CONTRIBUTING.md) for development setup, architecture overview, and the full guide to writing parser plugins.
+## Community
+- [GitHub Issues](https://github.com/mattfrautnick/testwise/issues) — Bug reports and feature requests
+- [GitHub Discussions](https://github.com/mattfrautnick/testwise/discussions) — Questions, ideas, and show & tell
+## License
+[MIT](LICENSE)

smartselect-0.1.3.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,19 @@
+testwise/__init__.py,sha256=XMM2xTN3zWPtFivFCrsL8RYwDvcSBTa0Nu970-kwWag,88
+testwise/cli.py,sha256=wg5a_1shcrjjN7b5aFX18uz6r7hDEzhIZ1SE3AQMK5I,6910
+testwise/config.py,sha256=TlkgWnUgbnZf_yX6VQGuPymsr66LSMvnYpBGF0V4aZY,4301
+testwise/context_builder.py,sha256=To8cCECLJvSLnNKuDM0PAzewrp5hNsbMYrr2W3QJ3yc,6438
+testwise/diff_analyzer.py,sha256=dixnZS7q89FadFMyBXuON5aEMyzu-3kRrgn-c1wQFao,8517
+testwise/exceptions.py,sha256=e8_5INl5641uPfzkVbd9A7ZusqXpTcjNfXkNsNhVdW4,683
+testwise/llm_selector.py,sha256=uMqJ5DE594k5x1nF1mYFyG6x8lDQpe0rsftqc9bdQzI,7998
+testwise/models.py,sha256=_YaKkVxdZzVRYIrmSSEywkt7dVx-KU2YHWF2RJgaSGA,4795
+testwise/reporter.py,sha256=RYA038VusjWFWw9ZPdQ2_OXv2i7F7Rms1sKgQaIQfvI,6905
+testwise/test_discovery.py,sha256=PgDxcDZjDX-BqQUokNkwgrWO-Io_yzpVkMG89aaGHFE,4981
+testwise/test_runner.py,sha256=kmf_UjAFDqWvkh5ePEsGmJeTlvUviTKKa7ruC6q6Q0M,6107
+testwise/parsers/__init__.py,sha256=jKAthPsrcU88Cc8K5D_4rjCFp9MLUkfnXrNF_9oOndw,2174
+testwise/parsers/generic_parser.py,sha256=jEA_1_ggOKwiFpLKqJcJvGhA9FUqOb849QMYDjtbKIY,2074
+testwise/parsers/pytest_parser.py,sha256=T-Nrs2zj6IGjNHRbMKB_Flxs1_Bj3X0DAwcVo_YsXVQ,6536
+smartselect-0.1.3.dist-info/METADATA,sha256=hh2VAciBA2aNQnd7dEKlRgPAVDendQ8fJGxFFW8GumM,10182
+smartselect-0.1.3.dist-info/WHEEL,sha256=QccIxa26bgl1E6uMy58deGWi-0aeIkkangHcxk2kWfw,87
+smartselect-0.1.3.dist-info/entry_points.txt,sha256=FgsKwPmLYN5GPIIotb1i87qVqVvm4OHs20DtoQxtO-E,176
+smartselect-0.1.3.dist-info/licenses/LICENSE,sha256=3H4rvlve13gMeaLQf8iewTht1Gh3DDQIi61XgPquRCY,1074
+smartselect-0.1.3.dist-info/RECORD,,

smartselect-0.1.3.dist-info/WHEEL ADDED Viewed

@@ -0,0 +1,4 @@
+Wheel-Version: 1.0
+Generator: hatchling 1.29.0
+Root-Is-Purelib: true
+Tag: py3-none-any

smartselect-0.1.3.dist-info/entry_points.txt ADDED Viewed

@@ -0,0 +1,6 @@
+[console_scripts]
+testwise = testwise.cli:main
+[testwise.parsers]
+generic = testwise.parsers.generic_parser:GenericParser
+pytest = testwise.parsers.pytest_parser:PytestParser

smartselect-0.1.3.dist-info/licenses/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Matthew Frautnick
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

testwise/__init__.py ADDED Viewed

@@ -0,0 +1,3 @@
+"""Testwise - LLM-powered test selection for CI/CD pipelines."""
+__version__ = "0.1.0"

testwise/cli.py ADDED Viewed

@@ -0,0 +1,204 @@
+"""CLI entry point and orchestration."""
+from __future__ import annotations
+import logging
+import sys
+import time
+from pathlib import Path
+import click
+from testwise.config import get_repo_root, load_config
+from testwise.context_builder import build_context
+from testwise.diff_analyzer import filter_diff_files, get_diff, truncate_diff
+from testwise.exceptions import LLMError, TestwiseError
+from testwise.llm_selector import fallback_all_tests, select_tests
+from testwise.models import RunReport, TestClassification
+from testwise.reporter import report_results
+from testwise.test_discovery import discover_tests, parse_test_files
+from testwise.test_runner import run_selected_tests
+@click.command()
+@click.option(
+    "--config",
+    "-c",
+    "config_path",
+    type=click.Path(exists=True, path_type=Path),
+    help="Path to .testwise.yml",
+)
+@click.option("--base-ref", "-b", help="Base git ref to diff against")
+@click.option("--head-ref", help="Head git ref (default: HEAD)")
+@click.option(
+    "--output",
+    "-o",
+    "output_format",
+    type=click.Choice(["text", "json", "github"]),
+    default="text",
+    help="Output format",
+)
+@click.option("--output-file", type=click.Path(path_type=Path), help="Write JSON report to file")
+@click.option("--dry-run", is_flag=True, help="Show selections without running tests")
+@click.option("--fallback", is_flag=True, help="Skip LLM, run all tests")
+@click.option(
+    "--run-level",
+    type=click.Choice(["must_run", "should_run", "all"]),
+    default="should_run",
+    help="Minimum classification to execute",
+)
+@click.option("--verbose", "-v", is_flag=True, help="Verbose logging")
+@click.version_option(package_name="smartselect")
+def main(
+    config_path: Path | None,
+    base_ref: str | None,
+    head_ref: str | None,
+    output_format: str,
+    output_file: Path | None,
+    dry_run: bool,
+    fallback: bool,
+    run_level: str,
+    verbose: bool,
+) -> None:
+    """Testwise - LLM-powered test selection for CI/CD pipelines."""
+    logging.basicConfig(
+        level=logging.DEBUG if verbose else logging.INFO,
+        format="%(name)s: %(message)s",
+    )
+    start_time = time.monotonic()
+    try:
+        # 1. Load config
+        config = load_config(config_path)
+        # 2. Get repo root
+        repo_root = get_repo_root()
+        # 3. Get diff
+        diff = get_diff(base_ref=base_ref, head_ref=head_ref, repo_path=repo_root)
+        if not diff.files:
+            click.echo("No changes detected.")
+            sys.exit(0)
+        # Filter diff files
+        diff.files = filter_diff_files(
+            diff.files,
+            include=config.include_patterns,
+            exclude=config.exclude_patterns,
+        )
+        # Truncate if needed
+        diff = truncate_diff(diff, config.context.max_diff_lines)
+        click.echo(
+            f"Changes: {len(diff.files)} files (+{diff.total_additions}/-{diff.total_deletions})"
+        )
+        # 4. Discover and parse test files
+        test_files = discover_tests(repo_root, config.runners)
+        if not test_files:
+            click.echo("No test files found. Check your runner patterns in .testwise.yml", err=True)
+            sys.exit(2)
+        parsed_files = parse_test_files(test_files, config.runners, repo_root)
+        total_tests = sum(len(pf.tests) for pf in parsed_files)
+        click.echo(f"Discovered: {len(test_files)} test files, {total_tests} individual tests")
+        # 5. Get test selections
+        llm_latency = 0.0
+        fallback_triggered = False
+        if fallback:
+            # User forced fallback
+            llm_response = fallback_all_tests(parsed_files, "User requested --fallback")
+            fallback_triggered = True
+        else:
+            # Build context and call LLM
+            messages = build_context(
+                diff=diff,
+                parsed_files=parsed_files,
+                runners=config.runners,
+                max_context_tokens=config.llm.max_context_tokens,
+                model=config.llm.model,
+            )
+            try:
+                llm_response, llm_latency = select_tests(messages, config.llm)
+                if llm_response.fallback_recommended:
+                    click.echo("LLM recommended fallback — running all tests")
+                    llm_response = fallback_all_tests(parsed_files, "LLM recommended fallback")
+                    fallback_triggered = True
+            except LLMError as e:
+                click.echo(f"LLM error: {e}", err=True)
+                if config.fallback_on_error:
+                    click.echo("Falling back to running all tests")
+                    llm_response = fallback_all_tests(parsed_files, str(e))
+                    fallback_triggered = True
+                else:
+                    sys.exit(2)
+        # 6. Filter by run level
+        min_classifications = {
+            "must_run": {TestClassification.must_run},
+            "should_run": {TestClassification.must_run, TestClassification.should_run},
+            "all": {
+                TestClassification.must_run,
+                TestClassification.should_run,
+                TestClassification.skip,
+            },
+        }
+        allowed = min_classifications[run_level]
+        active_selections = [s for s in llm_response.selections if s.classification in allowed]
+        skipped_selections = [s for s in llm_response.selections if s.classification not in allowed]
+        click.echo(f"Selected: {len(active_selections)} tests (skipping {len(skipped_selections)})")
+        # 7. Execute tests (unless dry run)
+        results = []
+        if not dry_run and active_selections:
+            click.echo("Running selected tests...")
+            results = run_selected_tests(
+                selections=active_selections,
+                parsed_files=parsed_files,
+                runners=config.runners,
+                repo_root=repo_root,
+            )
+        # 8. Build report
+        total_duration = time.monotonic() - start_time
+        passed = sum(1 for r in results if r.passed)
+        failed = sum(1 for r in results if not r.passed)
+        report = RunReport(
+            total_tests_discovered=total_tests,
+            tests_selected=len(active_selections),
+            tests_skipped=len(skipped_selections),
+            tests_passed=passed,
+            tests_failed=failed,
+            llm_model_used=config.llm.model,
+            llm_latency_seconds=llm_latency,
+            total_duration_seconds=total_duration,
+            results=results,
+            selections=llm_response.selections,
+            fallback_triggered=fallback_triggered,
+        )
+        # 9. Report
+        report_results(report, output_format, output_file)
+        # 10. Exit code
+        if failed > 0:
+            sys.exit(1)
+    except TestwiseError as e:
+        click.echo(f"Error: {e}", err=True)
+        sys.exit(2)
+if __name__ == "__main__":
+    main()

testwise/config.py ADDED Viewed

@@ -0,0 +1,135 @@
+"""Configuration loading and validation."""
+from __future__ import annotations
+import os
+import subprocess
+from pathlib import Path
+import yaml
+from testwise.exceptions import ConfigError
+from testwise.models import TestwiseConfig
+def get_repo_root() -> Path:
+    """Get the root of the current git repository."""
+    try:
+        result = subprocess.run(
+            ["git", "rev-parse", "--show-toplevel"],
+            capture_output=True,
+            text=True,
+            check=True,
+        )
+        return Path(result.stdout.strip())
+    except (subprocess.CalledProcessError, FileNotFoundError) as e:
+        raise ConfigError(f"Not in a git repository: {e}") from e
+def find_config_file(repo_root: Path) -> Path | None:
+    """Look for .testwise.yml or .testwise.yaml in the repo root."""
+    for name in (".testwise.yml", ".testwise.yaml"):
+        path = repo_root / name
+        if path.exists():
+            return path
+    return None
+def load_config(
+    config_path: Path | None = None,
+    overrides: dict[str, object] | None = None,
+) -> TestwiseConfig:
+    """Load configuration from file and environment variables.
+    Resolution order:
+    1. Explicit config_path argument
+    2. TESTWISE_CONFIG environment variable
+    3. Auto-discover in repo root
+    4. Defaults
+    """
+    raw: dict[str, object] = {}
+    # Find config file
+    if config_path is None:
+        env_path = os.environ.get("TESTWISE_CONFIG")
+        if env_path:
+            config_path = Path(env_path)
+    if config_path is None:
+        try:
+            repo_root = get_repo_root()
+            config_path = find_config_file(repo_root)
+        except ConfigError:
+            pass
+    # Load YAML
+    if config_path is not None:
+        if not config_path.exists():
+            raise ConfigError(f"Config file not found: {config_path}")
+        try:
+            with open(config_path) as f:
+                raw = yaml.safe_load(f) or {}
+        except yaml.YAMLError as e:
+            raise ConfigError(f"Invalid YAML in {config_path}: {e}") from e
+    # Apply environment variable overrides
+    _apply_env_overrides(raw)
+    # Apply explicit overrides
+    if overrides:
+        _deep_merge(raw, overrides)
+    # Validate and return
+    try:
+        return TestwiseConfig.model_validate(raw)
+    except Exception as e:
+        raise ConfigError(f"Invalid configuration: {e}") from e
+def _apply_env_overrides(raw: dict[str, object]) -> None:
+    """Apply TESTWISE_* environment variables as config overrides."""
+    env_map = {
+        "TESTWISE_LLM_MODEL": ("llm", "model"),
+        "TESTWISE_LLM_TEMPERATURE": ("llm", "temperature"),
+        "TESTWISE_LLM_MAX_CONTEXT_TOKENS": ("llm", "max_context_tokens"),
+        "TESTWISE_LLM_TIMEOUT": ("llm", "timeout_seconds"),
+        "TESTWISE_FALLBACK_ON_ERROR": ("fallback_on_error",),
+        "TESTWISE_RUN_SHOULD_RUN": ("run_should_run",),
+    }
+    for env_var, path in env_map.items():
+        value = os.environ.get(env_var)
+        if value is None:
+            continue
+        # Coerce types
+        if path[-1] in ("temperature",):
+            value = float(value)  # type: ignore[assignment]
+        elif path[-1] in ("max_context_tokens", "timeout_seconds"):
+            value = int(value)  # type: ignore[assignment]
+        elif path[-1] in ("fallback_on_error", "run_should_run"):
+            value = value.lower() in ("true", "1", "yes")  # type: ignore[assignment]
+        # Set in raw dict
+        target: dict[str, object] = raw
+        for key in path[:-1]:
+            nested = target.setdefault(key, {})
+            assert isinstance(nested, dict)
+            target = nested
+        target[path[-1]] = value
+    # API key env var name override
+    api_key_env = os.environ.get("TESTWISE_API_KEY_ENV")
+    if api_key_env:
+        llm = raw.setdefault("llm", {})
+        assert isinstance(llm, dict)
+        llm["api_key_env"] = api_key_env
+def _deep_merge(base: dict[str, object], override: dict[str, object]) -> None:
+    """Merge override into base, modifying base in place."""
+    for key, value in override.items():
+        if key in base and isinstance(base[key], dict) and isinstance(value, dict):
+            _deep_merge(base[key], value)  # type: ignore[arg-type]
+        else:
+            base[key] = value