PyPI - weco - Versions diffs - 0.3.5__tar.gz → 0.3.7__tar.gz - Mend

weco 0.3.5tar.gz → 0.3.7tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (62) hide show

{weco-0.3.5 → weco-0.3.7}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: weco
-Version: 0.3.5
+Version: 0.3.7
 Summary: Documentation for `weco`, a CLI for using Weco AI's code optimizer.
 Author-email: Weco AI Team <contact@weco.ai>
 License:
@@ -224,6 +224,7 @@ Provides-Extra: dev
 Requires-Dist: ruff; extra == "dev"
 Requires-Dist: build; extra == "dev"
 Requires-Dist: setuptools_scm; extra == "dev"
+Requires-Dist: pytest>=7.0.0; extra == "dev"
 Dynamic: license-file
 <div align="center">
@@ -323,6 +324,7 @@ For more advanced examples, including [Triton](/examples/triton/README.md), [CUD
 | `--eval-timeout`       | Timeout in seconds for each step in evaluation.                                                                                                                                                                             | No timeout (unlimited)                                                                                                                                                  | `--eval-timeout 3600`             |
 | `--save-logs`          | Save execution output from each optimization step to disk. Creates timestamped directories with raw output files and a JSONL index for tracking execution history.                                                        | `False`                                                                                                                                                 | `--save-logs`       |
 | `--apply-change`       | Automatically apply the best solution to the source file without prompting.                                                                                                                                                | `False`                                                                                                                                                 | `--apply-change`       |
+| `--api-key`            | API keys for LLM providers (BYOK). Format: `provider=key`. Can specify multiple providers.                                                                                                                                  | `None`                                                                                                                                                  | `--api-key openai=sk-xxx` |
 ---
@@ -377,6 +379,7 @@ Arguments for `weco resume`:
 |----------|-------------|---------|
 | `run-id` | The UUID of the run to resume (shown at the start of each run) | `0002e071-1b67-411f-a514-36947f0c4b31` |
 | `--apply-change` | Automatically apply the best solution to the source file without prompting | `--apply-change` |
+| `--api-key` | (Optional) API keys for LLM providers (BYOK). Format: `provider=key` | `--api-key openai=sk-xxx` |
 Notes:
 - Works only for interrupted runs (status: `error`, `terminated`, etc.).

{weco-0.3.5 → weco-0.3.7}/README.md RENAMED Viewed

@@ -95,6 +95,7 @@ For more advanced examples, including [Triton](/examples/triton/README.md), [CUD
 | `--eval-timeout`       | Timeout in seconds for each step in evaluation.                                                                                                                                                                             | No timeout (unlimited)                                                                                                                                                  | `--eval-timeout 3600`             |
 | `--save-logs`          | Save execution output from each optimization step to disk. Creates timestamped directories with raw output files and a JSONL index for tracking execution history.                                                        | `False`                                                                                                                                                 | `--save-logs`       |
 | `--apply-change`       | Automatically apply the best solution to the source file without prompting.                                                                                                                                                | `False`                                                                                                                                                 | `--apply-change`       |
+| `--api-key`            | API keys for LLM providers (BYOK). Format: `provider=key`. Can specify multiple providers.                                                                                                                                  | `None`                                                                                                                                                  | `--api-key openai=sk-xxx` |
 ---
@@ -149,6 +150,7 @@ Arguments for `weco resume`:
 |----------|-------------|---------|
 | `run-id` | The UUID of the run to resume (shown at the start of each run) | `0002e071-1b67-411f-a514-36947f0c4b31` |
 | `--apply-change` | Automatically apply the best solution to the source file without prompting | `--apply-change` |
+| `--api-key` | (Optional) API keys for LLM providers (BYOK). Format: `provider=key` | `--api-key openai=sk-xxx` |
 Notes:
 - Works only for interrupted runs (status: `error`, `terminated`, etc.).

{weco-0.3.5 → weco-0.3.7}/examples/extract-line-plot/README.md RENAMED Viewed

@@ -1,6 +1,6 @@
-## Extract Line Plot (Chart → CSV) with a VLM
+## Extract Line Plot (Chart → CSV): Accuracy/Cost Optimization for Agentic Workflow
-This example is about optimizing an AI feature that turns image of chart into a table in csv format.
+This example demonstrates optimizing an AI feature that turns chart images into CSV tables, showcasing how to use Weco to improve accuracy or reduce cost of a VLM-based extraction workflow.
 ### Prerequisites
@@ -15,8 +15,9 @@ export OPENAI_API_KEY=your_key_here
 ### Files
 - `prepare_data.py`: downloads ChartQA (full) and prepares a 100-sample subset of line charts.
-- `optimize.py`: baseline VLM function (`VLMExtractor.image_to_csv`) to be optimized.
+- `optimize.py`: exposes `extract_csv(image_path)` which returns CSV text plus the per-call cost (helpers stay private).
 - `eval.py`: evaluation harness that runs the baseline on images and reports a similarity score as "accuracy".
+- `guide.md`: optional additional instructions you can feed to Weco via `--additional-instructions guide.md`.
 Generated artifacts (gitignored):
 - `subset_line_100/` and `subset_line_100.zip`
@@ -47,12 +48,21 @@ Metric definition (summarized):
 - Per-sample score = 0.2 × header match + 0.8 × Jaccard(similarity of content rows).
 - Reported `accuracy` is the mean score over all evaluated samples.
+To emit a secondary `cost` metric that Weco can minimize (while enforcing `accuracy > 0.45`), append `--cost-metric`:
+```bash
+uv run --with openai python eval.py --max-samples 10 --num-workers 4 --cost-metric
+```
+If the final accuracy falls at or below `0.45`, the reported cost is replaced with a large penalty so Weco keeps searching for higher-accuracy solutions.
+You can tighten or relax this constraint with `--cost-accuracy-threshold`, e.g. `--cost-accuracy-threshold 0.50`.
 ### 3) Optimize the baseline with Weco
 Run Weco to iteratively improve `optimize.py` using 100 examples and many workers:
 ```bash
-weco run --source optimize.py --eval-command 'uv run --with openai python eval.py --max-samples 100 --num-workers 50' --metric accuracy --goal maximize --steps 20 --model gpt-5
+weco run --source optimize.py --eval-command 'uv run --with openai python eval.py --max-samples 100 --num-workers 50' --metric accuracy --goal maximize --steps 20 --model gpt-5 --additional-instructions guide.md
 ```
 Arguments:
@@ -63,10 +73,20 @@ Arguments:
 - `--steps 20`: number of optimization iterations.
 - `--model gpt-5`: model used by Weco to propose edits (change as desired).
+To minimize cost instead (subject to the accuracy constraint), enable the flag in the eval command and switch the optimization target:
+```bash
+weco run --source optimize.py --eval-command 'uv run --with openai python eval.py --max-samples 100 --num-workers 50 --cost-metric' --metric cost --goal minimize --steps 20 --model gpt-5 --additional-instructions guide.md
+```
+#### Cost optimization workflow
+- Run the evaluation command with `--cost-metric` once to confirm accuracy meets your threshold and note the baseline cost.
+- Adjust `--cost-accuracy-threshold` if you want to tighten or relax the constraint before launching optimization.
+- Kick off Weco with `--metric cost --goal minimize --additional-instructions guide.md` so the optimizer respects the constraint while acting on the extra tips.
 ### Tips
 - Ensure your OpenAI key has access to a vision-capable model (default: `gpt-4o-mini` in the eval; change via `--model`).
 - Adjust `--num-workers` to balance throughput and rate limits.
 - You can tweak baseline behavior in `optimize.py` (prompt, temperature) — Weco will explore modifications automatically during optimization.
+- Include `--additional-instructions guide.md` whenever you run Weco so those cost-conscious hints influence the generated proposals.

{weco-0.3.5 → weco-0.3.7}/examples/extract-line-plot/eval.py RENAMED Viewed

@@ -8,7 +8,7 @@ from concurrent.futures import ThreadPoolExecutor, as_completed
 from pathlib import Path
 from typing import Dict, List, Optional, Tuple
-from optimize import VLMExtractor
+from optimize import extract_csv
 try:
     import matplotlib
@@ -18,6 +18,9 @@ try:
 except Exception:  # pragma: no cover - optional dependency
     plt = None
+COST_ACCURACY_THRESHOLD_DEFAULT = 0.45
+COST_CONSTRAINT_PENALTY = 1_000_000.0
 def read_index(index_csv_path: Path) -> List[Tuple[str, Path, Path]]:
     rows: List[Tuple[str, Path, Path]] = []
@@ -259,14 +262,14 @@ def evaluate_predictions(gt_csv_path: Path, pred_csv_path: Path) -> float:
 def process_one(
-    extractor: VLMExtractor, base_dir: Path, example_id: str, image_rel: Path, gt_table_rel: Path, output_dir: Path
-) -> Tuple[str, float, Path, Path]:
+    base_dir: Path, example_id: str, image_rel: Path, gt_table_rel: Path, output_dir: Path
+) -> Tuple[str, float, Path, Path, float]:
     image_path = base_dir / image_rel
     gt_csv_path = base_dir / gt_table_rel
-    pred_csv_text = extractor.image_to_csv(image_path)
+    pred_csv_text, cost_usd = extract_csv(image_path)
     pred_path = write_csv(output_dir, example_id, pred_csv_text)
     score = evaluate_predictions(gt_csv_path, pred_path)
-    return example_id, score, pred_path, gt_csv_path
+    return example_id, score, pred_path, gt_csv_path, cost_usd
 def main() -> None:
@@ -276,6 +279,20 @@ def main() -> None:
     parser.add_argument("--out-dir", type=str, default="predictions")
     parser.add_argument("--max-samples", type=int, default=100)
     parser.add_argument("--num-workers", type=int, default=4)
+    parser.add_argument(
+        "--cost-metric",
+        action="store_true",
+        help=(
+            "When set, also report a `cost:` metric suitable for Weco minimization. "
+            "Requires final accuracy to exceed --cost-accuracy-threshold; otherwise a large penalty is reported."
+        ),
+    )
+    parser.add_argument(
+        "--cost-accuracy-threshold",
+        type=float,
+        default=COST_ACCURACY_THRESHOLD_DEFAULT,
+        help="Minimum accuracy required when --cost-metric is set (default: 0.45).",
+    )
     parser.add_argument(
         "--visualize-dir",
         type=str,
@@ -307,7 +324,6 @@ def main() -> None:
         sys.exit(1)
     rows = read_index(index_path)[: args.max_samples]
-    extractor = VLMExtractor()
     visualize_dir: Optional[Path] = Path(args.visualize_dir) if args.visualize_dir else None
     visualize_max = max(0, args.visualize_max)
@@ -315,22 +331,24 @@ def main() -> None:
         print("[warn] matplotlib not available; skipping visualization.", file=sys.stderr)
         visualize_dir = None
-    print(f"[setup] evaluating {len(rows)} samples using {extractor.model} …", flush=True)
+    print(f"[setup] evaluating {len(rows)} samples …", flush=True)
     start = time.time()
     scores: List[float] = []
+    costs: List[float] = []
     saved_visualizations = 0
     with ThreadPoolExecutor(max_workers=max(1, args.num_workers)) as pool:
         futures = [
-            pool.submit(process_one, extractor, base_dir, example_id, image_rel, gt_table_rel, Path(args.out_dir))
+            pool.submit(process_one, base_dir, example_id, image_rel, gt_table_rel, Path(args.out_dir))
             for (example_id, image_rel, gt_table_rel) in rows
         ]
         try:
             for idx, fut in enumerate(as_completed(futures), 1):
                 try:
-                    example_id, score, pred_path, gt_csv_path = fut.result()
+                    example_id, score, pred_path, gt_csv_path, cost_usd = fut.result()
                     scores.append(score)
+                    costs.append(cost_usd)
                     if visualize_dir and (visualize_max == 0 or saved_visualizations < visualize_max):
                         out_path = visualize_difference(
                             gt_csv_path,
@@ -346,7 +364,11 @@ def main() -> None:
                     if idx % 5 == 0 or idx == len(rows):
                         elapsed = time.time() - start
                         avg = sum(scores) / len(scores) if scores else 0.0
-                        print(f"[progress] {idx}/{len(rows)} done, avg score: {avg:.4f}, elapsed {elapsed:.1f}s", flush=True)
+                        avg_cost = sum(costs) / len(costs) if costs else 0.0
+                        print(
+                            f"[progress] {idx}/{len(rows)} done, avg score: {avg:.4f}, avg cost: ${avg_cost:.4f}, elapsed {elapsed:.1f}s",
+                            flush=True,
+                        )
                 except Exception as e:
                     print(f"[error] failed on sample {idx}: {e}", file=sys.stderr)
         except KeyboardInterrupt:
@@ -356,7 +378,7 @@ def main() -> None:
     final_score = sum(scores) / len(scores) if scores else 0.0
     # Apply cost cap: accuracy is zeroed if average cost/query exceeds $0.02
-    avg_cost_per_query = (extractor.total_cost_usd / extractor.num_queries) if getattr(extractor, "num_queries", 0) else 0.0
+    avg_cost_per_query = (sum(costs) / len(costs)) if costs else 0.0
     if avg_cost_per_query > 0.02:
         print(f"[cost] avg ${avg_cost_per_query:.4f}/query exceeds $0.02 cap; accuracy set to 0.0", flush=True)
         final_score = 0.0
@@ -365,6 +387,20 @@ def main() -> None:
     print(f"accuracy: {final_score:.4f}")
+    if args.cost_metric:
+        if final_score > args.cost_accuracy_threshold:
+            reported_cost = avg_cost_per_query
+        else:
+            print(
+                (
+                    f"[constraint] accuracy {final_score:.4f} <= "
+                    f"threshold {args.cost_accuracy_threshold:.2f}; reporting penalty ${COST_CONSTRAINT_PENALTY:.1f}"
+                ),
+                flush=True,
+            )
+            reported_cost = COST_CONSTRAINT_PENALTY
+        print(f"cost: {reported_cost:.6f}")
 if __name__ == "__main__":
     main()

weco-0.3.7/examples/extract-line-plot/optimize.py ADDED Viewed

@@ -0,0 +1,97 @@
+"""
+optimize.py
+Exposes a single public entry point `extract_csv` that turns a chart image into CSV text.
+All helper utilities remain private to this module.
+"""
+import base64
+from pathlib import Path
+from typing import Optional, Tuple
+from openai import OpenAI
+__all__ = ["extract_csv"]
+_DEFAULT_MODEL = "gpt-4o-mini"
+_CLIENT = OpenAI()
+def _build_prompt() -> str:
+    return (
+        "You are a precise data extraction model. Given a chart image, extract the underlying data table.\n"
+        "Return ONLY the CSV text with a header row and no markdown code fences.\n"
+        "Rules:\n"
+        "- The first column must be the x-axis values with its exact axis label as the header.\n"
+        "- Include one column per data series using the legend labels as headers.\n"
+        "- Preserve the original order of x-axis ticks as they appear.\n"
+        "- Use plain CSV (comma-separated), no explanations, no extra text.\n"
+    )
+def _image_to_data_uri(image_path: Path) -> str:
+    mime = "image/png" if image_path.suffix.lower() == ".png" else "image/jpeg"
+    data = image_path.read_bytes()
+    b64 = base64.b64encode(data).decode("ascii")
+    return f"data:{mime};base64,{b64}"
+def _clean_to_csv(text: str) -> str:
+    return text.strip()
+def _pricing_for_model(model_name: str) -> dict:
+    """Return pricing information for the given model in USD per token."""
+    name = (model_name or "").lower()
+    per_million = {
+        "gpt-5": {"in": 1.250, "in_cached": 0.125, "out": 10.000},
+        "gpt-5-mini": {"in": 0.250, "in_cached": 0.025, "out": 2.000},
+        "gpt-5-nano": {"in": 0.050, "in_cached": 0.005, "out": 0.400},
+    }
+    if name.startswith("gpt-5-nano"):
+        chosen = per_million["gpt-5-nano"]
+    elif name.startswith("gpt-5-mini"):
+        chosen = per_million["gpt-5-mini"]
+    elif name.startswith("gpt-5"):
+        chosen = per_million["gpt-5"]
+    else:
+        chosen = per_million["gpt-5-mini"]
+    return {k: v / 1_000_000.0 for k, v in chosen.items()}
+def extract_csv(image_path: Path, model: Optional[str] = None) -> Tuple[str, float]:
+    """
+    Extract CSV text from an image and return (csv_text, cost_usd).
+    The caller can optionally override the model name; otherwise the default is used.
+    """
+    effective_model = model or _DEFAULT_MODEL
+    prompt = _build_prompt()
+    image_uri = _image_to_data_uri(image_path)
+    response = _CLIENT.chat.completions.create(
+        model=effective_model,
+        messages=[
+            {
+                "role": "user",
+                "content": [{"type": "text", "text": prompt}, {"type": "image_url", "image_url": {"url": image_uri}}],
+            }
+        ],
+    )
+    usage = getattr(response, "usage", None)
+    cost_usd = 0.0
+    if usage is not None:
+        prompt_tokens = int(getattr(usage, "prompt_tokens", 0) or 0)
+        completion_tokens = int(getattr(usage, "completion_tokens", 0) or 0)
+        details = getattr(usage, "prompt_tokens_details", None)
+        cached_tokens = 0
+        if details is not None:
+            cached_tokens = int(getattr(details, "cached_tokens", 0) or 0)
+        non_cached_prompt_tokens = max(0, prompt_tokens - cached_tokens)
+        rates = _pricing_for_model(effective_model)
+        cost_usd = (
+            non_cached_prompt_tokens * rates["in"] + cached_tokens * rates["in_cached"] + completion_tokens * rates["out"]
+        )
+    text = response.choices[0].message.content or ""
+    return _clean_to_csv(text), cost_usd

{weco-0.3.5 → weco-0.3.7}/pyproject.toml RENAMED Viewed

@@ -8,7 +8,7 @@ name = "weco"
 authors = [{ name = "Weco AI Team", email = "contact@weco.ai" }]
 description = "Documentation for `weco`, a CLI for using Weco AI's code optimizer."
 readme = "README.md"
-version = "0.3.5"
+version = "0.3.7"
 license = { file = "LICENSE" }
 requires-python = ">=3.8"
 dependencies = [
@@ -18,7 +18,7 @@ dependencies = [
     "gitingest",
     "fastapi",
     "slowapi",
-    "psutil",
+    "psutil"
 ]
 keywords = ["AI", "Code Optimization", "Code Generation"]
 classifiers = [
@@ -34,7 +34,7 @@ weco = "weco.cli:main"
 Homepage = "https://github.com/WecoAI/weco-cli"
 [project.optional-dependencies]
-dev = ["ruff", "build", "setuptools_scm"]
+dev = ["ruff", "build", "setuptools_scm", "pytest>=7.0.0"]
 [tool.setuptools]
 packages = ["weco"]

weco-0.3.7/tests/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ """Tests for weco CLI."""

weco-0.3.7/tests/test_byok.py ADDED Viewed

@@ -0,0 +1,192 @@
+"""Tests to verify API keys are correctly passed through the system and sent to the API."""
+import pytest
+from unittest.mock import patch, MagicMock
+from rich.console import Console
+from weco.api import start_optimization_run, evaluate_feedback_then_suggest_next_solution
+class TestApiKeysInStartOptimizationRun:
+    """Test that api_keys are correctly included in start_optimization_run requests."""
+    @pytest.fixture
+    def mock_console(self):
+        """Create a mock console for testing."""
+        return MagicMock(spec=Console)
+    @pytest.fixture
+    def base_params(self, mock_console):
+        """Base parameters for start_optimization_run."""
+        return {
+            "console": mock_console,
+            "source_code": "print('hello')",
+            "source_path": "test.py",
+            "evaluation_command": "python test.py",
+            "metric_name": "accuracy",
+            "maximize": True,
+            "steps": 10,
+            "code_generator_config": {"model": "o4-mini"},
+            "evaluator_config": {"model": "o4-mini"},
+            "search_policy_config": {"num_drafts": 2},
+        }
+    @patch("weco.api.requests.post")
+    def test_api_keys_included_in_request(self, mock_post, base_params):
+        """Test that api_keys are included in the request JSON when provided."""
+        mock_response = MagicMock()
+        mock_response.json.return_value = {
+            "run_id": "test-run-id",
+            "solution_id": "test-solution-id",
+            "code": "print('hello')",
+            "plan": "test plan",
+        }
+        mock_response.raise_for_status = MagicMock()
+        mock_post.return_value = mock_response
+        api_keys = {"openai": "sk-test-key", "anthropic": "sk-ant-test"}
+        start_optimization_run(**base_params, api_keys=api_keys)
+        # Verify the request was made with api_keys in the JSON payload
+        mock_post.assert_called_once()
+        call_kwargs = mock_post.call_args
+        request_json = call_kwargs.kwargs["json"]
+        assert "api_keys" in request_json
+        assert request_json["api_keys"] == {"openai": "sk-test-key", "anthropic": "sk-ant-test"}
+    @patch("weco.api.requests.post")
+    def test_api_keys_not_included_when_none(self, mock_post, base_params):
+        """Test that api_keys field is not included when api_keys is None."""
+        mock_response = MagicMock()
+        mock_response.json.return_value = {
+            "run_id": "test-run-id",
+            "solution_id": "test-solution-id",
+            "code": "print('hello')",
+            "plan": "test plan",
+        }
+        mock_response.raise_for_status = MagicMock()
+        mock_post.return_value = mock_response
+        start_optimization_run(**base_params, api_keys=None)
+        # Verify the request was made without api_keys
+        mock_post.assert_called_once()
+        call_kwargs = mock_post.call_args
+        request_json = call_kwargs.kwargs["json"]
+        assert "api_keys" not in request_json
+    @patch("weco.api.requests.post")
+    def test_api_keys_not_included_when_empty_dict(self, mock_post, base_params):
+        """Test that api_keys field is not included when api_keys is an empty dict."""
+        mock_response = MagicMock()
+        mock_response.json.return_value = {
+            "run_id": "test-run-id",
+            "solution_id": "test-solution-id",
+            "code": "print('hello')",
+            "plan": "test plan",
+        }
+        mock_response.raise_for_status = MagicMock()
+        mock_post.return_value = mock_response
+        # Empty dict is falsy, so api_keys should not be included
+        start_optimization_run(**base_params, api_keys={})
+        mock_post.assert_called_once()
+        call_kwargs = mock_post.call_args
+        request_json = call_kwargs.kwargs["json"]
+        assert "api_keys" not in request_json
+class TestApiKeysInEvaluateFeedbackThenSuggest:
+    """Test that api_keys are correctly included in evaluate_feedback_then_suggest_next_solution requests."""
+    @pytest.fixture
+    def mock_console(self):
+        """Create a mock console for testing."""
+        return MagicMock(spec=Console)
+    @patch("weco.api.requests.post")
+    def test_api_keys_included_in_suggest_request(self, mock_post, mock_console):
+        """Test that api_keys are included in the suggest request JSON when provided."""
+        mock_response = MagicMock()
+        mock_response.json.return_value = {
+            "run_id": "test-run-id",
+            "solution_id": "new-solution-id",
+            "code": "print('improved')",
+            "plan": "improvement plan",
+            "is_done": False,
+        }
+        mock_response.raise_for_status = MagicMock()
+        mock_post.return_value = mock_response
+        api_keys = {"openai": "sk-test-key"}
+        evaluate_feedback_then_suggest_next_solution(
+            console=mock_console,
+            run_id="test-run-id",
+            step=1,
+            execution_output="accuracy: 0.95",
+            auth_headers={"Authorization": "Bearer test-token"},
+            api_keys=api_keys,
+        )
+        mock_post.assert_called_once()
+        call_kwargs = mock_post.call_args
+        request_json = call_kwargs.kwargs["json"]
+        assert "api_keys" in request_json
+        assert request_json["api_keys"] == {"openai": "sk-test-key"}
+    @patch("weco.api.requests.post")
+    def test_api_keys_not_included_in_suggest_when_none(self, mock_post, mock_console):
+        """Test that api_keys field is not included in suggest request when api_keys is None."""
+        mock_response = MagicMock()
+        mock_response.json.return_value = {
+            "run_id": "test-run-id",
+            "solution_id": "new-solution-id",
+            "code": "print('improved')",
+            "plan": "improvement plan",
+            "is_done": False,
+        }
+        mock_response.raise_for_status = MagicMock()
+        mock_post.return_value = mock_response
+        evaluate_feedback_then_suggest_next_solution(
+            console=mock_console,
+            run_id="test-run-id",
+            step=1,
+            execution_output="accuracy: 0.95",
+            auth_headers={"Authorization": "Bearer test-token"},
+            api_keys=None,
+        )
+        mock_post.assert_called_once()
+        call_kwargs = mock_post.call_args
+        request_json = call_kwargs.kwargs["json"]
+        assert "api_keys" not in request_json
+    @patch("weco.api.requests.post")
+    def test_api_keys_not_included_in_suggest_when_empty_dict(self, mock_post, mock_console):
+        """Test that api_keys field is not included in suggest request when api_keys is None."""
+        mock_response = MagicMock()
+        mock_response.json.return_value = {
+            "run_id": "test-run-id",
+            "solution_id": "new-solution-id",
+            "code": "print('improved')",
+            "plan": "improvement plan",
+            "is_done": False,
+        }
+        mock_response.raise_for_status = MagicMock()
+        mock_post.return_value = mock_response
+        evaluate_feedback_then_suggest_next_solution(
+            console=mock_console,
+            run_id="test-run-id",
+            step=1,
+            execution_output="accuracy: 0.95",
+            auth_headers={"Authorization": "Bearer test-token"},
+            api_keys={},
+        )
+        mock_post.assert_called_once()
+        call_kwargs = mock_post.call_args
+        request_json = call_kwargs.kwargs["json"]
+        assert "api_keys" not in request_json

weco-0.3.7/tests/test_cli.py ADDED Viewed

@@ -0,0 +1,70 @@
+"""Tests for CLI functions, particularly parse_api_keys."""
+import pytest
+from weco.cli import parse_api_keys
+class TestParseApiKeys:
+    """Test cases for parse_api_keys function."""
+    def test_parse_api_keys_none(self):
+        """Test that None input returns empty dict."""
+        result = parse_api_keys(None)
+        assert result == {}
+        assert isinstance(result, dict)
+    def test_parse_api_keys_empty_list(self):
+        """Test that empty list returns empty dict."""
+        result = parse_api_keys([])
+        assert result == {}
+        assert isinstance(result, dict)
+    def test_parse_api_keys_single_key(self):
+        """Test parsing a single API key."""
+        result = parse_api_keys(["openai=sk-xxx"])
+        assert result == {"openai": "sk-xxx"}
+    def test_parse_api_keys_multiple_keys(self):
+        """Test parsing multiple API keys."""
+        result = parse_api_keys(["openai=sk-xxx", "anthropic=sk-ant-yyy"])
+        assert result == {"openai": "sk-xxx", "anthropic": "sk-ant-yyy"}
+    def test_parse_api_keys_whitespace_handling(self):
+        """Test that whitespace is stripped from provider and key."""
+        result = parse_api_keys([" openai = sk-xxx ", "  anthropic  =  sk-ant-yyy  "])
+        assert result == {"openai": "sk-xxx", "anthropic": "sk-ant-yyy"}
+    def test_parse_api_keys_key_contains_equals(self):
+        """Test that keys containing '=' are handled correctly (split on first '=' only)."""
+        result = parse_api_keys(["openai=sk-xxx=extra=more"])
+        assert result == {"openai": "sk-xxx=extra=more"}
+    def test_parse_api_keys_no_equals(self):
+        """Test that missing '=' raises ValueError."""
+        with pytest.raises(ValueError, match="Invalid API key format.*Expected format: 'provider=key'"):
+            parse_api_keys(["openai"])
+    def test_parse_api_keys_empty_provider(self):
+        """Test that empty provider raises ValueError."""
+        with pytest.raises(ValueError, match="Provider and key must be non-empty"):
+            parse_api_keys(["=sk-xxx"])
+    def test_parse_api_keys_empty_key(self):
+        """Test that empty key raises ValueError."""
+        with pytest.raises(ValueError, match="Provider and key must be non-empty"):
+            parse_api_keys(["openai="])
+    def test_parse_api_keys_both_empty(self):
+        """Test that both empty provider and key raises ValueError."""
+        with pytest.raises(ValueError, match="Provider and key must be non-empty"):
+            parse_api_keys(["="])
+    def test_parse_api_keys_duplicate_provider(self):
+        """Test that duplicate providers overwrite previous value."""
+        result = parse_api_keys(["openai=sk-xxx", "openai=sk-yyy"])
+        assert result == {"openai": "sk-yyy"}
+    def test_parse_api_keys_mixed_case_provider(self):
+        """Test that mixed case providers are normalized correctly."""
+        result = parse_api_keys(["OpenAI=sk-xxx", "ANTHROPIC=sk-ant-yyy"])
+        assert result == {"openai": "sk-xxx", "anthropic": "sk-ant-yyy"}

weco 0.3.5__tar.gz → 0.3.7__tar.gz

weco 0.3.5tar.gz → 0.3.7tar.gz