PyPI - weco - Versions diffs - 0.2.28__tar.gz → 0.3.1__tar.gz - Mend

weco 0.2.28tar.gz → 0.3.1tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (50) hide show

{weco-0.2.28 → weco-0.3.1}/.gitignore RENAMED Viewed

@@ -24,9 +24,11 @@ wheels/
 .coverage
 htmlcov/
 .env
+.env.*
 .venv
 venv/
 ENV/
+.envrc
 # VSCode Extension
 node_modules/
@@ -63,6 +65,9 @@ Thumbs.db
 # Linting
 .ruff_cache/
+# UV
+uv.lock
 # Miscellaneous
 etc/
@@ -78,4 +83,4 @@ CLAUDE.md
 repomix-output.*
 # Claude config
-.claude/
+.claude/

{weco-0.2.28 → weco-0.3.1}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: weco
-Version: 0.2.28
+Version: 0.3.1
 Summary: Documentation for `weco`, a CLI for using Weco AI's code optimizer.
 Author-email: Weco AI Team <contact@weco.ai>
 License:
@@ -270,16 +270,17 @@ The `weco` CLI leverages a tree search approach guided by LLMs to iteratively ex
 1.  **Install the Package:**
     ```bash
-    pip install weco>=0.2.18
+    pip install weco
     ```
-2.  **Set Up LLM API Keys (Required):**
+2.  **Authenticate (Required):**
-    `weco` requires API keys for the LLMs it uses internally. You **must** provide these keys via environment variables:
+    `weco` now uses a **credit-based billing system** with centralized LLM access. You need to authenticate to use the service:
-    - **OpenAI:** `export OPENAI_API_KEY="your_key_here"` (Create your OpenAI API key [here](https://platform.openai.com/api-keys))
-    - **Anthropic:** `export ANTHROPIC_API_KEY="your_key_here"` (Create your Anthropic API key [here](https://console.anthropic.com/settings/keys))
-    - **Google:** `export GEMINI_API_KEY="your_key_here"` (Google AI Studio has a free API usage quota. Create your Gemini API key [here](https://aistudio.google.com/apikey) to use `weco` for free.)
+    - **Run the CLI**: `weco` will prompt you to authenticate via your web browser
+    - **Free Credits**: New users receive **free credits** upon signup
+    - **Centralized Keys**: All LLM provider API keys are managed by Weco (no BYOK required)
+    - **Credit Top-ups**: Purchase additional credits through the dashboard at [dashboard.weco.ai](https://dashboard.weco.ai)
 ---
@@ -338,6 +339,8 @@ weco run --source optimize.py \
 For more advanced examples, including [Triton](/examples/triton/README.md), [CUDA kernel optimization](/examples/cuda/README.md), [ML model optimization](/examples/spaceship-titanic/README.md), and [prompt engineering for math problems](examples/prompt/README.md), please see the `README.md` files within the corresponding subdirectories under the [`examples/`](examples/) folder.
+> Note: When recommend removing any backticks from your code if any are present. We currently don't support backticks but will support this in the future.
 ---
 ### Arguments for `weco run`
@@ -358,8 +361,8 @@ For more advanced examples, including [Triton](/examples/triton/README.md), [CUD
 | Argument                       | Description                                                                                                                                                                                                                | Default                                                                                                                                                | Example             |
 | :----------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------ | :------------------ |
 | `-n, --steps`                  | Number of optimization steps (LLM iterations) to run.                                                                                                                                                                      | 100                                                                                                                                                     | `-n 50`             |
-| `-M, --model`                  | Model identifier for the LLM to use (e.g., `o4-mini`, `claude-sonnet-4-0`).                                                                                                        | `o4-mini` when `OPENAI_API_KEY` is set; `claude-sonnet-4-0` when `ANTHROPIC_API_KEY` is set; `gemini-2.5-pro` when `GEMINI_API_KEY` is set. | `-M o4-mini`         |
-| `-i, --additional-instructions`| Natural language description of specific instructions **or** path to a file containing detailed instructions to guide the LLM.                                                                                             | `None`                                                                                                                                                  | `-i instructions.md` or `-i "Optimize the model for faster inference"`|
+| `-M, --model`                  | Model identifier for the LLM to use (e.g., `o4-mini`, `claude-sonnet-4-0`).                                                                                                        | `o4-mini` | `-M o4-mini`         |
+| `-i, --additional-instructions`| Natural language description of specific instructions **or** path to a file containing detailed instructions to guide the LLM. Supported file formats include - `.txt`, `.md`, and `.rst`.                                                                                             | `None`                                                                                                                                                  | `-i instructions.md` or `-i "Optimize the model for faster inference"`|
 | `-l, --log-dir`                | Path to the directory to log intermediate steps and final optimization result.                                                                                                                                             | `.runs/`                                                                                                                                               | `-l ./logs/`        |
 | `--eval-timeout`       | Timeout in seconds for each step in evaluation.                                                                                                                                                                             | No timeout (unlimited)                                                                                                                                                  | `--eval-timeout 3600`             |
 | `--save-logs`          | Save execution output from each optimization step to disk. Creates timestamped directories with raw output files and a JSONL index for tracking execution history.                                                        | `False`                                                                                                                                                 | `--save-logs`       |
@@ -368,24 +371,24 @@ For more advanced examples, including [Triton](/examples/triton/README.md), [CUD
 ### Authentication & Dashboard
-Weco offers both **anonymous** and **authenticated** usage:
-#### Anonymous Usage
-You can use Weco without creating an account by providing LLM API keys via environment variables. This is perfect for trying out Weco or for users who prefer not to create accounts.
+The CLI requires a Weco account for authentication and billing.
-#### Authenticated Usage (Recommended)
-To save your optimization runs and view them on the Weco dashboard, you can log in using Weco's secure device authentication flow:
+#### Credit-Based Authentication (Required)
+Weco now requires authentication for all operations. This enables our credit-based billing system and provides access to powerful optimizations:
-1. **During onboarding**: When you run `weco` for the first time, you'll be prompted to log in or skip
+1. **During onboarding**: When you run `weco` for the first time, you'll be prompted to log in
 2. **Manual login**: Use `weco logout` to clear credentials, then run `weco` again to re-authenticate
 3. **Device flow**: Weco will open your browser automatically and guide you through a secure OAuth-style authentication
 ![image (16)](https://github.com/user-attachments/assets/8a0a285b-4894-46fa-b6a2-4990017ca0c6)
-**Benefits of authenticated usage:**
-- **Run history**: View all your optimization runs on the Weco dashboard
-- **Progress tracking**: Monitor long-running optimizations remotely
-- **Enhanced support**: Get better assistance with your optimization challenges
+**Benefits:**
+- **No API Key Management**: All LLM provider keys are managed centrally
+- **Cost Transparency**: See exactly how many credits each optimization consumes
+- **Free Trial**: Free credits to get started with optimization projects
+- **Run History**: View all your optimization runs on the Weco dashboard
+- **Progress Tracking**: Monitor long-running optimizations remotely
+- **Budget Control**: Set spending limits and auto top-up preferences
 ---
@@ -398,6 +401,7 @@ To save your optimization runs and view them on the Weco dashboard, you can log
 | `weco` | Launch interactive onboarding | **Recommended for beginners** - Analyzes your codebase and guides you through setup |
 | `weco /path/to/project` | Launch onboarding for specific project | When working with a project in a different directory |
 | `weco run [options]` | Direct optimization execution | **For advanced users** - When you know exactly what to optimize and how |
+| `weco resume <run-id>` | Resume an interrupted run | Continue from the last completed step |
 | `weco logout` | Clear authentication credentials | To switch accounts or troubleshoot authentication issues |
 ### Model Selection
@@ -413,14 +417,37 @@ weco run --model claude-3.5-sonnet --source optimize.py [other options...]
 ```
 **Available models:**
-- `gpt-4o`, `o4-mini` (requires `OPENAI_API_KEY`)
-- `claude-3.5-sonnet`, `claude-sonnet-4-20250514` (requires `ANTHROPIC_API_KEY`)
-- `gemini-2.5-pro` (requires `GEMINI_API_KEY`)
+- `o4-mini`, `o3-mini`, `gpt-4o` (OpenAI models)
+- `claude-sonnet-4-0`, `claude-opus-4-0` (Anthropic models)
+- `gemini-2.5-pro`, `gemini-2.5-flash` (Google models)
-If no model is specified, Weco automatically selects the best available model based on your API keys.
+All models are available through Weco's centralized system. If no model is specified, Weco automatically selects the best model for your optimization task.
 ---
+### Resuming Interrupted Runs
+If your optimization run is interrupted (network issues, restart, etc.), resume from the most recent node:
+```bash
+# Resume an interrupted run
+weco resume 0002e071-1b67-411f-a514-36947f0c4b31
+```
+Arguments for `weco resume`:
+| Argument | Description | Example |
+|----------|-------------|---------|
+| `run-id` | The UUID of the run to resume (shown at the start of each run) | `0002e071-1b67-411f-a514-36947f0c4b31` |
+Notes:
+- Works only for interrupted runs (status: `error`, `terminated`, etc.).
+- You’ll be prompted to confirm that your evaluation environment (source file + evaluation command) hasn’t changed.
+- The source file is restored to the most recent solution before continuing.
+- All progress and metrics from the original run are preserved.
+- Log directory, save-logs behavior, and evaluation timeout are reused from the original run.
 ### Performance & Expectations
 Weco, powered by the AIDE algorithm, optimizes code iteratively based on your evaluation results. Achieving significant improvements, especially on complex research-level tasks, often requires substantial exploration time.
@@ -493,37 +520,7 @@ Weco will parse this output to extract the numerical value (1.5 in this case) as
 ## Supported Models
-Weco supports the following LLM models:
-### OpenAI Models
-- `gpt-5` (recommended)
-- `gpt-5-mini`
-- `gpt-5-nano`
-- `o3-pro` (recommended)
-- `o3` (recommended)
-- `o4-mini` (recommended)
-- `o3-mini`
-- `o1-pro`
-- `o1`
-- `gpt-4.1`
-- `gpt-4.1-mini`
-- `gpt-4.1-nano`
-- `gpt-4o`
-- `gpt-4o-mini`
-- `codex-mini-latest`
-### Anthropic Models
-- `claude-opus-4-1`
-- `claude-opus-4-0`
-- `claude-sonnet-4-0`
-- `claude-3-7-sonnet-latest`
-### Gemini Models
-- `gemini-2.5-pro`
-- `gemini-2.5-flash`
-- `gemini-2.5-flash-lite`
-You can specify any of these models using the `-M` or `--model` flag. Ensure you have the corresponding API key set as an environment variable for the model provider you wish to use.
+A list of models we support can be found in our documentation [here](https://docs.weco.ai/cli/supported-models).
 ---

{weco-0.2.28 → weco-0.3.1}/README.md RENAMED Viewed

@@ -43,16 +43,17 @@ The `weco` CLI leverages a tree search approach guided by LLMs to iteratively ex
 1.  **Install the Package:**
     ```bash
-    pip install weco>=0.2.18
+    pip install weco
     ```
-2.  **Set Up LLM API Keys (Required):**
+2.  **Authenticate (Required):**
-    `weco` requires API keys for the LLMs it uses internally. You **must** provide these keys via environment variables:
+    `weco` now uses a **credit-based billing system** with centralized LLM access. You need to authenticate to use the service:
-    - **OpenAI:** `export OPENAI_API_KEY="your_key_here"` (Create your OpenAI API key [here](https://platform.openai.com/api-keys))
-    - **Anthropic:** `export ANTHROPIC_API_KEY="your_key_here"` (Create your Anthropic API key [here](https://console.anthropic.com/settings/keys))
-    - **Google:** `export GEMINI_API_KEY="your_key_here"` (Google AI Studio has a free API usage quota. Create your Gemini API key [here](https://aistudio.google.com/apikey) to use `weco` for free.)
+    - **Run the CLI**: `weco` will prompt you to authenticate via your web browser
+    - **Free Credits**: New users receive **free credits** upon signup
+    - **Centralized Keys**: All LLM provider API keys are managed by Weco (no BYOK required)
+    - **Credit Top-ups**: Purchase additional credits through the dashboard at [dashboard.weco.ai](https://dashboard.weco.ai)
 ---
@@ -111,6 +112,8 @@ weco run --source optimize.py \
 For more advanced examples, including [Triton](/examples/triton/README.md), [CUDA kernel optimization](/examples/cuda/README.md), [ML model optimization](/examples/spaceship-titanic/README.md), and [prompt engineering for math problems](examples/prompt/README.md), please see the `README.md` files within the corresponding subdirectories under the [`examples/`](examples/) folder.
+> Note: When recommend removing any backticks from your code if any are present. We currently don't support backticks but will support this in the future.
 ---
 ### Arguments for `weco run`
@@ -131,8 +134,8 @@ For more advanced examples, including [Triton](/examples/triton/README.md), [CUD
 | Argument                       | Description                                                                                                                                                                                                                | Default                                                                                                                                                | Example             |
 | :----------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------ | :------------------ |
 | `-n, --steps`                  | Number of optimization steps (LLM iterations) to run.                                                                                                                                                                      | 100                                                                                                                                                     | `-n 50`             |
-| `-M, --model`                  | Model identifier for the LLM to use (e.g., `o4-mini`, `claude-sonnet-4-0`).                                                                                                        | `o4-mini` when `OPENAI_API_KEY` is set; `claude-sonnet-4-0` when `ANTHROPIC_API_KEY` is set; `gemini-2.5-pro` when `GEMINI_API_KEY` is set. | `-M o4-mini`         |
-| `-i, --additional-instructions`| Natural language description of specific instructions **or** path to a file containing detailed instructions to guide the LLM.                                                                                             | `None`                                                                                                                                                  | `-i instructions.md` or `-i "Optimize the model for faster inference"`|
+| `-M, --model`                  | Model identifier for the LLM to use (e.g., `o4-mini`, `claude-sonnet-4-0`).                                                                                                        | `o4-mini` | `-M o4-mini`         |
+| `-i, --additional-instructions`| Natural language description of specific instructions **or** path to a file containing detailed instructions to guide the LLM. Supported file formats include - `.txt`, `.md`, and `.rst`.                                                                                             | `None`                                                                                                                                                  | `-i instructions.md` or `-i "Optimize the model for faster inference"`|
 | `-l, --log-dir`                | Path to the directory to log intermediate steps and final optimization result.                                                                                                                                             | `.runs/`                                                                                                                                               | `-l ./logs/`        |
 | `--eval-timeout`       | Timeout in seconds for each step in evaluation.                                                                                                                                                                             | No timeout (unlimited)                                                                                                                                                  | `--eval-timeout 3600`             |
 | `--save-logs`          | Save execution output from each optimization step to disk. Creates timestamped directories with raw output files and a JSONL index for tracking execution history.                                                        | `False`                                                                                                                                                 | `--save-logs`       |
@@ -141,24 +144,24 @@ For more advanced examples, including [Triton](/examples/triton/README.md), [CUD
 ### Authentication & Dashboard
-Weco offers both **anonymous** and **authenticated** usage:
-#### Anonymous Usage
-You can use Weco without creating an account by providing LLM API keys via environment variables. This is perfect for trying out Weco or for users who prefer not to create accounts.
+The CLI requires a Weco account for authentication and billing.
-#### Authenticated Usage (Recommended)
-To save your optimization runs and view them on the Weco dashboard, you can log in using Weco's secure device authentication flow:
+#### Credit-Based Authentication (Required)
+Weco now requires authentication for all operations. This enables our credit-based billing system and provides access to powerful optimizations:
-1. **During onboarding**: When you run `weco` for the first time, you'll be prompted to log in or skip
+1. **During onboarding**: When you run `weco` for the first time, you'll be prompted to log in
 2. **Manual login**: Use `weco logout` to clear credentials, then run `weco` again to re-authenticate
 3. **Device flow**: Weco will open your browser automatically and guide you through a secure OAuth-style authentication
 ![image (16)](https://github.com/user-attachments/assets/8a0a285b-4894-46fa-b6a2-4990017ca0c6)
-**Benefits of authenticated usage:**
-- **Run history**: View all your optimization runs on the Weco dashboard
-- **Progress tracking**: Monitor long-running optimizations remotely
-- **Enhanced support**: Get better assistance with your optimization challenges
+**Benefits:**
+- **No API Key Management**: All LLM provider keys are managed centrally
+- **Cost Transparency**: See exactly how many credits each optimization consumes
+- **Free Trial**: Free credits to get started with optimization projects
+- **Run History**: View all your optimization runs on the Weco dashboard
+- **Progress Tracking**: Monitor long-running optimizations remotely
+- **Budget Control**: Set spending limits and auto top-up preferences
 ---
@@ -171,6 +174,7 @@ To save your optimization runs and view them on the Weco dashboard, you can log
 | `weco` | Launch interactive onboarding | **Recommended for beginners** - Analyzes your codebase and guides you through setup |
 | `weco /path/to/project` | Launch onboarding for specific project | When working with a project in a different directory |
 | `weco run [options]` | Direct optimization execution | **For advanced users** - When you know exactly what to optimize and how |
+| `weco resume <run-id>` | Resume an interrupted run | Continue from the last completed step |
 | `weco logout` | Clear authentication credentials | To switch accounts or troubleshoot authentication issues |
 ### Model Selection
@@ -186,14 +190,37 @@ weco run --model claude-3.5-sonnet --source optimize.py [other options...]
 ```
 **Available models:**
-- `gpt-4o`, `o4-mini` (requires `OPENAI_API_KEY`)
-- `claude-3.5-sonnet`, `claude-sonnet-4-20250514` (requires `ANTHROPIC_API_KEY`)
-- `gemini-2.5-pro` (requires `GEMINI_API_KEY`)
+- `o4-mini`, `o3-mini`, `gpt-4o` (OpenAI models)
+- `claude-sonnet-4-0`, `claude-opus-4-0` (Anthropic models)
+- `gemini-2.5-pro`, `gemini-2.5-flash` (Google models)
-If no model is specified, Weco automatically selects the best available model based on your API keys.
+All models are available through Weco's centralized system. If no model is specified, Weco automatically selects the best model for your optimization task.
 ---
+### Resuming Interrupted Runs
+If your optimization run is interrupted (network issues, restart, etc.), resume from the most recent node:
+```bash
+# Resume an interrupted run
+weco resume 0002e071-1b67-411f-a514-36947f0c4b31
+```
+Arguments for `weco resume`:
+| Argument | Description | Example |
+|----------|-------------|---------|
+| `run-id` | The UUID of the run to resume (shown at the start of each run) | `0002e071-1b67-411f-a514-36947f0c4b31` |
+Notes:
+- Works only for interrupted runs (status: `error`, `terminated`, etc.).
+- You’ll be prompted to confirm that your evaluation environment (source file + evaluation command) hasn’t changed.
+- The source file is restored to the most recent solution before continuing.
+- All progress and metrics from the original run are preserved.
+- Log directory, save-logs behavior, and evaluation timeout are reused from the original run.
 ### Performance & Expectations
 Weco, powered by the AIDE algorithm, optimizes code iteratively based on your evaluation results. Achieving significant improvements, especially on complex research-level tasks, often requires substantial exploration time.
@@ -266,37 +293,7 @@ Weco will parse this output to extract the numerical value (1.5 in this case) as
 ## Supported Models
-Weco supports the following LLM models:
-### OpenAI Models
-- `gpt-5` (recommended)
-- `gpt-5-mini`
-- `gpt-5-nano`
-- `o3-pro` (recommended)
-- `o3` (recommended)
-- `o4-mini` (recommended)
-- `o3-mini`
-- `o1-pro`
-- `o1`
-- `gpt-4.1`
-- `gpt-4.1-mini`
-- `gpt-4.1-nano`
-- `gpt-4o`
-- `gpt-4o-mini`
-- `codex-mini-latest`
-### Anthropic Models
-- `claude-opus-4-1`
-- `claude-opus-4-0`
-- `claude-sonnet-4-0`
-- `claude-3-7-sonnet-latest`
-### Gemini Models
-- `gemini-2.5-pro`
-- `gemini-2.5-flash`
-- `gemini-2.5-flash-lite`
-You can specify any of these models using the `-M` or `--model` flag. Ensure you have the corresponding API key set as an environment variable for the model provider you wish to use.
+A list of models we support can be found in our documentation [here](https://docs.weco.ai/cli/supported-models).
 ---

{weco-0.2.28 → weco-0.3.1}/examples/README.md RENAMED Viewed

@@ -19,12 +19,8 @@ Explore runnable examples that show how to use Weco to optimize kernels, prompts
 - **Install the CLI**
 ```bash
-pip install weco>=0.2.18
+pip install weco
 ```
-- **Set an API key** for at least one provider:
-  - OpenAI: `export OPENAI_API_KEY="your_key_here"`
-  - Anthropic: `export ANTHROPIC_API_KEY="your_key_here"`
-  - Google: `export GEMINI_API_KEY="your_key_here"`
 ### Examples at a glance
@@ -32,7 +28,7 @@ pip install weco>=0.2.18
 | :-- | :-- | :-- | :-- |
 | 🧭 Hello Kernel World | Learn the Weco workflow on a small PyTorch model | `torch` | [README](hello-kernel-world/README.md) • [Colab](hello-kernel-world/colab_notebook_walkthrough.ipynb) |
 | ⚡ Triton Optimization | Speed up attention with Triton kernels | `torch`, `triton` | [README](triton/README.md) |
-| 🚀 CUDA Optimization | Generate low-level CUDA kernels for max speed | `torch`, `ninja`, NVIDIA GPU + CUDA Toolkit | [README](cuda/README.md) |
+| 🚀 CUDA Optimization | Generate low-level CUDA kernels for max speed | `torch`, `ninja`, `triton`, NVIDIA GPU + CUDA Toolkit | [README](cuda/README.md) |
 | 🧠 Prompt Engineering | Iteratively refine LLM prompts to improve accuracy | `openai`, `datasets` | [README](prompt/README.md) |
 | 🛰️ Spaceship Titanic | Improve a Kaggle model training pipeline | `pandas`, `numpy`, `scikit-learn`, `torch`, `xgboost`, `lightgbm`, `catboost` | [README](spaceship-titanic/README.md) |
@@ -63,23 +59,23 @@ weco run --source optimize.py \
 cd examples/triton
 weco run --source optimize.py \
   --eval-command "python evaluate.py --solution-path optimize.py" \
-  --metric speedup --goal maximize --steps 30 \
+  --metric speedup --goal maximize --steps 50 \
   --model o4-mini \
-  --additional-instructions "Use triton to optimize while keeping numerical diff small."
+  --additional-instructions "Use triton to optimize the code while ensuring a small max float diff. Maintain the same code format. Do not use any fallbacks. Assume any required dependencies are installed and data is already on the gpu."
 ```
 ### 🚀 CUDA Optimization
-- **Install extra deps**: `pip install torch ninja`
+- **Install extra deps**: `pip install torch ninja triton`
 - **Requires**: NVIDIA GPU and CUDA Toolkit
 - **Run**:
 ```bash
 cd examples/cuda
 weco run --source optimize.py \
   --eval-command "python evaluate.py --solution-path optimize.py" \
-  --metric speedup --goal maximize --steps 15 \
+  --metric speedup --goal maximize --steps 50 \
   --model o4-mini \
-  --additional-instructions guide.md
+  --additional-instructions "Write in-line CUDA using pytorch's load_inline() to optimize the code while ensuring a small max float diff. Maintain the same code format. Do not use any fallbacks. Assume any required dependencies are installed and data is already on the gpu."
 ```
 ### 🧠 Prompt Engineering
@@ -90,7 +86,7 @@ weco run --source optimize.py \
 cd examples/prompt
 weco run --source optimize.py \
   --eval-command "python eval.py" \
-  --metric score --goal maximize --steps 15 \
+  --metric score --goal maximize --steps 20 \
   --model o4-mini
 ```

{weco-0.2.28 → weco-0.3.1}/examples/cuda/README.md RENAMED Viewed

@@ -2,23 +2,12 @@
 This example showcases using Weco to optimize a PyTorch causal multi-head self-attention implementation by generating custom [CUDA](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html) kernels.
 This approach aims for low-level optimization beyond standard PyTorch or even Triton for potentially higher performance on NVIDIA GPUs.
-This example uses a separate Markdown file (`guide.md`) to provide detailed instructions and context to the LLM.
 ## Setup
-Install the CLI using `pip`:
+Install the CLI and dependencies for the example:
 ```bash
-pip install weco>=0.2.18
-```
-Create your OpenAI API key [here](https://platform.openai.com/api-keys), then run:
-```bash
-export OPENAI_API_KEY="your_key_here"
-```
-Install the required dependencies:
-```bash
-pip install torch ninja
+pip install weco torch ninja triton
 ```
 > **Note:** This example requires a compatible NVIDIA GPU and the CUDA Toolkit installed on your system for compiling and running the generated CUDA code.
@@ -30,9 +19,9 @@ weco run --source optimize.py \
      --eval-command "python evaluate.py --solution-path optimize.py" \
      --metric speedup \
      --goal maximize \
-     --steps 15 \
+     --steps 50 \
      --model o4-mini \
-     --additional-instructions guide.md
+     --additional-instructions "Write in-line CUDA using pytorch's load_inline() to optimize the code while ensuring a small max float diff. Maintain the same code format. Do not use any fallbacks. Assume any required dependencies are installed and data is already on the gpu."
 ```
 ### Explanation
@@ -41,11 +30,11 @@ weco run --source optimize.py \
 *   `--eval-command "python evaluate.py --solution-path optimize.py"`: Runs the evaluation script, which compiles (if necessary) and benchmarks the CUDA-enhanced code in `optimize.py` against a baseline, printing the `speedup`.
 *   `--metric speedup`: The optimization target metric.
 *   `--goal maximize`: Weco aims to increase the speedup.
-*   `--steps 15`: The number of optimization iterations.
+*   `--steps 50`: The number of optimization iterations.
 *   `--model o4-mini`: The LLM used for code generation.
-*   `--additional-instructions guide.md`: Provides guidance to the LLM on the optimization approach.
+*   `--additional-instructions "..."`: Provides guidance to the LLM on the optimization approach.
-Weco will iteratively modify `optimize.py`, generating and integrating CUDA C++ code, guided by the evaluation results and the instructions in `guide.md`.
+Weco will iteratively modify `optimize.py`, generating and integrating CUDA C++ code, guided by the evaluation results and the additional instructions provided.
 ## Next Steps

{weco-0.2.28 → weco-0.3.1}/examples/cuda/evaluate.py RENAMED Viewed

@@ -8,6 +8,7 @@ import torch
 import torch.nn as nn
 import torch.nn.functional as F
 import math
+from triton.testing import do_bench
 ########################################################
@@ -78,29 +79,6 @@ def get_inputs(batch_size, seq_len, n_embd, device):
     return torch.randn(batch_size, seq_len, n_embd, device=device, dtype=torch.float32)
-@torch.no_grad()
-def bench(f, inputs, n_warmup, n_rep):
-    start_event = torch.cuda.Event(enable_timing=True)
-    end_event = torch.cuda.Event(enable_timing=True)
-    # warmup
-    for _ in range(n_warmup):
-        f(inputs)  # noqa
-    torch.cuda.synchronize()
-    # benchmark
-    t_avg_ms = 0.0
-    for _ in range(n_rep):
-        # time the forward pass
-        start_event.record()
-        f(inputs)
-        end_event.record()
-        # wait for all computations to complete
-        torch.cuda.synchronize()
-        t_avg_ms += start_event.elapsed_time(end_event)
-    return t_avg_ms / n_rep
 if __name__ == "__main__":
     import argparse
@@ -111,8 +89,8 @@ if __name__ == "__main__":
     # benchmarking parameters
     n_correctness_trials = 10
     correctness_tolerance = 1e-5
-    n_warmup = 1000
-    n_rep = 5000
+    warmup_ms = 1e3
+    rep_ms = 5 * 1e3
     # init parameters
     max_seqlen = 512
@@ -148,8 +126,12 @@ if __name__ == "__main__":
     for _ in range(n_correctness_trials):
         inputs = get_inputs(batch_size=batch_size, seq_len=seq_len, n_embd=n_embd, device="cuda")
         with torch.no_grad():
-            baseline_output = baseline_model(inputs)
             optimized_output = solution_model(inputs)
+            if torch.isnan(optimized_output).any():
+                print("Incorrect solution: NaN detected in optimized model output")
+            if torch.isinf(optimized_output).any():
+                print("Incorrect solution: Inf detected in optimized model output")
+            baseline_output = baseline_model(inputs)
             max_diff_avg += torch.max(torch.abs(optimized_output - baseline_output))
     max_diff_avg /= n_correctness_trials
     print(f"max float diff between values of baseline and optimized model: {max_diff_avg}")
@@ -158,8 +140,8 @@ if __name__ == "__main__":
     # measure performance
     inputs = get_inputs(batch_size=batch_size, seq_len=seq_len, n_embd=n_embd, device="cuda")
-    t_avg_baseline = bench(baseline_model, inputs, n_warmup, n_rep)
+    t_avg_baseline = do_bench(lambda: baseline_model(inputs), warmup=warmup_ms, rep=rep_ms)
     print(f"baseline time: {t_avg_baseline:.2f}ms")
-    t_avg_optimized = bench(solution_model, inputs, n_warmup, n_rep)
+    t_avg_optimized = do_bench(lambda: solution_model(inputs), warmup=warmup_ms, rep=rep_ms)
     print(f"optimized time: {t_avg_optimized:.2f}ms")
     print(f"speedup: {t_avg_baseline / t_avg_optimized:.2f}x")

{weco-0.2.28 → weco-0.3.1}/examples/hello-kernel-world/README.md RENAMED Viewed

@@ -4,19 +4,9 @@ This example demonstrates the basics of using Weco to optimize a simple PyTorch
 ## Setup
-Install the CLI using `pip`:
+Install the CLI and dependencies for the example:
 ```bash
-pip install weco>=0.2.18
-```
-Create your API key from one of the supported providers:
-- **OpenAI:** Create your API key [here](https://platform.openai.com/api-keys), then run: `export OPENAI_API_KEY="your_key_here"`
-- **Anthropic:** Create your API key [here](https://console.anthropic.com/settings/keys), then run: `export ANTHROPIC_API_KEY="your_key_here"`
-- **Google:** Create your API key [here](https://aistudio.google.com/apikey), then run: `export GEMINI_API_KEY="your_key_here"`
-Install the required dependencies:
-```bash
-pip install torch
+pip install weco torch
 ```
 ## Run Weco

weco 0.2.28__tar.gz → 0.3.1__tar.gz

weco 0.2.28tar.gz → 0.3.1tar.gz