weco 0.2.17__tar.gz → 0.2.19__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {weco-0.2.17 → weco-0.2.19}/PKG-INFO +30 -51
- {weco-0.2.17 → weco-0.2.19}/README.md +28 -50
- {weco-0.2.17 → weco-0.2.19}/examples/cuda/README.md +2 -2
- weco-0.2.19/examples/prompt/README.md +51 -0
- {weco-0.2.17 → weco-0.2.19}/examples/spaceship-titanic/README.md +9 -34
- weco-0.2.19/examples/spaceship-titanic/data/sample_submission.csv +4278 -0
- weco-0.2.19/examples/spaceship-titanic/data/test.csv +4278 -0
- weco-0.2.19/examples/spaceship-titanic/data/train.csv +8694 -0
- {weco-0.2.17 → weco-0.2.19}/examples/spaceship-titanic/requirements-test.txt +1 -2
- {weco-0.2.17 → weco-0.2.19}/examples/triton/README.md +2 -2
- {weco-0.2.17 → weco-0.2.19}/pyproject.toml +5 -7
- {weco-0.2.17 → weco-0.2.19}/weco/__init__.py +2 -1
- weco-0.2.19/weco/api.py +162 -0
- {weco-0.2.17 → weco-0.2.19}/weco/cli.py +211 -23
- {weco-0.2.17 → weco-0.2.19}/weco/panels.py +1 -1
- {weco-0.2.17 → weco-0.2.19}/weco/utils.py +32 -0
- {weco-0.2.17 → weco-0.2.19}/weco.egg-info/PKG-INFO +30 -51
- {weco-0.2.17 → weco-0.2.19}/weco.egg-info/SOURCES.txt +3 -2
- {weco-0.2.17 → weco-0.2.19}/weco.egg-info/requires.txt +1 -0
- weco-0.2.17/examples/prompt/README.md +0 -99
- weco-0.2.17/examples/spaceship-titanic/get_data.py +0 -16
- weco-0.2.17/examples/spaceship-titanic/submit.py +0 -14
- weco-0.2.17/weco/api.py +0 -86
- {weco-0.2.17 → weco-0.2.19}/.github/workflows/lint.yml +0 -0
- {weco-0.2.17 → weco-0.2.19}/.github/workflows/release.yml +0 -0
- {weco-0.2.17 → weco-0.2.19}/.gitignore +0 -0
- {weco-0.2.17 → weco-0.2.19}/.repomixignore +0 -0
- {weco-0.2.17 → weco-0.2.19}/LICENSE +0 -0
- {weco-0.2.17 → weco-0.2.19}/assets/example-optimization.gif +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/cuda/evaluate.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/cuda/guide.md +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/cuda/optimize.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/hello-kernel-world/evaluate.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/hello-kernel-world/optimize.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/prompt/eval.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/prompt/optimize.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/prompt/prompt_guide.md +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/spaceship-titanic/competition_description.md +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/spaceship-titanic/evaluate.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/triton/evaluate.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/examples/triton/optimize.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/setup.cfg +0 -0
- {weco-0.2.17 → weco-0.2.19}/weco/auth.py +0 -0
- {weco-0.2.17 → weco-0.2.19}/weco.egg-info/dependency_links.txt +0 -0
- {weco-0.2.17 → weco-0.2.19}/weco.egg-info/entry_points.txt +0 -0
- {weco-0.2.17 → weco-0.2.19}/weco.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: weco
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.19
|
|
4
4
|
Summary: Documentation for `weco`, a CLI for using Weco AI's code optimizer.
|
|
5
5
|
Author-email: Weco AI Team <contact@weco.ai>
|
|
6
6
|
License: MIT
|
|
@@ -14,6 +14,7 @@ Description-Content-Type: text/markdown
|
|
|
14
14
|
License-File: LICENSE
|
|
15
15
|
Requires-Dist: requests
|
|
16
16
|
Requires-Dist: rich
|
|
17
|
+
Requires-Dist: packaging
|
|
17
18
|
Provides-Extra: dev
|
|
18
19
|
Requires-Dist: ruff; extra == "dev"
|
|
19
20
|
Requires-Dist: build; extra == "dev"
|
|
@@ -22,15 +23,13 @@ Dynamic: license-file
|
|
|
22
23
|
|
|
23
24
|
<div align="center">
|
|
24
25
|
|
|
25
|
-
# Weco: The
|
|
26
|
+
# Weco: The Platform for Self-Improving Code
|
|
26
27
|
|
|
27
28
|
[](https://www.python.org)
|
|
28
29
|
[](https://docs.weco.ai/)
|
|
29
30
|
[](https://badge.fury.io/py/weco)
|
|
30
31
|
[](https://arxiv.org/abs/2502.13138)
|
|
31
32
|
|
|
32
|
-
<code>pip install weco</code>
|
|
33
|
-
|
|
34
33
|
</div>
|
|
35
34
|
|
|
36
35
|
---
|
|
@@ -39,9 +38,9 @@ Weco systematically optimizes your code, guided directly by your evaluation metr
|
|
|
39
38
|
|
|
40
39
|
Example applications include:
|
|
41
40
|
|
|
42
|
-
- **GPU Kernel Optimization**: Reimplement PyTorch functions using CUDA or Triton optimizing for `latency`, `throughput`, or `memory_bandwidth`.
|
|
43
|
-
- **Model Development**: Tune feature transformations or
|
|
44
|
-
- **Prompt Engineering**: Refine prompts for LLMs, optimizing for `win_rate`, `relevance`, or `format_adherence`
|
|
41
|
+
- **GPU Kernel Optimization**: Reimplement PyTorch functions using [CUDA](/examples/cuda/README.md) or [Triton](/examples/triton/README.md), optimizing for `latency`, `throughput`, or `memory_bandwidth`.
|
|
42
|
+
- **Model Development**: Tune feature transformations, architectures or [the whole training pipeline](/examples/spaceship-titanic/README.md), optimizing for `validation_accuracy`, `AUC`, or `Sharpe Ratio`.
|
|
43
|
+
- **Prompt Engineering**: Refine prompts for LLMs (e.g., for [math problems](/examples/prompt/README.md)), optimizing for `win_rate`, `relevance`, or `format_adherence`
|
|
45
44
|
|
|
46
45
|

|
|
47
46
|
|
|
@@ -71,29 +70,9 @@ The `weco` CLI leverages a tree search approach guided by Large Language Models
|
|
|
71
70
|
- **Anthropic:** `export ANTHROPIC_API_KEY="your_key_here"`
|
|
72
71
|
- **Google DeepMind:** `export GEMINI_API_KEY="your_key_here"` (Google AI Studio has a free API usage quota. Create a key [here](https://aistudio.google.com/apikey) to use `weco` for free.)
|
|
73
72
|
|
|
74
|
-
The optimization process will fail if the necessary keys for the chosen model are not found in your environment.
|
|
75
|
-
|
|
76
|
-
3. **Log In to Weco (Optional):**
|
|
77
|
-
|
|
78
|
-
To associate your optimization runs with your Weco account and view them on the Weco dashboard, you can log in. `weco` uses a device authentication flow:
|
|
79
|
-
|
|
80
|
-
- When you first run `weco run`, you'll be prompted if you want to log in or proceed anonymously.
|
|
81
|
-
- If you choose to log in (by pressing `l`), you'll be shown a URL and `weco` will attempt to open it in your default web browser.
|
|
82
|
-
- You then authenticate in the browser. Once authenticated, the CLI will detect this and complete the login.
|
|
83
|
-
- This saves a Weco-specific API key locally (typically at `~/.config/weco/credentials.json`).
|
|
84
|
-
|
|
85
|
-
If you choose to skip login (by pressing Enter or `s`), `weco` will still function using the environment variable LLM keys, but the run history will not be linked to a Weco account.
|
|
86
|
-
|
|
87
|
-
To log out and remove your saved Weco API key, use the `weco logout` command.
|
|
88
|
-
|
|
89
73
|
---
|
|
90
74
|
|
|
91
|
-
##
|
|
92
|
-
|
|
93
|
-
The CLI has two main commands:
|
|
94
|
-
|
|
95
|
-
- `weco run`: Initiates the code optimization process.
|
|
96
|
-
- `weco logout`: Logs you out of your Weco account.
|
|
75
|
+
## Get Started
|
|
97
76
|
|
|
98
77
|
<div style="background-color: #fff3cd; border: 1px solid #ffeeba; padding: 15px; border-radius: 4px; margin-bottom: 15px;">
|
|
99
78
|
<strong>⚠️ Warning: Code Modification</strong><br>
|
|
@@ -102,10 +81,6 @@ The CLI has two main commands:
|
|
|
102
81
|
|
|
103
82
|
---
|
|
104
83
|
|
|
105
|
-
### `weco run` Command
|
|
106
|
-
|
|
107
|
-
This command starts the optimization process.
|
|
108
|
-
|
|
109
84
|
**Example: Optimizing Simple PyTorch Operations**
|
|
110
85
|
|
|
111
86
|
This basic example shows how to optimize a simple PyTorch function for speedup.
|
|
@@ -123,9 +98,8 @@ pip install torch
|
|
|
123
98
|
weco run --source optimize.py \
|
|
124
99
|
--eval-command "python evaluate.py --solution-path optimize.py --device cpu" \
|
|
125
100
|
--metric speedup \
|
|
126
|
-
--maximize
|
|
101
|
+
--goal maximize \
|
|
127
102
|
--steps 15 \
|
|
128
|
-
--model gemini-2.5-pro-exp-03-25 \
|
|
129
103
|
--additional-instructions "Fuse operations in the forward method while ensuring the max float deviation remains small. Maintain the same format of the code."
|
|
130
104
|
```
|
|
131
105
|
|
|
@@ -133,28 +107,33 @@ weco run --source optimize.py \
|
|
|
133
107
|
|
|
134
108
|
---
|
|
135
109
|
|
|
136
|
-
|
|
110
|
+
### Arguments for `weco run`
|
|
137
111
|
|
|
138
|
-
|
|
139
|
-
| :-------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :------- |
|
|
140
|
-
| `--source` | Path to the source code file that will be optimized (e.g., `optimize.py`). | Yes |
|
|
141
|
-
| `--eval-command` | Command to run for evaluating the code in `--source`. This command should print the target `--metric` and its value to the terminal (stdout/stderr). See note below. | Yes |
|
|
142
|
-
| `--metric` | The name of the metric you want to optimize (e.g., 'accuracy', 'speedup', 'loss'). This metric name should match what's printed by your `--eval-command`. | Yes |
|
|
143
|
-
| `--maximize` | Whether to maximize (`true`) or minimize (`false`) the metric. | Yes |
|
|
144
|
-
| `--steps` | Number of optimization steps (LLM iterations) to run. | Yes |
|
|
145
|
-
| `--model` | Model identifier for the LLM to use (e.g., `gpt-4o`, `claude-3.5-sonnet`). Recommended models to try include `o3-mini`, `claude-3-haiku`, and `gemini-2.5-pro-exp-03-25`. | Yes |
|
|
146
|
-
| `--additional-instructions` | (Optional) Natural language description of specific instructions OR path to a file containing detailed instructions to guide the LLM. | No |
|
|
147
|
-
| `--log-dir` | (Optional) Path to the directory to log intermediate steps and final optimization result. Defaults to `.runs/`. | No |
|
|
112
|
+
**Required:**
|
|
148
113
|
|
|
149
|
-
|
|
114
|
+
| Argument | Description |
|
|
115
|
+
| :------------------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
116
|
+
| `-s, --source` | Path to the source code file that will be optimized (e.g., `optimize.py`). |
|
|
117
|
+
| `-c, --eval-command`| Command to run for evaluating the code in `--source`. This command should print the target `--metric` and its value to the terminal (stdout/stderr). See note below. |
|
|
118
|
+
| `-m, --metric` | The name of the metric you want to optimize (e.g., 'accuracy', 'speedup', 'loss'). This metric name should match what's printed by your `--eval-command`. |
|
|
119
|
+
| `-g, --goal` | `maximize`/`max` to maximize the `--metric` or `minimize`/`min` to minimize it. |
|
|
150
120
|
|
|
151
|
-
|
|
121
|
+
<br>
|
|
152
122
|
|
|
153
|
-
|
|
123
|
+
**Optional:**
|
|
154
124
|
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
125
|
+
| Argument | Description | Default |
|
|
126
|
+
| :----------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------ |
|
|
127
|
+
| `-n, --steps` | Number of optimization steps (LLM iterations) to run. | 100 |
|
|
128
|
+
| `-M, --model` | Model identifier for the LLM to use (e.g., `gpt-4o`, `claude-3.5-sonnet`). | `o4-mini` when `OPENAI_API_KEY` is set; `claude-3-7-sonnet-20250219` when `ANTHROPIC_API_KEY` is set; `gemini-2.5-pro-exp-03-25` when `GEMINI_API_KEY` is set (priority: `OPENAI_API_KEY` > `ANTHROPIC_API_KEY` > `GEMINI_API_KEY`). |
|
|
129
|
+
| `-i, --additional-instructions`| Natural language description of specific instructions **or** path to a file containing detailed instructions to guide the LLM. | `None` |
|
|
130
|
+
| `-l, --log-dir` | Path to the directory to log intermediate steps and final optimization result. | `.runs/` |
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
### Weco Dashboard
|
|
135
|
+
To associate your optimization runs with your Weco account and view them on the Weco dashboard, you can log in. `weco` uses a device authentication flow
|
|
136
|
+

|
|
158
137
|
|
|
159
138
|
---
|
|
160
139
|
|
|
@@ -1,14 +1,12 @@
|
|
|
1
1
|
<div align="center">
|
|
2
2
|
|
|
3
|
-
# Weco: The
|
|
3
|
+
# Weco: The Platform for Self-Improving Code
|
|
4
4
|
|
|
5
5
|
[](https://www.python.org)
|
|
6
6
|
[](https://docs.weco.ai/)
|
|
7
7
|
[](https://badge.fury.io/py/weco)
|
|
8
8
|
[](https://arxiv.org/abs/2502.13138)
|
|
9
9
|
|
|
10
|
-
<code>pip install weco</code>
|
|
11
|
-
|
|
12
10
|
</div>
|
|
13
11
|
|
|
14
12
|
---
|
|
@@ -17,9 +15,9 @@ Weco systematically optimizes your code, guided directly by your evaluation metr
|
|
|
17
15
|
|
|
18
16
|
Example applications include:
|
|
19
17
|
|
|
20
|
-
- **GPU Kernel Optimization**: Reimplement PyTorch functions using CUDA or Triton optimizing for `latency`, `throughput`, or `memory_bandwidth`.
|
|
21
|
-
- **Model Development**: Tune feature transformations or
|
|
22
|
-
- **Prompt Engineering**: Refine prompts for LLMs, optimizing for `win_rate`, `relevance`, or `format_adherence`
|
|
18
|
+
- **GPU Kernel Optimization**: Reimplement PyTorch functions using [CUDA](/examples/cuda/README.md) or [Triton](/examples/triton/README.md), optimizing for `latency`, `throughput`, or `memory_bandwidth`.
|
|
19
|
+
- **Model Development**: Tune feature transformations, architectures or [the whole training pipeline](/examples/spaceship-titanic/README.md), optimizing for `validation_accuracy`, `AUC`, or `Sharpe Ratio`.
|
|
20
|
+
- **Prompt Engineering**: Refine prompts for LLMs (e.g., for [math problems](/examples/prompt/README.md)), optimizing for `win_rate`, `relevance`, or `format_adherence`
|
|
23
21
|
|
|
24
22
|

|
|
25
23
|
|
|
@@ -49,29 +47,9 @@ The `weco` CLI leverages a tree search approach guided by Large Language Models
|
|
|
49
47
|
- **Anthropic:** `export ANTHROPIC_API_KEY="your_key_here"`
|
|
50
48
|
- **Google DeepMind:** `export GEMINI_API_KEY="your_key_here"` (Google AI Studio has a free API usage quota. Create a key [here](https://aistudio.google.com/apikey) to use `weco` for free.)
|
|
51
49
|
|
|
52
|
-
The optimization process will fail if the necessary keys for the chosen model are not found in your environment.
|
|
53
|
-
|
|
54
|
-
3. **Log In to Weco (Optional):**
|
|
55
|
-
|
|
56
|
-
To associate your optimization runs with your Weco account and view them on the Weco dashboard, you can log in. `weco` uses a device authentication flow:
|
|
57
|
-
|
|
58
|
-
- When you first run `weco run`, you'll be prompted if you want to log in or proceed anonymously.
|
|
59
|
-
- If you choose to log in (by pressing `l`), you'll be shown a URL and `weco` will attempt to open it in your default web browser.
|
|
60
|
-
- You then authenticate in the browser. Once authenticated, the CLI will detect this and complete the login.
|
|
61
|
-
- This saves a Weco-specific API key locally (typically at `~/.config/weco/credentials.json`).
|
|
62
|
-
|
|
63
|
-
If you choose to skip login (by pressing Enter or `s`), `weco` will still function using the environment variable LLM keys, but the run history will not be linked to a Weco account.
|
|
64
|
-
|
|
65
|
-
To log out and remove your saved Weco API key, use the `weco logout` command.
|
|
66
|
-
|
|
67
50
|
---
|
|
68
51
|
|
|
69
|
-
##
|
|
70
|
-
|
|
71
|
-
The CLI has two main commands:
|
|
72
|
-
|
|
73
|
-
- `weco run`: Initiates the code optimization process.
|
|
74
|
-
- `weco logout`: Logs you out of your Weco account.
|
|
52
|
+
## Get Started
|
|
75
53
|
|
|
76
54
|
<div style="background-color: #fff3cd; border: 1px solid #ffeeba; padding: 15px; border-radius: 4px; margin-bottom: 15px;">
|
|
77
55
|
<strong>⚠️ Warning: Code Modification</strong><br>
|
|
@@ -80,10 +58,6 @@ The CLI has two main commands:
|
|
|
80
58
|
|
|
81
59
|
---
|
|
82
60
|
|
|
83
|
-
### `weco run` Command
|
|
84
|
-
|
|
85
|
-
This command starts the optimization process.
|
|
86
|
-
|
|
87
61
|
**Example: Optimizing Simple PyTorch Operations**
|
|
88
62
|
|
|
89
63
|
This basic example shows how to optimize a simple PyTorch function for speedup.
|
|
@@ -101,9 +75,8 @@ pip install torch
|
|
|
101
75
|
weco run --source optimize.py \
|
|
102
76
|
--eval-command "python evaluate.py --solution-path optimize.py --device cpu" \
|
|
103
77
|
--metric speedup \
|
|
104
|
-
--maximize
|
|
78
|
+
--goal maximize \
|
|
105
79
|
--steps 15 \
|
|
106
|
-
--model gemini-2.5-pro-exp-03-25 \
|
|
107
80
|
--additional-instructions "Fuse operations in the forward method while ensuring the max float deviation remains small. Maintain the same format of the code."
|
|
108
81
|
```
|
|
109
82
|
|
|
@@ -111,28 +84,33 @@ weco run --source optimize.py \
|
|
|
111
84
|
|
|
112
85
|
---
|
|
113
86
|
|
|
114
|
-
|
|
87
|
+
### Arguments for `weco run`
|
|
115
88
|
|
|
116
|
-
|
|
117
|
-
| :-------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | :------- |
|
|
118
|
-
| `--source` | Path to the source code file that will be optimized (e.g., `optimize.py`). | Yes |
|
|
119
|
-
| `--eval-command` | Command to run for evaluating the code in `--source`. This command should print the target `--metric` and its value to the terminal (stdout/stderr). See note below. | Yes |
|
|
120
|
-
| `--metric` | The name of the metric you want to optimize (e.g., 'accuracy', 'speedup', 'loss'). This metric name should match what's printed by your `--eval-command`. | Yes |
|
|
121
|
-
| `--maximize` | Whether to maximize (`true`) or minimize (`false`) the metric. | Yes |
|
|
122
|
-
| `--steps` | Number of optimization steps (LLM iterations) to run. | Yes |
|
|
123
|
-
| `--model` | Model identifier for the LLM to use (e.g., `gpt-4o`, `claude-3.5-sonnet`). Recommended models to try include `o3-mini`, `claude-3-haiku`, and `gemini-2.5-pro-exp-03-25`. | Yes |
|
|
124
|
-
| `--additional-instructions` | (Optional) Natural language description of specific instructions OR path to a file containing detailed instructions to guide the LLM. | No |
|
|
125
|
-
| `--log-dir` | (Optional) Path to the directory to log intermediate steps and final optimization result. Defaults to `.runs/`. | No |
|
|
89
|
+
**Required:**
|
|
126
90
|
|
|
127
|
-
|
|
91
|
+
| Argument | Description |
|
|
92
|
+
| :------------------ | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
93
|
+
| `-s, --source` | Path to the source code file that will be optimized (e.g., `optimize.py`). |
|
|
94
|
+
| `-c, --eval-command`| Command to run for evaluating the code in `--source`. This command should print the target `--metric` and its value to the terminal (stdout/stderr). See note below. |
|
|
95
|
+
| `-m, --metric` | The name of the metric you want to optimize (e.g., 'accuracy', 'speedup', 'loss'). This metric name should match what's printed by your `--eval-command`. |
|
|
96
|
+
| `-g, --goal` | `maximize`/`max` to maximize the `--metric` or `minimize`/`min` to minimize it. |
|
|
128
97
|
|
|
129
|
-
|
|
98
|
+
<br>
|
|
130
99
|
|
|
131
|
-
|
|
100
|
+
**Optional:**
|
|
132
101
|
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
102
|
+
| Argument | Description | Default |
|
|
103
|
+
| :----------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------ |
|
|
104
|
+
| `-n, --steps` | Number of optimization steps (LLM iterations) to run. | 100 |
|
|
105
|
+
| `-M, --model` | Model identifier for the LLM to use (e.g., `gpt-4o`, `claude-3.5-sonnet`). | `o4-mini` when `OPENAI_API_KEY` is set; `claude-3-7-sonnet-20250219` when `ANTHROPIC_API_KEY` is set; `gemini-2.5-pro-exp-03-25` when `GEMINI_API_KEY` is set (priority: `OPENAI_API_KEY` > `ANTHROPIC_API_KEY` > `GEMINI_API_KEY`). |
|
|
106
|
+
| `-i, --additional-instructions`| Natural language description of specific instructions **or** path to a file containing detailed instructions to guide the LLM. | `None` |
|
|
107
|
+
| `-l, --log-dir` | Path to the directory to log intermediate steps and final optimization result. | `.runs/` |
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
### Weco Dashboard
|
|
112
|
+
To associate your optimization runs with your Weco account and view them on the Weco dashboard, you can log in. `weco` uses a device authentication flow
|
|
113
|
+

|
|
136
114
|
|
|
137
115
|
---
|
|
138
116
|
|
|
@@ -21,7 +21,7 @@ Run the following command to start the optimization process:
|
|
|
21
21
|
weco run --source optimize.py \
|
|
22
22
|
--eval-command "python evaluate.py --solution-path optimize.py" \
|
|
23
23
|
--metric speedup \
|
|
24
|
-
--maximize
|
|
24
|
+
--goal maximize \
|
|
25
25
|
--steps 30 \
|
|
26
26
|
--model gemini-2.5-pro-exp-03-25 \
|
|
27
27
|
--additional-instructions guide.md
|
|
@@ -32,7 +32,7 @@ weco run --source optimize.py \
|
|
|
32
32
|
* `--source optimize.py`: The initial PyTorch self-attention code to be optimized with CUDA.
|
|
33
33
|
* `--eval-command "python evaluate.py --solution-path optimize.py"`: Runs the evaluation script, which compiles (if necessary) and benchmarks the CUDA-enhanced code in `optimize.py` against a baseline, printing the `speedup`.
|
|
34
34
|
* `--metric speedup`: The optimization target metric.
|
|
35
|
-
* `--maximize
|
|
35
|
+
* `--goal maximize`: Weco aims to increase the speedup.
|
|
36
36
|
* `--steps 30`: The number of optimization iterations.
|
|
37
37
|
* `--model gemini-2.5-pro-exp-03-25`: The LLM used for code generation.
|
|
38
38
|
* `--additional-instructions guide.md`: Points Weco to a file containing detailed instructions for the LLM on how to write the CUDA kernels, handle compilation (e.g., using `torch.utils.cpp_extension`), manage data types, and ensure correctness.
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
# AIME Prompt Engineering Example with Weco
|
|
2
|
+
|
|
3
|
+
This example shows how **Weco** can iteratively improve a prompt for solving American Invitational Mathematics Examination (AIME) problems. The experiment runs locally, requires only two short Python files, and aims to improve the accuracy metric.
|
|
4
|
+
|
|
5
|
+
This example uses `gpt-4o-mini` via the OpenAI API by default. Ensure your `OPENAI_API_KEY` environment variable is set.
|
|
6
|
+
|
|
7
|
+
## Files in this folder
|
|
8
|
+
|
|
9
|
+
| File | Purpose |
|
|
10
|
+
| :------------ | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
11
|
+
| `optimize.py` | Holds the prompt template (instructing the LLM to reason step-by-step and use `\\boxed{}` for the final answer) and the mutable `EXTRA_INSTRUCTIONS` string. Weco edits **only** this file during the search. |
|
|
12
|
+
| `eval.py` | Downloads a small slice of the 2024 AIME dataset, calls `optimize.solve` in parallel, parses the LLM output (looking for `\\boxed{}`), compares it to the ground truth, prints progress logs, and finally prints an `accuracy:` line that Weco reads. |
|
|
13
|
+
|
|
14
|
+
|
|
15
|
+
## Quick start
|
|
16
|
+
|
|
17
|
+
1. **Clone the repository and enter the folder.**
|
|
18
|
+
```bash
|
|
19
|
+
git clone https://github.com/your‑fork/weco‑examples.git
|
|
20
|
+
cd weco‑examples/aime‑2024
|
|
21
|
+
```
|
|
22
|
+
2. **Run Weco.** The command below edits `EXTRA_INSTRUCTIONS` in `optimize.py`, invokes `eval.py` on every iteration, reads the printed accuracy, and keeps the best variants.
|
|
23
|
+
```bash
|
|
24
|
+
weco --source optimize.py \
|
|
25
|
+
--eval-command "python eval.py" \
|
|
26
|
+
--metric accuracy \
|
|
27
|
+
--goal maximize \
|
|
28
|
+
--steps 40 \
|
|
29
|
+
--model gemini-2.5-flash-preview-04-17 \
|
|
30
|
+
--additional-instructions prompt_guide.md
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
During each evaluation round you will see log lines similar to the following.
|
|
34
|
+
|
|
35
|
+
```text
|
|
36
|
+
[setup] loading 20 problems from AIME 2024 …
|
|
37
|
+
[progress] 5/20 completed, elapsed 7.3 s
|
|
38
|
+
[progress] 10/20 completed, elapsed 14.6 s
|
|
39
|
+
[progress] 15/20 completed, elapsed 21.8 s
|
|
40
|
+
[progress] 20/20 completed, elapsed 28.9 s
|
|
41
|
+
accuracy: 0.0500
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Weco then mutates the config, tries again, and gradually pushes the accuracy higher. On a modern laptop you can usually double the baseline score within thirty to forty iterations.
|
|
45
|
+
|
|
46
|
+
## How it works
|
|
47
|
+
|
|
48
|
+
* `eval_aime.py` slices the **Maxwell‑Jia/AIME_2024** dataset to twenty problems for fast feedback. You can change the slice in one line.
|
|
49
|
+
* The script sends model calls in parallel via `ThreadPoolExecutor`, so network latency is hidden.
|
|
50
|
+
* Every five completed items, the script logs progress and elapsed time.
|
|
51
|
+
* The final line `accuracy: value` is the only part Weco needs for guidance.
|
|
@@ -1,33 +1,16 @@
|
|
|
1
|
-
# Example:
|
|
1
|
+
# Example: Solving a Kaggle Competition (Spaceship Titanic)
|
|
2
2
|
|
|
3
3
|
This example demonstrates using Weco to optimize a Python script designed for the [Spaceship Titanic Kaggle competition](https://www.kaggle.com/competitions/spaceship-titanic/overview). The goal is to improve the model's `accuracy` metric by directly optimizing the evaluate.py
|
|
4
4
|
|
|
5
5
|
## Setup
|
|
6
6
|
|
|
7
7
|
1. Ensure you are in the `examples/spaceship-titanic` directory.
|
|
8
|
-
2.
|
|
9
|
-
3.
|
|
8
|
+
2. `pip install weco`
|
|
9
|
+
3. Set up LLM API Key, `export OPENAI_API_KEY="your_key_here"`
|
|
10
|
+
4. **Install Dependencies:** Install the required Python packages:
|
|
10
11
|
```bash
|
|
11
12
|
pip install -r requirements-test.txt
|
|
12
13
|
```
|
|
13
|
-
4. **Prepare Data:** Run the utility script once to download the dataset from Kaggle and place it in the expected `./data/` subdirectories:
|
|
14
|
-
```bash
|
|
15
|
-
python get_data.py
|
|
16
|
-
```
|
|
17
|
-
After running `get_data.py`, your directory structure should look like this:
|
|
18
|
-
```
|
|
19
|
-
.
|
|
20
|
-
├── competition_description.md
|
|
21
|
-
├── data
|
|
22
|
-
│ ├── sample_submission.csv
|
|
23
|
-
│ ├── test.csv
|
|
24
|
-
│ └── train.csv
|
|
25
|
-
├── evaluate.py
|
|
26
|
-
├── get_data.py
|
|
27
|
-
├── README.md # This file
|
|
28
|
-
├── requirements-test.txt
|
|
29
|
-
└── submit.py
|
|
30
|
-
```
|
|
31
14
|
|
|
32
15
|
## Optimization Command
|
|
33
16
|
|
|
@@ -37,21 +20,13 @@ Run the following command to start optimizing the model:
|
|
|
37
20
|
weco run --source evaluate.py \
|
|
38
21
|
--eval-command "python evaluate.py --data-dir ./data" \
|
|
39
22
|
--metric accuracy \
|
|
40
|
-
--maximize
|
|
41
|
-
--steps
|
|
42
|
-
--model
|
|
23
|
+
--goal maximize \
|
|
24
|
+
--steps 20 \
|
|
25
|
+
--model o4-mini \
|
|
43
26
|
--additional-instructions "Improve feature engineering, model choice and hyper-parameters."
|
|
44
27
|
--log-dir .runs/spaceship-titanic
|
|
45
28
|
```
|
|
46
29
|
|
|
47
|
-
## Submit the solution
|
|
48
|
-
|
|
49
|
-
Once the optimization finished, you can submit your predictions to kaggle to see the results. Make sure `submission.csv` is present and then simply run the following command.
|
|
50
|
-
|
|
51
|
-
```bash
|
|
52
|
-
python submit.py
|
|
53
|
-
```
|
|
54
|
-
|
|
55
30
|
### Explanation
|
|
56
31
|
|
|
57
32
|
* `--source evaluate.py`: The script provides a baseline as root node and directly optimize the evaluate.py
|
|
@@ -59,9 +34,9 @@ python submit.py
|
|
|
59
34
|
* [optional] `--data-dir`: path to the train and test data.
|
|
60
35
|
* [optional] `--seed`: Seed for reproduce the experiment.
|
|
61
36
|
* `--metric accuracy`: The target metric Weco should optimize.
|
|
62
|
-
* `--maximize
|
|
37
|
+
* `--goal maximize`: Weco aims to increase the accuracy.
|
|
63
38
|
* `--steps 10`: The number of optimization iterations.
|
|
64
39
|
* `--model gemini-2.5-pro-exp-03-25`: The LLM driving the optimization.
|
|
65
40
|
* `--additional-instructions "Improve feature engineering, model choice and hyper-parameters."`: A simple instruction for model improvement or you can put the path to [`comptition_description.md`](./competition_description.md) within the repo to feed the agent more detailed information.
|
|
66
41
|
|
|
67
|
-
Weco will iteratively modify the feature engineering or modeling code within `evaluate.py`, run the evaluation pipeline, and use the resulting `accuracy` to guide further improvements.
|
|
42
|
+
Weco will iteratively modify the feature engineering or modeling code within `evaluate.py`, run the evaluation pipeline, and use the resulting `accuracy` to guide further improvements.
|