weco 0.2.16__tar.gz → 0.2.18__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {weco-0.2.16 → weco-0.2.18}/.github/workflows/release.yml +0 -2
- {weco-0.2.16 → weco-0.2.18}/PKG-INFO +19 -40
- {weco-0.2.16 → weco-0.2.18}/README.md +17 -39
- weco-0.2.18/examples/prompt/README.md +51 -0
- {weco-0.2.16 → weco-0.2.18}/examples/spaceship-titanic/README.md +7 -32
- weco-0.2.18/examples/spaceship-titanic/data/sample_submission.csv +4278 -0
- weco-0.2.18/examples/spaceship-titanic/data/test.csv +4278 -0
- weco-0.2.18/examples/spaceship-titanic/data/train.csv +8694 -0
- {weco-0.2.16 → weco-0.2.18}/examples/spaceship-titanic/requirements-test.txt +1 -2
- {weco-0.2.16 → weco-0.2.18}/pyproject.toml +5 -7
- {weco-0.2.16 → weco-0.2.18}/weco/__init__.py +2 -1
- weco-0.2.18/weco/api.py +162 -0
- {weco-0.2.16 → weco-0.2.18}/weco/cli.py +165 -23
- {weco-0.2.16 → weco-0.2.18}/weco/panels.py +1 -1
- {weco-0.2.16 → weco-0.2.18}/weco/utils.py +32 -0
- {weco-0.2.16 → weco-0.2.18}/weco.egg-info/PKG-INFO +19 -40
- {weco-0.2.16 → weco-0.2.18}/weco.egg-info/SOURCES.txt +3 -2
- {weco-0.2.16 → weco-0.2.18}/weco.egg-info/requires.txt +1 -0
- weco-0.2.16/examples/prompt/README.md +0 -100
- weco-0.2.16/examples/spaceship-titanic/get_data.py +0 -16
- weco-0.2.16/examples/spaceship-titanic/submit.py +0 -14
- weco-0.2.16/weco/api.py +0 -86
- {weco-0.2.16 → weco-0.2.18}/.github/workflows/lint.yml +0 -0
- {weco-0.2.16 → weco-0.2.18}/.gitignore +0 -0
- {weco-0.2.16 → weco-0.2.18}/.repomixignore +0 -0
- {weco-0.2.16 → weco-0.2.18}/LICENSE +0 -0
- {weco-0.2.16 → weco-0.2.18}/assets/example-optimization.gif +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/cuda/README.md +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/cuda/evaluate.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/cuda/guide.md +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/cuda/optimize.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/hello-kernel-world/evaluate.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/hello-kernel-world/optimize.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/prompt/eval.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/prompt/optimize.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/prompt/prompt_guide.md +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/spaceship-titanic/competition_description.md +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/spaceship-titanic/evaluate.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/triton/README.md +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/triton/evaluate.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/examples/triton/optimize.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/setup.cfg +0 -0
- {weco-0.2.16 → weco-0.2.18}/weco/auth.py +0 -0
- {weco-0.2.16 → weco-0.2.18}/weco.egg-info/dependency_links.txt +0 -0
- {weco-0.2.16 → weco-0.2.18}/weco.egg-info/entry_points.txt +0 -0
- {weco-0.2.16 → weco-0.2.18}/weco.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: weco
|
|
3
|
-
Version: 0.2.
|
|
3
|
+
Version: 0.2.18
|
|
4
4
|
Summary: Documentation for `weco`, a CLI for using Weco AI's code optimizer.
|
|
5
5
|
Author-email: Weco AI Team <contact@weco.ai>
|
|
6
6
|
License: MIT
|
|
@@ -14,25 +14,33 @@ Description-Content-Type: text/markdown
|
|
|
14
14
|
License-File: LICENSE
|
|
15
15
|
Requires-Dist: requests
|
|
16
16
|
Requires-Dist: rich
|
|
17
|
+
Requires-Dist: packaging
|
|
17
18
|
Provides-Extra: dev
|
|
18
19
|
Requires-Dist: ruff; extra == "dev"
|
|
19
20
|
Requires-Dist: build; extra == "dev"
|
|
20
21
|
Requires-Dist: setuptools_scm; extra == "dev"
|
|
21
22
|
Dynamic: license-file
|
|
22
23
|
|
|
23
|
-
|
|
24
|
+
<div align="center">
|
|
24
25
|
|
|
25
|
-
|
|
26
|
+
# Weco: The Platform for Self-Improving Code
|
|
27
|
+
|
|
28
|
+
[](https://www.python.org)
|
|
29
|
+
[](https://docs.weco.ai/)
|
|
26
30
|
[](https://badge.fury.io/py/weco)
|
|
27
31
|
[](https://arxiv.org/abs/2502.13138)
|
|
28
32
|
|
|
33
|
+
</div>
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
29
37
|
Weco systematically optimizes your code, guided directly by your evaluation metrics.
|
|
30
38
|
|
|
31
39
|
Example applications include:
|
|
32
40
|
|
|
33
|
-
- **GPU Kernel Optimization**: Reimplement PyTorch functions using CUDA or Triton optimizing for `latency`, `throughput`, or `memory_bandwidth`.
|
|
34
|
-
- **Model Development**: Tune feature transformations or
|
|
35
|
-
- **Prompt Engineering**: Refine prompts for LLMs, optimizing for `win_rate`, `relevance`, or `format_adherence`
|
|
41
|
+
- **GPU Kernel Optimization**: Reimplement PyTorch functions using [CUDA](/examples/cuda/README.md) or [Triton](/examples/triton/README.md), optimizing for `latency`, `throughput`, or `memory_bandwidth`.
|
|
42
|
+
- **Model Development**: Tune feature transformations, architectures or [the whole training pipeline](/examples/spaceship-titanic/README.md), optimizing for `validation_accuracy`, `AUC`, or `Sharpe Ratio`.
|
|
43
|
+
- **Prompt Engineering**: Refine prompts for LLMs (e.g., for [math problems](/examples/prompt/README.md)), optimizing for `win_rate`, `relevance`, or `format_adherence`
|
|
36
44
|
|
|
37
45
|

|
|
38
46
|
|
|
@@ -62,29 +70,9 @@ The `weco` CLI leverages a tree search approach guided by Large Language Models
|
|
|
62
70
|
- **Anthropic:** `export ANTHROPIC_API_KEY="your_key_here"`
|
|
63
71
|
- **Google DeepMind:** `export GEMINI_API_KEY="your_key_here"` (Google AI Studio has a free API usage quota. Create a key [here](https://aistudio.google.com/apikey) to use `weco` for free.)
|
|
64
72
|
|
|
65
|
-
The optimization process will fail if the necessary keys for the chosen model are not found in your environment.
|
|
66
|
-
|
|
67
|
-
3. **Log In to Weco (Optional):**
|
|
68
|
-
|
|
69
|
-
To associate your optimization runs with your Weco account and view them on the Weco dashboard, you can log in. `weco` uses a device authentication flow:
|
|
70
|
-
|
|
71
|
-
- When you first run `weco run`, you'll be prompted if you want to log in or proceed anonymously.
|
|
72
|
-
- If you choose to log in (by pressing `l`), you'll be shown a URL and `weco` will attempt to open it in your default web browser.
|
|
73
|
-
- You then authenticate in the browser. Once authenticated, the CLI will detect this and complete the login.
|
|
74
|
-
- This saves a Weco-specific API key locally (typically at `~/.config/weco/credentials.json`).
|
|
75
|
-
|
|
76
|
-
If you choose to skip login (by pressing Enter or `s`), `weco` will still function using the environment variable LLM keys, but the run history will not be linked to a Weco account.
|
|
77
|
-
|
|
78
|
-
To log out and remove your saved Weco API key, use the `weco logout` command.
|
|
79
|
-
|
|
80
73
|
---
|
|
81
74
|
|
|
82
|
-
##
|
|
83
|
-
|
|
84
|
-
The CLI has two main commands:
|
|
85
|
-
|
|
86
|
-
- `weco run`: Initiates the code optimization process.
|
|
87
|
-
- `weco logout`: Logs you out of your Weco account.
|
|
75
|
+
## Get Started
|
|
88
76
|
|
|
89
77
|
<div style="background-color: #fff3cd; border: 1px solid #ffeeba; padding: 15px; border-radius: 4px; margin-bottom: 15px;">
|
|
90
78
|
<strong>⚠️ Warning: Code Modification</strong><br>
|
|
@@ -93,15 +81,11 @@ The CLI has two main commands:
|
|
|
93
81
|
|
|
94
82
|
---
|
|
95
83
|
|
|
96
|
-
### `weco run` Command
|
|
97
|
-
|
|
98
|
-
This command starts the optimization process.
|
|
99
|
-
|
|
100
84
|
**Example: Optimizing Simple PyTorch Operations**
|
|
101
85
|
|
|
102
86
|
This basic example shows how to optimize a simple PyTorch function for speedup.
|
|
103
87
|
|
|
104
|
-
For more advanced examples, including [Triton](/examples/triton/README.md), [CUDA kernel optimization](/examples/cuda/README.md)
|
|
88
|
+
For more advanced examples, including [Triton](/examples/triton/README.md), [CUDA kernel optimization](/examples/cuda/README.md), [ML model optimization](/examples/spaceship-titanic/README.md), and [prompt engineering for math problems](https://github.com/WecoAI/weco-cli/tree/main/examples/prompt), please see the `README.md` files within the corresponding subdirectories under the [`examples/`](./examples/) folder.
|
|
105
89
|
|
|
106
90
|
```bash
|
|
107
91
|
# Navigate to the example directory
|
|
@@ -136,17 +120,12 @@ weco run --source optimize.py \
|
|
|
136
120
|
| `--model` | Model identifier for the LLM to use (e.g., `gpt-4o`, `claude-3.5-sonnet`). Recommended models to try include `o3-mini`, `claude-3-haiku`, and `gemini-2.5-pro-exp-03-25`. | Yes |
|
|
137
121
|
| `--additional-instructions` | (Optional) Natural language description of specific instructions OR path to a file containing detailed instructions to guide the LLM. | No |
|
|
138
122
|
| `--log-dir` | (Optional) Path to the directory to log intermediate steps and final optimization result. Defaults to `.runs/`. | No |
|
|
139
|
-
| `--preserve-source` | (Optional) If set, do not overwrite the original `--source` file. Modifications and the best solution will still be saved in the `--log-dir`. | No |
|
|
140
123
|
|
|
141
124
|
---
|
|
142
125
|
|
|
143
|
-
###
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
```bash
|
|
148
|
-
weco logout
|
|
149
|
-
```
|
|
126
|
+
### Weco Dashboard
|
|
127
|
+
To associate your optimization runs with your Weco account and view them on the Weco dashboard, you can log in. `weco` uses a device authentication flow
|
|
128
|
+

|
|
150
129
|
|
|
151
130
|
---
|
|
152
131
|
|
|
@@ -1,16 +1,23 @@
|
|
|
1
|
-
|
|
1
|
+
<div align="center">
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
# Weco: The Platform for Self-Improving Code
|
|
4
|
+
|
|
5
|
+
[](https://www.python.org)
|
|
6
|
+
[](https://docs.weco.ai/)
|
|
4
7
|
[](https://badge.fury.io/py/weco)
|
|
5
8
|
[](https://arxiv.org/abs/2502.13138)
|
|
6
9
|
|
|
10
|
+
</div>
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
7
14
|
Weco systematically optimizes your code, guided directly by your evaluation metrics.
|
|
8
15
|
|
|
9
16
|
Example applications include:
|
|
10
17
|
|
|
11
|
-
- **GPU Kernel Optimization**: Reimplement PyTorch functions using CUDA or Triton optimizing for `latency`, `throughput`, or `memory_bandwidth`.
|
|
12
|
-
- **Model Development**: Tune feature transformations or
|
|
13
|
-
- **Prompt Engineering**: Refine prompts for LLMs, optimizing for `win_rate`, `relevance`, or `format_adherence`
|
|
18
|
+
- **GPU Kernel Optimization**: Reimplement PyTorch functions using [CUDA](/examples/cuda/README.md) or [Triton](/examples/triton/README.md), optimizing for `latency`, `throughput`, or `memory_bandwidth`.
|
|
19
|
+
- **Model Development**: Tune feature transformations, architectures or [the whole training pipeline](/examples/spaceship-titanic/README.md), optimizing for `validation_accuracy`, `AUC`, or `Sharpe Ratio`.
|
|
20
|
+
- **Prompt Engineering**: Refine prompts for LLMs (e.g., for [math problems](/examples/prompt/README.md)), optimizing for `win_rate`, `relevance`, or `format_adherence`
|
|
14
21
|
|
|
15
22
|

|
|
16
23
|
|
|
@@ -40,29 +47,9 @@ The `weco` CLI leverages a tree search approach guided by Large Language Models
|
|
|
40
47
|
- **Anthropic:** `export ANTHROPIC_API_KEY="your_key_here"`
|
|
41
48
|
- **Google DeepMind:** `export GEMINI_API_KEY="your_key_here"` (Google AI Studio has a free API usage quota. Create a key [here](https://aistudio.google.com/apikey) to use `weco` for free.)
|
|
42
49
|
|
|
43
|
-
The optimization process will fail if the necessary keys for the chosen model are not found in your environment.
|
|
44
|
-
|
|
45
|
-
3. **Log In to Weco (Optional):**
|
|
46
|
-
|
|
47
|
-
To associate your optimization runs with your Weco account and view them on the Weco dashboard, you can log in. `weco` uses a device authentication flow:
|
|
48
|
-
|
|
49
|
-
- When you first run `weco run`, you'll be prompted if you want to log in or proceed anonymously.
|
|
50
|
-
- If you choose to log in (by pressing `l`), you'll be shown a URL and `weco` will attempt to open it in your default web browser.
|
|
51
|
-
- You then authenticate in the browser. Once authenticated, the CLI will detect this and complete the login.
|
|
52
|
-
- This saves a Weco-specific API key locally (typically at `~/.config/weco/credentials.json`).
|
|
53
|
-
|
|
54
|
-
If you choose to skip login (by pressing Enter or `s`), `weco` will still function using the environment variable LLM keys, but the run history will not be linked to a Weco account.
|
|
55
|
-
|
|
56
|
-
To log out and remove your saved Weco API key, use the `weco logout` command.
|
|
57
|
-
|
|
58
50
|
---
|
|
59
51
|
|
|
60
|
-
##
|
|
61
|
-
|
|
62
|
-
The CLI has two main commands:
|
|
63
|
-
|
|
64
|
-
- `weco run`: Initiates the code optimization process.
|
|
65
|
-
- `weco logout`: Logs you out of your Weco account.
|
|
52
|
+
## Get Started
|
|
66
53
|
|
|
67
54
|
<div style="background-color: #fff3cd; border: 1px solid #ffeeba; padding: 15px; border-radius: 4px; margin-bottom: 15px;">
|
|
68
55
|
<strong>⚠️ Warning: Code Modification</strong><br>
|
|
@@ -71,15 +58,11 @@ The CLI has two main commands:
|
|
|
71
58
|
|
|
72
59
|
---
|
|
73
60
|
|
|
74
|
-
### `weco run` Command
|
|
75
|
-
|
|
76
|
-
This command starts the optimization process.
|
|
77
|
-
|
|
78
61
|
**Example: Optimizing Simple PyTorch Operations**
|
|
79
62
|
|
|
80
63
|
This basic example shows how to optimize a simple PyTorch function for speedup.
|
|
81
64
|
|
|
82
|
-
For more advanced examples, including [Triton](/examples/triton/README.md), [CUDA kernel optimization](/examples/cuda/README.md)
|
|
65
|
+
For more advanced examples, including [Triton](/examples/triton/README.md), [CUDA kernel optimization](/examples/cuda/README.md), [ML model optimization](/examples/spaceship-titanic/README.md), and [prompt engineering for math problems](https://github.com/WecoAI/weco-cli/tree/main/examples/prompt), please see the `README.md` files within the corresponding subdirectories under the [`examples/`](./examples/) folder.
|
|
83
66
|
|
|
84
67
|
```bash
|
|
85
68
|
# Navigate to the example directory
|
|
@@ -114,17 +97,12 @@ weco run --source optimize.py \
|
|
|
114
97
|
| `--model` | Model identifier for the LLM to use (e.g., `gpt-4o`, `claude-3.5-sonnet`). Recommended models to try include `o3-mini`, `claude-3-haiku`, and `gemini-2.5-pro-exp-03-25`. | Yes |
|
|
115
98
|
| `--additional-instructions` | (Optional) Natural language description of specific instructions OR path to a file containing detailed instructions to guide the LLM. | No |
|
|
116
99
|
| `--log-dir` | (Optional) Path to the directory to log intermediate steps and final optimization result. Defaults to `.runs/`. | No |
|
|
117
|
-
| `--preserve-source` | (Optional) If set, do not overwrite the original `--source` file. Modifications and the best solution will still be saved in the `--log-dir`. | No |
|
|
118
100
|
|
|
119
101
|
---
|
|
120
102
|
|
|
121
|
-
###
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
```bash
|
|
126
|
-
weco logout
|
|
127
|
-
```
|
|
103
|
+
### Weco Dashboard
|
|
104
|
+
To associate your optimization runs with your Weco account and view them on the Weco dashboard, you can log in. `weco` uses a device authentication flow
|
|
105
|
+

|
|
128
106
|
|
|
129
107
|
---
|
|
130
108
|
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
# AIME Prompt Engineering Example with Weco
|
|
2
|
+
|
|
3
|
+
This example shows how **Weco** can iteratively improve a prompt for solving American Invitational Mathematics Examination (AIME) problems. The experiment runs locally, requires only two short Python files, and aims to improve the accuracy metric.
|
|
4
|
+
|
|
5
|
+
This example uses `gpt-4o-mini` via the OpenAI API by default. Ensure your `OPENAI_API_KEY` environment variable is set.
|
|
6
|
+
|
|
7
|
+
## Files in this folder
|
|
8
|
+
|
|
9
|
+
| File | Purpose |
|
|
10
|
+
| :------------ | :---------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
11
|
+
| `optimize.py` | Holds the prompt template (instructing the LLM to reason step-by-step and use `\\boxed{}` for the final answer) and the mutable `EXTRA_INSTRUCTIONS` string. Weco edits **only** this file during the search. |
|
|
12
|
+
| `eval.py` | Downloads a small slice of the 2024 AIME dataset, calls `optimize.solve` in parallel, parses the LLM output (looking for `\\boxed{}`), compares it to the ground truth, prints progress logs, and finally prints an `accuracy:` line that Weco reads. |
|
|
13
|
+
|
|
14
|
+
|
|
15
|
+
## Quick start
|
|
16
|
+
|
|
17
|
+
1. **Clone the repository and enter the folder.**
|
|
18
|
+
```bash
|
|
19
|
+
git clone https://github.com/your‑fork/weco‑examples.git
|
|
20
|
+
cd weco‑examples/aime‑2024
|
|
21
|
+
```
|
|
22
|
+
2. **Run Weco.** The command below edits `EXTRA_INSTRUCTIONS` in `optimize.py`, invokes `eval.py` on every iteration, reads the printed accuracy, and keeps the best variants.
|
|
23
|
+
```bash
|
|
24
|
+
weco --source optimize.py \
|
|
25
|
+
--eval-command "python eval.py" \
|
|
26
|
+
--metric accuracy \
|
|
27
|
+
--maximize true \
|
|
28
|
+
--steps 40 \
|
|
29
|
+
--model gemini-2.5-flash-preview-04-17 \
|
|
30
|
+
--addtional-instructions prompt_guide.md
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
During each evaluation round you will see log lines similar to the following.
|
|
34
|
+
|
|
35
|
+
```text
|
|
36
|
+
[setup] loading 20 problems from AIME 2024 …
|
|
37
|
+
[progress] 5/20 completed, elapsed 7.3 s
|
|
38
|
+
[progress] 10/20 completed, elapsed 14.6 s
|
|
39
|
+
[progress] 15/20 completed, elapsed 21.8 s
|
|
40
|
+
[progress] 20/20 completed, elapsed 28.9 s
|
|
41
|
+
accuracy: 0.0500
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Weco then mutates the config, tries again, and gradually pushes the accuracy higher. On a modern laptop you can usually double the baseline score within thirty to forty iterations.
|
|
45
|
+
|
|
46
|
+
## How it works
|
|
47
|
+
|
|
48
|
+
* `eval_aime.py` slices the **Maxwell‑Jia/AIME_2024** dataset to twenty problems for fast feedback. You can change the slice in one line.
|
|
49
|
+
* The script sends model calls in parallel via `ThreadPoolExecutor`, so network latency is hidden.
|
|
50
|
+
* Every five completed items, the script logs progress and elapsed time.
|
|
51
|
+
* The final line `accuracy: value` is the only part Weco needs for guidance.
|
|
@@ -1,33 +1,16 @@
|
|
|
1
|
-
# Example:
|
|
1
|
+
# Example: Solving a Kaggle Competition (Spaceship Titanic)
|
|
2
2
|
|
|
3
3
|
This example demonstrates using Weco to optimize a Python script designed for the [Spaceship Titanic Kaggle competition](https://www.kaggle.com/competitions/spaceship-titanic/overview). The goal is to improve the model's `accuracy` metric by directly optimizing the evaluate.py
|
|
4
4
|
|
|
5
5
|
## Setup
|
|
6
6
|
|
|
7
7
|
1. Ensure you are in the `examples/spaceship-titanic` directory.
|
|
8
|
-
2.
|
|
9
|
-
3.
|
|
8
|
+
2. `pip install weco`
|
|
9
|
+
3. Set up LLM API Key, `export OPENAI_API_KEY="your_key_here"`
|
|
10
|
+
4. **Install Dependencies:** Install the required Python packages:
|
|
10
11
|
```bash
|
|
11
12
|
pip install -r requirements-test.txt
|
|
12
13
|
```
|
|
13
|
-
4. **Prepare Data:** Run the utility script once to download the dataset from Kaggle and place it in the expected `./data/` subdirectories:
|
|
14
|
-
```bash
|
|
15
|
-
python get_data.py
|
|
16
|
-
```
|
|
17
|
-
After running `get_data.py`, your directory structure should look like this:
|
|
18
|
-
```
|
|
19
|
-
.
|
|
20
|
-
├── competition_description.md
|
|
21
|
-
├── data
|
|
22
|
-
│ ├── sample_submission.csv
|
|
23
|
-
│ ├── test.csv
|
|
24
|
-
│ └── train.csv
|
|
25
|
-
├── evaluate.py
|
|
26
|
-
├── get_data.py
|
|
27
|
-
├── README.md # This file
|
|
28
|
-
├── requirements-test.txt
|
|
29
|
-
└── submit.py
|
|
30
|
-
```
|
|
31
14
|
|
|
32
15
|
## Optimization Command
|
|
33
16
|
|
|
@@ -38,20 +21,12 @@ weco run --source evaluate.py \
|
|
|
38
21
|
--eval-command "python evaluate.py --data-dir ./data" \
|
|
39
22
|
--metric accuracy \
|
|
40
23
|
--maximize true \
|
|
41
|
-
--steps
|
|
42
|
-
--model
|
|
24
|
+
--steps 20 \
|
|
25
|
+
--model o4-mini \
|
|
43
26
|
--additional-instructions "Improve feature engineering, model choice and hyper-parameters."
|
|
44
27
|
--log-dir .runs/spaceship-titanic
|
|
45
28
|
```
|
|
46
29
|
|
|
47
|
-
## Submit the solution
|
|
48
|
-
|
|
49
|
-
Once the optimization finished, you can submit your predictions to kaggle to see the results. Make sure `submission.csv` is present and then simply run the following command.
|
|
50
|
-
|
|
51
|
-
```bash
|
|
52
|
-
python submit.py
|
|
53
|
-
```
|
|
54
|
-
|
|
55
30
|
### Explanation
|
|
56
31
|
|
|
57
32
|
* `--source evaluate.py`: The script provides a baseline as root node and directly optimize the evaluate.py
|
|
@@ -64,4 +39,4 @@ python submit.py
|
|
|
64
39
|
* `--model gemini-2.5-pro-exp-03-25`: The LLM driving the optimization.
|
|
65
40
|
* `--additional-instructions "Improve feature engineering, model choice and hyper-parameters."`: A simple instruction for model improvement or you can put the path to [`comptition_description.md`](./competition_description.md) within the repo to feed the agent more detailed information.
|
|
66
41
|
|
|
67
|
-
Weco will iteratively modify the feature engineering or modeling code within `evaluate.py`, run the evaluation pipeline, and use the resulting `accuracy` to guide further improvements.
|
|
42
|
+
Weco will iteratively modify the feature engineering or modeling code within `evaluate.py`, run the evaluation pipeline, and use the resulting `accuracy` to guide further improvements.
|