codexapi 0.5.6__tar.gz → 0.5.8__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {codexapi-0.5.6/src/codexapi.egg-info → codexapi-0.5.8}/PKG-INFO +19 -9
- {codexapi-0.5.6 → codexapi-0.5.8}/README.md +18 -8
- {codexapi-0.5.6 → codexapi-0.5.8}/pyproject.toml +1 -1
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi/__init__.py +1 -1
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi/cli.py +65 -7
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi/foreach.py +5 -5
- codexapi-0.5.8/src/codexapi/gh_integration.py +229 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi/task.py +45 -38
- {codexapi-0.5.6 → codexapi-0.5.8/src/codexapi.egg-info}/PKG-INFO +19 -9
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi.egg-info/SOURCES.txt +1 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/LICENSE +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/setup.cfg +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi/__main__.py +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi/agent.py +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi/ralph.py +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi/taskfile.py +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi.egg-info/dependency_links.txt +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi.egg-info/entry_points.txt +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi.egg-info/requires.txt +0 -0
- {codexapi-0.5.6 → codexapi-0.5.8}/src/codexapi.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: codexapi
|
|
3
|
-
Version: 0.5.
|
|
3
|
+
Version: 0.5.8
|
|
4
4
|
Summary: Minimal Python API for running the Codex CLI.
|
|
5
5
|
License: MIT
|
|
6
6
|
Keywords: codex,agent,cli,openai
|
|
@@ -79,9 +79,19 @@ Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
|
|
|
79
79
|
When using `--item`, the task file must include at least one `{{item}}` placeholder.
|
|
80
80
|
|
|
81
81
|
Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
|
|
82
|
-
Use `max_iterations` in the task file to override the default
|
|
82
|
+
Use `max_iterations` in the task file to override the default iteration cap (0 means unlimited).
|
|
83
83
|
Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
|
|
84
84
|
|
|
85
|
+
Take tasks from a GitHub Project (requires `gh-task`):
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
codexapi task -p owner/projects/3 -n "Your Name" -s Backlog task_a.yaml task_b.yaml
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
Task labels are derived from task filenames (basename without extension). The
|
|
92
|
+
issue title/body become `{{item}}` after removing any existing `## Progress`
|
|
93
|
+
section.
|
|
94
|
+
|
|
85
95
|
Example task progress run:
|
|
86
96
|
|
|
87
97
|
```bash
|
|
@@ -157,10 +167,10 @@ the same conversation and returns only the agent's message.
|
|
|
157
167
|
### `task(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> str`
|
|
158
168
|
|
|
159
169
|
Runs a task with checker-driven retries and returns the success summary.
|
|
160
|
-
Raises `TaskFailed` when the maximum
|
|
170
|
+
Raises `TaskFailed` when the maximum iterations are reached.
|
|
161
171
|
|
|
162
172
|
- `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
|
|
163
|
-
- `max_iterations` (int): maximum number of task
|
|
173
|
+
- `max_iterations` (int): maximum number of task iterations (0 means unlimited).
|
|
164
174
|
- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
|
|
165
175
|
- `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
|
|
166
176
|
|
|
@@ -170,7 +180,7 @@ Runs a task with checker-driven retries and returns a `TaskResult` without
|
|
|
170
180
|
raising `TaskFailed`.
|
|
171
181
|
Arguments mirror `task()` (including hooks).
|
|
172
182
|
|
|
173
|
-
### `Task(prompt,
|
|
183
|
+
### `Task(prompt, max_iterations=10, cwd=None, yolo=True, thread_id=None, flags=None)`
|
|
174
184
|
|
|
175
185
|
Runs a Codex task with checker-driven retries. Subclass it and implement
|
|
176
186
|
`check()` to return an error string when the task is incomplete, or return
|
|
@@ -185,22 +195,22 @@ default check prompt and includes the agent output.
|
|
|
185
195
|
- `on_success(result)`: optional success hook.
|
|
186
196
|
- `on_failure(result)`: optional failure hook.
|
|
187
197
|
|
|
188
|
-
### `TaskResult(success, summary,
|
|
198
|
+
### `TaskResult(success, summary, iterations, errors, thread_id)`
|
|
189
199
|
|
|
190
200
|
Simple result object returned by `Task.__call__`.
|
|
191
201
|
|
|
192
202
|
- `success` (bool): whether the task completed successfully.
|
|
193
203
|
- `summary` (str): agent summary of what happened.
|
|
194
|
-
- `
|
|
204
|
+
- `iterations` (int): how many iterations were used.
|
|
195
205
|
- `errors` (str | None): last checker error, if any.
|
|
196
206
|
- `thread_id` (str | None): Codex thread id for the session.
|
|
197
207
|
|
|
198
208
|
### `TaskFailed`
|
|
199
209
|
|
|
200
|
-
Exception raised by `task()` when
|
|
210
|
+
Exception raised by `task()` when iterations are exhausted.
|
|
201
211
|
|
|
202
212
|
- `summary` (str): failure summary text.
|
|
203
|
-
- `
|
|
213
|
+
- `iterations` (int | None): iterations made when the task failed.
|
|
204
214
|
- `errors` (str | None): last checker error, if any.
|
|
205
215
|
|
|
206
216
|
### `foreach(list_file, task_file, n=None, cwd=None, yolo=True, flags=None) -> ForeachResult`
|
|
@@ -65,9 +65,19 @@ Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
|
|
|
65
65
|
When using `--item`, the task file must include at least one `{{item}}` placeholder.
|
|
66
66
|
|
|
67
67
|
Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
|
|
68
|
-
Use `max_iterations` in the task file to override the default
|
|
68
|
+
Use `max_iterations` in the task file to override the default iteration cap (0 means unlimited).
|
|
69
69
|
Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
|
|
70
70
|
|
|
71
|
+
Take tasks from a GitHub Project (requires `gh-task`):
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
codexapi task -p owner/projects/3 -n "Your Name" -s Backlog task_a.yaml task_b.yaml
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
Task labels are derived from task filenames (basename without extension). The
|
|
78
|
+
issue title/body become `{{item}}` after removing any existing `## Progress`
|
|
79
|
+
section.
|
|
80
|
+
|
|
71
81
|
Example task progress run:
|
|
72
82
|
|
|
73
83
|
```bash
|
|
@@ -143,10 +153,10 @@ the same conversation and returns only the agent's message.
|
|
|
143
153
|
### `task(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> str`
|
|
144
154
|
|
|
145
155
|
Runs a task with checker-driven retries and returns the success summary.
|
|
146
|
-
Raises `TaskFailed` when the maximum
|
|
156
|
+
Raises `TaskFailed` when the maximum iterations are reached.
|
|
147
157
|
|
|
148
158
|
- `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
|
|
149
|
-
- `max_iterations` (int): maximum number of task
|
|
159
|
+
- `max_iterations` (int): maximum number of task iterations (0 means unlimited).
|
|
150
160
|
- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
|
|
151
161
|
- `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
|
|
152
162
|
|
|
@@ -156,7 +166,7 @@ Runs a task with checker-driven retries and returns a `TaskResult` without
|
|
|
156
166
|
raising `TaskFailed`.
|
|
157
167
|
Arguments mirror `task()` (including hooks).
|
|
158
168
|
|
|
159
|
-
### `Task(prompt,
|
|
169
|
+
### `Task(prompt, max_iterations=10, cwd=None, yolo=True, thread_id=None, flags=None)`
|
|
160
170
|
|
|
161
171
|
Runs a Codex task with checker-driven retries. Subclass it and implement
|
|
162
172
|
`check()` to return an error string when the task is incomplete, or return
|
|
@@ -171,22 +181,22 @@ default check prompt and includes the agent output.
|
|
|
171
181
|
- `on_success(result)`: optional success hook.
|
|
172
182
|
- `on_failure(result)`: optional failure hook.
|
|
173
183
|
|
|
174
|
-
### `TaskResult(success, summary,
|
|
184
|
+
### `TaskResult(success, summary, iterations, errors, thread_id)`
|
|
175
185
|
|
|
176
186
|
Simple result object returned by `Task.__call__`.
|
|
177
187
|
|
|
178
188
|
- `success` (bool): whether the task completed successfully.
|
|
179
189
|
- `summary` (str): agent summary of what happened.
|
|
180
|
-
- `
|
|
190
|
+
- `iterations` (int): how many iterations were used.
|
|
181
191
|
- `errors` (str | None): last checker error, if any.
|
|
182
192
|
- `thread_id` (str | None): Codex thread id for the session.
|
|
183
193
|
|
|
184
194
|
### `TaskFailed`
|
|
185
195
|
|
|
186
|
-
Exception raised by `task()` when
|
|
196
|
+
Exception raised by `task()` when iterations are exhausted.
|
|
187
197
|
|
|
188
198
|
- `summary` (str): failure summary text.
|
|
189
|
-
- `
|
|
199
|
+
- `iterations` (int | None): iterations made when the task failed.
|
|
190
200
|
- `errors` (str | None): last checker error, if any.
|
|
191
201
|
|
|
192
202
|
### `foreach(list_file, task_file, n=None, cwd=None, yolo=True, flags=None) -> ForeachResult`
|
|
@@ -1033,9 +1033,25 @@ def main(argv=None):
|
|
|
1033
1033
|
help="Item value for task files that use {{item}} placeholders.",
|
|
1034
1034
|
)
|
|
1035
1035
|
task_parser.add_argument(
|
|
1036
|
-
"
|
|
1037
|
-
|
|
1038
|
-
help="
|
|
1036
|
+
"-p",
|
|
1037
|
+
"--project",
|
|
1038
|
+
help="GitHub Project reference to pull tasks from.",
|
|
1039
|
+
)
|
|
1040
|
+
task_parser.add_argument(
|
|
1041
|
+
"-s",
|
|
1042
|
+
"--status",
|
|
1043
|
+
default="Backlog",
|
|
1044
|
+
help="Status name to take from when using --project (default: Backlog).",
|
|
1045
|
+
)
|
|
1046
|
+
task_parser.add_argument(
|
|
1047
|
+
"-n",
|
|
1048
|
+
"--name",
|
|
1049
|
+
help="Owner label name for gh-task when using --project.",
|
|
1050
|
+
)
|
|
1051
|
+
task_parser.add_argument(
|
|
1052
|
+
"task_args",
|
|
1053
|
+
nargs="*",
|
|
1054
|
+
help="Prompt to send (no --project) or task files (with --project).",
|
|
1039
1055
|
)
|
|
1040
1056
|
task_parser.add_argument(
|
|
1041
1057
|
"--check",
|
|
@@ -1046,7 +1062,7 @@ def main(argv=None):
|
|
|
1046
1062
|
type=int,
|
|
1047
1063
|
default=None,
|
|
1048
1064
|
help=(
|
|
1049
|
-
"Max agent
|
|
1065
|
+
"Max agent iterations (0 means unlimited). "
|
|
1050
1066
|
f"Defaults to {DEFAULT_MAX_ITERATIONS}."
|
|
1051
1067
|
),
|
|
1052
1068
|
)
|
|
@@ -1276,8 +1292,40 @@ def main(argv=None):
|
|
|
1276
1292
|
if args.ralph_fresh is None:
|
|
1277
1293
|
args.ralph_fresh = True
|
|
1278
1294
|
|
|
1295
|
+
if args.command == "task" and args.project:
|
|
1296
|
+
if args.task_file:
|
|
1297
|
+
raise SystemExit("task --project does not allow -f.")
|
|
1298
|
+
if args.item is not None:
|
|
1299
|
+
raise SystemExit("--item is only supported with -f.")
|
|
1300
|
+
if args.check is not None:
|
|
1301
|
+
raise SystemExit("--check is not allowed with --project.")
|
|
1302
|
+
if args.max_iterations is not None:
|
|
1303
|
+
raise SystemExit("--max-iterations is not allowed with --project.")
|
|
1304
|
+
if not args.name:
|
|
1305
|
+
raise SystemExit("--name is required with --project.")
|
|
1306
|
+
if not args.task_args:
|
|
1307
|
+
raise SystemExit("task --project requires one or more task files.")
|
|
1308
|
+
try:
|
|
1309
|
+
from .gh_integration import GhTaskRunner
|
|
1310
|
+
except ImportError as exc:
|
|
1311
|
+
raise SystemExit("gh-task is required for --project. Install it with pip.") from exc
|
|
1312
|
+
|
|
1313
|
+
task_runner = GhTaskRunner(
|
|
1314
|
+
args.project,
|
|
1315
|
+
args.name,
|
|
1316
|
+
args.task_args,
|
|
1317
|
+
args.status,
|
|
1318
|
+
args.cwd,
|
|
1319
|
+
args.yolo,
|
|
1320
|
+
args.flags,
|
|
1321
|
+
)
|
|
1322
|
+
result = task_runner(progress=not args.quiet)
|
|
1323
|
+
if not result.success:
|
|
1324
|
+
raise SystemExit(1)
|
|
1325
|
+
return
|
|
1326
|
+
|
|
1279
1327
|
if args.command == "task" and args.task_file:
|
|
1280
|
-
if args.
|
|
1328
|
+
if args.task_args:
|
|
1281
1329
|
raise SystemExit("task -f does not take a prompt.")
|
|
1282
1330
|
if args.item is not None:
|
|
1283
1331
|
task_def = load_task_file(args.task_file)
|
|
@@ -1303,11 +1351,13 @@ def main(argv=None):
|
|
|
1303
1351
|
return
|
|
1304
1352
|
|
|
1305
1353
|
prompt_source = None
|
|
1306
|
-
|
|
1354
|
+
prompt = None
|
|
1355
|
+
if args.command in ("run", "ralph"):
|
|
1307
1356
|
prompt_source = args.prompt
|
|
1308
1357
|
elif args.command == "science":
|
|
1309
1358
|
prompt_source = args.task
|
|
1310
|
-
|
|
1359
|
+
if args.command != "task":
|
|
1360
|
+
prompt = _read_prompt(prompt_source)
|
|
1311
1361
|
exit_code = 0
|
|
1312
1362
|
message = None
|
|
1313
1363
|
|
|
@@ -1339,6 +1389,8 @@ def main(argv=None):
|
|
|
1339
1389
|
)
|
|
1340
1390
|
return
|
|
1341
1391
|
if args.command == "task":
|
|
1392
|
+
if args.project:
|
|
1393
|
+
raise SystemExit("task --project already handled earlier.")
|
|
1342
1394
|
if args.item is not None:
|
|
1343
1395
|
raise SystemExit("--item is only supported with -f.")
|
|
1344
1396
|
if args.max_iterations is None:
|
|
@@ -1347,6 +1399,12 @@ def main(argv=None):
|
|
|
1347
1399
|
raise SystemExit("--max-iterations must be >= 0.")
|
|
1348
1400
|
check = args.check
|
|
1349
1401
|
try:
|
|
1402
|
+
task_args = args.task_args or []
|
|
1403
|
+
if len(task_args) > 1:
|
|
1404
|
+
raise SystemExit("task takes a single prompt unless --project is used.")
|
|
1405
|
+
if task_args:
|
|
1406
|
+
prompt_source = task_args[0]
|
|
1407
|
+
prompt = _read_prompt(prompt_source)
|
|
1350
1408
|
task(
|
|
1351
1409
|
prompt,
|
|
1352
1410
|
check,
|
|
@@ -185,8 +185,8 @@ def _run_item(
|
|
|
185
185
|
|
|
186
186
|
summary = ""
|
|
187
187
|
success = False
|
|
188
|
-
|
|
189
|
-
|
|
188
|
+
iterations = None
|
|
189
|
+
max_iterations = None
|
|
190
190
|
try:
|
|
191
191
|
task = TaskFile(
|
|
192
192
|
task_file,
|
|
@@ -196,17 +196,17 @@ def _run_item(
|
|
|
196
196
|
thread_id=None,
|
|
197
197
|
flags=flags,
|
|
198
198
|
)
|
|
199
|
-
|
|
199
|
+
max_iterations = task.max_iterations
|
|
200
200
|
result = task()
|
|
201
201
|
success = result.success
|
|
202
|
-
|
|
202
|
+
iterations = result.iterations
|
|
203
203
|
summary = result.summary or ""
|
|
204
204
|
except Exception as exc:
|
|
205
205
|
summary = f"{type(exc).__name__}: {exc}"
|
|
206
206
|
success = False
|
|
207
207
|
|
|
208
208
|
summary = _single_line(summary)
|
|
209
|
-
turns = _format_turns(
|
|
209
|
+
turns = _format_turns(iterations, max_iterations)
|
|
210
210
|
if summary:
|
|
211
211
|
summary = f"{summary} {turns}"
|
|
212
212
|
else:
|
|
@@ -0,0 +1,229 @@
|
|
|
1
|
+
import logging
|
|
2
|
+
import re
|
|
3
|
+
import time
|
|
4
|
+
from pathlib import Path
|
|
5
|
+
|
|
6
|
+
from tqdm import tqdm
|
|
7
|
+
|
|
8
|
+
from gh_task.project import Project
|
|
9
|
+
|
|
10
|
+
from .taskfile import TaskFile
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
_logger = logging.getLogger(__name__)
|
|
14
|
+
|
|
15
|
+
_PROGRESS_HEADER = "## Progress"
|
|
16
|
+
_SUCCESS_LABEL = "✓"
|
|
17
|
+
_FAILURE_LABEL = "⨉"
|
|
18
|
+
_SUCCESS_COLOR = "2da44e"
|
|
19
|
+
_FAILURE_COLOR = "d73a4a"
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
def _canonical_task_name(path):
|
|
23
|
+
return Path(path).stem
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
def _task_file_map(task_files):
|
|
27
|
+
mapping = {}
|
|
28
|
+
for path in task_files:
|
|
29
|
+
name = _canonical_task_name(path)
|
|
30
|
+
if not name:
|
|
31
|
+
raise ValueError(f"Task file name is empty: {path}")
|
|
32
|
+
key = name.lower()
|
|
33
|
+
if key in mapping:
|
|
34
|
+
raise ValueError(f"Duplicate task name '{name}' for {path} and {mapping[key][1]}")
|
|
35
|
+
mapping[key] = (name, path)
|
|
36
|
+
if not mapping:
|
|
37
|
+
raise ValueError("At least one task file is required")
|
|
38
|
+
return mapping
|
|
39
|
+
|
|
40
|
+
|
|
41
|
+
def _issue_url(issue):
|
|
42
|
+
if issue.url:
|
|
43
|
+
return issue.url
|
|
44
|
+
return f"https://github.com/{issue.repo}/issues/{issue.number}"
|
|
45
|
+
|
|
46
|
+
|
|
47
|
+
def _match_task_file(issue, task_map):
|
|
48
|
+
labels = issue.labels or []
|
|
49
|
+
matches = []
|
|
50
|
+
for label in labels:
|
|
51
|
+
key = label.strip().lower()
|
|
52
|
+
if key in task_map:
|
|
53
|
+
matches.append((label, task_map[key][1]))
|
|
54
|
+
if not matches:
|
|
55
|
+
raise ValueError(f"Issue {_issue_url(issue)} has no matching task label")
|
|
56
|
+
if len(matches) > 1:
|
|
57
|
+
details = ", ".join(f"{label} -> {path}" for label, path in matches)
|
|
58
|
+
raise ValueError(
|
|
59
|
+
f"Issue {_issue_url(issue)} matches multiple task labels: {details}"
|
|
60
|
+
)
|
|
61
|
+
return matches[0][1]
|
|
62
|
+
|
|
63
|
+
|
|
64
|
+
def _strip_progress_section(body):
|
|
65
|
+
if not body:
|
|
66
|
+
return ""
|
|
67
|
+
match = re.search(r"(?m)^## Progress\\s*$", body)
|
|
68
|
+
if not match:
|
|
69
|
+
return body.strip()
|
|
70
|
+
return body[:match.start()].rstrip()
|
|
71
|
+
|
|
72
|
+
|
|
73
|
+
def _format_item_text(issue, description):
|
|
74
|
+
title = issue.title or ""
|
|
75
|
+
url = _issue_url(issue)
|
|
76
|
+
description = description or ""
|
|
77
|
+
return f"Issue: {url}\nTitle: {title}\nDescription: {description}\n"
|
|
78
|
+
|
|
79
|
+
|
|
80
|
+
def _format_status_line(status_line):
|
|
81
|
+
match = re.match(r"^\\[(?P<turns>[^ ]+) @ (?P<elapsed>[^\\]]+)\\]:\\s*(?P<summary>.*)$", status_line)
|
|
82
|
+
if not match:
|
|
83
|
+
return status_line
|
|
84
|
+
summary = match.group("summary").strip()
|
|
85
|
+
prefix = f"`[{match.group('turns')} {match.group('elapsed')}]`"
|
|
86
|
+
if summary:
|
|
87
|
+
return f"{prefix} {summary}"
|
|
88
|
+
return prefix
|
|
89
|
+
|
|
90
|
+
|
|
91
|
+
def _format_progress_bar(total, remaining, start_time):
|
|
92
|
+
if total is None:
|
|
93
|
+
total = 0
|
|
94
|
+
current = total - remaining
|
|
95
|
+
if current < 0:
|
|
96
|
+
current = 0
|
|
97
|
+
elapsed = 0.0
|
|
98
|
+
if start_time is not None:
|
|
99
|
+
elapsed = time.monotonic() - start_time
|
|
100
|
+
total_for_bar = total if total > 0 else 1
|
|
101
|
+
return tqdm.format_meter(current, total_for_bar, elapsed, ncols=80)
|
|
102
|
+
|
|
103
|
+
|
|
104
|
+
def _render_progress_section(base_body, status_line, bar_text):
|
|
105
|
+
parts = [
|
|
106
|
+
_PROGRESS_HEADER,
|
|
107
|
+
"",
|
|
108
|
+
status_line,
|
|
109
|
+
"",
|
|
110
|
+
"```",
|
|
111
|
+
bar_text,
|
|
112
|
+
"```",
|
|
113
|
+
]
|
|
114
|
+
section = "\n".join(parts).rstrip()
|
|
115
|
+
if base_body:
|
|
116
|
+
return f"{base_body.rstrip()}\n\n{section}\n"
|
|
117
|
+
return f"{section}\n"
|
|
118
|
+
|
|
119
|
+
|
|
120
|
+
class GhTaskFile(TaskFile):
|
|
121
|
+
def __init__(
|
|
122
|
+
self,
|
|
123
|
+
path,
|
|
124
|
+
issue,
|
|
125
|
+
project,
|
|
126
|
+
item_text,
|
|
127
|
+
cwd=None,
|
|
128
|
+
yolo=True,
|
|
129
|
+
thread_id=None,
|
|
130
|
+
flags=None,
|
|
131
|
+
):
|
|
132
|
+
super().__init__(path, item_text, None, cwd, yolo, thread_id, flags)
|
|
133
|
+
self.issue = issue
|
|
134
|
+
self.project = project
|
|
135
|
+
self._progress_updates = True
|
|
136
|
+
|
|
137
|
+
def on_progress(
|
|
138
|
+
self,
|
|
139
|
+
iterations,
|
|
140
|
+
max_iterations,
|
|
141
|
+
total_estimate,
|
|
142
|
+
remaining_estimate,
|
|
143
|
+
status_line,
|
|
144
|
+
):
|
|
145
|
+
super().on_progress(
|
|
146
|
+
iterations,
|
|
147
|
+
max_iterations,
|
|
148
|
+
total_estimate,
|
|
149
|
+
remaining_estimate,
|
|
150
|
+
status_line,
|
|
151
|
+
)
|
|
152
|
+
try:
|
|
153
|
+
self.project.set_estimate(self.issue, remaining_estimate)
|
|
154
|
+
except Exception as exc:
|
|
155
|
+
_logger.warning("Failed to update estimate for issue %s", _issue_url(self.issue), exc_info=exc)
|
|
156
|
+
if not status_line:
|
|
157
|
+
return
|
|
158
|
+
try:
|
|
159
|
+
body = self.project.get_issue_body(self.issue)
|
|
160
|
+
base = _strip_progress_section(body)
|
|
161
|
+
status = _format_status_line(status_line)
|
|
162
|
+
bar_text = _format_progress_bar(total_estimate, remaining_estimate, self._progress_start)
|
|
163
|
+
updated = _render_progress_section(base, status, bar_text)
|
|
164
|
+
self.project.set_issue_body(self.issue, updated)
|
|
165
|
+
except Exception as exc:
|
|
166
|
+
_logger.warning("Failed to update issue progress for %s", _issue_url(self.issue), exc_info=exc)
|
|
167
|
+
|
|
168
|
+
def on_success(self, result):
|
|
169
|
+
super().on_success(result)
|
|
170
|
+
self.project.ensure_label(
|
|
171
|
+
self.issue.repo,
|
|
172
|
+
_SUCCESS_LABEL,
|
|
173
|
+
color=_SUCCESS_COLOR,
|
|
174
|
+
description="Task succeeded",
|
|
175
|
+
)
|
|
176
|
+
self.project.add_label(self.issue, _SUCCESS_LABEL)
|
|
177
|
+
|
|
178
|
+
def on_failure(self, result):
|
|
179
|
+
super().on_failure(result)
|
|
180
|
+
self.project.ensure_label(
|
|
181
|
+
self.issue.repo,
|
|
182
|
+
_FAILURE_LABEL,
|
|
183
|
+
color=_FAILURE_COLOR,
|
|
184
|
+
description="Task failed",
|
|
185
|
+
)
|
|
186
|
+
self.project.add_label(self.issue, _FAILURE_LABEL)
|
|
187
|
+
|
|
188
|
+
def tear_down(self):
|
|
189
|
+
super().tear_down()
|
|
190
|
+
self.project.move(self.issue, "In review")
|
|
191
|
+
self.project.release(self.issue)
|
|
192
|
+
|
|
193
|
+
|
|
194
|
+
class GhTaskRunner:
|
|
195
|
+
def __init__(
|
|
196
|
+
self,
|
|
197
|
+
project,
|
|
198
|
+
name,
|
|
199
|
+
task_files,
|
|
200
|
+
status="Backlog",
|
|
201
|
+
cwd=None,
|
|
202
|
+
yolo=True,
|
|
203
|
+
flags=None,
|
|
204
|
+
):
|
|
205
|
+
task_map = _task_file_map(task_files)
|
|
206
|
+
self.project = Project(project, name, has_label=list(task_map))
|
|
207
|
+
self.issue = self.project.take(status=status, return_issue=True)
|
|
208
|
+
self.issue = self.project.get_issue(self.issue)
|
|
209
|
+
try:
|
|
210
|
+
task_path = _match_task_file(self.issue, task_map)
|
|
211
|
+
except Exception:
|
|
212
|
+
self.project.release(self.issue)
|
|
213
|
+
raise
|
|
214
|
+
body = self.project.get_issue_body(self.issue)
|
|
215
|
+
description = _strip_progress_section(body)
|
|
216
|
+
item_text = _format_item_text(self.issue, description)
|
|
217
|
+
self.task = GhTaskFile(
|
|
218
|
+
task_path,
|
|
219
|
+
self.issue,
|
|
220
|
+
self.project,
|
|
221
|
+
item_text,
|
|
222
|
+
cwd,
|
|
223
|
+
yolo,
|
|
224
|
+
None,
|
|
225
|
+
flags,
|
|
226
|
+
)
|
|
227
|
+
|
|
228
|
+
def __call__(self, progress=False):
|
|
229
|
+
return self.task(progress=progress)
|
|
@@ -152,15 +152,17 @@ def _format_elapsed(seconds):
|
|
|
152
152
|
return f"{hours}h{minutes:02d}m{seconds:02d}s"
|
|
153
153
|
|
|
154
154
|
|
|
155
|
-
def _format_turns(
|
|
155
|
+
def _format_turns(iteration, total):
|
|
156
156
|
if total:
|
|
157
|
-
width =
|
|
157
|
+
width = len(str(total))
|
|
158
158
|
total_text = str(total)
|
|
159
159
|
else:
|
|
160
|
-
width =
|
|
160
|
+
width = len(str(iteration))
|
|
161
161
|
total_text = "∞"
|
|
162
|
-
|
|
163
|
-
|
|
162
|
+
if width < 1:
|
|
163
|
+
width = 1
|
|
164
|
+
iteration_text = f"{iteration:0{width}d}"
|
|
165
|
+
return f"{iteration_text}/{total_text}"
|
|
164
166
|
|
|
165
167
|
|
|
166
168
|
def estimate(prompt, agent_output, check_output, cwd, yolo, flags, previous_total):
|
|
@@ -190,21 +192,21 @@ def _success_prompt():
|
|
|
190
192
|
|
|
191
193
|
def _failure_prompt(error):
|
|
192
194
|
return (
|
|
193
|
-
"We ran out of
|
|
195
|
+
"We ran out of iterations. Summarize what you did and what is still failing.\n\n"
|
|
194
196
|
f"Outstanding issues:\n{error}"
|
|
195
197
|
)
|
|
196
198
|
|
|
197
199
|
|
|
198
200
|
class TaskFailed(RuntimeError):
|
|
199
|
-
"""Raised when a task hits the maximum
|
|
201
|
+
"""Raised when a task hits the maximum iterations without success."""
|
|
200
202
|
|
|
201
|
-
def __init__(self, summary,
|
|
202
|
-
message = "Task failed after maximum
|
|
203
|
+
def __init__(self, summary, iterations=None, errors=None):
|
|
204
|
+
message = "Task failed after maximum iterations."
|
|
203
205
|
if summary:
|
|
204
206
|
message = f"{message}\n{summary}"
|
|
205
207
|
super().__init__(message)
|
|
206
208
|
self.summary = summary
|
|
207
|
-
self.
|
|
209
|
+
self.iterations = iterations
|
|
208
210
|
self.errors = errors
|
|
209
211
|
|
|
210
212
|
|
|
@@ -235,7 +237,7 @@ def task(
|
|
|
235
237
|
prompt: The task prompt to run.
|
|
236
238
|
check: False to skip verification, None for the default check, or
|
|
237
239
|
a string check prompt. The string "None" skips verification.
|
|
238
|
-
max_iterations: Maximum number of task
|
|
240
|
+
max_iterations: Maximum number of task iterations (0 means unlimited).
|
|
239
241
|
cwd: Optional working directory for the Codex session.
|
|
240
242
|
yolo: Whether to pass --yolo to Codex.
|
|
241
243
|
flags: Additional raw CLI flags to pass to Codex.
|
|
@@ -249,7 +251,7 @@ def task(
|
|
|
249
251
|
The agent's response text when the task succeeds.
|
|
250
252
|
|
|
251
253
|
Raises:
|
|
252
|
-
TaskFailed: when the task reaches the maximum
|
|
254
|
+
TaskFailed: when the task reaches the maximum iterations without success.
|
|
253
255
|
"""
|
|
254
256
|
result = task_result(
|
|
255
257
|
prompt,
|
|
@@ -266,7 +268,7 @@ def task(
|
|
|
266
268
|
)
|
|
267
269
|
if result.success:
|
|
268
270
|
return result.summary
|
|
269
|
-
raise TaskFailed(result.summary, result.
|
|
271
|
+
raise TaskFailed(result.summary, result.iterations, result.errors)
|
|
270
272
|
|
|
271
273
|
|
|
272
274
|
def task_result(
|
|
@@ -284,7 +286,7 @@ def task_result(
|
|
|
284
286
|
):
|
|
285
287
|
"""Run a prompt with optional checker-driven retries and return TaskResult.
|
|
286
288
|
|
|
287
|
-
The runner keeps a single session. Each verification
|
|
289
|
+
The runner keeps a single session. Each verification iteration uses a fresh,
|
|
288
290
|
stateless agent call. When progress is True, show progress updates each round.
|
|
289
291
|
|
|
290
292
|
Hook strings mirror task file keys: set_up, tear_down, on_success, on_failure.
|
|
@@ -317,10 +319,10 @@ def task_result(
|
|
|
317
319
|
class TaskResult:
|
|
318
320
|
"""Outcome summary for a task run."""
|
|
319
321
|
|
|
320
|
-
def __init__(self, success, summary,
|
|
322
|
+
def __init__(self, success, summary, iterations, errors, thread_id):
|
|
321
323
|
self.success = success
|
|
322
324
|
self.summary = summary
|
|
323
|
-
self.
|
|
325
|
+
self.iterations = iterations
|
|
324
326
|
self.errors = errors
|
|
325
327
|
self.thread_id = thread_id
|
|
326
328
|
|
|
@@ -328,7 +330,7 @@ class TaskResult:
|
|
|
328
330
|
return (
|
|
329
331
|
"TaskResult("
|
|
330
332
|
f"success={self.success}, "
|
|
331
|
-
f"
|
|
333
|
+
f"iterations={self.iterations}, "
|
|
332
334
|
f"errors={self.errors!r}, "
|
|
333
335
|
f"thread_id={self.thread_id!r}, "
|
|
334
336
|
f"summary={self.summary!r}"
|
|
@@ -350,16 +352,16 @@ class Task:
|
|
|
350
352
|
def __init__(
|
|
351
353
|
self,
|
|
352
354
|
prompt,
|
|
353
|
-
|
|
355
|
+
max_iterations=DEFAULT_MAX_ITERATIONS,
|
|
354
356
|
cwd=None,
|
|
355
357
|
yolo=True,
|
|
356
358
|
thread_id=None,
|
|
357
359
|
flags=None,
|
|
358
360
|
):
|
|
359
|
-
if
|
|
360
|
-
raise ValueError("
|
|
361
|
+
if max_iterations < 0:
|
|
362
|
+
raise ValueError("max_iterations must be >= 0")
|
|
361
363
|
self.prompt = prompt
|
|
362
|
-
self.
|
|
364
|
+
self.max_iterations = max_iterations
|
|
363
365
|
self.cwd = cwd
|
|
364
366
|
self.last_output = None
|
|
365
367
|
self.last_check_output = None
|
|
@@ -368,8 +370,10 @@ class Task:
|
|
|
368
370
|
self._yolo = yolo
|
|
369
371
|
self._flags = flags
|
|
370
372
|
self._progress_enabled = False
|
|
373
|
+
self._progress_updates = False
|
|
371
374
|
self._progress_bar = None
|
|
372
375
|
self._progress_total = None
|
|
376
|
+
self._progress_start = None
|
|
373
377
|
self.agent = Agent(
|
|
374
378
|
cwd,
|
|
375
379
|
yolo,
|
|
@@ -463,8 +467,9 @@ class Task:
|
|
|
463
467
|
# If this fails in the middle we will still try to tear down
|
|
464
468
|
self.set_up()
|
|
465
469
|
|
|
470
|
+
progress_updates = progress or self._progress_updates
|
|
466
471
|
self._progress_enabled = progress
|
|
467
|
-
if
|
|
472
|
+
if progress_updates:
|
|
468
473
|
remaining, _summary = estimate(
|
|
469
474
|
self.prompt,
|
|
470
475
|
"",
|
|
@@ -476,15 +481,17 @@ class Task:
|
|
|
476
481
|
)
|
|
477
482
|
self._progress_total = remaining
|
|
478
483
|
start_time = time.monotonic()
|
|
484
|
+
self._progress_start = start_time
|
|
479
485
|
self.on_progress(
|
|
480
486
|
0,
|
|
481
|
-
self.
|
|
487
|
+
self.max_iterations,
|
|
482
488
|
self._progress_total,
|
|
483
489
|
remaining,
|
|
484
490
|
None,
|
|
485
491
|
)
|
|
486
492
|
else:
|
|
487
493
|
start_time = time.monotonic()
|
|
494
|
+
self._progress_start = start_time
|
|
488
495
|
|
|
489
496
|
# Start with the initial prompt
|
|
490
497
|
output = self.agent(self.prompt)
|
|
@@ -492,16 +499,16 @@ class Task:
|
|
|
492
499
|
if debug:
|
|
493
500
|
_logger.debug("Initial output: %s", output)
|
|
494
501
|
|
|
495
|
-
# Try correcting it up to
|
|
502
|
+
# Try correcting it up to max_iterations times
|
|
496
503
|
error = None
|
|
497
|
-
|
|
504
|
+
iteration = 0
|
|
498
505
|
while True:
|
|
499
|
-
|
|
506
|
+
iteration += 1
|
|
500
507
|
error = self.check(self.last_output)
|
|
501
508
|
if debug:
|
|
502
509
|
_logger.debug("Check error: %s", error)
|
|
503
510
|
|
|
504
|
-
if
|
|
511
|
+
if progress_updates:
|
|
505
512
|
check_output = self.last_check_output
|
|
506
513
|
if self.check_skipped:
|
|
507
514
|
check_output = "Verification skipped."
|
|
@@ -520,18 +527,18 @@ class Task:
|
|
|
520
527
|
self._progress_total = total_estimate
|
|
521
528
|
elapsed = _format_elapsed(time.monotonic() - start_time)
|
|
522
529
|
status_prefix = (
|
|
523
|
-
f"[{_format_turns(
|
|
530
|
+
f"[{_format_turns(iteration, self.max_iterations)} @ {elapsed}]"
|
|
524
531
|
)
|
|
525
532
|
is_final = not error or (
|
|
526
|
-
self.
|
|
533
|
+
self.max_iterations and iteration >= self.max_iterations
|
|
527
534
|
)
|
|
528
535
|
if is_final:
|
|
529
536
|
marker = "✅" if not error else "❌"
|
|
530
537
|
summary = f"{marker} {summary}".strip()
|
|
531
538
|
status_line = f"{status_prefix}: {summary}".rstrip()
|
|
532
539
|
self.on_progress(
|
|
533
|
-
|
|
534
|
-
self.
|
|
540
|
+
iteration,
|
|
541
|
+
self.max_iterations,
|
|
535
542
|
total_estimate,
|
|
536
543
|
remaining,
|
|
537
544
|
status_line,
|
|
@@ -543,20 +550,20 @@ class Task:
|
|
|
543
550
|
result = TaskResult(
|
|
544
551
|
True,
|
|
545
552
|
summary,
|
|
546
|
-
|
|
553
|
+
iteration,
|
|
547
554
|
None,
|
|
548
555
|
self.agent.thread_id,
|
|
549
556
|
)
|
|
550
557
|
self.on_success(result)
|
|
551
558
|
return result
|
|
552
|
-
if self.
|
|
559
|
+
if self.max_iterations and iteration >= self.max_iterations:
|
|
553
560
|
summary = self.agent(self.failure_prompt(error))
|
|
554
561
|
if debug:
|
|
555
562
|
_logger.debug("Failure summary: %s", summary)
|
|
556
563
|
result = TaskResult(
|
|
557
564
|
False,
|
|
558
565
|
summary,
|
|
559
|
-
|
|
566
|
+
iteration,
|
|
560
567
|
error,
|
|
561
568
|
self.agent.thread_id,
|
|
562
569
|
)
|
|
@@ -580,7 +587,7 @@ class AutoTask(Task):
|
|
|
580
587
|
self,
|
|
581
588
|
prompt,
|
|
582
589
|
check=None,
|
|
583
|
-
|
|
590
|
+
max_iterations=DEFAULT_MAX_ITERATIONS,
|
|
584
591
|
cwd=None,
|
|
585
592
|
yolo=True,
|
|
586
593
|
thread_id=None,
|
|
@@ -592,9 +599,9 @@ class AutoTask(Task):
|
|
|
592
599
|
):
|
|
593
600
|
if not (check is None or check is False or isinstance(check, str)):
|
|
594
601
|
raise TypeError("check must be a string or False")
|
|
595
|
-
if
|
|
596
|
-
raise ValueError("
|
|
597
|
-
super().__init__(prompt,
|
|
602
|
+
if max_iterations < 0:
|
|
603
|
+
raise ValueError("max_iterations must be >= 0")
|
|
604
|
+
super().__init__(prompt, max_iterations, cwd, yolo, thread_id, flags)
|
|
598
605
|
self.check_text = check
|
|
599
606
|
self._set_up = _validate_hook("set_up", set_up)
|
|
600
607
|
self._tear_down = _validate_hook("tear_down", tear_down)
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: codexapi
|
|
3
|
-
Version: 0.5.
|
|
3
|
+
Version: 0.5.8
|
|
4
4
|
Summary: Minimal Python API for running the Codex CLI.
|
|
5
5
|
License: MIT
|
|
6
6
|
Keywords: codex,agent,cli,openai
|
|
@@ -79,9 +79,19 @@ Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
|
|
|
79
79
|
When using `--item`, the task file must include at least one `{{item}}` placeholder.
|
|
80
80
|
|
|
81
81
|
Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
|
|
82
|
-
Use `max_iterations` in the task file to override the default
|
|
82
|
+
Use `max_iterations` in the task file to override the default iteration cap (0 means unlimited).
|
|
83
83
|
Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
|
|
84
84
|
|
|
85
|
+
Take tasks from a GitHub Project (requires `gh-task`):
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
codexapi task -p owner/projects/3 -n "Your Name" -s Backlog task_a.yaml task_b.yaml
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
Task labels are derived from task filenames (basename without extension). The
|
|
92
|
+
issue title/body become `{{item}}` after removing any existing `## Progress`
|
|
93
|
+
section.
|
|
94
|
+
|
|
85
95
|
Example task progress run:
|
|
86
96
|
|
|
87
97
|
```bash
|
|
@@ -157,10 +167,10 @@ the same conversation and returns only the agent's message.
|
|
|
157
167
|
### `task(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> str`
|
|
158
168
|
|
|
159
169
|
Runs a task with checker-driven retries and returns the success summary.
|
|
160
|
-
Raises `TaskFailed` when the maximum
|
|
170
|
+
Raises `TaskFailed` when the maximum iterations are reached.
|
|
161
171
|
|
|
162
172
|
- `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
|
|
163
|
-
- `max_iterations` (int): maximum number of task
|
|
173
|
+
- `max_iterations` (int): maximum number of task iterations (0 means unlimited).
|
|
164
174
|
- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
|
|
165
175
|
- `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
|
|
166
176
|
|
|
@@ -170,7 +180,7 @@ Runs a task with checker-driven retries and returns a `TaskResult` without
|
|
|
170
180
|
raising `TaskFailed`.
|
|
171
181
|
Arguments mirror `task()` (including hooks).
|
|
172
182
|
|
|
173
|
-
### `Task(prompt,
|
|
183
|
+
### `Task(prompt, max_iterations=10, cwd=None, yolo=True, thread_id=None, flags=None)`
|
|
174
184
|
|
|
175
185
|
Runs a Codex task with checker-driven retries. Subclass it and implement
|
|
176
186
|
`check()` to return an error string when the task is incomplete, or return
|
|
@@ -185,22 +195,22 @@ default check prompt and includes the agent output.
|
|
|
185
195
|
- `on_success(result)`: optional success hook.
|
|
186
196
|
- `on_failure(result)`: optional failure hook.
|
|
187
197
|
|
|
188
|
-
### `TaskResult(success, summary,
|
|
198
|
+
### `TaskResult(success, summary, iterations, errors, thread_id)`
|
|
189
199
|
|
|
190
200
|
Simple result object returned by `Task.__call__`.
|
|
191
201
|
|
|
192
202
|
- `success` (bool): whether the task completed successfully.
|
|
193
203
|
- `summary` (str): agent summary of what happened.
|
|
194
|
-
- `
|
|
204
|
+
- `iterations` (int): how many iterations were used.
|
|
195
205
|
- `errors` (str | None): last checker error, if any.
|
|
196
206
|
- `thread_id` (str | None): Codex thread id for the session.
|
|
197
207
|
|
|
198
208
|
### `TaskFailed`
|
|
199
209
|
|
|
200
|
-
Exception raised by `task()` when
|
|
210
|
+
Exception raised by `task()` when iterations are exhausted.
|
|
201
211
|
|
|
202
212
|
- `summary` (str): failure summary text.
|
|
203
|
-
- `
|
|
213
|
+
- `iterations` (int | None): iterations made when the task failed.
|
|
204
214
|
- `errors` (str | None): last checker error, if any.
|
|
205
215
|
|
|
206
216
|
### `foreach(list_file, task_file, n=None, cwd=None, yolo=True, flags=None) -> ForeachResult`
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|