codexapi 0.5.4__tar.gz → 0.5.6__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {codexapi-0.5.4/src/codexapi.egg-info → codexapi-0.5.6}/PKG-INFO +13 -3
- {codexapi-0.5.4 → codexapi-0.5.6}/README.md +12 -2
- {codexapi-0.5.4 → codexapi-0.5.6}/pyproject.toml +1 -1
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/__init__.py +1 -1
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/cli.py +68 -6
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/task.py +149 -72
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/taskfile.py +11 -0
- {codexapi-0.5.4 → codexapi-0.5.6/src/codexapi.egg-info}/PKG-INFO +13 -3
- {codexapi-0.5.4 → codexapi-0.5.6}/LICENSE +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/setup.cfg +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/__main__.py +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/agent.py +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/foreach.py +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/ralph.py +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi.egg-info/SOURCES.txt +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi.egg-info/dependency_links.txt +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi.egg-info/entry_points.txt +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi.egg-info/requires.txt +0 -0
- {codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi.egg-info/top_level.txt +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: codexapi
|
|
3
|
-
Version: 0.5.
|
|
3
|
+
Version: 0.5.6
|
|
4
4
|
Summary: Minimal Python API for running the Codex CLI.
|
|
5
5
|
License: MIT
|
|
6
6
|
Keywords: codex,agent,cli,openai
|
|
@@ -68,18 +68,26 @@ codexapi run --cwd /path/to/project "Fix the failing tests."
|
|
|
68
68
|
echo "Say hello." | codexapi run
|
|
69
69
|
```
|
|
70
70
|
|
|
71
|
-
`codexapi task` exits with code 0 on success and 1 on failure
|
|
71
|
+
`codexapi task` exits with code 0 on success and 1 on failure.
|
|
72
72
|
|
|
73
73
|
```bash
|
|
74
74
|
codexapi task "Fix the failing tests." --max-iterations 5
|
|
75
75
|
codexapi task -f task.yaml
|
|
76
|
+
codexapi task -f task.yaml -i README.md
|
|
76
77
|
```
|
|
77
78
|
Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
|
|
79
|
+
When using `--item`, the task file must include at least one `{{item}}` placeholder.
|
|
78
80
|
|
|
79
81
|
Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
|
|
80
82
|
Use `max_iterations` in the task file to override the default attempt cap (0 means unlimited).
|
|
81
83
|
Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
|
|
82
84
|
|
|
85
|
+
Example task progress run:
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
./examples/example_task_progress.sh
|
|
89
|
+
```
|
|
90
|
+
|
|
83
91
|
Show running sessions and their latest activity:
|
|
84
92
|
|
|
85
93
|
```bash
|
|
@@ -120,6 +128,8 @@ Run a task file across a list file:
|
|
|
120
128
|
```bash
|
|
121
129
|
codexapi foreach list.txt task.yaml
|
|
122
130
|
codexapi foreach list.txt task.yaml -n 4
|
|
131
|
+
codexapi foreach list.txt task.yaml --retry-failed
|
|
132
|
+
codexapi foreach list.txt task.yaml --retry-all
|
|
123
133
|
```
|
|
124
134
|
|
|
125
135
|
## API
|
|
@@ -151,7 +161,7 @@ Raises `TaskFailed` when the maximum attempts are reached.
|
|
|
151
161
|
|
|
152
162
|
- `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
|
|
153
163
|
- `max_iterations` (int): maximum number of task attempts (0 means unlimited).
|
|
154
|
-
- `progress` (bool):
|
|
164
|
+
- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
|
|
155
165
|
- `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
|
|
156
166
|
|
|
157
167
|
### `task_result(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> TaskResult`
|
|
@@ -54,18 +54,26 @@ codexapi run --cwd /path/to/project "Fix the failing tests."
|
|
|
54
54
|
echo "Say hello." | codexapi run
|
|
55
55
|
```
|
|
56
56
|
|
|
57
|
-
`codexapi task` exits with code 0 on success and 1 on failure
|
|
57
|
+
`codexapi task` exits with code 0 on success and 1 on failure.
|
|
58
58
|
|
|
59
59
|
```bash
|
|
60
60
|
codexapi task "Fix the failing tests." --max-iterations 5
|
|
61
61
|
codexapi task -f task.yaml
|
|
62
|
+
codexapi task -f task.yaml -i README.md
|
|
62
63
|
```
|
|
63
64
|
Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
|
|
65
|
+
When using `--item`, the task file must include at least one `{{item}}` placeholder.
|
|
64
66
|
|
|
65
67
|
Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
|
|
66
68
|
Use `max_iterations` in the task file to override the default attempt cap (0 means unlimited).
|
|
67
69
|
Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
|
|
68
70
|
|
|
71
|
+
Example task progress run:
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
./examples/example_task_progress.sh
|
|
75
|
+
```
|
|
76
|
+
|
|
69
77
|
Show running sessions and their latest activity:
|
|
70
78
|
|
|
71
79
|
```bash
|
|
@@ -106,6 +114,8 @@ Run a task file across a list file:
|
|
|
106
114
|
```bash
|
|
107
115
|
codexapi foreach list.txt task.yaml
|
|
108
116
|
codexapi foreach list.txt task.yaml -n 4
|
|
117
|
+
codexapi foreach list.txt task.yaml --retry-failed
|
|
118
|
+
codexapi foreach list.txt task.yaml --retry-all
|
|
109
119
|
```
|
|
110
120
|
|
|
111
121
|
## API
|
|
@@ -137,7 +147,7 @@ Raises `TaskFailed` when the maximum attempts are reached.
|
|
|
137
147
|
|
|
138
148
|
- `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
|
|
139
149
|
- `max_iterations` (int): maximum number of task attempts (0 means unlimited).
|
|
140
|
-
- `progress` (bool):
|
|
150
|
+
- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
|
|
141
151
|
- `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
|
|
142
152
|
|
|
143
153
|
### `task_result(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> TaskResult`
|
|
@@ -15,7 +15,7 @@ from .agent import Agent, agent
|
|
|
15
15
|
from .foreach import foreach
|
|
16
16
|
from .ralph import cancel_ralph_loop, run_ralph_loop
|
|
17
17
|
from .task import DEFAULT_MAX_ITERATIONS, TaskFailed, task
|
|
18
|
-
from .taskfile import TaskFile
|
|
18
|
+
from .taskfile import TaskFile, load_task_file, task_def_uses_item
|
|
19
19
|
|
|
20
20
|
_SESSION_ID_RE = re.compile(
|
|
21
21
|
r"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
|
|
@@ -62,6 +62,7 @@ _COLUMN_TITLES = {
|
|
|
62
62
|
"perm": "PERM",
|
|
63
63
|
"cwd": "CWD",
|
|
64
64
|
}
|
|
65
|
+
_FOREACH_STATUS_MARKERS = {"⏳", "✅", "❌"}
|
|
65
66
|
|
|
66
67
|
|
|
67
68
|
def _read_prompt(prompt):
|
|
@@ -871,6 +872,37 @@ def _print_top_once(show):
|
|
|
871
872
|
print(_format_session(session, layout))
|
|
872
873
|
|
|
873
874
|
|
|
875
|
+
def _clean_foreach_list(path, retry_failed, retry_all):
|
|
876
|
+
with open(path, "r", encoding="utf-8") as handle:
|
|
877
|
+
data = handle.read()
|
|
878
|
+
ends_with_newline = data.endswith("\n")
|
|
879
|
+
lines = data.splitlines()
|
|
880
|
+
|
|
881
|
+
cleaned = []
|
|
882
|
+
changed = False
|
|
883
|
+
for line in lines:
|
|
884
|
+
new_line = line
|
|
885
|
+
if retry_all or (retry_failed and new_line.startswith("❌")):
|
|
886
|
+
if new_line and new_line[0] in _FOREACH_STATUS_MARKERS:
|
|
887
|
+
new_line = new_line[1:]
|
|
888
|
+
if new_line.startswith(" "):
|
|
889
|
+
new_line = new_line[1:]
|
|
890
|
+
pipe = new_line.find("|")
|
|
891
|
+
if pipe != -1:
|
|
892
|
+
new_line = new_line[:pipe].rstrip()
|
|
893
|
+
if new_line != line:
|
|
894
|
+
changed = True
|
|
895
|
+
cleaned.append(new_line)
|
|
896
|
+
|
|
897
|
+
if not changed:
|
|
898
|
+
return
|
|
899
|
+
text = "\n".join(cleaned)
|
|
900
|
+
if ends_with_newline:
|
|
901
|
+
text += "\n"
|
|
902
|
+
with open(path, "w", encoding="utf-8") as handle:
|
|
903
|
+
handle.write(text)
|
|
904
|
+
|
|
905
|
+
|
|
874
906
|
def _run_top(argv):
|
|
875
907
|
if argv and argv[0] in ("-h", "--help"):
|
|
876
908
|
print("usage: codexapi top")
|
|
@@ -995,6 +1027,11 @@ def main(argv=None):
|
|
|
995
1027
|
"--task-file",
|
|
996
1028
|
help="YAML task file to run.",
|
|
997
1029
|
)
|
|
1030
|
+
task_parser.add_argument(
|
|
1031
|
+
"-i",
|
|
1032
|
+
"--item",
|
|
1033
|
+
help="Item value for task files that use {{item}} placeholders.",
|
|
1034
|
+
)
|
|
998
1035
|
task_parser.add_argument(
|
|
999
1036
|
"prompt",
|
|
1000
1037
|
nargs="?",
|
|
@@ -1148,6 +1185,17 @@ def main(argv=None):
|
|
|
1148
1185
|
"task_file",
|
|
1149
1186
|
help="Path to the YAML task file.",
|
|
1150
1187
|
)
|
|
1188
|
+
foreach_retry_group = foreach_parser.add_mutually_exclusive_group()
|
|
1189
|
+
foreach_retry_group.add_argument(
|
|
1190
|
+
"--retry-failed",
|
|
1191
|
+
action="store_true",
|
|
1192
|
+
help="Reset failed (❌) items for re-run.",
|
|
1193
|
+
)
|
|
1194
|
+
foreach_retry_group.add_argument(
|
|
1195
|
+
"--retry-all",
|
|
1196
|
+
action="store_true",
|
|
1197
|
+
help="Reset all items for re-run.",
|
|
1198
|
+
)
|
|
1151
1199
|
foreach_parser.add_argument(
|
|
1152
1200
|
"-n",
|
|
1153
1201
|
type=int,
|
|
@@ -1181,6 +1229,12 @@ def main(argv=None):
|
|
|
1181
1229
|
if args.command == "foreach":
|
|
1182
1230
|
if args.n is not None and args.n < 1:
|
|
1183
1231
|
raise SystemExit("-n must be >= 1.")
|
|
1232
|
+
if args.retry_failed or args.retry_all:
|
|
1233
|
+
_clean_foreach_list(
|
|
1234
|
+
args.list_file,
|
|
1235
|
+
args.retry_failed,
|
|
1236
|
+
args.retry_all,
|
|
1237
|
+
)
|
|
1184
1238
|
result = foreach(
|
|
1185
1239
|
args.list_file,
|
|
1186
1240
|
args.task_file,
|
|
@@ -1225,20 +1279,25 @@ def main(argv=None):
|
|
|
1225
1279
|
if args.command == "task" and args.task_file:
|
|
1226
1280
|
if args.prompt:
|
|
1227
1281
|
raise SystemExit("task -f does not take a prompt.")
|
|
1282
|
+
if args.item is not None:
|
|
1283
|
+
task_def = load_task_file(args.task_file)
|
|
1284
|
+
if not task_def_uses_item(task_def):
|
|
1285
|
+
raise SystemExit(
|
|
1286
|
+
"task -f --item requires {{item}} in the task file."
|
|
1287
|
+
)
|
|
1228
1288
|
if args.check is not None:
|
|
1229
1289
|
raise SystemExit("--check is not allowed with -f.")
|
|
1230
1290
|
if args.max_iterations is not None:
|
|
1231
1291
|
raise SystemExit("--max-iterations is not allowed with -f.")
|
|
1232
1292
|
task_runner = TaskFile(
|
|
1233
1293
|
args.task_file,
|
|
1234
|
-
|
|
1294
|
+
args.item,
|
|
1235
1295
|
cwd=args.cwd,
|
|
1236
1296
|
yolo=args.yolo,
|
|
1237
1297
|
thread_id=None,
|
|
1238
1298
|
flags=args.flags,
|
|
1239
1299
|
)
|
|
1240
1300
|
result = task_runner(progress=not args.quiet)
|
|
1241
|
-
print(result.summary)
|
|
1242
1301
|
if not result.success:
|
|
1243
1302
|
raise SystemExit(1)
|
|
1244
1303
|
return
|
|
@@ -1250,6 +1309,7 @@ def main(argv=None):
|
|
|
1250
1309
|
prompt_source = args.task
|
|
1251
1310
|
prompt = _read_prompt(prompt_source)
|
|
1252
1311
|
exit_code = 0
|
|
1312
|
+
message = None
|
|
1253
1313
|
|
|
1254
1314
|
if args.command == "ralph":
|
|
1255
1315
|
if args.max_iterations < 0:
|
|
@@ -1279,13 +1339,15 @@ def main(argv=None):
|
|
|
1279
1339
|
)
|
|
1280
1340
|
return
|
|
1281
1341
|
if args.command == "task":
|
|
1342
|
+
if args.item is not None:
|
|
1343
|
+
raise SystemExit("--item is only supported with -f.")
|
|
1282
1344
|
if args.max_iterations is None:
|
|
1283
1345
|
args.max_iterations = DEFAULT_MAX_ITERATIONS
|
|
1284
1346
|
if args.max_iterations < 0:
|
|
1285
1347
|
raise SystemExit("--max-iterations must be >= 0.")
|
|
1286
1348
|
check = args.check
|
|
1287
1349
|
try:
|
|
1288
|
-
|
|
1350
|
+
task(
|
|
1289
1351
|
prompt,
|
|
1290
1352
|
check,
|
|
1291
1353
|
args.max_iterations,
|
|
@@ -1295,7 +1357,6 @@ def main(argv=None):
|
|
|
1295
1357
|
not args.quiet,
|
|
1296
1358
|
)
|
|
1297
1359
|
except TaskFailed as exc:
|
|
1298
|
-
message = exc.summary
|
|
1299
1360
|
exit_code = 1
|
|
1300
1361
|
else:
|
|
1301
1362
|
use_session = args.thread_id or args.print_thread_id
|
|
@@ -1312,7 +1373,8 @@ def main(argv=None):
|
|
|
1312
1373
|
else:
|
|
1313
1374
|
message = agent(prompt, args.cwd, args.yolo, args.flags)
|
|
1314
1375
|
|
|
1315
|
-
|
|
1376
|
+
if message is not None:
|
|
1377
|
+
print(message)
|
|
1316
1378
|
if exit_code:
|
|
1317
1379
|
raise SystemExit(exit_code)
|
|
1318
1380
|
|
|
@@ -5,6 +5,7 @@ import logging
|
|
|
5
5
|
import time
|
|
6
6
|
|
|
7
7
|
from .agent import Agent, agent
|
|
8
|
+
from tqdm import tqdm
|
|
8
9
|
|
|
9
10
|
_logger = logging.getLogger(__name__)
|
|
10
11
|
|
|
@@ -20,11 +21,13 @@ _CHECK_PREFIX = (
|
|
|
20
21
|
"Set success to true only if everything matches the intent."
|
|
21
22
|
)
|
|
22
23
|
_CHECK_SUFFIX = "JSON only. No markdown or extra text."
|
|
23
|
-
|
|
24
|
-
"
|
|
25
|
-
"
|
|
26
|
-
"
|
|
27
|
-
"
|
|
24
|
+
_ESTIMATE_PROMPT = (
|
|
25
|
+
"Estimate remaining work in story points for the task below.\n"
|
|
26
|
+
"You may inspect the repo (read files, git status/diff), but do not run tests.\n"
|
|
27
|
+
"Do not change any files.\n"
|
|
28
|
+
"Use the task prompt, current repo state, and latest agent/check outputs.\n"
|
|
29
|
+
"Return only JSON with keys: remaining (number) and summary (string).\n"
|
|
30
|
+
"summary must be a single line describing agent + verifier status."
|
|
28
31
|
)
|
|
29
32
|
DEFAULT_MAX_ITERATIONS = 10
|
|
30
33
|
|
|
@@ -62,14 +65,32 @@ def _resolve_check_text(prompt, check):
|
|
|
62
65
|
return check, False
|
|
63
66
|
|
|
64
67
|
|
|
65
|
-
def
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
"
|
|
71
|
-
|
|
68
|
+
def _build_estimate_prompt(prompt, agent_output, check_output, previous_total):
|
|
69
|
+
agent_text = agent_output.strip() or "(no agent output yet)"
|
|
70
|
+
check_text = check_output.strip() or "(no check output yet)"
|
|
71
|
+
lines = [
|
|
72
|
+
_ESTIMATE_PROMPT,
|
|
73
|
+
"",
|
|
74
|
+
"TASK:",
|
|
75
|
+
"```",
|
|
76
|
+
prompt,
|
|
77
|
+
"```",
|
|
78
|
+
]
|
|
79
|
+
if previous_total is not None:
|
|
80
|
+
lines.append(
|
|
81
|
+
f"This task was previously estimated at about {previous_total} story points."
|
|
82
|
+
)
|
|
83
|
+
lines.extend(
|
|
84
|
+
[
|
|
85
|
+
"",
|
|
86
|
+
"AGENT OUTPUT:",
|
|
87
|
+
agent_text,
|
|
88
|
+
"",
|
|
89
|
+
"CHECK OUTPUT:",
|
|
90
|
+
check_text,
|
|
91
|
+
]
|
|
72
92
|
)
|
|
93
|
+
return "\n".join(lines)
|
|
73
94
|
|
|
74
95
|
|
|
75
96
|
def _check_result(output):
|
|
@@ -91,25 +112,29 @@ def _check_result(output):
|
|
|
91
112
|
return success, reason.strip()
|
|
92
113
|
|
|
93
114
|
|
|
94
|
-
def
|
|
115
|
+
def _estimate_result(output):
|
|
95
116
|
try:
|
|
96
117
|
data = json.loads(output)
|
|
97
118
|
except json.JSONDecodeError as exc:
|
|
98
119
|
raise RuntimeError(
|
|
99
|
-
f"
|
|
120
|
+
f"Estimate returned invalid JSON: {exc}"
|
|
100
121
|
) from exc
|
|
101
122
|
|
|
102
123
|
if not isinstance(data, dict):
|
|
103
|
-
raise RuntimeError("
|
|
124
|
+
raise RuntimeError("Estimate JSON must be an object.")
|
|
125
|
+
|
|
126
|
+
remaining = data.get("remaining")
|
|
127
|
+
summary = data.get("summary")
|
|
128
|
+
if not isinstance(remaining, (int, float)):
|
|
129
|
+
raise RuntimeError("Estimate JSON missing numeric 'remaining'.")
|
|
130
|
+
if not isinstance(summary, str):
|
|
131
|
+
raise RuntimeError("Estimate JSON missing string 'summary'.")
|
|
104
132
|
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
raise RuntimeError("Progress summary JSON missing string 'agent'.")
|
|
109
|
-
if not isinstance(check_summary, str):
|
|
110
|
-
raise RuntimeError("Progress summary JSON missing string 'check'.")
|
|
133
|
+
remaining = int(round(remaining))
|
|
134
|
+
if remaining < 0:
|
|
135
|
+
remaining = 0
|
|
111
136
|
|
|
112
|
-
return
|
|
137
|
+
return remaining, _single_line(summary)
|
|
113
138
|
|
|
114
139
|
|
|
115
140
|
def _single_line(text):
|
|
@@ -118,56 +143,36 @@ def _single_line(text):
|
|
|
118
143
|
return " ".join(text.replace("\r", " ").split())
|
|
119
144
|
|
|
120
145
|
|
|
121
|
-
def
|
|
146
|
+
def _format_elapsed(seconds):
|
|
122
147
|
if seconds < 0:
|
|
123
148
|
seconds = 0
|
|
124
149
|
seconds = int(round(seconds))
|
|
125
150
|
hours, remainder = divmod(seconds, 3600)
|
|
126
151
|
minutes, seconds = divmod(remainder, 60)
|
|
127
|
-
|
|
128
|
-
if hours:
|
|
129
|
-
parts.append(f"{hours}h")
|
|
130
|
-
if minutes or hours:
|
|
131
|
-
parts.append(f"{minutes}m")
|
|
132
|
-
if not hours:
|
|
133
|
-
parts.append(f"{seconds}s")
|
|
134
|
-
return " ".join(parts)
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
def _print_progress(
|
|
138
|
-
attempt,
|
|
139
|
-
total,
|
|
140
|
-
start_time,
|
|
141
|
-
agent_output,
|
|
142
|
-
check_output,
|
|
143
|
-
cwd,
|
|
144
|
-
yolo,
|
|
145
|
-
flags,
|
|
146
|
-
):
|
|
147
|
-
elapsed = time.monotonic() - start_time
|
|
148
|
-
remaining = 0
|
|
149
|
-
remaining_text = "unknown"
|
|
150
|
-
if total:
|
|
151
|
-
if attempt:
|
|
152
|
-
remaining = (elapsed / attempt) * (total - attempt)
|
|
153
|
-
remaining_text = _format_duration(remaining)
|
|
152
|
+
return f"{hours}h{minutes:02d}m{seconds:02d}s"
|
|
154
153
|
|
|
155
|
-
summary_prompt = _build_progress_prompt(agent_output, check_output)
|
|
156
|
-
summary = agent(summary_prompt, cwd, yolo, flags)
|
|
157
|
-
agent_summary, check_summary = _progress_result(summary)
|
|
158
154
|
|
|
159
|
-
|
|
160
|
-
if
|
|
161
|
-
|
|
155
|
+
def _format_turns(attempt, total):
|
|
156
|
+
if total:
|
|
157
|
+
width = max(2, len(str(total)))
|
|
158
|
+
total_text = str(total)
|
|
162
159
|
else:
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
160
|
+
width = 2
|
|
161
|
+
total_text = "∞"
|
|
162
|
+
attempt_text = f"{attempt:0{width}d}"
|
|
163
|
+
return f"{attempt_text}/{total_text}"
|
|
164
|
+
|
|
165
|
+
|
|
166
|
+
def estimate(prompt, agent_output, check_output, cwd, yolo, flags, previous_total):
|
|
167
|
+
estimate_prompt = _build_estimate_prompt(
|
|
168
|
+
prompt,
|
|
169
|
+
agent_output or "",
|
|
170
|
+
check_output or "",
|
|
171
|
+
previous_total,
|
|
167
172
|
)
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
|
|
173
|
+
output = agent(estimate_prompt, cwd, yolo, flags)
|
|
174
|
+
return _estimate_result(output)
|
|
175
|
+
|
|
171
176
|
|
|
172
177
|
def _fix_prompt(error):
|
|
173
178
|
return (
|
|
@@ -234,7 +239,7 @@ def task(
|
|
|
234
239
|
cwd: Optional working directory for the Codex session.
|
|
235
240
|
yolo: Whether to pass --yolo to Codex.
|
|
236
241
|
flags: Additional raw CLI flags to pass to Codex.
|
|
237
|
-
progress: Whether to
|
|
242
|
+
progress: Whether to show a tqdm progress bar with status updates.
|
|
238
243
|
set_up: Optional setup prompt to run before the task.
|
|
239
244
|
tear_down: Optional cleanup prompt to run after the task.
|
|
240
245
|
on_success: Optional prompt to run after a successful task.
|
|
@@ -280,7 +285,7 @@ def task_result(
|
|
|
280
285
|
"""Run a prompt with optional checker-driven retries and return TaskResult.
|
|
281
286
|
|
|
282
287
|
The runner keeps a single session. Each verification attempt uses a fresh,
|
|
283
|
-
stateless agent call. When progress is True,
|
|
288
|
+
stateless agent call. When progress is True, show progress updates each round.
|
|
284
289
|
|
|
285
290
|
Hook strings mirror task file keys: set_up, tear_down, on_success, on_failure.
|
|
286
291
|
"""
|
|
@@ -362,6 +367,9 @@ class Task:
|
|
|
362
367
|
self.check_text = None
|
|
363
368
|
self._yolo = yolo
|
|
364
369
|
self._flags = flags
|
|
370
|
+
self._progress_enabled = False
|
|
371
|
+
self._progress_bar = None
|
|
372
|
+
self._progress_total = None
|
|
365
373
|
self.agent = Agent(
|
|
366
374
|
cwd,
|
|
367
375
|
yolo,
|
|
@@ -403,6 +411,30 @@ class Task:
|
|
|
403
411
|
def on_failure(self, result):
|
|
404
412
|
"""Hook called after a failed run, e.g. log the failure reason."""
|
|
405
413
|
|
|
414
|
+
def on_progress(
|
|
415
|
+
self,
|
|
416
|
+
turns,
|
|
417
|
+
max_turns,
|
|
418
|
+
total_estimate,
|
|
419
|
+
remaining_estimate,
|
|
420
|
+
status_line,
|
|
421
|
+
):
|
|
422
|
+
"""Hook called with progress updates."""
|
|
423
|
+
if not self._progress_enabled:
|
|
424
|
+
return
|
|
425
|
+
if self._progress_bar is None:
|
|
426
|
+
self._progress_bar = tqdm(total=total_estimate)
|
|
427
|
+
if total_estimate != self._progress_bar.total:
|
|
428
|
+
self._progress_bar.total = total_estimate
|
|
429
|
+
current = total_estimate - remaining_estimate
|
|
430
|
+
if current < 0:
|
|
431
|
+
current = 0
|
|
432
|
+
if self._progress_bar.n != current:
|
|
433
|
+
self._progress_bar.n = current
|
|
434
|
+
self._progress_bar.refresh()
|
|
435
|
+
if status_line:
|
|
436
|
+
tqdm.write(status_line, file=self._progress_bar.fp)
|
|
437
|
+
|
|
406
438
|
def fix_prompt(self, error):
|
|
407
439
|
"""Build a prompt that asks the agent to fix checker failures."""
|
|
408
440
|
return (
|
|
@@ -425,12 +457,35 @@ class Task:
|
|
|
425
457
|
def __call__(self, debug=False, progress=False):
|
|
426
458
|
"""Run the task with checker-driven retries.
|
|
427
459
|
If debug is True, log debug messages.
|
|
428
|
-
If progress is True,
|
|
460
|
+
If progress is True, show a tqdm progress bar with status updates.
|
|
429
461
|
"""
|
|
430
462
|
try:
|
|
431
463
|
# If this fails in the middle we will still try to tear down
|
|
432
464
|
self.set_up()
|
|
433
465
|
|
|
466
|
+
self._progress_enabled = progress
|
|
467
|
+
if progress:
|
|
468
|
+
remaining, _summary = estimate(
|
|
469
|
+
self.prompt,
|
|
470
|
+
"",
|
|
471
|
+
"",
|
|
472
|
+
self.cwd,
|
|
473
|
+
self._yolo,
|
|
474
|
+
self._flags,
|
|
475
|
+
None,
|
|
476
|
+
)
|
|
477
|
+
self._progress_total = remaining
|
|
478
|
+
start_time = time.monotonic()
|
|
479
|
+
self.on_progress(
|
|
480
|
+
0,
|
|
481
|
+
self.max_attempts,
|
|
482
|
+
self._progress_total,
|
|
483
|
+
remaining,
|
|
484
|
+
None,
|
|
485
|
+
)
|
|
486
|
+
else:
|
|
487
|
+
start_time = time.monotonic()
|
|
488
|
+
|
|
434
489
|
# Start with the initial prompt
|
|
435
490
|
output = self.agent(self.prompt)
|
|
436
491
|
self.last_output = output
|
|
@@ -438,7 +493,6 @@ class Task:
|
|
|
438
493
|
_logger.debug("Initial output: %s", output)
|
|
439
494
|
|
|
440
495
|
# Try correcting it up to max_attempts times
|
|
441
|
-
start_time = time.monotonic()
|
|
442
496
|
error = None
|
|
443
497
|
attempt = 0
|
|
444
498
|
while True:
|
|
@@ -451,15 +505,36 @@ class Task:
|
|
|
451
505
|
check_output = self.last_check_output
|
|
452
506
|
if self.check_skipped:
|
|
453
507
|
check_output = "Verification skipped."
|
|
454
|
-
|
|
455
|
-
|
|
456
|
-
self.
|
|
457
|
-
start_time,
|
|
458
|
-
self.last_output,
|
|
508
|
+
remaining, summary = estimate(
|
|
509
|
+
self.prompt,
|
|
510
|
+
self.last_output or "",
|
|
459
511
|
check_output or "",
|
|
460
512
|
self.cwd,
|
|
461
513
|
self._yolo,
|
|
462
514
|
self._flags,
|
|
515
|
+
self._progress_total,
|
|
516
|
+
)
|
|
517
|
+
total_estimate = self._progress_total
|
|
518
|
+
if total_estimate is None or remaining > total_estimate:
|
|
519
|
+
total_estimate = remaining
|
|
520
|
+
self._progress_total = total_estimate
|
|
521
|
+
elapsed = _format_elapsed(time.monotonic() - start_time)
|
|
522
|
+
status_prefix = (
|
|
523
|
+
f"[{_format_turns(attempt, self.max_attempts)} @ {elapsed}]"
|
|
524
|
+
)
|
|
525
|
+
is_final = not error or (
|
|
526
|
+
self.max_attempts and attempt >= self.max_attempts
|
|
527
|
+
)
|
|
528
|
+
if is_final:
|
|
529
|
+
marker = "✅" if not error else "❌"
|
|
530
|
+
summary = f"{marker} {summary}".strip()
|
|
531
|
+
status_line = f"{status_prefix}: {summary}".rstrip()
|
|
532
|
+
self.on_progress(
|
|
533
|
+
attempt,
|
|
534
|
+
self.max_attempts,
|
|
535
|
+
total_estimate,
|
|
536
|
+
remaining,
|
|
537
|
+
status_line,
|
|
463
538
|
)
|
|
464
539
|
if not error:
|
|
465
540
|
summary = self.agent(self.success_prompt())
|
|
@@ -494,6 +569,8 @@ class Task:
|
|
|
494
569
|
finally:
|
|
495
570
|
# No matter what, once we have set_up we will always tear_down
|
|
496
571
|
self.tear_down()
|
|
572
|
+
if self._progress_bar is not None:
|
|
573
|
+
self._progress_bar.close()
|
|
497
574
|
|
|
498
575
|
|
|
499
576
|
class AutoTask(Task):
|
|
@@ -54,6 +54,17 @@ def _render(text, item):
|
|
|
54
54
|
return text.replace(_ITEM_TOKEN, item)
|
|
55
55
|
|
|
56
56
|
|
|
57
|
+
def task_def_uses_item(task_def):
|
|
58
|
+
"""Return True if a task definition includes the {{item}} placeholder."""
|
|
59
|
+
if not isinstance(task_def, dict):
|
|
60
|
+
raise TypeError("task definition must be a dict")
|
|
61
|
+
for key in ("prompt", "set_up", "tear_down", "check", "on_success", "on_failure"):
|
|
62
|
+
value = task_def.get(key)
|
|
63
|
+
if isinstance(value, str) and _ITEM_TOKEN in value:
|
|
64
|
+
return True
|
|
65
|
+
return False
|
|
66
|
+
|
|
67
|
+
|
|
57
68
|
class TaskFile(AutoTask):
|
|
58
69
|
"""Task subclass that maps a YAML task file onto Task hooks."""
|
|
59
70
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: codexapi
|
|
3
|
-
Version: 0.5.
|
|
3
|
+
Version: 0.5.6
|
|
4
4
|
Summary: Minimal Python API for running the Codex CLI.
|
|
5
5
|
License: MIT
|
|
6
6
|
Keywords: codex,agent,cli,openai
|
|
@@ -68,18 +68,26 @@ codexapi run --cwd /path/to/project "Fix the failing tests."
|
|
|
68
68
|
echo "Say hello." | codexapi run
|
|
69
69
|
```
|
|
70
70
|
|
|
71
|
-
`codexapi task` exits with code 0 on success and 1 on failure
|
|
71
|
+
`codexapi task` exits with code 0 on success and 1 on failure.
|
|
72
72
|
|
|
73
73
|
```bash
|
|
74
74
|
codexapi task "Fix the failing tests." --max-iterations 5
|
|
75
75
|
codexapi task -f task.yaml
|
|
76
|
+
codexapi task -f task.yaml -i README.md
|
|
76
77
|
```
|
|
77
78
|
Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
|
|
79
|
+
When using `--item`, the task file must include at least one `{{item}}` placeholder.
|
|
78
80
|
|
|
79
81
|
Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
|
|
80
82
|
Use `max_iterations` in the task file to override the default attempt cap (0 means unlimited).
|
|
81
83
|
Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
|
|
82
84
|
|
|
85
|
+
Example task progress run:
|
|
86
|
+
|
|
87
|
+
```bash
|
|
88
|
+
./examples/example_task_progress.sh
|
|
89
|
+
```
|
|
90
|
+
|
|
83
91
|
Show running sessions and their latest activity:
|
|
84
92
|
|
|
85
93
|
```bash
|
|
@@ -120,6 +128,8 @@ Run a task file across a list file:
|
|
|
120
128
|
```bash
|
|
121
129
|
codexapi foreach list.txt task.yaml
|
|
122
130
|
codexapi foreach list.txt task.yaml -n 4
|
|
131
|
+
codexapi foreach list.txt task.yaml --retry-failed
|
|
132
|
+
codexapi foreach list.txt task.yaml --retry-all
|
|
123
133
|
```
|
|
124
134
|
|
|
125
135
|
## API
|
|
@@ -151,7 +161,7 @@ Raises `TaskFailed` when the maximum attempts are reached.
|
|
|
151
161
|
|
|
152
162
|
- `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
|
|
153
163
|
- `max_iterations` (int): maximum number of task attempts (0 means unlimited).
|
|
154
|
-
- `progress` (bool):
|
|
164
|
+
- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
|
|
155
165
|
- `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
|
|
156
166
|
|
|
157
167
|
### `task_result(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> TaskResult`
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|