PyPI - codexapi - Versions diffs - 0.5.4__tar.gz → 0.5.6__tar.gz - Mend

codexapi 0.5.4tar.gz → 0.5.6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

{codexapi-0.5.4/src/codexapi.egg-info → codexapi-0.5.6}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: codexapi
-Version: 0.5.4
+Version: 0.5.6
 Summary: Minimal Python API for running the Codex CLI.
 License: MIT
 Keywords: codex,agent,cli,openai
@@ -68,18 +68,26 @@ codexapi run --cwd /path/to/project "Fix the failing tests."
 echo "Say hello." | codexapi run
 ```
-`codexapi task` exits with code 0 on success and 1 on failure, printing the summary.
+`codexapi task` exits with code 0 on success and 1 on failure.
 ```bash
 codexapi task "Fix the failing tests." --max-iterations 5
 codexapi task -f task.yaml
+codexapi task -f task.yaml -i README.md
 ```
 Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
+When using `--item`, the task file must include at least one `{{item}}` placeholder.
 Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
 Use `max_iterations` in the task file to override the default attempt cap (0 means unlimited).
 Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
+Example task progress run:
+```bash
+./examples/example_task_progress.sh
+```
 Show running sessions and their latest activity:
 ```bash
@@ -120,6 +128,8 @@ Run a task file across a list file:
 ```bash
 codexapi foreach list.txt task.yaml
 codexapi foreach list.txt task.yaml -n 4
+codexapi foreach list.txt task.yaml --retry-failed
+codexapi foreach list.txt task.yaml --retry-all
 ```
 ## API
@@ -151,7 +161,7 @@ Raises `TaskFailed` when the maximum attempts are reached.
 - `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
 - `max_iterations` (int): maximum number of task attempts (0 means unlimited).
-- `progress` (bool): print progress after each verification round.
+- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
 - `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
 ### `task_result(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> TaskResult`

{codexapi-0.5.4 → codexapi-0.5.6}/README.md RENAMED Viewed

@@ -54,18 +54,26 @@ codexapi run --cwd /path/to/project "Fix the failing tests."
 echo "Say hello." | codexapi run
 ```
-`codexapi task` exits with code 0 on success and 1 on failure, printing the summary.
+`codexapi task` exits with code 0 on success and 1 on failure.
 ```bash
 codexapi task "Fix the failing tests." --max-iterations 5
 codexapi task -f task.yaml
+codexapi task -f task.yaml -i README.md
 ```
 Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
+When using `--item`, the task file must include at least one `{{item}}` placeholder.
 Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
 Use `max_iterations` in the task file to override the default attempt cap (0 means unlimited).
 Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
+Example task progress run:
+```bash
+./examples/example_task_progress.sh
+```
 Show running sessions and their latest activity:
 ```bash
@@ -106,6 +114,8 @@ Run a task file across a list file:
 ```bash
 codexapi foreach list.txt task.yaml
 codexapi foreach list.txt task.yaml -n 4
+codexapi foreach list.txt task.yaml --retry-failed
+codexapi foreach list.txt task.yaml --retry-all
 ```
 ## API
@@ -137,7 +147,7 @@ Raises `TaskFailed` when the maximum attempts are reached.
 - `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
 - `max_iterations` (int): maximum number of task attempts (0 means unlimited).
-- `progress` (bool): print progress after each verification round.
+- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
 - `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
 ### `task_result(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> TaskResult`

{codexapi-0.5.4 → codexapi-0.5.6}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "codexapi"
-version = "0.5.4"
+version = "0.5.6"
 description = "Minimal Python API for running the Codex CLI."
 readme = "README.md"
 requires-python = ">=3.8"

{codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/__init__.py RENAMED Viewed

@@ -15,4 +15,4 @@ __all__ = [
     "task",
     "task_result",
 ]
-__version__ = "0.5.4"
+__version__ = "0.5.6"

{codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/cli.py RENAMED Viewed

@@ -15,7 +15,7 @@ from .agent import Agent, agent
 from .foreach import foreach
 from .ralph import cancel_ralph_loop, run_ralph_loop
 from .task import DEFAULT_MAX_ITERATIONS, TaskFailed, task
-from .taskfile import TaskFile
+from .taskfile import TaskFile, load_task_file, task_def_uses_item
 _SESSION_ID_RE = re.compile(
     r"[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}"
@@ -62,6 +62,7 @@ _COLUMN_TITLES = {
     "perm": "PERM",
     "cwd": "CWD",
 }
+_FOREACH_STATUS_MARKERS = {"⏳", "✅", "❌"}
 def _read_prompt(prompt):
@@ -871,6 +872,37 @@ def _print_top_once(show):
         print(_format_session(session, layout))
+def _clean_foreach_list(path, retry_failed, retry_all):
+    with open(path, "r", encoding="utf-8") as handle:
+        data = handle.read()
+    ends_with_newline = data.endswith("\n")
+    lines = data.splitlines()
+    cleaned = []
+    changed = False
+    for line in lines:
+        new_line = line
+        if retry_all or (retry_failed and new_line.startswith("❌")):
+            if new_line and new_line[0] in _FOREACH_STATUS_MARKERS:
+                new_line = new_line[1:]
+                if new_line.startswith(" "):
+                    new_line = new_line[1:]
+            pipe = new_line.find("|")
+            if pipe != -1:
+                new_line = new_line[:pipe].rstrip()
+        if new_line != line:
+            changed = True
+        cleaned.append(new_line)
+    if not changed:
+        return
+    text = "\n".join(cleaned)
+    if ends_with_newline:
+        text += "\n"
+    with open(path, "w", encoding="utf-8") as handle:
+        handle.write(text)
 def _run_top(argv):
     if argv and argv[0] in ("-h", "--help"):
         print("usage: codexapi top")
@@ -995,6 +1027,11 @@ def main(argv=None):
         "--task-file",
         help="YAML task file to run.",
     )
+    task_parser.add_argument(
+        "-i",
+        "--item",
+        help="Item value for task files that use {{item}} placeholders.",
+    )
     task_parser.add_argument(
         "prompt",
         nargs="?",
@@ -1148,6 +1185,17 @@ def main(argv=None):
         "task_file",
         help="Path to the YAML task file.",
     )
+    foreach_retry_group = foreach_parser.add_mutually_exclusive_group()
+    foreach_retry_group.add_argument(
+        "--retry-failed",
+        action="store_true",
+        help="Reset failed (❌) items for re-run.",
+    )
+    foreach_retry_group.add_argument(
+        "--retry-all",
+        action="store_true",
+        help="Reset all items for re-run.",
+    )
     foreach_parser.add_argument(
         "-n",
         type=int,
@@ -1181,6 +1229,12 @@ def main(argv=None):
     if args.command == "foreach":
         if args.n is not None and args.n < 1:
             raise SystemExit("-n must be >= 1.")
+        if args.retry_failed or args.retry_all:
+            _clean_foreach_list(
+                args.list_file,
+                args.retry_failed,
+                args.retry_all,
+            )
         result = foreach(
             args.list_file,
             args.task_file,
@@ -1225,20 +1279,25 @@ def main(argv=None):
     if args.command == "task" and args.task_file:
         if args.prompt:
             raise SystemExit("task -f does not take a prompt.")
+        if args.item is not None:
+            task_def = load_task_file(args.task_file)
+            if not task_def_uses_item(task_def):
+                raise SystemExit(
+                    "task -f --item requires {{item}} in the task file."
+                )
         if args.check is not None:
             raise SystemExit("--check is not allowed with -f.")
         if args.max_iterations is not None:
             raise SystemExit("--max-iterations is not allowed with -f.")
         task_runner = TaskFile(
             args.task_file,
-            None,
+            args.item,
             cwd=args.cwd,
             yolo=args.yolo,
             thread_id=None,
             flags=args.flags,
         )
         result = task_runner(progress=not args.quiet)
-        print(result.summary)
         if not result.success:
             raise SystemExit(1)
         return
@@ -1250,6 +1309,7 @@ def main(argv=None):
         prompt_source = args.task
     prompt = _read_prompt(prompt_source)
     exit_code = 0
+    message = None
     if args.command == "ralph":
         if args.max_iterations < 0:
@@ -1279,13 +1339,15 @@ def main(argv=None):
         )
         return
     if args.command == "task":
+        if args.item is not None:
+            raise SystemExit("--item is only supported with -f.")
         if args.max_iterations is None:
             args.max_iterations = DEFAULT_MAX_ITERATIONS
         if args.max_iterations < 0:
             raise SystemExit("--max-iterations must be >= 0.")
         check = args.check
         try:
-            message = task(
+            task(
                 prompt,
                 check,
                 args.max_iterations,
@@ -1295,7 +1357,6 @@ def main(argv=None):
                 not args.quiet,
             )
         except TaskFailed as exc:
-            message = exc.summary
             exit_code = 1
     else:
         use_session = args.thread_id or args.print_thread_id
@@ -1312,7 +1373,8 @@ def main(argv=None):
         else:
             message = agent(prompt, args.cwd, args.yolo, args.flags)
-    print(message)
+    if message is not None:
+        print(message)
     if exit_code:
         raise SystemExit(exit_code)

{codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/task.py RENAMED Viewed

@@ -5,6 +5,7 @@ import logging
 import time
 from .agent import Agent, agent
+from tqdm import tqdm
 _logger = logging.getLogger(__name__)
@@ -20,11 +21,13 @@ _CHECK_PREFIX = (
     "Set success to true only if everything matches the intent."
 )
 _CHECK_SUFFIX = "JSON only. No markdown or extra text."
-_PROGRESS_PROMPT = (
-    "Summarize the outputs below in one line each.\n"
-    "Return only JSON with keys: agent (string) and check (string).\n"
-    "Each value must be a single line with no newlines.\n"
-    "Do not run commands or change any files."
+_ESTIMATE_PROMPT = (
+    "Estimate remaining work in story points for the task below.\n"
+    "You may inspect the repo (read files, git status/diff), but do not run tests.\n"
+    "Do not change any files.\n"
+    "Use the task prompt, current repo state, and latest agent/check outputs.\n"
+    "Return only JSON with keys: remaining (number) and summary (string).\n"
+    "summary must be a single line describing agent + verifier status."
 )
 DEFAULT_MAX_ITERATIONS = 10
@@ -62,14 +65,32 @@ def _resolve_check_text(prompt, check):
     return check, False
-def _build_progress_prompt(agent_output, check_output):
-    return (
-        f"{_PROGRESS_PROMPT}\n\n"
-        "AGENT OUTPUT:\n"
-        f"{agent_output}\n\n"
-        "CHECK OUTPUT:\n"
-        f"{check_output}"
+def _build_estimate_prompt(prompt, agent_output, check_output, previous_total):
+    agent_text = agent_output.strip() or "(no agent output yet)"
+    check_text = check_output.strip() or "(no check output yet)"
+    lines = [
+        _ESTIMATE_PROMPT,
+        "",
+        "TASK:",
+        "```",
+        prompt,
+        "```",
+    ]
+    if previous_total is not None:
+        lines.append(
+            f"This task was previously estimated at about {previous_total} story points."
+        )
+    lines.extend(
+        [
+            "",
+            "AGENT OUTPUT:",
+            agent_text,
+            "",
+            "CHECK OUTPUT:",
+            check_text,
+        ]
     )
+    return "\n".join(lines)
 def _check_result(output):
@@ -91,25 +112,29 @@ def _check_result(output):
     return success, reason.strip()
-def _progress_result(output):
+def _estimate_result(output):
     try:
         data = json.loads(output)
     except json.JSONDecodeError as exc:
         raise RuntimeError(
-            f"Progress summary returned invalid JSON: {exc}"
+            f"Estimate returned invalid JSON: {exc}"
         ) from exc
     if not isinstance(data, dict):
-        raise RuntimeError("Progress summary JSON must be an object.")
+        raise RuntimeError("Estimate JSON must be an object.")
+    remaining = data.get("remaining")
+    summary = data.get("summary")
+    if not isinstance(remaining, (int, float)):
+        raise RuntimeError("Estimate JSON missing numeric 'remaining'.")
+    if not isinstance(summary, str):
+        raise RuntimeError("Estimate JSON missing string 'summary'.")
-    agent_summary = data.get("agent")
-    check_summary = data.get("check")
-    if not isinstance(agent_summary, str):
-        raise RuntimeError("Progress summary JSON missing string 'agent'.")
-    if not isinstance(check_summary, str):
-        raise RuntimeError("Progress summary JSON missing string 'check'.")
+    remaining = int(round(remaining))
+    if remaining < 0:
+        remaining = 0
-    return _single_line(agent_summary), _single_line(check_summary)
+    return remaining, _single_line(summary)
 def _single_line(text):
@@ -118,56 +143,36 @@ def _single_line(text):
     return " ".join(text.replace("\r", " ").split())
-def _format_duration(seconds):
+def _format_elapsed(seconds):
     if seconds < 0:
         seconds = 0
     seconds = int(round(seconds))
     hours, remainder = divmod(seconds, 3600)
     minutes, seconds = divmod(remainder, 60)
-    parts = []
-    if hours:
-        parts.append(f"{hours}h")
-    if minutes or hours:
-        parts.append(f"{minutes}m")
-    if not hours:
-        parts.append(f"{seconds}s")
-    return " ".join(parts)
-def _print_progress(
-    attempt,
-    total,
-    start_time,
-    agent_output,
-    check_output,
-    cwd,
-    yolo,
-    flags,
-):
-    elapsed = time.monotonic() - start_time
-    remaining = 0
-    remaining_text = "unknown"
-    if total:
-        if attempt:
-            remaining = (elapsed / attempt) * (total - attempt)
-        remaining_text = _format_duration(remaining)
+    return f"{hours}h{minutes:02d}m{seconds:02d}s"
-    summary_prompt = _build_progress_prompt(agent_output, check_output)
-    summary = agent(summary_prompt, cwd, yolo, flags)
-    agent_summary, check_summary = _progress_result(summary)
-    elapsed_text = _format_duration(elapsed)
-    if not total:
-        round_text = f"Round {attempt}/unlimited"
+def _format_turns(attempt, total):
+    if total:
+        width = max(2, len(str(total)))
+        total_text = str(total)
     else:
-        round_text = f"Round {attempt}/{total}"
-    print(
-        f"{round_text} ({elapsed_text} elapsed, {remaining_text} remaining)",
-        flush=True,
+        width = 2
+        total_text = "∞"
+    attempt_text = f"{attempt:0{width}d}"
+    return f"{attempt_text}/{total_text}"
+def estimate(prompt, agent_output, check_output, cwd, yolo, flags, previous_total):
+    estimate_prompt = _build_estimate_prompt(
+        prompt,
+        agent_output or "",
+        check_output or "",
+        previous_total,
     )
-    print(f"Agent: {agent_summary}", flush=True)
-    print(f"Check: {check_summary}", flush=True)
-    print("", flush=True)
+    output = agent(estimate_prompt, cwd, yolo, flags)
+    return _estimate_result(output)
 def _fix_prompt(error):
     return (
@@ -234,7 +239,7 @@ def task(
         cwd: Optional working directory for the Codex session.
         yolo: Whether to pass --yolo to Codex.
         flags: Additional raw CLI flags to pass to Codex.
-        progress: Whether to print progress after each verification round.
+        progress: Whether to show a tqdm progress bar with status updates.
         set_up: Optional setup prompt to run before the task.
         tear_down: Optional cleanup prompt to run after the task.
         on_success: Optional prompt to run after a successful task.
@@ -280,7 +285,7 @@ def task_result(
     """Run a prompt with optional checker-driven retries and return TaskResult.
     The runner keeps a single session. Each verification attempt uses a fresh,
-    stateless agent call. When progress is True, print a summary each round.
+    stateless agent call. When progress is True, show progress updates each round.
     Hook strings mirror task file keys: set_up, tear_down, on_success, on_failure.
     """
@@ -362,6 +367,9 @@ class Task:
         self.check_text = None
         self._yolo = yolo
         self._flags = flags
+        self._progress_enabled = False
+        self._progress_bar = None
+        self._progress_total = None
         self.agent = Agent(
             cwd,
             yolo,
@@ -403,6 +411,30 @@ class Task:
     def on_failure(self, result):
         """Hook called after a failed run, e.g. log the failure reason."""
+    def on_progress(
+        self,
+        turns,
+        max_turns,
+        total_estimate,
+        remaining_estimate,
+        status_line,
+    ):
+        """Hook called with progress updates."""
+        if not self._progress_enabled:
+            return
+        if self._progress_bar is None:
+            self._progress_bar = tqdm(total=total_estimate)
+        if total_estimate != self._progress_bar.total:
+            self._progress_bar.total = total_estimate
+        current = total_estimate - remaining_estimate
+        if current < 0:
+            current = 0
+        if self._progress_bar.n != current:
+            self._progress_bar.n = current
+        self._progress_bar.refresh()
+        if status_line:
+            tqdm.write(status_line, file=self._progress_bar.fp)
     def fix_prompt(self, error):
         """Build a prompt that asks the agent to fix checker failures."""
         return (
@@ -425,12 +457,35 @@ class Task:
     def __call__(self, debug=False, progress=False):
         """Run the task with checker-driven retries.
             If debug is True, log debug messages.
-            If progress is True, print progress after each verification round.
+            If progress is True, show a tqdm progress bar with status updates.
         """
         try:
             # If this fails in the middle we will still try to tear down
             self.set_up()
+            self._progress_enabled = progress
+            if progress:
+                remaining, _summary = estimate(
+                    self.prompt,
+                    "",
+                    "",
+                    self.cwd,
+                    self._yolo,
+                    self._flags,
+                    None,
+                )
+                self._progress_total = remaining
+                start_time = time.monotonic()
+                self.on_progress(
+                    0,
+                    self.max_attempts,
+                    self._progress_total,
+                    remaining,
+                    None,
+                )
+            else:
+                start_time = time.monotonic()
             # Start with the initial prompt
             output = self.agent(self.prompt)
             self.last_output = output
@@ -438,7 +493,6 @@ class Task:
                 _logger.debug("Initial output: %s", output)
             # Try correcting it up to max_attempts times
-            start_time = time.monotonic()
             error = None
             attempt = 0
             while True:
@@ -451,15 +505,36 @@ class Task:
                     check_output = self.last_check_output
                     if self.check_skipped:
                         check_output = "Verification skipped."
-                    _print_progress(
-                        attempt,
-                        self.max_attempts,
-                        start_time,
-                        self.last_output,
+                    remaining, summary = estimate(
+                        self.prompt,
+                        self.last_output or "",
                         check_output or "",
                         self.cwd,
                         self._yolo,
                         self._flags,
+                        self._progress_total,
+                    )
+                    total_estimate = self._progress_total
+                    if total_estimate is None or remaining > total_estimate:
+                        total_estimate = remaining
+                    self._progress_total = total_estimate
+                    elapsed = _format_elapsed(time.monotonic() - start_time)
+                    status_prefix = (
+                        f"[{_format_turns(attempt, self.max_attempts)} @ {elapsed}]"
+                    )
+                    is_final = not error or (
+                        self.max_attempts and attempt >= self.max_attempts
+                    )
+                    if is_final:
+                        marker = "✅" if not error else "❌"
+                        summary = f"{marker} {summary}".strip()
+                    status_line = f"{status_prefix}: {summary}".rstrip()
+                    self.on_progress(
+                        attempt,
+                        self.max_attempts,
+                        total_estimate,
+                        remaining,
+                        status_line,
                     )
                 if not error:
                     summary = self.agent(self.success_prompt())
@@ -494,6 +569,8 @@ class Task:
         finally:
             # No matter what, once we have set_up we will always tear_down
             self.tear_down()
+            if self._progress_bar is not None:
+                self._progress_bar.close()
 class AutoTask(Task):

{codexapi-0.5.4 → codexapi-0.5.6}/src/codexapi/taskfile.py RENAMED Viewed

@@ -54,6 +54,17 @@ def _render(text, item):
     return text.replace(_ITEM_TOKEN, item)
+def task_def_uses_item(task_def):
+    """Return True if a task definition includes the {{item}} placeholder."""
+    if not isinstance(task_def, dict):
+        raise TypeError("task definition must be a dict")
+    for key in ("prompt", "set_up", "tear_down", "check", "on_success", "on_failure"):
+        value = task_def.get(key)
+        if isinstance(value, str) and _ITEM_TOKEN in value:
+            return True
+    return False
 class TaskFile(AutoTask):
     """Task subclass that maps a YAML task file onto Task hooks."""

{codexapi-0.5.4 → codexapi-0.5.6/src/codexapi.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: codexapi
-Version: 0.5.4
+Version: 0.5.6
 Summary: Minimal Python API for running the Codex CLI.
 License: MIT
 Keywords: codex,agent,cli,openai
@@ -68,18 +68,26 @@ codexapi run --cwd /path/to/project "Fix the failing tests."
 echo "Say hello." | codexapi run
 ```
-`codexapi task` exits with code 0 on success and 1 on failure, printing the summary.
+`codexapi task` exits with code 0 on success and 1 on failure.
 ```bash
 codexapi task "Fix the failing tests." --max-iterations 5
 codexapi task -f task.yaml
+codexapi task -f task.yaml -i README.md
 ```
 Progress is shown by default for `codexapi task`; use `--quiet` to suppress it.
+When using `--item`, the task file must include at least one `{{item}}` placeholder.
 Task files default to using the standard check prompt for the task. Set `check: "None"` to skip verification.
 Use `max_iterations` in the task file to override the default attempt cap (0 means unlimited).
 Checks are wrapped with the verifier prompt, include the agent output, and expect JSON with `success`/`reason`.
+Example task progress run:
+```bash
+./examples/example_task_progress.sh
+```
 Show running sessions and their latest activity:
 ```bash
@@ -120,6 +128,8 @@ Run a task file across a list file:
 ```bash
 codexapi foreach list.txt task.yaml
 codexapi foreach list.txt task.yaml -n 4
+codexapi foreach list.txt task.yaml --retry-failed
+codexapi foreach list.txt task.yaml --retry-all
 ```
 ## API
@@ -151,7 +161,7 @@ Raises `TaskFailed` when the maximum attempts are reached.
 - `check` (str | None | False): custom check prompt, default checker, or `False`/`"None"` to skip.
 - `max_iterations` (int): maximum number of task attempts (0 means unlimited).
-- `progress` (bool): print progress after each verification round.
+- `progress` (bool): show a tqdm progress bar with a one-line status after each round.
 - `set_up`/`tear_down`/`on_success`/`on_failure` (str | None): optional hook prompts.
 ### `task_result(prompt, check=None, max_iterations=10, cwd=None, yolo=True, flags=None, progress=False, set_up=None, tear_down=None, on_success=None, on_failure=None) -> TaskResult`