PyPI - python-codex - Versions diffs - 0.1.5__tar.gz → 0.1.6__tar.gz - Mend

python-codex 0.1.5tar.gz → 0.1.6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (102) hide show

{python_codex-0.1.5 → python_codex-0.1.6}/AGENTS.md RENAMED Viewed

@@ -13,6 +13,8 @@
 - `responses_server` compat 层应透传请求里的 `model`；不要再做 “取 downstream /models 第一个 id 并强制覆盖请求模型” 这种兜底兼容。
 - 对 `model_provider = "vllm"`，`responses_server` 仍然走 `/v1/chat/completions` compat 路径，但要保留 reasoning：把 chat chunk 里的 `reasoning` / `reasoning_content` 翻回 Responses `reasoning` item，并把历史里的 Responses `reasoning` item 回放成下游 assistant message 的 `reasoning` 字段。
 - `responses_server` 的 provider-specific chat payload 定制统一放在 `responses_server/payload_processors.py`：使用 `CompatServerConfig.model_provider` 选择 `provider_name -> proc_fn(outcomming_request)` 映射，并且只在真正发出 downstream `/v1/chat/completions` 前 post-process；`StreamRouter` 内部继续保留 canonical payload，避免 tool hydration / mock web_search follow-up 被 provider 改写污染。
+- `responses_server` 如果要兼容下游 `/v1/messages`，也优先保持这条边界：内部继续用 canonical chat request / chat-like chunk 流，只有真正发请求和读取 SSE 时才做 messages 适配，这样 tool hydration、mock `web_search` follow-up、provider payload post-process 都能复用。
+- 真实 vLLM `0.19.0` 的 `/v1/messages` 会对缺失 `max_tokens` 直接返回 `400`；messages 适配层必须总是补这个字段。当前约定是优先透传请求里的 `max_output_tokens`/`max_tokens`，否则回退到默认 `32000`。
 - `pycodex` 默认是最小交互 CLI；无 prompt 时进入 REPL，并通过 `AgentRuntime` 跑外层提交循环。当前会显示最小事件流、assistant 流式输出、简单 title/history（`/title`, `/history`），并默认注册一组与原版一一对应的本地工具子集。
 - 交互 CLI 的事件流展示优先表达用户可感知的阶段（例如工具开始/完成、模型回看工具结果），不要直接把内部 `iteration` 计数暴露成主要状态文案；`iterations` 应继续保留在 `TurnResult` 等程序化结果里。
 - prompt/context 相关逻辑统一放在 `pycodex/context.py`：`AgentLoop` 只维护真实会话历史；每轮请求前由 `ContextManager` 注入 base instructions、developer message、`AGENTS.md` 指令和 `<environment_context>`，且这些注入项不写回 history。

{python_codex-0.1.5 → python_codex-0.1.6}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: python-codex
-Version: 0.1.5
+Version: 0.1.6
 Summary: A minimal Python extraction of Codex's main agent loop
 License-File: LICENSE
 Requires-Python: >=3.6.2
@@ -159,6 +159,7 @@ pycodex "Summarize this repo in one sentence."
 printf 'Reply with exactly OK.' | pycodex
 pycodex --json "Reply with exactly OK."
 pycodex --profile model_proxy "Reply with exactly OK."
+pycodex --profile opus --use-messages "Reply with exactly OK."
 pycodex --vllm-endpoint http://127.0.0.1:18000 "Reply with exactly OK."
 pycodex --put @127.0.0.1:5577
 pycodex --put /data/.codex/@127.0.0.1:5577
@@ -211,6 +212,9 @@ Current behavior:
   historical `reasoning` items are replayed into downstream assistant messages
   via the `reasoning` field. Streaming token usage is also requested from vLLM
   and forwarded to the final `response.completed.response.usage`
+- standalone `responses_server` now also supports downstream `/v1/messages`
+  backends via `--outcomming-api messages`, while keeping the internal
+  canonical request/route logic in chat-completions shape
 - `pycodex doctor` checks config, `.env`, API keys, DNS, TCP/TLS, and an
   optional live Responses API request

{python_codex-0.1.5 → python_codex-0.1.6}/README.md RENAMED Viewed

@@ -138,6 +138,7 @@ pycodex "Summarize this repo in one sentence."
 printf 'Reply with exactly OK.' | pycodex
 pycodex --json "Reply with exactly OK."
 pycodex --profile model_proxy "Reply with exactly OK."
+pycodex --profile opus --use-messages "Reply with exactly OK."
 pycodex --vllm-endpoint http://127.0.0.1:18000 "Reply with exactly OK."
 pycodex --put @127.0.0.1:5577
 pycodex --put /data/.codex/@127.0.0.1:5577
@@ -190,6 +191,9 @@ Current behavior:
   historical `reasoning` items are replayed into downstream assistant messages
   via the `reasoning` field. Streaming token usage is also requested from vLLM
   and forwarded to the final `response.completed.response.usage`
+- standalone `responses_server` now also supports downstream `/v1/messages`
+  backends via `--outcomming-api messages`, while keeping the internal
+  canonical request/route logic in chat-completions shape
 - `pycodex doctor` checks config, `.env`, API keys, DNS, TCP/TLS, and an
   optional live Responses API request

{python_codex-0.1.5 → python_codex-0.1.6}/docs/responses_server/README.md RENAMED Viewed

@@ -21,6 +21,7 @@
 - incomming `POST /v1/responses`
 - incomming `GET /v1/models`
 - outcomming `POST /v1/chat/completions`
+- outcomming `POST /v1/messages`（通过边界适配复用同一套 canonical chat request / stream routing）
 - 流式 assistant 文本
 - vLLM chat-completions `reasoning` / `reasoning_content` -> Responses `reasoning` item 适配
 - vLLM 历史 `reasoning` item -> assistant message `reasoning` 字段回放
@@ -42,13 +43,13 @@
 ## Incomming / Outcomming 分层
 - incomming：面向 Codex 的 Responses 子集
-- outcomming：面向 backend 的 chat completions 子集
+- outcomming：面向 backend 的 chat-completions / messages 兼容子集
 当前职责拆分：
 - `responses_server/app.py`：FastAPI app 和 CLI 入口
 - `responses_server/server.py`：`ResponseServer`，负责持有 `SessionStore` 和 `StreamRouter`
-- `responses_server/stream_router.py`：`StreamRouter`，负责 incomming 请求翻译、outcomming chat 请求和流路由；对 `model_provider = "vllm"` 额外适配 chat-level reasoning
+- `responses_server/stream_router.py`：`StreamRouter`，负责 incomming 请求翻译、outcomming request 和流路由；对 `model_provider = "vllm"` 额外适配 chat-level reasoning
 - `responses_server/payload_processors.py`：按 `CompatServerConfig.model_provider` 选择 provider-specific payload `post_process`
 - `responses_server/tools/`：provider 侧工具适配层；当前放 mock `web_search` 和 custom-tool function wrapper
 - `responses_server/session_store.py`：最小隐藏状态存储
@@ -63,6 +64,15 @@ uv run python -m responses_server \
   --model-provider vllm
 ```
+如果下游不是 `/v1/chat/completions`，而是 Anthropic/Claude 风格的
+`/v1/messages`，再额外加：
+```bash
+uv run python -m responses_server \
+  --outcomming-base-url http://127.0.0.1:8000/v1 \
+  --outcomming-api messages
+```
 默认会在本地启动一个 incomming Responses 服务；真正监听地址由 `--host` 和 `--port`
 控制。
@@ -73,6 +83,15 @@ uv run python -m responses_server \
 当前内置规则里，`vllm` 仍走 chat-completions compat 路径，但会额外保留
 reasoning；`stepfun` 会删除所有 `developer` role。
+`messages` compat 则故意不改这层 canonical request：仍然先构造 chat 风格
+`outcomming_request`，只有在真正发请求和读 SSE 时，才在边界把它翻译成
+messages request / event。这样 tool hydration、mock `web_search`
+follow-up、provider payload post-process 仍然复用同一套主逻辑。
+当前 messages 边界还会补一个兼容性细节：下游如果像 vLLM `0.19.0` 一样要求
+`max_tokens`，则优先透传上游请求里的 `max_output_tokens` / `max_tokens`；
+如果上游没给，当前默认补 `32000`，避免直接被下游 `400` 拒绝。
 ## 验证
 当前独立测试：

{python_codex-0.1.5 → python_codex-0.1.6}/pycodex/cli.py RENAMED Viewed

@@ -42,7 +42,6 @@ CliSessionMode = Literal["exec", "tui"]
 LOCAL_RESPONSES_SERVER_API_KEY_ENV = "PYCODEX_LOCAL_RESPONSES_SERVER_KEY"
 CLI_ORIGINATOR = "codex-tui"
 def launch_chat_completion_compat_server(*args, **kwargs):
     from responses_server import (
         launch_chat_completion_compat_server as launch_compat_server,
@@ -123,6 +122,15 @@ def build_parser() -> 'argparse.ArgumentParser':
             "When set, pycodex starts a local responses compat server for this session."
         ),
     )
+    parser.add_argument(
+        "--use-messages",
+        default=False,
+        action="store_true",
+        help=(
+            "When set, pycodex starts a local responses compat server and routes "
+            "to a downstream /v1/messages backend for this session."
+        ),
+    )
     parser.add_argument(
         "--system-prompt",
         default=None,
@@ -373,12 +381,17 @@ def _build_model_client(
     managed_responses_base_url: 'typing.Union[str, None]' = None,
     vllm_endpoint: 'typing.Union[str, None]' = None,
     use_chat_completion: 'bool' = False,
+    use_messages: 'bool' = False,
 ):
     load_codex_dotenv(config_path)
     provider_config = ResponsesProviderConfig.from_codex_config(
         config_path,
         profile,
     )
+    if use_chat_completion and use_messages:
+        raise ValueError("--use-chat-completion and --use-messages cannot be combined")
+    if vllm_endpoint and use_messages:
+        raise ValueError("--vllm-endpoint and --use-messages cannot be combined")
     url, key_env = provider_config.base_url, provider_config.api_key_env
     if managed_responses_base_url is not None:
         url, key_env = (
@@ -386,7 +399,7 @@ def _build_model_client(
             LOCAL_RESPONSES_SERVER_API_KEY_ENV,
         )
         os.environ.setdefault(LOCAL_RESPONSES_SERVER_API_KEY_ENV, "dummy")
-    elif vllm_endpoint or use_chat_completion:
+    elif vllm_endpoint or use_chat_completion or use_messages:
         if vllm_endpoint:
             managed_server = launch_chat_completion_compat_server(
                 vllm_endpoint,
@@ -397,6 +410,9 @@ def _build_model_client(
                 provider_config.base_url,
                 provider_config.api_key_env,
                 model_provider=provider_config.provider_name,
+                outcomming_api=(
+                    "messages" if use_messages else "chat_completions"
+                ),
             )
         atexit.register(managed_server.stop)
         url, key_env = (
@@ -755,6 +771,7 @@ async def run_cli(args: 'argparse.Namespace') -> 'int':
             args.timeout_seconds,
             vllm_endpoint=args.vllm_endpoint,
             use_chat_completion=args.use_chat_completion,
+            use_messages=args.use_messages,
         )
         runtime = build_runtime(

{python_codex-0.1.5 → python_codex-0.1.6}/pycodex/utils/visualize.py RENAMED Viewed

@@ -158,13 +158,29 @@ class Spinner:
             self._paused = False
     def clear(self) -> 'None':
-        if not self._enabled or not self._visible:
-            return
         with self._terminal_lock:
+            if not self._visible:
+                return
             self._raw_write("\r\x1b[2K")
             self._raw_flush()
             self._visible = False
+    def render_now(self) -> 'None':
+        if not self._turn_active or self._paused:
+            return
+        frame = colorize_cli_message(
+            build_cli_spinner_frame(self._index, self._label),
+            "status",
+            self._color_enabled,
+        )
+        self._index += 1
+        with self._terminal_lock:
+            if not self._turn_active or self._paused:
+                return
+            self._raw_write(f"\r\x1b[2K{frame}")
+            self._raw_flush()
+            self._visible = True
     def close(self) -> 'None':
         self.finish_turn()
         if self._thread is not None:
@@ -726,6 +742,7 @@ class CliSessionView:
                 else:
                     self._spinner.resume()
                     self._spinner.set_label("running provider tools")
+                    self._spinner.render_now()
             return
         if event.kind == "tool_started":
@@ -745,15 +762,11 @@ class CliSessionView:
                     self._spinner.set_label(f"running {tool_name}")
                 else:
                     self._spinner.set_label("running provider tools")
+                self._spinner.render_now()
             return
         if event.kind == "tool_completed":
             self._finish_stream()
-            if self._input_active:
-                self._spinner.pause()
-            else:
-                self._spinner.resume()
-                self._spinner.set_label("thinking")
             tool_name, summary, is_error = extract_tool_event_display(event.payload)
             summary = self._rewrite_agent_summary(tool_name, summary)
             if tool_name == "update_plan" and not is_error:
@@ -762,6 +775,12 @@ class CliSessionView:
                     self._print_line(
                         colorize_cli_message(line, "plan", self._color_enabled)
                     )
+                if self._input_active:
+                    self._spinner.pause()
+                else:
+                    self._spinner.resume()
+                    self._spinner.set_label("thinking")
+                    self._spinner.render_now()
                 return
             message = format_cli_tool_message(
                 tool_name,
@@ -770,6 +789,12 @@ class CliSessionView:
             )
             self._remember_agent_name(tool_name, summary)
             self._print_line(self._colorize_formatted_tool_message(message))
+            if self._input_active:
+                self._spinner.pause()
+            else:
+                self._spinner.resume()
+                self._spinner.set_label("thinking")
+                self._spinner.render_now()
             return
         if event.kind == "turn_completed":
@@ -830,6 +855,8 @@ class CliSessionView:
     def resume_spinner(self) -> 'None':
         self._spinner.resume()
+        if not self._input_active:
+            self._spinner.render_now()
     def set_input_active(self, active: 'bool', resume_spinner: 'bool' = True) -> 'None':
         self._input_active = active

{python_codex-0.1.5 → python_codex-0.1.6}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "python-codex"
-version = "0.1.5"
+version = "0.1.6"
 description = "A minimal Python extraction of Codex's main agent loop"
 readme = "README.md"
 requires-python = ">=3.6.2"

{python_codex-0.1.5 → python_codex-0.1.6}/responses_server/app.py RENAMED Viewed

@@ -55,12 +55,18 @@ def build_parser() -> 'argparse.ArgumentParser':
         prog="python -m responses_server",
         description=(
             "Standalone localhost `/v1/responses` server that translates the "
-            "Codex/Responses subset onto an outcomming `/v1/chat/completions` backend."
+            "Codex/Responses subset onto an outcomming `/v1/chat/completions` "
+            "or `/v1/messages` backend."
         ),
     )
     parser.add_argument("--host", default="127.0.0.1")
     parser.add_argument("--port", type=int, default=8001)
     parser.add_argument("--outcomming-base-url", required=True)
+    parser.add_argument(
+        "--outcomming-api",
+        default="chat_completions",
+        choices=["chat_completions", "messages"],
+    )
     parser.add_argument("--outcomming-api-key-env", default=None)
     parser.add_argument("--model-provider", default=None)
     parser.add_argument("--timeout-seconds", type=float, default=120.0)
@@ -80,10 +86,12 @@ def launch_chat_completion_compat_server(
     base_url: 'str',
     api_key_env: 'typing.Union[str, None]' = None,
     model_provider: 'typing.Union[str, None]' = None,
+    outcomming_api: 'str' = "chat_completions",
 ):
     config = CompatServerConfig.from_base_url(
         base_url,
         api_key_env,
+        outcomming_api=outcomming_api,
         model_provider=model_provider,
     )
     server = ManagedResponseServer(config)
@@ -209,6 +217,7 @@ def main() -> 'None':
             host=args.host,
             port=args.port,
             outcomming_base_url=args.outcomming_base_url,
+            outcomming_api=args.outcomming_api,
             outcomming_api_key_env=args.outcomming_api_key_env,
             model_provider=args.model_provider,
             timeout_seconds=args.timeout_seconds,

{python_codex-0.1.5 → python_codex-0.1.6}/responses_server/config.py RENAMED Viewed

@@ -10,6 +10,7 @@ class CompatServerConfig:
     host: 'str' = "127.0.0.1"
     port: 'int' = 0
     outcomming_base_url: 'str' = "http://127.0.0.1:8000/v1"
+    outcomming_api: 'str' = "chat_completions"
     outcomming_api_key_env: 'typing.Union[str, None]' = None
     model_provider: 'typing.Union[str, None]' = None
     timeout_seconds: 'float' = 120.0
@@ -24,15 +25,24 @@ class CompatServerConfig:
         base = self.outcomming_base_url.rstrip("/")
         return f"{base}/chat/completions"
+    def outcomming_messages_url(self) -> 'str':
+        base = self.outcomming_base_url.rstrip("/")
+        return f"{base}/messages"
     def outcomming_models_url(self) -> 'str':
         base = self.outcomming_base_url.rstrip("/")
         return f"{base}/models"
+    def normalized_outcomming_api(self) -> 'str':
+        value = str(self.outcomming_api or "").strip().lower()
+        return value or "chat_completions"
     def with_ephemeral_port(self) -> 'CompatServerConfig':
         return CompatServerConfig(
             host=self.host,
             port=0,
             outcomming_base_url=self.outcomming_base_url,
+            outcomming_api=self.outcomming_api,
             outcomming_api_key_env=self.outcomming_api_key_env,
             model_provider=self.model_provider,
             timeout_seconds=self.timeout_seconds,
@@ -44,6 +54,7 @@ class CompatServerConfig:
         outcomming_base_url: 'str',
         api_key_env: 'typing.Union[str, None]' = None,
         model_provider: 'typing.Union[str, None]' = None,
+        outcomming_api: 'str' = "chat_completions",
     ) -> 'CompatServerConfig':
         parsed = urllib.parse.urlparse(outcomming_base_url)
         if not parsed.scheme or not parsed.netloc:
@@ -58,6 +69,7 @@ class CompatServerConfig:
             )
         return cls(
             outcomming_base_url=outcomming_base_url,
+            outcomming_api=outcomming_api,
             outcomming_api_key_env=api_key_env,
             model_provider=model_provider,
         )

python-codex 0.1.5__tar.gz → 0.1.6__tar.gz

python-codex 0.1.5tar.gz → 0.1.6tar.gz