PyPI - glm-launch - Versions diffs - 2026.6.1__py3-none-any.whl - Mend

glm-launch 2026.6.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

glm_launch-2026.6.1.dist-info/METADATA +354 -0
glm_launch-2026.6.1.dist-info/RECORD +5 -0
glm_launch-2026.6.1.dist-info/WHEEL +4 -0
glm_launch-2026.6.1.dist-info/entry_points.txt +2 -0
main.py +668 -0

glm_launch-2026.6.1.dist-info/METADATA ADDED Viewed

@@ -0,0 +1,354 @@
+Metadata-Version: 2.4
+Name: glm-launch
+Version: 2026.6.1
+Summary: Wrap claude/codex/opencode with Z.ai GLM settings
+Requires-Python: >=3.13
+Requires-Dist: typer
+Description-Content-Type: text/markdown
+# glm-cli-git
+A Python CLI tool that wraps LLM coding tools (`claude`, `codex`, `opencode`) with [GLM](https://docs.z.ai/) settings. Instead of running a local proxy, it configures environment variables and config files, then exec's the underlying binary directly.
+Requires Python 3.13+.
+## Usage
+```bash
+# 1. Set your Z.AI auth token
+export GLM_AUTH_TOKEN="your-zai-api-key"
+# 2. Launch Claude Code routed through Z.AI (defaults to glm-5.2)
+uv run glm-launch              # bare command defaults to `claude`
+uv run glm-launch claude       # same thing, explicit
+# Pick a different model
+uv run glm-launch claude --model glm-5.1        # long-horizon flagship
+uv run glm-launch claude --model glm-5-turbo    # fast
+uv run glm-launch claude --model glm-4.5-air    # cheap
+# Bootstrap your current shell so a plain `claude` uses Z.AI
+eval "$(uv run glm-launch shell)"
+claude
+# See available models (built-in list, or --remote for the live API list)
+uv run glm-launch models
+uv run glm-launch models --remote
+# Sanity-check connectivity / latency
+uv run glm-launch bench
+```
+> Examples use the installed `glm-launch` entrypoint. Before `uv sync` you can run
+> the script directly with `uv run src/main.py …` — the two are interchangeable.
+## Installation
+```bash
+uv sync
+```
+This installs a `glm-launch` entrypoint. Run commands via `uv run glm-launch <command>`, or `uv tool install .` to get `glm-launch` on your PATH directly. You can also run the script without installing via `uv run src/main.py <command>`.
+### Run without cloning (`uvx`)
+You can run `glm-launch` directly with [`uvx`](https://docs.astral.sh/uv/guides/tools/) (`uv tool run`) — no clone or manual install needed.
+```bash
+# From GitHub (works today)
+uvx --from git+https://github.com/jefftriplett/glm-launch glm-launch launch claude
+# Pin to a tag/branch/commit
+uvx --from git+https://github.com/jefftriplett/glm-launch@main glm-launch models
+```
+Once published to PyPI, this simplifies to:
+```bash
+# Coming soon — not yet on PyPI
+uvx glm-launch launch claude
+```
+## Commands
+### `launch claude`
+Launch [Claude Code](https://docs.anthropic.com/en/docs/claude-code) with GLM environment settings. Sets Anthropic env vars to route requests through Z.AI's Anthropic-compatible endpoint, then exec's the `claude` binary.
+> The `launch` prefix is optional: `glm-launch claude` is equivalent to `glm-launch launch claude`, and a bare `glm-launch` defaults to `claude`. The same applies to `codex` and `opencode`.
+```bash
+uv run glm-launch launch claude
+```
+**Options:**
+| Flag | Env var | Default | Description |
+|------|---------|---------|-------------|
+| `--model` / `-m` | — | `glm-5.2` | Model name passed to `claude --model` |
+| `--base-url` | `GLM_BASE_URL` | `https://api.z.ai/api/anthropic` | API endpoint |
+| `--api-key` | `GLM_API_KEY` | `""` | API key |
+| `--auth-token` | `GLM_AUTH_TOKEN` | **(required)** | Z.AI auth token |
+| `--api-timeout-ms` | `API_TIMEOUT_MS` | `3000000` | Request timeout in milliseconds |
+| `--default-haiku-model` | `ANTHROPIC_DEFAULT_HAIKU_MODEL` | `glm-4.5-air` | Model for Haiku-tier requests |
+| `--default-sonnet-model` | `ANTHROPIC_DEFAULT_SONNET_MODEL` | `glm-5.2` | Model for Sonnet-tier requests |
+| `--default-opus-model` | `ANTHROPIC_DEFAULT_OPUS_MODEL` | `glm-5.2` | Model for Opus-tier requests |
+| `--subagent-model` | `CLAUDE_CODE_SUBAGENT_MODEL` | `glm-4.5-air` | Model used for spawned subagents |
+| `--effort-level` | `CLAUDE_CODE_EFFORT_LEVEL` | `max` | Effort level for the agent loop |
+| `--attribution-header` | `CLAUDE_CODE_ATTRIBUTION_HEADER` | `0` | Attribution header toggle (`0` disables it) |
+| `--auto-compact-window` | `CLAUDE_CODE_AUTO_COMPACT_WINDOW` | `200000` | Auto-compact context window in tokens (empty to leave unset) |
+The following env vars are set before exec'ing `claude`:
+- `ANTHROPIC_BASE_URL` — from `--base-url` / `GLM_BASE_URL`
+- `ANTHROPIC_API_KEY` — from `--api-key` / `GLM_API_KEY`
+- `ANTHROPIC_AUTH_TOKEN` — from `--auth-token` / `GLM_AUTH_TOKEN`
+- `API_TIMEOUT_MS` — from `--api-timeout-ms` / `API_TIMEOUT_MS`
+- `ANTHROPIC_DEFAULT_HAIKU_MODEL` — from `--default-haiku-model`
+- `ANTHROPIC_DEFAULT_SONNET_MODEL` — from `--default-sonnet-model`
+- `ANTHROPIC_DEFAULT_OPUS_MODEL` — from `--default-opus-model`
+- `CLAUDE_CODE_SUBAGENT_MODEL` — from `--subagent-model`
+- `CLAUDE_CODE_EFFORT_LEVEL` — from `--effort-level`
+- `CLAUDE_CODE_ATTRIBUTION_HEADER` — from `--attribution-header`
+- `CLAUDE_CODE_AUTO_COMPACT_WINDOW` — from `--auto-compact-window` (only when non-empty)
+**Examples:**
+```bash
+# Use defaults (glm-5.2, Z.AI endpoint)
+uv run glm-launch launch claude
+# Flagship reasoning/coding model (the default)
+uv run glm-launch launch claude --model glm-5.2
+# Long-horizon agentic flagship
+uv run glm-launch launch claude --model glm-5.1
+# Fast, speed-optimized GLM-5 variant
+uv run glm-launch launch claude --model glm-5-turbo
+# Lightweight, low-cost model for cheaper runs
+uv run glm-launch launch claude --model glm-4.5-air
+# Tune the model tiers independently (e.g. cheap subagents, flagship main)
+uv run glm-launch launch claude \
+  --model glm-5.2 \
+  --subagent-model glm-4.5-air \
+  --default-haiku-model glm-4.5-air
+# Pass extra args through to claude
+uv run glm-launch launch claude -- --verbose
+# Override via env vars
+GLM_AUTH_TOKEN="my-token" uv run glm-launch launch claude
+```
+Run `uv run glm-launch models` to see all valid model names (or `--remote` for the live list).
+If `claude` is not on your PATH, the tool falls back to `~/.claude/local/claude`.
+### `launch codex`
+Launch [Codex](https://github.com/openai/codex) with the `--oss` flag for local Ollama usage.
+```bash
+uv run glm-launch launch codex
+```
+**Options:**
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--model` / `-m` | — | Model name passed to `codex -m` |
+**Examples:**
+```bash
+# Launch with default settings
+uv run glm-launch launch codex
+# Specify a model
+uv run glm-launch launch codex --model "some-model"
+# Pass extra args through to codex
+uv run glm-launch launch codex -- --some-flag
+```
+### `launch opencode`
+Launch [opencode](https://opencode.ai/) after writing provider config. Writes an Ollama-compatible provider to `~/.config/opencode/opencode.json` and updates the recent model state at `~/.local/state/opencode/model.json`, then exec's the `opencode` binary.
+```bash
+uv run glm-launch launch opencode --model "some-model"
+```
+**Options:**
+| Flag | Env var | Default | Description |
+|------|---------|---------|-------------|
+| `--model` / `-m` | — | — | Model name to configure in opencode |
+| `--base-url` | `GLM_BASE_URL` | **(required)** | Base URL for the API endpoint |
+**Examples:**
+```bash
+# Launch with a model
+GLM_BASE_URL="http://localhost:11434/v1" uv run glm-launch launch opencode --model "llama3"
+# Pass extra args through to opencode
+uv run glm-launch launch opencode --model "llama3" -- --some-flag
+```
+### `shell`
+Print `export` lines that bootstrap your current shell with the GLM env vars — without launching anything. Eval the output and a plain `claude` (or any Anthropic SDK tool) will talk to Z.AI.
+```bash
+eval "$(uv run glm-launch shell)"
+claude
+```
+Accepts the same model/auth options as `launch claude` (`--model`, `--auth-token`, `--default-*-model`, etc.). Secrets are shell-quoted; empty values are skipped. Sets `ANTHROPIC_MODEL` plus all the `ANTHROPIC_*` / `CLAUDE_CODE_*` vars listed under `launch claude`.
+```bash
+# Inspect what would be exported
+uv run glm-launch shell
+# Bootstrap with a specific model
+eval "$(uv run glm-launch shell --model glm-5.1)"
+```
+### `models`
+List Z.AI GLM models. By default prints a built-in, annotated list; `--remote` fetches the live list from the Z.AI PaaS endpoint.
+```bash
+# Built-in list (no token needed)
+uv run glm-launch models
+# Live list from the API (needs GLM_AUTH_TOKEN)
+uv run glm-launch models --remote
+```
+**Options:**
+| Flag | Env var | Default | Description |
+|------|---------|---------|-------------|
+| `--remote` / `-r` | — | `false` | Fetch the live list from the Z.AI API |
+| `--models-url` | `GLM_MODELS_URL` | `https://api.z.ai/api/paas/v4/models` | PaaS models endpoint (used with `--remote`) |
+| `--auth-token` | `GLM_AUTH_TOKEN` | — | Auth token (required with `--remote`) |
+| `--timeout` | — | `30.0` | Request timeout in seconds |
+The live endpoint is the OpenAI-compatible PaaS base (`/api/paas/v4/models`) and uses `Authorization: Bearer <token>` — distinct from the Anthropic-style chat base (`/api/anthropic`) used by `launch claude` and `bench`.
+### `bench`
+Time a single `/v1/messages` round-trip against the configured GLM endpoint. Useful as a sanity check that your auth token, base URL, and chosen model are reachable.
+```bash
+uv run glm-launch bench
+```
+**Options:**
+| Flag | Env var | Default | Description |
+|------|---------|---------|-------------|
+| `--model` / `-m` | — | `glm-5.2` | Model to benchmark |
+| `--base-url` | `GLM_BASE_URL` | `https://api.z.ai/api/anthropic` | API endpoint |
+| `--auth-token` | `GLM_AUTH_TOKEN` | **(required)** | Auth token for the endpoint |
+| `--timeout` | — | `30.0` | Request timeout in seconds |
+Sends a minimal 32-token request and prints the round-trip time. Exits non-zero on HTTP error or timeout.
+**Example output:**
+```
+  glm-5.2 via https://api.z.ai/api/anthropic
+  OK (200) in 412ms
+```
+### `doctor`
+Check your environment for correct setup. Reports on environment variables, binary availability, and config files.
+```bash
+uv run glm-launch doctor
+```
+**Checks performed:**
+- **Environment variables** — Whether `GLM_BASE_URL`, `GLM_API_KEY`, `GLM_AUTH_TOKEN`, `API_TIMEOUT_MS`, and the `ANTHROPIC_DEFAULT_*_MODEL` vars are set. Secrets are masked in output.
+- **Binaries** — Whether `claude`, `codex`, and `opencode` are found on PATH (with fallback to `~/.claude/local/claude` for claude).
+- **Config files** — Whether `~/.config/opencode/opencode.json` and `~/.local/state/opencode/model.json` exist.
+Exits with code 1 if any binary is missing, 0 otherwise.
+**Example output:**
+```
+Environment variables:
+  GLM_BASE_URL: (not set)
+  GLM_API_KEY: (not set)
+  GLM_AUTH_TOKEN: (not set)
+  API_TIMEOUT_MS: (not set)
+  ANTHROPIC_DEFAULT_HAIKU_MODEL: (not set)
+  ANTHROPIC_DEFAULT_SONNET_MODEL: (not set)
+  ANTHROPIC_DEFAULT_OPUS_MODEL: (not set)
+Binaries:
+  claude: /usr/local/bin/claude
+  codex: /usr/local/bin/codex
+  opencode: /usr/local/bin/opencode
+Config files:
+  /home/user/.config/opencode/opencode.json: exists
+  /home/user/.local/state/opencode/model.json: not found
+All checks passed.
+```
+## Environment variables
+| Variable | Used by | Description |
+|----------|---------|-------------|
+| `GLM_BASE_URL` | `launch claude`, `launch opencode`, `shell` | API base URL |
+| `GLM_API_KEY` | `launch claude`, `shell` | API key |
+| `GLM_AUTH_TOKEN` | `launch claude`, `shell`, `bench`, `models --remote` | Z.AI auth token (required) |
+| `GLM_MODELS_URL` | `models --remote` | PaaS models endpoint |
+| `API_TIMEOUT_MS` | `launch claude`, `shell` | Request timeout in milliseconds |
+| `ANTHROPIC_DEFAULT_HAIKU_MODEL` | `launch claude`, `shell` | Model for Haiku-tier requests |
+| `ANTHROPIC_DEFAULT_SONNET_MODEL` | `launch claude`, `shell` | Model for Sonnet-tier requests |
+| `ANTHROPIC_DEFAULT_OPUS_MODEL` | `launch claude`, `shell` | Model for Opus-tier requests |
+| `CLAUDE_CODE_SUBAGENT_MODEL` | `launch claude`, `shell` | Model used for spawned subagents |
+| `CLAUDE_CODE_EFFORT_LEVEL` | `launch claude`, `shell` | Effort level for the agent loop |
+| `CLAUDE_CODE_ATTRIBUTION_HEADER` | `launch claude`, `shell` | Attribution header toggle (`0` disables it) |
+| `CLAUDE_CODE_AUTO_COMPACT_WINDOW` | `launch claude`, `shell` | Auto-compact context window in tokens |
+## How it works
+Each provider follows the same pattern:
+1. Resolve the binary on PATH (with optional fallback path)
+2. Set up configuration (env vars for claude, config files for opencode, flags for codex)
+3. `os.execvpe()` the binary — fully replacing the glm process with the underlying tool for direct stdio passthrough
+For Claude specifically, Z.AI exposes an Anthropic-compatible endpoint at `https://api.z.ai/api/anthropic`, so no local proxy is needed. The CLI sets the standard `ANTHROPIC_*` env vars and Claude Code talks directly to Z.AI.
+## Development
+Common tasks are wrapped in a [`justfile`](https://github.com/casey/just). Run `just` with no arguments to list them.
+| Recipe | Description |
+|--------|-------------|
+| `just bootstrap` | Upgrade `pip`/`uv`, then `uv sync` |
+| `just sync` | `uv sync` the project dependencies |
+| `just lock` | `uv lock` the dependency versions |
+| `just build` | `uv build` the wheel and sdist |
+| `just publish` | `uv publish` to PyPI |
+| `just bump *ARGS` | Bump the CalVer version with `bumpver` (e.g. `just bump`) |
+| `just bump-dry *ARGS` | Preview a version bump without writing changes |
+| `just lint *ARGS` | Run the [prek](https://github.com/j178/prek) hooks (defaults to `--all-files`) |
+| `just fmt` | Format the `justfile` itself |
+| `just demo` | Smoke-test the CLI by listing models |
+Versioning follows [CalVer](https://calver.org/) (`YYYY.MM.INC1`), and lint hooks (ruff, pyupgrade, validate-pyproject) are configured in `.pre-commit-config.yaml` and run with `prek`.

glm_launch-2026.6.1.dist-info/RECORD ADDED Viewed

@@ -0,0 +1,5 @@
+main.py,sha256=by1EUAZOJiJTaoIw7_b9ARYmUNmnu4U__emkqx_fym8,21489
+glm_launch-2026.6.1.dist-info/METADATA,sha256=QYF_fOa4nqx2qpmDy50RsQ3Ff8EBNt2pQZT_ZbQmBrY,13600
+glm_launch-2026.6.1.dist-info/WHEEL,sha256=mffPy8wBnZQn2VnJUU5jE99KsxaSfiyMHV9Yt0aLVxs,87
+glm_launch-2026.6.1.dist-info/entry_points.txt,sha256=wRXUbEpSSprL99WcU1Us89QOK7L21OjmrQBED3i3mag,40
+glm_launch-2026.6.1.dist-info/RECORD,,

glm_launch-2026.6.1.dist-info/WHEEL ADDED Viewed

@@ -0,0 +1,4 @@
+Wheel-Version: 1.0
+Generator: hatchling 1.30.1
+Root-Is-Purelib: true
+Tag: py3-none-any

glm_launch-2026.6.1.dist-info/entry_points.txt ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ [console_scripts]
2	+ glm-launch = main:cli

main.py ADDED Viewed

@@ -0,0 +1,668 @@
+# /// script
+# requires-python = ">=3.13"
+# dependencies = [
+#     "typer",
+# ]
+# ///
+from __future__ import annotations
+import json
+import os
+import shutil
+import typer
+app = typer.Typer(invoke_without_command=True)
+launch_app = typer.Typer(
+    help="Launch an LLM coding tool with GLM settings.",
+    invoke_without_command=True,
+)
+app.add_typer(launch_app, name="launch")
+@app.callback(invoke_without_command=True)
+def main(ctx: typer.Context) -> None:
+    if ctx.invoked_subcommand is None:
+        print(ctx.get_help())
+        raise typer.Exit()
+@launch_app.callback(invoke_without_command=True)
+def launch_main(ctx: typer.Context) -> None:
+    if ctx.invoked_subcommand is None:
+        print(ctx.get_help())
+        raise typer.Exit()
+# ---------------------------------------------------------------------------
+# Z.ai model registry
+# ---------------------------------------------------------------------------
+# Current Z.ai GLM models (API IDs are lowercase). Kept here so `models` and
+# the help text stay in one place. See https://z.ai/model-api
+ZAI_MODELS: list[tuple[str, str]] = [
+    ("glm-5.2", "Flagship — frontier reasoning, coding, and agentic tasks"),
+    ("glm-5.1", "Long-horizon agentic flagship (200K context)"),
+    ("glm-5", "GLM-5 flagship"),
+    ("glm-5-turbo", "Speed-optimized GLM-5 variant"),
+    ("glm-4.7", "Balanced cost/performance coding model"),
+    ("glm-4.6", "Strong coding model, 200K context"),
+    ("glm-4.5", "Previous-gen general model"),
+    ("glm-4.5-air", "Lightweight, low-cost (good for subagents/haiku tier)"),
+]
+# ---------------------------------------------------------------------------
+# Binary resolution helpers
+# ---------------------------------------------------------------------------
+def _find_binary(name: str, fallback_path: str | None = None) -> str:
+    """Locate *name* on PATH, optionally falling back to *fallback_path*."""
+    found = shutil.which(name)
+    if found:
+        return found
+    if fallback_path:
+        expanded = os.path.expanduser(fallback_path)
+        if os.path.isfile(expanded) and os.access(expanded, os.X_OK):
+            return expanded
+    install_hint = "Install it or ensure it is on your PATH."
+    raise SystemExit(f"{name!r} not found. {install_hint}")
+# ---------------------------------------------------------------------------
+# Claude / GLM environment
+# ---------------------------------------------------------------------------
+def _build_claude_env(
+    *,
+    model: str,
+    base_url: str,
+    api_key: str,
+    auth_token: str,
+    api_timeout_ms: str,
+    default_haiku_model: str,
+    default_sonnet_model: str,
+    default_opus_model: str,
+    subagent_model: str,
+    effort_level: str,
+    attribution_header: str = "0",
+    auto_compact_window: str = "",
+) -> dict[str, str]:
+    """Build the GLM env vars claude needs to talk to Z.ai."""
+    env = {
+        "ANTHROPIC_BASE_URL": base_url,
+        "ANTHROPIC_API_KEY": api_key,
+        "ANTHROPIC_AUTH_TOKEN": auth_token,
+        "API_TIMEOUT_MS": api_timeout_ms,
+        "ANTHROPIC_DEFAULT_HAIKU_MODEL": default_haiku_model,
+        "ANTHROPIC_DEFAULT_SONNET_MODEL": default_sonnet_model,
+        "ANTHROPIC_DEFAULT_OPUS_MODEL": default_opus_model,
+        "CLAUDE_CODE_SUBAGENT_MODEL": subagent_model,
+        "CLAUDE_CODE_EFFORT_LEVEL": effort_level,
+        "CLAUDE_CODE_ATTRIBUTION_HEADER": attribution_header,
+    }
+    if model:
+        env["ANTHROPIC_MODEL"] = model
+    if auto_compact_window:
+        env["CLAUDE_CODE_AUTO_COMPACT_WINDOW"] = auto_compact_window
+    return env
+# ---------------------------------------------------------------------------
+# launch claude
+# ---------------------------------------------------------------------------
+@launch_app.command(
+    "claude",
+    context_settings={"allow_extra_args": True, "allow_interspersed_args": False},
+)
+def launch_claude(
+    ctx: typer.Context,
+    model: str = typer.Option(
+        "glm-5.2", "--model", "-m", help="Model name to pass to claude"
+    ),
+    base_url: str = typer.Option(
+        "https://api.z.ai/api/anthropic",
+        "--base-url",
+        envvar="GLM_BASE_URL",
+        help="Base URL for the API endpoint",
+    ),
+    api_key: str = typer.Option(
+        "",
+        "--api-key",
+        envvar="GLM_API_KEY",
+        help="API key",
+    ),
+    auth_token: str = typer.Option(
+        ...,
+        "--auth-token",
+        envvar="GLM_AUTH_TOKEN",
+        help="Auth token",
+    ),
+    api_timeout_ms: str = typer.Option(
+        "3000000",
+        "--api-timeout-ms",
+        envvar="API_TIMEOUT_MS",
+        help="API request timeout in milliseconds",
+    ),
+    default_haiku_model: str = typer.Option(
+        "glm-4.5-air",
+        "--default-haiku-model",
+        envvar="ANTHROPIC_DEFAULT_HAIKU_MODEL",
+        help="Default model for Haiku-tier requests",
+    ),
+    default_sonnet_model: str = typer.Option(
+        "glm-5.2",
+        "--default-sonnet-model",
+        envvar="ANTHROPIC_DEFAULT_SONNET_MODEL",
+        help="Default model for Sonnet-tier requests",
+    ),
+    default_opus_model: str = typer.Option(
+        "glm-5.2",
+        "--default-opus-model",
+        envvar="ANTHROPIC_DEFAULT_OPUS_MODEL",
+        help="Default model for Opus-tier requests",
+    ),
+    subagent_model: str = typer.Option(
+        "glm-4.5-air",
+        "--subagent-model",
+        envvar="CLAUDE_CODE_SUBAGENT_MODEL",
+        help="Model used for spawned subagents",
+    ),
+    effort_level: str = typer.Option(
+        "max",
+        "--effort-level",
+        envvar="CLAUDE_CODE_EFFORT_LEVEL",
+        help="Effort level (e.g. max)",
+    ),
+    attribution_header: str = typer.Option(
+        "0",
+        "--attribution-header",
+        envvar="CLAUDE_CODE_ATTRIBUTION_HEADER",
+        help="Attribution header toggle (0 disables it)",
+    ),
+    auto_compact_window: str = typer.Option(
+        "200000",
+        "--auto-compact-window",
+        envvar="CLAUDE_CODE_AUTO_COMPACT_WINDOW",
+        help="Auto-compact context window (token count); empty to leave unset",
+    ),
+) -> None:
+    """Launch claude with GLM environment settings."""
+    binary = _find_binary("claude", "~/.claude/local/claude")
+    env = os.environ.copy()
+    env.update(
+        _build_claude_env(
+            model=model,
+            base_url=base_url,
+            api_key=api_key,
+            auth_token=auth_token,
+            api_timeout_ms=api_timeout_ms,
+            default_haiku_model=default_haiku_model,
+            default_sonnet_model=default_sonnet_model,
+            default_opus_model=default_opus_model,
+            subagent_model=subagent_model,
+            effort_level=effort_level,
+            attribution_header=attribution_header,
+            auto_compact_window=auto_compact_window,
+        )
+    )
+    cmd_args = [binary]
+    if model:
+        cmd_args.extend(["--model", model])
+    cmd_args.extend(ctx.args)
+    os.execvpe(binary, cmd_args, env)
+# ---------------------------------------------------------------------------
+# launch codex
+# ---------------------------------------------------------------------------
+@launch_app.command(
+    "codex",
+    context_settings={"allow_extra_args": True, "allow_interspersed_args": False},
+)
+def launch_codex(
+    ctx: typer.Context,
+    model: str = typer.Option(
+        None, "--model", "-m", help="Model name to pass to codex"
+    ),
+) -> None:
+    """Launch codex with --oss flag for local Ollama usage."""
+    binary = _find_binary("codex")
+    cmd_args = [binary, "--oss"]
+    if model:
+        cmd_args.extend(["-m", model])
+    cmd_args.extend(ctx.args)
+    os.execvpe(binary, cmd_args, os.environ)
+# ---------------------------------------------------------------------------
+# launch opencode
+# ---------------------------------------------------------------------------
+def _write_opencode_config(model: str | None, base_url: str) -> None:
+    """Write (or update) ~/.config/opencode/opencode.json and state file."""
+    config_dir = os.path.expanduser("~/.config/opencode")
+    config_path = os.path.join(config_dir, "opencode.json")
+    # Load existing config or start fresh
+    config: dict = {}
+    if os.path.isfile(config_path):
+        try:
+            with open(config_path) as f:
+                config = json.load(f)
+        except (json.JSONDecodeError, OSError):
+            pass
+    config.setdefault("$schema", "https://opencode.ai/config.json")
+    providers = config.setdefault("provider", {})
+    ollama = providers.setdefault("ollama", {})
+    ollama["npm"] = "@ai-sdk/openai-compatible"
+    ollama["name"] = "Ollama (local)"
+    ollama.setdefault("options", {})["baseURL"] = base_url
+    models = ollama.setdefault("models", {})
+    if model:
+        models[model] = {"name": model, "_launch": True}
+    os.makedirs(config_dir, mode=0o755, exist_ok=True)
+    with open(config_path, "w") as f:
+        json.dump(config, f, indent=2)
+        f.write("\n")
+    # Update state/model.json with recent model
+    if model:
+        state_dir = os.path.expanduser("~/.local/state/opencode")
+        state_path = os.path.join(state_dir, "model.json")
+        state: dict = {}
+        if os.path.isfile(state_path):
+            try:
+                with open(state_path) as f:
+                    state = json.load(f)
+            except (json.JSONDecodeError, OSError):
+                pass
+        recent: list = state.setdefault("recent", [])
+        state.setdefault("favorite", [])
+        state.setdefault("variant", {})
+        entry = {"providerID": "ollama", "modelID": model}
+        recent = [r for r in recent if r.get("modelID") != model]
+        recent.insert(0, entry)
+        state["recent"] = recent[:10]
+        os.makedirs(state_dir, mode=0o755, exist_ok=True)
+        with open(state_path, "w") as f:
+            json.dump(state, f, indent=2)
+            f.write("\n")
+@launch_app.command(
+    "opencode",
+    context_settings={"allow_extra_args": True, "allow_interspersed_args": False},
+)
+def launch_opencode(
+    ctx: typer.Context,
+    model: str = typer.Option(None, "--model", "-m", help="Model name for opencode"),
+    base_url: str = typer.Option(
+        ...,
+        "--base-url",
+        envvar="GLM_BASE_URL",
+        help="Base URL for the API endpoint",
+    ),
+) -> None:
+    """Launch opencode after writing provider config."""
+    binary = _find_binary("opencode")
+    _write_opencode_config(model, base_url)
+    cmd_args = [binary]
+    cmd_args.extend(ctx.args)
+    os.execvpe(binary, cmd_args, os.environ)
+# ---------------------------------------------------------------------------
+# shell
+# ---------------------------------------------------------------------------
+def _shell_quote(value: str) -> str:
+    """Single-quote a value safely for POSIX shell eval."""
+    return "'" + value.replace("'", "'\"'\"'") + "'"
+@app.command()
+def shell(
+    model: str = typer.Option(
+        "glm-5.2", "--model", "-m", help="Top-level model (ANTHROPIC_MODEL)"
+    ),
+    base_url: str = typer.Option(
+        "https://api.z.ai/api/anthropic",
+        "--base-url",
+        envvar="GLM_BASE_URL",
+        help="Base URL for the API endpoint",
+    ),
+    api_key: str = typer.Option("", "--api-key", envvar="GLM_API_KEY", help="API key"),
+    auth_token: str = typer.Option(
+        ..., "--auth-token", envvar="GLM_AUTH_TOKEN", help="Auth token"
+    ),
+    api_timeout_ms: str = typer.Option(
+        "3000000", "--api-timeout-ms", envvar="API_TIMEOUT_MS"
+    ),
+    default_haiku_model: str = typer.Option(
+        "glm-4.5-air", "--default-haiku-model", envvar="ANTHROPIC_DEFAULT_HAIKU_MODEL"
+    ),
+    default_sonnet_model: str = typer.Option(
+        "glm-5.2", "--default-sonnet-model", envvar="ANTHROPIC_DEFAULT_SONNET_MODEL"
+    ),
+    default_opus_model: str = typer.Option(
+        "glm-5.2", "--default-opus-model", envvar="ANTHROPIC_DEFAULT_OPUS_MODEL"
+    ),
+    subagent_model: str = typer.Option(
+        "glm-4.5-air", "--subagent-model", envvar="CLAUDE_CODE_SUBAGENT_MODEL"
+    ),
+    effort_level: str = typer.Option(
+        "max", "--effort-level", envvar="CLAUDE_CODE_EFFORT_LEVEL"
+    ),
+    attribution_header: str = typer.Option(
+        "0", "--attribution-header", envvar="CLAUDE_CODE_ATTRIBUTION_HEADER"
+    ),
+    auto_compact_window: str = typer.Option(
+        "200000", "--auto-compact-window", envvar="CLAUDE_CODE_AUTO_COMPACT_WINDOW"
+    ),
+) -> None:
+    """Print `export` lines to bootstrap the current shell for Z.ai.
+    Eval the output to configure your shell so a plain `claude` uses Z.ai:
+        eval "$(uv run src/main.py shell)"
+    """
+    env = _build_claude_env(
+        model=model,
+        base_url=base_url,
+        api_key=api_key,
+        auth_token=auth_token,
+        api_timeout_ms=api_timeout_ms,
+        default_haiku_model=default_haiku_model,
+        default_sonnet_model=default_sonnet_model,
+        default_opus_model=default_opus_model,
+        subagent_model=subagent_model,
+        effort_level=effort_level,
+        attribution_header=attribution_header,
+        auto_compact_window=auto_compact_window,
+    )
+    for key, value in env.items():
+        if value:
+            print(f"export {key}={_shell_quote(value)}")
+# ---------------------------------------------------------------------------
+# models
+# ---------------------------------------------------------------------------
+def _fetch_remote_models(models_url: str, auth_token: str, timeout: float) -> list[str]:
+    """Fetch the live model ID list from the Z.ai PaaS /models endpoint."""
+    import json
+    import urllib.error
+    import urllib.request
+    req = urllib.request.Request(
+        models_url,
+        headers={"Authorization": f"Bearer {auth_token}"},
+        method="GET",
+    )
+    try:
+        with urllib.request.urlopen(req, timeout=timeout) as resp:
+            payload = json.load(resp)
+    except urllib.error.HTTPError as e:
+        body = e.read().decode("utf-8", errors="replace")
+        msg = f"Failed to fetch models ({e.code})"
+        if body:
+            msg += f": {body[:200]}"
+        raise SystemExit(msg)
+    except urllib.error.URLError as e:
+        raise SystemExit(f"Failed to fetch models: {e.reason}")
+    data = payload.get("data", payload) if isinstance(payload, dict) else payload
+    ids = [m.get("id") for m in data if isinstance(m, dict) and m.get("id")]
+    return sorted(ids)
+@app.command()
+def models(
+    remote: bool = typer.Option(
+        False, "--remote", "-r", help="Fetch the live list from the Z.ai API"
+    ),
+    models_url: str = typer.Option(
+        "https://api.z.ai/api/paas/v4/models",
+        "--models-url",
+        envvar="GLM_MODELS_URL",
+        help="PaaS models endpoint (used with --remote)",
+    ),
+    auth_token: str = typer.Option(
+        "",
+        "--auth-token",
+        envvar="GLM_AUTH_TOKEN",
+        help="Auth token (required with --remote)",
+    ),
+    timeout: float = typer.Option(30.0, "--timeout", help="Request timeout in seconds"),
+) -> None:
+    """List Z.ai GLM models (built-in list, or --remote for the live API list)."""
+    if remote:
+        if not auth_token:
+            raise SystemExit(
+                "--remote requires an auth token (--auth-token or GLM_AUTH_TOKEN)."
+            )
+        known = dict(ZAI_MODELS)
+        ids = _fetch_remote_models(models_url, auth_token, timeout)
+        if not ids:
+            print(f"No models returned from {models_url}")
+            return
+        print(f"Z.ai models (live from {models_url}):")
+        width = max(len(model_id) for model_id in ids)
+        for model_id in ids:
+            desc = known.get(model_id, "")
+            print(f"  {model_id.ljust(width)}  {desc}".rstrip())
+        return
+    print("Z.ai GLM models (use the ID in --model):")
+    width = max(len(model_id) for model_id, _ in ZAI_MODELS)
+    for model_id, desc in ZAI_MODELS:
+        print(f"  {model_id.ljust(width)}  {desc}")
+# ---------------------------------------------------------------------------
+# bench
+# ---------------------------------------------------------------------------
+@app.command()
+def bench(
+    model: str = typer.Option("glm-5.2", "--model", "-m", help="Model to benchmark"),
+    base_url: str = typer.Option(
+        "https://api.z.ai/api/anthropic",
+        "--base-url",
+        envvar="GLM_BASE_URL",
+        help="Base URL for the API endpoint",
+    ),
+    auth_token: str = typer.Option(
+        ...,
+        "--auth-token",
+        envvar="GLM_AUTH_TOKEN",
+        help="Auth token for the endpoint",
+    ),
+    timeout: float = typer.Option(30.0, "--timeout", help="Request timeout in seconds"),
+) -> None:
+    """Time a single /v1/messages round-trip against the configured endpoint."""
+    import json
+    import time
+    import urllib.error
+    import urllib.request
+    url = f"{base_url.rstrip('/')}/v1/messages"
+    payload = json.dumps(
+        {
+            "model": model,
+            "max_tokens": 32,
+            "messages": [{"role": "user", "content": "Reply: ok"}],
+        }
+    ).encode()
+    req = urllib.request.Request(
+        url,
+        data=payload,
+        headers={
+            "x-api-key": auth_token,
+            "content-type": "application/json",
+            "anthropic-version": "2023-06-01",
+        },
+        method="POST",
+    )
+    print(f"  {model} via {base_url}")
+    start = time.monotonic()
+    try:
+        with urllib.request.urlopen(req, timeout=timeout) as resp:
+            elapsed_ms = int((time.monotonic() - start) * 1000)
+            print(f"  OK ({resp.status}) in {elapsed_ms}ms")
+    except urllib.error.HTTPError as e:
+        elapsed_ms = int((time.monotonic() - start) * 1000)
+        body = e.read().decode("utf-8", errors="replace")
+        print(f"  FAIL ({e.code}) in {elapsed_ms}ms")
+        if body:
+            print(f"  {body[:200]}")
+        raise typer.Exit(code=1)
+    except urllib.error.URLError as e:
+        elapsed_ms = int((time.monotonic() - start) * 1000)
+        print(f"  FAIL ({e.reason}) in {elapsed_ms}ms")
+        raise typer.Exit(code=1)
+# ---------------------------------------------------------------------------
+# doctor
+# ---------------------------------------------------------------------------
+_CLAUDE_ENV_VARS = [
+    "GLM_BASE_URL",
+    "GLM_API_KEY",
+    "GLM_AUTH_TOKEN",
+    "API_TIMEOUT_MS",
+    "ANTHROPIC_DEFAULT_HAIKU_MODEL",
+    "ANTHROPIC_DEFAULT_SONNET_MODEL",
+    "ANTHROPIC_DEFAULT_OPUS_MODEL",
+    "CLAUDE_CODE_SUBAGENT_MODEL",
+    "CLAUDE_CODE_EFFORT_LEVEL",
+]
+_BINARIES = [
+    ("claude", "~/.claude/local/claude"),
+    ("codex", None),
+    ("opencode", None),
+]
+_SECRET_VARS = {"GLM_API_KEY", "GLM_AUTH_TOKEN"}
+def _mask(value: str) -> str:
+    """Show first 4 and last 4 chars, mask the rest."""
+    if len(value) <= 10:
+        return value[:2] + "***"
+    return value[:4] + "***" + value[-4:]
+@app.command()
+def doctor() -> None:
+    """Check environment variables and binary availability."""
+    ok = True
+    print("Environment variables:")
+    for var in _CLAUDE_ENV_VARS:
+        value = os.environ.get(var)
+        if value:
+            display = _mask(value) if var in _SECRET_VARS else value
+            print(f"  {var}: {display}")
+        else:
+            print(f"  {var}: (not set)")
+    print()
+    print("Binaries:")
+    for name, fallback in _BINARIES:
+        found = shutil.which(name)
+        if found:
+            print(f"  {name}: {found}")
+        elif fallback:
+            expanded = os.path.expanduser(fallback)
+            if os.path.isfile(expanded) and os.access(expanded, os.X_OK):
+                print(f"  {name}: {expanded} (fallback)")
+            else:
+                print(f"  {name}: NOT FOUND")
+                ok = False
+        else:
+            print(f"  {name}: NOT FOUND")
+            ok = False
+    print()
+    print("Config files:")
+    opencode_config = os.path.expanduser("~/.config/opencode/opencode.json")
+    if os.path.isfile(opencode_config):
+        print(f"  {opencode_config}: exists")
+    else:
+        print(f"  {opencode_config}: not found")
+    opencode_state = os.path.expanduser("~/.local/state/opencode/model.json")
+    if os.path.isfile(opencode_state):
+        print(f"  {opencode_state}: exists")
+    else:
+        print(f"  {opencode_state}: not found")
+    print()
+    if ok:
+        print("All checks passed.")
+    else:
+        print("Some checks failed. See above for details.")
+        raise typer.Exit(code=1)
+# ---------------------------------------------------------------------------
+# Top-level provider aliases
+# ---------------------------------------------------------------------------
+# Expose providers at the top level so `glm-launch claude` works the same as
+# `glm-launch launch claude`. The `launch` group is kept for backwards compat.
+_PROVIDER_CTX = {"allow_extra_args": True, "allow_interspersed_args": False}
+app.command("claude", context_settings=_PROVIDER_CTX)(launch_claude)
+app.command("codex", context_settings=_PROVIDER_CTX)(launch_codex)
+app.command("opencode", context_settings=_PROVIDER_CTX)(launch_opencode)
+# ---------------------------------------------------------------------------
+# entry point
+# ---------------------------------------------------------------------------
+def cli() -> None:
+    """Run the app, defaulting to the `claude` provider when no command is given."""
+    import sys
+    if len(sys.argv) == 1:
+        sys.argv.append("claude")
+    app()
+if __name__ == "__main__":
+    cli()