npm - @qwen-code/qwen-code - Versions diffs - 0.15.12-preview.3 → 0.16.0 - Mend

@qwen-code/qwen-code 0.15.12-preview.3 → 0.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (94) hide show

package/bundled/qc-helper/docs/features/sub-agents.md CHANGED Viewed

@@ -134,7 +134,7 @@ Subagents are configured using Markdown files with YAML frontmatter. This format
 ---
 name: agent-name
 description: Brief description of when and how to use this agent
-model: inherit # Optional: inherit or model-id
+model: inherit # Optional: inherit, fast, modelId, or authType:modelId
 approvalMode: auto-edit # Optional: default, plan, auto-edit, yolo
 tools:         # Optional: allowlist of tools
   - tool1
@@ -151,10 +151,48 @@ Multiple paragraphs are supported.
 Use the optional `model` frontmatter field to control which model a subagent uses:
-- `inherit`: Use the same model as the main conversation
-- Omit the field: Same as `inherit`
-- `glm-5`: Use that model ID with the main conversation's auth type
-- `openai:gpt-4o`: Use a different provider (resolves credentials from env vars)
+- `inherit`: Use the same model as the main conversation.
+- Omit the field: Same as `inherit`.
+- `fast`: Use the configured `fastModel`. If no valid fast model is configured,
+  the subagent falls back to `inherit`.
+- `glm-5`: Use that model ID. Qwen Code first checks the main conversation's
+  auth type; if the model is not available there, it can resolve the model from
+  another configured provider.
+- `openai:gpt-4o`: Use an explicit provider and model ID. This is useful when a
+  subagent should run on a model registered under a different auth type from the
+  main conversation.
+For example:
+```
+---
+name: fast-reviewer
+description: Reviews small diffs with the configured fast model
+model: fast
+tools:
+  - read_file
+  - grep_search
+---
+```
+```
+---
+name: openai-researcher
+description: Uses an OpenAI-compatible provider for research tasks
+model: openai:gpt-4o
+tools:
+  - read_file
+  - grep_search
+  - glob
+---
+```
+The `fast` selector uses the same `fastModel` setting configured in
+`settings.json` or with `/model --fast`. That setting may itself refer to a
+model under another configured auth type, such as `openai:deepseek-v4-flash`.
+When the selector resolves to another auth type, Qwen Code creates a dedicated
+runtime provider for that subagent request and sends the provider only the bare
+model ID.
 #### Permission Mode
@@ -620,6 +658,10 @@ Always follow these standards:
 - **Tool Restrictions**: Use `tools` to limit which tools a subagent can access, or `disallowedTools` to block specific tools while inheriting everything else
 - **Permission Mode**: Subagents inherit their parent's permission mode by default. Plan-mode sessions cannot escalate to auto-edit through delegated agents. Privileged modes (auto-edit, yolo) are blocked in untrusted folders.
+- **Provider Selection**: A subagent with `model: authType:modelId`, or
+  `model: fast` where `fastModel` resolves to another auth type, sends that
+  subagent's model requests to the selected provider. Make sure that provider is
+  appropriate for the subagent's task and data.
 - **Sandboxing**: All tool execution follows the same security model as direct tool use
 - **Audit Trail**: All Subagents actions are logged and visible in real-time
 - **Access Control**: Project and user-level separation provides appropriate boundaries

package/bundled/qc-helper/docs/qwen-serve.md CHANGED Viewed

@@ -12,6 +12,7 @@ Run Qwen Code as a local HTTP daemon so multiple clients (IDE plugins, web UIs,
 - **Reconnect-safe streaming** — SSE with `Last-Event-ID` reconnect lets a client drop and pick up exactly where it left off (within the ring's replay window).
 - **First-responder permissions** — when the agent asks for permission to run a tool, every connected client sees the request; whichever client answers first wins.
 - **One daemon, one workspace** — each `qwen serve` process binds to exactly one workspace at boot (per [#3803](https://github.com/QwenLM/qwen-code/issues/3803) §02). Multi-workspace deployments run one daemon per workspace on separate ports (or behind an orchestrator).
+- **Remote runtime control** ([#4175](https://github.com/QwenLM/qwen-code/issues/4175) PR 17) — change a session's approval mode (`POST /session/:id/approval-mode`), toggle a tool per workspace (`POST /workspace/tools/:name/enable`), scaffold an empty `QWEN.md` (`POST /workspace/init`, mechanical only — does NOT call the model; for AI-fill, follow up with `POST /session/:id/prompt`), or restart a single MCP server with a budget pre-check (`POST /workspace/mcp/:server/restart`). All four are strict-gated — configure `--token` first.
 ## Quickstart
@@ -38,6 +39,53 @@ curl http://127.0.0.1:4170/capabilities
 The `workspaceCwd` field surfaces the bound workspace so clients can pre-flight check + omit `cwd` on `POST /session`.
+The daemon also exposes read-only runtime snapshots for client UIs:
+`GET /workspace/mcp`, `GET /workspace/skills`, `GET /workspace/providers`,
+`GET /workspace/env`, `GET /workspace/preflight`,
+`GET /session/:id/context`, and `GET /session/:id/supported-commands`.
+`GET /workspace/mcp`, `GET /workspace/skills`, and `GET /workspace/providers`
+report the live ACP runtime and do not start the ACP child when idle; an
+idle daemon returns `initialized: false` with an empty snapshot. Once a
+session is alive they switch to `initialized: true` and surface the real
+state.
+`GET /workspace/env` and `GET /workspace/preflight` always answer with
+`initialized: true` regardless of ACP state. `env` never consults ACP
+(daemon-process info only); `preflight` answers daemon-level cells from
+`process.*` and emits `status: 'not_started'` placeholders for ACP-level
+cells when the child is idle.
+`GET /workspace/env` reports the daemon process's runtime, platform, sandbox,
+proxy, and the **presence** (never the value) of whitelisted secret env vars
+such as `OPENAI_API_KEY`. Proxy URLs are stripped of credentials and reduced
+to `host:port` before they hit the wire. The route always answers from the
+daemon process directly and never spawns an ACP child.
+`GET /workspace/preflight` returns a list of readiness checks. **Daemon-level
+cells** (Node version, CLI entry, workspace directory, ripgrep, git, npm)
+always render. **ACP-level cells** (auth, MCP discovery, skills, providers,
+tool registry, egress) require a live ACP child — when the daemon is idle
+they emit `status: 'not_started'` placeholders rather than spawning ACP just
+to populate them. Failures map to a closed `errorKind` enum (`missing_binary`,
+`auth_env_error`, `init_timeout`, `protocol_error`, `missing_file`,
+`parse_error`, `blocked_egress`) so client UIs can render structured
+remediation.
+The daemon also exposes workspace file helpers:
+- `GET /file` reads text files and returns a raw-byte `sha256:<hex>` hash.
+- `GET /file/bytes` reads bounded raw byte windows and returns base64 content.
+- `POST /file/write` creates or replaces text files.
+- `POST /file/edit` applies one exact text replacement.
+Write/edit are **strict mutation routes**: even on loopback they require a
+configured bearer token, otherwise they return `token_required`. Replacements
+and edits require the latest `expectedHash` from `GET /file` (or a full-window
+`GET /file/bytes`). `create` never overwrites. Explicit writes to ignored paths
+are allowed but audited. Binary writes, delete/move/mkdir, and recursive parent
+creation are not part of this surface.
 ### 3. Open a session
 ```bash
@@ -97,6 +145,17 @@ qwen serve --hostname 0.0.0.0 --port 4170
 Clients then send `Authorization: Bearer $QWEN_SERVER_TOKEN` on every request. `/health` is exempted **only on loopback binds** so k8s/Compose liveness probes inside the pod (where the daemon listens on `127.0.0.1`) don't need credentials. On non-loopback binds (`--hostname 0.0.0.0` etc.) `/health` requires the token like every other route — otherwise an attacker can probe arbitrary addresses to confirm the daemon's existence. Use `/capabilities` to verify your token is correct end-to-end (it always requires auth):
+> **Hardened loopback (`--require-auth`).** The default loopback no-token behavior is fine for a single-user laptop but unsafe on shared dev hosts, CI runners, or multi-tenant workstations where any local user can `curl 127.0.0.1:4170`. Pass `--require-auth` to make the bearer token mandatory on every route — including `/health` and `/capabilities` — even when bound to `127.0.0.1`. Boot fails without a token. With the flag on, an **unauthenticated** client can't read `/capabilities` to discover that auth is required; the discovery surface is the 401 response body itself. Once authenticated, the `caps.features.require_auth` tag is a post-auth confirmation that the deployment is hardened (useful for audit / compliance UIs):
+>
+> ```bash
+> qwen serve --require-auth --token "$(openssl rand -hex 32)"
+> # → /health, /capabilities, /session, … all require Authorization: Bearer …
+> curl http://127.0.0.1:4170/health
+> # → 401
+> curl -H "Authorization: Bearer $TOKEN" http://127.0.0.1:4170/capabilities | jq '.features | index("require_auth")'
+> # → 13   (or whatever index — non-null after authenticating means the tag is present)
+> ```
 ```bash
 curl -H "Authorization: Bearer $QWEN_SERVER_TOKEN" http://your-host:4170/capabilities
 # → {"v":1,"mode":"http-bridge","features":[...],"modelServices":[],"workspaceCwd":"/path/to/your-project"}
@@ -107,15 +166,19 @@ The token comparison is constant-time (SHA-256 + `crypto.timingSafeEqual`); 401
 ## CLI flags
-| Flag                    | Default         | Purpose                                                                                                                                                                                                                                                                                                                                             |
-| ----------------------- | --------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| `--port <n>`            | `4170`          | TCP port. `0` = OS-assigned ephemeral port.                                                                                                                                                                                                                                                                                                         |
-| `--hostname <addr>`     | `127.0.0.1`     | Bind interface. Anything beyond loopback requires a token.                                                                                                                                                                                                                                                                                          |
-| `--token <str>`         | —               | Bearer token. Falls back to `QWEN_SERVER_TOKEN` env var (with leading/trailing whitespace stripped — handy for `$(cat token.txt)`).                                                                                                                                                                                                                 |
-| `--max-sessions <n>`    | `20`            | Cap on concurrent live sessions. New `POST /session` requests that would spawn a fresh child return `503` (with `Retry-After: 5`) when the cap is hit; attaches to existing sessions are NOT counted. Set to `0` to disable. Sized for single-user / small-team usage; raise it if your deployment has the RAM/FD headroom (~30–50 MB per session). |
-| `--workspace <path>`    | `process.cwd()` | Absolute workspace path this daemon binds to (per [#3803](https://github.com/QwenLM/qwen-code/issues/3803) §02 — 1 daemon = 1 workspace). `POST /session` requests with a mismatched `cwd` return `400 workspace_mismatch`. For multi-workspace deployments, run one `qwen serve` per workspace on separate ports.                                  |
-| `--max-connections <n>` | `256`           | Listener-level TCP connection cap (`server.maxConnections`). Bounds raw socket count irrespective of session count — slow / phantom SSE clients get rejected at accept time once full. Raise alongside `--max-sessions` if your deployment expects many SSE subscribers per session.                                                                |
-| `--http-bridge`         | `true`          | Stage 1 mode: one `qwen --acp` child per daemon (bound to one workspace at boot, per [#3803](https://github.com/QwenLM/qwen-code/issues/3803) §02); N sessions multiplex onto that child via ACP `newSession()`. Stage 2 native in-process becomes available later.                                                                                 |
+| Flag                      | Default         | Purpose                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
+| ------------------------- | --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
+| `--port <n>`              | `4170`          | TCP port. `0` = OS-assigned ephemeral port.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
+| `--hostname <addr>`       | `127.0.0.1`     | Bind interface. Anything beyond loopback requires a token.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
+| `--token <str>`           | —               | Bearer token. Falls back to `QWEN_SERVER_TOKEN` env var (with leading/trailing whitespace stripped — handy for `$(cat token.txt)`).                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
+| `--require-auth`          | `false`         | Refuse to start without a bearer token, even on loopback. Hardens the `127.0.0.1` developer default for shared dev hosts / CI runners / multi-tenant workstations where any local user can hit the listener. Boots only with `--token` or `QWEN_SERVER_TOKEN` set; gates `/health` behind the bearer too.                                                                                                                                                                                                                                                                            |
+| `--max-sessions <n>`      | `20`            | Cap on concurrent live sessions. New `POST /session` requests that would spawn a fresh child return `503` (with `Retry-After: 5`) when the cap is hit; attaches to existing sessions are NOT counted. Set to `0` to disable. Sized for single-user / small-team usage; raise it if your deployment has the RAM/FD headroom (~30–50 MB per session).                                                                                                                                                                                                                                  |
+| `--workspace <path>`      | `process.cwd()` | Absolute workspace path this daemon binds to (per [#3803](https://github.com/QwenLM/qwen-code/issues/3803) §02 — 1 daemon = 1 workspace). `POST /session` requests with a mismatched `cwd` return `400 workspace_mismatch`. For multi-workspace deployments, run one `qwen serve` per workspace on separate ports.                                                                                                                                                                                                                                                                   |
+| `--max-connections <n>`   | `256`           | Listener-level TCP connection cap (`server.maxConnections`). Bounds raw socket count irrespective of session count — slow / phantom SSE clients get rejected at accept time once full. Raise alongside `--max-sessions` if your deployment expects many SSE subscribers per session.                                                                                                                                                                                                                                                                                                 |
+| `--event-ring-size <n>`   | `8000`          | Per-session SSE replay ring depth (#3803 §02 target). Sets the backlog available to `GET /session/:id/events` with `Last-Event-ID: N`. Larger = more reconnect headroom at the cost of a few hundred KB extra RAM per session. SDK clients can additionally request a larger per-subscriber backlog cap on a specific subscription via `?maxQueued=N` (range `[16, 2048]`, default 256). Daemons also emit a non-terminal `slow_client_warning` SSE frame at 75% queue fill so clients can drain / reconnect before getting evicted. Pre-flight `caps.features.slow_client_warning`. |
+| `--mcp-client-budget <n>` | —               | Positive integer cap on live MCP clients **per ACP session** (issue [#4175](https://github.com/QwenLM/qwen-code/issues/4175) PR 14 v1; PR 23 graduates this to per-workspace via the shared MCP pool). Combine with `--mcp-budget-mode`. When unset, no accounting-driven enforcement (but `GET /workspace/mcp` still reports `clientCount`). Distinct from claude-code's `MCP_SERVER_CONNECTION_BATCH_SIZE` which gates startup concurrency, not the total client count. Pre-flight `caps.features.mcp_guardrails`.                                                                 |
+| `--mcp-budget-mode <m>`   | `warn` / `off`  | How `--mcp-client-budget` is enforced. `warn` (default when budget set): no refusal, snapshot's `budgets[0].status` flips to `warning` at ≥75% of budget. `enforce`: connects past the cap are refused, per-server cell shows `disabledReason: 'budget'`, deterministic by `mcpServers` declaration order. `off` (default when budget unset): pure observability. Boot rejects `enforce` without a budget.                                                                                                                                                                           |
+| `--http-bridge`           | `true`          | Stage 1 mode: one `qwen --acp` child per daemon (bound to one workspace at boot, per [#3803](https://github.com/QwenLM/qwen-code/issues/3803) §02); N sessions multiplex onto that child via ACP `newSession()`. Stage 2 native in-process becomes available later.                                                                                                                                                                                                                                                                                                                  |
 > **Sizing the load knobs.** `--max-sessions` is the **new-child** cap.
 > Three other layers also limit load — when sizing for a high-concurrency
@@ -135,6 +198,20 @@ The token comparison is constant-time (SHA-256 + `crypto.timingSafeEqual`); 401
 > sizing assumes single-user / small-team load; raise progressively
 > (and watch RSS) for multi-tenant deployments.
+> **MCP client guardrails (issue [#4175](https://github.com/QwenLM/qwen-code/issues/4175) PR 14).** A workspace declaring 30 MCP servers in `mcpServers` will start 30 clients with no upstream cap unless you set one. `--mcp-client-budget=N` caps the live MCP client count; `--mcp-budget-mode={enforce,warn,off}` chooses the behavior. Default is `warn` when a budget is set (snapshot surfaces the warning but no client is refused — useful for measuring real-world fanout before flipping on enforcement). Refused servers under `enforce` mode get `disabledReason: 'budget'` on their per-server cell, and the `budgets[0]` cell shows `status: 'error'` + `errorKind: 'budget_exhausted'`. Slot reservation is by server name and survives reconnects / discovery timeouts — a refused server can't take a slot from a healthy one.
+>
+> ⚠️ **v1 scope: per-session, not per-workspace.** Each ACP session inside the daemon has its own `Config`/`McpClientManager` (created via `newSessionConfig` per session). The budget caps live MCP clients **per session**, not aggregated across all sessions in the workspace. Snapshot at `GET /workspace/mcp` reflects the bootstrap session's view (the cell carries `scope: 'session'` for honesty). If you run 5 concurrent ACP sessions with `--mcp-client-budget=10`, you may have up to 50 live MCP clients across the daemon — the cap holds per session. **Wave 5 PR 23 (shared MCP pool)** introduces a workspace-scoped manager and graduates this to true per-workspace enforcement.
+>
+> ```sh
+> qwen serve --mcp-client-budget=10 --mcp-budget-mode=warn
+> # later, after telemetry shows your real-world distribution:
+> qwen serve --mcp-client-budget=10 --mcp-budget-mode=enforce
+> ```
+>
+> This is **not** the same as claude-code's `MCP_SERVER_CONNECTION_BATCH_SIZE` (which gates startup concurrency); they're orthogonal. PR 23 will add a real shared MCP pool (a `scope: 'workspace'` cell in `budgets[]` alongside the per-session cell); PR 14 v1 is the in-process counter + soft enforcement on the existing per-session manager.
+>
+> **Push events (issue [#4175](https://github.com/QwenLM/qwen-code/issues/4175) PR 14b).** SDK clients subscribed to `GET /session/:id/events` receive typed frames when budget thresholds cross — `mcp_budget_warning` (synthetic, fires once per upward 75% crossing with hysteresis re-arm at 37.5%, advertised via `mcp_guardrail_events`) and `mcp_child_refused_batch` (coalesced once per discovery pass under `enforce` mode; length-1 from `readResource` lazy-spawn refusal). The snapshot at `GET /workspace/mcp` is still the source-of-truth for state-after-reconnect; events are change-edges. Useful when dashboarding in real-time without polling.
 ## Default deployment threat model
 - **127.0.0.1 only** — loopback bind, no auth needed.
@@ -227,7 +304,7 @@ Stage 1's contract is sized for prototyping. Per [#3889 chiga0 downstream-consum
 **Reliability baseline:**
-3. **Client-initiated heartbeat path** — distinguish "agent thinking" from "daemon dead" without waiting for the 15s server heartbeat.
+3. ~~**Client-initiated heartbeat path**~~ — shipped via [#4175](https://github.com/QwenLM/qwen-code/issues/4175) PR 9. `POST /session/:id/heartbeat` records last-seen timestamps on the daemon (capability tag `client_heartbeat`); SDK helpers are `DaemonClient.heartbeat()` / `DaemonSessionClient.heartbeat()`.
 4. **`permission_already_resolved` event** when a vote loses the first-responder race — currently UIs have to infer state from a `404`.
 5. **Larger / per-session-configurable replay ring** — default 4000 covers short drops; mobile / chatty-turn workloads need 8000+ or per-session config.
 6. **`slow_client_warning` event before `client_evicted`** — soft backpressure so well-behaved slow clients can self-throttle (trim render depth, drop chunks) before being terminated.
@@ -300,6 +377,53 @@ The bridge keeps **one channel per daemon** (one daemon per workspace, per §02)
 **Peer agents (Cursor / Continue / Claude Code / OpenCode / Gemini CLI) all do single-process multi-session.** qwen-code matches them at the agent layer; the Stage 1 bridge in this PR makes the same architecture visible over HTTP.
+## Logging in to a remote daemon (issue #4175 PR 21)
+When the daemon runs on a remote pod (no shared display with you), you can still log in to a Qwen account by triggering an OAuth device flow over HTTP. The daemon polls the IdP itself; your job is just to open a URL on whatever device has a browser.
+```bash
+# 1. Start a flow. The daemon contacts the IdP, returns a code + URL.
+curl -X POST http://127.0.0.1:4170/workspace/auth/device-flow \
+  -H "Authorization: Bearer $TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{"providerId":"qwen-oauth"}'
+# → 201 {
+#     "deviceFlowId": "fa07c61b-…",
+#     "userCode": "USER-1",
+#     "verificationUri": "https://chat.qwen.ai/api/v1/oauth2/device",
+#     "verificationUriComplete": "https://chat.qwen.ai/...?user_code=USER-1",
+#     "expiresAt": 1700000600000,
+#     "intervalMs": 5000,
+#     "attached": false
+#   }
+# 2. Visit the URL on your phone / laptop, enter the user code.
+# 3. Poll for completion (or subscribe to SSE for the auth_device_flow_authorized event):
+curl http://127.0.0.1:4170/workspace/auth/device-flow/fa07c61b-… \
+  -H "Authorization: Bearer $TOKEN"
+# → status transitions: pending → authorized
+```
+The TypeScript SDK wraps both steps into a single helper:
+```ts
+import { DaemonClient } from '@qwen-code/sdk';
+const client = new DaemonClient({ baseUrl, token });
+const flow = await client.auth.start({ providerId: 'qwen-oauth' });
+console.log(`Open ${flow.verificationUri}\nCode: ${flow.userCode}`);
+const result = await flow.awaitCompletion({ signal: abortCtrl.signal });
+// result.status === 'authorized'
+```
+**The daemon never opens a browser on your behalf.** Even when running locally, the daemon stays passive — it returns the URL and lets the SDK / user choose where to open it. This is intentional: a daemon on a headless pod that called `xdg-open` would silently fail, masking the actual auth surface. Mirror `gh auth login`'s "Press Enter to open browser" UX in your client.
+**`--require-auth` and dev convenience.** The device-flow routes use the strict mutation gate (PR 15), which means a token-less loopback default returns `401 token_required`. Locally, the simplest way around this during development is `qwen serve --token=dev-token`; you don't need `--require-auth` unless you're hardening the loopback default.
+**Cross-daemon limitation.** `oauth_creds.json` is daemon-shared (`~/.qwen/oauth_creds.json`), so a successful login in daemon A is automatically picked up by daemon B's next token refresh — but daemon B's SDK clients won't receive the `auth_device_flow_authorized` event (events are per-daemon).
+**Cross-client take-over.** Two SDK clients on the same daemon that both `POST /workspace/auth/device-flow` for the same provider get the per-provider singleton: the first call starts a fresh IdP request and returns `attached: false`; the second call returns the EXISTING in-flight entry with `attached: true`. The take-over is recorded on the audit trail (under the second client's `X-Qwen-Client-Id`) but does NOT emit a separate event — both clients eventually observe the SAME `auth_device_flow_authorized` once the user finishes the IdP page. If your UI distinguishes "I started this" from "someone else's flow I joined", branch on the `attached` field returned by `start()`.
 ## What's next
 - **Build a client?** See the [DaemonClient TypeScript quickstart](../developers/examples/daemon-client-quickstart.md) and the [HTTP protocol reference](../developers/qwen-serve-protocol.md).

package/bundled/review/SKILL.md CHANGED Viewed

@@ -18,8 +18,10 @@ You are an expert code reviewer. Your job is to review code changes and provide
 **Critical rules (most commonly violated — read these first):**
-1. **Match the language of the PR.** If the PR is in English, ALL your output (terminal + PR comments) MUST be in English. If in Chinese, use Chinese. Do NOT switch languages. For **local reviews** (no PR), if the system prompt includes an output language preference, use that language; otherwise follow the user's input language.
-2. **Step 9: use Create Review API** with `comments` array for inline comments. Do NOT use `gh api .../pulls/.../comments` to post individual comments. See Step 9 for the JSON format.
+1. **For same-repo PR reviews (PR number, or URL whose owner/repo matches a local remote), the worktree is MANDATORY.** After argument parsing and remote detection (early in Step 1), the first command that touches code state MUST be `qwen review fetch-pr`. Do NOT use `gh pr checkout`, `git checkout <branch>`, `git switch`, `git pull`, `git reset --hard`, or any other command that modifies the user's current HEAD or working tree. After `fetch-pr` returns, ALL subsequent reads, linters, builds, tests, and edits MUST happen inside the `worktreePath` it created. Violating this contaminates the user's local branch state. (Cross-repo PRs with no matching remote use lightweight mode and do NOT create a worktree — see Step 1.)
+2. **If `--comment` was specified, Step 8 (Autofix) is SKIPPED entirely.** `--comment` means the user wants inline PR comments posted, not code mutations. Do not ask "Apply auto-fixes? (y/n)" — go straight from Step 7 to Step 9.
+3. **Match the language of the PR.** If the PR is in English, ALL your output (terminal + PR comments) MUST be in English. If in Chinese, use Chinese. Do NOT switch languages. For **local reviews** (no PR), if the system prompt includes an output language preference, use that language; otherwise follow the user's input language.
+4. **Step 9: use Create Review API** with `comments` array for inline comments. Do NOT use `gh api .../pulls/.../comments` to post individual comments. See Step 9 for the JSON format.
 **Design philosophy: Silence is better than noise.** Every comment you make should be worth the reader's time. If you're unsure whether something is a problem, DO NOT MENTION IT. Low-quality feedback causes "cry wolf" fatigue — developers stop reading all AI comments and miss real issues.
@@ -44,6 +46,8 @@ Based on the remaining arguments:
   - If both diffs are empty, inform the user there are no changes to review and stop here — do not proceed to the review agents
 - **PR number or same-repo URL** (e.g., `123` or a URL whose owner/repo matches the current repo — cross-repo URLs are handled by the lightweight mode above):
+  > ⚠️ **MANDATORY worktree flow.** Do NOT use `gh pr checkout`, `git checkout <branch>`, `git switch`, `git pull`, `git reset --hard`, or any other command that changes the user's current HEAD or working tree contents. The ONLY entry point is `qwen review fetch-pr` (below) — it isolates the PR into an ephemeral worktree so the user's local state is never touched. After it returns, every subsequent command in Steps 2-8 MUST operate inside the returned `worktreePath` (e.g. `cd <worktreePath>` first, or pass the path as a `--cwd` / explicit argument).
   - **Run `qwen review fetch-pr`** to set up the working state in one pass — it cleans any stale worktree, fetches the PR HEAD into `qwen-review/pr-<n>`, queries `gh pr view` for metadata, and creates an ephemeral worktree at `.qwen/tmp/review-pr-<n>`:
     ```bash
@@ -442,7 +446,12 @@ If the user responds with "post comments" (or similar intent like "yes post them
 ## Step 8: Autofix
-If there are **Critical** or **Suggestion** findings with clear, unambiguous fixes, offer to auto-apply them.
+**Skip this entire step (do not even ask) if EITHER of the following is true:**
+- `--comment` was specified in the arguments — the user explicitly asked for inline PR comments, not code edits. Go straight to Step 9.
+- The review target is a cross-repo PR running in lightweight mode (no local files to edit).
+Otherwise, if there are **Critical** or **Suggestion** findings with clear, unambiguous fixes, offer to auto-apply them. (If there are no such findings, this step is also a no-op — fall through to Step 9.)
 1. Count the number of auto-fixable findings (those with concrete suggested fixes that can be expressed as file edits).
 2. If there are fixable findings, ask the user:

package/chunks/{agent-LIAWUWAO.js → agent-K6OWOMBN.js} RENAMED Viewed

@@ -7,32 +7,30 @@ import {
   hasRebuiltToolRegistry,
   rebuildToolRegistryOnOverride,
   resolveSubagentApprovalMode
-} from "./chunk-AJSOD5IR.js";
-import "./chunk-5P5XGNYH.js";
+} from "./chunk-3T4ZT63H.js";
 import "./chunk-K5PGHDBN.js";
 import "./chunk-O4PICXES.js";
 import "./chunk-TW522KN6.js";
 import "./chunk-MLZQVCF3.js";
-import "./chunk-JMZQICAL.js";
-import "./chunk-5QQ5FGTU.js";
-import "./chunk-B7ZL7HUA.js";
+import "./chunk-CAVZVZX6.js";
+import "./chunk-G7YTSRES.js";
+import "./chunk-4AOCVI6J.js";
 import "./chunk-77WXWU44.js";
-import "./chunk-OCC4MZRS.js";
-import "./chunk-CAWKL3UC.js";
-import "./chunk-XLQ4E5PS.js";
-import "./chunk-SYCJMSIJ.js";
+import "./chunk-F23NCRJ2.js";
+import "./chunk-CSWBPY3P.js";
+import "./chunk-WCZWAKFG.js";
 import "./chunk-UWCTAVOD.js";
 import "./chunk-OFEVLU4C.js";
-import "./chunk-CM2IESUE.js";
-import "./chunk-UXW7MYAW.js";
-import "./chunk-G27O2LD2.js";
+import "./chunk-PR4T27R7.js";
+import "./chunk-MAY32HXD.js";
+import "./chunk-D5NTAHYL.js";
 import "./chunk-T4VD6OJ4.js";
 import "./chunk-RDYWTWEM.js";
-import "./chunk-TPGOGCWM.js";
-import "./chunk-FYMSCRHM.js";
-import "./chunk-SQNQIOD5.js";
-import "./chunk-FKVKVE6N.js";
-import "./chunk-GJXIKCKL.js";
+import "./chunk-YJLGXDQJ.js";
+import "./chunk-PVVL5Q3W.js";
+import "./chunk-GGNTZ2NH.js";
+import "./chunk-KXZ4TJB4.js";
+import "./chunk-XP27SJMH.js";
 import "./chunk-E7E2MFYM.js";
 import "./chunk-ZERZSAZL.js";
 import "./chunk-QN5NZ3UQ.js";

package/chunks/{anthropicContentGenerator-4QE6LTVV.js → anthropicContentGenerator-RQJNXJIY.js} RENAMED Viewed

@@ -6,7 +6,7 @@ import {
 } from "./chunk-KQIKOTQJ.js";
 import {
   RequestTokenizer
-} from "./chunk-BXNCPI75.js";
+} from "./chunk-DMIMF3CG.js";
 import {
   Blob,
   File,
@@ -16,15 +16,16 @@ import {
 import {
   buildRuntimeFetchOptions,
   redactProxyError
-} from "./chunk-CAWKL3UC.js";
+} from "./chunk-CSWBPY3P.js";
 import {
   CAPPED_DEFAULT_MAX_TOKENS,
   DEFAULT_TIMEOUT,
   convertSchema,
   hasExplicitOutputLimit,
+  runtimeDiagnostics,
   safeJsonParse,
   tokenLimit
-} from "./chunk-UXW7MYAW.js";
+} from "./chunk-MAY32HXD.js";
 import {
   FinishReason,
   GenerateContentResponse
@@ -32,7 +33,7 @@ import {
 import "./chunk-RDYWTWEM.js";
 import {
   createDebugLogger
-} from "./chunk-GJXIKCKL.js";
+} from "./chunk-XP27SJMH.js";
 import "./chunk-E7E2MFYM.js";
 import {
   require_ms
@@ -4810,6 +4811,7 @@ var AnthropicContentGenerator = class {
     let response;
     try {
       const anthropicRequest = await this.buildRequest(request);
+      runtimeDiagnostics.recordAnthropicWireRequest(anthropicRequest);
       const headers = this.buildPerRequestHeaders(anthropicRequest);
       response = await this.client.messages.create(anthropicRequest, {
         signal: request.config?.abortSignal,
@@ -4827,6 +4829,7 @@ var AnthropicContentGenerator = class {
       ...anthropicRequest,
       stream: true
     };
+    runtimeDiagnostics.recordAnthropicWireRequest(streamingRequest);
     let stream;
     try {
       stream = await this.client.messages.create(

package/chunks/{askUserQuestion-QFSCBTUO.js → askUserQuestion-PQPMPNM3.js} RENAMED Viewed

@@ -6,10 +6,10 @@ import {
   BaseToolInvocation,
   ToolDisplayNames,
   ToolNames
-} from "./chunk-FYMSCRHM.js";
+} from "./chunk-PVVL5Q3W.js";
 import {
   createDebugLogger
-} from "./chunk-GJXIKCKL.js";
+} from "./chunk-XP27SJMH.js";
 import "./chunk-QWSRH265.js";
 import {
   init_esbuild_shims

package/chunks/{ca-S3XJMT6P.js → ca-UZ7BANMN.js} RENAMED Viewed

@@ -146,7 +146,7 @@ var ca_default = {
   "Compresses the context by replacing it with a summary.": "Comprimeix el context substituint-lo per un resum.",
   "open full Qwen Code documentation in your browser": "obrir la documentaci\xF3 completa de Qwen Code al navegador",
   "Configuration not available.": "Configuraci\xF3 no disponible.",
-  "Configure authentication information for login": "Configurar la informaci\xF3 d'autenticaci\xF3 per a iniciar sessi\xF3",
+  "Connect an LLM provider": "Connectar un prove\xEFdor LLM",
   "Copy the last result or code snippet to clipboard": "Copiar l'\xFAltim resultat o fragment de codi al porta-retalls",
   // ============================================================================
   // Ordres - Agents
@@ -826,8 +826,8 @@ var ca_default = {
   "Continue previous conversation": "Continuar la conversa anterior",
   "\u{1F44B} Welcome back! (Last updated: {{timeAgo}})": "\u{1F44B} Benvingut de nou! (Darrera actualitzaci\xF3: {{timeAgo}})",
   "\u{1F3AF} Overall Goal:": "\u{1F3AF} Objectiu general:",
-  "Select Authentication Method": "Seleccioneu el m\xE8tode d'autenticaci\xF3",
-  "You must select an auth method to proceed. Press Ctrl+C again to exit.": "Cal seleccionar un m\xE8tode d'autenticaci\xF3 per continuar. Premeu Ctrl+C de nou per sortir.",
+  "Connect a Provider": "Connectar un prove\xEFdor",
+  "You must connect a provider to proceed. Press Ctrl+C again to exit.": "Cal connectar un prove\xEFdor per continuar. Premeu Ctrl+C de nou per sortir.",
   "Terms of Services and Privacy Notice": "Termes de servei i av\xEDs de privacitat",
   "Qwen OAuth": "Qwen OAuth",
   "Discontinued \u2014 switch to Coding Plan or API Key": "Descontinuat \u2014 canvieu a Coding Plan o API Key",