@qwen-code/qwen-code 0.14.0-preview.2 → 0.14.0-preview.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,61 @@
1
+ ---
2
+ name: loop
3
+ description: Create a recurring loop that runs a prompt on a schedule. Usage - /loop 5m check the build, /loop check the PR every 30m, /loop run tests (defaults to 10m). /loop list to show jobs, /loop clear to cancel all.
4
+ allowedTools:
5
+ - cron_create
6
+ - cron_list
7
+ - cron_delete
8
+ ---
9
+
10
+ # /loop — schedule a recurring prompt
11
+
12
+ ## Subcommands
13
+
14
+ If the input (after stripping the `/loop` prefix) is exactly one of these keywords, run the subcommand instead of scheduling:
15
+
16
+ - **`list`** — call CronList and display the results. Done.
17
+ - **`clear`** — call CronList, then call CronDelete for every job returned. Confirm how many were cancelled. Done.
18
+
19
+ Otherwise, parse the input below into `[interval] <prompt…>` and schedule it with CronCreate.
20
+
21
+ ## Parsing (in priority order)
22
+
23
+ 1. **Leading token**: if the first whitespace-delimited token matches `^\d+[smhd]$` (e.g. `5m`, `2h`), that's the interval; the rest is the prompt.
24
+ 2. **Trailing "every" clause**: otherwise, if the input ends with `every <N><unit>` or `every <N> <unit-word>` (e.g. `every 20m`, `every 5 minutes`, `every 2 hours`), extract that as the interval and strip it from the prompt. Only match when what follows "every" is a time expression — `check every PR` has no interval.
25
+ 3. **Default**: otherwise, interval is `10m` and the entire input is the prompt.
26
+
27
+ If the resulting prompt is empty, show usage `/loop [interval] <prompt>` and stop — do not call CronCreate.
28
+
29
+ Examples:
30
+
31
+ - `5m /babysit-prs` → interval `5m`, prompt `/babysit-prs` (rule 1)
32
+ - `check the deploy every 20m` → interval `20m`, prompt `check the deploy` (rule 2)
33
+ - `run tests every 5 minutes` → interval `5m`, prompt `run tests` (rule 2)
34
+ - `check the deploy` → interval `10m`, prompt `check the deploy` (rule 3)
35
+ - `check every PR` → interval `10m`, prompt `check every PR` (rule 3 — "every" not followed by time)
36
+ - `5m` → empty prompt → show usage
37
+
38
+ ## Interval → cron
39
+
40
+ Supported suffixes: `s` (seconds, rounded up to nearest minute, min 1), `m` (minutes), `h` (hours), `d` (days). Convert:
41
+
42
+ | Interval pattern | Cron expression | Notes |
43
+ | ----------------- | ---------------------- | ----------------------------------------- |
44
+ | `Nm` where N ≤ 59 | `*/N * * * *` | every N minutes |
45
+ | `Nm` where N ≥ 60 | `0 */H * * *` | round to hours (H = N/60, must divide 24) |
46
+ | `Nh` where N ≤ 23 | `0 */N * * *` | every N hours |
47
+ | `Nd` | `0 0 */N * *` | every N days at midnight local |
48
+ | `Ns` | treat as `ceil(N/60)m` | cron minimum granularity is 1 minute |
49
+
50
+ **If the interval doesn't cleanly divide its unit** (e.g. `7m` → `*/7 * * * *` gives uneven gaps at :56→:00; `90m` → 1.5h which cron can't express), pick the nearest clean interval and tell the user what you rounded to before scheduling.
51
+
52
+ ## Action
53
+
54
+ 1. Call CronCreate with:
55
+ - `cron`: the expression from the table above
56
+ - `prompt`: the parsed prompt from above, verbatim (slash commands are passed through unchanged)
57
+ - `recurring`: `true`
58
+ 2. Briefly confirm: what's scheduled, the cron expression, the human-readable cadence, that recurring tasks auto-expire after 3 days, and that they can cancel sooner with CronDelete (include the job ID).
59
+ 3. **Then immediately execute the parsed prompt now** — don't wait for the first cron fire. If it's a slash command, invoke it via the Skill tool; otherwise act on it directly.
60
+
61
+ ## Input
@@ -1,11 +1,12 @@
1
1
  # Extension Releasing
2
2
 
3
- There are two primary ways of releasing extensions to users:
3
+ There are three primary ways of releasing extensions to users:
4
4
 
5
5
  - [Git repository](#releasing-through-a-git-repository)
6
6
  - [Github Releases](#releasing-through-github-releases)
7
+ - [npm Registry](#releasing-through-npm-registry)
7
8
 
8
- Git repository releases tend to be the simplest and most flexible approach, while GitHub releases can be more efficient on initial install as they are shipped as single archives instead of requiring a git clone which downloads each file individually. Github releases may also contain platform specific archives if you need to ship platform specific binary files.
9
+ Git repository releases tend to be the simplest and most flexible approach, while GitHub releases can be more efficient on initial install as they are shipped as single archives instead of requiring a git clone which downloads each file individually. Github releases may also contain platform specific archives if you need to ship platform specific binary files. npm registry releases are ideal for teams that already use npm for package distribution, especially with private registries.
9
10
 
10
11
  ## Releasing through a git repository
11
12
 
@@ -119,3 +120,85 @@ jobs:
119
120
  release/linux.arm64.my-tool.tar.gz
120
121
  release/win32.arm64.my-tool.zip
121
122
  ```
123
+
124
+ ## Releasing through npm registry
125
+
126
+ You can publish Qwen Code extensions as scoped npm packages (e.g. `@your-org/my-extension`). This is a good fit when:
127
+
128
+ - Your team already uses npm for package distribution
129
+ - You need private registry support with existing auth infrastructure
130
+ - You want version resolution and access control handled by npm
131
+
132
+ ### Package requirements
133
+
134
+ Your npm package must include a `qwen-extension.json` file at the package root. This is the same config file used by all Qwen Code extensions — the npm tarball is simply another delivery mechanism.
135
+
136
+ A minimal package structure looks like:
137
+
138
+ ```
139
+ my-extension/
140
+ ├── package.json
141
+ ├── qwen-extension.json
142
+ ├── QWEN.md # optional context file
143
+ ├── commands/ # optional custom commands
144
+ ├── skills/ # optional custom skills
145
+ └── agents/ # optional custom subagents
146
+ ```
147
+
148
+ Make sure `qwen-extension.json` is included in your published package (i.e. not excluded by `.npmignore` or the `files` field in `package.json`).
149
+
150
+ ### Publishing
151
+
152
+ Use standard npm publishing tools:
153
+
154
+ ```bash
155
+ # Publish to the default registry
156
+ npm publish
157
+
158
+ # Publish to a private/custom registry
159
+ npm publish --registry https://your-registry.com
160
+ ```
161
+
162
+ ### Installation
163
+
164
+ Users install your extension using the scoped package name:
165
+
166
+ ```bash
167
+ # Install latest version
168
+ qwen extensions install @your-org/my-extension
169
+
170
+ # Install a specific version
171
+ qwen extensions install @your-org/my-extension@1.2.0
172
+
173
+ # Install from a custom registry
174
+ qwen extensions install @your-org/my-extension --registry https://your-registry.com
175
+ ```
176
+
177
+ ### Update behavior
178
+
179
+ - Extensions installed without a version pin (e.g. `@scope/pkg`) track the `latest` dist-tag.
180
+ - Extensions installed with a dist-tag (e.g. `@scope/pkg@beta`) track that specific tag.
181
+ - Extensions pinned to an exact version (e.g. `@scope/pkg@1.2.0`) are always considered up-to-date and will not prompt for updates.
182
+
183
+ ### Authentication for private registries
184
+
185
+ Qwen Code reads npm auth credentials automatically:
186
+
187
+ 1. **`NPM_TOKEN` environment variable** — highest priority
188
+ 2. **`.npmrc` file** — supports both host-level and path-scoped `_authToken` entries (e.g. `//your-registry.com/:_authToken=TOKEN` or `//pkgs.dev.azure.com/org/_packaging/feed/npm/registry/:_authToken=TOKEN`)
189
+
190
+ `.npmrc` files are read from the current directory and the user's home directory.
191
+
192
+ ### Managing release channels
193
+
194
+ You can use npm dist-tags to manage release channels:
195
+
196
+ ```bash
197
+ # Publish a beta release
198
+ npm publish --tag beta
199
+
200
+ # Users install beta channel
201
+ qwen extensions install @your-org/my-extension@beta
202
+ ```
203
+
204
+ This works similarly to git branch-based release channels but uses npm's native dist-tag mechanism.
@@ -12,11 +12,11 @@ We offer a suite of extension management tools using both `qwen extensions` CLI
12
12
 
13
13
  You can manage extensions at runtime within the interactive CLI using `/extensions` slash commands. These commands support hot-reloading, meaning changes take effect immediately without restarting the application.
14
14
 
15
- | Command | Description |
16
- | ------------------------------------- | ----------------------------------------------------------------- |
17
- | `/extensions` or `/extensions manage` | Manage all installed extensions |
18
- | `/extensions install <source>` | Install an extension from a git URL, local path, or marketplace |
19
- | `/extensions explore [source]` | Open extensions source page(Gemini or ClaudeCode) in your browser |
15
+ | Command | Description |
16
+ | ------------------------------------- | ---------------------------------------------------------------------------- |
17
+ | `/extensions` or `/extensions manage` | Manage all installed extensions |
18
+ | `/extensions install <source>` | Install an extension from a git URL, local path, npm package, or marketplace |
19
+ | `/extensions explore [source]` | Open extensions source page(Gemini or ClaudeCode) in your browser |
20
20
 
21
21
  ### CLI Extension Management
22
22
 
@@ -89,6 +89,34 @@ Gemini extensions are automatically converted to Qwen Code format during install
89
89
  - TOML command files are automatically migrated to Markdown format
90
90
  - MCP servers, context files, and settings are preserved
91
91
 
92
+ #### From npm Registry
93
+
94
+ Qwen Code supports installing extensions from npm registries using scoped package names. This is ideal for teams with private registries that already have auth, versioning, and publishing infrastructure in place.
95
+
96
+ ```bash
97
+ # Install the latest version
98
+ qwen extensions install @scope/my-extension
99
+
100
+ # Install a specific version
101
+ qwen extensions install @scope/my-extension@1.2.0
102
+
103
+ # Install from a custom registry
104
+ qwen extensions install @scope/my-extension --registry https://your-registry.com
105
+ ```
106
+
107
+ Only scoped packages (`@scope/package-name`) are supported to avoid ambiguity with the `owner/repo` GitHub shorthand format.
108
+
109
+ **Registry resolution** follows this priority:
110
+
111
+ 1. `--registry` CLI flag (explicit override)
112
+ 2. Scoped registry from `.npmrc` (e.g. `@scope:registry=https://...`)
113
+ 3. Default registry from `.npmrc`
114
+ 4. Fallback: `https://registry.npmjs.org/`
115
+
116
+ **Authentication** is handled automatically via the `NPM_TOKEN` environment variable or registry-specific `_authToken` entries in your `.npmrc` file.
117
+
118
+ > **Note:** npm extensions must include a `qwen-extension.json` file at the package root, following the same format as any other Qwen Code extension. See [Extension Releasing](./extension-releasing.md#releasing-through-npm-registry) for packaging details.
119
+
92
120
  #### From Git Repository
93
121
 
94
122
  ```bash
@@ -127,7 +155,7 @@ This is useful if you have an extension disabled at the top-level and only enabl
127
155
 
128
156
  ### Updating an extension
129
157
 
130
- For extensions installed from a local path or a git repository, you can explicitly update to the latest version (as reflected in the `qwen-extension.json` `version` field) with `qwen extensions update extension-name`.
158
+ For extensions installed from a local path, a git repository, or an npm registry, you can explicitly update to the latest version with `qwen extensions update extension-name`. For npm extensions installed without a version pin (e.g. `@scope/pkg`), updates check the `latest` dist-tag. For those installed with a specific dist-tag (e.g. `@scope/pkg@beta`), updates track that tag. Extensions pinned to an exact version (e.g. `@scope/pkg@1.2.0`) are always considered up-to-date.
131
159
 
132
160
  You can update all extensions with:
133
161
 
@@ -15,4 +15,5 @@ export default {
15
15
  language: 'i18n',
16
16
  channels: 'Channels',
17
17
  hooks: 'Hooks',
18
+ 'scheduled-tasks': 'Scheduled Tasks',
18
19
  };
@@ -4,13 +4,18 @@
4
4
 
5
5
  Qwen Code hooks provide a powerful mechanism for extending and customizing the behavior of the Qwen Code application. Hooks allow users to execute custom scripts or programs at specific points in the application lifecycle, such as before tool execution, after tool execution, at session start/end, and during other key events.
6
6
 
7
- > **⚠️ EXPERIMENTAL FEATURE**
8
- >
9
- > Hooks are currently in an experimental stage. To enable hooks, start Qwen Code with the `--experimental-hooks` flag:
10
- >
11
- > ```bash
12
- > qwen --experimental-hooks
13
- > ```
7
+ Hooks are enabled by default. You can temporarily disable all hooks by setting `disableAllHooks` to `true` in your settings file (at the top level, alongside `hooks`):
8
+
9
+ ```json
10
+ {
11
+ "disableAllHooks": true,
12
+ "hooks": {
13
+ "PreToolUse": [...]
14
+ }
15
+ }
16
+ ```
17
+
18
+ This disables all hooks without deleting their configurations.
14
19
 
15
20
  ## What are Hooks?
16
21
 
@@ -0,0 +1,139 @@
1
+ # Run Prompts on a Schedule
2
+
3
+ > Use `/loop` and the cron scheduling tools to run prompts repeatedly, poll for status, or set one-time reminders within a Qwen Code session.
4
+
5
+ Scheduled tasks let Qwen Code re-run a prompt automatically on an interval. Use them to poll a deployment, babysit a PR, check back on a long-running build, or remind yourself to do something later in the session.
6
+
7
+ Tasks are session-scoped: they live in the current Qwen Code process and are gone when you exit. Nothing is written to disk.
8
+
9
+ > **Note:** Scheduled tasks are an experimental feature. Enable them with `experimental.cron: true` in your [settings](../configuration/settings.md), or set `QWEN_CODE_ENABLE_CRON=1` in your environment.
10
+
11
+ ## Schedule a recurring prompt with /loop
12
+
13
+ The `/loop` [bundled skill](skills.md) is the quickest way to schedule a recurring prompt. Pass an optional interval and a prompt, and Qwen Code sets up a cron job that fires in the background while the session stays open.
14
+
15
+ ```text
16
+ /loop 5m check if the deployment finished and tell me what happened
17
+ ```
18
+
19
+ Qwen Code parses the interval, converts it to a cron expression, schedules the job, and confirms the cadence and job ID. It then immediately executes the prompt once — you don't have to wait for the first cron fire.
20
+
21
+ ### Interval syntax
22
+
23
+ Intervals are optional. You can lead with them, trail with them, or leave them out entirely.
24
+
25
+ | Form | Example | Parsed interval |
26
+ | :---------------------- | :------------------------------------ | :--------------------------- |
27
+ | Leading token | `/loop 30m check the build` | every 30 minutes |
28
+ | Trailing `every` clause | `/loop check the build every 2 hours` | every 2 hours |
29
+ | No interval | `/loop check the build` | defaults to every 10 minutes |
30
+
31
+ Supported units are `s` for seconds, `m` for minutes, `h` for hours, and `d` for days. Seconds are rounded up to the nearest minute since cron has one-minute granularity. Intervals that don't divide evenly into their unit, such as `7m` or `90m`, are rounded to the nearest clean interval and Qwen Code tells you what it picked.
32
+
33
+ ### Loop over another command
34
+
35
+ The scheduled prompt can itself be a command or skill invocation. This is useful for re-running a workflow you've already packaged.
36
+
37
+ ```text
38
+ /loop 20m /review-pr 1234
39
+ ```
40
+
41
+ Each time the job fires, Qwen Code runs `/review-pr 1234` as if you had typed it.
42
+
43
+ ### Manage loops
44
+
45
+ `/loop` also supports two subcommands for managing existing jobs:
46
+
47
+ ```text
48
+ /loop list
49
+ ```
50
+
51
+ Lists all scheduled jobs with their IDs and cron expressions.
52
+
53
+ ```text
54
+ /loop clear
55
+ ```
56
+
57
+ Cancels all scheduled jobs at once.
58
+
59
+ ## Set a one-time reminder
60
+
61
+ For one-shot reminders, describe what you want in natural language instead of using `/loop`. Qwen Code schedules a single-fire task that deletes itself after running.
62
+
63
+ ```text
64
+ remind me at 3pm to push the release branch
65
+ ```
66
+
67
+ ```text
68
+ in 45 minutes, check whether the integration tests passed
69
+ ```
70
+
71
+ Qwen Code pins the fire time to a specific minute and hour using a cron expression and confirms when it will fire.
72
+
73
+ ## Manage scheduled tasks
74
+
75
+ Ask Qwen Code in natural language to list or cancel tasks, or reference the underlying tools directly.
76
+
77
+ ```text
78
+ what scheduled tasks do I have?
79
+ ```
80
+
81
+ ```text
82
+ cancel the deploy check job
83
+ ```
84
+
85
+ Under the hood, Qwen Code uses these tools:
86
+
87
+ | Tool | Purpose |
88
+ | :----------- | :-------------------------------------------------------------------------------------------------------------- |
89
+ | `CronCreate` | Schedule a new task. Accepts a 5-field cron expression, the prompt to run, and whether it recurs or fires once. |
90
+ | `CronList` | List all scheduled tasks with their IDs, schedules, and prompts. |
91
+ | `CronDelete` | Cancel a task by ID. |
92
+
93
+ Each scheduled task has an 8-character ID you can pass to `CronDelete`. A session can hold up to 50 scheduled tasks at once.
94
+
95
+ ## How scheduled tasks run
96
+
97
+ The scheduler checks every second for due tasks and enqueues them when the session is idle. A scheduled prompt fires between your turns, not while Qwen Code is mid-response. If Qwen Code is busy when a task comes due, the prompt waits until the current turn ends.
98
+
99
+ All times are interpreted in your local timezone. A cron expression like `0 9 * * *` means 9am wherever you're running Qwen Code, not UTC.
100
+
101
+ ### Jitter
102
+
103
+ To avoid every session hitting the API at the same wall-clock moment, the scheduler adds a small deterministic offset to fire times:
104
+
105
+ - **Recurring tasks** fire up to 10% of their period late, capped at 15 minutes. An hourly job might fire anywhere from `:00` to `:06`.
106
+ - **One-shot tasks** scheduled for the top or bottom of the hour (minute `:00` or `:30`) fire up to 90 seconds early.
107
+
108
+ The offset is derived from the task ID, so the same task always gets the same offset. If exact timing matters, pick a minute that is not `:00` or `:30`, for example `3 9 * * *` instead of `0 9 * * *`, and the one-shot jitter will not apply.
109
+
110
+ ### Three-day expiry
111
+
112
+ Recurring tasks automatically expire 3 days after creation. The task fires one final time, then deletes itself. This bounds how long a forgotten loop can run. If you need a recurring task to last longer, cancel and recreate it before it expires.
113
+
114
+ One-shot tasks do not expire on a timer — they simply delete themselves after firing once.
115
+
116
+ ## Cron expression reference
117
+
118
+ `CronCreate` accepts standard 5-field cron expressions: `minute hour day-of-month month day-of-week`. All fields support wildcards (`*`), single values (`5`), steps (`*/15`), ranges (`1-5`), and comma-separated lists (`1,15,30`).
119
+
120
+ | Example | Meaning |
121
+ | :------------- | :--------------------------- |
122
+ | `*/5 * * * *` | Every 5 minutes |
123
+ | `0 * * * *` | Every hour on the hour |
124
+ | `7 * * * *` | Every hour at 7 minutes past |
125
+ | `0 9 * * *` | Every day at 9am local |
126
+ | `0 9 * * 1-5` | Weekdays at 9am local |
127
+ | `30 14 15 3 *` | March 15 at 2:30pm local |
128
+
129
+ Day-of-week uses `0` or `7` for Sunday through `6` for Saturday. When both day-of-month and day-of-week are constrained (neither is `*`), a date matches if either field matches — this follows standard vixie-cron semantics.
130
+
131
+ Extended syntax like `L`, `W`, `?`, and name aliases such as `MON` or `JAN` is not supported.
132
+
133
+ ## Limitations
134
+
135
+ Session-scoped scheduling has inherent constraints:
136
+
137
+ - Tasks only fire while Qwen Code is running and idle. Closing the terminal or letting the session exit cancels everything.
138
+ - No catch-up for missed fires. If a task's scheduled time passes while Qwen Code is busy on a long-running request, it fires once when Qwen Code becomes idle, not once per missed interval.
139
+ - No persistence across restarts. Restarting Qwen Code clears all session-scoped tasks.
@@ -98,6 +98,7 @@ Subagents are configured using Markdown files with YAML frontmatter. This format
98
98
  ---
99
99
  name: agent-name
100
100
  description: Brief description of when and how to use this agent
101
+ model: inherit # Optional: inherit or model-id
101
102
  tools:
102
103
  - tool1
103
104
  - tool2
@@ -106,9 +107,17 @@ tools:
106
107
 
107
108
  System prompt content goes here.
108
109
  Multiple paragraphs are supported.
109
- You can use ${variable} templating for dynamic content.
110
110
  ```
111
111
 
112
+ #### Model Selection
113
+
114
+ Use the optional `model` frontmatter field to control which model a subagent uses:
115
+
116
+ - `inherit`: Use the same model as the main conversation
117
+ - Omit the field: Same as `inherit`
118
+ - `glm-5`: Use that model ID with the main conversation's auth type
119
+ - `openai:gpt-4o`: Use a different provider (resolves credentials from env vars)
120
+
112
121
  #### Example Usage
113
122
 
114
123
  ```
@@ -117,12 +126,7 @@ name: project-documenter
117
126
  description: Creates project documentation and README files
118
127
  ---
119
128
 
120
- You are a documentation specialist for the ${project_name} project.
121
-
122
- Your task: ${task_description}
123
-
124
- Working directory: ${current_directory}
125
- Generated on: ${timestamp}
129
+ You are a documentation specialist.
126
130
 
127
131
  Focus on creating clear, comprehensive documentation that helps both
128
132
  new contributors and end users understand the project.
@@ -213,7 +217,7 @@ tools:
213
217
  - web_search
214
218
  ---
215
219
 
216
- You are a technical documentation specialist for ${project_name}.
220
+ You are a technical documentation specialist.
217
221
 
218
222
  Your role is to create clear, comprehensive documentation that serves both
219
223
  developers and end users. Focus on:
@@ -1,11 +1,12 @@
1
1
  ---
2
2
  name: review
3
- description: Review changed code for correctness, security, code quality, and performance. Use when the user asks to review code changes, a PR, or specific files. Invoke with `/review`, `/review <pr-number>`, or `/review <file-path>`.
3
+ description: Review changed code for correctness, security, code quality, and performance. Use when the user asks to review code changes, a PR, or specific files. Invoke with `/review`, `/review <pr-number>`, `/review <file-path>`, or `/review <pr-number> --comment` to post inline comments on the PR.
4
4
  allowedTools:
5
5
  - task
6
6
  - run_shell_command
7
7
  - grep_search
8
8
  - read_file
9
+ - write_file
9
10
  - glob
10
11
  ---
11
12
 
@@ -15,7 +16,11 @@ You are an expert code reviewer. Your job is to review code changes and provide
15
16
 
16
17
  ## Step 1: Determine what to review
17
18
 
18
- Your goal here is to understand the scope of changes so you can dispatch agents effectively in Step 2. Based on the arguments provided:
19
+ Your goal here is to understand the scope of changes so you can dispatch agents effectively in Step 2.
20
+
21
+ First, parse the `--comment` flag: split the arguments by whitespace, and if any token is exactly `--comment` (not a substring match — ignore tokens like `--commentary`), set the comment flag and remove that token from the argument list. If `--comment` is set but the review target is not a PR, warn the user: "Warning: `--comment` flag is ignored because the review target is not a PR." and continue without it.
22
+
23
+ Based on the remaining arguments:
19
24
 
20
25
  - **No arguments**: Review local uncommitted changes
21
26
  - Run `git diff` and `git diff --staged` to get all changes
@@ -36,6 +41,20 @@ Launch **four parallel review agents** to analyze the changes from different ang
36
41
 
37
42
  **IMPORTANT**: Do NOT paste the full diff into each agent's prompt — this duplicates it 4x. Instead, give each agent the command to obtain the diff, a concise summary of what the changes are about, and its review focus. Each agent can read files and search the codebase on its own.
38
43
 
44
+ Apply the **Exclusion Criteria** (defined at the end of this document) — do NOT flag anything that matches those criteria.
45
+
46
+ Each agent must return findings in this structured format (one per issue):
47
+
48
+ ```
49
+ - **File:** <file path>:<line number or range>
50
+ - **Issue:** <clear description of the problem>
51
+ - **Impact:** <why it matters>
52
+ - **Suggested fix:** <concrete code suggestion when possible, or "N/A">
53
+ - **Severity:** Critical | Suggestion | Nice to have
54
+ ```
55
+
56
+ If an agent finds no issues in its dimension, it should explicitly return "No issues found."
57
+
39
58
  ### Agent 1: Correctness & Security
40
59
 
41
60
  Focus areas:
@@ -80,15 +99,42 @@ Focus areas:
80
99
  - Unexpected side effects or hidden coupling
81
100
  - Anything else that looks off — trust your instincts
82
101
 
83
- ## Step 3: Restore environment and present findings
102
+ ## Step 2.5: Deduplicate and verify
103
+
104
+ ### Deduplication
105
+
106
+ Before verification, merge findings that refer to the same issue (same file, same line range, same root cause) even if reported by different agents. Keep the most detailed description and note which agents flagged it.
107
+
108
+ ### Independent verification
109
+
110
+ For each **unique** finding, launch an **independent verification agent**. Run verification agents in parallel, but if there are more than 10 unique findings, batch them in groups of 10 to avoid resource exhaustion.
111
+
112
+ Each verification agent receives:
84
113
 
85
- If you checked out a PR branch in Step 1, restore the original state first: check out the original branch, `git stash pop` if changes were stashed, and remove the temp file.
114
+ - The finding description (what's wrong, file, line)
115
+ - The command to obtain the diff (as determined in Step 1)
116
+ - Access to read files and search the codebase
86
117
 
87
- Then combine results from all four agents into a single, well-organized review. Use this format:
118
+ Each verification agent must **independently** (without seeing other agents' findings):
119
+
120
+ 1. Read the actual code at the referenced file and line
121
+ 2. Check surrounding context — callers, type definitions, tests, related modules
122
+ 3. Verify the issue is not a false positive — reject if it matches any item in the **Exclusion Criteria**
123
+ 4. Return a verdict:
124
+ - **confirmed** — with severity: Critical, Suggestion, or Nice to have
125
+ - **rejected** — with a one-line reason why it's not a real issue
126
+
127
+ **When uncertain, lean toward rejecting.** The goal is high signal, low noise — it's better to miss a minor suggestion than to report a false positive.
128
+
129
+ **After all verification agents complete:** remove all rejected findings. Only confirmed findings proceed to Step 3.
130
+
131
+ ## Step 3: Present findings
132
+
133
+ Present the confirmed findings from Step 2.5 as a single, well-organized review. Use this format:
88
134
 
89
135
  ### Summary
90
136
 
91
- A 1-2 sentence overview of the changes and overall assessment.
137
+ A 1-2 sentence overview of the changes and overall assessment. Include verification stats: "X findings reported, Y confirmed after independent verification."
92
138
 
93
139
  ### Findings
94
140
 
@@ -113,6 +159,98 @@ One of:
113
159
  - **Request changes** — Has critical issues that need fixing
114
160
  - **Comment** — Has suggestions but no blockers
115
161
 
162
+ ## Step 4: Post PR inline comments (only if `--comment` flag was set)
163
+
164
+ Skip this step if `--comment` was not specified or the review target is not a PR.
165
+
166
+ First, get the repository owner/repo and the PR's HEAD commit SHA:
167
+
168
+ ```bash
169
+ gh repo view --json owner,name --jq '"\(.owner.login)/\(.name)"'
170
+ gh pr view {pr_number} --json headRefOid --jq '.headRefOid'
171
+ ```
172
+
173
+ **Important:** Use `gh pr view --json headRefOid` instead of `git rev-parse HEAD` — the local branch may be behind the remote, and the GitHub API requires the exact remote HEAD SHA. If either command fails, inform the user and skip Step 4.
174
+
175
+ Then, for each confirmed finding, post an **inline comment** on the specific file and line using `gh api`:
176
+
177
+ **Shell safety:** Review content may contain double quotes, `$VAR`, backticks, or other shell-sensitive characters. Do NOT interpolate review text directly into shell arguments. Instead, use a **two-step process**: write the body to a temp file using the `write_file` tool (which bypasses shell interpretation entirely), then reference the file with `-F body=@file` in the shell command.
178
+
179
+ ```
180
+ # Step A: Use write_file tool to create /tmp/pr-comment.txt with content:
181
+ **[{severity}]** {issue description}
182
+
183
+ {suggested fix}
184
+ ```
185
+
186
+ ```bash
187
+ # Step B: Post single-line comment referencing the file:
188
+ gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \
189
+ -F body=@/tmp/pr-comment.txt \
190
+ -f commit_id="{commit_sha}" \
191
+ -f path="{file_path}" \
192
+ -F line={line_number} \
193
+ -f side="RIGHT"
194
+
195
+ # For multi-line findings (e.g., line range 42-50), add start_line and start_side:
196
+ gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \
197
+ -F body=@/tmp/pr-comment.txt \
198
+ -f commit_id="{commit_sha}" \
199
+ -f path="{file_path}" \
200
+ -F start_line={start_line} \
201
+ -F line={end_line} \
202
+ -f start_side="RIGHT" \
203
+ -f side="RIGHT"
204
+ ```
205
+
206
+ Repeat Steps A-B for each finding, overwriting the temp file each time. Clean up the temp file in Step 5.
207
+
208
+ If posting an inline comment fails (e.g., line not part of the diff, auth error), include the finding in the overall review summary comment instead.
209
+
210
+ **Important rules:**
211
+
212
+ - Only post **ONE comment per unique issue** — do not duplicate across lines
213
+ - Keep each comment concise and actionable
214
+ - Include the severity tag (Critical/Suggestion/Nice to have) at the start of each comment
215
+ - Include the suggested fix in the comment body when available
216
+
217
+ After posting all inline comments, use `write_file` to create `/tmp/pr-review-summary.txt` with the summary text, then submit the review using the action that matches the verdict from Step 3:
218
+
219
+ ```bash
220
+ # Submit review with the matching action:
221
+ # If verdict is "Approve":
222
+ gh pr review {pr_number} --approve --body-file /tmp/pr-review-summary.txt
223
+
224
+ # If verdict is "Request changes":
225
+ gh pr review {pr_number} --request-changes --body-file /tmp/pr-review-summary.txt
226
+
227
+ # If verdict is "Comment":
228
+ gh pr review {pr_number} --comment --body-file /tmp/pr-review-summary.txt
229
+ ```
230
+
231
+ If there are **no confirmed findings**:
232
+
233
+ ```bash
234
+ gh pr review {pr_number} --approve --body "No issues found. LGTM! ✅"
235
+ ```
236
+
237
+ ## Step 5: Restore environment
238
+
239
+ If you checked out a PR branch in Step 1, restore the original state now: check out the original branch, `git stash pop` if changes were stashed, and remove all temp files (`/tmp/pr-review-context.md`, `/tmp/pr-comment.txt`, `/tmp/pr-review-summary.txt`).
240
+
241
+ This step runs **after** Step 4 to ensure the PR branch is still checked out when posting inline comments (Step 4 needs the correct commit SHA from the PR branch).
242
+
243
+ ## Exclusion Criteria
244
+
245
+ These criteria apply to both Step 2 (review agents) and Step 2.5 (verification agents). Do NOT flag or confirm any finding that matches:
246
+
247
+ - Pre-existing issues in unchanged code (focus on the diff only)
248
+ - Style, formatting, or naming that matches surrounding codebase conventions
249
+ - Pedantic nitpicks that a senior engineer would not flag
250
+ - Issues that a linter or type checker would catch automatically
251
+ - Subjective "consider doing X" suggestions that aren't real problems
252
+ - If you're unsure whether something is a problem, do NOT report it
253
+
116
254
  ## Guidelines
117
255
 
118
256
  - Be specific and actionable. Avoid vague feedback like "could be improved."