@trashcodermaker/pi-pr-review-handler 1.1.4 → 1.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@trashcodermaker/pi-pr-review-handler",
3
- "version": "1.1.4",
3
+ "version": "1.2.1",
4
4
  "description": "Systematically process GitHub PR review comments: triage for validity, fix code, and post replies. Pi package — install with `pi install npm:@trashcodermaker/pi-pr-review-handler`.",
5
5
  "type": "module",
6
6
  "license": "MIT",
@@ -41,7 +41,7 @@ Agent specs live in `agents/` relative to this skill (`agents/triage-agent.md`,
41
41
 
42
42
  | Platform | Dispatch mechanism |
43
43
  |----------|-------------------|
44
- | Pi | inline fallback (no native subtask mechanism) |
44
+ | Pi | `subagent` tool if [pi-subagents](https://www.npmjs.com/package/pi-subagents) is installed, else inline fallback |
45
45
  | Claude Code | Task tool |
46
46
  | Cursor | background agent |
47
47
  | Gemini CLI / OpenCode / others | native subtask mechanism if available |
@@ -49,7 +49,9 @@ Agent specs live in `agents/` relative to this skill (`agents/triage-agent.md`,
49
49
 
50
50
  **Dispatch pattern**: read the relevant agent spec, embed its instructions into the task prompt along with the thread-specific input data (thread info for triage, verdict data for implementation), and launch one subtask per thread. Triage is read-only so subtasks run in parallel; implementation writes files so it runs serially.
51
51
 
52
- **Inline fallback**: if your platform has no subtask mechanism, you (the orchestrator) read each spec and perform its steps yourself, one thread at a time. The specs are written as direct instructions, so inline execution is straightforward.
52
+ **Pi dispatch capability**: Pi does not bundle a subtask mechanism it depends on the optional `pi-subagents` package (recommended in the project README). To detect it, look in your available tools list for a tool **literally named `subagent`** (the exact tool name, not a description match). If you see a tool named `subagent`, it is available — use it (PARALLEL mode for triage, SINGLE mode serially for implementation). If no tool named `subagent` appears in your tool list, fall back to inline execution. Do not assume pi-subagents is installed, and do not error if it is missing — the skill degrades gracefully either way.
53
+
54
+ **Inline fallback**: if the `subagent` tool is not available (e.g. pi-subagents not installed) or the platform has no subtask mechanism, you (the orchestrator) read each spec and perform its steps yourself, one thread at a time. The specs are written as direct instructions, so inline execution is straightforward.
53
55
 
54
56
  ## Phase 0: Setup
55
57
 
@@ -96,11 +98,13 @@ gh api graphql -f query='
96
98
 
97
99
  Filter `isResolved: false`. Include full thread (not just top-level) — follow-up replies often contain the real concern.
98
100
 
101
+ **Deduplicate threads before triage.** Automated reviewers (React Doctor, claude[bot], etc.) often post the same concern multiple times across different reviews, and human reviewers may re-comment after a push. Before dispatching Triage Agents, cluster threads by `(path, line)` and collapse those whose top-level comment bodies share the same core concern (match on the first sentence or the rule identifier like `react-doctor/prefer-useReducer`). Keep one representative thread per cluster, but record all `databaseId`s — Phase 4 posts the reply to **every** original thread in the cluster so no reviewer comment is left unanswered. Report the dedup result to the user (e.g. "8 threads → 3 unique concerns") so the Checkpoint 1 table stays readable.
102
+
99
103
  **Fallback: REST** (when GraphQL unavailable):
100
104
 
101
105
  ```bash
102
106
  gh api repos/{owner}/{repo}/pulls/{pr_number}/comments \
103
- --jq 'group_by(.path, .original_commit_id) | map({
107
+ --jq 'group_by([.path, (.original_commit_id // "HEAD")]) | map({
104
108
  thread_id: .[0].id, path: .[0].path,
105
109
  comments: [.[] | {id, body, user: .user.login, line, in_reply_to_id}]
106
110
  })'
@@ -117,16 +121,44 @@ gh api repos/{owner}/{repo}/pulls/{pr_number}/reviews \
117
121
  --jq 'sort_by(.submitted_at) | reverse | .[:5] | .[] | {id, state, user: .user.login, body}'
118
122
  ```
119
123
 
120
- Include any non-empty review bodies alongside the thread data when presenting to the user in Phase 2.
124
+ Present any non-empty review bodies in Checkpoint 1 (Phase 1) as a
125
+ separate **Review-level feedback** section. These are summary comments
126
+ without line references, so they do not go through the Triage Agent
127
+ automatically — the user decides how to handle each one (ignore / reply
128
+ only / needs code change). If the user marks one as needing a code change,
129
+ convert it into an Implementation task with `path: <overall PR>` and no
130
+ specific line; the Implementation Agent then works from the review body
131
+ text and the PR diff.
132
+
133
+ ### Fetch PR diff
134
+
135
+ Triage needs to see what the PR actually changed — without it, the agent cannot
136
+ distinguish a problem the PR introduced from one that already existed in the
137
+ base branch.
138
+
139
+ ```bash
140
+ gh pr diff {pr_number} > /tmp/pr-{pr_number}.diff
141
+ ```
142
+
143
+ Cache the diff to a temp file. When dispatching each Triage Agent in Phase 1,
144
+ pass only the hunks relevant to that thread's `path` as `pr_diff_context`. If
145
+ the total diff is small (< 500 lines), pass the full diff to every agent for
146
+ broader context.
121
147
 
122
148
  ## Phase 1: Triage (parallel dispatch)
123
149
 
124
150
  ### Dispatch strategy
125
151
 
126
- Triage is read-only — safe to parallelize.
152
+ Triage is read-only — safe to parallelize. First detect dispatch capability:
153
+
154
+ - **Pi with `subagent` tool available** (pi-subagents installed): use PARALLEL mode — spawn one Triage Agent subtask per thread simultaneously. For implementation (Phase 2), use SINGLE mode serially — each fix is a separate subtask, one at a time, because fixes write files.
155
+ - **No `subagent` tool / no subtask mechanism**: run inline, same logic, one thread at a time.
156
+
157
+ Then choose a strategy based on thread count:
127
158
 
128
159
  - **≥3 threads + parallel capability**: spawn one Triage Agent per thread simultaneously
129
160
  - **≤2 threads or no parallel capability**: run inline, same logic
161
+ - **Large PR (> 15 threads)**: batch by file path to keep context manageable. Group threads sharing the same `path` into the same batch (they share `pr_diff_context`, saving tokens). Dispatch one batch at a time, 8–10 threads per batch. Collect verdicts across batches before presenting Checkpoint 1. This avoids spawning dozens of subagents at once, which can hit API rate limits and produce a verdict table too long for the user to review.
130
162
 
131
163
  If unsure about parallel capability, default to inline.
132
164
 
@@ -142,13 +174,14 @@ comments:
142
174
  - <top-level comment text>
143
175
  - <reply 1, if any>
144
176
  - <reply 2, if any>
177
+ pr_diff_context: <diff hunks for {path} from /tmp/pr-{pr_number}.diff, or full diff if PR is small>
145
178
  ```
146
179
 
147
180
  Embed the Triage Agent spec (`agents/triage-agent.md`) into the task prompt so the subtask has the full role instructions and output format, then append the thread-specific data above. Collect structured verdicts from all agents.
148
181
 
149
182
  ### Checkpoint 1: User confirmation
150
183
 
151
- Present verdicts as a summary table:
184
+ Present thread verdicts as a summary table:
152
185
 
153
186
  | # | File:Line | Reviewer | Summary | Verdict | Affected Files |
154
187
  |---|-----------|----------|---------|---------|----------------|
@@ -156,7 +189,21 @@ Present verdicts as a summary table:
156
189
  | 2 | src/ui/Button.tsx:18 | bob | Rename for clarity | valid-fix | Button.tsx |
157
190
  | 3 | src/api/handler.ts:99 | alice | Add rate limiting | invalid | — |
158
191
 
159
- User can adjust verdicts or skip threads. Proceed with confirmed plan.
192
+ Then present the **Review-level feedback** section (summary comments
193
+ without line references, fetched in Phase 0):
194
+
195
+ | # | Reviewer | State | Body (excerpt) |
196
+ |---|----------|-------|----------------|
197
+ | R1 | alice | CHANGES_REQUESTED | "Overall solid, but the auth module needs a refactor — see thread #1" |
198
+ | R2 | bob | COMMENTED | "Please add tests for the token expiry edge case" |
199
+
200
+ For each review-level item, ask the user to choose:
201
+
202
+ - **Ignore** — no action needed (e.g. summary of already-addressed threads)
203
+ - **Reply only** — draft a clarification in Phase 3, no code change
204
+ - **Needs code change** — convert to an Implementation task: `path: <overall PR>`, no line, `summary: <review body>`, `affected_files: <user-specified or all PR files>`, `suggested_fix: <user describes>`. Dispatch in Phase 2.
205
+
206
+ User can adjust thread verdicts or skip threads. Proceed with confirmed plan.
160
207
 
161
208
  ### Quick exits
162
209
 
@@ -191,6 +238,8 @@ prior_changes: <list of previous fixes in this PR, if any>
191
238
 
192
239
  Embed the Implementation Agent spec (`agents/implementation-agent.md`) into the task prompt so the subtask has the full role instructions, then append the verdict data above.
193
240
 
241
+ **Pi dispatch**: if the `subagent` tool is available, use SINGLE mode — one subtask per fix, awaited in turn (serial). Pass `prior_changes` by collecting each completed subtask's output and appending it to the next subtask's input. If `subagent` is unavailable, execute the Implementation Agent spec inline, one thread at a time.
242
+
194
243
  After each agent completes:
195
244
 
196
245
  ```bash
@@ -202,15 +251,23 @@ If commit is empty (agent made no changes), skip.
202
251
 
203
252
  ### After all fixes
204
253
 
205
- ```bash
206
- npx tsc --noEmit
207
- ```
254
+ Run the project's type checker or equivalent verification. Detect the
255
+ project type and run the matching command — do not assume TypeScript:
208
256
 
209
- If tsc fails:
257
+ | Project marker | Command |
258
+ | --- | --- |
259
+ | `tsconfig.json` | `npx tsc --noEmit` |
260
+ | `package.json` (no tsconfig) | `npm run lint --if-present` and `npm test --if-present` |
261
+ | `pyproject.toml` / `setup.py` | `ruff check .` then `mypy .` (if configured) |
262
+ | `go.mod` | `go build ./...` |
263
+ | `Cargo.toml` | `cargo check` |
264
+ | none recognized | skip; tell user to run the project's check manually |
265
+
266
+ If the check fails:
210
267
 
211
268
  - Identify which commit introduced the error (`git bisect` or check error file paths)
212
269
  - `git revert --no-commit {commit}` → fix the error → recommit
213
- - Re-run tsc until clean
270
+ - Re-run the check until clean
214
271
 
215
272
  Also check:
216
273
 
@@ -236,12 +293,38 @@ Draft one reply per thread, using `git diff origin/{branch}...HEAD` as ground tr
236
293
 
237
294
  **valid-fix (succeeded)**: describe what was changed, referencing the reviewer's concern. If the fix differs from what the reviewer suggested, explain why.
238
295
 
296
+ Examples:
297
+
298
+ > Good (EN): "Added the null check at line 42 as you suggested — `user` can indeed be undefined when the session expires."
299
+ >
300
+ > Good (中文): "已在 42 行加了空值判断,session 过期时 `user` 确实可能为 undefined。"
301
+ >
302
+ > Good (EN, diverged from suggestion): "Your point about rate limiting is valid. Instead of a fixed window I used a token bucket in `rateLimit.ts` — it handles burst traffic better and the existing tests cover it."
303
+
239
304
  **valid-fix (failed)**: acknowledge the concern was valid. Explain why the fix couldn't be applied (type conflict, dependency issue). Suggest next steps if possible ("will address in a follow-up PR"). Don't be apologetic — just factual.
240
305
 
306
+ Examples:
307
+
308
+ > Good (EN): "You're right that this should be typed more strictly, but the `User` interface is shared with the legacy auth module which expects `any`. I'll extract a `StrictUser` type in a follow-up PR to avoid breaking that module here."
309
+ >
310
+ > Good (中文): "这里确实该用更严格的类型,但 `User` 接口被旧 auth 模块共用且依赖 `any`。我会在后续 PR 里拆出 `StrictUser` 类型,避免这里改动波及该模块。"
311
+
241
312
  **valid-nofix**: acknowledge the concern is valid. Explain why no code change is needed. Provide clarification if the reviewer misunderstood the code.
242
313
 
314
+ Examples:
315
+
316
+ > Good (EN): "Fair point on the naming — `handleX` is a bit vague. It's part of the public API documented in `docs/api.md`, so renaming would be a breaking change. I'll add a deprecation alias in the next major."
317
+ >
318
+ > Good (中文): "命名确实不够清晰,不过 `handleX` 是 `docs/api.md` 里记录的公开 API,重命名属于破坏性变更,下个大版本会加弃用别名。"
319
+
243
320
  **invalid**: explain clearly why the premise doesn't apply. Reference specific code that already handles the concern. Acknowledge the reviewer's intent ("I see why you'd think X, but..."). Be respectful but direct — don't hedge if the concern is genuinely wrong.
244
321
 
322
+ Examples:
323
+
324
+ > Good (EN): "I see why you'd think the count could overflow here, but `items` is already capped at `MAX_ITEMS` (line 15) before this loop runs, so `i` stays well within `Number.MAX_SAFE_INTEGER`."
325
+ >
326
+ > Good (中文): "能理解你担心这里 count 溢出,但进入循环前 `items` 已在 15 行被 `MAX_ITEMS` 截断,`i` 始终远小于 `Number.MAX_SAFE_INTEGER`。"
327
+
245
328
  ### Reply guidelines
246
329
 
247
330
  - Match the reviewer's language (Chinese reviewer → Chinese reply)
@@ -261,14 +344,14 @@ Present all reply drafts. User can:
261
344
 
262
345
  ## Phase 4: Post & Push
263
346
 
264
- Post approved replies:
347
+ Post approved replies. The reply endpoint requires `-X POST` and the PR number in the path:
265
348
 
266
349
  ```bash
267
- gh api repos/{owner}/{repo}/pulls/comments/{comment_id}/replies \
350
+ gh api -X POST repos/{owner}/{repo}/pulls/{pr_number}/comments/{comment_id}/replies \
268
351
  -f body='The response text'
269
352
  ```
270
353
 
271
- Reply to the top-level comment of each thread (the one with `databaseId`, not a reply).
354
+ Reply to the top-level comment of each thread (the one with `databaseId`, not a reply). The `{pr_number}` is the PR number (e.g. `561`), and `{comment_id}` is the `databaseId` from Phase 0. Note: the path is `pulls/{pr_number}/comments/...`, **not** `pulls/comments/...` — the latter returns 404.
272
355
 
273
356
  Push all commits:
274
357
 
@@ -310,7 +393,7 @@ Output final summary:
310
393
  | API rate limit | Wait for `X-RateLimit-Reset`, retry |
311
394
  | PR closed/merged | Warn user — replies have no effect |
312
395
  | GraphQL unsupported | Fall back to REST with resolved-thread caveat |
313
- | tsc fails after all fixes | Fix before posting replies |
396
+ | Type check fails after all fixes | Fix before posting replies |
314
397
 
315
398
  ## Key Principles
316
399
 
@@ -23,14 +23,17 @@ comments:
23
23
  - [top-level comment]
24
24
  - [reply 1, if any]
25
25
  - [reply 2, if any]
26
+ pr_diff_context: [diff hunks for this file from the PR, or full diff if PR is small]
26
27
  ```
27
28
 
28
29
  ## Steps
29
30
 
30
- ### 1. Read the referenced code
31
+ ### 1. Read the referenced code and PR diff context
31
32
 
32
33
  Open `{path}` and examine the code at `{line}` plus surrounding context (±20 lines minimum). Understand what the code does, what it depends on, and what depends on it.
33
34
 
35
+ Then read `pr_diff_context` — the diff hunks for this file from the PR. This tells you whether the code the reviewer commented on was introduced by this PR, modified by it, or already existed in the base branch. A concern about code the PR didn't touch is usually out of scope for this review.
36
+
34
37
  ### 2. Read the full thread carefully
35
38
 
36
39
  The top-level comment may be refined, overridden, or clarified by follow-up replies. The actual concern may be in a reply, not the original comment. Weight later comments appropriately — they often represent the reviewer's evolved thinking.