@clipboard-health/ai-rules 2.20.11 → 2.20.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@clipboard-health/ai-rules",
3
- "version": "2.20.11",
3
+ "version": "2.20.13",
4
4
  "description": "Pre-built AI agent rules for consistent coding standards.",
5
5
  "keywords": [
6
6
  "ai",
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: babysit-pr
3
- description: "Watch a PR through CI and review feedback: commit/push, wait for CI, auto-fix high-confidence failures, reply to active review threads, and summarize parsed automated review-body comments with sentinel-tagged comments. Runs one pass against the current branch's PR; pass a PR number or URL to `gh pr checkout` that PR first. Use when the user says 'babysit my PR', 'babysit PR 482', 'watch my PR', 'keep my PR moving', or 'respond to comments'."
3
+ description: "Watch a PR through CI and review feedback: commit/push, wait for CI, auto-fix high-confidence failures, reply to active review threads, address top-level Conversation-tab comments, and summarize automated review-body content with sentinel-tagged comments. Runs one pass against the current branch's PR; pass a PR number or URL to `gh pr checkout` that PR first. Use when the user says 'babysit my PR', 'babysit PR 482', 'watch my PR', 'keep my PR moving', or 'respond to comments'."
4
4
  argument-hint: "[pr-number-or-url]"
5
5
  ---
6
6
 
@@ -25,17 +25,17 @@ This skill always runs exactly one pass. It never waits or repeats internally. F
25
25
 
26
26
  ## Sentinels
27
27
 
28
- The skill uses two HTML-comment sentinels.
28
+ The skill uses two sentinels. Each is a visible footer line wrapped in `<sub>` (a 🤖 mark plus the token in `<code>`).
29
29
 
30
- **Addressed sentinel**: `<!-- babysit-pr:addressed v1 core@3.4.1 -->`. The `core@<X.Y.Z>` suffix records which plugin version produced the reply. Appended on its own line at the end of every reply the skill posts (both thread replies and the review-body summary). This is how the skill knows, on re-runs, which threads and automated review-body comments it already handled. Dedupe matches by the version-agnostic prefix `<!-- babysit-pr:addressed v1` followed by a single space, so pre-versioning sentinels left by earlier plugin versions are still recognized. Grep `babysit-pr:addressed v1` (without `-->`) to find sentinels regardless of version; grep `babysit-pr:addressed v1 core@3.4.1` to find ones from a specific version.
30
+ **Addressed sentinel**: `<sub>🤖 <code>babysit-pr:addressed v1 core@3.4.1</code></sub>`. Appended on its own line at the end of every reply the skill posts (both thread replies and the review-body summary); this is how re-runs know which threads and review-body comments are already handled. Dedupe matches the version-agnostic substring `babysit-pr:addressed v1` followed by a space (also matches legacy `<!-- babysit-pr:addressed v1 ... -->` sentinels). Grep `babysit-pr:addressed v1` for any version; add `core@3.4.1` for a specific one.
31
31
 
32
- **Follow-up sentinel**: `<!-- babysit-pr:followup v1 core@3.4.1 -->`. Attached to replies that defer an out-of-scope comment as a tracked follow-up (see the Scope subsection and the Defer verdict in step 6). Grep `babysit-pr:followup` across PR conversation JSON to enumerate deferred items. This sentinel is additive — the post-reply scripts still append the `addressed` sentinel at the end, so a deferred thread is correctly machine-classified as addressed (the skill _has_ handled it — by deferring). Human reviewers and future sweeps distinguish deferred from resolved by looking for the follow-up sentinel.
32
+ **Follow-up sentinel**: `<sub>🤖 <code>babysit-pr:followup v1 core@3.4.1</code></sub>`. Attached to replies that defer an out-of-scope comment as a tracked follow-up (see the Scope subsection and the Defer verdict in step 6). Grep `babysit-pr:followup` across PR conversation JSON to enumerate deferred items. This sentinel is additive — the post-reply scripts still append the `addressed` sentinel at the end, so a deferred thread is correctly machine-classified as addressed (the skill _has_ handled it — by deferring). Human reviewers and future sweeps distinguish deferred from resolved by looking for the follow-up sentinel.
33
33
 
34
34
  **Sentinel recency rules.** The script emits a per-thread `activityState` with three values:
35
35
 
36
36
  - **`active`** — no sentinel yet, OR at least one human commented after the last sentinel. Always handle this thread.
37
37
  - **`uncertain`** — a sentinel exists AND one or more bot comments appeared after it. The thread carries a `postSentinelBotComments` array listing EVERY such comment. You MUST read every entry in that array (not just the most recent — a later ack must not hide an earlier actionable finding), then decide:
38
- - **Every** post-sentinel bot comment is a non-actionable acknowledgement (`"Thanks, resolved"`, `"LGTM"`, `"Learnings added"`, etc.) → mark the thread **Skip-reply**; do not post a new reply. (See step 6 — Skip-reply is a distinct classification from the `addressed` activityState value.)
38
+ - **Every** post-sentinel bot comment is a non-actionable acknowledgement (`"Thanks, resolved"`, `"LGTM"`, `"Learnings added"`, etc.) → mark the thread **Skip-reply**; do not post a new reply. (See step 6a — Skip-reply is a distinct classification from the `addressed` activityState value.)
39
39
  - **Any** post-sentinel bot comment carries new actionable content (new nit, new finding, corrected diagnosis) → treat as **active**; reply again AND mention in the final summary that you reactivated an "uncertain" thread and why.
40
40
  - If you cannot confidently classify every entry → default to **active** and flag it. Silence is the failure mode we are trying to avoid.
41
41
  - **`addressed`** — the sentinel is the newest relevant activity on the thread. Skip it.
@@ -44,7 +44,7 @@ The skill uses two HTML-comment sentinels.
44
44
 
45
45
  The bot detection exists ONLY to downgrade the default for post-sentinel bot activity from `"active"` to `"uncertain"`. It NEVER suppresses bot comments or marks a thread `"addressed"` on its own — review-bot content would be lost if it did.
46
46
 
47
- For automated review-body comments, the script emits a stable `fingerprint` per comment (sha256 of file + line + title + body, no timestamp). This includes CodeRabbit's Nitpick comments, Minor comments, and Outside diff range comments sections, plus Mendral `Needs attention` review bodies that include a file/line anchor. Before posting a summary, search existing PR issue-comments for a prior babysit-pr sentinel comment that already contains those fingerprints; if every current fingerprint is already present in a prior sentinel comment, skip posting.
47
+ For automated review bodies, the script emits a stable `fingerprint` per review (sha256 of the whole normalized body collapsed whitespace, no timestamp, no author). It covers every review from a known automated reviewer (CodeRabbit, Mendral, Dependabot, etc.); the agent reads each body directly and extracts findings as part of its scope/verdict assessment, instead of relying on a fragile pre-parser. For top-level Conversation-tab comments, the script emits the same kind of `fingerprint` per comment. Dedupe happens against the `priorBabysitSentinels` array returned in the same JSON document: if a current `reviewBodyComments[].fingerprint` or `activeIssueComments[].fingerprint` already appears in any prior sentinel body, skip posting / treat it as addressed.
48
48
 
49
49
  ## One iteration
50
50
 
@@ -135,12 +135,16 @@ The output JSON has:
135
135
  - `threads`: every unresolved review thread, with `threadId`, `replyToCommentDatabaseId`, `comments[]`, `lastBabysitSentinelAt`, `lastHumanCommentAt`, `lastBotCommentAt`, `postSentinelBotComments[]`, `postSentinelHumanComments[]`, and `activityState` (`"active"` / `"uncertain"` / `"addressed"`).
136
136
  - `activeThreads`: threads where `activityState != "addressed"` — these need attention this iteration (active AND uncertain).
137
137
  - `uncertainThreads`: just the uncertain subset. For each, read EVERY entry in `postSentinelBotComments` before deciding.
138
- - `nitpickComments`: parsed automated review-body comments, each with a stable `fingerprint`. The field name is retained for compatibility; it includes CodeRabbit Nitpick, Minor, and Outside diff range comments, plus Mendral `Needs attention` review bodies that include a file/line anchor.
139
- - `totalActiveThreads`, `totalUncertainThreads`, `totalNitpicks`, `totalUnresolvedComments` for quick checks.
138
+ - `reviewBodyComments`: every review from a known automated reviewer (CodeRabbit, Mendral, Dependabot, etc.), with the raw body and a stable per-review `fingerprint`. The agent reads each body directly to extract findings.
139
+ - `issueComments`: every top-level Conversation-tab comment, each with `isBabysitSentinel`, `isKnownBot`, and a per-comment `fingerprint`.
140
+ - `activeIssueComments`: the subset of `issueComments` that are NOT babysit-pr sentinels, NOT from a known bot, and whose `fingerprint` is NOT already listed in any prior babysit-pr summary. These are the human Conversation-tab comments still needing a reply.
141
+ - `priorBabysitSentinels`: prior babysit-pr summary comments posted as PR issue-comments. The script does the dedupe lookup for `activeIssueComments` automatically; the agent uses this array for `reviewBodyComments` dedupe.
142
+ - `truncated`: array naming any GraphQL connection that hit GitHub's 100-item cap (`reviewThreads`, `thread-comments`, `reviews`, `issueComments`). Non-empty means some comments may not be in this JSON — surface this in the final summary.
143
+ - `totalActiveThreads`, `totalUncertainThreads`, `totalActiveIssueComments`, `totalReviewBodyComments`, `totalUnresolvedComments` for quick checks.
140
144
 
141
145
  ### Scope
142
146
 
143
- This PR's review-feedback scope is strict by default. Steps 6 (threads) and 7 (automated review-body comments) classify each comment as in-scope or out-of-scope using this rule before choosing a verdict. Step 5 (CI) uses the broader CI-scope rule in that step, not this one — CI can legitimately fail on unchanged lines because the PR changed a contract or dependency path.
147
+ This PR's review-feedback scope is strict by default. Steps 6a (threads), 6b (top-level conversation comments), and 7 (automated review bodies) classify each comment as in-scope or out-of-scope using this rule before choosing a verdict. Step 5 (CI) uses the broader CI-scope rule in that step, not this one — CI can legitimately fail on unchanged lines because the PR changed a contract or dependency path.
144
148
 
145
149
  Build the changed-line set from `gh pr diff` once per iteration. Count changed diff lines on both sides: added lines in the new version, removed lines in the old version, and modified code represented by adjacent remove/add pairs. Do not count diff context lines. A reviewer comment or automated review-body comment is **in scope** when its anchor falls on a changed diff line on either side of the hunk. Deleted-line comments like "why remove this?" or "please add this back" are in scope by definition. For a range like `12-14`, any overlap with a changed diff line is in scope.
146
150
 
@@ -170,7 +174,7 @@ Default posture: focus on in-scope feedback. For out-of-scope feedback, apply th
170
174
 
171
175
  Run `bash scripts/fetchFailedLogs.sh` to stream failed output for every failing check on the PR. The first line is either:
172
176
 
173
- - `# babysit-pr: no failing checks` → skip to step 6.
177
+ - `# babysit-pr: no failing checks` → skip to step 6a.
174
178
  - `# babysit-pr: failing checks` → followed by one delimited block per failing job or external check:
175
179
  - `# --- run=<id> job=<id> ---` blocks carry the job's `--log-failed` output (GitHub Actions).
176
180
  - `# --- external check: <name> (<url>) ---` blocks carry no logs — the check isn't a GitHub Actions run (CircleCI, Nx Cloud, semgrep, CodeRabbit, Devin, etc.). Treat these like "External checks with no inspectable logs" in the diagnosis-only list below: stop and report, don't guess a fix.
@@ -195,7 +199,7 @@ Read the logs and diagnose: **build/type errors first** (they cause cascading te
195
199
 
196
200
  Scope check for CI: scope is the PR's changed files plus failures directly caused by those changes in the PR's execution path. Use `gh pr diff --name-only` as the first signal — this is PR-authoritative and works even if the local base ref is missing or stale (e.g., in fresh clones or CI sandboxes). Allow fixes outside changed files only when the logs and code make causality clear (e.g., the PR renamed a symbol that a sibling test references). CI failures outside that surface are out of scope — report the diagnosis, don't apply speculative fixes. CI fixes are never Deferred as follow-ups: CI needs to pass on this PR.
197
201
 
198
- ### 6. Assess active review threads
202
+ ### 6a. Assess active review threads
199
203
 
200
204
  For every thread in `activeThreads` (this includes both `"active"` and `"uncertain"`):
201
205
 
@@ -215,19 +219,29 @@ For every thread in `activeThreads` (this includes both `"active"` and `"uncerta
215
219
  - Does not meet the bar → **Defer** (new verdict). Record a one-line rationale and, if relevant, a pointer to where the concern lives.
216
220
  - Disagree and Already-fixed can still apply to out-of-scope comments (e.g., reviewer asks for a refactor that's already landed on main, or misreads the code).
217
221
 
218
- ### 7. Assess automated review-body comments
222
+ ### 6b. Assess top-level Conversation-tab comments
219
223
 
220
- For every parsed automated review-body comment in `nitpickComments`:
224
+ For every entry in `activeIssueComments` humans commenting on the PR Conversation tab without anchoring to a file/line:
221
225
 
222
- - Check whether its `fingerprint` already appears in a prior babysit-pr sentinel comment on the PR. If yes, skip.
223
- - **Classify scope** (in / out) using the Scope subsection. For ranges like `12-14`, any overlap with changed diff lines on either side of the hunk is in scope; no overlap is out of scope unless one of the explicit escape-hatch signals applies.
224
- - Pick a verdict:
226
+ - Apply the **Scope** subsection's rules. A top-level comment is in scope when the reviewer explicitly ties it to a changed file/line, behavior the PR introduced, or a contract the PR altered. Otherwise out of scope by default.
227
+ - Pick a verdict the same way as a thread: Agree / Disagree / Already fixed (in-scope), or Agree-meets-bar / Defer (out-of-scope). Apply fixes for Agree verdicts.
228
+ - Replies are NOT posted as individual top-level comments — that would clutter the conversation. Instead, every issue-comment verdict goes into the **same step-9 PR-level summary** as the review-body findings, under its own `## Conversation-tab comments` heading. Per-comment fingerprints join the fenced fingerprint block so future runs dedupe.
229
+ - If `activeIssueComments` is empty AND `reviewBodyComments` is empty (or all dedupe), skip the PR-level summary comment entirely in step 9.
230
+
231
+ ### 7. Assess automated review bodies
232
+
233
+ For every entry in `reviewBodyComments`:
234
+
235
+ - Dedupe first: if its `fingerprint` already appears in any `priorBabysitSentinels[].body`, skip — already covered.
236
+ - Otherwise, READ THE BODY IN FULL. Automated reviewers (CodeRabbit, Mendral, etc.) pack findings into nested `<details>/<blockquote>` HTML with file paths, line ranges, and titles inline. Identify each individual finding the body contains.
237
+ - For each finding, **classify scope** (in / out) using the Scope subsection. For ranges like `12-14`, any overlap with changed diff lines on either side of the hunk is in scope; no overlap is out of scope unless one of the explicit escape-hatch signals applies.
238
+ - Pick a verdict per finding:
225
239
  - In-scope → Agree / Disagree / Already fixed (as with threads). If Agree, apply the fix.
226
- - Out-of-scope → apply the out-of-scope fix bar. Meets the bar → Agree and apply the fix, noting in the summary that it was fixed despite being out of scope. Does not meet the bar → **Defer**. A Deferred automated review-body comment does not get its own top-level comment; it goes into the summary under the **Deferred (out of scope)** heading (see step 9).
240
+ - Out-of-scope → apply the out-of-scope fix bar. Meets the bar → Agree and apply the fix, noting in the summary that it was fixed despite being out of scope. Does not meet the bar → **Defer**. A Deferred finding does not get its own top-level comment; it goes into the summary under the **Deferred (out of scope)** heading (see step 9).
227
241
 
228
- Deferred review-body fingerprints still go into the fenced fingerprint block at the end of the summary alongside addressed ones, so future runs dedupe correctly the comment is handled, just handled by deferring.
242
+ The whole-body `fingerprint` (not per-finding) goes in the fenced fingerprint block at the end of the summary. If the review body later changes (new findings, edits), the fingerprint changes and the next pass will post the summary again slightly noisier but never silently drops a new finding. Trivial whitespace/version-tag changes are absorbed by body normalization before hashing, so identical content doesn't churn.
229
243
 
230
- If no automated review-body comments remain after filtering, skip ONLY the top-level review-body summary comment in step 9. Still post thread replies for every non-Skip-reply thread from step 6.
244
+ If `reviewBodyComments` is empty (or all entries dedupe), skip ONLY the review-body section of the summary in step 9. Still post thread replies for every non-Skip-reply thread from step 6a and handle issue comments per step 6b.
231
245
 
232
246
  ### 8. Commit and push (if any edits)
233
247
 
@@ -253,7 +267,7 @@ Capture the `url=` line for the reply templates in step 9.
253
267
 
254
268
  ### 9. Post replies
255
269
 
256
- For every thread assessed in step 6 that was NOT marked **Skip-reply** (i.e., one of Agree / Disagree / Already fixed / Defer):
270
+ For every thread assessed in step 6a that was NOT marked **Skip-reply** (i.e., one of Agree / Disagree / Already fixed / Defer):
257
271
 
258
272
  ```bash
259
273
  bash scripts/postSentinelReply.sh "$THREAD_ID" "$BODY"
@@ -266,24 +280,25 @@ Body templates (the script appends the `addressed` sentinel if missing):
266
280
  - **Agree**: `Addressed in <commit-url>. <one-line what-changed>.`
267
281
  - **Disagree**: `Leaving current behavior. <reasoning>.`
268
282
  - **Already fixed**: `Already handled by <commit-url-or-file:line>. <brief pointer>.`
269
- - **Defer**: `Out of scope for this PR; this looks like follow-up work rather than something introduced or required by this change. <one-line rationale or pointer if useful>.\n\n<!-- babysit-pr:followup v1 core@3.4.1 -->`
283
+ - **Defer**: `Out of scope for this PR; this looks like follow-up work rather than something introduced or required by this change. <one-line rationale or pointer if useful>.\n\n<sub>🤖 <code>babysit-pr:followup v1 core@3.4.1</code></sub>`
270
284
 
271
285
  For Defer replies, include the follow-up sentinel on its own line as shown. The script will append the `addressed` sentinel after it on its own line, so the final body ends with the follow-up sentinel followed by a blank line followed by the `addressed` sentinel — `grep babysit-pr:followup` finds the deferral and `grep babysit-pr:addressed` still marks the thread handled for dedupe.
272
286
 
273
287
  The script uses the `addPullRequestReviewThreadReply` GraphQL mutation. It does NOT resolve the thread.
274
288
 
275
- If any automated review-body comments were assessed in step 7, post ONE top-level PR comment summarizing all of them:
289
+ If any automated review bodies were assessed in step 7 OR any active issue comments were assessed in step 6b, post ONE top-level PR comment summarizing all of them:
276
290
 
277
291
  ```bash
278
292
  bash scripts/postSentinelPrComment.sh "$PR_NUMBER" "$BODY"
279
293
  ```
280
294
 
281
- The review-body summary should:
295
+ The PR-level summary should:
282
296
 
283
- - Group verdicts under **Agree / Disagree / Already fixed / Deferred (out of scope)** headings. Omit a heading if its list is empty.
284
- - Under **Deferred (out of scope)**, list each deferred review-body comment as a bullet, followed on its own line by `<!-- babysit-pr:followup v1 core@3.4.1 -->` so grep catches them individually.
297
+ - Group by source. Use `## Review-body findings` for step-7 work and `## Conversation-tab comments` for step-6b work. Omit a section if its list is empty.
298
+ - Inside each section, group verdicts under **Agree / Disagree / Already fixed / Deferred (out of scope)** subheadings. Omit a subheading if its list is empty.
299
+ - Under **Deferred (out of scope)**, list each deferred item as a bullet, followed on its own line by `<sub>🤖 <code>babysit-pr:followup v1 core@3.4.1</code></sub>` so grep catches them individually.
285
300
  - Include the commit URL for fixes.
286
- - Include every current review-body comment's `fingerprint` — addressed and deferred — in a fenced block at the end (one per line, before the sentinel) so future runs can dedupe. Deferred comments count as handled for dedupe purposes.
301
+ - End with a fenced fingerprint block listing every current fingerprint — addressed and deferred — one per line. Include both `reviewBodyComments[].fingerprint` (whole-body, one per automated review) and `activeIssueComments[].fingerprint` (per Conversation-tab comment). Future runs dedupe by matching these against `priorBabysitSentinels`.
287
302
 
288
303
  ### 10. Summarize
289
304
 
@@ -293,8 +308,10 @@ Report:
293
308
  - Merge conflict status if relevant (resolved or aborted with reason).
294
309
  - CI checks fixed / still failing / skipped-with-diagnosis.
295
310
  - Review threads replied to, grouped by verdict (including any Defer count: "X threads deferred as follow-ups").
296
- - Review-body comments summarized (or skipped because already covered), including the Deferred count: "Y review-body comments deferred as follow-ups".
311
+ - Conversation-tab comments addressed, grouped by verdict (e.g. "Z conversation comments deferred as follow-ups").
312
+ - Review-body findings summarized (or skipped because already covered), including the Deferred count: "Y review-body findings deferred as follow-ups".
297
313
  - Threads left active because of bot-acknowledgement uncertainty (flag by thread URL).
314
+ - If `truncated` is non-empty: explicitly call out which connection hit GitHub's 100-item GraphQL cap (e.g. "`truncated: ['thread-comments']` — at least one review thread has more than 100 comments; this pass may have missed the tail. Investigate before relying on it for completeness.").
298
315
  - The stop condition triggered for this pass (clean / progressing / stuck).
299
316
 
300
317
  When the report mentions any deferrals, include a one-liner the user can run later to enumerate them, e.g.:
@@ -309,7 +326,7 @@ Do not rely only on `gh pr view --json comments,reviews` — that view can miss
309
326
 
310
327
  After the single pass completes, pick exactly one outcome:
311
328
 
312
- - **Exit clean** — all CI checks passed AND every thread in `activeThreads` was either marked Skip-reply during step 6's inspection or has already received a fresh sentinel reply in this pass (Agree / Disagree / Already-fixed / **Defer** all count — a Defer reply is a sentinel reply), AND every current review-body fingerprint is covered by an existing sentinel comment (deferred review-body comments count; they're in the summary's fingerprint block). Do not use raw `totalActiveThreads` from the script output — it is pre-inspection and will stay non-zero for Skip-reply cases. A PR with Deferred threads is still clean from babysit's perspective: the skill has done what it can without widening scope. Report success and stop.
329
+ - **Exit clean** — all CI checks passed AND every thread in `activeThreads` was either marked Skip-reply during step 6a's inspection or has already received a fresh sentinel reply in this pass (Agree / Disagree / Already-fixed / **Defer** all count — a Defer reply is a sentinel reply), AND every entry in `activeIssueComments` is covered by this pass's PR-level summary, AND every current review-body fingerprint is covered by an existing sentinel comment (deferred review-body and conversation-comment fingerprints count; they're in the summary's fenced block). Do not use raw `totalActiveThreads` / `totalActiveIssueComments` from the script output — they're pre-inspection and will stay non-zero for Skip-reply or post-summary cases. A PR with Deferred items is still clean from babysit's perspective: the skill has done what it can without widening scope. Report success and stop.
313
330
  - **Exit progressing** — pass made commits, posted new replies, or both, and the PR is not yet clean (CI is still pending, a new CI run was triggered by this pass's commits, or more work remains). There is real work still in flight that another run would pick up. Report what was done and what is pending, and tell the user to re-run `/babysit-pr` once CI settles, or to wrap the call with `/loop <cadence> /babysit-pr` (or a shell `while true; do ...; done`) for automatic re-runs.
314
331
  - **Exit stuck** — pass made no commits and posted no new replies, and the PR is still not clean. Nothing actionable happened this pass. Use this whenever progress is blocked on something outside the skill's scope, including:
315
332
  - Merge conflict in step 2 that exceeded the high-confidence resolution bar.
@@ -338,7 +355,7 @@ User: `babysit my PR`
338
355
  - No PR arg → operate on the current branch.
339
356
  - Preflight OK, PR #482 found.
340
357
  - `gh pr checks --watch` times out at 600s — two checks still pending.
341
- - `unresolvedPrComments.sh` returns 0 active threads, 0 review-body comments.
358
+ - `unresolvedPrComments.sh` returns 0 active threads, 0 review-body comments, 0 active issue comments.
342
359
  - No commits, no replies posted, CI state unchanged vs. start.
343
360
  - Outcome: **stuck**. Report: "CI still running after 10 min; no comments to address. Re-run `/babysit-pr` once CI settles, or wrap with `/loop 2m /babysit-pr`."
344
361
 
@@ -349,24 +366,25 @@ User: `babysit PR 482`
349
366
  - Preflight OK. Input parser matches the explicit-token rule and captures `482`.
350
367
  - `gh pr checkout 482` switches the worktree to PR #482's head branch (say, `feat/xyz`).
351
368
  - Step 2's `gh pr view` confirms PR #482 on the now-current branch; the new-PR fallback does not fire.
352
- - Remainder proceeds as a normal single pass (CI watch, thread / nitpick assessment, replies).
369
+ - Remainder proceeds as a normal single pass (CI watch, thread / conversation-comment / review-body assessment, replies).
353
370
  - Report final state on exit.
354
371
 
355
- ### Example 3: out-of-scope nitpick gets deferred
372
+ ### Example 3: out-of-scope review-body finding gets deferred
356
373
 
357
374
  User: `babysit my PR`
358
375
 
359
376
  - Preflight OK, PR #612 found, CI green.
360
- - `unresolvedPrComments.sh` returns 1 active thread and 2 review-body comments:
377
+ - `unresolvedPrComments.sh` returns 1 active thread, 1 active issue comment, and 1 CodeRabbit review body containing two findings:
361
378
  - Thread on `src/users.ts:82` (unchanged, not touched by diff) — reviewer: "while you're here, this helper could be memoized".
362
- - Nitpick on `src/orders.ts:45-47` anchor overlaps a changed line; CodeRabbit says the error message should use backticks. In scope.
363
- - Nitpick on `src/unrelated.ts:10` file not touched by the PR. Out of scope, no escape-hatch signal.
379
+ - Active issue comment from a teammate on the Conversation tab: "general nit can you rename the new module to `payments-core`?". Touches a changed file (`src/payments/index.ts`).
380
+ - CodeRabbit review body — agent reads it and identifies two findings: (a) on `src/orders.ts:45-47`, anchor overlaps a changed line, error message should use backticks (in scope); (b) on `src/unrelated.ts:10`, file not touched by the PR (out of scope, no escape-hatch signal).
364
381
  - Scope classification:
365
- - Thread is on an unchanged line; reviewer doesn't tie it to this PR's changes; doesn't meet the fix bar (not a crash, not a bug, not trivial). → **Defer**.
366
- - First nitpick is in-scope → **Agree**, apply backtick fix.
367
- - Second nitpick is out-of-scope, not a correctness bug, not a one-liner → **Defer** (goes under the Deferred (out of scope) heading in the summary).
368
- - Commit `f00dbabe` for the in-scope review-body fix. Post Defer reply on the thread with the `babysit-pr:followup v1` sentinel above the `addressed` sentinel. Post the review-body summary with Agree (1) and Deferred (out of scope) (1) headings; both fingerprints listed in the fenced block.
369
- - Summary reports: "1 thread deferred as follow-up, 1 review-body comment deferred as follow-up" plus the `gh api graphql ... | grep babysit-pr:followup` one-liner.
382
+ - Thread is on an unchanged line; reviewer doesn't tie it to this PR's changes; doesn't meet the fix bar. → **Defer**.
383
+ - Conversation-tab comment ties to a changed file and is a trivial rename. → **Agree**, apply rename.
384
+ - Finding (a) is in-scope → **Agree**, apply backtick fix.
385
+ - Finding (b) is out-of-scope, not a correctness bug, not a one-liner **Defer**.
386
+ - Commit `f00dbabe` covers the rename and the backtick fix. Post Defer reply on the thread with the `babysit-pr:followup v1` sentinel above the `addressed` sentinel. Post one PR-level summary with `## Review-body findings` (Agree 1, Deferred 1) and `## Conversation-tab comments` (Agree 1); the fenced block lists the CodeRabbit review body's whole-body fingerprint AND the conversation comment's per-comment fingerprint.
387
+ - Summary reports: "1 thread deferred as follow-up, 1 review-body finding deferred as follow-up, 0 conversation comments deferred" plus the `gh api graphql ... | grep babysit-pr:followup` one-liner.
370
388
  - **Exit clean** — Defer replies count as fresh sentinel replies; all fingerprints are covered.
371
389
 
372
390
  ## Input
@@ -2,13 +2,20 @@
2
2
  # _sentinel.sh — shared SENTINEL constants + append helper.
3
3
  # Sourced by unresolvedPrComments.sh, postSentinelReply.sh, postSentinelPrComment.sh.
4
4
  #
5
- # SENTINEL_PREFIX is the version-agnostic substring used for matching/dedupe so
6
- # pre-versioning sentinels (`<!-- babysit-pr:addressed v1 -->`) are still
7
- # recognized alongside versioned ones. SENTINEL is the literal emitted on new
8
- # replies; the `core@X.Y.Z` suffix records which plugin version produced it.
5
+ # SENTINEL is the literal emitted on new replies: a visible footer (robot mark +
6
+ # token in `<code>`, wrapped in `<sub>`). SENTINEL_PREFIX is the wrapper-free
7
+ # substring used for matching/dedupe, so it matches both this footer and legacy
8
+ # `<!-- babysit-pr:addressed v1 ... -->` sentinels. The `core@X.Y.Z` suffix is
9
+ # substituted at build time by embedPluginVersion.mts.
9
10
 
10
- SENTINEL_PREFIX='<!-- babysit-pr:addressed v1 '
11
- SENTINEL='<!-- babysit-pr:addressed v1 core@3.4.1 -->'
11
+ SENTINEL_PREFIX='babysit-pr:addressed v1 '
12
+ SENTINEL='<sub>🤖 <code>babysit-pr:addressed v1 core@3.4.1</code></sub>'
13
+
14
+ # Bot author allowlist (JSON array literal). Used by unresolvedPrComments.sh
15
+ # as a fallback when GraphQL's `author.__typename == "Bot"` misses a GitHub
16
+ # App that posts via a User-type service account. Single source of truth so
17
+ # adding a new bot is a one-line edit.
18
+ BOTS_JSON='["coderabbitai","coderabbitai[bot]","mendral-app","mendral-app[bot]","dependabot","dependabot[bot]","github-actions","github-actions[bot]","github-advanced-security","github-advanced-security[bot]","renovate","renovate[bot]","renovate-bot","pre-commit-ci","pre-commit-ci[bot]","codecov","codecov[bot]","sonarcloud","sonarcloud[bot]"]'
12
19
 
13
20
  # Echo $1 with SENTINEL appended on its own trailing paragraph, unless the
14
21
  # body already contains any version of the sentinel (matched via SENTINEL_PREFIX).
@@ -1,16 +1,30 @@
1
1
  #!/usr/bin/env bash
2
- # unresolvedPrComments.sh — Fetch review threads + review-body comments for babysit-pr.
3
- # Extended from plugins/core/skills/unresolved-pr-comments/scripts/unresolvedPrComments.sh.
4
- # Adds: thread IDs, per-thread sentinel recency state, stable review-body fingerprints.
2
+ # unresolvedPrComments.sh — Fetch review data for babysit-pr.
3
+ #
4
+ # Returns one JSON document with:
5
+ # - threads / activeThreads / uncertainThreads — review threads with
6
+ # sentinel-recency state (active / uncertain / addressed).
7
+ # - reviewBodyComments — raw bodies of every review from known automated
8
+ # reviewers (CodeRabbit, Mendral, etc.), each with a stable fingerprint.
9
+ # The agent reads bodies directly; we no longer pre-parse findings.
10
+ # - issueComments — every top-level PR conversation comment, tagged with
11
+ # isBabysitSentinel and isKnownBot flags.
12
+ # - activeIssueComments — non-sentinel, non-bot issue comments whose
13
+ # per-comment fingerprint is NOT already listed in any prior babysit-pr
14
+ # summary. These are the human Conversation-tab comments needing a reply.
15
+ # - priorBabysitSentinels — issue comments whose body contains the
16
+ # babysit-pr sentinel prefix. Used for review-body + issue-comment dedupe.
17
+ # - truncated — array naming any GraphQL connection that hit GitHub's
18
+ # 100-item cap (reviewThreads, thread-comments, reviews, issueComments).
19
+ # Agent must surface this in the final summary.
5
20
  #
6
21
  # Usage: bash unresolvedPrComments.sh [pr-number]
7
- # Compatible with macOS bash 3.2. Requires: gh, jq (>= 1.5), perl with Digest::SHA.
22
+ # Compatible with macOS bash 3.2. Requires: gh, jq (>= 1.5),
23
+ # and one of shasum / sha256sum for fingerprinting.
8
24
 
9
25
  set -euo pipefail
10
26
 
11
27
  SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
12
- # shellcheck source=parseNitpicks.sh
13
- source "${SCRIPT_DIR}/parseNitpicks.sh"
14
28
  # shellcheck source=_sentinel.sh
15
29
  source "${SCRIPT_DIR}/_sentinel.sh"
16
30
 
@@ -21,6 +35,14 @@ output_error() {
21
35
  exit 1
22
36
  }
23
37
 
38
+ if command -v shasum >/dev/null 2>&1; then
39
+ SHA256_CMD="shasum -a 256"
40
+ elif command -v sha256sum >/dev/null 2>&1; then
41
+ SHA256_CMD="sha256sum"
42
+ else
43
+ SHA256_CMD=""
44
+ fi
45
+
24
46
  validate_prerequisites() {
25
47
  if ! command -v jq >/dev/null 2>&1; then
26
48
  printf '{"error":"jq not found. Install from https://stedolan.github.io/jq"}\n' >&3
@@ -29,11 +51,8 @@ validate_prerequisites() {
29
51
  if ! command -v gh >/dev/null 2>&1; then
30
52
  output_error "gh CLI not found. Install from https://cli.github.com"
31
53
  fi
32
- if ! command -v perl >/dev/null 2>&1; then
33
- output_error "perl not found."
34
- fi
35
- if ! perl -MDigest::SHA -e1 >/dev/null 2>&1; then
36
- output_error "Perl Digest::SHA module not found (should be in core Perl since 5.9.3)."
54
+ if [ -z "$SHA256_CMD" ]; then
55
+ output_error "Neither shasum nor sha256sum found on PATH."
37
56
  fi
38
57
  if ! gh api user --jq '.login' >/dev/null 2>&1; then
39
58
  output_error "Not authenticated with GitHub. Run: gh auth login"
@@ -77,7 +96,9 @@ get_repo_info() {
77
96
  fi
78
97
  }
79
98
 
80
- # Pagination limits: 100 review threads, 20 comments per thread, 100 reviews.
99
+ # Each connection caps at GitHub's 100-item maximum. hasNextPage is checked
100
+ # after the fetch and surfaced via the top-level `truncated` array — real
101
+ # cursor pagination is a follow-up if the warning ever fires in practice.
81
102
  GRAPHQL_QUERY='
82
103
  query($owner: String!, $repo: String!, $pr: Int!) {
83
104
  repository(owner: $owner, name: $repo) {
@@ -85,10 +106,12 @@ query($owner: String!, $repo: String!, $pr: Int!) {
85
106
  title
86
107
  url
87
108
  reviewThreads(first: 100) {
109
+ pageInfo { hasNextPage }
88
110
  nodes {
89
111
  id
90
112
  isResolved
91
- comments(first: 20) {
113
+ comments(first: 100) {
114
+ pageInfo { hasNextPage }
92
115
  nodes {
93
116
  id
94
117
  databaseId
@@ -106,12 +129,24 @@ query($owner: String!, $repo: String!, $pr: Int!) {
106
129
  }
107
130
  }
108
131
  reviews(first: 100) {
132
+ pageInfo { hasNextPage }
109
133
  nodes {
110
134
  body
111
- author { login }
135
+ author { login __typename }
112
136
  createdAt
113
137
  }
114
138
  }
139
+ comments(first: 100) {
140
+ pageInfo { hasNextPage }
141
+ nodes {
142
+ id
143
+ databaseId
144
+ body
145
+ createdAt
146
+ url
147
+ author { login __typename }
148
+ }
149
+ }
115
150
  }
116
151
  }
117
152
  }'
@@ -141,6 +176,49 @@ is_code_scanning_alert_fixed() {
141
176
  [ "$state" = "fixed" ]
142
177
  }
143
178
 
179
+ # Normalize a body for stable hashing: collapse all runs of whitespace
180
+ # (including newlines) to a single space, then trim. Trivial whitespace
181
+ # reshuffles by a bot do not churn the fingerprint.
182
+ normalize_body() {
183
+ printf '%s' "$1" | tr -s '[:space:]' ' ' | sed -E 's/^ //; s/ $//'
184
+ }
185
+
186
+ # Echo first 16 hex chars of sha256(normalize(body)).
187
+ fingerprint_body() {
188
+ local normalized
189
+ normalized="$(normalize_body "$1")"
190
+ printf '%s' "$normalized" | $SHA256_CMD | cut -c1-16
191
+ }
192
+
193
+ # Take a JSON array of {body, ...extra} and emit the same array with a
194
+ # `fingerprint` field added to each entry. Three jq spawns total regardless
195
+ # of N: one to stream bodies as base64, one to assemble the fingerprint
196
+ # array, one to zip them back onto the originals.
197
+ add_fingerprints() {
198
+ local input_json="$1"
199
+ local count
200
+ count="$(printf '%s' "$input_json" | jq 'length')"
201
+ if [ "$count" = "0" ]; then
202
+ printf '[]'
203
+ return
204
+ fi
205
+
206
+ local fps=()
207
+ local line
208
+ while IFS= read -r line; do
209
+ [ -z "$line" ] && continue
210
+ local body
211
+ body="$(printf '%s' "$line" | base64 -d)"
212
+ fps+=("$(fingerprint_body "$body")")
213
+ done < <(printf '%s' "$input_json" | jq -r '.[] | .body // "" | @base64')
214
+
215
+ local fps_json
216
+ fps_json="$(printf '%s\n' "${fps[@]}" | jq -Rs 'split("\n") | map(select(. != ""))')"
217
+ printf '%s' "$input_json" | jq --argjson fps "$fps_json" '
218
+ [., $fps] | transpose | map(.[0] + { fingerprint: (.[1] // "") })
219
+ '
220
+ }
221
+
144
222
  main() {
145
223
  validate_prerequisites
146
224
 
@@ -165,40 +243,23 @@ main() {
165
243
  title="$(printf '%s' "$response" | jq -r '.data.repository.pullRequest.title')"
166
244
  url="$(printf '%s' "$response" | jq -r '.data.repository.pullRequest.url')"
167
245
 
168
- # Build threads with sentinel recency state.
169
- #
170
246
  # Bot detection combines TWO signals (union, not intersection):
171
- # 1. GraphQL `author.__typename == "Bot"` — catches every bot GitHub marks as such,
172
- # including bots not on our allowlist. This is the primary signal.
173
- # 2. Login allowlist catches GitHub Apps/Actions that post via a User-type service
174
- # account rather than a Bot account.
175
- # An unknown bot whose login we don't recognize but which is type=Bot still gets
176
- # classified correctly; we never fall back to treating it as a human.
177
- #
247
+ # 1. GraphQL `author.__typename == "Bot"` — catches every bot GitHub marks
248
+ # as such. Primary signal.
249
+ # 2. Login allowlist BOTS_JSON (sourced from _sentinel.sh) catches
250
+ # GitHub Apps/Actions that post via a User-type service account.
251
+
178
252
  # Per-thread emitted fields:
179
253
  # - threadId, replyToCommentDatabaseId, comments[], isResolved, file, line
180
- # - lastBabysitSentinelAt: max createdAt of OUR sentinel replies (null if none)
254
+ # - lastBabysitSentinelAt: max createdAt of OUR sentinel replies
181
255
  # - lastHumanCommentAt: max createdAt of non-sentinel, non-bot comments
182
256
  # - lastBotCommentAt: max createdAt of non-sentinel bot comments
183
- # - postSentinelBotComments: ARRAY of every bot comment after lastBabysitSentinelAt
184
- # (the agent inspects ALL of them; a later ack must not hide
185
- # an earlier actionable bot comment)
186
- # - postSentinelHumanComments: ARRAY of every human comment after lastBabysitSentinelAt
187
- # - activityState: tri-state, one of:
188
- # "active" — needs a reply (no sentinel yet, OR a human commented after our sentinel)
189
- # "uncertain" — sentinel exists, but a bot posted after it; agent MUST inspect every
190
- # entry in postSentinelBotComments and treat as active unless EVERY one
191
- # is confidently a non-actionable acknowledgement
192
- # "addressed" — our sentinel is the newest relevant activity on this thread
193
- local bots_json='["coderabbitai","coderabbitai[bot]","mendral-app","mendral-app[bot]","dependabot","dependabot[bot]","github-actions","github-actions[bot]","github-advanced-security","github-advanced-security[bot]","renovate","renovate[bot]","renovate-bot","pre-commit-ci","pre-commit-ci[bot]","codecov","codecov[bot]","sonarcloud","sonarcloud[bot]"]'
257
+ # - postSentinelBotComments: ARRAY of every bot comment after the sentinel
258
+ # - postSentinelHumanComments: ARRAY of every human comment after the sentinel
259
+ # - activityState: "active" / "uncertain" / "addressed"
194
260
  local threads_json
195
- threads_json="$(printf '%s' "$response" | jq --arg sentinel_prefix "$SENTINEL_PREFIX" --argjson bots "$bots_json" '
196
- # Exact login equality via IN($bots[]) — do NOT use `inside($bots)`, which
197
- # does substring matching for strings and would classify login "code" as a
198
- # bot because it appears inside "codecov".
261
+ threads_json="$(printf '%s' "$response" | jq --arg sentinel_prefix "$SENTINEL_PREFIX" --argjson bots "$BOTS_JSON" '
199
262
  def is_bot: ((.author.__typename // "") == "Bot") or ((.author.login // "") | IN($bots[]));
200
- # Match by version-agnostic prefix so pre-versioning sentinels left on
201
- # older PRs (`<!-- babysit-pr:addressed v1 -->`) still dedupe correctly.
202
263
  def is_sentinel: ((.body // "") | contains($sentinel_prefix));
203
264
  [
204
265
  .data.repository.pullRequest.reviewThreads.nodes[]
@@ -211,6 +272,7 @@ main() {
211
272
  replyToCommentDatabaseId: ($comments[0].databaseId // null),
212
273
  file: ($comments[0].path // null),
213
274
  line: ($comments[0].line // $comments[0].originalLine // null),
275
+ commentsTruncated: ($t.comments.pageInfo.hasNextPage // false),
214
276
  comments: [
215
277
  $comments[] | {
216
278
  id,
@@ -285,8 +347,7 @@ main() {
285
347
  ]
286
348
  ')"
287
349
 
288
- # Flattened unresolved_comments — retained for backward compat with the prose summary.
289
- # Includes comments from "active" AND "uncertain" threads so the agent never misses new feedback.
350
+ # Flattened unresolved_comments — retained for backward compat.
290
351
  local all_unresolved
291
352
  all_unresolved="$(printf '%s' "$threads_json" | jq '[
292
353
  .[]
@@ -303,11 +364,6 @@ main() {
303
364
  ]')"
304
365
 
305
366
  # Filter out fixed code-scanning alerts from github-advanced-security.
306
- # Two-pass: collect unique alert numbers, query each once, then drop matching
307
- # comments in a single jq pass. Avoids the O(N²) rebuild and duplicate gh api
308
- # calls the naive per-comment loop would incur.
309
- # github-advanced-security posts under either login depending on account type
310
- # (app vs direct) — both forms match below.
311
367
  local security_alerts
312
368
  security_alerts="$(printf '%s' "$all_unresolved" | jq -r '
313
369
  .[]
@@ -326,10 +382,6 @@ main() {
326
382
  if [ -z "$fixed_alerts" ]; then
327
383
  unresolved_comments="$all_unresolved"
328
384
  else
329
- # capture() on a non-matching string produces ZERO outputs (not null, not an
330
- # error). Without the `// null` guard below, `as $n` would bind to nothing
331
- # and the map entry would silently collapse to empty — dropping
332
- # github-advanced-security comments that reference no code-scanning URL.
333
385
  unresolved_comments="$(printf '%s' "$all_unresolved" | jq --arg fixed "$fixed_alerts" '
334
386
  ($fixed | split(" ") | map(select(length > 0))) as $fixedSet
335
387
  | map(
@@ -342,53 +394,135 @@ main() {
342
394
  ')"
343
395
  fi
344
396
 
345
- # Automated review-body comments. The legacy function/field names stay for
346
- # compatibility with callers that already consume nitpickComments.
347
- local reviews_json
348
- reviews_json="$(printf '%s' "$response" | jq '[.data.repository.pullRequest.reviews.nodes[]]')"
349
- local nitpick_comments
350
- nitpick_comments="$(extract_nitpick_comments "$reviews_json")"
397
+ # Raw review-body comments from known bots. The agent reads each body itself
398
+ # and extracts findings; no pre-parsing.
399
+ local raw_review_body_comments
400
+ raw_review_body_comments="$(printf '%s' "$response" | jq --argjson bots "$BOTS_JSON" '
401
+ def is_bot_author: ((.author.__typename // "") == "Bot") or ((.author.login // "") | IN($bots[]));
402
+ [
403
+ .data.repository.pullRequest.reviews.nodes[]
404
+ | select((.body // "") != "")
405
+ | select(is_bot_author)
406
+ | {
407
+ author: (.author.login // "deleted-user"),
408
+ authorType: (.author.__typename // null),
409
+ createdAt: .createdAt,
410
+ body: .body
411
+ }
412
+ ]
413
+ ')"
414
+ local review_body_comments
415
+ review_body_comments="$(add_fingerprints "$raw_review_body_comments")"
416
+
417
+ # All issue comments (top-level Conversation-tab comments).
418
+ local raw_issue_comments
419
+ raw_issue_comments="$(printf '%s' "$response" | jq --arg sentinel_prefix "$SENTINEL_PREFIX" --argjson bots "$BOTS_JSON" '
420
+ def is_sentinel_body: ((.body // "") | contains($sentinel_prefix));
421
+ def is_bot_author: ((.author.__typename // "") == "Bot") or ((.author.login // "") | IN($bots[]));
422
+ [
423
+ .data.repository.pullRequest.comments.nodes[]
424
+ | {
425
+ id,
426
+ databaseId,
427
+ author: (.author.login // "deleted-user"),
428
+ authorType: (.author.__typename // null),
429
+ body,
430
+ createdAt,
431
+ url,
432
+ isBabysitSentinel: is_sentinel_body,
433
+ isKnownBot: is_bot_author
434
+ }
435
+ ]
436
+ ')"
437
+ local issue_comments
438
+ issue_comments="$(add_fingerprints "$raw_issue_comments")"
351
439
 
352
- # Active threads: anything NOT yet addressed. Includes "uncertain" — agent must inspect.
440
+ # priorBabysitSentinels: issue comments containing the sentinel prefix.
441
+ local prior_sentinels
442
+ prior_sentinels="$(printf '%s' "$issue_comments" | jq '[.[] | select(.isBabysitSentinel)]')"
443
+
444
+ # Concatenate prior sentinel bodies into one blob — used as a haystack for
445
+ # fingerprint dedupe (both review-body and issue-comment fingerprints land
446
+ # in the fenced block at the end of a babysit-pr summary).
447
+ local prior_sentinel_blob
448
+ prior_sentinel_blob="$(printf '%s' "$prior_sentinels" | jq -r '[.[].body] | join("\n")')"
449
+
450
+ # activeIssueComments: non-sentinel, non-bot comments whose fingerprint is
451
+ # NOT already listed in any prior babysit-pr summary.
452
+ local active_issue_comments
453
+ active_issue_comments="$(printf '%s' "$issue_comments" | jq --arg blob "$prior_sentinel_blob" '
454
+ [.[]
455
+ | select(.isBabysitSentinel | not)
456
+ | select(.isKnownBot | not)
457
+ | select($blob | contains(.fingerprint) | not)
458
+ ]
459
+ ')"
460
+
461
+ # Active threads: anything NOT yet addressed.
353
462
  local active_threads total_active_threads uncertain_threads total_uncertain_threads
354
463
  active_threads="$(printf '%s' "$threads_json" | jq '[.[] | select(.activityState != "addressed")]')"
355
464
  total_active_threads="$(printf '%s' "$active_threads" | jq 'length')"
356
465
  uncertain_threads="$(printf '%s' "$threads_json" | jq '[.[] | select(.activityState == "uncertain")]')"
357
466
  total_uncertain_threads="$(printf '%s' "$uncertain_threads" | jq 'length')"
358
467
 
359
- local total_unresolved total_nitpicks
468
+ local total_unresolved total_review_body_comments total_active_issue_comments
360
469
  total_unresolved="$(printf '%s' "$unresolved_comments" | jq 'length')"
361
- total_nitpicks="$(printf '%s' "$nitpick_comments" | jq 'length')"
470
+ total_review_body_comments="$(printf '%s' "$review_body_comments" | jq 'length')"
471
+ total_active_issue_comments="$(printf '%s' "$active_issue_comments" | jq 'length')"
472
+
473
+ # Truncation: which connections hit GitHub's 100-item GraphQL cap?
474
+ local truncated
475
+ truncated="$(jq -n \
476
+ --argjson response "$response" \
477
+ --argjson threads "$threads_json" \
478
+ '
479
+ [
480
+ (if $response.data.repository.pullRequest.reviewThreads.pageInfo.hasNextPage then "reviewThreads" else empty end),
481
+ (if [$threads[] | select(.commentsTruncated)] | length > 0 then "thread-comments" else empty end),
482
+ (if $response.data.repository.pullRequest.reviews.pageInfo.hasNextPage then "reviews" else empty end),
483
+ (if $response.data.repository.pullRequest.comments.pageInfo.hasNextPage then "issueComments" else empty end)
484
+ ]
485
+ ')"
362
486
 
363
487
  jq -n \
488
+ --argjson activeIssueComments "$active_issue_comments" \
364
489
  --argjson activeThreads "$active_threads" \
365
- --argjson nitpickComments "$nitpick_comments" \
490
+ --argjson issueComments "$issue_comments" \
366
491
  --arg owner "$owner" \
367
492
  --argjson prNumber "$pr_number" \
493
+ --argjson priorBabysitSentinels "$prior_sentinels" \
368
494
  --arg repo "$repo" \
495
+ --argjson reviewBodyComments "$review_body_comments" \
369
496
  --arg sentinel "$SENTINEL" \
370
497
  --arg title "$title" \
371
498
  --argjson threads "$threads_json" \
499
+ --argjson totalActiveIssueComments "$total_active_issue_comments" \
372
500
  --argjson totalActiveThreads "$total_active_threads" \
373
- --argjson totalNitpicks "$total_nitpicks" \
501
+ --argjson totalReviewBodyComments "$total_review_body_comments" \
374
502
  --argjson totalUncertainThreads "$total_uncertain_threads" \
375
503
  --argjson totalUnresolvedComments "$total_unresolved" \
504
+ --argjson truncated "$truncated" \
376
505
  --argjson uncertainThreads "$uncertain_threads" \
377
506
  --argjson unresolvedComments "$unresolved_comments" \
378
507
  --arg url "$url" \
379
508
  '{
509
+ activeIssueComments: $activeIssueComments,
380
510
  activeThreads: $activeThreads,
381
- nitpickComments: $nitpickComments,
511
+ issueComments: $issueComments,
382
512
  owner: $owner,
383
513
  prNumber: $prNumber,
514
+ priorBabysitSentinels: $priorBabysitSentinels,
384
515
  repo: $repo,
516
+ reviewBodyComments: $reviewBodyComments,
385
517
  sentinel: $sentinel,
386
518
  threads: $threads,
387
519
  title: $title,
520
+ totalActiveIssueComments: $totalActiveIssueComments,
388
521
  totalActiveThreads: $totalActiveThreads,
389
- totalNitpicks: $totalNitpicks,
522
+ totalReviewBodyComments: $totalReviewBodyComments,
390
523
  totalUncertainThreads: $totalUncertainThreads,
391
524
  totalUnresolvedComments: $totalUnresolvedComments,
525
+ truncated: $truncated,
392
526
  uncertainThreads: $uncertainThreads,
393
527
  unresolvedComments: $unresolvedComments,
394
528
  url: $url
@@ -45,6 +45,6 @@ Script paths in this procedure are written as `scripts/...`, relative to this SK
45
45
  4. Push the branch to origin.
46
46
  5. Look up the current agent session ID by running this skill's bundled script: `bash scripts/find-session-id.sh '<phrase>'`. Pass a distinctive verbatim chunk (≥10 words) from the most recent user message; see the script header for quoting constraints. If the script prints `codex <id>`, use `Agent session: codex resume <id>`. If it prints `claude-code <id>`, use `Agent session: claude --resume <id>`. If empty, there is no session footer line.
47
47
  6. Check for an existing PR with `gh pr view`.
48
- - No PR: create with `gh pr create`. Title = commit subject. Description = the PR body shape above, followed by the session footer line if known and `<!-- commit-push-pr:created v1 core@3.4.1 -->`.
49
- - PR exists: refresh the body via `gh pr edit --body` so (a) the new commit's changes are reflected in the prose while existing `## Summary`, `## Validation`, and `## Notes` sections are preserved unless clearly stale, (b) any known session footer line is appended if missing, never removing or rewriting existing `Agent session: ...` or `Agent session ID: ...` lines, and (c) any existing `<!-- commit-push-pr:created v1 ... -->` line is preserved verbatim, appending `<!-- commit-push-pr:created v1 core@3.4.1 -->` if absent. Then report the URL.
48
+ - No PR: create with `gh pr create`. Title = commit subject. Description = the PR body shape above, followed by the session footer line if known and the agent footer `<sub>🤖 <code>commit-push-pr:created v1 core@3.4.1</code></sub>` on its own line.
49
+ - PR exists: refresh the body via `gh pr edit --body` so (a) the new commit's changes are reflected in the prose while existing `## Summary`, `## Validation`, and `## Notes` sections are preserved unless clearly stale, (b) any known session footer line is appended if missing, never removing or rewriting existing `Agent session: ...` or `Agent session ID: ...` lines, and (c) any existing footer carrying the substring `commit-push-pr:created v1` is preserved verbatim, appending `<sub>🤖 <code>commit-push-pr:created v1 core@3.4.1</code></sub>` only if absent. Then report the URL.
50
50
  7. End with one short text response: branch name and the full PR URL (e.g., `https://github.com/clipboardhealth/core-utils/pull/123`). Never use shorthand like `repo#123` — always output the complete URL.
@@ -1,272 +0,0 @@
1
- #!/usr/bin/env bash
2
- # parseNitpicks.sh — Parse bot review-body comments from PR review bodies.
3
- #
4
- # Each emitted comment includes a stable `fingerprint` field (sha256 of file +
5
- # normalized line range + title + body), so reposted reviews dedupe to the same
6
- # fingerprint. Source review timestamps are kept as `createdAt` metadata but
7
- # NOT included in the fingerprint.
8
- #
9
- # Sourced by unresolvedPrComments.sh. Requires: perl with Digest::SHA + Encode.
10
-
11
- extract_nitpick_comments() {
12
- local reviews_json="$1"
13
-
14
- printf '%s' "$reviews_json" | perl -e '
15
- use strict;
16
- use warnings;
17
- use JSON::PP;
18
- use Digest::SHA qw(sha256_hex);
19
- use Encode qw(encode_utf8);
20
-
21
- local $/;
22
- my $reviews_json = <STDIN>;
23
- my $reviews = decode_json($reviews_json);
24
-
25
- my @comments = (
26
- extract_coderabbit_comments($reviews),
27
- extract_mendral_comments($reviews),
28
- );
29
- print encode_json(\@comments);
30
-
31
- sub extract_coderabbit_comments {
32
- my ($reviews) = @_;
33
-
34
- my $latest_review;
35
- my $latest_time = "";
36
- for my $review (@$reviews) {
37
- my $author = $review->{author}{login} // "";
38
- my $body = $review->{body} // "";
39
- next unless $author eq "coderabbitai" && has_supported_sections($body);
40
- my $created = $review->{createdAt} // "";
41
- if ($created gt $latest_time) {
42
- $latest_time = $created;
43
- $latest_review = $review;
44
- }
45
- }
46
-
47
- return () unless $latest_review;
48
-
49
- my $body = $latest_review->{body};
50
- my $author = $latest_review->{author}{login} // "deleted-user";
51
- my $created_at = $latest_review->{createdAt} // "";
52
-
53
- my @sections = extract_review_body_comment_sections($body);
54
- return () unless @sections;
55
-
56
- my @comments;
57
- for my $section (@sections) {
58
- my $section_content = $section->{content};
59
- my $category = $section->{category};
60
-
61
- while ($section_content =~ /<details>\s*<summary>([^<]+?)\s+\(\d+\)<\/summary>\s*<blockquote>([\s\S]*?)<\/blockquote>\s*<\/details>/g) {
62
- my $raw_file_name = trim($1);
63
- my $file_content = $2;
64
-
65
- # Category prefix is optional. CodeRabbit emits 0–N `_…_` tags
66
- # separated by `|` (e.g. `_⚠️ Potential issue_ | _🟠 Major_ | _⚡ Quick win_`
67
- # or just `_💤 Low value_` on lower-confidence findings). The previous
68
- # regex required exactly two tags and silently dropped one-tag and
69
- # three-tag variants.
70
- while ($file_content =~ /`(\d+(?:-\d+)?)`:\s*(?:_[^_]+_(?:\s*\|\s*_[^_]+_)*\s*)?\*\*([^*]+)\*\*\s*([\s\S]*?)(?=---|\n`\d|<\/blockquote>|$)/g) {
71
- my $line_range = $1;
72
- my $title = trim($2);
73
- my $clean_body = clean_comment_body(trim($3));
74
- my $file_name = normalize_file_name($raw_file_name, $line_range);
75
-
76
- push @comments, review_body_comment(
77
- $author,
78
- $created_at,
79
- $file_name,
80
- $line_range,
81
- $title,
82
- $clean_body,
83
- $category,
84
- );
85
- }
86
- }
87
- }
88
-
89
- return @comments;
90
- }
91
-
92
- sub extract_mendral_comments {
93
- my ($reviews) = @_;
94
-
95
- my $latest_review;
96
- my $latest_time = "";
97
- for my $review (@$reviews) {
98
- my $author = $review->{author}{login} // "";
99
- my $body = $review->{body} // "";
100
- next unless ($author eq "mendral-app" || $author eq "mendral-app[bot]") && is_actionable_mendral_review($body);
101
- my $created = $review->{createdAt} // "";
102
- if ($created gt $latest_time) {
103
- $latest_time = $created;
104
- $latest_review = $review;
105
- }
106
- }
107
-
108
- return () unless $latest_review;
109
-
110
- my $body = $latest_review->{body} // "";
111
- my $title = mendral_title($body);
112
- return () unless $title;
113
-
114
- my $clean_body = clean_mendral_body($body);
115
- return () unless $clean_body ne "";
116
-
117
- my ($file_name, $line_range) = extract_first_file_line_reference($clean_body);
118
- return () unless $file_name && $line_range;
119
-
120
- return review_body_comment(
121
- $latest_review->{author}{login} // "deleted-user",
122
- $latest_review->{createdAt} // "",
123
- $file_name,
124
- $line_range,
125
- $title,
126
- $clean_body,
127
- "mendral",
128
- );
129
- }
130
-
131
- sub review_body_comment {
132
- my ($author, $created_at, $file_name, $line_range, $title, $clean_body, $category) = @_;
133
-
134
- # Fingerprint: file + normalized line + title + body (NO timestamp,
135
- # NO author, NO category — reposted reviews must dedupe to the same
136
- # fingerprint even if a review bot relabels the section).
137
- my $fingerprint_input = join("\n", $file_name, $line_range, $title, $clean_body);
138
- my $fingerprint = substr(sha256_hex(encode_utf8($fingerprint_input)), 0, 16);
139
-
140
- return {
141
- author => $author,
142
- body => "$title\n\n$clean_body",
143
- category => $category,
144
- createdAt => $created_at,
145
- file => $file_name,
146
- fingerprint => $fingerprint,
147
- line => $line_range,
148
- title => $title,
149
- };
150
- }
151
-
152
- sub has_supported_sections {
153
- my ($text) = @_;
154
- $text = strip_markdown_blockquote_prefixes($text);
155
- return $text =~ /<summary>\s*[^<]*(?:Nitpick comments|Minor comments|Outside diff range comments)\s*\(\d+\)<\/summary>\s*<blockquote>/i;
156
- }
157
-
158
- sub is_actionable_mendral_review {
159
- my ($text) = @_;
160
- my $title = mendral_title($text);
161
- return defined $title && $title =~ /^(?:needs attention|changes requested|needs changes)$/i;
162
- }
163
-
164
- sub mendral_title {
165
- my ($text) = @_;
166
- $text = strip_markdown_blockquote_prefixes($text);
167
- return $1 if $text =~ /^\s*\*\*([^*]+)\*\*/m;
168
- return undef;
169
- }
170
-
171
- sub clean_mendral_body {
172
- my ($text) = @_;
173
- $text = strip_markdown_blockquote_prefixes($text);
174
- $text =~ s/^\s*\*\*[^*]+\*\*\s*//;
175
- $text =~ s/<details>[\s\S]*$//;
176
- $text =~ s/<sub>[\s\S]*?<\/sub>//g;
177
- $text =~ s/<!--[\s\S]*?-->//g;
178
- return trim($text);
179
- }
180
-
181
- sub extract_first_file_line_reference {
182
- my ($text) = @_;
183
- $text =~ s/\x{2013}|\x{2014}/-/g;
184
-
185
- if ($text =~ /`([^`\n]+\/[^`\n]+\.[A-Za-z0-9]+)`[^\n]{0,120}?\blines?\s+(\d+(?:\s*(?:-|to)\s*\d+)?)/i) {
186
- return ($1, normalize_line_range($2));
187
- }
188
-
189
- return (undef, undef);
190
- }
191
-
192
- sub normalize_line_range {
193
- my ($line_range) = @_;
194
- $line_range = trim($line_range);
195
- return "$1-$2" if $line_range =~ /^(\d+)\s*(?:-|to)\s*(\d+)$/i;
196
- return $line_range;
197
- }
198
-
199
- sub extract_review_body_comment_sections {
200
- my ($text) = @_;
201
- $text = strip_markdown_blockquote_prefixes($text);
202
-
203
- my @sections;
204
- while ($text =~ /<summary>\s*[^<]*(Nitpick comments|Minor comments|Outside diff range comments)\s*\(\d+\)<\/summary>\s*<blockquote>/ig) {
205
- my $category = section_category($1);
206
- my $content_start = $+[0];
207
- my $after = substr($text, $content_start);
208
-
209
- my $depth = 1;
210
- my @tags;
211
- while ($after =~ /(<blockquote>|<\/blockquote>)/gi) {
212
- my $tag = $1;
213
- my $pos = $-[0];
214
- my $is_open = ($tag =~ /^<blockquote>/i) ? 1 : 0;
215
- push @tags, [$pos, $is_open];
216
- }
217
- for my $tag (@tags) {
218
- $depth += $tag->[1] ? 1 : -1;
219
- if ($depth == 0) {
220
- push @sections, {
221
- category => $category,
222
- content => substr($after, 0, $tag->[0]),
223
- };
224
- last;
225
- }
226
- }
227
- }
228
- return @sections;
229
- }
230
-
231
- sub section_category {
232
- my ($label) = @_;
233
- return "nitpick" if $label =~ /Nitpick comments/i;
234
- return "minor" if $label =~ /Minor comments/i;
235
- return "outside-diff" if $label =~ /Outside diff range comments/i;
236
- return "unknown";
237
- }
238
-
239
- sub normalize_file_name {
240
- my ($file_name, $line_range) = @_;
241
- my $suffix = "-" . $line_range;
242
- $file_name =~ s/\Q$suffix\E$//;
243
- return $file_name;
244
- }
245
-
246
- sub strip_markdown_blockquote_prefixes {
247
- my ($text) = @_;
248
- $text =~ s/^[ \t]*>[ \t]?//mg;
249
- return $text;
250
- }
251
-
252
- sub clean_comment_body {
253
- my ($text) = @_;
254
- my $prev = "";
255
- while ($text ne $prev) {
256
- $prev = $text;
257
- $text =~ s/<details>(?:(?!<details>)[\s\S])*?<\/details>//g;
258
- }
259
- # Do NOT HTML-escape angle brackets: the nitpick body is posted back to GitHub
260
- # as Markdown via `gh api`, where `&lt;`/`&gt;` would render literally and
261
- # corrupt generic-type expressions or HTML snippets from the original review.
262
- return trim($text);
263
- }
264
-
265
- sub trim {
266
- my ($s) = @_;
267
- $s =~ s/^\s+//;
268
- $s =~ s/\s+$//;
269
- return $s;
270
- }
271
- '
272
- }