npm - claude-dev-env - Versions diffs - 1.37.1 → 1.38.0 - Mend

claude-dev-env 1.37.1 → 1.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (92) hide show

package/CLAUDE.md +3 -0
package/_shared/pr-loop/audit-contract.md +4 -3
package/_shared/pr-loop/fix-protocol.md +2 -0
package/_shared/pr-loop/gh-payloads.md +38 -37
package/_shared/pr-loop/scripts/README.md +0 -1
package/_shared/pr-loop/scripts/preflight.py +2 -1
package/_shared/pr-loop/scripts/tests/test_code_rules_gate.py +2 -2
package/_shared/pr-loop/scripts/tests/test_preflight.py +22 -0
package/_shared/pr-loop/state-schema.md +10 -10
package/agents/clean-coder.md +4 -0
package/agents/code-quality-agent.md +23 -85
package/agents/groq-coder.md +8 -6
package/hooks/blocking/__init__.py +0 -0
package/hooks/blocking/hedging_language_blocker.py +2 -2
package/hooks/blocking/state_description_blocker.py +243 -0
package/hooks/blocking/tdd_enforcer.py +94 -0
package/hooks/blocking/test_hedging_language_blocker.py +1 -1
package/hooks/blocking/test_state_description_blocker.py +618 -0
package/hooks/blocking/test_tdd_enforcer.py +152 -0
package/hooks/config/state_description_blocker_constants.py +130 -0
package/hooks/hooks.json +10 -0
package/package.json +1 -1
package/rules/no-historical-clutter.md +31 -10
package/scripts/config/groq_bugteam_config.py +13 -5
package/skills/bugteam/CONSTRAINTS.md +20 -27
package/skills/bugteam/EXAMPLES.md +1 -1
package/skills/bugteam/PROMPTS.md +60 -31
package/skills/bugteam/SKILL.md +47 -47
package/skills/bugteam/SKILL_EVALS.md +8 -8
package/skills/bugteam/reference/github-pr-reviews.md +31 -31
package/skills/bugteam/reference/team-setup.md +1 -1
package/skills/bugteam/reference/teardown-publish-permissions.md +4 -4
package/skills/copilot-review/SKILL.md +7 -14
package/skills/findbugs/SKILL.md +2 -2
package/skills/fixbugs/SKILL.md +1 -1
package/skills/monitor-open-prs/SKILL.md +6 -6
package/skills/pr-converge/SKILL.md +7 -6
package/skills/pr-converge/reference/convergence-gates.md +28 -30
package/skills/pr-converge/reference/examples.md +4 -4
package/skills/pr-converge/reference/fix-protocol.md +6 -8
package/skills/pr-converge/reference/multi-pr-orchestration.md +10 -10
package/skills/pr-converge/reference/per-tick.md +18 -33
package/skills/pr-converge/reference/stop-conditions.md +7 -7
package/skills/pr-converge/scripts/README.md +65 -117
package/skills/pr-review-responder/EXAMPLES.md +2 -2
package/skills/pr-review-responder/PRINCIPLES.md +2 -8
package/skills/pr-review-responder/README.md +7 -48
package/skills/pr-review-responder/SKILL.md +2 -3
package/skills/pr-review-responder/TESTING.md +8 -65
package/skills/qbug/SKILL.md +10 -16
package/_shared/pr-loop/scripts/config/gh_util_constants.py +0 -31
package/_shared/pr-loop/scripts/gh_util.py +0 -193
package/_shared/pr-loop/scripts/tests/test_gh_util.py +0 -257
package/_shared/pr-loop/scripts/tests/test_gh_util_constants.py +0 -61
package/skills/pr-converge/scripts/check_pr_mergeability.py +0 -78
package/skills/pr-converge/scripts/config/pr_converge_constants.py +0 -134
package/skills/pr-converge/scripts/config/test_pr_converge_constants.py +0 -152
package/skills/pr-converge/scripts/fetch_bugbot_inline_comments.py +0 -70
package/skills/pr-converge/scripts/fetch_bugbot_reviews.py +0 -57
package/skills/pr-converge/scripts/fetch_claude_inline_comments.py +0 -70
package/skills/pr-converge/scripts/fetch_claude_reviews.py +0 -61
package/skills/pr-converge/scripts/fetch_copilot_inline_comments.py +0 -70
package/skills/pr-converge/scripts/fetch_copilot_reviews.py +0 -61
package/skills/pr-converge/scripts/mark_pr_ready.py +0 -54
package/skills/pr-converge/scripts/post-bugbot-run.helpers.ps1 +0 -49
package/skills/pr-converge/scripts/post-bugbot-run.ps1 +0 -33
package/skills/pr-converge/scripts/reply_to_inline_comment.py +0 -84
package/skills/pr-converge/scripts/request_copilot_review.py +0 -71
package/skills/pr-converge/scripts/resolve_pr_head.py +0 -58
package/skills/pr-converge/scripts/review_field_helpers.py +0 -43
package/skills/pr-converge/scripts/reviewer_fetch_core.py +0 -153
package/skills/pr-converge/scripts/reviewer_specs.py +0 -98
package/skills/pr-converge/scripts/test_check_pr_mergeability.py +0 -126
package/skills/pr-converge/scripts/test_fetch_bugbot_inline_comments.py +0 -443
package/skills/pr-converge/scripts/test_fetch_bugbot_reviews.py +0 -299
package/skills/pr-converge/scripts/test_fetch_claude_inline_comments.py +0 -485
package/skills/pr-converge/scripts/test_fetch_claude_reviews.py +0 -368
package/skills/pr-converge/scripts/test_fetch_copilot_inline_comments.py +0 -440
package/skills/pr-converge/scripts/test_fetch_copilot_reviews.py +0 -366
package/skills/pr-converge/scripts/test_mark_pr_ready.py +0 -69
package/skills/pr-converge/scripts/test_post_bugbot_run.py +0 -195
package/skills/pr-converge/scripts/test_reply_to_inline_comment.py +0 -159
package/skills/pr-converge/scripts/test_request_copilot_review.py +0 -101
package/skills/pr-converge/scripts/test_resolve_pr_head.py +0 -79
package/skills/pr-converge/scripts/test_review_field_helpers.py +0 -80
package/skills/pr-converge/scripts/test_reviewer_fetch_core.py +0 -448
package/skills/pr-converge/scripts/test_reviewer_specs.py +0 -107
package/skills/pr-converge/scripts/test_trigger_bugbot.py +0 -139
package/skills/pr-converge/scripts/test_view_pr_context.py +0 -155
package/skills/pr-converge/scripts/trigger_bugbot.py +0 -77
package/skills/pr-converge/scripts/view_pr_context.py +0 -78
package/skills/pr-review-responder/scripts/respond_to_reviews.py +0 -376

package/skills/bugteam/PROMPTS.md CHANGED Viewed

@@ -15,16 +15,35 @@ Keep the spawn prompt self-contained: reference only the PR scope, audit rubric,
   <worktree_path>absolute path from Step 1 per-PR workspace</worktree_path>
 </context>
-cd into `<worktree_path>` before any git, gh, or file operation.
+cd into `<worktree_path>` before any git or file operation.
 <scope>
   <diff_path>Absolute path to the per-PR patch file: <run_temp_dir>/pr-<N>/loop-<L>.patch (same path as gh pr diff redirect in AUDIT)</diff_path>
   <scope_rule>Audit only lines added or modified in the diff. Pre-existing code on untouched lines is out of scope.</scope_rule>
+  <changed_files_rule>Build the list of changed file paths from the diff. Open each one with Read and audit cross-file consistency. Read every changed test file and cross-reference test assertions, expected values, and mock setup against the production code's config constants and function signatures. When a test file asserts a value that diverges from config, file a finding under category J.</changed_files_rule>
 </scope>
 <bug_categories>
-  Investigate each category explicitly. For each, return either at least
-  one finding OR a verified-clean entry with the evidence used to clear it:
+  Investigate each of the eleven categories (A–K) explicitly. For each,
+  return either at least one finding OR a verified-clean entry with the
+  evidence used to clear it. A category is verified-clean only when one
+  complete execution path through the changed code has been traced from
+  entry to exit. Surface-level scanning is insufficient evidence. The
+  evidence field must name (1) the specific function examined, (2) the
+  code path traced from entry to exit, and (3) the specific check performed.
+  Generic phrases such as "verified clean", "no issues found",
+  "pattern appears correct", "looks good", "seems fine", and
+  "no problems detected" do not satisfy the verified-clean requirement.
+  When evidence contains any of these phrases, the category is not
+  verified-clean -- re-audit with a concrete trace.
+  Categories A–K (one-line summary; full rubric and sub-bucket decomposition
+  for each is in `packages/claude-dev-env/audit-rubrics/category_rubrics/`;
+  ready-to-send Variant C prompts — each with a PR/repo-independent
+  generalized skeleton above a `---` separator and a worked example against
+  an authentic PR below — are in
+  `packages/claude-dev-env/audit-rubrics/prompts/`):
   A. API contract verification (signatures, return types, async/await correctness)
   B. Selector / query / engine compatibility
   C. Resource cleanup and lifecycle (file handles, connections, processes, locks)
@@ -35,6 +54,10 @@ cd into `<worktree_path>` before any git, gh, or file operation.
   H. Security boundaries (injection, path traversal, auth bypass, secret leakage)
   I. Concurrency hazards (race conditions, missing awaits, shared mutable state)
   J. Magic values and configuration drift
+  K. Codebase conflicts — a change updates one site of a pattern but a parallel
+     site in unchanged code stays stale, producing contradictory behavior;
+     the diff is internally consistent, the bug emerges only against unchanged
+     code (canonical example: jl-cmd/claude-code-config PR #397 r3210166636)
 </bug_categories>
 <constraints>
@@ -42,26 +65,26 @@ cd into `<worktree_path>` before any git, gh, or file operation.
   - Cite file:line for every finding.
   - When the diff alone does not provide enough context to confirm a bug,
     list it under "Open questions" rather than assert it.
+  - For every finding, search `git grep` for all callers of the targeted function. When the obvious fix would silently change behavior for other call paths, include a fix constraint that preserves them.
 </constraints>
 <comment_posting>
-  Sibling auditors (-b through -k): run only steps 1–3 (audit, assign IDs,
+  Sibling auditors (-b through -k): run only steps 1–2 (audit, assign IDs,
   capture excerpt, validate anchors), then write outcome XML per <output_format> and return.
-  Skip steps 4–8 — sibling auditors do not post PR reviews.
+  Skip steps 3–5 — sibling auditors do not post PR reviews.
   Validator (-a) and single-opus auditors: run all steps below.
-  1. Audit the diff against the 10 categories above. Buffer the findings
-     in memory; all posting happens at step 6 once anchors are validated.
+  1. Audit the diff against the 11 categories above. Buffer the findings
+     in memory; all posting happens at step 4 once anchors are validated.
   2. Assign each finding a stable finding_id of exactly the form `loop<L>-<K>`
      where <K> is 1-based within this loop.
-  3. For each finding, capture a verbatim excerpt from the target file at the cited line. Populate the `<excerpt>` element in the outcome XML with it. Validate every finding's (file, line) against the captured diff. Split
-     findings into two buckets: anchored (line is in the diff) and
-     unanchored (line is not in the diff — goes into the review body's
-     "Findings without a diff anchor" section per Step 2.5).
-  4. Build the review body per Step 2.5's review-body shape, filling in the
-     P0/P1/P2 counts and the unanchored-findings list (if any).
-  5. For each anchored finding, write its body to its own temp file:
+  3. For each finding, capture a verbatim excerpt from the target file at the cited
+     line. Populate the `<excerpt>` element in the outcome XML with it. Validate
+     every finding's (file, line) against the captured diff. Split findings into two
+     buckets: anchored (line is in the diff) and unanchored (line is not in the diff
+     — goes into the review body's "Findings without a diff anchor" section per
+     Step 2.5). Format each finding body as:
        **[severity] one-line title**
        Category: <letter> (<category name>)
@@ -69,16 +92,16 @@ cd into `<worktree_path>` before any git, gh, or file operation.
        _From /bugteam audit loop <L>._
-  6. Post ONE review via Step 2.5's per-loop review CLI shape. Harvest the
-     parent review `html_url` from the response JSON and the `comments[]`
-     child entries (each with its own `id` and `html_url`). Match child
-     entries to anchored findings in index order.
-  7. If the review POST itself fails, use Step 2.5's Review POST failure
-     fallback (single issue comment with full body and all findings inline).
-  8. Write every body (review body, each finding body, any fallback body)
-     to its own temp file. Load each file into the JSON payload via jq's
-     `--rawfile` or `-Rs`, then pipe the jq output to `gh api ... --input -`
-     so every body reaches GitHub as file contents inside the JSON payload.
+  4. Post ONE review via `pull_request_review_write(method="create",
+     event="COMMENT", body=<review_body>, owner=<O>, repo=<R>,
+     pullNumber=<N>, comments=[...])`. See Step 2.5 in SKILL.md for the full
+     parameter shape. Harvest the parent review `html_url` from the response
+     and the `comments[]` child entries (each with its own `id` and `html_url`).
+     Match child entries to anchored findings in index order.
+  5. If the review POST fails, use `add_issue_comment(owner=<O>, repo=<R>,
+     issueNumber=<N>, body=<full_text>)` as fallback.
+  Body text is passed directly as string parameters to the MCP tool calls —
+  no temp files, no jq, no shell pipes.
 </comment_posting>
 <output_format>
@@ -126,7 +149,7 @@ After the teammate writes the XML and returns, the lead reads `.bugteam-pr<N>-lo
   <worktree_path>absolute path from Step 1 per-PR workspace</worktree_path>
 </context>
-cd into `<worktree_path>` before any git, gh, or file operation.
+cd into `<worktree_path>` before any git or file operation.
 <bugs_to_fix>
   [for each P0/P1/P2 finding from last_findings:]
@@ -147,19 +170,22 @@ cd into `<worktree_path>` before any git, gh, or file operation.
   1. Read each referenced file before editing.
   2. Apply each fix you can address.
   3. Run `python -m py_compile` (or language-equivalent) on every modified file.
-  4. git add by explicit path, then git commit with a message summarizing the bugs fixed.
+  4. Run the project's test suite and confirm all existing tests pass. If a test fails, diagnose the regression and fix it before committing.
+  5. Read the previous loop's outcome XML (`<worktree_path>/.bugteam-pr<N>-loop<L-1>.outcomes.xml`) and obtain its total finding count. If this is the first loop (L <= 1) or the file does not exist, skip this comparison. Otherwise, re-read each changed file and count any new violations. Compute the post-fix total: previous total minus bugs fixed in this round plus new violations. If the post-fix total exceeds the previous total, flag all new findings as same-loop fix-targets and revise before committing.
+  6. git add by explicit path, then git commit with a message summarizing the bugs fixed.
      - If the commit fails because a git hook (pre-commit, commit-msg, etc.) blocked it,
        capture the hook's stderr, write status=hook_blocked for every finding in this loop
        (the commit was atomic; if it failed, no finding was applied), populate hook_output
        on each outcome, and return WITHOUT retrying. The lead will treat this loop as no-progress.
-  5. git push with a plain fast-forward push (the default, no flag overrides).
-  6. For each bug, post a fix reply to its finding_comment_id via the
-     Step 2.5 reply CLI shape:
+  7. git push with a plain fast-forward push (the default, no flag overrides).
+  8. For each bug, post a fix reply to its finding_comment_id via
+     `add_reply_to_pull_request_comment(commentId=<id>, body=<reply_text>,
+     owner=<O>, repo=<R>, pullNumber=<N>)`:
      - "Fixed in <commit_sha>" if the bug was addressed by your commit
      - "Could not address this loop: <one-line reason>" if you skipped or failed it
      - "Hook blocked the fix commit: <one-line summary>" if the commit was hook-blocked
-     Use the Fix reply CLI shape from Step 2.5 (`jq -Rs | gh api .../comments/<id>/replies --input -`). Write every reply body to a temp file first.
-  7. Write `.bugteam-pr<N>-loop<L>.outcomes.xml` inside `<worktree_path>` (schema below) and return its path.
+     Body text is passed directly as string parameters -- no temp files, no jq, no shell pipes.
+  9. Write `.bugteam-pr<N>-loop<L>.fix-outcomes.xml` inside `<worktree_path>` (schema below) and return its path.
 </execution>
 <outcome_xml_schema>
@@ -186,5 +212,8 @@ cd into `<worktree_path>` before any git, gh, or file operation.
   - git add by explicit path — name each file being staged.
   - Preserve existing comments on lines you do not modify.
   - Type hints on every signature you touch.
+  - **Narrow scope.** Fix only the exact defect at the specified file:line. No restructuring, no inlining helpers, no renames, no "while I'm here" cleanup.
+  - **Preserve helpers.** Do not remove or inline existing helper functions unless the finding explicitly names the helper as the problem.
+  - **No regression.** Before committing, re-read each changed file and count any new violations. Compare the post-fix total (previous total minus bugs fixed plus new violations) against the previous loop's total finding count (from `<worktree_path>/.bugteam-pr<N>-loop<L-1>.outcomes.xml`). On the first loop (L <= 1) or when the file does not exist, skip this guard. The post-fix total must be flat or decreased relative to the previous loop. An increase means the fix introduced new bugs — revise before committing. Do not commit a regression.
 </constraints>
 ```

package/skills/bugteam/SKILL.md CHANGED Viewed

@@ -120,9 +120,9 @@ Non-zero → stop. Revoke in Step 5 on every exit path.
 ### Step 1: Resolve PR scope (once)
-Accept one or more PR numbers from the invocation. For each PR, run `gh pr view
---json number,baseRefName,headRefName,url` (falling back to the merge-base diff
-path when no PR exists). Capture `all_prs = [{number, owner, repo, baseRef,
+Accept one or more PR numbers from the invocation. For each PR, call
+`pull_request_read(method="get", pullNumber=N, owner=O, repo=R)` (falling back
+to the merge-base diff path when no PR exists). Capture `all_prs = [{number, owner, repo, baseRef,
 headRef, url}, ...]`. A single-PR invocation produces a one-element list and
 follows the same downstream rules.
@@ -186,41 +186,34 @@ Order: audit → buffer → validate anchors vs diff → single review POST.
 Review body states counts; zero findings → still one review, `comments: []`,
 body `## /bugteam loop <L> audit: 0P0 / 0P1 / 0P2 → clean`.
-**Payloads:** build JSON with `jq --rawfile` / `-Rs`, pipe to `gh api ...
---input -` (avoids shell-quoting; satisfies `gh-body-backtick-guard`). Write
-each markdown body to a temp file first.
+**Payloads:** Use MCP tool calls (see below). Body text with markdown (backticks,
+newlines, quotes) passes through safely as string parameters — no temp files, no
+jq, no shell pipes.
 **Review POST** (one `comments[]` object per anchored finding; single-line
 `{path, line, side: "RIGHT", body}`; multi-line add `start_line`, `start_side:
 "RIGHT"`):
 ```
-jq -n \
---rawfile review_body <tmp_review_body.md> \
---arg commit_id "$(git rev-parse HEAD)" \
---rawfile finding_body_1 <tmp_finding_1.md> \
---arg path_1 "<file_1>" \
---argjson line_1 <line_1> \
-[... one finding_body_K / path_K / line_K triple per finding ...] \
-'{
-commit_id: $commit_id,
-event: "COMMENT",
-body: $review_body,
-comments: [
-{path: $path_1, line: $line_1, side: "RIGHT", body: $finding_body_1}
-[, ... ]
-]
-}' \
-| gh api repos/<owner>/<repo>/pulls/<number>/reviews -X POST --input -
+pull_request_review_write(
+  method="create",
+  event="COMMENT",
+  body=<review_body_text>,
+  commitID=<head_sha_at_post_time>,
+  owner=<owner>, repo=<repo>, pullNumber=<number>,
+  comments=[
+    {path: <path_1>, line: <line_1>, side: "RIGHT", body: <finding_body_1>}
+    [, ... ]
+  ]
+)
 ```
-**Fix reply:** `jq -Rs '{body: .}' <tmp_reply.md | gh api
-repos/<owner>/<repo>/pulls/<number>/comments/<finding_comment_id>/replies -X
-POST --input -`
+**Fix reply:** `add_reply_to_pull_request_comment(commentId=<finding_comment_id>,
+body=<reply_text>, owner=<owner>, repo=<repo>, pullNumber=<number>)`
-**Review POST fails:** issue comment fallback: `jq -Rs '{body: .}'
-<tmp_fallback.md | gh api repos/<owner>/<repo>/issues/<number>/comments -X POST
---input -`
+**Review POST fails:** issue comment fallback:
+`add_issue_comment(owner=<owner>, repo=<repo>, issueNumber=<number>,
+body=<fallback_text>)`
 `<head_sha_at_post_time>`: `git rev-parse HEAD` in subagent cwd immediately
 before POST.
@@ -263,10 +256,16 @@ and before iteration begins, when `last_action == "fresh"`). A re-invocation of
 cleaned this HEAD (short-circuit) and otherwise records that prior loops were
 dirty so the AUDIT runs against the latest diff with that signal in mind:
-```bash
-dirty_review_count=0
-gh api "repos/<owner>/<repo>/pulls/<number>/reviews?per_page=100" --paginate --slurp \
-  | jq '[.[][] | select((.body // "") | startswith("## /bugteam loop "))] | sort_by(.submitted_at) | reverse'
+```python
+dirty_review_count = 0
+all_reviews = pull_request_read(
+    method="get_reviews", pullNumber=N, owner=O, repo=R
+)
+prior_reviews = [
+    rev for rev in all_reviews
+    if rev.get("body", "").startswith("## /bugteam loop ")
+]
+prior_reviews.sort(key=lambda rev: rev["submitted_at"], reverse=True)
 ```
 Iterate from index 0 (most recent) toward older entries:
@@ -313,7 +312,7 @@ Iterate from index 0 (most recent) toward older entries:
 Lead only; merge-base / diff semantics:
 [`../../_shared/pr-loop/code-rules-gate.md`][path-code-rules]; shared script
 inventory: [`../../_shared/pr-loop/scripts/README.md`][path-scripts-readme].
-Non-zero → spawn **clean-coder** standards-fix (read stderr, edit, re-run
+Non-zero → spawn **clean-coder** standards-fix (`mode="bypassPermissions"`) (read stderr, edit, re-run
 **this same** command, one commit, `git push`, shutdown) until exit **0** or
 **5**
 failed gate rounds → `error: code rules gate failed pre-audit`. After **0**:
@@ -335,12 +334,10 @@ before the next AUDIT.
 ### AUDIT action
-```bash
-mkdir -p "<run_temp_dir>/pr-<N>"
-gh pr diff <N> -R <owner>/<repo> > "<run_temp_dir>/pr-<N>/loop-<L>.patch"
-```
-**Spawn:**
+1. Create the directory: `mkdir -p "<run_temp_dir>/pr-<N>"`.
+2. Call `pull_request_read(method="get_diff", pullNumber=N, owner=O, repo=R)`
+   to capture the diff text, then write it to
+   `"<run_temp_dir>/pr-<N>/loop-<L>.patch"` using the `Write` tool.
 ```
 Agent(
@@ -411,6 +408,8 @@ advanced; `git -C "<run_temp_dir>/pr-<N>/worktree" fetch origin <branch> && git
 `HEAD`. Unchanged HEAD →
 `stuck — bugfix subagent could not address findings`.
+**Scope verification.** Run `git diff HEAD~1 --name-only` and compare against the set of files referenced in bugs_to_fix. When the commit touches any file NOT in the bugs_to_fix list, downgrade the outcome to `unverified_fixed` with reason "commit touched unexpected files: <list>".
 ### Step 4: Teardown
 1. For each PR in `all_prs`: `git worktree remove
@@ -431,16 +430,17 @@ else {'onerror': h}))"
 ### Step 4.5: PR description
 Lead only; cumulative product narrative (not process). Delegate body to
-`pr-description-writer` via `Agent` (else `general-purpose`) so the
-mandatory-pr-description hook accepts `gh pr edit`.
+`pr-description-writer` via `Agent` (`mode="bypassPermissions"`) (else `general-purpose`) so the
+mandatory-pr-description hook accepts `update_pull_request`.
-1. `gh pr diff <number> -R <owner>/<repo> > .bugteam-final.diff`
-2. `gh pr view <number> -R <owner>/<repo> --json body --jq .body >
-   .bugteam-original-body.md`
+1. `pull_request_read(method="get_diff", pullNumber=N, owner=O, repo=R)` → write
+   output to `.bugteam-final.diff` with `Write` tool.
+2. `pull_request_read(method="get", pullNumber=N, owner=O, repo=R)` → extract
+   `.body` from response, write to `.bugteam-original-body.md` with `Write` tool.
 3. Agent brief: paths + branch names; describe merge-ready change from diff;
    keep curated original sections intact; return markdown body.
-4. Write `.bugteam-final-body.md`; `gh pr edit <number> -R <owner>/<repo>
-   --body-file .bugteam-final-body.md`
+4. Write `.bugteam-final-body.md`; `update_pull_request(pullNumber=N, owner=O,
+   repo=R, body=<body_text>)`.
 5. Delete the three temp files.
 On failure: log in final report; continue to Step 5.

package/skills/bugteam/SKILL_EVALS.md CHANGED Viewed

@@ -68,7 +68,7 @@ The harness does not yet exist; this document defines its contract.
 **Scenario.** Current branch is `main` with no PR and no upstream difference.
 **Layer B predicted trace.**
-1. `Bash("gh pr view --json ...")` → non-zero exit.
+1. `pull_request_read(method="get", pullNumber=N, owner=O, repo=R)` → fails / no matching PR.
 2. `Bash("git merge-base HEAD origin/main")` → empty.
 3. No grant script.
@@ -103,10 +103,10 @@ The harness does not yet exist; this document defines its contract.
 | # | Tool call | Source |
 |---|---|---|
 | 1 | `Bash("python .../scripts/grant_project_claude_permissions.py")` | `SKILL.md` § Step 0 |
-| 2 | `Bash("gh pr view --json number,baseRefName,headRefName,url")` | `SKILL.md` § Step 1 |
+| 2 | `pull_request_read(method="get", pullNumber=42, owner=..., repo=...)` | `SKILL.md` § Step 1 |
 | 3 | `Bash("git -C \"<run_temp_dir>/pr-42/worktree\" rev-parse HEAD")` → captures `starting_sha` | `SKILL.md` § Step 2 — **Loop state** block |
 | 4 | `Bash("mkdir -p <run_temp_dir>/pr-42")` | `SKILL.md` § AUDIT action |
-| 5 | `Bash("gh pr diff 42 -R ... > <run_temp_dir>/pr-42/loop-1.patch")` | `SKILL.md` § AUDIT action |
+| 5 | `pull_request_read(method="get_diff", pullNumber=42, owner=..., repo=...)` → write to `<run_temp_dir>/pr-42/loop-1.patch` | `SKILL.md` § AUDIT action |
 | 6 | `Agent(subagent_type="code-quality-agent", name="bugfind-pr42-loop1", run_in_background=true, model="opus", description=..., prompt=<audit XML loop 1>)` | `SKILL.md` § AUDIT action |
 | 7 | Lead awaits background-completion notification | `SKILL.md` § AUDIT action |
 | 8 | `Read(".bugteam-pr42-loop1.outcomes.xml")` | `SKILL.md` § AUDIT action |
@@ -116,17 +116,17 @@ The harness does not yet exist; this document defines its contract.
 | 12 | `Bash("git -C \"<run_temp_dir>/pr-42/worktree\" rev-parse HEAD")` → verify HEAD advanced | `SKILL.md` § FIX action (**Verify**) |
 | 13 | `Bash("git -C \"<run_temp_dir>/pr-42/worktree\" fetch origin <branch>")` → fetch remote state | `SKILL.md` § FIX action (**Verify**) |
 | 14 | `Bash("git -C \"<run_temp_dir>/pr-42/worktree\" rev-parse origin/<branch>")` → confirm matches HEAD | `SKILL.md` § FIX action (**Verify**) |
-| 15 | `Bash("gh pr diff 42 -R ... > <run_temp_dir>/pr-42/loop-2.patch")` | `SKILL.md` § AUDIT action |
+| 15 | `pull_request_read(method="get_diff", pullNumber=42, owner=..., repo=...)` → write to `<run_temp_dir>/pr-42/loop-2.patch` | `SKILL.md` § AUDIT action |
 | 16 | `Agent(subagent_type="code-quality-agent", name="bugfind-pr42-loop2", run_in_background=true, ...)` (loop 2) | `SKILL.md` § AUDIT action |
 | 17 | Lead awaits background-completion notification | `SKILL.md` § AUDIT action |
 | 18 | `Read(".bugteam-pr42-loop2.outcomes.xml")` — zero findings | `SKILL.md` § AUDIT action |
 | 19 | `Bash("git worktree remove \"<run_temp_dir>/pr-42/worktree\"")` | `SKILL.md` § Step 4 step 1 |
 | 20 | `Bash("python -c \"...shutil.rmtree(r'<run_temp_dir>', ...)\"")` | `SKILL.md` § Step 4 step 2 (Windows-safe teardown) |
-| 21 | `Bash("gh pr diff 42 -R ... > .bugteam-final.diff")` | `SKILL.md` § Step 4.5 step 1 |
-| 22 | `Bash("gh pr view 42 -R ... --json body --jq .body > .bugteam-original-body.md")` | `SKILL.md` § Step 4.5 step 2 |
+| 21 | `pull_request_read(method="get_diff", pullNumber=42, owner=..., repo=...)` → write to `.bugteam-final.diff` | `SKILL.md` § Step 4.5 step 1 |
+| 22 | `pull_request_read(method="get", pullNumber=42, owner=..., repo=...)` → extract `.body`, write to `.bugteam-original-body.md` | `SKILL.md` § Step 4.5 step 2 |
 | 23 | `Agent(subagent_type="pr-description-writer", description=..., prompt=<brief>)` | `SKILL.md` § Step 4.5 |
 | 24 | `Write(".bugteam-final-body.md", <returned body>)` | `SKILL.md` § Step 4.5 step 4 |
-| 25 | `Bash("gh pr edit 42 -R ... --body-file .bugteam-final-body.md")` | `SKILL.md` § Step 4.5 step 4 |
+| 25 | `update_pull_request(pullNumber=42, owner=..., repo=..., body=...)` | `SKILL.md` § Step 4.5 step 4 |
 | 26 | `Bash("rm .bugteam-final.diff .bugteam-original-body.md .bugteam-final-body.md")` | `SKILL.md` § Step 4.5 step 5 |
 | 27 | `Bash("python .../scripts/revoke_project_claude_permissions.py")` | `SKILL.md` § Step 5 |
@@ -224,7 +224,7 @@ Patch this table to match observation and annotate each correction.
 - Every finding's outcome XML carries `used_fallback="true"` and the issue-comment URL as `finding_comment_url`.
 - Cycle continues to the FIX action without aborting.
-**Open item for the real run.** The issue-comments fallback shape is `jq -Rs | gh api .../issues/<number>/comments --input -` (`SKILL.md` § Step 2.5 **Review POST fails**; full narrative in `reference/github-pr-reviews.md` § **Review POST failure fallback**). Before running Eval 10 for real, confirm the teammate obeys this shape — the fixture must assert the endpoint path and the `--input -` pattern.
+**Open item for the real run.** The issue-comments fallback uses `add_issue_comment(owner=..., repo=..., issueNumber=42, body=...)` (`SKILL.md` § Step 2.5 **Review POST fails**; full narrative in `reference/github-pr-reviews.md` § **Review POST failure fallback**). Before running Eval 10 for real, confirm the teammate obeys this shape — the fixture must assert the `add_issue_comment` tool call.
 ---

package/skills/bugteam/reference/github-pr-reviews.md CHANGED Viewed

@@ -8,55 +8,55 @@ Per-loop pull-request reviews: findings render as a tree under one parent review
 **Ordering:** Bugfind audits first, buffers findings, validates anchors against the captured diff, then posts the review **once** at the end. The review body states the finding count authoritatively. Keep all posting in that single end-of-loop review POST.
-## CLI shapes (teammate)
+## MCP tool shapes (teammate)
-All three POSTs use the same pattern: build JSON with `jq` (`--rawfile` or `-Rs` so markdown with backticks, newlines, and quotes survives intact), then pipe to `gh api ... --input -` on stdin. This avoids shell-quoting edge cases.
+All three POSTs use MCP tool calls. Body text with markdown (backticks, newlines, quotes) passes through safely as string parameters — no temp files, no jq, no shell pipes.
 ### Per-loop review (one POST creates parent + children)
 Build `comments[]` programmatically from buffered, diff-anchored findings. Single-line shape: `{path, line, side: "RIGHT", body: <finding markdown>}`. Multi-line: `{path, start_line, start_side: "RIGHT", line, side: "RIGHT", body: ...}` (all four anchor fields required).
 ```
-jq -n \
-  --rawfile review_body <tmp_review_body.md> \
-  --arg commit_id "$(git rev-parse HEAD)" \
-  --rawfile finding_body_1 <tmp_finding_1.md> \
-  --arg path_1 "<file_1>" \
-  --argjson line_1 <line_1> \
-  [... one finding_body_K / path_K / line_K triple per anchored finding ...] \
-  '{
-     commit_id: $commit_id,
-     event: "COMMENT",
-     body: $review_body,
-     comments: [
-       {path: $path_1, line: $line_1, side: "RIGHT", body: $finding_body_1}
-       [, ... one object per anchored finding ...]
-     ]
-   }' \
-| gh api repos/<owner>/<repo>/pulls/<number>/reviews -X POST --input -
+pull_request_review_write(
+  method="create",
+  event="COMMENT",
+  body=<review_body_markdown>,
+  commitID=<head_sha_at_post_time>,
+  owner=<owner>, repo=<repo>, pullNumber=<pull_number>,
+  comments=[
+    {path: <file_1>, line: <line_1>, side: "RIGHT", body: <finding_1_markdown>}
+    [, ... one object per anchored finding ...]
+  ]
+)
 ```
-Response JSON includes the parent review `id` / `html_url` and a `comments` array of child comments (`id`, `html_url`). Harvest children in index order and align with the finding list.
+Response includes the parent review `html_url` and a `comments` array of child comments (`id`, `html_url`). Harvest children in index order and align with the finding list.
 ### Fix reply
 ```
-jq -Rs '{body: .}' < <tmp_reply.md> \
-| gh api repos/<owner>/<repo>/pulls/<number>/comments/<finding_comment_id>/replies -X POST --input -
+add_reply_to_pull_request_comment(
+  commentId=<finding_comment_id>,
+  body=<reply_markdown>,
+  owner=<owner>, repo=<repo>, pullNumber=<pull_number>
+)
 ```
 ### Review POST failure fallback
-Top-level PR comment via issue-comments (`{issue_number}` is the PR number):
+Top-level PR comment via issue-comments (`issue_number` is the PR number):
 ```
-jq -Rs '{body: .}' < <tmp_fallback.md> \
-| gh api repos/<owner>/<repo>/issues/<number>/comments -X POST --input -
+add_issue_comment(
+  owner=<owner>, repo=<repo>,
+  issueNumber=<pull_number>,
+  body=<full_fallback_markdown>
+)
 ```
 `<head_sha_at_post_time>` is the SHA at post time (`git rev-parse HEAD` in the teammate’s working directory immediately before the POST). The review anchors finding comments to the head SHA at audit time (before this loop’s fix lands).
-Write each body (review body and every per-finding body) to its own temp file before the `jq` pipeline. Bodies stay inside files the pipeline reads — they reach GitHub inside the JSON payload — which keeps them compatible with the `gh-body-backtick-guard` hook that scans command-line `--body` arguments.
+Body text is passed directly as string parameters to the MCP tool calls — no temp files, no jq, no shell pipes.
 ## Review body template (`<tmp_review_body.md>`)
@@ -77,10 +77,10 @@ GitHub rejects the entire review POST if any `comments[]` entry targets a line n
 ## Review POST failure fallback
-If the review POST fails (rate limit, network, malformed payload), fall back to one top-level issue comment containing the review body plus every finding inline (severity, file:line, description). Every finding in that run: `used_fallback="true"`, `finding_comment_url` = issue-comment URL. Use the issue-comment CLI shape above.
+If the review POST fails (rate limit, network, malformed payload), fall back to one top-level issue comment containing the review body plus every finding inline (severity, file:line, description). Every finding in that run: `used_fallback="true"`, `finding_comment_url` = issue-comment URL. Use `add_issue_comment` (see MCP tool reference below).
-## GitHub REST endpoints
+## MCP tool reference
-- Per-loop batched review: `POST /repos/{owner}/{repo}/pulls/{pull_number}/reviews` (required: `body`, `event=COMMENT`, `commit_id`; optional `comments[]` — each entry needs `path`, `body`, `line`, `side`)
-- Fix reply: `POST /repos/{owner}/{repo}/pulls/{pull_number}/comments/{comment_id}/replies` (required: `body`)
-- Review-POST failure fallback: `POST /repos/{owner}/{repo}/issues/{issue_number}/comments` (required: `body`; `{issue_number}` is the PR number)
+- Per-loop batched review: `pull_request_review_write(method="create", event="COMMENT", body=..., commitID=..., owner=..., repo=..., pullNumber=..., comments=[...])` — wraps `POST /repos/{owner}/{repo}/pulls/{pull_number}/reviews`
+- Fix reply: `add_reply_to_pull_request_comment(commentId=..., body=..., owner=..., repo=..., pullNumber=...)` — wraps `POST /repos/{owner}/{repo}/pulls/{pull_number}/comments/{comment_id}/replies`
+- Review-POST failure fallback: `add_issue_comment(owner=..., repo=..., issueNumber=..., body=...)` — wraps `POST /repos/{owner}/{repo}/issues/{issue_number}/comments`

package/skills/bugteam/reference/team-setup.md CHANGED Viewed

@@ -14,7 +14,7 @@ This is the **first** action of every `/bugteam` invocation, before any subagent
 Same resolution path as `/findbugs`:
-1. `gh pr view --json number,baseRefName,headRefName,url` from the working directory.
+1. `pull_request_read(method="get", pullNumber=N, owner=O, repo=R)` — extracts `number`, `baseRefName`, `headRefName`, `url` from response (`N` comes from the parent skill's PR context, or fall back to `pull_request_read` with the default branch lookup to recover the PR number).
 2. Fall back to `git merge-base HEAD origin/<default>` then `git diff <merge-base>...HEAD`.
 3. Neither → refuse per refusal cases in `SKILL.md`.

package/skills/bugteam/reference/teardown-publish-permissions.md CHANGED Viewed

@@ -34,15 +34,15 @@ If that subagent is missing, fall back to `general-purpose` with the same brief
 **Steps:**
-1. Capture cumulative diff: `gh pr diff <number> -R <owner>/<repo> > .bugteam-final.diff`.
-2. Capture original body: `gh pr view <number> -R <owner>/<repo> --json body --jq .body > .bugteam-original-body.md`.
+1. Capture cumulative diff: `pull_request_read(method="get_diff", pullNumber=N, owner=O, repo=R)` → write output to `.bugteam-final.diff` with `Write` tool.
+2. Capture original body: `pull_request_read(method="get", pullNumber=N, owner=O, repo=R)` → extract `.body` from response, write to `.bugteam-original-body.md` with `Write` tool.
 3. Agent brief:
    - **Inputs:** diff path, original body path, head branch, base branch.
    - **Constraint:** describe what the PR delivers from the cumulative diff — behavior, user-visible effect, merge rationale. Process metadata (loops, fix counts, findings) stays in review comments.
-   - **Preservation rule:** if the original body has manually curated sections (linked issues, screenshots, test plan, “Risk Assessment”, etc.), preserve them verbatim and only rewrite narrative around them.
+   - **Preservation rule:** if the original body has manually curated sections (linked issues, screenshots, test plan, "Risk Assessment", etc.), preserve them verbatim and only rewrite narrative around them.
    - **Output:** new body markdown.
 4. Write `.bugteam-final-body.md`.
-5. `gh pr edit <number> -R <owner>/<repo> --body-file .bugteam-final-body.md`.
+5. `update_pull_request(pullNumber=N, owner=O, repo=R, body=<body_text>)`.
 6. Delete `.bugteam-final.diff`, `.bugteam-original-body.md`, `.bugteam-final-body.md`.
 If Step 4.5 fails (agent error, hook block, network), report in the final report and continue to Step 5. The original PR body remains; commits and comments are unaffected.

package/skills/copilot-review/SKILL.md CHANGED Viewed

@@ -26,10 +26,10 @@ The user is on a PR branch, wants Copilot (the GitHub Copilot reviewer bot) to k
 From the current repo:
 ```bash
-gh pr view --json number,url,headRefOid,baseRefName,headRefName,isDraft
+# MCP: pull_request_read(method="get") returns {number, url, head.sha, base.ref, head.ref, isDraft}
 ```
-Capture `number`, `headRefOid`, owner/repo (from `url`), and branch name. Pass these to the subagent so it does not rediscover them.
+Capture `number`, `head.sha`, owner/repo (from `url`), and branch name. Pass these to the subagent so it does not rediscover them.
 ### Step 2: Spawn the background subagent
@@ -57,23 +57,16 @@ Pass this verbatim to the subagent (substituting the bracketed values):
 >
 > **Per-tick work** (do this now, then on each wakeup):
 >
-> 1. Resolve current HEAD: `gh api repos/[OWNER]/[REPO]/pulls/[NUMBER] --jq '.head.sha'`.
-> 2. Fetch latest Copilot review:
->    ```bash
->    gh api repos/[OWNER]/[REPO]/pulls/[NUMBER]/reviews \
->      --jq '[.[] | select(.user.login=="copilot-pull-request-reviewer[bot]")] | sort_by(.submitted_at) | last'
->    ```
+> 1. Resolve current HEAD: `pull_request_read(method="get", pullNumber=[NUMBER], owner="[OWNER]", repo="[REPO]")` and extract `.head.sha`.
+> 2. Fetch latest Copilot review via `pull_request_read(method="get_reviews", pullNumber=[NUMBER], owner="[OWNER]", repo="[REPO]")`.
 >    Capture `commit_id`, `state`, `submitted_at`, `id`.
 > 3. Decide the branch:
 >    - **No review exists:** re-request (step 4), schedule next wakeup, return.
 >    - **Latest review's `commit_id` != current HEAD:** re-request (step 4), schedule next wakeup, return.
 >    - **Latest review's `commit_id` == current HEAD with unresolved inline findings:** TDD-fix them, push, reply inline on each thread, re-request (step 4), schedule next wakeup, return.
 >    - **Latest review's `commit_id` == current HEAD and clean:** report convergence to the parent with a one-sentence summary and terminate. The loop is done; skip the ScheduleWakeup call.
-> 4. Re-request Copilot. The reviewer ID **must** be `copilot-pull-request-reviewer[bot]` with the `[bot]` suffix — empirically verified: `Copilot`, `copilot`, and `github-copilot` all return `requested_reviewers: []` with no error, silently no-op.
->    ```bash
->    gh api -X POST repos/[OWNER]/[REPO]/pulls/[NUMBER]/requested_reviewers \
->      -f 'reviewers[]=copilot-pull-request-reviewer[bot]'
->    ```
+> 4. Re-request Copilot via `request_copilot_review(owner="[OWNER]", repo="[REPO]", pullNumber=[NUMBER])`.
+>    The reviewer ID **must** be `copilot-pull-request-reviewer[bot]` with the `[bot]` suffix — empirically verified: `Copilot`, `copilot`, and `github-copilot` all return `requested_reviewers: []` with no error, silently no-op.
 > 5. Schedule the next wakeup with `ScheduleWakeup`:
 >    - `delaySeconds: 300`
 >    - `reason`: one short sentence on what you are waiting for.
@@ -86,7 +79,7 @@ Pass this verbatim to the subagent (substituting the bracketed values):
 > - Implement the fix.
 > - Stage the fix and create one new commit on the existing branch: `git add <files> && git commit -m "fix(review): ..."`.
 > - Push the new commit: `git push origin [BRANCH]`.
-> - Reply inline on each comment thread with `gh api -X POST repos/[OWNER]/[REPO]/pulls/[NUMBER]/comments` using `in_reply_to` set to the comment id, referencing the new commit SHA.
+> - Reply inline on each comment thread with `add_reply_to_pull_request_comment(owner="[OWNER]", repo="[REPO]", pullNumber=[NUMBER], body="...", commentId=<comment_id>)`, referencing the new commit SHA.
 >
 > When a pre-push, pre-commit, or other hook rejects the change, solve it. Read the hook's error message, diagnose the root cause in the code or test, and fix that. Then rerun the commit or push. Hooks exist to catch real problems; treat each rejection as new evidence to act on.
 >

package/skills/findbugs/SKILL.md CHANGED Viewed

@@ -24,7 +24,7 @@ If the current branch has no associated PR and no diff against the default branc
 Determine the audit target in this order:
-1. **Open PR for current branch.** Run `gh pr view --json number,baseRefName,headRefName,url` from the working directory. If a PR exists, capture its number, base ref, head ref, and URL.
+1. **Open PR for current branch.** Call `pull_request_read(method="get", pullNumber=N, owner=O, repo=R)` for the current branch (`N` comes from the parent skill's context, or fall back to `search_issues` MCP with the current branch name to recover the PR number). If a PR exists, capture its number, base ref, head ref, and URL.
 2. **No PR but a remote default branch exists.** Diff against the default branch's merge-base: `git merge-base HEAD origin/<default>` then `git diff <merge-base>...HEAD`. Treat this as the audit scope.
 3. **Neither.** Respond exactly: `No PR or upstream diff found. Push the branch or open a PR first.` and stop.
@@ -39,7 +39,7 @@ diff_temp_path = Path(tempfile.gettempdir()) / f"findbugs-pr-{os.getpid()}.patch
 `os.getpid()` supplies the per-invocation suffix that prevents collisions with parallel `/findbugs` runs (a UUID or timestamp is equally acceptable). Capture the resolved absolute path as `<diff_temp_path>` and pass that **literal** path to every shell command that follows. Shell-side parameter expansion (`${TMPDIR:-/tmp}`, `$$`, `%TEMP%`) is forbidden because cmd.exe and PowerShell do not honor it.
-When a PR exists: `gh pr diff <number> -R <owner>/<repo> > "<diff_temp_path>"`.
+When a PR exists: call `pull_request_read(method="get_diff", pullNumber=N, owner=O, repo=R)` and save the returned diff content to `"<diff_temp_path>"`.
 When falling back to merge-base diff: `git diff <merge-base>...HEAD > "<diff_temp_path>"`.

package/skills/fixbugs/SKILL.md CHANGED Viewed

@@ -49,7 +49,7 @@ If the filtered set is empty, refuse per the refusal cases above.
 Re-establish the same PR target `/findbugs` used:
-1. `gh pr view --json number,baseRefName,headRefName,url` from the working directory.
+1. `pull_request_read(method="get", pullNumber=N, owner=O, repo=R)` for the current branch (`N` comes from the parent skill's PR context, or fall back to `search_issues` MCP with the current branch name to recover the PR number).
 2. Fall back to `git merge-base HEAD origin/<default>` then `git diff <merge-base>...HEAD`.
 3. Neither → respond `No PR or upstream diff. Cannot scope fixes.` and stop.