npm - claude-dev-env - Versions diffs - 1.25.2 → 1.26.0 - Mend

claude-dev-env 1.25.2 → 1.26.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (105) hide show

package/skills/bugteam/SKILL.md CHANGED Viewed

@@ -7,8 +7,8 @@ description: >-
   returns zero bugs or a 10-loop safety cap is reached. One up-front
   confirmation authorizes the entire cycle. Each audit teammate is spawned
   fresh per loop to prevent anchoring bias. Wraps the cycle with project
-  permission grant/revoke. Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
-  and Claude Code v2.1.32+. Triggers: '/bugteam', 'run the bug team',
+  permission grant/revoke. Requires CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1.
+  Triggers: '/bugteam', 'run the bug team',
   'auto-fix the PR until clean', 'loop audit and fix'.
 ---
@@ -22,9 +22,11 @@ description: >-
 ## Contents
-This file is 400+ lines. The list below is for the LLM reading this skill — partial reads (e.g., `head -100`) miss what comes later, so this section ensures the full scope is visible from the top. (Per Anthropic's [Skill authoring best practices — Structure longer reference files with table of contents](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices#structure-longer-reference-files-with-table-of-contents).)
+This file is the orchestration core. The list below is for the LLM reading this skill — partial reads (e.g., `head -100`) miss what comes later, so this section ensures the full scope is visible from the top. (Per Anthropic's [Skill authoring best practices — Structure longer reference files with table of contents](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices#structure-longer-reference-files-with-table-of-contents).)
 - When this skill applies — refusal cases (4) and trigger conditions
+- Utility scripts — pre-flight checks (`scripts/`, executed not loaded as context)
+- Pre-audit code rules gate — `validate_content` / hook parity before each AUDIT
 - The Process — Progress checklist + Steps 0–6
   - Step 0 — Grant project permissions
   - Step 1 — Resolve PR scope
@@ -35,9 +37,9 @@ This file is 400+ lines. The list below is for the LLM reading this skill — pa
   - Step 4.5 — Finalize the PR description (via pr-description-writer)
   - Step 5 — Revoke project permissions
   - Step 6 — Print the final report
-- Constraints — invariants the implementer must preserve
-- Examples — five end-to-end scenarios
-- Why this design — rationale for agent-teams + clean-room + grant/revoke
+- [`PROMPTS.md`](PROMPTS.md) — AUDIT spawn-prompt XML, FIX spawn-prompt XML, the 10 audit categories (A–J), and both outcome XML schemas. Load before spawning bugfind or bugfix, or when parsing teammate outcome XML.
+- [`EXAMPLES.md`](EXAMPLES.md) — six end-to-end scenarios (converged, cap reached, stuck, partial-fix, no-PR, dirty-tree). Load when an unfamiliar exit condition appears.
+- [`CONSTRAINTS.md`](CONSTRAINTS.md) — invariants plus "Why this design" rationale. Load when a constraint question arises.
 ## When this skill applies
@@ -46,11 +48,24 @@ User wants automated convergence on a clean PR without babysitting each step. Ty
 Refusal cases — check in order; first match short-circuits and stops:
 - **Agent teams not enabled.** Check `claude config get env.CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS` and `~/.claude/settings.json`. If neither sets it to `"1"`, respond: `CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 not set. /bugteam requires the agent teams feature. See https://code.claude.com/docs/en/agent-teams#enable-agent-teams.` and stop.
-- **Claude Code version too old.** Run `claude --version`. If older than v2.1.32, respond: `Claude Code v<version> is older than the v2.1.32 minimum for agent teams. Upgrade first.` and stop.
 - **Missing PR or upstream diff.** Respond exactly: `No PR or upstream diff. /bugteam needs a target.` and stop.
 - **Working tree dirty with uncommitted changes the user did not stage.** Respond: `Uncommitted changes detected. Stash, commit, or revert before /bugteam.` and stop. Reason: the fix teammate will commit the working tree, mixing user-uncommitted work into automated fixes.
 - **Required subagents not installed.** Before Step 0, verify `code-quality-agent` and `clean-coder` subagent types exist in the available agents list. If either is missing, respond: `Required subagent type <name> not installed. /bugteam needs both code-quality-agent and clean-coder available.` and stop.
+## Utility scripts
+Fragile or repeatable shell sequences belong in `scripts/` (see Anthropic [Skill authoring best practices — Progressive disclosure](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices#progressive-disclosure-patterns): utility scripts are **executed**, not loaded into context). Details: [`scripts/README.md`](scripts/README.md).
+### Pre-flight (recommended before Step 0)
+From the repository root, run:
+```bash
+python "${CLAUDE_SKILL_DIR}/scripts/bugteam_preflight.py"
+```
+If the exit code is non-zero, stop and fix failing checks before granting permissions. Optional: `BUGTEAM_PREFLIGHT_SKIP=1` skips pre-flight (emergency only). Optional: `--pre-commit` when `.pre-commit-config.yaml` exists.
 ## The Process
 ### Progress checklist (copy at start, tick as you go)
@@ -92,7 +107,17 @@ Capture: `<owner>/<repo>`, head branch, base branch, PR number, PR URL. This sco
 ### Step 2: Create the agent team
-This session is the **team lead**. Create a team using the agent teams feature. Per the docs: *"After enabling agent teams, tell Claude to create an agent team and describe the task and the team structure you want in natural language. Claude creates the team, spawns teammates, and coordinates work based on your prompt."*
+This session is the **team lead**. Create the team by calling the `TeamCreate` tool with these exact arguments:
+```
+TeamCreate(
+  team_name="<team_name>",
+  description="Bugteam audit/fix loop for PR <number> (<owner>/<repo>)",
+  agent_type="team-lead"
+)
+```
+`<team_name>` is the value built below under **Team name** (sanitization + timestamp already applied). `TeamCreate` is the tool that resolves the docs' phrasing: *"tell Claude to create an agent team and describe the task and the team structure you want in natural language. Claude creates the team, spawns teammates, and coordinates work based on your prompt."*
 Team specification:
@@ -202,20 +227,39 @@ If the audit returns zero findings, the teammate still posts ONE review with `ev
 ### Step 3: The cycle
-Repeat until an exit condition fires:
+Repeat until an exit condition fires.
-1. Increment `loop_count`. If `loop_count > 10`, exit reason = `cap reached`.
-2. Decide the next action:
-   - `last_action in {"fresh", "fixed"}` → run **AUDIT**
-   - `last_action == "audited"` and `last_findings.total > 0` → run **FIX**
+**Ordering principle:** Mandatory **CODE_RULES** checks (`validate_content` from `hooks/blocking/code_rules_enforcer.py`) must pass on the PR-scoped file set **before** any **AUDIT** (bugfind) teammate runs. The **clean-coder** teammate clears gate failures; then the **code-quality-agent** teammate audits. This mirrors “CI green, then review,” without relying on GitHub Actions — the script is the gate.
+1. Decide the next action from `last_action` and `last_findings`:
    - `last_action == "audited"` and `last_findings.total == 0` → exit reason = `converged`
-   - `last_action == "fixed"` and `git rev-parse HEAD` did not change since pre-FIX → exit reason = `stuck` (see FIX action for detection)
-3. Execute the chosen action (see action specs below).
-4. Update `last_action`, `last_findings`, and append to `audit_log`.
-5. Print a one-line progress marker so the user can watch convergence:
-   - After audit: `Loop <N> audit: <P0>P0 / <P1>P1 / <P2>P2`
-   - After fix: `Loop <N> fix: commit <sha7> (<files_changed> files, +<add>/-<del>)`
-6. Loop.
+   - `last_action == "fixed"` and `git rev-parse HEAD` did not change since pre-FIX → exit reason = `stuck` (see FIX action)
+   - `last_action in {"fresh", "fixed"}` → go to **pre-audit path** (below), then **AUDIT**
+   - `last_action == "audited"` and `last_findings.total > 0` → go to **FIX** (below)
+2. **Pre-audit path** (only when the next step is **AUDIT**):
+   1. From the repository root, run the gate script (align `--base` with the PR base branch from Step 1, e.g. `origin/main` or `origin/develop`):
+      ```bash
+      python "${CLAUDE_SKILL_DIR}/scripts/bugteam_code_rules_gate.py" --base origin/<baseRefName>
+      ```
+      Use `git merge-base` + `git diff --name-only` inside the script; see [`scripts/README.md`](scripts/README.md). The lead runs this (not a teammate).
+   2. If exit code **0** → continue to step 3 (AUDIT spawn) below.
+   3. If exit code **non-zero** → spawn a NEW **clean-coder** teammate — **standards-fix pass** — with instructions: read the script’s stderr, edit the repo until a **re-run** of the **same** gate command exits **0**, then one commit, `git push`, shutdown. Repeat standards-fix spawns until the gate exits **0** or **5** failed gate rounds (each round = one teammate session after a non-zero gate). If still non-zero after 5 rounds → exit reason = `error: code rules gate failed pre-audit`.
+   4. After gate exit **0**, increment `loop_count`. If `loop_count > 10`, exit reason = `cap reached` (counts **audits**, not standards-only rounds).
+   5. Execute **AUDIT action** (spawn bugfind). Print progress: `Loop <N> audit: ...`
+3. **FIX path** (when `last_action == "audited"` and `last_findings.total > 0`):
+   1. Increment `loop_count`. If `loop_count > 10`, exit reason = `cap reached`.
+   2. Execute **FIX action** (spawn bugfix clean-coder for audit findings). Print: `Loop <N> fix: commit ...`
+   3. Set `last_action = "fixed"`, update `audit_log`, loop to step 1 (next iteration will hit **pre-audit path** before the next AUDIT).
+4. After **AUDIT**, update `last_action`, `last_findings`, `audit_log`; print the audit progress line if not already printed.
+5. Loop.
+**Note:** The first iteration uses **pre-audit path** then **AUDIT**. After a **FIX** for audit findings, the next iteration runs **pre-audit path** again (gate → then AUDIT), so `validate_content` stays green before semantic audit.
 ### AUDIT action (clean-room teammate, fresh per loop)
@@ -228,192 +272,105 @@ gh pr diff <number> -R <owner>/<repo> > "<team_temp_dir>/loop-<N>.patch"
 `<team_temp_dir>` is the absolute path captured in Step 2 (already includes the sanitized team_name and timestamp suffix, and `team_name` itself is already prefixed with `bugteam-`). Claude resolves the portable temp root once via `Path(tempfile.gettempdir()) / team_name` (requires `import tempfile`) and passes the literal absolute path to every shell command. `tempfile.gettempdir()` honors `TMPDIR`, `TEMP`, and `TMP` in the platform-correct order and falls back to `C:\Users\<user>\AppData\Local\Temp` on Windows or `/tmp` on Unix, so this works identically on macOS, Linux, Windows cmd.exe, and PowerShell: Claude resolves the literal path once and every shell receives the same absolute value.
-Spawn a NEW `bugfind` teammate for this loop using the `code-quality-agent` subagent type. The teammate is fresh: no prior loop's findings, no chat history, no inherited audit context. Per the docs: *"The lead's conversation history does not carry over."* — and we further guarantee independence by spawning a new teammate per loop rather than reusing one.
-The teammate's spawn prompt is the full XML below — copy it verbatim with the placeholders substituted. **Keep the spawn prompt context-free.** Reference only the PR scope, audit rubric, and this loop number. Write each instruction as a standalone statement so the teammate treats the prompt as a fresh brief — every audit starts from first principles.
-```xml
-<context>
-  <repo>owner/repo</repo>
-  <branch>head ref</branch>
-  <base_branch>base ref</base_branch>
-  <pr_url>full URL</pr_url>
-  <loop>N</loop>
-</context>
-<scope>
-  <diff_path>Absolute path to the loop-N patch file under team_temp_dir from Step 2 (same path as gh pr diff redirect in AUDIT)</diff_path>
-  <scope_rule>Audit only lines added or modified in the diff. Pre-existing code on untouched lines is out of scope.</scope_rule>
-</scope>
-<bug_categories>
-  Investigate each category explicitly. For each, return either at least
-  one finding OR a verified-clean entry with the evidence used to clear it:
-  A. API contract verification (signatures, return types, async/await correctness)
-  B. Selector / query / engine compatibility
-  C. Resource cleanup and lifecycle (file handles, connections, processes, locks)
-  D. Variable scoping, ordering, and unbound references
-  E. Dead code and unused imports
-  F. Silent failures (catch-all excepts, unconditional success returns, missing error propagation)
-  G. Off-by-one, bounds, and integer overflow
-  H. Security boundaries (injection, path traversal, auth bypass, secret leakage)
-  I. Concurrency hazards (race conditions, missing awaits, shared mutable state)
-  J. Magic values and configuration drift
-</bug_categories>
-<constraints>
-  - Read-only on source code: the audit does not modify any source file.
-  - Cite file:line for every finding.
-  - When the diff alone does not provide enough context to confirm a bug,
-    list it under "Open questions" rather than assert it.
-</constraints>
-<comment_posting>
-  1. Audit the diff against the 10 categories above. Buffer the findings
-     in memory; all posting happens at step 6 once anchors are validated.
-  2. Assign each finding a stable finding_id of exactly the form `loopN-K`
-     where K is 1-based within this loop.
-  3. Validate every finding's (file, line) against the captured diff. Split
-     findings into two buckets: anchored (line is in the diff) and
-     unanchored (line is not in the diff — goes into the review body's
-     "Findings without a diff anchor" section per Step 2.5).
-  4. Build the review body per Step 2.5's review-body shape, filling in the
-     P0/P1/P2 counts and the unanchored-findings list (if any).
-  5. For each anchored finding, write its body to its own temp file:
-       **[severity] one-line title**
-       Category: <letter> (<category name>)
-       <2-3 sentence description with concrete trace>
-       _From /bugteam audit loop N._
-  6. Post ONE review via Step 2.5's per-loop review CLI shape. Harvest the
-     parent review `html_url` from the response JSON and the `comments[]`
-     child entries (each with its own `id` and `html_url`). Match child
-     entries to anchored findings in index order.
-  7. If the review POST itself fails, use Step 2.5's Review POST failure
-     fallback (single issue comment with full body and all findings inline).
-  8. Write every body (review body, each finding body, any fallback body)
-     to its own temp file. Load each file into the JSON payload via jq's
-     `--rawfile` or `-Rs`, then pipe the jq output to `gh api ... --input -`
-     so every body reaches GitHub as file contents inside the JSON payload.
-</comment_posting>
-<output_format>
-  Write the outcome XML below to .bugteam-loop-N.outcomes.xml in the
-  working directory. Return only that path on stdout. The schema:
-</output_format>
-```
-Outcome XML schema (bugfind writes this):
-```xml
-<bugteam_audit loop="<N>" review_url="<url>">
-  <finding
-    finding_id="loop<N>-<index>"
-    severity="P0|P1|P2"
-    category="<letter>"
-    file="<path>"
-    line="<int>"
-    finding_comment_id="<gh child comment id, or empty if unanchored/review-fallback>"
-    finding_comment_url="<url of child comment, OR review_url if unanchored, OR fallback issue comment URL>"
-    used_fallback="true|false"
-  >
-    <title>one-line title</title>
-    <description>2-3 sentence description with concrete trace</description>
-  </finding>
-  <verified_clean>
-    <category letter="<letter>" name="<name>" evidence="brief evidence + cleared conclusion"/>
-  </verified_clean>
-</bugteam_audit>
-```
-After the teammate writes the XML and returns, the lead reads `.bugteam-loop-<N>.outcomes.xml`, parses it, and populates `loop_comment_index` from `<finding>` elements. Then **shut down the bugfind teammate**: `Ask the bugfind teammate to shut down`. Per the docs: *"The lead sends a shutdown request. The teammate can approve, exiting gracefully, or reject with an explanation."* If the teammate rejects shutdown, force-shut by failing the team and starting Step 5 cleanup with exit reason = `error: bugfind teammate refused shutdown`.
+Spawn a fresh `bugfind` teammate for this loop by calling the `Agent` tool with these exact arguments:
+```
+Agent(
+  subagent_type="code-quality-agent",
+  name="bugfind",
+  team_name="<team_name>",
+  model="sonnet",
+  description="Bugfind audit loop <N>",
+  prompt="<audit XML from the block below, with placeholders substituted>"
+)
+```
+Each loop calls `Agent` again with a fresh `Agent` invocation so the teammate starts with its own context window. The docs guarantee this: *"The lead's conversation history does not carry over."* Spawning per loop keeps every audit independent.
+See [`PROMPTS.md`](PROMPTS.md) for the AUDIT spawn-prompt XML and bugfind outcome schema. Substitute placeholders (repo, branch, base_branch, pr_url, loop, diff_path) and pass the result as the `prompt` argument.
+After the teammate writes the XML and returns, the lead reads `.bugteam-loop-<N>.outcomes.xml` with the `Read` tool, parses it, and populates `loop_comment_index` from `<finding>` elements.
+**Expected path: self-termination.** In practice, teammates self-terminate when their task is complete — the `Agent` call returns and the teammate's session ends automatically. When that happens, no `SendMessage` shutdown is needed and the cycle proceeds directly to the next action.
+**Fallback path: lead-initiated shutdown.** If the teammate has not self-terminated after the `Agent` call returns (observable as the teammate still appearing in the active-teammates list), send a shutdown message:
+```
+SendMessage(
+  to="bugfind",
+  message={
+    "type": "shutdown_request",
+    "reason": "audit loop <N> complete; outcome XML captured"
+  }
+)
+```
+The teammate replies with `{type: "shutdown_response", approve: true}` and exits. If `approve` comes back `false`, treat this as a fatal error: set exit reason = `error: bugfind teammate refused shutdown` and jump to Step 4 teardown followed by Step 5 revoke.
 `last_action = "audited"`. `last_findings = parsed`. Append `(loop=N, action="audit", counts={P0,P1,P2}, sha=current_HEAD, review_url=<url>, finding_count=<n>, fallback_count=<n>)` to `audit_log`.
-**Parallel auditors from loop 4 onward (`loop_count >= 4`).** Once the cycle has made it through three full audit/fix rounds without converging, the next audit spawns THREE bugfind teammates in parallel — named `bugfind-loop-<N>-a`, `bugfind-loop-<N>-b`, `bugfind-loop-<N>-c` — each with an identical spawn prompt (same diff path, same rubric, same loop number). `a` is the post-owner; `b` and `c` write their outcome XML to `<team_temp_dir>/loop-<N>-b.outcomes.xml` and `...-c.outcomes.xml` respectively, then shut down. `a` reads all three outcome XML files, merges findings by `(file, line, category_letter)` (same tuple collapses to one finding, keeping the longest description and the highest severity of the group), re-assigns merged-finding IDs as `loopN-K`, and posts the single per-loop review per the standard posting protocol above. The lead shuts down `b` and `c` first, then `a` after its post completes.
+**Parallel auditors from loop 4 onward (`loop_count >= 4`).** The pre-audit code rules gate must still pass immediately before this step (Step 3). After three full audit/fix rounds without convergence, spawn three bugfind teammates concurrently by issuing three `Agent` calls in a single assistant message so they run in parallel:
+```
+Agent(subagent_type="code-quality-agent", name="bugfind-loop-<N>-a", team_name="<team_name>", model="sonnet", description="Bugfind audit loop <N> variant a", prompt="<audit XML; write outcome to .bugteam-loop-<N>.outcomes.xml; post the per-loop review; read and merge b/c outcomes from <team_temp_dir>/loop-<N>-b.outcomes.xml and <team_temp_dir>/loop-<N>-c.outcomes.xml>")
+Agent(subagent_type="code-quality-agent", name="bugfind-loop-<N>-b", team_name="<team_name>", model="sonnet", description="Bugfind audit loop <N> variant b", prompt="<audit XML; write outcome to <team_temp_dir>/loop-<N>-b.outcomes.xml; skip PR posting>")
+Agent(subagent_type="code-quality-agent", name="bugfind-loop-<N>-c", team_name="<team_name>", model="sonnet", description="Bugfind audit loop <N> variant c", prompt="<audit XML; write outcome to <team_temp_dir>/loop-<N>-c.outcomes.xml; skip PR posting>")
+```
+Teammate `-a` is the post-owner: it reads all three outcome XML files using their explicit absolute paths — its own outcome at `.bugteam-loop-<N>.outcomes.xml` (working directory), and the sibling outcomes at `<team_temp_dir>/loop-<N>-b.outcomes.xml` and `<team_temp_dir>/loop-<N>-c.outcomes.xml` — then merges findings by `(file, line, category_letter)` (same tuple collapses to one finding, keeping the longest description and the highest severity of the group), re-assigns merged-finding IDs as `loopN-K`, and posts the single per-loop review per the standard posting protocol above. The `-a` spawn prompt must include both sibling paths as literal absolute values so `-a` can read them with the `Read` tool by path without any discovery step.
+Shut down `-b` and `-c` first with two parallel `SendMessage` calls, then shut down `-a` after its post completes:
+```
+SendMessage(to="bugfind-loop-<N>-b", message={"type": "shutdown_request", "reason": "variant XML captured"})
+SendMessage(to="bugfind-loop-<N>-c", message={"type": "shutdown_request", "reason": "variant XML captured"})
+```
+then
+```
+SendMessage(to="bugfind-loop-<N>-a", message={"type": "shutdown_request", "reason": "merged review posted"})
+```
 ### FIX action (fresh teammate, only sees latest audit)
-Spawn a NEW `bugfix` teammate for this loop using the `clean-coder` teammate role, model sonnet. The teammate sees ONLY the most recent audit's findings — no prior-loop findings, no prior-loop fix history, no chat history.
-The teammate receives the **finding comment URL and id for each finding** (from `loop_comment_index`) and **owns the reply posting**. After committing fixes, the teammate posts one reply per finding: `Fixed in <commit_sha>` for addressed findings, `Could not address this loop: <one-line reason>` for skipped or failed findings. Same one-identity model as bugfind: teammate posts, lead does not.
-After all replies are posted, the teammate writes its own outcome XML (see schema below), returns, and the lead **shuts down the bugfix teammate** the same way as the bugfind shutdown.
-Prompt skeleton:
-```xml
-<context>
-  <repo>owner/repo</repo>
-  <branch>head</branch>
-  <base_branch>base</base_branch>
-  <pr_url>url</pr_url>
-  <loop>N</loop>
-</context>
-<bugs_to_fix>
-  [for each P0/P1/P2 finding from last_findings:]
-  <bug
-    finding_id="loop<N>-<index>"
-    severity="P0|P1|P2"
-    file="<path>"
-    line="<int>"
-    category="<letter>"
-    finding_comment_id="<id>"
-    finding_comment_url="<url>"
-  >
-    <description>...</description>
-  </bug>
-</bugs_to_fix>
-<execution>
-  1. Read each referenced file before editing.
-  2. Apply each fix you can address.
-  3. Run `python -m py_compile` (or language-equivalent) on every modified file.
-  4. git add by explicit path, then git commit with a message summarizing the bugs fixed.
-     - If the commit fails because a git hook (pre-commit, commit-msg, etc.) blocked it,
-       capture the hook's stderr, write status=hook_blocked for every finding in this loop
-       (the commit was atomic; if it failed, no finding was applied), populate hook_output
-       on each outcome, and return WITHOUT retrying. The lead will treat this loop as no-progress.
-  5. git push with a plain fast-forward push (the default, no flag overrides).
-  6. For each bug, post a fix reply to its finding_comment_id via the
-     Step 2.5 reply CLI shape:
-     - "Fixed in <commit_sha>" if the bug was addressed by your commit
-     - "Could not address this loop: <one-line reason>" if you skipped or failed it
-     - "Hook blocked the fix commit: <one-line summary>" if the commit was hook-blocked
-     Use the Fix reply CLI shape from Step 2.5 (`jq -Rs | gh api .../comments/<id>/replies --input -`). Write every reply body to a temp file first.
-  7. Write `.bugteam-loop-<N>.outcomes.xml` (schema below) and return its path.
-</execution>
-<outcome_xml_schema>
-  <bugteam_fix loop="<N>" commit_sha="<sha or empty if no commit>">
-    <outcome
-      finding_id="loop<N>-<index>"
-      status="fixed|could_not_address|hook_blocked"
-      commit_sha="<sha if fixed, empty otherwise>"
-      reply_comment_id="<id of the reply posted>"
-      reply_comment_url="<url of the reply posted>"
-    >
-      <reason>only present when status=could_not_address; one-line reason text</reason>
-      <hook_output>only present when status=hook_blocked; verbatim stderr from the blocked hook</hook_output>
-    </outcome>
-  </bugteam_fix>
-</outcome_xml_schema>
-<constraints>
-  - Modify only files referenced in bugs_to_fix.
-  - One commit on the existing branch, then push.
-  - Keep the branch linear and the PR base fixed; append one new commit per
-    loop and fast-forward push only.
-  - Let every git hook run on every commit.
-  - git add by explicit path — name each file being staged.
-  - Preserve existing comments on lines you do not modify.
-  - Type hints on every signature you touch.
-</constraints>
+Spawn a fresh `bugfix` teammate for this loop by calling the `Agent` tool with these exact arguments:
+```
+Agent(
+  subagent_type="clean-coder",
+  name="bugfix",
+  team_name="<team_name>",
+  model="sonnet",
+  description="Bugfix loop <N>",
+  prompt="<fix XML from the block below, with placeholders substituted>"
+)
 ```
+The teammate sees only the most recent audit's findings — each `Agent` call starts with a fresh context window, so prior-loop findings, prior-loop fix history, and prior chat history stay inside the lead.
+Pass the **finding comment URL and id for each finding** (from `loop_comment_index`) inside the XML prompt so the teammate owns reply posting. After committing fixes, the teammate posts one reply per finding: `Fixed in <commit_sha>` for addressed findings, `Could not address this loop: <one-line reason>` for skipped or failed findings. Same one-identity model as bugfind: teammate posts, lead waits.
+After all replies are posted, the teammate writes its own outcome XML (see schema below) and returns.
+**Expected path: self-termination.** In practice, teammates self-terminate when their task is complete — the `Agent` call returns and the teammate's session ends automatically. When that happens, no `SendMessage` shutdown is needed and the cycle proceeds directly to the next action.
+**Fallback path: lead-initiated shutdown.** If the teammate has not self-terminated after the `Agent` call returns, send a shutdown message:
+```
+SendMessage(
+  to="bugfix",
+  message={
+    "type": "shutdown_request",
+    "reason": "fix loop <N> complete; commit <sha7> pushed"
+  }
+)
+```
+If the shutdown response returns `approve: false`, treat it the same as the bugfind refusal case above: exit reason = `error: bugfix teammate refused shutdown`, jump to Step 4 teardown then Step 5 revoke.
+See [`PROMPTS.md`](PROMPTS.md) for the FIX spawn-prompt XML and bugfix outcome schema. Substitute placeholders (repo, branch, base_branch, pr_url, loop, and the per-finding bug entries built from `last_findings`) and pass the result as the `prompt` argument.
 Verify the fix actually committed and pushed:
 - `git rev-parse HEAD` after fix should differ from before
@@ -425,10 +382,33 @@ If `git rev-parse HEAD` did not change, exit reason = `stuck — bugfix teammate
 ### Step 4: Tear down the team and clean working tree
-When the cycle exits (any reason):
+When the cycle exits (any reason), run these steps in order from THIS session (the lead):
+1. **Confirm every teammate has shut down.** Any teammate still alive (for example, from an aborted shutdown mid-loop) must receive a shutdown message first. For each remaining teammate name:
+   ```
+   SendMessage(to="<teammate_name>", message={"type": "shutdown_request", "reason": "bugteam cycle ending"})
+   ```
-1. **Clean up the team as the lead.** Per the docs: *"When you're done, ask the lead to clean up: 'Clean up the team'. This removes the shared team resources. When the lead runs cleanup, it checks for active teammates and fails if any are still running, so shut them down first."* The lead is THIS session — call cleanup directly. If any teammate is still alive (e.g., from an aborted shutdown), shut it down first.
-2. Delete the per-team scoped temp directory using Python: `shutil.rmtree(team_temp_dir, ignore_errors=True)` (requires `import shutil`). This works on every platform without OS-detection branching. Pass the literal absolute path Claude resolved at Step 2; Claude performs the path resolution so every shell receives the same literal value at cleanup time.
+   The docs state: *"When the lead runs cleanup, it checks for active teammates and fails if any are still running, so shut them down first."*
+   If any teammate returns `approve: false` during this cleanup shutdown, log the refusing teammate name (e.g., `cleanup warning: <teammate_name> refused shutdown_request`) and force-proceed to step 2 (`TeamDelete`) anyway. `TeamDelete` may fail if active teammates remain; if it does, surface the error in the final report with the refusing teammate name so the user can manually clean up. Do not abort the cleanup sequence — continue through temp-dir deletion, Step 4.5, and Step 5 regardless.
+2. **Clean up the team** by calling `TeamDelete` with no arguments — it reads `<team_name>` from the current session's team context:
+   ```
+   TeamDelete()
+   ```
+   The docs state: *"When you're done, ask the lead to clean up: 'Clean up the team'."* `TeamDelete` is the tool that resolves that sentence.
+3. **Delete the per-team scoped temp directory** by running this Python one-liner through the `Bash` tool (same literal `<team_temp_dir>` path resolved at Step 2):
+   ```
+   python -c "import shutil; shutil.rmtree(r'<team_temp_dir>', ignore_errors=True)"
+   ```
+   `shutil.rmtree(..., ignore_errors=True)` works identically on Windows and Unix, so the lead uses one command regardless of platform.
 ### Step 4.5: Finalize the PR description (mandatory)
@@ -436,7 +416,27 @@ After teardown and before permission revoke, the lead rewrites the PR body to re
 The lead delegates the body authoring to the `pr-description-writer` agent so the global mandatory-pr-description-writer hook accepts the subsequent `gh pr edit`. The lead does NOT compose the body inline.
-`pr-description-writer` is provided by the global git-workflow rule in `claude-code-config` (as the `pr-description-writer` agent type). If that agent is not available in the current environment, fall back to spawning a `general-purpose` agent with the same brief — the global hook treats agent-authored bodies the same regardless of the specific agent type. If neither agent is available, log a warning in the final report and skip Step 4.5; the original PR body remains.
+`pr-description-writer` is provided by the global git-workflow rule in `claude-code-config`. Invoke it with the `Agent` tool:
+```
+Agent(
+  subagent_type="pr-description-writer",
+  description="Rewrite PR <number> body from cumulative diff",
+  prompt="<brief from step 3 below>"
+)
+```
+If `pr-description-writer` is not in the available agents list for the current environment, fall back to `general-purpose` with the same brief — the global hook treats agent-authored bodies the same regardless of the specific agent type:
+```
+Agent(
+  subagent_type="general-purpose",
+  description="Rewrite PR <number> body from cumulative diff",
+  prompt="<brief from step 3 below>"
+)
+```
+When neither agent is available, log a warning in the final report and skip Step 4.5 so the original PR body stays in place.
 Steps:
@@ -484,99 +484,8 @@ If exit = `cap reached`, name the remaining bug count and recommend `/findbugs`
 ## Constraints
-- **Agent teams required, not parallel subagents.** The skill MUST use Claude Code's agent teams feature (`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1`). Spawning `code-quality-agent` and `clean-coder` as parallel subagents from the lead's context = fail; the clean-room property requires independent teammate sessions.
-- **Grant before any spawn, revoke before any return.** Step 0 grants project `.claude/**` permissions; Step 5 revokes. Both are mandatory. Revoke runs on every exit path including error, cap-reached, and stuck.
-- **Fresh teammate per loop.** Both bugfind and bugfix are spawned new each loop and shut down after their action. Reusing a teammate across loops accumulates context inside that teammate's window — defeats clean-room.
-- **One up-front confirmation = whole cycle.** The `/bugteam` invocation authorizes the entire cycle; every subsequent decision runs on that single authorization.
-- **10-loop hard cap.** Counted as audits performed. Worst case = 10 audits + 10 fixes = 20 teammate spawns + 20 shutdowns.
-- **Clean-room audits, every loop.** Each bugfind teammate's spawn prompt contains only the PR scope, audit rubric, and the current loop number. Prior loop history stays in the lead.
-- **Targeted fixes.** Each fix teammate sees ONLY the most recent audit's findings. Prior loops are invisible to the fix teammate.
-- **Sonnet for both teammates.** Predictable cost, fits-purpose for code work.
-- **Fix teammate receives the latest audit as its input contract.** Passing the audit's findings to the fix teammate is the input contract — each loop's fix run operates on the current audit's output and only that.
-- **One commit per fix action.** Loops produce one commit per loop, not one per bug.
-- **Linear branch, fixed PR base.** Every loop appends one forward-only commit; existing commits and the PR base stay intact throughout the cycle.
-- **Lead-only cleanup.** Per the docs: *"Always use the lead to clean up. Teammates should not run cleanup because their team context may not resolve correctly, potentially leaving resources in an inconsistent state."* This session is the lead, and cleanup runs here only.
-- **Cleanup the per-team scoped temp directory on exit.** The resolved `<team_temp_dir>` (absolute literal captured in Step 2) is deleted entirely so no loop patches leak between runs.
-- **Cleanup all `.bugteam-*` files on exit.** `.bugteam-loop-*.patch`, `.bugteam-loop-*.outcomes.xml`, `.bugteam-final.diff`, `.bugteam-original-body.md`, `.bugteam-final-body.md`. Working directory ends clean.
-- **Teammates own audit/fix comment posting.** Bugfind posts ONE per-loop review (parent body + child finding comments in a single batched POST, with review-fallback to a top-level issue comment). Bugfix posts the fix replies after committing. All comment, review, and reply POSTs belong to the teammates; the lead's single PR-write action is the final description rewrite at Step 4.5.
-- **Lead owns the final PR description rewrite only** (Step 4.5), and only via the `pr-description-writer` agent. The lead does not compose the description inline.
-- **One review per loop, findings as child comments of that review.** Each loop posts a single pull-request review whose body is the loop header and whose `comments[]` are the anchored findings. Each loop's review stands alone — one review created per loop, fully self-contained on the PR conversation.
-- **PR description rewrite on every exit.** Step 4.5 runs on `converged`, `cap reached`, and `stuck`. On `error`, the rewrite is best-effort; if it fails, surface the error in the final report and continue to revoke.
-- **Outcome XML, not JSON.** Both teammates write structured outcome data (findings or fix outcomes) to `.bugteam-loop-<N>.outcomes.xml`. The lead reads these files between actions. XML chosen for parser robustness against multi-line, special-character, and quoted reason fields.
+See [`CONSTRAINTS.md`](CONSTRAINTS.md).
 ## Examples
-<example>
-User: `/bugteam`
-Claude: [resolves PR #42, runs loop]
-`Loop 1 audit: 1P0 / 2P1 / 0P2`
-`Loop 1 fix: commit a1b2c3d (3 files, +18/-7)`
-`Loop 2 audit: 0P0 / 1P1 / 0P2`
-`Loop 2 fix: commit e4f5g6h (1 file, +5/-2)`
-`Loop 3 audit: 0P0 / 0P1 / 0P2 → converged`
-`/bugteam exit: converged`
-`Loops: 3`
-`Starting commit: 9d8c7b6`
-`Final commit: e4f5g6h`
-`Net change: 4 files, +23/-9`
-</example>
-<example>
-User: `/bugteam`
-Claude: [runs 10 loops without convergence]
-`Loop 10 audit: 0P0 / 1P1 / 2P2`
-`/bugteam exit: cap reached`
-`Loops: 10`
-`Remaining: 0P0 / 1P1 / 2P2 — run /findbugs for human triage`
-</example>
-<example>
-User: `/bugteam`
-Claude: [loop 4 fix produces no commit]
-`Loop 4 fix: clean-coder reported no changes (could not address remaining bugs)`
-`/bugteam exit: stuck`
-`Unresolved findings (3): src/cache.py:88 (P0 race condition); ...`
-</example>
-<example>
-User: `/bugteam` (mixed-outcome path: some findings fixed, others skipped)
-Claude: [resolves PR #99, runs loop with partial-fix outcomes]
-`Loop 1 audit: 1P0 / 3P1 / 0P2`
-`Loop 1 fix: commit a1b2c3d (2 files, +8/-3) — 2 fixed, 2 could_not_address`
-`Loop 2 audit: 0P0 / 2P1 / 0P2`
-`Loop 2 fix: 0 fixed, 2 could_not_address (no commit)`
-`/bugteam exit: stuck`
-`Loops: 2`
-`Unresolved findings (2): src/auth.py:45 (P1: file is generated, cannot edit); src/legacy.py:200 (P1: rewrite scope exceeds the bug)`
-The bugfix teammate writes one outcome per finding to `.bugteam-loop-2.outcomes.xml`. Findings with `status=could_not_address` carry their `<reason>` text, and the teammate posts a matching reply to each finding comment so the reviewer sees why each bug stayed open.
-</example>
-<example>
-User: `/bugteam` (no PR or upstream diff)
-Claude: `No PR or upstream diff. /bugteam needs a target.`
-</example>
-<example>
-User: `/bugteam` (uncommitted changes in working tree)
-Claude: `Uncommitted changes detected. Stash, commit, or revert before /bugteam.`
-</example>
-## Why this design
-The three sibling skills compose, but `/bugteam` solves a problem they cannot solve in sequence:
-- `/findbugs` audits once and stops.
-- `/fixbugs` fixes the findings of one audit and stops.
-- A human-driven `/findbugs` → `/fixbugs` → `/findbugs` → `/fixbugs` cycle works but requires the user to drive it.
-`/bugteam` automates that cycle. The clean-room property is preserved by spawning a fresh audit agent each loop with no inherited context — every audit is independent of the prior loop's verdict. The 10-loop cap is the safety: pathological cases (audit agent oscillating, fix agent regressing) cannot run away.
-The single up-front confirmation is the explicit trade — `/bugteam` is more autonomous than `/findbugs`+`/fixbugs` chained manually. The user accepts that autonomy by typing the command. Stop conditions and the loop log give the user full visibility on exit.
+See [`EXAMPLES.md`](EXAMPLES.md).