claude-dev-env 1.26.0 → 1.26.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/hooks/blocking/code_rules_enforcer.py +5 -1
- package/hooks/blocking/test_code_rules_enforcer.py +61 -0
- package/hooks/blocking/test_code_rules_enforcer_file_global_constants.py +4 -2
- package/hooks/notification/subagent_complete_notify.py +5 -5
- package/hooks/notification/test_subagent_complete_notify.py +7 -0
- package/hooks/validators/git_checks.py +4 -1
- package/hooks/validators/test_output_formatter.py +7 -2
- package/package.json +1 -1
- package/skills/bugteam/SKILL.md +143 -309
- package/skills/bugteam/SKILL_EVALS.md +46 -46
- package/skills/bugteam/reference/README.md +13 -0
- package/skills/bugteam/reference/audit-and-teammates.md +127 -0
- package/skills/bugteam/reference/design-rationale.md +28 -0
- package/skills/bugteam/reference/github-pr-reviews.md +86 -0
- package/skills/bugteam/reference/team-setup.md +51 -0
- package/skills/bugteam/reference/teardown-publish-permissions.md +70 -0
- package/skills/bugteam/scripts/README.md +4 -0
- package/skills/bugteam/{_claude_permissions_common.py → scripts/_claude_permissions_common.py} +37 -33
- package/skills/bugteam/scripts/bugteam_code_rules_gate.py +55 -0
- package/skills/bugteam/{grant_project_claude_permissions.py → scripts/grant_project_claude_permissions.py} +10 -11
- package/skills/bugteam/{revoke_project_claude_permissions.py → scripts/revoke_project_claude_permissions.py} +13 -14
- package/skills/bugteam/sources.md +93 -0
- /package/hooks/validators/{config.py → validator_defaults.py} +0 -0
- /package/skills/bugteam/{test_claude_permissions_common.py → scripts/test_claude_permissions_common.py} +0 -0
|
@@ -6,7 +6,7 @@ Evaluation-driven iteration set for the `bugteam` skill, following [Anthropic
|
|
|
6
6
|
|
|
7
7
|
Evals are split into two layers. Both layers run against the same trace but carry different failure semantics.
|
|
8
8
|
|
|
9
|
-
**Layer A — Ironclad invariants.** Order-and-presence rules that MUST hold on every run regardless of fixture, regardless of model choice, regardless of the exact number of loops taken.
|
|
9
|
+
**Layer A — Ironclad invariants.** Order-and-presence rules that MUST hold on every run regardless of fixture, regardless of model choice, regardless of the exact number of loops taken. Citations use **section headings and companion files** (`SKILL.md`, `CONSTRAINTS.md`, `reference/*.md`) — not fragile line numbers — so layout edits to `SKILL.md` do not invalidate the contract. If an assertion fails, either the run diverged from the skill or the cited text is ambiguous and needs patching.
|
|
10
10
|
|
|
11
11
|
**Layer B — Fixture-dependent expectations.** The concrete tool trace predicted for a specific fixture (fixed PR state, canned audit XML, canned fix XML). Layer B is prediction — reality may diverge in small ways (extra `Bash("git rev-parse HEAD")` checkpoints the lead inserts for sanity; retry loops on transient failures; consolidated cleanup calls) without indicating a skill defect. Layer B failures trigger reconciliation, not auto-failure.
|
|
12
12
|
|
|
@@ -14,23 +14,23 @@ Evals are split into two layers. Both layers run against the same trace but carr
|
|
|
14
14
|
|
|
15
15
|
## Ironclad invariants (Layer A, apply to every eval)
|
|
16
16
|
|
|
17
|
-
Each invariant cites the
|
|
17
|
+
Each invariant cites the normative section or companion file it derives from.
|
|
18
18
|
|
|
19
19
|
| # | Invariant | Citation |
|
|
20
20
|
|---|---|---|
|
|
21
|
-
| I-1 | `Bash
|
|
22
|
-
| I-2 | `Bash
|
|
23
|
-
| I-3 | Exactly one `TeamCreate` and exactly one `TeamDelete` per invocation. | SKILL.md Step 2
|
|
24
|
-
| I-4 |
|
|
25
|
-
| I-5 | `Agent` calls are fresh per loop — the same `name` is never reused across loops without an intervening shutdown. |
|
|
26
|
-
| I-6 | Both audit and fix `Agent` calls pass `model="sonnet"`. | SKILL.md Step 2
|
|
21
|
+
| I-1 | `Bash` invoking `scripts/grant_project_claude_permissions.py` precedes every `TeamCreate`. | `SKILL.md` § Step 0 |
|
|
22
|
+
| I-2 | `Bash` invoking `scripts/revoke_project_claude_permissions.py` runs exactly once per invocation, after the last `TeamDelete`, on every exit path (converged, stuck, cap reached, error). | `SKILL.md` § Step 5 |
|
|
23
|
+
| I-3 | Exactly one `TeamCreate` and exactly one `TeamDelete` per invocation. | `SKILL.md` § Step 2; § Step 4 |
|
|
24
|
+
| I-4 | Before `TeamDelete`, no teammate remains active without cleanup: either the teammate self-terminated after `Agent` returned, or the lead sent a matching `SendMessage(..., shutdown_request)` (including parallel-auditor shutdowns). No orphaned teammates when `TeamDelete` runs. | `SKILL.md` § AUDIT action (**Shutdown**); § FIX action (**Shutdown**); § Step 4 |
|
|
25
|
+
| I-5 | `Agent` calls are fresh per loop — the same `name` is never reused across loops without an intervening shutdown. | `CONSTRAINTS.md` — **Fresh teammate per loop** |
|
|
26
|
+
| I-6 | Both audit and fix `Agent` calls pass `model="sonnet"`. | `SKILL.md` § Step 2 (**Roles**); `CONSTRAINTS.md` — **Sonnet for both teammates** |
|
|
27
27
|
| I-7 | `TeamDelete()` is called with no arguments. | TeamDelete schema: no required params, no properties |
|
|
28
|
-
| I-8 | Loop count ≤ 10 audits. 11th audit never fires. | SKILL.md
|
|
29
|
-
| I-9 | From loop 4 onward without convergence, the audit phase emits three parallel `Agent` calls in a single assistant message with names `bugfind-loop-<N>-a/b/c`. | SKILL.md AUDIT action
|
|
30
|
-
| I-10 | Lead reads `.bugteam-loop-<N>.outcomes.xml` with the `Read` tool after each audit, before the next action. | SKILL.md
|
|
31
|
-
| I-11 | On exit of any kind, ordering is: `TeamDelete` → temp-dir cleanup → Step 4.5 PR rewrite → revoke. | SKILL.md Step 4
|
|
32
|
-
| I-12 | Lead never posts PR review comments, finding comments, or fix replies. The only lead-side PR mutation is the final `gh pr edit --body-file` in Step 4.5. |
|
|
33
|
-
| I-13 | Only the orchestrator (lead session) invokes `TeamCreate`. Every teammate `Agent(...)` call passes `team_name=<lead_team_name>`. No teammate ever calls `TeamCreate`. When supplementary work arises mid-cycle (parallel auditors, adjacent-file audits, infrastructure fixes), the lead spawns additional teammates into the existing team rather than creating a second team. |
|
|
28
|
+
| I-8 | Loop count ≤ 10 audits. 11th audit never fires. | `SKILL.md` YAML `description` (10-loop cap); § Step 3 (**Pre-audit** / **FIX** increment rules) |
|
|
29
|
+
| I-9 | From loop 4 onward without convergence, the audit phase emits three parallel `Agent` calls in a single assistant message with names `bugfind-loop-<N>-a/b/c`. | `SKILL.md` § AUDIT action (**Parallel auditors**); `reference/audit-and-teammates.md` § **Parallel auditors** |
|
|
30
|
+
| I-10 | Lead reads `.bugteam-loop-<N>.outcomes.xml` with the `Read` tool after each audit, before the next action. | `SKILL.md` § AUDIT action |
|
|
31
|
+
| I-11 | On exit of any kind, ordering is: teammate shutdowns → `TeamDelete` → temp-dir cleanup → Step 4.5 PR rewrite → revoke. | `SKILL.md` § Step 4; § Step 4.5; § Step 5; `reference/teardown-publish-permissions.md` § **Step 4** / **Step 4.5** / **Step 5** |
|
|
32
|
+
| I-12 | Lead never posts PR review comments, finding comments, or fix replies. The only lead-side PR mutation is the final `gh pr edit --body-file` in Step 4.5. | `CONSTRAINTS.md` — **Teammates own audit/fix comment posting**; **Lead owns the final PR description rewrite only** |
|
|
33
|
+
| I-13 | Only the orchestrator (lead session) invokes `TeamCreate`. Every teammate `Agent(...)` call passes `team_name=<lead_team_name>`. No teammate ever calls `TeamCreate`. When supplementary work arises mid-cycle (parallel auditors, adjacent-file audits, infrastructure fixes), the lead spawns additional teammates into the existing team rather than creating a second team. | `CONSTRAINTS.md` — **Orchestrator-only `TeamCreate`** (runtime error quoted there) |
|
|
34
34
|
|
|
35
35
|
Any eval failing one or more Layer A invariants fails the run.
|
|
36
36
|
|
|
@@ -105,33 +105,33 @@ The harness does not yet exist; this document defines its contract.
|
|
|
105
105
|
|
|
106
106
|
| # | Tool call | Source |
|
|
107
107
|
|---|---|---|
|
|
108
|
-
| 1 | `Bash("python .../grant_project_claude_permissions.py")` | SKILL.md Step 0 |
|
|
109
|
-
| 2 | `Bash("gh pr view --json number,baseRefName,headRefName,url")` | SKILL.md Step 1 |
|
|
110
|
-
| 3 | `Bash("git rev-parse HEAD")` → captures `starting_sha` | SKILL.md
|
|
111
|
-
| 4 | `TeamCreate(team_name="bugteam-pr-42-<ts>", description=..., agent_type="team-lead")` | SKILL.md Step 2 |
|
|
112
|
-
| 5 | `Bash("mkdir -p <team_temp_dir>")` | SKILL.md AUDIT action |
|
|
113
|
-
| 6 | `Bash("gh pr diff 42 -R ... > <team_temp_dir>/loop-1.patch")` | SKILL.md AUDIT action |
|
|
114
|
-
| 7 | `Agent(subagent_type="code-quality-agent", name="bugfind", team_name=..., model="sonnet", description=..., prompt=<audit XML loop 1>)` | SKILL.md AUDIT action |
|
|
115
|
-
| 8 | `Read(".bugteam-loop-1.outcomes.xml")` | SKILL.md AUDIT action |
|
|
116
|
-
| 9 | `SendMessage(to="bugfind", message={type: "shutdown_request", reason: "audit loop 1 complete; outcome XML captured"})` | SKILL.md AUDIT
|
|
117
|
-
| 10 | `Agent(subagent_type="clean-coder", name="bugfix", team_name=..., model="sonnet", description=..., prompt=<fix XML loop 1>)` | SKILL.md FIX action |
|
|
118
|
-
| 11 | `Read(".bugteam-loop-1.outcomes.xml")` — bugfix outcome XML overwrites same filename | SKILL.md FIX action |
|
|
119
|
-
| 12 | `Bash("git rev-parse HEAD")` → verify HEAD advanced | SKILL.md
|
|
120
|
-
| 13 | `Bash("git fetch origin <branch> && git rev-parse origin/<branch>")` → verify push landed | SKILL.md
|
|
121
|
-
| 14 | `SendMessage(to="bugfix", message={type: "shutdown_request", reason: "fix loop 1 complete; commit <sha7> pushed"})` | SKILL.md FIX
|
|
122
|
-
| 15 | `Bash("gh pr diff 42 -R ... > <team_temp_dir>/loop-2.patch")` | SKILL.md AUDIT action |
|
|
123
|
-
| 16 | `Agent(subagent_type="code-quality-agent", name="bugfind", ...)` (loop 2) | SKILL.md AUDIT action |
|
|
124
|
-
| 17 | `Read(".bugteam-loop-2.outcomes.xml")` — zero findings | SKILL.md AUDIT action |
|
|
125
|
-
| 18 | `SendMessage(to="bugfind", message={type: "shutdown_request", reason: "audit loop 2 complete; zero findings"})` | SKILL.md AUDIT
|
|
126
|
-
| 19 | `TeamDelete()` | SKILL.md Step 4 |
|
|
127
|
-
| 20 | `Bash("python -c \"import shutil; shutil.rmtree(r'<team_temp_dir>', ignore_errors=True)\"")` | SKILL.md Step 4 |
|
|
128
|
-
| 21 | `Bash("gh pr diff 42 -R ... > .bugteam-final.diff")` | SKILL.md
|
|
129
|
-
| 22 | `Bash("gh pr view 42 -R ... --json body --jq .body > .bugteam-original-body.md")` | SKILL.md
|
|
130
|
-
| 23 | `Agent(subagent_type="pr-description-writer", description=..., prompt=<brief>)` | SKILL.md Step 4.5 |
|
|
131
|
-
| 24 | `Write(".bugteam-final-body.md", <returned body>)` | SKILL.md
|
|
132
|
-
| 25 | `Bash("gh pr edit 42 -R ... --body-file .bugteam-final-body.md")` | SKILL.md
|
|
133
|
-
| 26 | `Bash("rm .bugteam-final.diff .bugteam-original-body.md .bugteam-final-body.md .bugteam-loop-*.outcomes.xml
|
|
134
|
-
| 27 | `Bash("python .../revoke_project_claude_permissions.py")` | SKILL.md Step 5 |
|
|
108
|
+
| 1 | `Bash("python .../scripts/grant_project_claude_permissions.py")` | `SKILL.md` § Step 0 |
|
|
109
|
+
| 2 | `Bash("gh pr view --json number,baseRefName,headRefName,url")` | `SKILL.md` § Step 1 |
|
|
110
|
+
| 3 | `Bash("git rev-parse HEAD")` → captures `starting_sha` | `SKILL.md` § Step 2 — **Loop state** block |
|
|
111
|
+
| 4 | `TeamCreate(team_name="bugteam-pr-42-<ts>", description=..., agent_type="team-lead")` | `SKILL.md` § Step 2 |
|
|
112
|
+
| 5 | `Bash("mkdir -p <team_temp_dir>")` | `SKILL.md` § AUDIT action |
|
|
113
|
+
| 6 | `Bash("gh pr diff 42 -R ... > <team_temp_dir>/loop-1.patch")` | `SKILL.md` § AUDIT action |
|
|
114
|
+
| 7 | `Agent(subagent_type="code-quality-agent", name="bugfind", team_name=..., model="sonnet", description=..., prompt=<audit XML loop 1>)` | `SKILL.md` § AUDIT action |
|
|
115
|
+
| 8 | `Read(".bugteam-loop-1.outcomes.xml")` | `SKILL.md` § AUDIT action |
|
|
116
|
+
| 9 | `SendMessage(to="bugfind", message={type: "shutdown_request", reason: "audit loop 1 complete; outcome XML captured"})` | `SKILL.md` § AUDIT action (**Shutdown** fallback) |
|
|
117
|
+
| 10 | `Agent(subagent_type="clean-coder", name="bugfix", team_name=..., model="sonnet", description=..., prompt=<fix XML loop 1>)` | `SKILL.md` § FIX action |
|
|
118
|
+
| 11 | `Read(".bugteam-loop-1.outcomes.xml")` — bugfix outcome XML overwrites same filename | `SKILL.md` § FIX action |
|
|
119
|
+
| 12 | `Bash("git rev-parse HEAD")` → verify HEAD advanced | `SKILL.md` § FIX action (**Verify**) |
|
|
120
|
+
| 13 | `Bash("git fetch origin <branch> && git rev-parse origin/<branch>")` → verify push landed | `SKILL.md` § FIX action (**Verify**) |
|
|
121
|
+
| 14 | `SendMessage(to="bugfix", message={type: "shutdown_request", reason: "fix loop 1 complete; commit <sha7> pushed"})` | `SKILL.md` § FIX action (**Shutdown** fallback) |
|
|
122
|
+
| 15 | `Bash("gh pr diff 42 -R ... > <team_temp_dir>/loop-2.patch")` | `SKILL.md` § AUDIT action |
|
|
123
|
+
| 16 | `Agent(subagent_type="code-quality-agent", name="bugfind", ...)` (loop 2) | `SKILL.md` § AUDIT action |
|
|
124
|
+
| 17 | `Read(".bugteam-loop-2.outcomes.xml")` — zero findings | `SKILL.md` § AUDIT action |
|
|
125
|
+
| 18 | `SendMessage(to="bugfind", message={type: "shutdown_request", reason: "audit loop 2 complete; zero findings"})` | `SKILL.md` § AUDIT action (**Shutdown** fallback) |
|
|
126
|
+
| 19 | `TeamDelete()` | `SKILL.md` § Step 4 |
|
|
127
|
+
| 20 | `Bash("python -c \"import shutil; shutil.rmtree(r'<team_temp_dir>', ignore_errors=True)\"")` | `SKILL.md` § Step 4 |
|
|
128
|
+
| 21 | `Bash("gh pr diff 42 -R ... > .bugteam-final.diff")` | `SKILL.md` § Step 4.5 step 1 |
|
|
129
|
+
| 22 | `Bash("gh pr view 42 -R ... --json body --jq .body > .bugteam-original-body.md")` | `SKILL.md` § Step 4.5 step 2 |
|
|
130
|
+
| 23 | `Agent(subagent_type="pr-description-writer", description=..., prompt=<brief>)` | `SKILL.md` § Step 4.5 |
|
|
131
|
+
| 24 | `Write(".bugteam-final-body.md", <returned body>)` | `SKILL.md` § Step 4.5 step 4 |
|
|
132
|
+
| 25 | `Bash("gh pr edit 42 -R ... --body-file .bugteam-final-body.md")` | `SKILL.md` § Step 4.5 step 4 |
|
|
133
|
+
| 26 | `Bash("rm .bugteam-final.diff .bugteam-original-body.md .bugteam-final-body.md")` | `SKILL.md` § Step 4.5 step 5 (lead may add `.bugteam-loop-*.outcomes.xml` in the same or a separate `rm` — reconcile on first real run) |
|
|
134
|
+
| 27 | `Bash("python .../scripts/revoke_project_claude_permissions.py")` | `SKILL.md` § Step 5 |
|
|
135
135
|
|
|
136
136
|
**Pass criteria.**
|
|
137
137
|
- All Layer A invariants hold.
|
|
@@ -228,7 +228,7 @@ Patch this table to match observation and annotate each correction.
|
|
|
228
228
|
- Every finding's outcome XML carries `used_fallback="true"` and the issue-comment URL as `finding_comment_url`.
|
|
229
229
|
- Cycle continues to the FIX action without aborting.
|
|
230
230
|
|
|
231
|
-
**Open item for the real run.** The
|
|
231
|
+
**Open item for the real run.** The issue-comments fallback shape is `jq -Rs | gh api .../issues/<number>/comments --input -` (`SKILL.md` § Step 2.5 **Review POST fails**; full narrative in `reference/github-pr-reviews.md` § **Review POST failure fallback**). Before running Eval 10 for real, confirm the teammate obeys this shape — the fixture must assert the endpoint path and the `--input -` pattern.
|
|
232
232
|
|
|
233
233
|
---
|
|
234
234
|
|
|
@@ -327,8 +327,8 @@ The adjacent-audit teammate writes its own outcome XML, self-terminates. Lead re
|
|
|
327
327
|
1. **Cycle 0 — Reconcile predictions with reality.** On the first real run, diff every Layer B predicted trace against the observed trace. Patch this file to match reality and annotate each correction with a reason.
|
|
328
328
|
2. **Baseline.** Run every eval with the skill unloaded. Record which cases the base model handles from memory versus which it gets wrong.
|
|
329
329
|
3. **Treatment.** Run every eval with the skill loaded. Layer A invariants must pass on every case. Layer B mismatches trigger Cycle 0 reconciliation.
|
|
330
|
-
4. **Regress on change.** Every edit to `SKILL.md` re-runs the full suite. A passing→failing transition on any Layer A invariant blocks the change. A Layer B mismatch after
|
|
331
|
-
5. **Extend on gotcha.** When the skill misfires in real use, add a new eval that reproduces the miss before patching
|
|
330
|
+
4. **Regress on change.** Every edit to normative text in `SKILL.md`, `CONSTRAINTS.md`, `PROMPTS.md`, or `reference/*.md` sections that Layer A cites re-runs the full suite. A passing→failing transition on any Layer A invariant blocks the change. A Layer B mismatch after such an edit triggers a patch to the affected eval trace in the same commit.
|
|
331
|
+
5. **Extend on gotcha.** When the skill misfires in real use, add a new eval that reproduces the miss before patching the orchestration or companion files.
|
|
332
332
|
|
|
333
333
|
## Harness sketch (future work)
|
|
334
334
|
|
|
@@ -341,6 +341,6 @@ A minimal Python harness under `packages/claude-dev-env/skills/bugteam/evals/`:
|
|
|
341
341
|
|
|
342
342
|
## Open research items flagged during this pass
|
|
343
343
|
|
|
344
|
-
1. **GitHub REST review-POST payload shape.** Eval 9 and Eval 10 depend on the exact body shape of `POST /pulls/<number>/reviews`.
|
|
345
|
-
2. **`SendMessage` shutdown origination — RESOLVED.** `SendMessage` tool docs include the line "Don't originate `shutdown_request` unless asked." `TeamCreate` tool docs explicitly direct the lead to originate `{type: "shutdown_request"}` for teammate cleanup. Real-run observation (loop 1 of eval run 2026-04-18) resolved the contradiction: teammates self-terminate when their task is complete — the `Agent` call returns and the teammate's session ends without any `SendMessage`. The cycle proceeded correctly without the lead ever needing to originate a `shutdown_request`. `SKILL.md`
|
|
344
|
+
1. **GitHub REST review-POST payload shape.** Eval 9 and Eval 10 depend on the exact body shape of `POST /pulls/<number>/reviews`. The `jq -n --rawfile ... --argjson ... | gh api ... --input -` fence lives in `SKILL.md` § Step 2.5 (**Review POST**); expanded copy in `reference/github-pr-reviews.md` § **Per-loop review**. Before running Eval 9/10 for real, fetch the current GitHub REST reference to confirm the request schema (fields `commit_id`, `event`, `body`, `comments[]`) and the multi-line anchor `{path, start_line, start_side, line, side, body}` shape still apply. Record the confirmed version and URL here.
|
|
345
|
+
2. **`SendMessage` shutdown origination — RESOLVED.** `SendMessage` tool docs include the line "Don't originate `shutdown_request` unless asked." `TeamCreate` tool docs explicitly direct the lead to originate `{type: "shutdown_request"}` for teammate cleanup. Real-run observation (loop 1 of eval run 2026-04-18) resolved the contradiction: teammates self-terminate when their task is complete — the `Agent` call returns and the teammate's session ends without any `SendMessage`. The cycle proceeded correctly without the lead ever needing to originate a `shutdown_request`. `SKILL.md` § AUDIT / FIX actions document self-termination as the expected path and lead-originated `SendMessage(shutdown_request)` as a fallback; `reference/audit-and-teammates.md` carries the longer shutdown narrative. Layer A **I-4** encodes “no orphaned teammates,” not “always send SendMessage.”
|
|
346
346
|
3. **Model override redundancy.** `code-quality-agent` and `clean-coder` may already pin `model` in their agent definitions. The explicit `model="sonnet"` in every spawn is insurance, but on the first real run confirm no conflict between the lead-passed model and the agent-frontmatter model.
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
# Bugteam reference
|
|
2
|
+
|
|
3
|
+
Expanded material that used to live inline in `SKILL.md`. Load a file when the orchestration stub in `SKILL.md` is not enough — debugging GitHub review shape, gate semantics, teardown edge cases, or explaining the design to a human.
|
|
4
|
+
|
|
5
|
+
| File | Domain |
|
|
6
|
+
|------|--------|
|
|
7
|
+
| [`design-rationale.md`](design-rationale.md) | Why agent teams (clean-room), table-of-contents habit, when `/bugteam` applies, refusal reasons |
|
|
8
|
+
| [`team-setup.md`](team-setup.md) | Permissions grant (`CLAUDE_SKILL_DIR`), PR scope, `TeamCreate`, team name / sanitization / temp dir / roles / loop state |
|
|
9
|
+
| [`github-pr-reviews.md`](github-pr-reviews.md) | Per-loop reviews, `jq` + `gh api` payloads, anchors, fallbacks, REST endpoints |
|
|
10
|
+
| [`audit-and-teammates.md`](audit-and-teammates.md) | Pre-audit gate, full cycle numbering, AUDIT and FIX actions, parallel auditors |
|
|
11
|
+
| [`teardown-publish-permissions.md`](teardown-publish-permissions.md) | Utility scripts note, teardown, PR description rewrite, revoke, final report |
|
|
12
|
+
|
|
13
|
+
Canonical documentation quotes: [`../sources.md`](../sources.md).
|
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
# Gate, cycle, AUDIT, and FIX
|
|
2
|
+
|
|
3
|
+
## Step 3 — The cycle (full detail)
|
|
4
|
+
|
|
5
|
+
Repeat until an exit condition fires.
|
|
6
|
+
|
|
7
|
+
**Ordering principle:** Mandatory **CODE_RULES** checks (`validate_content` from `hooks/blocking/code_rules_enforcer.py`) must pass on the PR-scoped file set **before** any **AUDIT** (bugfind) teammate runs. The **clean-coder** teammate clears gate failures; then the **code-quality-agent** teammate audits. This mirrors “CI green, then review,” without relying on GitHub Actions — the script is the gate.
|
|
8
|
+
|
|
9
|
+
1. Decide the next action from `last_action` and `last_findings`:
|
|
10
|
+
- `last_action == "audited"` and `last_findings.total == 0` → exit reason = `converged`
|
|
11
|
+
- `last_action == "fixed"` and `git rev-parse HEAD` did not change since pre-FIX → exit reason = `stuck` (see FIX action)
|
|
12
|
+
- `last_action in {"fresh", "fixed"}` → go to **pre-audit path** (below), then **AUDIT**
|
|
13
|
+
- `last_action == "audited"` and `last_findings.total > 0` → go to **FIX** (below)
|
|
14
|
+
|
|
15
|
+
2. **Pre-audit path** (only when the next step is **AUDIT**):
|
|
16
|
+
1. From the repository root, run the gate script (align `--base` with the PR base branch from Step 1, e.g. `origin/main` or `origin/develop`):
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
python "${CLAUDE_SKILL_DIR}/scripts/bugteam_code_rules_gate.py" --base origin/<baseRefName>
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
`git merge-base` + `git diff --name-only` live inside the script; see [`../scripts/README.md`](../scripts/README.md). The lead runs this (not a teammate).
|
|
23
|
+
|
|
24
|
+
2. If exit code **0** → continue to step 2.5 (AUDIT spawn) below.
|
|
25
|
+
3. If exit code **non-zero** → spawn a new **clean-coder** teammate — **standards-fix pass** — with instructions: read the script’s stderr, edit the repo until a **re-run** of the **same** gate command exits **0**, then one commit, `git push`, shutdown. Repeat standards-fix spawns until the gate exits **0** or **5** failed gate rounds (each round = one teammate session after a non-zero gate). If still non-zero after 5 rounds → exit reason = `error: code rules gate failed pre-audit`.
|
|
26
|
+
4. After gate exit **0**, increment `loop_count`. If `loop_count > 10`, exit reason = `cap reached` (counts **audits**, not standards-only rounds).
|
|
27
|
+
5. Execute **AUDIT action** (spawn bugfind). Print progress: `Loop <N> audit: ...`
|
|
28
|
+
|
|
29
|
+
3. **FIX path** (when `last_action == "audited"` and `last_findings.total > 0`):
|
|
30
|
+
1. Increment `loop_count`. If `loop_count > 10`, exit reason = `cap reached`.
|
|
31
|
+
2. Execute **FIX action** (spawn bugfix clean-coder for audit findings). Print: `Loop <N> fix: commit ...`
|
|
32
|
+
3. Set `last_action = "fixed"`, update `audit_log`, loop to step 1 (next iteration hits **pre-audit path** before the next AUDIT).
|
|
33
|
+
|
|
34
|
+
4. After **AUDIT**, update `last_action`, `last_findings`, `audit_log`; print the audit progress line if not already printed.
|
|
35
|
+
|
|
36
|
+
5. Loop.
|
|
37
|
+
|
|
38
|
+
**Note:** The first iteration uses **pre-audit path** then **AUDIT**. After a **FIX**, the next iteration runs **pre-audit path** again (gate → then AUDIT), so `validate_content` stays green before semantic audit.
|
|
39
|
+
|
|
40
|
+
## AUDIT action (clean-room teammate, fresh per loop)
|
|
41
|
+
|
|
42
|
+
Capture a fresh PR diff for this loop into the per-team scoped directory so concurrent `/bugteam` runs keep patches isolated. Use the literal `<team_temp_dir>` resolved once in Step 2 — Claude resolves the absolute path; every shell receives the same literal value.
|
|
43
|
+
|
|
44
|
+
Commands and `Agent(...)` shape: `SKILL.md`.
|
|
45
|
+
|
|
46
|
+
`<team_temp_dir>` includes the sanitized `team_name` and timestamp; `team_name` is already prefixed with `bugteam-`. Claude resolves `Path(tempfile.gettempdir()) / team_name` once and passes that absolute path to every shell. `tempfile.gettempdir()` honors `TMPDIR`, `TEMP`, `TMP` and falls back to the OS temp directory, so the same approach works on macOS, Linux, Windows cmd.exe, and PowerShell.
|
|
47
|
+
|
|
48
|
+
Each loop calls `Agent` again with a fresh invocation so the teammate starts with its own context window. Doc line on lead history: [`../sources.md`](../sources.md).
|
|
49
|
+
|
|
50
|
+
See [`../PROMPTS.md`](../PROMPTS.md) for AUDIT spawn-prompt XML and bugfind outcome schema. Substitute placeholders (`repo`, `branch`, `base_branch`, `pr_url`, `loop`, `diff_path`) into the `prompt` argument.
|
|
51
|
+
|
|
52
|
+
After the teammate returns, the lead reads `.bugteam-loop-<N>.outcomes.xml` with the `Read` tool, parses it, and populates `loop_comment_index` from `<finding>` elements.
|
|
53
|
+
|
|
54
|
+
### Shutdown (bugfind)
|
|
55
|
+
|
|
56
|
+
**Expected path — self-termination:** Teammates often self-terminate when complete — the `Agent` call returns and the session ends. Then no `SendMessage` is needed.
|
|
57
|
+
|
|
58
|
+
**Fallback — lead-initiated shutdown:** If the teammate still appears active after `Agent` returns, send:
|
|
59
|
+
|
|
60
|
+
```
|
|
61
|
+
SendMessage(
|
|
62
|
+
to="bugfind",
|
|
63
|
+
message={
|
|
64
|
+
"type": "shutdown_request",
|
|
65
|
+
"reason": "audit loop <N> complete; outcome XML captured"
|
|
66
|
+
}
|
|
67
|
+
)
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
The teammate replies with `{type: "shutdown_response", approve: true}`. If `approve` is `false`, exit reason = `error: bugfind teammate refused shutdown` → Step 4 teardown then Step 5 revoke.
|
|
71
|
+
|
|
72
|
+
`last_action = "audited"`. Append audit metadata to `audit_log`.
|
|
73
|
+
|
|
74
|
+
### Parallel auditors (`loop_count >= 4`)
|
|
75
|
+
|
|
76
|
+
The pre-audit gate must pass immediately before this step. After three full audit/fix rounds without convergence, issue three `Agent` calls in **one** assistant message so they run in parallel:
|
|
77
|
+
|
|
78
|
+
```
|
|
79
|
+
Agent(subagent_type="code-quality-agent", name="bugfind-loop-<N>-a", team_name="<team_name>", model="sonnet", description="Bugfind audit loop <N> variant a", prompt="<audit XML; write outcome to .bugteam-loop-<N>.outcomes.xml; post the per-loop review; read and merge b/c outcomes from <team_temp_dir>/loop-<N>-b.outcomes.xml and <team_temp_dir>/loop-<N>-c.outcomes.xml>")
|
|
80
|
+
Agent(subagent_type="code-quality-agent", name="bugfind-loop-<N>-b", team_name="<team_name>", model="sonnet", description="Bugfind audit loop <N> variant b", prompt="<audit XML; write outcome to <team_temp_dir>/loop-<N>-b.outcomes.xml; skip PR posting>")
|
|
81
|
+
Agent(subagent_type="code-quality-agent", name="bugfind-loop-<N>-c", team_name="<team_name>", model="sonnet", description="Bugfind audit loop <N> variant c", prompt="<audit XML; write outcome to <team_temp_dir>/loop-<N>-c.outcomes.xml; skip PR posting>")
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Teammate `-a` is the post-owner: read all three outcome XML files at explicit absolute paths (`.bugteam-loop-<N>.outcomes.xml` in cwd, plus sibling paths under `<team_temp_dir>`), merge findings by `(file, line, category_letter)` (collapse duplicates, keep longest description and highest severity), re-assign merged IDs as `loopN-K`, post the single per-loop review. The `-a` prompt must embed sibling paths as literal absolutes so `Read` works without discovery.
|
|
85
|
+
|
|
86
|
+
Shutdown order: parallel `SendMessage` to `b` and `c`, then `a`:
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
SendMessage(to="bugfind-loop-<N>-b", message={"type": "shutdown_request", "reason": "variant XML captured"})
|
|
90
|
+
SendMessage(to="bugfind-loop-<N>-c", message={"type": "shutdown_request", "reason": "variant XML captured"})
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
then
|
|
94
|
+
|
|
95
|
+
```
|
|
96
|
+
SendMessage(to="bugfind-loop-<N>-a", message={"type": "shutdown_request", "reason": "merged review posted"})
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## FIX action (fresh teammate)
|
|
100
|
+
|
|
101
|
+
`Agent` shape in `SKILL.md`. The teammate sees only the latest audit’s findings — each `Agent` call starts with a fresh context window; prior-loop findings, fix history, and chat stay in the lead.
|
|
102
|
+
|
|
103
|
+
Pass finding comment URL and id for each finding (from `loop_comment_index`) in the XML prompt so the teammate owns replies. After commit: one reply per finding (`Fixed in <commit_sha>` or `Could not address this loop: <one-line reason>`). Same identity model as bugfind: teammate posts; lead waits.
|
|
104
|
+
|
|
105
|
+
After replies, the teammate writes outcome XML (schema in [`../PROMPTS.md`](../PROMPTS.md)).
|
|
106
|
+
|
|
107
|
+
### Shutdown (bugfix)
|
|
108
|
+
|
|
109
|
+
Same self-termination vs `SendMessage` split as bugfind. Fallback message:
|
|
110
|
+
|
|
111
|
+
```
|
|
112
|
+
SendMessage(
|
|
113
|
+
to="bugfix",
|
|
114
|
+
message={
|
|
115
|
+
"type": "shutdown_request",
|
|
116
|
+
"reason": "fix loop <N> complete; commit <sha7> pushed"
|
|
117
|
+
}
|
|
118
|
+
)
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
`approve: false` → `error: bugfix teammate refused shutdown` → Step 4 then 5.
|
|
122
|
+
|
|
123
|
+
Substitute placeholders from `last_findings` into the fix prompt per [`../PROMPTS.md`](../PROMPTS.md).
|
|
124
|
+
|
|
125
|
+
**Verify push:** `git rev-parse HEAD` after fix must differ from before; new HEAD must exist on `origin/<branch>` (`git fetch origin <branch> && git rev-parse origin/<branch>` matches `HEAD`). If HEAD did not change → `stuck — bugfix teammate could not address findings`.
|
|
126
|
+
|
|
127
|
+
`last_action = "fixed"`. Append fix line to `audit_log`.
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
# Design rationale
|
|
2
|
+
|
|
3
|
+
## Core principle (expanded)
|
|
4
|
+
|
|
5
|
+
A Claude Code **agent team** runs the audit-and-fix loop until convergence. The bugfind teammate audits clean-room (own context window, no chat history); the bugfix teammate addresses each audit's findings; both spawn fresh per loop. A 10-loop hard cap prevents runaway cost. Project permissions are granted at session start and revoked at session end.
|
|
6
|
+
|
|
7
|
+
Teammate isolation versus subagents returning into the lead’s context is the clean-room property. Verbatim Anthropic quotes and URLs: [`../sources.md`](../sources.md).
|
|
8
|
+
|
|
9
|
+
## Why not parallel subagents here
|
|
10
|
+
|
|
11
|
+
Subagents return their results into the lead’s context, which accumulates across loops. Agent-team teammates are independent sessions with their own context windows and do not pollute the lead. The lead can shut down and respawn each loop so every audit starts fresh. For `/bugteam`, the independent-context property is required; parallel subagents fail the clean-room requirement. Supporting quotes: [`../sources.md`](../sources.md) (subagents vs agent teams).
|
|
12
|
+
|
|
13
|
+
## Table of contents in `SKILL.md`
|
|
14
|
+
|
|
15
|
+
The top-of-file list exists so partial reads (for example `head -100`) still show scope. Anthropic guidance: [Structure longer reference files with table of contents](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices#structure-longer-reference-files-with-table-of-contents).
|
|
16
|
+
|
|
17
|
+
## When `/bugteam` applies (narrative)
|
|
18
|
+
|
|
19
|
+
The user wants automated convergence on a clean PR without babysitting each step. Typed `/bugteam` once means full authorization for up to ten audit cycles and the corresponding fix commits.
|
|
20
|
+
|
|
21
|
+
### Refusal reasons (detail)
|
|
22
|
+
|
|
23
|
+
- **Agent teams off:** Without the feature flag, the workflow cannot run.
|
|
24
|
+
- **No PR / diff:** There is nothing scoped to audit.
|
|
25
|
+
- **Dirty tree:** The fix teammate will commit; uncommitted local work would be mixed into automated commits.
|
|
26
|
+
- **Missing subagents:** Both `code-quality-agent` and `clean-coder` must exist in the environment before Step 0.
|
|
27
|
+
|
|
28
|
+
Exact refusal strings remain in `SKILL.md`.
|
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
# GitHub PR comments (Step 2.5)
|
|
2
|
+
|
|
3
|
+
Per-loop pull-request reviews: findings render as a tree under one parent review (similar to Cursor Bugbot). **Teammates own all PR comment posting** — bugfind posts the review (parent body plus child finding comments in one batched POST); bugfix posts fix replies. Comment, review, and reply POSTs belong to teammates. The lead’s single PR write is the final description rewrite at Step 4.5 (`pr-description-writer`).
|
|
4
|
+
|
|
5
|
+
- **Per-loop review** — One `POST /pulls/<number>/reviews` per loop, posted by the bugfind teammate **after** auditing. The review body is the loop header (audit counts); the review’s `comments[]` array holds one anchored finding per P0/P1/P2 finding. GitHub renders a single collapsible thread with each finding as a child comment.
|
|
6
|
+
|
|
7
|
+
- **Fix replies** — Reply to each child finding comment after the commit lands. Body: `Fixed in <commit_sha>` if addressed, or `Could not address this loop: <one-line reason>` if not. The `/pulls/<number>/comments/<id>/replies` endpoint works on any review comment, including those created as part of a review.
|
|
8
|
+
|
|
9
|
+
**Ordering:** Bugfind audits first, buffers findings, validates anchors against the captured diff, then posts the review **once** at the end. The review body states the finding count authoritatively. Keep all posting in that single end-of-loop review POST.
|
|
10
|
+
|
|
11
|
+
## CLI shapes (teammate)
|
|
12
|
+
|
|
13
|
+
All three POSTs use the same pattern: build JSON with `jq` (`--rawfile` or `-Rs` so markdown with backticks, newlines, and quotes survives intact), then pipe to `gh api ... --input -` on stdin. This avoids shell-quoting edge cases.
|
|
14
|
+
|
|
15
|
+
### Per-loop review (one POST creates parent + children)
|
|
16
|
+
|
|
17
|
+
Build `comments[]` programmatically from buffered, diff-anchored findings. Single-line shape: `{path, line, side: "RIGHT", body: <finding markdown>}`. Multi-line: `{path, start_line, start_side: "RIGHT", line, side: "RIGHT", body: ...}` (all four anchor fields required).
|
|
18
|
+
|
|
19
|
+
```
|
|
20
|
+
jq -n \
|
|
21
|
+
--rawfile review_body <tmp_review_body.md> \
|
|
22
|
+
--arg commit_id "$(git rev-parse HEAD)" \
|
|
23
|
+
--rawfile finding_body_1 <tmp_finding_1.md> \
|
|
24
|
+
--arg path_1 "<file_1>" \
|
|
25
|
+
--argjson line_1 <line_1> \
|
|
26
|
+
[... one finding_body_K / path_K / line_K triple per anchored finding ...] \
|
|
27
|
+
'{
|
|
28
|
+
commit_id: $commit_id,
|
|
29
|
+
event: "COMMENT",
|
|
30
|
+
body: $review_body,
|
|
31
|
+
comments: [
|
|
32
|
+
{path: $path_1, line: $line_1, side: "RIGHT", body: $finding_body_1}
|
|
33
|
+
[, ... one object per anchored finding ...]
|
|
34
|
+
]
|
|
35
|
+
}' \
|
|
36
|
+
| gh api repos/<owner>/<repo>/pulls/<number>/reviews -X POST --input -
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
Response JSON includes the parent review `id` / `html_url` and a `comments` array of child comments (`id`, `html_url`). Harvest children in index order and align with the finding list.
|
|
40
|
+
|
|
41
|
+
### Fix reply
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
jq -Rs '{body: .}' < <tmp_reply.md> \
|
|
45
|
+
| gh api repos/<owner>/<repo>/pulls/<number>/comments/<finding_comment_id>/replies -X POST --input -
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
### Review POST failure fallback
|
|
49
|
+
|
|
50
|
+
Top-level PR comment via issue-comments (`{issue_number}` is the PR number):
|
|
51
|
+
|
|
52
|
+
```
|
|
53
|
+
jq -Rs '{body: .}' < <tmp_fallback.md> \
|
|
54
|
+
| gh api repos/<owner>/<repo>/issues/<number>/comments -X POST --input -
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
`<head_sha_at_post_time>` is the SHA at post time (`git rev-parse HEAD` in the teammate’s working directory immediately before the POST). The review anchors finding comments to the head SHA at audit time (before this loop’s fix lands).
|
|
58
|
+
|
|
59
|
+
Write each body (review body and every per-finding body) to its own temp file before the `jq` pipeline. Bodies stay inside files the pipeline reads — they reach GitHub inside the JSON payload — which keeps them compatible with the `gh-body-backtick-guard` hook that scans command-line `--body` arguments.
|
|
60
|
+
|
|
61
|
+
## Review body template (`<tmp_review_body.md>`)
|
|
62
|
+
|
|
63
|
+
```
|
|
64
|
+
## /bugteam loop <N> audit: <P0>P0 / <P1>P1 / <P2>P2
|
|
65
|
+
|
|
66
|
+
<if any findings could not be anchored to a diff line, include this section:>
|
|
67
|
+
### Findings without a diff anchor
|
|
68
|
+
|
|
69
|
+
- **[severity] title** — <file>:<line> — <one-line description>
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
If the audit returns zero findings, still post **one** review with `event=COMMENT`, empty `comments[]`, and body `## /bugteam loop <N> audit: 0P0 / 0P1 / 0P2 → clean` so each loop’s section is self-contained on the PR.
|
|
73
|
+
|
|
74
|
+
## Anchor-validation fallback (teammate)
|
|
75
|
+
|
|
76
|
+
GitHub rejects the entire review POST if any `comments[]` entry targets a line not in the diff. Before posting, validate every finding’s `(file, line)` against the captured diff. Findings not in the diff are **not** added to `comments[]`; list them in the review body under `### Findings without a diff anchor`. Outcome XML: `used_fallback="true"`, `finding_comment_id=""`, `finding_comment_url=<review_url>` (parent URL when there is no child). Log fallback count in outcome XML for the lead’s final report. The loop continues; anchor mismatch does not abort.
|
|
77
|
+
|
|
78
|
+
## Review POST failure fallback
|
|
79
|
+
|
|
80
|
+
If the review POST fails (rate limit, network, malformed payload), fall back to one top-level issue comment containing the review body plus every finding inline (severity, file:line, description). Every finding in that run: `used_fallback="true"`, `finding_comment_url` = issue-comment URL. Use the issue-comment CLI shape above.
|
|
81
|
+
|
|
82
|
+
## GitHub REST endpoints
|
|
83
|
+
|
|
84
|
+
- Per-loop batched review: `POST /repos/{owner}/{repo}/pulls/{pull_number}/reviews` (required: `body`, `event=COMMENT`, `commit_id`; optional `comments[]` — each entry needs `path`, `body`, `line`, `side`)
|
|
85
|
+
- Fix reply: `POST /repos/{owner}/{repo}/pulls/{pull_number}/comments/{comment_id}/replies` (required: `body`)
|
|
86
|
+
- Review-POST failure fallback: `POST /repos/{owner}/{repo}/issues/{issue_number}/comments` (required: `body`; `{issue_number}` is the PR number)
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
# Team setup and loop state
|
|
2
|
+
|
|
3
|
+
## Step 0 — Grant project permissions (detail)
|
|
4
|
+
|
|
5
|
+
Before spawning any teammates, grant the team session write access to the project’s `.claude/**` tree. Command in `SKILL.md`.
|
|
6
|
+
|
|
7
|
+
`${CLAUDE_SKILL_DIR}` is a Claude Code host-managed token, pre-substituted by the runtime before any shell sees it. Unlike `${TMPDIR}` and similar shell parameter expansions, it does not depend on the shell’s expansion semantics, so it behaves the same on Unix and Windows shells.
|
|
8
|
+
|
|
9
|
+
The script reads `Path.cwd()` and writes idempotent allow rules into `~/.claude/settings.json`. Run from the project root. If it fails (non-zero exit), surface the error and stop — do not proceed without the grant.
|
|
10
|
+
|
|
11
|
+
This is the **first** action of every `/bugteam` invocation, before any team creation or agent spawn. The corresponding revoke runs at Step 5 regardless of how the cycle exits.
|
|
12
|
+
|
|
13
|
+
## Step 1 — Resolve PR scope (detail)
|
|
14
|
+
|
|
15
|
+
Same resolution path as `/findbugs`:
|
|
16
|
+
|
|
17
|
+
1. `gh pr view --json number,baseRefName,headRefName,url` from the working directory.
|
|
18
|
+
2. Fall back to `git merge-base HEAD origin/<default>` then `git diff <merge-base>...HEAD`.
|
|
19
|
+
3. Neither → refuse per refusal cases in `SKILL.md`.
|
|
20
|
+
|
|
21
|
+
Capture `<owner>/<repo>`, head branch, base branch, PR number, PR URL. This scope persists across every loop — `/bugteam` runs to completion from the single up-front confirmation.
|
|
22
|
+
|
|
23
|
+
## Step 2 — Create the agent team (detail)
|
|
24
|
+
|
|
25
|
+
This session is the **team lead**. Create the team with `TeamCreate` using the exact argument shape in `SKILL.md`.
|
|
26
|
+
|
|
27
|
+
`<team_name>` is built under **Team name** below (sanitization + timestamp already applied). `TeamCreate` matches natural-language team creation in the product docs; quote in [`../sources.md`](../sources.md).
|
|
28
|
+
|
|
29
|
+
### Team specification
|
|
30
|
+
|
|
31
|
+
- **Team name:** `bugteam-pr-<number>-<YYYYMMDDHHMMSS>` (or `bugteam-<sanitized-head-branch>-<YYYYMMDDHHMMSS>` if no PR). The timestamp is captured at team-creation time from the lead session and prevents two concurrent invocations on the same PR from colliding.
|
|
32
|
+
|
|
33
|
+
- **Branch-name sanitization (no-PR fallback only):** Before substituting `<head-branch>` into the `team_name` template, replace every character outside `[A-Za-z0-9._-]` with `-`. The whitelist keeps safe portable filename characters only; OS-reserved and shell-special characters (`/ \ : * ? < > | "` plus ASCII control characters `0x00`–`0x1F`) fall outside the whitelist and become `-`. Example: `feat/foo*bar` → `feat-foo-bar`; `team_name` becomes `bugteam-feat-foo-bar-<YYYYMMDDHHMMSS>`. Apply sanitization when `team_name` is first assembled so every downstream use (team creation, scoped temp dir, cleanup) sees the safe form.
|
|
34
|
+
|
|
35
|
+
- **Per-team temp directory (resolved once, reused everywhere):** After `team_name` is captured, resolve a portable absolute path with a Claude-side lookup using Python’s `tempfile.gettempdir()`, which honors `TMPDIR`, `TEMP`, and `TMP` in the platform-correct order and falls back to `C:\Users\<user>\AppData\Local\Temp` on Windows or `/tmp` on Unix: `Path(tempfile.gettempdir()) / team_name` (requires `import tempfile`). The `team_name` value already carries the `bugteam-` prefix; keep it as-is here. Let `tempfile.gettempdir()` perform the lookup; use its result directly. Capture the resolved absolute path as `<team_temp_dir>` and pass that literal path to every shell command that follows. Claude performs all temp-root resolution so every shell (bash, cmd.exe, PowerShell) receives the same literal absolute value.
|
|
36
|
+
|
|
37
|
+
- **Roles defined up front (spawned per loop, not at team creation):**
|
|
38
|
+
- `bugfind` — teammate role `code-quality-agent`, model sonnet
|
|
39
|
+
- `bugfix` — teammate role `clean-coder`, model sonnet
|
|
40
|
+
|
|
41
|
+
- **Display mode:** inherit the user’s default (`teammateMode` in `~/.claude.json`); do not override.
|
|
42
|
+
|
|
43
|
+
Reference teammate role definitions by name when spawning. Doc quote on subagent scopes: [`../sources.md`](../sources.md).
|
|
44
|
+
|
|
45
|
+
### Loop state block
|
|
46
|
+
|
|
47
|
+
The block in `SKILL.md` mixes lead-internal variables and one shell command (`starting_sha`). Read it as instructions, not a single runnable script.
|
|
48
|
+
|
|
49
|
+
**`loop_comment_index` scope (per-loop, not cross-loop):** Reset at the start of every AUDIT action, populated as finding comments are posted during AUDIT, consumed by the matching FIX action when it posts fix replies, and discarded after FIX completes. It does not persist across loops; each loop starts with an empty index and its own fresh set of comment URLs.
|
|
50
|
+
|
|
51
|
+
Each entry: `{loop, finding_id, finding_comment_id, finding_comment_url, used_fallback, fix_status}`. Populated by AUDIT, consumed by FIX.
|
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
# Utility scripts, teardown, PR body, permissions, report
|
|
2
|
+
|
|
3
|
+
## Utility scripts (progressive disclosure)
|
|
4
|
+
|
|
5
|
+
Fragile or repeatable shell sequences belong in `scripts/`. Anthropic: [Progressive disclosure patterns](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices#progressive-disclosure-patterns) — utility scripts are **executed**, not loaded into context as primary reading. Details: [`../scripts/README.md`](../scripts/README.md).
|
|
6
|
+
|
|
7
|
+
### Pre-flight (recommended before Step 0)
|
|
8
|
+
|
|
9
|
+
From the repository root, run the command in `SKILL.md`. If the exit code is non-zero, stop and fix failing checks before granting permissions. Optional: `BUGTEAM_PREFLIGHT_SKIP=1` skips pre-flight (emergency only). Optional: `--pre-commit` when `.pre-commit-config.yaml` exists.
|
|
10
|
+
|
|
11
|
+
## Step 4 — Tear down the team and clean working tree
|
|
12
|
+
|
|
13
|
+
When the cycle exits (any reason), run these steps in order from **this** session (the lead).
|
|
14
|
+
|
|
15
|
+
1. **Confirm every teammate has shut down.** Any teammate still alive (for example, from an aborted shutdown mid-loop) must receive a shutdown message first. For each remaining teammate name:
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
SendMessage(to="<teammate_name>", message={"type": "shutdown_request", "reason": "bugteam cycle ending"})
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
Product docs: the lead’s cleanup checks for active teammates — shut them down first. Verbatim quote: [`../sources.md`](../sources.md) (lead cleanup).
|
|
22
|
+
|
|
23
|
+
If any teammate returns `approve: false` during cleanup, log the name (for example `cleanup warning: <teammate_name> refused shutdown_request`) and **still** proceed to `TeamDelete`. `TeamDelete` may fail if teammates remain; if so, surface the error in the final report. Do not abort the sequence — continue through temp-dir deletion, Step 4.5, and Step 5.
|
|
24
|
+
|
|
25
|
+
2. **Clean up the team** with `TeamDelete()` (no arguments — reads `<team_name>` from session context). Maps to “clean up the team” in the docs; quote: [`../sources.md`](../sources.md).
|
|
26
|
+
|
|
27
|
+
3. **Delete the per-team temp directory** using the Python one-liner in `SKILL.md` with the same literal `<team_temp_dir>` from Step 2. `shutil.rmtree(..., ignore_errors=True)` is portable across Windows and Unix.
|
|
28
|
+
|
|
29
|
+
## Step 4.5 — Finalize the PR description (mandatory)
|
|
30
|
+
|
|
31
|
+
After teardown and before permission revoke, the lead rewrites the PR body to the PR’s **final cumulative state** — what the change delivers, not the bugteam process. This is the **only** PR-write the lead performs (audit and fix comments stay with teammates).
|
|
32
|
+
|
|
33
|
+
The lead delegates body text to the `pr-description-writer` agent so the global mandatory-pr-description-writer hook accepts the subsequent `gh pr edit`. The lead does **not** compose the body inline.
|
|
34
|
+
|
|
35
|
+
`pr-description-writer` comes from the global git-workflow rule in `claude-code-config`. Invoke with `Agent`:
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
Agent(
|
|
39
|
+
subagent_type="pr-description-writer",
|
|
40
|
+
description="Rewrite PR <number> body from cumulative diff",
|
|
41
|
+
prompt="<brief from steps below>"
|
|
42
|
+
)
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
If that subagent is missing, fall back to `general-purpose` with the same brief — the hook treats agent-authored bodies the same. If neither exists, log a warning and skip Step 4.5.
|
|
46
|
+
|
|
47
|
+
**Steps:**
|
|
48
|
+
|
|
49
|
+
1. Capture cumulative diff: `gh pr diff <number> -R <owner>/<repo> > .bugteam-final.diff`.
|
|
50
|
+
2. Capture original body: `gh pr view <number> -R <owner>/<repo> --json body --jq .body > .bugteam-original-body.md`.
|
|
51
|
+
3. Agent brief:
|
|
52
|
+
- **Inputs:** diff path, original body path, head branch, base branch.
|
|
53
|
+
- **Constraint:** describe what the PR delivers from the cumulative diff — behavior, user-visible effect, merge rationale. Process metadata (loops, fix counts, findings) stays in review comments.
|
|
54
|
+
- **Preservation rule:** if the original body has manually curated sections (linked issues, screenshots, test plan, “Risk Assessment”, etc.), preserve them verbatim and only rewrite narrative around them.
|
|
55
|
+
- **Output:** new body markdown.
|
|
56
|
+
4. Write `.bugteam-final-body.md`.
|
|
57
|
+
5. `gh pr edit <number> -R <owner>/<repo> --body-file .bugteam-final-body.md`.
|
|
58
|
+
6. Delete `.bugteam-final.diff`, `.bugteam-original-body.md`, `.bugteam-final-body.md`.
|
|
59
|
+
|
|
60
|
+
If Step 4.5 fails (agent error, hook block, network), report in the final report and continue to Step 5. The original PR body remains; commits and comments are unaffected.
|
|
61
|
+
|
|
62
|
+
## Step 5 — Revoke project permissions (mandatory, always)
|
|
63
|
+
|
|
64
|
+
After team cleanup — including on error, cap-reached, or stuck exits — run the revoke command from `SKILL.md`.
|
|
65
|
+
|
|
66
|
+
This removes allow rules and `additionalDirectories` added in Step 0. Revoke is non-negotiable: leaving the grant in place would let future sessions inherit elevated `.claude/**` access without an explicit opt-in. Run revoke even if Step 4 partially failed; log cleanup errors separately.
|
|
67
|
+
|
|
68
|
+
## Step 6 — Final report
|
|
69
|
+
|
|
70
|
+
Template and exit hints are in `SKILL.md`. If exit = `cap reached`, name remaining bug count and suggest `/findbugs` for human triage. If `stuck`, name which findings the fix agent could not resolve. If `error`, surface the error and loop number.
|
|
@@ -6,6 +6,10 @@ Scripts in this directory are **executed** by the lead or teammates. They are no
|
|
|
6
6
|
|--------|---------|
|
|
7
7
|
| `bugteam_preflight.py` | Run pytest (when configured) and optional `pre-commit` before `/bugteam`. |
|
|
8
8
|
| `bugteam_code_rules_gate.py` | Run `validate_content` from `code-rules-enforcer.py` on PR-scoped files (`git diff` vs merge-base). Exit `1` if any mandatory rule fails. Invoked **before each audit**; the fixer clears it before the auditor runs. |
|
|
9
|
+
| `grant_project_claude_permissions.py` | Idempotent grant of Edit/Write/Read on `cwd/.claude/**` into `~/.claude/settings.json`. |
|
|
10
|
+
| `revoke_project_claude_permissions.py` | Removes the matching grant entries from `~/.claude/settings.json`. |
|
|
11
|
+
| `test_claude_permissions_common.py` | Pytest module for path normalization and glob-metacharacter guards in `_claude_permissions_common.py`. |
|
|
12
|
+
| `_claude_permissions_common.py` | Shared helpers for the grant/revoke scripts (atomic JSON writes, settings sections). |
|
|
9
13
|
|
|
10
14
|
## `bugteam_preflight.py`
|
|
11
15
|
|