claude-dev-env 1.37.1 → 1.38.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (94) hide show
  1. package/CLAUDE.md +3 -0
  2. package/_shared/pr-loop/audit-contract.md +4 -3
  3. package/_shared/pr-loop/fix-protocol.md +2 -0
  4. package/_shared/pr-loop/gh-payloads.md +38 -37
  5. package/_shared/pr-loop/scripts/README.md +0 -1
  6. package/_shared/pr-loop/scripts/preflight.py +2 -1
  7. package/_shared/pr-loop/scripts/tests/test_code_rules_gate.py +2 -2
  8. package/_shared/pr-loop/scripts/tests/test_preflight.py +22 -0
  9. package/_shared/pr-loop/state-schema.md +10 -10
  10. package/agents/clean-coder.md +4 -0
  11. package/agents/code-quality-agent.md +23 -85
  12. package/agents/groq-coder.md +8 -6
  13. package/hooks/blocking/__init__.py +0 -0
  14. package/hooks/blocking/code_rules_enforcer.py +93 -32
  15. package/hooks/blocking/hedging_language_blocker.py +2 -2
  16. package/hooks/blocking/state_description_blocker.py +243 -0
  17. package/hooks/blocking/tdd_enforcer.py +94 -0
  18. package/hooks/blocking/test_code_rules_enforcer_unused_imports.py +158 -0
  19. package/hooks/blocking/test_hedging_language_blocker.py +1 -1
  20. package/hooks/blocking/test_state_description_blocker.py +618 -0
  21. package/hooks/blocking/test_tdd_enforcer.py +152 -0
  22. package/hooks/config/state_description_blocker_constants.py +130 -0
  23. package/hooks/hooks.json +10 -0
  24. package/package.json +1 -1
  25. package/rules/no-historical-clutter.md +31 -10
  26. package/scripts/config/groq_bugteam_config.py +13 -5
  27. package/skills/bugteam/CONSTRAINTS.md +20 -27
  28. package/skills/bugteam/EXAMPLES.md +1 -1
  29. package/skills/bugteam/PROMPTS.md +60 -31
  30. package/skills/bugteam/SKILL.md +47 -47
  31. package/skills/bugteam/SKILL_EVALS.md +8 -8
  32. package/skills/bugteam/reference/github-pr-reviews.md +31 -31
  33. package/skills/bugteam/reference/team-setup.md +1 -1
  34. package/skills/bugteam/reference/teardown-publish-permissions.md +4 -4
  35. package/skills/copilot-review/SKILL.md +7 -14
  36. package/skills/findbugs/SKILL.md +2 -2
  37. package/skills/fixbugs/SKILL.md +1 -1
  38. package/skills/monitor-open-prs/SKILL.md +6 -6
  39. package/skills/pr-converge/SKILL.md +7 -6
  40. package/skills/pr-converge/reference/convergence-gates.md +28 -30
  41. package/skills/pr-converge/reference/examples.md +4 -4
  42. package/skills/pr-converge/reference/fix-protocol.md +6 -8
  43. package/skills/pr-converge/reference/multi-pr-orchestration.md +10 -10
  44. package/skills/pr-converge/reference/per-tick.md +18 -33
  45. package/skills/pr-converge/reference/stop-conditions.md +7 -7
  46. package/skills/pr-converge/scripts/README.md +65 -117
  47. package/skills/pr-review-responder/EXAMPLES.md +2 -2
  48. package/skills/pr-review-responder/PRINCIPLES.md +2 -8
  49. package/skills/pr-review-responder/README.md +7 -48
  50. package/skills/pr-review-responder/SKILL.md +2 -3
  51. package/skills/pr-review-responder/TESTING.md +8 -65
  52. package/skills/qbug/SKILL.md +10 -16
  53. package/_shared/pr-loop/scripts/config/gh_util_constants.py +0 -31
  54. package/_shared/pr-loop/scripts/gh_util.py +0 -193
  55. package/_shared/pr-loop/scripts/tests/test_gh_util.py +0 -257
  56. package/_shared/pr-loop/scripts/tests/test_gh_util_constants.py +0 -61
  57. package/skills/pr-converge/scripts/check_pr_mergeability.py +0 -78
  58. package/skills/pr-converge/scripts/config/pr_converge_constants.py +0 -134
  59. package/skills/pr-converge/scripts/config/test_pr_converge_constants.py +0 -152
  60. package/skills/pr-converge/scripts/fetch_bugbot_inline_comments.py +0 -70
  61. package/skills/pr-converge/scripts/fetch_bugbot_reviews.py +0 -57
  62. package/skills/pr-converge/scripts/fetch_claude_inline_comments.py +0 -70
  63. package/skills/pr-converge/scripts/fetch_claude_reviews.py +0 -61
  64. package/skills/pr-converge/scripts/fetch_copilot_inline_comments.py +0 -70
  65. package/skills/pr-converge/scripts/fetch_copilot_reviews.py +0 -61
  66. package/skills/pr-converge/scripts/mark_pr_ready.py +0 -54
  67. package/skills/pr-converge/scripts/post-bugbot-run.helpers.ps1 +0 -49
  68. package/skills/pr-converge/scripts/post-bugbot-run.ps1 +0 -33
  69. package/skills/pr-converge/scripts/reply_to_inline_comment.py +0 -84
  70. package/skills/pr-converge/scripts/request_copilot_review.py +0 -71
  71. package/skills/pr-converge/scripts/resolve_pr_head.py +0 -58
  72. package/skills/pr-converge/scripts/review_field_helpers.py +0 -43
  73. package/skills/pr-converge/scripts/reviewer_fetch_core.py +0 -153
  74. package/skills/pr-converge/scripts/reviewer_specs.py +0 -98
  75. package/skills/pr-converge/scripts/test_check_pr_mergeability.py +0 -126
  76. package/skills/pr-converge/scripts/test_fetch_bugbot_inline_comments.py +0 -443
  77. package/skills/pr-converge/scripts/test_fetch_bugbot_reviews.py +0 -299
  78. package/skills/pr-converge/scripts/test_fetch_claude_inline_comments.py +0 -485
  79. package/skills/pr-converge/scripts/test_fetch_claude_reviews.py +0 -368
  80. package/skills/pr-converge/scripts/test_fetch_copilot_inline_comments.py +0 -440
  81. package/skills/pr-converge/scripts/test_fetch_copilot_reviews.py +0 -366
  82. package/skills/pr-converge/scripts/test_mark_pr_ready.py +0 -69
  83. package/skills/pr-converge/scripts/test_post_bugbot_run.py +0 -195
  84. package/skills/pr-converge/scripts/test_reply_to_inline_comment.py +0 -159
  85. package/skills/pr-converge/scripts/test_request_copilot_review.py +0 -101
  86. package/skills/pr-converge/scripts/test_resolve_pr_head.py +0 -79
  87. package/skills/pr-converge/scripts/test_review_field_helpers.py +0 -80
  88. package/skills/pr-converge/scripts/test_reviewer_fetch_core.py +0 -448
  89. package/skills/pr-converge/scripts/test_reviewer_specs.py +0 -107
  90. package/skills/pr-converge/scripts/test_trigger_bugbot.py +0 -139
  91. package/skills/pr-converge/scripts/test_view_pr_context.py +0 -155
  92. package/skills/pr-converge/scripts/trigger_bugbot.py +0 -77
  93. package/skills/pr-converge/scripts/view_pr_context.py +0 -78
  94. package/skills/pr-review-responder/scripts/respond_to_reviews.py +0 -376
@@ -190,6 +190,17 @@ def test_should_deny_edit_when_pragma_sentinel_present_in_new_string_without_tes
190
190
  assert _decision_from(completed) == "deny"
191
191
 
192
192
 
193
+ def _make_edit_payload(file_path: Path, old_string: str, new_string: str) -> dict:
194
+ return {
195
+ "tool_name": "Edit",
196
+ "tool_input": {
197
+ "file_path": str(file_path),
198
+ "old_string": old_string,
199
+ "new_string": new_string,
200
+ },
201
+ }
202
+
203
+
193
204
  def test_should_allow_python_file_with_only_module_level_constants(tmp_path: Path) -> None:
194
205
  sandbox = _sandbox(tmp_path)
195
206
  constants_file = sandbox / "constants.py"
@@ -209,6 +220,147 @@ def test_should_allow_python_file_with_only_module_level_constants(tmp_path: Pat
209
220
  assert _decision_from(completed) == "allow"
210
221
 
211
222
 
223
+ def test_should_allow_edit_to_change_constant_value_in_constants_only_file(
224
+ tmp_path: Path,
225
+ ) -> None:
226
+ sandbox = _sandbox(tmp_path)
227
+ constants_file = sandbox / "constants.py"
228
+ constants_file.write_text(
229
+ '"""Module-level constants."""\n'
230
+ "MAXIMUM_RETRIES: int = 3\n"
231
+ "DEFAULT_TIMEOUT_SECONDS: float = 30.0\n"
232
+ )
233
+
234
+ completed = _run_hook_with_payload(
235
+ _make_edit_payload(
236
+ constants_file,
237
+ old_string="MAXIMUM_RETRIES: int = 3",
238
+ new_string="MAXIMUM_RETRIES: int = 5",
239
+ )
240
+ )
241
+
242
+ assert _decision_from(completed) == "allow"
243
+
244
+
245
+ def _make_multiedit_payload(file_path: Path, edits: list[dict]) -> dict:
246
+ return {
247
+ "tool_name": "MultiEdit",
248
+ "tool_input": {
249
+ "file_path": str(file_path),
250
+ "edits": edits,
251
+ },
252
+ }
253
+
254
+
255
+ def test_should_allow_multiedit_to_change_constant_value_in_constants_only_file(
256
+ tmp_path: Path,
257
+ ) -> None:
258
+ sandbox = _sandbox(tmp_path)
259
+ constants_file = sandbox / "constants.py"
260
+ constants_file.write_text(
261
+ '"""Module-level constants."""\n'
262
+ "MAXIMUM_RETRIES: int = 3\n"
263
+ "DEFAULT_TIMEOUT_SECONDS: float = 30.0\n"
264
+ )
265
+
266
+ completed = _run_hook_with_payload(
267
+ _make_multiedit_payload(
268
+ constants_file,
269
+ edits=[
270
+ {
271
+ "old_string": "MAXIMUM_RETRIES: int = 3",
272
+ "new_string": "MAXIMUM_RETRIES: int = 5",
273
+ },
274
+ ],
275
+ )
276
+ )
277
+
278
+ assert _decision_from(completed) == "allow"
279
+
280
+
281
+ def test_should_deny_multiedit_that_adds_function_to_constants_only_file(
282
+ tmp_path: Path,
283
+ ) -> None:
284
+ sandbox = _sandbox(tmp_path)
285
+ constants_file = sandbox / "constants.py"
286
+ constants_file.write_text(
287
+ '"""Module-level constants."""\n'
288
+ "MAXIMUM_RETRIES: int = 3\n"
289
+ )
290
+
291
+ completed = _run_hook_with_payload(
292
+ _make_multiedit_payload(
293
+ constants_file,
294
+ edits=[
295
+ {
296
+ "old_string": "MAXIMUM_RETRIES: int = 3",
297
+ "new_string": "MAXIMUM_RETRIES: int = 3\n\ndef reset() -> None:\n return None",
298
+ },
299
+ ],
300
+ )
301
+ )
302
+
303
+ assert _decision_from(completed) == "deny"
304
+
305
+
306
+ def test_should_deny_edit_that_adds_function_to_constants_only_file(
307
+ tmp_path: Path,
308
+ ) -> None:
309
+ sandbox = _sandbox(tmp_path)
310
+ constants_file = sandbox / "constants.py"
311
+ constants_file.write_text(
312
+ '"""Module-level constants."""\n'
313
+ "MAXIMUM_RETRIES: int = 3\n"
314
+ )
315
+
316
+ completed = _run_hook_with_payload(
317
+ _make_edit_payload(
318
+ constants_file,
319
+ old_string="MAXIMUM_RETRIES: int = 3",
320
+ new_string="MAXIMUM_RETRIES: int = 3\n\ndef reset() -> None:\n return None",
321
+ )
322
+ )
323
+
324
+ assert _decision_from(completed) == "deny"
325
+
326
+
327
+ def test_should_deny_python_file_with_assignment_calling_undefined_function(
328
+ tmp_path: Path,
329
+ ) -> None:
330
+ sandbox = _sandbox(tmp_path)
331
+ unsafe_file = sandbox / "unsafe.py"
332
+ unsafe_content = (
333
+ '"""Config with unsafe call."""\n'
334
+ "VALUE: str = compute()\n"
335
+ )
336
+ unsafe_file.write_text(unsafe_content)
337
+
338
+ completed = _run_hook_with_payload(
339
+ _make_write_payload(unsafe_file, unsafe_content)
340
+ )
341
+
342
+ assert _decision_from(completed) == "deny"
343
+
344
+
345
+ def test_should_allow_python_file_with_assignment_calling_imported_function(
346
+ tmp_path: Path,
347
+ ) -> None:
348
+ sandbox = _sandbox(tmp_path)
349
+ safe_file = sandbox / "safe.py"
350
+ safe_content = (
351
+ '"""Config with imported call."""\n'
352
+ "from pathlib import Path\n"
353
+ "BASE_PATH = Path(r'C:\\\\data')\n"
354
+ )
355
+ safe_file.write_text(safe_content)
356
+
357
+ completed = _run_hook_with_payload(
358
+ _make_write_payload(safe_file, safe_content)
359
+ )
360
+
361
+ assert _decision_from(completed) == "allow"
362
+
363
+
212
364
  def test_should_deny_python_file_when_any_function_definition_is_present(tmp_path: Path) -> None:
213
365
  sandbox = _sandbox(tmp_path)
214
366
  mixed_file = sandbox / "mixed.py"
@@ -0,0 +1,130 @@
1
+ """Configuration constants for the state_description_blocker PreToolUse hook."""
2
+
3
+ from re import IGNORECASE, Pattern, compile
4
+
5
+ ALL_COMMENT_TRANSITION_PATTERNS: list[Pattern[str]] = [
6
+ compile(r"\binstead of\b", IGNORECASE),
7
+ compile(r"\bpreviously\b", IGNORECASE),
8
+ compile(r"\bnow uses\b", IGNORECASE),
9
+ compile(r"\bnow does\b", IGNORECASE),
10
+ compile(r"\bnow handles\b", IGNORECASE),
11
+ compile(r"\bnow supports\b", IGNORECASE),
12
+ compile(r"\bnow names\b", IGNORECASE),
13
+ compile(r"\bnow includes\b", IGNORECASE),
14
+ compile(r"\bwas previously\b", IGNORECASE),
15
+ compile(r"\bwere previously\b", IGNORECASE),
16
+ compile(r"\bwas formerly\b", IGNORECASE),
17
+ compile(r"\bwas added\b", IGNORECASE),
18
+ compile(r"\bused to\b", IGNORECASE),
19
+ compile(r"\bno longer\b", IGNORECASE),
20
+ compile(r"\bhas been updated\b", IGNORECASE),
21
+ compile(r"\bhave been updated\b", IGNORECASE),
22
+ compile(r"\bhas been changed\b", IGNORECASE),
23
+ compile(r"\bhave been changed\b", IGNORECASE),
24
+ compile(r"\breplaced by\b", IGNORECASE),
25
+ compile(r"\breplaces\b", IGNORECASE),
26
+ compile(r"\bsuperseded by\b", IGNORECASE),
27
+ compile(r"\bsupersedes\b", IGNORECASE),
28
+ compile(r"\bchanged from\b", IGNORECASE),
29
+ compile(r"\bchanges from\b", IGNORECASE),
30
+ compile(r"\bswitched from\b", IGNORECASE),
31
+ compile(r"\bswitched to\b", IGNORECASE),
32
+ compile(r"\bmigrated from\b", IGNORECASE),
33
+ compile(r"\bmigrated to\b", IGNORECASE),
34
+ compile(r"\bmoved to\b", IGNORECASE),
35
+ compile(r"\bmoved into\b", IGNORECASE),
36
+ compile(r"\bextracted as\b", IGNORECASE),
37
+ compile(r"\bupdated to\b", IGNORECASE),
38
+ compile(r"\boriginally\b", IGNORECASE),
39
+ compile(r"\bas of\b", IGNORECASE),
40
+ ]
41
+
42
+ CODE_FENCE_PATTERN: Pattern[str] = compile(r"```[\s\S]*?```")
43
+ INLINE_CODE_PATTERN: Pattern[str] = compile(r"``[^`]+``|`[^`]+`")
44
+
45
+ ALL_MARKDOWN_EXTENSIONS: frozenset[str] = frozenset(
46
+ {".md", ".mdx", ".markdown", ".rmd"}
47
+ )
48
+
49
+ ALL_HASH_ONLY_EXTENSIONS: frozenset[str] = frozenset(
50
+ {
51
+ ".py",
52
+ ".rb",
53
+ ".sh",
54
+ ".bash",
55
+ ".zsh",
56
+ ".ps1",
57
+ ".psm1",
58
+ ".yaml",
59
+ ".yml",
60
+ ".tf",
61
+ }
62
+ )
63
+
64
+ ALL_BLOCK_COMMENT_ONLY_EXTENSIONS: frozenset[str] = frozenset(
65
+ {
66
+ ".css",
67
+ }
68
+ )
69
+
70
+ ALL_HASH_AND_SLASH_EXTENSIONS: frozenset[str] = frozenset(
71
+ {
72
+ ".php",
73
+ }
74
+ )
75
+
76
+ ALL_BLOCK_COMMENT_EXTENSIONS: frozenset[str] = frozenset(
77
+ {
78
+ ".js",
79
+ ".jsx",
80
+ ".ts",
81
+ ".tsx",
82
+ ".java",
83
+ ".c",
84
+ ".cpp",
85
+ ".h",
86
+ ".hpp",
87
+ ".rs",
88
+ ".go",
89
+ ".swift",
90
+ ".kt",
91
+ ".scala",
92
+ ".php",
93
+ ".css",
94
+ ".scss",
95
+ ".less",
96
+ }
97
+ )
98
+
99
+ ALL_COMMENT_BEARING_EXTENSIONS: frozenset[str] = frozenset(
100
+ {
101
+ ".py",
102
+ ".js",
103
+ ".jsx",
104
+ ".ts",
105
+ ".tsx",
106
+ ".java",
107
+ ".c",
108
+ ".cpp",
109
+ ".h",
110
+ ".hpp",
111
+ ".rs",
112
+ ".go",
113
+ ".rb",
114
+ ".php",
115
+ ".swift",
116
+ ".kt",
117
+ ".scala",
118
+ ".sh",
119
+ ".bash",
120
+ ".zsh",
121
+ ".ps1",
122
+ ".psm1",
123
+ ".yaml",
124
+ ".yml",
125
+ ".tf",
126
+ ".css",
127
+ ".scss",
128
+ ".less",
129
+ }
130
+ )
package/hooks/hooks.json CHANGED
@@ -30,6 +30,11 @@
30
30
  "command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/validation/hook_format_validator.py",
31
31
  "timeout": 15
32
32
  },
33
+ {
34
+ "type": "command",
35
+ "command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/blocking/code_rules_enforcer.py",
36
+ "timeout": 30
37
+ },
33
38
  {
34
39
  "type": "command",
35
40
  "command": "python3 -c \"import sys; sys.path.insert(0, r'${CLAUDE_PLUGIN_ROOT}/hooks'); from validators.run_all_validators import main; sys.exit(main())\"",
@@ -44,6 +49,11 @@
44
49
  "type": "command",
45
50
  "command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/blocking/windows_rmtree_blocker.py",
46
51
  "timeout": 10
52
+ },
53
+ {
54
+ "type": "command",
55
+ "command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/blocking/state_description_blocker.py",
56
+ "timeout": 10
47
57
  }
48
58
  ]
49
59
  },
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-dev-env",
3
- "version": "1.37.1",
3
+ "version": "1.38.1",
4
4
  "description": "Claude Code development standards — rules, hooks, agents, commands, and skills",
5
5
  "type": "module",
6
6
  "bin": {
@@ -1,25 +1,46 @@
1
1
  ---
2
- paths: **/*.md
2
+ paths: **/*
3
3
  ---
4
4
 
5
- # No Historical Clutter in Documentation
5
+ # No Historical Clutter in Documentation or Comments
6
6
 
7
- **When this applies:** Any Write or Edit to `.md` files.
7
+ **When this applies:** Any Write or Edit to files containing comments or documentation.
8
+
9
+ **Hook enforcement:** `state-description-blocker` (PreToolUse on Write|Edit) blocks historical/comparative language automatically. See `hooks.json` for registration.
8
10
 
9
11
  ## Rule
10
12
 
11
- Never reference removed implementations, old defaults, prior behaviors, or how something "used to be" when updating documentation. The current state is all that matters.
13
+ Never reference removed implementations, old defaults, prior behaviors, or how something `"used to be"` when updating documentation. The current state is all that matters.
12
14
 
13
15
  ## Examples of prohibited patterns
14
16
 
17
+ ### In documentation (.md files)
18
+
15
19
  | Pattern | Why it's clutter |
16
20
  |---------|-----------------|
17
- | "instead of 30" in a pagination rule | The old default no longer exists in code; the rule reader doesn't need to know what it was |
18
- | "previously this used X" | If X is gone, it's noise |
19
- | "before this rule, we did Y" | The rule exists now; the before-state is irrelevant |
20
- | "migrated from Z to W" | If Z is fully removed, the migration story is git history, not documentation |
21
- | "the old implementation did A" | If A is gone, the reader gains nothing from knowing it existed |
22
- | "originally" / "used to be" | Same — dead context |
21
+ | `` `"instead of 30"` `` in a pagination rule | The old default `no longer` exists in code; the rule reader doesn't need to know what it was |
22
+ | `` `"previously this used X"` `` | If X is gone, it's noise |
23
+ | `` `"before this rule, we did Y"` `` | The rule exists now; the before-state is irrelevant |
24
+ | `` `"migrated from Z to W"` `` | If Z is fully removed, the migration story is git history, not documentation |
25
+ | `` `"the old implementation did A"` `` | If A is gone, the reader gains nothing from knowing it existed |
26
+ | `` `"originally"` `` / `` `"used to be"` `` | Same — dead context |
27
+
28
+ ### In code comments
29
+
30
+ | Pattern | Good replacement |
31
+ |---------|-----------------|
32
+ | `# Uses X instead of Y` | `# Uses X` |
33
+ | `# Previously configured via Z` | `# Configured via Z` |
34
+ | `# Now uses the new API client` | `# Uses the new API client` |
35
+ | `# No longer supports legacy mode` | `# Supports modern mode only` |
36
+ | `// Switched to async processing` | `// Processes asynchronously` |
37
+ | `# Replaced by the cache layer` | `# Cache layer handles reads` |
38
+
39
+ ### Hook-detected patterns
40
+
41
+ The `state-description-blocker` hook (PreToolUse on Write\|Edit) enforces these patterns automatically:
42
+
43
+ `instead of`, `previously`, `now uses/does/handles/supports/names/includes`, `was previously`, `were previously`, `was formerly`, `was added`, `used to`, `no longer`, `has/have been updated/changed`, `replaced by`, `replaces`, `superseded by`, `supersedes`, `changed from`, `changes from`, `switched from/to`, `migrated from/to`, `moved to/into`, `extracted as`, `updated to`, `originally`, `as of`
23
44
 
24
45
  ## What IS allowed
25
46
 
@@ -22,7 +22,7 @@ GROQ_RETRY_BACKOFF_SECONDS = (2, 4, 8)
22
22
  REVIEW_BODY_HEADER_TEMPLATE = "## groq-bugteam audit: {p0} P0 / {p1} P1 / {p2} P2"
23
23
  NO_FINDINGS_REVIEW_BODY = (
24
24
  "## groq-bugteam audit: clean\n\n"
25
- "Groq ({model}) reviewed the diff against categories A-J and found no issues."
25
+ "Groq ({model}) reviewed the diff against categories A-K and found no issues."
26
26
  )
27
27
 
28
28
  AUDIT_SYSTEM_PROMPT = """You are an adversarial code reviewer auditing a pull request diff.
@@ -31,8 +31,13 @@ Inspect ONLY lines added or modified in the diff. Pre-existing code on
31
31
  untouched lines is out of scope. Cite file:line for every finding -- the line
32
32
  number MUST refer to the NEW side of the diff (post-change line number).
33
33
 
34
- Investigate these ten categories. Skip a category silently when you find
35
- nothing; do not emit verified-clean entries.
34
+ Investigate these eleven categories. Skip a category silently when you find
35
+ nothing; do not emit verified-clean entries. For the canonical rubric and
36
+ sub-bucket decomposition for each category, see
37
+ packages/claude-dev-env/audit-rubrics/category_rubrics/. For ready-to-send
38
+ Variant C audit prompts (each containing a PR/repo-independent generalized
39
+ skeleton above a `---` separator and a worked example against an authentic
40
+ PR below it), see packages/claude-dev-env/audit-rubrics/prompts/.
36
41
 
37
42
  A. API contract verification (signatures, return types, async/await)
38
43
  B. Selector / query / engine compatibility
@@ -44,6 +49,9 @@ G. Off-by-one, bounds, integer overflow
44
49
  H. Security boundaries (injection, path traversal, auth bypass, secret leakage)
45
50
  I. Concurrency hazards (race conditions, missing awaits, shared mutable state)
46
51
  J. Magic values and configuration drift
52
+ K. Codebase conflicts (a change updates one site of a pattern but a parallel
53
+ site in unchanged code stays stale, producing contradictory behavior;
54
+ diff is internally consistent, bug emerges only against unchanged code)
47
55
 
48
56
  Severity rubric:
49
57
  - P0: crashes, data loss, security breach, broken production invariant
@@ -56,7 +64,7 @@ Respond with JSON only -- no prose outside the JSON object. Shape:
56
64
  "findings": [
57
65
  {
58
66
  "severity": "P0" | "P1" | "P2",
59
- "category": "A" | ... | "J",
67
+ "category": "A" | ... | "K",
60
68
  "file": "relative path from repo root",
61
69
  "line": int,
62
70
  "title": "one-line summary",
@@ -126,7 +134,7 @@ SPEC_IMPLEMENTER_SYSTEM_PROMPT = """<groq_spec_implementer>
126
134
 
127
135
  - finding_index (int, stable across audit and fix)
128
136
  - severity (P0 | P1 | P2)
129
- - category (single letter A–J)
137
+ - category (single letter A–K)
130
138
  - file (relative path, must match the file being patched)
131
139
  - target_line_start (int, 1-based, inclusive)
132
140
  - target_line_end (int, 1-based, inclusive; equals target_line_start for single-line edits)
@@ -1,35 +1,28 @@
1
- # Bugteam — invariants and design rationale
2
-
3
- ## Constraints
4
-
5
- - **One run per invocation, multi-PR supported.** All PRs in a single /bugteam invocation share one `run_temp_dir`. Per-PR identity lives in the subagent name prefix (`bugfind-pr<N>-loop<L>` / `bugfix-pr<N>-loop<L>`) and the `<run_temp_dir>/pr-<N>/` subfolder containing that PR's git worktree, diff patches, and outcome XML files.
6
- - **Grant before any spawn, revoke before any return.** Step 0 grants project `.claude/**` permissions; Step 5 revokes. Both are mandatory. Revoke runs on every exit path including error, cap-reached, and stuck.
7
- - **Fresh subagent per loop.** Both bugfind and bugfix are spawned new each loop. Reusing a subagent across loops accumulates context inside that subagent's window defeats clean-room.
8
- - **One up-front confirmation = whole cycle.** The `/bugteam` invocation authorizes the entire cycle; every subsequent decision runs on that single authorization.
9
- - **10-loop hard cap.** Counted as **AUDIT** completions (increment in Step 3). Standards-fix passes before an audit do not advance `loop_count`. Worst case includes extra clean-coder spawns for the code-rules gate.
10
- - **Code rules gate before every AUDIT.** Run `_shared/pr-loop/scripts/code_rules_gate.py` (resolved via `${CLAUDE_SKILL_DIR}/../../_shared/pr-loop/scripts/code_rules_gate.py`) until exit **0** before spawning **bugfind**. Same `validate_content` logic as `hooks/blocking/code_rules_enforcer.py`.
11
- - **Clean-room audits, every loop.** Each bugfind subagent's spawn prompt contains only the PR scope, audit rubric, and the current loop number. Prior loop history stays in the lead.
12
- - **Targeted fixes.** Each fix subagent sees ONLY the most recent audit's findings. Prior loops are invisible to the fix subagent.
13
- - **Opus 4.7 at xhigh effort for validator and fix subagents.** Single-auditor mode, validator, and fix spawns pass `model="opus"`; parallel-auditor siblings (`-b` through `-k`) pass `model="haiku"`. Opus 4.7's default effort level in Claude Code is `xhigh` (https://code.claude.com/docs/en/model-config — *"On Opus 4.7, the default effort is `xhigh` for all plans and providers."*), so no `effort` override is needed at spawn time. Effort is set per-subagent in YAML frontmatter, not via the `Agent` tool's parameters; `code-quality-agent` and `clean-coder` rely on the model default. The trade vs Sonnet is higher per-loop cost in exchange for deeper audit recall and stronger fix correctness on bug-hunting work, which the per-PR loop economics tolerate (10-loop hard cap bounds total spend).
14
- - **Fix subagent receives the latest audit as its input contract.** Passing the audit's findings to the fix subagent is the input contract — each loop's fix run operates on the current audit's output and only that.
15
- - **One commit per fix action.** Loops produce one commit per loop, not one per bug.
16
- - **Linear branch, fixed PR base.** Every loop appends one forward-only commit; existing commits and the PR base stay intact throughout the cycle.
17
- - **Lead-only cleanup.** Cleanup runs in the lead (this session) only. Step 4 removes the full `<run_temp_dir>` so no loop patches leak between runs.
18
- - **Cleanup all `.bugteam-*` files on exit.** The per-run `<run_temp_dir>` is removed entirely by Step 4, which covers `<run_temp_dir>/pr-<N>/loop-<L>.patch` and `<run_temp_dir>/pr-<N>/loop-<L>-<letter>.outcomes.xml`. The per-loop outcomes XML at `<worktree_path>/.bugteam-pr<N>-loop<L>.outcomes.xml` is removed with the worktree. Step 4.5 deletes `.bugteam-final.diff`, `.bugteam-original-body.md`, and `.bugteam-final-body.md`. Working directory ends clean.
19
- - **Audit/fix comment posting.** The bugfind subagent posts ONE per-loop review (parent body + child finding comments in a single batched POST, with review-fallback to a top-level issue comment). The bugfix subagent posts the fix replies after committing. All comment, review, and reply POSTs belong to the subagents; the lead's single PR-write action is the final description rewrite at Step 4.5.
20
- - **Lead owns the final PR description rewrite only** (Step 4.5), and only via the `pr-description-writer` agent. The lead does not compose the description inline.
1
+ # Bugteam constraints
2
+
3
+ ## Non-Negotiable
4
+
5
+ - **Pre-flight is mandatory.** `preflight.py` must exit 0 before Step 0. If it fails for `core.hooksPath`, auto-remediate with `fix_hookspath.py`. All other failures require manual fixes.
6
+ - **Looping against a fixed known count.** 10 audit loops hard cap. No exceptions. The cap is a safety value, set high enough to converge on most non-trivial PRs while preventing infinite loops.
7
+ - **`loop_count` is the iteration counter.** It increments before each AUDIT in Step 3. A FIX without a preceding AUDIT does not advance `loop_count`. The `loop_count > 10` check runs before each AUDIT. After 10 AUDITs, the cycle exits regardless of remaining FIX rounds. Standards-fix passes before an audit do not advance `loop_count`.
21
8
  - **One review per loop, findings as child comments of that review.** Each loop posts a single pull-request review whose body is the loop header and whose `comments[]` are the anchored findings. Each loop's review stands alone — one review created per loop, fully self-contained on the PR conversation.
22
9
  - **PR description rewrite on every exit.** Step 4.5 runs on `converged`, `cap reached`, and `stuck`. On `error`, the rewrite is best-effort; if it fails, surface the error in the final report and continue to revoke.
23
- - **Outcome XML, not JSON.** Both subagents write structured outcome data (findings or fix outcomes) to `.bugteam-pr<N>-loop<L>.outcomes.xml`. The lead reads these files between actions. XML chosen for parser robustness against multi-line, special-character, and quoted reason fields.
10
+ - **Outcome XML, not JSON.** The AUDIT subagent writes findings to `.bugteam-pr<N>-loop<L>.outcomes.xml` and the FIX subagent writes fix outcomes to `.bugteam-pr<N>-loop<L>.fix-outcomes.xml`. The lead reads these files between actions. Separate paths prevent the FIX output from overwriting the AUDIT's findings file. XML chosen for parser robustness against multi-line, special-character, and quoted reason fields.
24
11
 
25
12
  ## Why this design
26
13
 
27
- The three sibling skills compose, but `/bugteam` solves a problem they cannot solve in sequence:
14
+ ### Why retry with fix why not just reject and move on
15
+
16
+ Bugteam's purpose is to make real PRs better before they ship, not to just point out problems. A review that says "fix this bug" without giving the author&#60;subagent&#62; a chance to fix it in the same session would be a weaker intervention — the PR author still has to go back, figure out the fix, apply it, re-push, and re-trigger review. By bundling fix attempts into the same loop, bugteam reduces round-trips from N audits + N manual fix cycles to N audits + N automated fix attempts, with no human context-switching.
17
+
18
+ ### Why 10 loops — why not unlimited
19
+
20
+ A PR that needs more than 10 audit-fix rounds has deeper problems than bugteam can address. The 10-loop cap is a forcing function: after 10 rounds, escalate to `/findbugs` or human review rather than grinding on diminishing returns.
21
+
22
+ ### Why outcome XML — why not JSON
28
23
 
29
- - `/findbugs` audits once and stops.
30
- - `/fixbugs` fixes the findings of one audit and stops.
31
- - A human-driven `/findbugs` → `/fixbugs` → `/findbugs` → `/fixbugs` cycle works but requires the user to drive it.
24
+ JSON escapes `\n` inside `"reason": "could not address: some\nmulti-line\ntext"`, making the file hard to read and grep. XML preserves the raw text as element content, so `&#60;reason&#62;could not address: some&#10;multi-line&#10;text&#60;/reason&#62;` renders legibly in every markdown-capable viewer. The choice is ergonomic, not technical — both formats carry the same information.
32
25
 
33
- `/bugteam` automates that cycle. The clean-room property is preserved by spawning a fresh audit agent each loop with no inherited context — every audit is independent of the prior loop's verdict. The 10-loop cap is the safety: pathological cases (audit agent oscillating, fix agent regressing) cannot run away.
26
+ ### Why sibling auditor paths diverge (worktree vs temp)
34
27
 
35
- The single up-front confirmation is the explicit trade — `/bugteam` is more autonomous than `/findbugs`+`/fixbugs` chained manually. The user accepts that autonomy by typing the command. Stop conditions and the loop log give the user full visibility on exit.
28
+ Only the -a validator writes to the worktree `.bugteam-pr&#60;N&#62;-loop&#60;L&#62;.outcomes.xml` path, which the lead reads. Sibling auditors (-b through -k) write to unique paths under `&#60;run_temp_dir&#62;` to avoid collisions. Without this split, parallel haiku auditors writing to the same path would clobber each other's output, and the lead consuming one path would see only whichever writer finished last.
@@ -50,7 +50,7 @@ Claude: [resolves PR #99, runs loop with partial-fix outcomes]
50
50
  `Loops: 2`
51
51
  `Unresolved findings (2): src/auth.py:45 (P1: file is generated, cannot edit); src/legacy.py:200 (P1: rewrite scope exceeds the bug)`
52
52
 
53
- The bugfix teammate writes one outcome per finding to `.bugteam-loop-2.outcomes.xml`. Findings with `status=could_not_address` carry their `<reason>` text, and the teammate posts a matching reply to each finding comment so the reviewer sees why each bug stayed open.
53
+ The bugfix teammate writes one outcome per finding to `.bugteam-pr99-loop2.fix-outcomes.xml`. Findings with `status=could_not_address` carry their `<reason>` text, and the teammate posts a matching reply to each finding comment so the reviewer sees why each bug stayed open.
54
54
  </example>
55
55
 
56
56
  <example>