claude-dev-env 1.44.0 → 1.45.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (34) hide show
  1. package/CLAUDE.md +9 -0
  2. package/_shared/pr-loop/scripts/code_rules_gate.py +426 -85
  3. package/_shared/pr-loop/scripts/pr_loop_shared_constants/code_rules_gate_constants.py +20 -0
  4. package/_shared/pr-loop/scripts/tests/test_code_rules_gate.py +625 -21
  5. package/_shared/pr-loop/scripts/tests/test_code_rules_gate_constants.py +15 -0
  6. package/agents/clean-coder.md +7 -1
  7. package/agents/code-quality-agent.md +8 -5
  8. package/hooks/blocking/code_rules_enforcer.py +1562 -37
  9. package/hooks/blocking/open_questions_in_plans_blocker.py +249 -0
  10. package/hooks/blocking/test_code_rules_enforcer.py +1389 -0
  11. package/hooks/blocking/test_code_rules_enforcer_banned_noun_word.py +292 -0
  12. package/hooks/blocking/test_code_rules_enforcer_cap_meta.py +46 -8
  13. package/hooks/blocking/test_code_rules_enforcer_exempt_marker_chained.py +189 -0
  14. package/hooks/blocking/test_code_rules_enforcer_function_length.py +210 -0
  15. package/hooks/blocking/test_code_rules_enforcer_tests_isolate_home_temp.py +1512 -0
  16. package/hooks/blocking/test_code_rules_enforcer_unused_imports.py +9 -5
  17. package/hooks/blocking/test_open_questions_in_plans_blocker.py +790 -0
  18. package/hooks/hooks.json +10 -0
  19. package/hooks/hooks_constants/banned_identifiers_constants.py +19 -0
  20. package/hooks/hooks_constants/code_rules_enforcer_constants.py +129 -2
  21. package/hooks/hooks_constants/open_questions_in_plans_blocker_constants.py +35 -0
  22. package/hooks/hooks_constants/test_open_questions_in_plans_blocker_constants.py +125 -0
  23. package/package.json +1 -1
  24. package/skills/_shared/pr-loop/scripts/_path_resolver.py +34 -13
  25. package/skills/_shared/pr-loop/scripts/init_loop_state.py +1 -2
  26. package/skills/_shared/pr-loop/scripts/teardown_worktrees.py +1 -4
  27. package/skills/_shared/pr-loop/scripts/test__path_resolver.py +57 -0
  28. package/skills/_shared/pr-loop/scripts/test_init_loop_state.py +48 -0
  29. package/skills/_shared/pr-loop/scripts/test_teardown_worktrees.py +59 -0
  30. package/skills/bugteam/PROMPTS.md +48 -12
  31. package/skills/bugteam/reference/team-setup.md +4 -2
  32. package/skills/bugteam/scripts/bugteam_code_rules_gate.py +487 -76
  33. package/skills/bugteam/scripts/bugteam_scripts_constants/bugteam_code_rules_gate_constants.py +22 -1
  34. package/skills/bugteam/scripts/test_bugteam_code_rules_gate.py +597 -12
@@ -102,3 +102,18 @@ def test_git_diff_name_only_null_terminated_command_prefix_includes_dash_z() ->
102
102
  )
103
103
  assert command_prefix == ("git", "diff", "--name-only", "-z")
104
104
 
105
+
106
+ def test_banned_noun_span_pattern_extracts_definition_line_and_span() -> None:
107
+ message = (
108
+ "Line 5: Identifier 'canned_results' contains banned noun word "
109
+ "(word: 'results') (binding span at line 1, spanning 3 lines)"
110
+ )
111
+ match = constants_module.BANNED_NOUN_VIOLATION_PATTERN.search(message)
112
+ assert match is not None
113
+ definition_line = int(
114
+ match.group(constants_module.BANNED_NOUN_DEFINITION_LINE_GROUP_INDEX)
115
+ )
116
+ line_span = int(match.group(constants_module.BANNED_NOUN_SPAN_GROUP_INDEX))
117
+ assert definition_line == 1
118
+ assert line_span == 3
119
+
@@ -438,7 +438,13 @@ Docstrings on functions, methods, classes, and modules are encouraged for public
438
438
 
439
439
  ## Audit Awareness
440
440
 
441
- Code clean-coder writes will be audited later against the A–K bug categories from `code-quality-agent`. The hooks listed in this file enforce the Category J slice at write time, but A–I and K (codebase conflicts / incomplete propagation) surface only in audit. For each category's full rubric, sub-bucket decomposition, and concrete checks, see `../audit-rubrics/category_rubrics/` (relative to this agent file). While generating code, anticipate the full A–K surface so the first write clears every audit category.
441
+ Code clean-coder writes will be audited later against the A–N bug categories from `code-quality-agent`. The hooks listed in this file enforce the Category J slice at write time, but A–I and K–N surface only in audit. For each category's full rubric, sub-bucket decomposition, and concrete checks, see `../audit-rubrics/category_rubrics/` (relative to this agent file). While generating code, anticipate the full A–N surface so the first write clears every audit category.
442
+
443
+ Three audit lanes deserve particular attention while generating new code:
444
+
445
+ - **Category L — Behavior-equivalence for refactors.** When the task rewrites an existing `check_*`, parser, or path classifier, pin the function's canonical historically-valid inputs into a `KNOWN_GOOD_INPUTS` table and assert each still passes after the rewrite. Refactors that intentionally change behavior cite the changed inputs in the PR body. New checks without prior behavior require no equivalence table.
446
+ - **Category M — Producer/consumer cardinality vs collection-type contract.** For any new function returning `list[X]`, `Sequence[X]`, or `Iterable[X]`, decide whether the return can contain duplicates and whether any downstream consumer treats the value as a set. Subprocess-stdout parsers must return `frozenset[Path]` or `dict.fromkeys`-deduplicated `list[Path]`. Functions whose only consumer is `extend(...)` into a list pass; functions with explicit "duplicates preserved" docstring text pass.
447
+ - **Category N — Test-name scenario verifier.** When naming a test `test_*_at_*` / `_under_*` / `_when_*` / `_with_*`, prove via monkeypatch / fixture inspection that the named condition is in effect when the system under test runs. For path-decision functions (anything registered in `*_path_exemptions.py` / `is_*_path` / `_resolve_*_path` modules), ship a parametric matrix of canonical edge cases (empty string, single filename, tilde, UNC, drive-letter, symlinked, `..`-containing, trailing-slash). Tests with neutral names (`test_returns_empty_list_on_x`) are unaffected.
442
448
 
443
449
  ## What You Produce
444
450
 
@@ -9,7 +9,7 @@ color: red
9
9
 
10
10
  You audit a pull request diff for bugs and CODE_RULES.md compliance issues. You return findings; the orchestrator handles fixes.
11
11
 
12
- **Announce at start:** "Using code-quality-agent — auditing diff against A–K categories with CODE_RULES.md awareness."
12
+ **Announce at start:** "Using code-quality-agent — auditing diff against A–N categories with CODE_RULES.md awareness."
13
13
 
14
14
  ## Scope
15
15
 
@@ -19,7 +19,7 @@ Audit only added or modified lines in the diff. Pre-existing code on untouched l
19
19
 
20
20
  This agent runs in one of two modes depending on the calling prompt:
21
21
 
22
- - **Unscoped (default):** the prompt names no categories. Walk all of A through K and produce Shape A/B for every category.
22
+ - **Unscoped (default):** the prompt names no categories. Walk all of A through N and produce Shape A/B for every category.
23
23
  - **Category-restricted:** the prompt names a subset of categories ("audit only category F" or "investigate only H, I, and K"). Audit only the named categories and produce Shape A/B for those alone; skip the rest.
24
24
 
25
25
  Tradeoff for callers picking the category-restricted mode: parallel category invocation loses cross-category reasoning. A security finding in Category H may inform a Category J classification, and a parallel split misses that connection. When categories need to inform each other, prefer the unscoped mode.
@@ -32,9 +32,9 @@ Preserve every existing comment. Findings on production code report only on new
32
32
 
33
33
  Report findings only. Author zero edits. Author zero diffs. Run zero commits or pushes. The orchestrator (and the calling skill) handles fix application, commit creation, and PR posting based on your finding list.
34
34
 
35
- ## Bug Categories A–K
35
+ ## Bug Categories A–N
36
36
 
37
- Every audit pass walks all eleven categories. Each category produces either at least one Shape A finding (concrete bug at a file:line) or at least one Shape B proof-of-absence entry (audited and clean, with adversarial probes documented). A category that returns neither is a protocol gap per the audit contract.
37
+ Every audit pass walks all fourteen categories. Each category produces either at least one Shape A finding (concrete bug at a file:line) or at least one Shape B proof-of-absence entry (audited and clean, with adversarial probes documented). A category that returns neither is a protocol gap per the audit contract.
38
38
 
39
39
  For each category's full description, examples, sub-bucket decomposition, and concrete checks, read the matching rubric in `../audit-rubrics/category_rubrics/`:
40
40
 
@@ -51,6 +51,9 @@ For each category's full description, examples, sub-bucket decomposition, and co
51
51
  | I | Concurrency hazards | `../audit-rubrics/category_rubrics/category-i-concurrency.md` |
52
52
  | J | CODE_RULES.md compliance | `../audit-rubrics/category_rubrics/category-j-code-rules-compliance.md` |
53
53
  | K | Codebase conflicts (incomplete propagation) | `../audit-rubrics/category_rubrics/category-k-codebase-conflicts.md` |
54
+ | L | Behavior-equivalence for refactors | `../audit-rubrics/category_rubrics/category-l-behavior-equivalence.md` |
55
+ | M | Producer/consumer cardinality vs collection-type contract | `../audit-rubrics/category_rubrics/category-m-producer-consumer-cardinality.md` |
56
+ | N | Test-name scenario verifier | `../audit-rubrics/category_rubrics/category-n-test-name-scenario-verifier.md` |
54
57
 
55
58
  Test files (`test_*.py`, `*_test.py`, `*.test.*`, `*.spec.*`, `conftest.py`, and any path under `/tests/`) are exempt from category J. The exempt path families documented in the J reference also opt out of the constants-location sub-item.
56
59
 
@@ -110,7 +113,7 @@ A bare verified-clean label is inadequate: every Shape B entry lists the files o
110
113
 
111
114
  ## Per-Category Expectation
112
115
 
113
- Every category A through K is investigated. The output for each category is one of:
116
+ Every category A through N is investigated. The output for each category is one of:
114
117
  - one or more Shape A findings, or
115
118
  - one Shape B proof-of-absence entry with concrete files, quoted lines, and adversarial probes.
116
119