claude-dev-env 1.55.2 → 1.57.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +19 -0
- package/hooks/blocking/precommit_code_rules_gate.py +197 -0
- package/hooks/blocking/test_precommit_code_rules_gate.py +126 -0
- package/hooks/hooks.json +5 -0
- package/hooks/hooks_constants/precommit_code_rules_gate_constants.py +26 -0
- package/package.json +1 -1
- package/skills/_shared/pr-loop/prompts/pr-consistency-audit.xml +1 -0
- package/skills/_shared/pr-loop/scripts/_path_resolver.py +1 -1
- package/skills/_shared/pr-loop/scripts/build_audit_prompt.py +3 -3
- package/skills/_shared/pr-loop/scripts/build_fix_prompt.py +1 -1
- package/skills/_shared/pr-loop/scripts/init_loop_state.py +1 -1
- package/skills/_shared/pr-loop/scripts/skills_pr_loop_constants/path_resolver_constants.py +4 -0
- package/skills/_shared/pr-loop/scripts/teardown_worktrees.py +1 -1
- package/skills/_shared/pr-loop/scripts/test__path_resolver.py +11 -1
- package/skills/_shared/pr-loop/scripts/test_build_audit_prompt.py +24 -0
- package/skills/_shared/pr-loop/scripts/test_build_fix_prompt.py +49 -0
- package/skills/_shared/pr-loop/scripts/test_init_loop_state.py +23 -0
- package/skills/autoconverge/SKILL.md +20 -1
- package/skills/autoconverge/workflow/converge.mjs +79 -17
- package/skills/pr-converge/SKILL.md +22 -0
package/CLAUDE.md
CHANGED
|
@@ -50,6 +50,25 @@ Run every multi-step code task in two phases:
|
|
|
50
50
|
|
|
51
51
|
Repair agents run only on reported findings; the verifier re-checks after each repair. Work lands (commit, push, draft PR) only on a clean verdict — enforced by the `verified_commit_gate` hook, which blocks `git commit`/`git push` unless a hook-minted verdict covers the current branch diff. The one exemption is mechanical, not discretionary: a diff whose every changed file is non-code or has an unchanged Python AST once docstrings are stripped (docs, docstrings, comments).
|
|
52
52
|
|
|
53
|
+
## Converge & Review Loop Discipline
|
|
54
|
+
|
|
55
|
+
- **Worktree isolation:** Run every PR convergence and review loop in an isolated worktree, never a shared checkout that concurrent processes may advance. Verify isolation (the working directory path includes `.claude/worktrees/`) before the first tick or round.
|
|
56
|
+
- **No hedging in findings:** Findings and PR reports state verified facts only — never `likely`, `probably`, `should`, `appears to`. Verify each claim against the code before stating it; the anti-hallucination Stop hook rejects hedged responses.
|
|
57
|
+
- **Tight edit scope:** Edit exactly what the task names — no whole-file rewrites, no renaming public method parameters, no changes beyond the stated task. When the user asks for a "lasting" or "reusable" fix, prefer the durable systemic fix over a one-off edit.
|
|
58
|
+
- **GitHub MCP first:** The GitHub MCP (`mcp__plugin_github_github__*`) is the primary path for PR and review-thread inspection; raw `gh api` is the fallback, not the default — MCP calls work the same from any worktree.
|
|
59
|
+
|
|
60
|
+
## Sub-agent Output Validation
|
|
61
|
+
|
|
62
|
+
After any sub-agent returns a PR description, file list, or counts, verify each claim against the actual diff and repo state before using it. Flag and correct any invented paths, fabricated counts, or out-of-scope changes before they land in commits or PR bodies.
|
|
63
|
+
|
|
64
|
+
## Git Sync Intent
|
|
65
|
+
|
|
66
|
+
When asked to sync git ("get X onto origin main", "update main"), fast-forward local main to origin — do NOT commit untracked working-tree files unless explicitly told to.
|
|
67
|
+
|
|
68
|
+
## Scheduled Task Cadence
|
|
69
|
+
|
|
70
|
+
For scheduled/cron tasks, default to sub-hour intervals (30-minute); do not propose hourly cadences.
|
|
71
|
+
|
|
53
72
|
## Additional Non-overlapping Rules
|
|
54
73
|
|
|
55
74
|
- **task_scope:** Match every action to what was explicitly requested. When intent is ambiguous, research official docs and present options via AskUserQuestion before making any changes. Proceed with edits only on explicit instruction.
|
|
@@ -0,0 +1,197 @@
|
|
|
1
|
+
"""PreToolUse hook that runs the staged CODE_RULES gate before git commit.
|
|
2
|
+
|
|
3
|
+
Intercepts Bash `git commit` invocations (including `git -C <path> commit`),
|
|
4
|
+
resolves the repository root, and runs the shared code_rules_gate engine in
|
|
5
|
+
``--staged`` mode over the staged files. A commit that would introduce
|
|
6
|
+
CODE_RULES violations is denied with the gate's file:line report so the
|
|
7
|
+
violations surface before the commit instead of stalling converge loops at
|
|
8
|
+
commit time. Non-commit commands, repositories with no staged Python files,
|
|
9
|
+
and clean staged changes pass through silently. A gate-engine failure denies
|
|
10
|
+
the commit with the failure detail — the gate never fails open.
|
|
11
|
+
"""
|
|
12
|
+
|
|
13
|
+
import json
|
|
14
|
+
import re
|
|
15
|
+
import subprocess
|
|
16
|
+
import sys
|
|
17
|
+
from pathlib import Path
|
|
18
|
+
|
|
19
|
+
_blocking_dir = str(Path(__file__).resolve().parent)
|
|
20
|
+
if _blocking_dir not in sys.path:
|
|
21
|
+
sys.path.insert(0, _blocking_dir)
|
|
22
|
+
_hooks_dir = str(Path(__file__).resolve().parent.parent)
|
|
23
|
+
if _hooks_dir not in sys.path:
|
|
24
|
+
sys.path.insert(0, _hooks_dir)
|
|
25
|
+
|
|
26
|
+
from block_main_commit import ( # noqa: E402
|
|
27
|
+
extract_git_working_directory,
|
|
28
|
+
is_commit_command,
|
|
29
|
+
parse_bash_command_from_stdin,
|
|
30
|
+
resolve_directory,
|
|
31
|
+
)
|
|
32
|
+
from hooks_constants.precommit_code_rules_gate_constants import ( # noqa: E402
|
|
33
|
+
ALL_GIT_REPOSITORY_ROOT_COMMAND,
|
|
34
|
+
ALL_STAGED_PYTHON_FILES_COMMAND,
|
|
35
|
+
GATE_RELATIVE_PATH,
|
|
36
|
+
GATE_TIMEOUT_SECONDS,
|
|
37
|
+
GIT_COMMAND_TIMEOUT_SECONDS,
|
|
38
|
+
GIT_DASH_C_COMMIT_PATTERN,
|
|
39
|
+
)
|
|
40
|
+
|
|
41
|
+
|
|
42
|
+
def is_git_commit_invocation(bash_command: str) -> bool:
|
|
43
|
+
"""Report whether *bash_command* runs a git commit.
|
|
44
|
+
|
|
45
|
+
Matches both the plain ``git commit`` substring form and the
|
|
46
|
+
``git -C <path> commit`` form, where the directory flag sits between
|
|
47
|
+
the two words.
|
|
48
|
+
|
|
49
|
+
Args:
|
|
50
|
+
bash_command: The Bash tool command string from the hook payload.
|
|
51
|
+
|
|
52
|
+
Returns:
|
|
53
|
+
True when the command invokes git commit; False otherwise.
|
|
54
|
+
"""
|
|
55
|
+
if is_commit_command(bash_command):
|
|
56
|
+
return True
|
|
57
|
+
return re.search(GIT_DASH_C_COMMIT_PATTERN, bash_command) is not None
|
|
58
|
+
|
|
59
|
+
|
|
60
|
+
def resolve_repository_root(working_directory: str | None) -> Path | None:
|
|
61
|
+
"""Resolve the git repository root for the commit's working directory.
|
|
62
|
+
|
|
63
|
+
Args:
|
|
64
|
+
working_directory: Directory the commit runs in, or None for the
|
|
65
|
+
hook's current working directory.
|
|
66
|
+
|
|
67
|
+
Returns:
|
|
68
|
+
The repository root path, or None when the directory is not inside
|
|
69
|
+
a git repository or git is unavailable.
|
|
70
|
+
"""
|
|
71
|
+
try:
|
|
72
|
+
completed_process = subprocess.run(
|
|
73
|
+
list(ALL_GIT_REPOSITORY_ROOT_COMMAND),
|
|
74
|
+
capture_output=True,
|
|
75
|
+
text=True,
|
|
76
|
+
timeout=GIT_COMMAND_TIMEOUT_SECONDS,
|
|
77
|
+
cwd=working_directory,
|
|
78
|
+
)
|
|
79
|
+
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
|
|
80
|
+
return None
|
|
81
|
+
if completed_process.returncode != 0:
|
|
82
|
+
return None
|
|
83
|
+
top_level_text = completed_process.stdout.strip()
|
|
84
|
+
if not top_level_text:
|
|
85
|
+
return None
|
|
86
|
+
return Path(top_level_text)
|
|
87
|
+
|
|
88
|
+
|
|
89
|
+
def list_staged_python_files(repository_root: Path) -> list[str]:
|
|
90
|
+
"""List repository-relative paths of staged Python files.
|
|
91
|
+
|
|
92
|
+
Args:
|
|
93
|
+
repository_root: Repository root used as the git working directory.
|
|
94
|
+
|
|
95
|
+
Returns:
|
|
96
|
+
Repository-relative paths of Python files staged for add, copy,
|
|
97
|
+
modify, or rename. Empty when the listing command fails — the
|
|
98
|
+
caller then allows the commit because git itself will surface the
|
|
99
|
+
repository problem.
|
|
100
|
+
"""
|
|
101
|
+
try:
|
|
102
|
+
completed_process = subprocess.run(
|
|
103
|
+
list(ALL_STAGED_PYTHON_FILES_COMMAND),
|
|
104
|
+
capture_output=True,
|
|
105
|
+
text=True,
|
|
106
|
+
timeout=GIT_COMMAND_TIMEOUT_SECONDS,
|
|
107
|
+
cwd=str(repository_root),
|
|
108
|
+
)
|
|
109
|
+
except (subprocess.TimeoutExpired, FileNotFoundError, OSError):
|
|
110
|
+
return []
|
|
111
|
+
if completed_process.returncode != 0:
|
|
112
|
+
return []
|
|
113
|
+
return [
|
|
114
|
+
each_line.strip()
|
|
115
|
+
for each_line in completed_process.stdout.splitlines()
|
|
116
|
+
if each_line.strip()
|
|
117
|
+
]
|
|
118
|
+
|
|
119
|
+
|
|
120
|
+
def run_staged_gate(repository_root: Path) -> tuple[int, str]:
|
|
121
|
+
"""Run the shared code_rules_gate engine in staged mode.
|
|
122
|
+
|
|
123
|
+
Args:
|
|
124
|
+
repository_root: Repository root passed to the gate's --repo-root.
|
|
125
|
+
|
|
126
|
+
Returns:
|
|
127
|
+
Tuple of the gate exit code and its stderr report. A missing gate
|
|
128
|
+
script or a gate timeout returns a non-zero code with an
|
|
129
|
+
explanatory message so the commit is denied rather than waved
|
|
130
|
+
through on infrastructure failure.
|
|
131
|
+
"""
|
|
132
|
+
gate_path = Path(__file__).resolve().parents[2] / GATE_RELATIVE_PATH
|
|
133
|
+
if not gate_path.is_file():
|
|
134
|
+
return 1, f"precommit_code_rules_gate: gate engine missing at {gate_path}"
|
|
135
|
+
try:
|
|
136
|
+
completed_process = subprocess.run(
|
|
137
|
+
[
|
|
138
|
+
sys.executable,
|
|
139
|
+
str(gate_path),
|
|
140
|
+
"--repo-root",
|
|
141
|
+
str(repository_root),
|
|
142
|
+
"--staged",
|
|
143
|
+
],
|
|
144
|
+
capture_output=True,
|
|
145
|
+
text=True,
|
|
146
|
+
encoding="utf-8",
|
|
147
|
+
errors="replace",
|
|
148
|
+
timeout=GATE_TIMEOUT_SECONDS,
|
|
149
|
+
)
|
|
150
|
+
except subprocess.TimeoutExpired:
|
|
151
|
+
return 1, (
|
|
152
|
+
f"precommit_code_rules_gate: gate engine timed out after {GATE_TIMEOUT_SECONDS}s"
|
|
153
|
+
)
|
|
154
|
+
return completed_process.returncode, completed_process.stderr
|
|
155
|
+
|
|
156
|
+
|
|
157
|
+
def build_denial_response(gate_report: str) -> dict:
|
|
158
|
+
"""Build the PreToolUse deny payload carrying the gate report.
|
|
159
|
+
|
|
160
|
+
Args:
|
|
161
|
+
gate_report: The gate's stderr report listing file:line violations.
|
|
162
|
+
|
|
163
|
+
Returns:
|
|
164
|
+
The hookSpecificOutput deny mapping for the PreToolUse protocol.
|
|
165
|
+
"""
|
|
166
|
+
denial_reason = (
|
|
167
|
+
f"BLOCKED: staged files violate CODE_RULES; fix before committing.\n{gate_report.strip()}"
|
|
168
|
+
)
|
|
169
|
+
return {
|
|
170
|
+
"hookSpecificOutput": {
|
|
171
|
+
"hookEventName": "PreToolUse",
|
|
172
|
+
"permissionDecision": "deny",
|
|
173
|
+
"permissionDecisionReason": denial_reason,
|
|
174
|
+
}
|
|
175
|
+
}
|
|
176
|
+
|
|
177
|
+
|
|
178
|
+
def main() -> None:
|
|
179
|
+
"""Gate git commits on the staged CODE_RULES report."""
|
|
180
|
+
bash_command = parse_bash_command_from_stdin()
|
|
181
|
+
if not is_git_commit_invocation(bash_command):
|
|
182
|
+
sys.exit(0)
|
|
183
|
+
working_directory = resolve_directory(extract_git_working_directory(bash_command))
|
|
184
|
+
repository_root = resolve_repository_root(working_directory)
|
|
185
|
+
if repository_root is None:
|
|
186
|
+
sys.exit(0)
|
|
187
|
+
if not list_staged_python_files(repository_root):
|
|
188
|
+
sys.exit(0)
|
|
189
|
+
gate_exit_code, gate_report = run_staged_gate(repository_root)
|
|
190
|
+
if gate_exit_code == 0:
|
|
191
|
+
sys.exit(0)
|
|
192
|
+
print(json.dumps(build_denial_response(gate_report)))
|
|
193
|
+
sys.exit(0)
|
|
194
|
+
|
|
195
|
+
|
|
196
|
+
if __name__ == "__main__":
|
|
197
|
+
main()
|
|
@@ -0,0 +1,126 @@
|
|
|
1
|
+
"""Behavior tests for the precommit_code_rules_gate PreToolUse hook.
|
|
2
|
+
|
|
3
|
+
Each test builds a real git repository in a temporary directory, stages
|
|
4
|
+
real files, and runs the hook script as a subprocess with a PreToolUse
|
|
5
|
+
JSON payload on stdin — the exact production invocation path.
|
|
6
|
+
"""
|
|
7
|
+
|
|
8
|
+
import json
|
|
9
|
+
import subprocess
|
|
10
|
+
import sys
|
|
11
|
+
from pathlib import Path
|
|
12
|
+
|
|
13
|
+
HOOK_PATH = Path(__file__).resolve().parent / "precommit_code_rules_gate.py"
|
|
14
|
+
|
|
15
|
+
CLEAN_MODULE_SOURCE = '''"""Increment helper used by the precommit gate tests."""
|
|
16
|
+
|
|
17
|
+
|
|
18
|
+
def add_one(number: int) -> int:
|
|
19
|
+
"""Return *number* plus one.
|
|
20
|
+
|
|
21
|
+
Args:
|
|
22
|
+
number: The integer to increment.
|
|
23
|
+
|
|
24
|
+
Returns:
|
|
25
|
+
The incremented integer.
|
|
26
|
+
"""
|
|
27
|
+
return number + 1
|
|
28
|
+
'''
|
|
29
|
+
|
|
30
|
+
VIOLATING_MODULE_SOURCE = '''"""Module carrying a banned identifier for the precommit gate tests."""
|
|
31
|
+
|
|
32
|
+
|
|
33
|
+
def compute_total() -> int:
|
|
34
|
+
"""Return a fixed total.
|
|
35
|
+
|
|
36
|
+
Returns:
|
|
37
|
+
The fixed total.
|
|
38
|
+
"""
|
|
39
|
+
result = 1
|
|
40
|
+
return result
|
|
41
|
+
'''
|
|
42
|
+
|
|
43
|
+
|
|
44
|
+
def run_git(repository_root: Path, *git_arguments: str) -> None:
|
|
45
|
+
subprocess.run(
|
|
46
|
+
["git", "-C", str(repository_root), *git_arguments],
|
|
47
|
+
check=True,
|
|
48
|
+
capture_output=True,
|
|
49
|
+
)
|
|
50
|
+
|
|
51
|
+
|
|
52
|
+
def initialize_repository(repository_root: Path) -> None:
|
|
53
|
+
run_git(repository_root, "init")
|
|
54
|
+
run_git(repository_root, "config", "user.email", "tests@example.com")
|
|
55
|
+
run_git(repository_root, "config", "user.name", "Gate Tests")
|
|
56
|
+
run_git(repository_root, "commit", "--allow-empty", "-m", "initial")
|
|
57
|
+
|
|
58
|
+
|
|
59
|
+
def stage_file(repository_root: Path, relative_name: str, source_text: str) -> None:
|
|
60
|
+
(repository_root / relative_name).write_text(source_text, encoding="utf-8")
|
|
61
|
+
run_git(repository_root, "add", relative_name)
|
|
62
|
+
|
|
63
|
+
|
|
64
|
+
def run_hook(bash_command: str, working_directory: Path) -> subprocess.CompletedProcess[str]:
|
|
65
|
+
payload = json.dumps({"tool_input": {"command": bash_command}})
|
|
66
|
+
return subprocess.run(
|
|
67
|
+
[sys.executable, str(HOOK_PATH)],
|
|
68
|
+
input=payload,
|
|
69
|
+
capture_output=True,
|
|
70
|
+
text=True,
|
|
71
|
+
cwd=str(working_directory),
|
|
72
|
+
timeout=120,
|
|
73
|
+
)
|
|
74
|
+
|
|
75
|
+
|
|
76
|
+
def parse_denial(hook_stdout: str) -> dict:
|
|
77
|
+
return json.loads(hook_stdout)["hookSpecificOutput"]
|
|
78
|
+
|
|
79
|
+
|
|
80
|
+
def test_non_commit_command_passes_through(tmp_path: Path) -> None:
|
|
81
|
+
initialize_repository(tmp_path)
|
|
82
|
+
completed_hook = run_hook("git status", tmp_path)
|
|
83
|
+
assert completed_hook.returncode == 0
|
|
84
|
+
assert completed_hook.stdout.strip() == ""
|
|
85
|
+
|
|
86
|
+
|
|
87
|
+
def test_commit_with_clean_staged_python_file_is_allowed(tmp_path: Path) -> None:
|
|
88
|
+
initialize_repository(tmp_path)
|
|
89
|
+
stage_file(tmp_path, "incrementer.py", CLEAN_MODULE_SOURCE)
|
|
90
|
+
completed_hook = run_hook("git commit -m add", tmp_path)
|
|
91
|
+
assert completed_hook.returncode == 0
|
|
92
|
+
assert completed_hook.stdout.strip() == ""
|
|
93
|
+
|
|
94
|
+
|
|
95
|
+
def test_commit_with_violating_staged_file_is_blocked(tmp_path: Path) -> None:
|
|
96
|
+
initialize_repository(tmp_path)
|
|
97
|
+
stage_file(tmp_path, "totals.py", VIOLATING_MODULE_SOURCE)
|
|
98
|
+
completed_hook = run_hook("git commit -m add", tmp_path)
|
|
99
|
+
assert completed_hook.returncode == 0
|
|
100
|
+
denial = parse_denial(completed_hook.stdout)
|
|
101
|
+
assert denial["permissionDecision"] == "deny"
|
|
102
|
+
assert "totals.py" in denial["permissionDecisionReason"]
|
|
103
|
+
assert "Line" in denial["permissionDecisionReason"]
|
|
104
|
+
|
|
105
|
+
|
|
106
|
+
def test_git_dash_c_commit_form_is_blocked(tmp_path: Path) -> None:
|
|
107
|
+
repository_root = tmp_path / "repo"
|
|
108
|
+
repository_root.mkdir()
|
|
109
|
+
initialize_repository(repository_root)
|
|
110
|
+
stage_file(repository_root, "totals.py", VIOLATING_MODULE_SOURCE)
|
|
111
|
+
elsewhere = tmp_path / "elsewhere"
|
|
112
|
+
elsewhere.mkdir()
|
|
113
|
+
quoted_root = str(repository_root)
|
|
114
|
+
completed_hook = run_hook(f'git -C "{quoted_root}" commit -m add', elsewhere)
|
|
115
|
+
assert completed_hook.returncode == 0
|
|
116
|
+
denial = parse_denial(completed_hook.stdout)
|
|
117
|
+
assert denial["permissionDecision"] == "deny"
|
|
118
|
+
assert "totals.py" in denial["permissionDecisionReason"]
|
|
119
|
+
|
|
120
|
+
|
|
121
|
+
def test_commit_with_no_staged_python_files_is_allowed(tmp_path: Path) -> None:
|
|
122
|
+
initialize_repository(tmp_path)
|
|
123
|
+
stage_file(tmp_path, "notes.md", "# Notes\n")
|
|
124
|
+
completed_hook = run_hook("git commit -m docs", tmp_path)
|
|
125
|
+
assert completed_hook.returncode == 0
|
|
126
|
+
assert completed_hook.stdout.strip() == ""
|
package/hooks/hooks.json
CHANGED
|
@@ -105,6 +105,11 @@
|
|
|
105
105
|
"command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/blocking/block_main_commit.py",
|
|
106
106
|
"timeout": 15
|
|
107
107
|
},
|
|
108
|
+
{
|
|
109
|
+
"type": "command",
|
|
110
|
+
"command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/blocking/precommit_code_rules_gate.py",
|
|
111
|
+
"timeout": 30
|
|
112
|
+
},
|
|
108
113
|
{
|
|
109
114
|
"type": "command",
|
|
110
115
|
"command": "python3 ${CLAUDE_PLUGIN_ROOT}/hooks/blocking/pr_description_enforcer.py",
|
|
@@ -0,0 +1,26 @@
|
|
|
1
|
+
"""Constants for the precommit_code_rules_gate PreToolUse hook.
|
|
2
|
+
|
|
3
|
+
Command parsing, git timeouts, and the staged-gate invocation surface used
|
|
4
|
+
to run the shared code_rules_gate engine before a git commit.
|
|
5
|
+
"""
|
|
6
|
+
|
|
7
|
+
from pathlib import Path
|
|
8
|
+
|
|
9
|
+
GIT_DASH_C_COMMIT_PATTERN: str = r"git\s+-C\s+[\"']?[^\"';&|]+?[\"']?\s+commit\b"
|
|
10
|
+
GIT_COMMAND_TIMEOUT_SECONDS: int = 5
|
|
11
|
+
GATE_TIMEOUT_SECONDS: int = 25
|
|
12
|
+
GATE_RELATIVE_PATH: Path = Path("_shared") / "pr-loop" / "scripts" / "code_rules_gate.py"
|
|
13
|
+
ALL_STAGED_PYTHON_FILES_COMMAND: tuple[str, ...] = (
|
|
14
|
+
"git",
|
|
15
|
+
"diff",
|
|
16
|
+
"--cached",
|
|
17
|
+
"--name-only",
|
|
18
|
+
"--diff-filter=ACMR",
|
|
19
|
+
"--",
|
|
20
|
+
"*.py",
|
|
21
|
+
)
|
|
22
|
+
ALL_GIT_REPOSITORY_ROOT_COMMAND: tuple[str, ...] = (
|
|
23
|
+
"git",
|
|
24
|
+
"rev-parse",
|
|
25
|
+
"--show-toplevel",
|
|
26
|
+
)
|
package/package.json
CHANGED
|
@@ -196,6 +196,7 @@
|
|
|
196
196
|
<constraints>
|
|
197
197
|
<constraint>Read every file completely. Do not skim. Do not skip any file or any line.</constraint>
|
|
198
198
|
<constraint>Write findings to the temp file immediately. Do not accumulate them in memory and batch-write at the end. You will forget things.</constraint>
|
|
199
|
+
<constraint>Double-quote every path in shell commands and write paths with forward slashes (e.g. C:/Users/...), even on Windows.</constraint>
|
|
199
200
|
<constraint>Every finding must cite the file and line of the problem AND the file and line of the evidence that proves it is a problem. No floating claims.</constraint>
|
|
200
201
|
<constraint>When two files contradict each other, flag BOTH files. Do not guess which is correct unless a canonical source resolves it.</constraint>
|
|
201
202
|
<constraint>If you cannot determine the correct value or form, flag the inconsistency and mark it "unresolvable — no canonical source found". Do not guess.</constraint>
|
|
@@ -124,7 +124,7 @@ def per_pr_workspace(
|
|
|
124
124
|
slug = slugify_pr_identity(owner, repo, pr_number)
|
|
125
125
|
return PerPrWorkspace(
|
|
126
126
|
worktree=pr_workspace_dir / WORKTREE_DIRNAME,
|
|
127
|
-
diff_patch_template=
|
|
127
|
+
diff_patch_template=(pr_workspace_dir / slug / DIFF_PATCH_TEMPLATE).as_posix(),
|
|
128
128
|
outcome_xml_template=OUTCOME_XML_TEMPLATE,
|
|
129
129
|
fix_outcome_xml_template=FIX_OUTCOME_XML_TEMPLATE,
|
|
130
130
|
)
|
|
@@ -62,14 +62,14 @@ def build_audit_prompt_xml(
|
|
|
62
62
|
SubElement(context, "pr_number").text = str(pr_number)
|
|
63
63
|
SubElement(context, "head_ref").text = head_ref
|
|
64
64
|
SubElement(context, "base_ref").text = base_ref
|
|
65
|
-
SubElement(context, "worktree_path").text =
|
|
66
|
-
SubElement(context, "run_temp_dir").text =
|
|
65
|
+
SubElement(context, "worktree_path").text = worktree_path.as_posix()
|
|
66
|
+
SubElement(context, "run_temp_dir").text = run_temp_dir.as_posix()
|
|
67
67
|
|
|
68
68
|
scope = SubElement(root, "scope")
|
|
69
69
|
scope.text = (
|
|
70
70
|
f"Audit the full diff of {owner}/{repo}#{pr_number} "
|
|
71
71
|
f"({head_ref} against {base_ref}) for CODE_RULES violations, "
|
|
72
|
-
f"bugs, and anti-patterns. Work in {worktree_path}."
|
|
72
|
+
f"bugs, and anti-patterns. Work in {worktree_path.as_posix()}."
|
|
73
73
|
)
|
|
74
74
|
|
|
75
75
|
bug_categories = SubElement(root, "bug_categories")
|
|
@@ -63,7 +63,7 @@ def build_fix_prompt_xml(
|
|
|
63
63
|
SubElement(context, "pr_number").text = str(pr_number)
|
|
64
64
|
SubElement(context, "head_ref").text = head_ref
|
|
65
65
|
SubElement(context, "base_ref").text = base_ref
|
|
66
|
-
SubElement(context, "worktree_path").text =
|
|
66
|
+
SubElement(context, "worktree_path").text = worktree_path.as_posix()
|
|
67
67
|
|
|
68
68
|
bugs_elem = SubElement(root, "bugs")
|
|
69
69
|
if isinstance(findings_data, list):
|
|
@@ -26,6 +26,8 @@ ALL_AUDIT_CONSTRAINT_TEXTS = [
|
|
|
26
26
|
"Every finding must cite file:line.",
|
|
27
27
|
"Document each finding with severity, file, line, and suggested fix.",
|
|
28
28
|
"Read each file in the diff before reporting on it.",
|
|
29
|
+
"Double-quote every path in shell commands and write paths with "
|
|
30
|
+
"forward slashes (e.g. C:/Users/...), even on Windows.",
|
|
29
31
|
]
|
|
30
32
|
|
|
31
33
|
ALL_AUDIT_CATEGORY_ENTRIES = [
|
|
@@ -71,6 +73,8 @@ ALL_FIX_CONSTRAINT_TEXTS = [
|
|
|
71
73
|
"Every fix must have a corresponding test.",
|
|
72
74
|
"Remove deprecated code directly and update all call sites.",
|
|
73
75
|
"Handle each error case with a named exception type.",
|
|
76
|
+
"Double-quote every path in shell commands and write paths with "
|
|
77
|
+
"forward slashes (e.g. C:/Users/...), even on Windows.",
|
|
74
78
|
]
|
|
75
79
|
|
|
76
80
|
XML_PRETTY_INDENT = " "
|
|
@@ -164,7 +164,7 @@ def main(all_arguments: list[str]) -> int:
|
|
|
164
164
|
run_temp_dir=run_temp_dir,
|
|
165
165
|
all_pr_entries=all_pr_entries,
|
|
166
166
|
)
|
|
167
|
-
print(f"Removed {removed_count} worktree(s), cleaned {run_temp_dir}")
|
|
167
|
+
print(f"Removed {removed_count} worktree(s), cleaned {run_temp_dir.as_posix()}")
|
|
168
168
|
return 0
|
|
169
169
|
|
|
170
170
|
|
|
@@ -47,7 +47,17 @@ def test_per_pr_workspace_diff_patch_template_carries_loop_placeholder() -> None
|
|
|
47
47
|
workspace = path_resolver.per_pr_workspace(run_temp_dir, "owner", "repo", 7)
|
|
48
48
|
rendered = workspace.diff_patch_template.format(loop=3)
|
|
49
49
|
assert rendered.endswith("loop-3.patch")
|
|
50
|
-
assert "owner-repo-pr-7" in rendered
|
|
50
|
+
assert "owner-repo-pr-7" in rendered
|
|
51
|
+
|
|
52
|
+
|
|
53
|
+
def test_per_pr_workspace_diff_patch_template_uses_forward_slashes() -> None:
|
|
54
|
+
run_temp_dir = Path("C:/Users/jon/AppData/Local/Temp/bugteam-pr-376")
|
|
55
|
+
workspace = path_resolver.per_pr_workspace(run_temp_dir, "owner", "repo", 376)
|
|
56
|
+
assert "\\" not in workspace.diff_patch_template
|
|
57
|
+
assert workspace.diff_patch_template == (
|
|
58
|
+
"C:/Users/jon/AppData/Local/Temp/bugteam-pr-376/"
|
|
59
|
+
"pr-376/owner-repo-pr-376/loop-{loop}.patch"
|
|
60
|
+
)
|
|
51
61
|
|
|
52
62
|
|
|
53
63
|
def test_per_pr_workspace_is_frozen() -> None:
|
|
@@ -60,6 +60,30 @@ def _build_audit_root() -> Element:
|
|
|
60
60
|
)
|
|
61
61
|
|
|
62
62
|
|
|
63
|
+
def test_context_and_scope_render_paths_with_forward_slashes() -> None:
|
|
64
|
+
root = build_audit_prompt.build_audit_prompt_xml(
|
|
65
|
+
owner="jl-cmd",
|
|
66
|
+
repo="claude-code-config",
|
|
67
|
+
pr_number=376,
|
|
68
|
+
loop=1,
|
|
69
|
+
head_ref="feat/branch",
|
|
70
|
+
base_ref="main",
|
|
71
|
+
worktree_path=Path("C:/Users/jon/AppData/Local/Temp/bugteam-pr-376/worktree"),
|
|
72
|
+
run_temp_dir=Path("C:/Users/jon/AppData/Local/Temp/bugteam-pr-376"),
|
|
73
|
+
)
|
|
74
|
+
context = root.find("context")
|
|
75
|
+
assert context is not None
|
|
76
|
+
worktree_text = context.findtext("worktree_path")
|
|
77
|
+
run_temp_text = context.findtext("run_temp_dir")
|
|
78
|
+
assert worktree_text == "C:/Users/jon/AppData/Local/Temp/bugteam-pr-376/worktree"
|
|
79
|
+
assert run_temp_text == "C:/Users/jon/AppData/Local/Temp/bugteam-pr-376"
|
|
80
|
+
scope = root.find("scope")
|
|
81
|
+
assert scope is not None
|
|
82
|
+
assert scope.text is not None
|
|
83
|
+
assert "\\" not in scope.text
|
|
84
|
+
assert "C:/Users/jon/AppData/Local/Temp/bugteam-pr-376/worktree" in scope.text
|
|
85
|
+
|
|
86
|
+
|
|
63
87
|
def test_bug_categories_carry_ids_a_through_p_in_order() -> None:
|
|
64
88
|
root = _build_audit_root()
|
|
65
89
|
bug_categories = root.find("bug_categories")
|
|
@@ -0,0 +1,49 @@
|
|
|
1
|
+
"""Tests for build_fix_prompt's agent-facing path rendering."""
|
|
2
|
+
|
|
3
|
+
from __future__ import annotations
|
|
4
|
+
|
|
5
|
+
import importlib.util
|
|
6
|
+
import json
|
|
7
|
+
import sys
|
|
8
|
+
from pathlib import Path
|
|
9
|
+
from types import ModuleType
|
|
10
|
+
|
|
11
|
+
_SCRIPTS_DIR = Path(__file__).resolve().parent
|
|
12
|
+
if str(_SCRIPTS_DIR) not in sys.path:
|
|
13
|
+
sys.path.insert(0, str(_SCRIPTS_DIR))
|
|
14
|
+
|
|
15
|
+
|
|
16
|
+
def _load_build_fix_prompt() -> ModuleType:
|
|
17
|
+
module_path = _SCRIPTS_DIR / "build_fix_prompt.py"
|
|
18
|
+
spec = importlib.util.spec_from_file_location("build_fix_prompt", module_path)
|
|
19
|
+
assert spec is not None
|
|
20
|
+
assert spec.loader is not None
|
|
21
|
+
module = importlib.util.module_from_spec(spec)
|
|
22
|
+
sys.modules["build_fix_prompt"] = module
|
|
23
|
+
spec.loader.exec_module(module)
|
|
24
|
+
return module
|
|
25
|
+
|
|
26
|
+
|
|
27
|
+
build_fix_prompt = _load_build_fix_prompt()
|
|
28
|
+
|
|
29
|
+
|
|
30
|
+
def test_context_worktree_path_renders_with_forward_slashes(tmp_path: Path) -> None:
|
|
31
|
+
findings_json_path = tmp_path / "findings.json"
|
|
32
|
+
findings_json_path.write_text(
|
|
33
|
+
json.dumps([{"severity": "P1", "file": "a.py", "line": 1}]),
|
|
34
|
+
encoding="utf-8",
|
|
35
|
+
)
|
|
36
|
+
root = build_fix_prompt.build_fix_prompt_xml(
|
|
37
|
+
owner="jl-cmd",
|
|
38
|
+
repo="claude-code-config",
|
|
39
|
+
pr_number=376,
|
|
40
|
+
loop=1,
|
|
41
|
+
head_ref="feat/branch",
|
|
42
|
+
base_ref="main",
|
|
43
|
+
worktree_path=Path("C:/Users/jon/AppData/Local/Temp/bugteam-pr-376/worktree"),
|
|
44
|
+
findings_json_path=findings_json_path,
|
|
45
|
+
)
|
|
46
|
+
context = root.find("context")
|
|
47
|
+
assert context is not None
|
|
48
|
+
worktree_text = context.findtext("worktree_path")
|
|
49
|
+
assert worktree_text == "C:/Users/jon/AppData/Local/Temp/bugteam-pr-376/worktree"
|
|
@@ -46,3 +46,26 @@ def test_create_loop_state_writes_state_under_typed_worktree(
|
|
|
46
46
|
written_state = json.loads(state_path.read_text(encoding="utf-8"))
|
|
47
47
|
assert written_state["starting_sha"] == "abc1234"
|
|
48
48
|
assert written_state["loop_count"] == 0
|
|
49
|
+
|
|
50
|
+
|
|
51
|
+
def test_main_prints_state_path_with_forward_slashes(
|
|
52
|
+
tmp_path: Path,
|
|
53
|
+
monkeypatch: pytest.MonkeyPatch,
|
|
54
|
+
capsys: pytest.CaptureFixture[str],
|
|
55
|
+
) -> None:
|
|
56
|
+
path_resolver_module = init_loop_state.resolve_run_temp_dir.__globals__["tempfile"]
|
|
57
|
+
monkeypatch.setattr(path_resolver_module, "gettempdir", lambda: str(tmp_path))
|
|
58
|
+
exit_code = init_loop_state.main(
|
|
59
|
+
[
|
|
60
|
+
"--pr-number",
|
|
61
|
+
"422",
|
|
62
|
+
"--head-ref",
|
|
63
|
+
"feat/branch",
|
|
64
|
+
"--starting-sha",
|
|
65
|
+
"abc1234",
|
|
66
|
+
]
|
|
67
|
+
)
|
|
68
|
+
assert exit_code == 0
|
|
69
|
+
printed_path = capsys.readouterr().out.strip()
|
|
70
|
+
assert "\\" not in printed_path
|
|
71
|
+
assert printed_path.endswith("worktree/loop-state.json")
|
|
@@ -66,7 +66,19 @@ own. The workflow runs in the background and notifies this session on
|
|
|
66
66
|
completion. Watch live progress with `/workflows`.
|
|
67
67
|
|
|
68
68
|
The workflow returns
|
|
69
|
-
`{ converged, rounds, finalSha, blocker }`.
|
|
69
|
+
`{ converged, rounds, finalSha, blocker, standardsNote }`. Every agent the
|
|
70
|
+
workflow spawns runs on Fable 5 (`model: 'fable'`).
|
|
71
|
+
|
|
72
|
+
## Budget-aware round boundaries
|
|
73
|
+
|
|
74
|
+
The workflow's `budget` API is the pacing signal: when a usage target is
|
|
75
|
+
set, `converge.mjs` checks `budget.remaining()` before each round and
|
|
76
|
+
stops at the round boundary when one full round (three parallel lenses +
|
|
77
|
+
one fix commit + re-verify) does not fit. On a budget stop the workflow
|
|
78
|
+
returns `blocker: "budget"` with the run id; resume with
|
|
79
|
+
`Workflow({scriptPath, resumeFromRunId})` — completed rounds replay from
|
|
80
|
+
the journal. Never start a round the budget cannot finish: a half-run
|
|
81
|
+
round records nothing resumable and replays dirty.
|
|
70
82
|
|
|
71
83
|
## Teardown (on workflow completion)
|
|
72
84
|
|
|
@@ -85,6 +97,7 @@ The workflow returns
|
|
|
85
97
|
Rounds: <N>
|
|
86
98
|
Final commit: <finalSha>
|
|
87
99
|
Blocker: <blocker> # only when blocked
|
|
100
|
+
Standards: <standardsNote> # only when a round deferred code-standard findings
|
|
88
101
|
```
|
|
89
102
|
|
|
90
103
|
## What the workflow does each round
|
|
@@ -100,6 +113,12 @@ run ends short of ready. Hard-won failure lessons live in
|
|
|
100
113
|
`clean-coder` applies all fixes in a single commit, pushes, replies to and
|
|
101
114
|
resolves any bot threads; re-verify next round on the new HEAD. When all
|
|
102
115
|
three are clean on a stable HEAD, post the CLEAN bugteam audit artifact.
|
|
116
|
+
A round whose findings are ALL code-standard violations (pure CODE_RULES/style,
|
|
117
|
+
no behavioral impact) passes for convergence purposes: the workflow files a
|
|
118
|
+
follow-up issue listing the findings, opens a draft environment-hardening PR
|
|
119
|
+
(hooks/rules that block those violation classes at Write/Edit time), resolves
|
|
120
|
+
any bot threads with a deferral note, and reports the deferral in
|
|
121
|
+
`standardsNote`.
|
|
103
122
|
- **Copilot gate:** request a Copilot review, poll up to three times; findings
|
|
104
123
|
route back into Converge, a no-show after the cap is a blocker.
|
|
105
124
|
- **Convergence check:** `check_convergence.py` is the authoritative gate; on a
|
|
@@ -44,11 +44,12 @@ const LENS_SCHEMA = {
|
|
|
44
44
|
file: { type: 'string' },
|
|
45
45
|
line: { type: 'integer' },
|
|
46
46
|
severity: { type: 'string', enum: ['P0', 'P1', 'P2'] },
|
|
47
|
+
category: { type: 'string', enum: ['bug', 'code-standard'], description: 'code-standard for pure CODE_RULES/style violations with no behavioral impact; bug otherwise' },
|
|
47
48
|
title: { type: 'string' },
|
|
48
49
|
detail: { type: 'string' },
|
|
49
50
|
replyToCommentId: { type: ['integer', 'null'], description: 'GitHub review comment id to reply to and resolve, or null when the finding has no thread' },
|
|
50
51
|
},
|
|
51
|
-
required: ['file', 'line', 'severity', 'title', 'detail', 'replyToCommentId'],
|
|
52
|
+
required: ['file', 'line', 'severity', 'category', 'title', 'detail', 'replyToCommentId'],
|
|
52
53
|
},
|
|
53
54
|
},
|
|
54
55
|
},
|
|
@@ -415,7 +416,7 @@ async function resolveHead() {
|
|
|
415
416
|
`Print the current HEAD SHA of ${prCoordinates}. Run exactly:\n` +
|
|
416
417
|
`gh api repos/${input.owner}/${input.repo}/pulls/${input.prNumber} --jq .head.sha\n` +
|
|
417
418
|
`Return the full 40-character SHA in the sha field. Do not modify any files.`,
|
|
418
|
-
{ label: 'resolve-head', phase: 'Converge', schema: HEAD_SCHEMA, agentType: 'Explore' },
|
|
419
|
+
{ model: 'fable', label: 'resolve-head', phase: 'Converge', schema: HEAD_SCHEMA, agentType: 'Explore' },
|
|
419
420
|
)
|
|
420
421
|
return head?.sha
|
|
421
422
|
}
|
|
@@ -433,7 +434,7 @@ function prefetchMainForRound() {
|
|
|
433
434
|
`Refresh the base ref for ${prCoordinates} so the parallel review lenses can diff against an up-to-date origin/main without each running its own fetch. Run exactly:\n` +
|
|
434
435
|
`git fetch origin main\n` +
|
|
435
436
|
`Do not edit, commit, push, rebase, or modify any files — fetch only.`,
|
|
436
|
-
{ label: 'prefetch-main', phase: 'Converge', agentType: 'Explore' },
|
|
437
|
+
{ model: 'fable', label: 'prefetch-main', phase: 'Converge', agentType: 'Explore' },
|
|
437
438
|
)
|
|
438
439
|
}
|
|
439
440
|
|
|
@@ -461,8 +462,8 @@ function runBugbotLens(head) {
|
|
|
461
462
|
` - If a clean review exists on HEAD -> return clean.\n` +
|
|
462
463
|
`4. No review yet on HEAD: check_bugbot_ci.py --check-active. If active (exit 0), poll: repeat check_bugbot_ci.py --check-clean / --check-active every 60 seconds (delay each iteration with "sleep 60", or the PowerShell alternative "Start-Sleep -Seconds 60") for up to 25 iterations, then re-fetch the review. If not active (exit 1), post the literal comment "bugbot run" (no @mention, no other text) via python "${CONFIG.sharedScripts}/post_fix_reply.py" --owner ${input.owner} --repo ${input.repo} --pr-number ${input.prNumber} --body "bugbot run", delay 8 seconds with "sleep 8" (PowerShell alternative "Start-Sleep -Seconds 8"), then poll as above.\n` +
|
|
463
464
|
`5. If after the full poll budget Bugbot has neither a check run nor a review on HEAD -> return {sha:${'`'}${head}${'`'}, clean:true, down:true, findings:[]} (treat as down).\n\n` +
|
|
464
|
-
`Scope is the whole PR; you are only reading Bugbot's own output here. Return strictly the schema.`,
|
|
465
|
-
{ label: 'lens:bugbot', phase: 'Converge', schema: LENS_SCHEMA },
|
|
465
|
+
`Scope is the whole PR; you are only reading Bugbot's own output here. For each finding set category: 'code-standard' when it is a pure CODE_RULES/style violation (naming, comments, type hints, magic values, structure) with no behavioral impact; 'bug' otherwise. Return strictly the schema.`,
|
|
466
|
+
{ model: 'fable', label: 'lens:bugbot', phase: 'Converge', schema: LENS_SCHEMA },
|
|
466
467
|
)
|
|
467
468
|
}
|
|
468
469
|
|
|
@@ -477,8 +478,8 @@ function runCodeReviewLens(head) {
|
|
|
477
478
|
`You are the code-review lens for ${prCoordinates}, HEAD ${head}.\n\n` +
|
|
478
479
|
`Review the FULL origin/main...HEAD diff — every file the PR touches. Do NOT delta-scope to recent commits or to a single file. The workflow already fetched origin/main this round, so do NOT run git fetch; run git diff --name-only origin/main...HEAD to enumerate the changed files, then review the complete diff of each.\n\n` +
|
|
479
480
|
`Apply correctness-focused review: real bugs, broken logic, incorrect error handling, data-loss or security risks, contract mismatches, and reuse/simplification problems. Report only defensible findings with concrete file:line evidence.\n\n` +
|
|
480
|
-
`Do NOT edit, commit, or push — reporting only. Return strictly the schema: clean=true with empty findings when the diff is sound, otherwise one entry per finding (severity P0/P1/P2, replyToCommentId=null since these are not yet GitHub threads). Set sha=${'`'}${head}${'`'}, down=false.`,
|
|
481
|
-
{ label: 'lens:code-review', phase: 'Converge', schema: LENS_SCHEMA, agentType: 'code-quality-agent' },
|
|
481
|
+
`Do NOT edit, commit, or push — reporting only. Return strictly the schema: clean=true with empty findings when the diff is sound, otherwise one entry per finding (severity P0/P1/P2; category 'code-standard' for pure CODE_RULES/style violations with no behavioral impact, 'bug' otherwise; replyToCommentId=null since these are not yet GitHub threads). Set sha=${'`'}${head}${'`'}, down=false.`,
|
|
482
|
+
{ model: 'fable', label: 'lens:code-review', phase: 'Converge', schema: LENS_SCHEMA, agentType: 'code-quality-agent' },
|
|
482
483
|
)
|
|
483
484
|
}
|
|
484
485
|
|
|
@@ -493,8 +494,8 @@ function runAuditLens(head) {
|
|
|
493
494
|
`You are the second-opinion bug-audit lens for ${prCoordinates}, HEAD ${head}.\n\n` +
|
|
494
495
|
`Read the audit rubric at ${CONFIG.bugteamRubric} and apply its categories (A through P) against the FULL origin/main...HEAD diff — every file the PR touches, never a delta cut. The workflow already fetched origin/main this round, so do NOT run git fetch; run git diff --name-only origin/main...HEAD first to enumerate scope.\n\n` +
|
|
495
496
|
`This is a clean-room audit: assume nothing from other lenses. Report only findings backed by concrete file:line evidence. Do NOT edit, commit, or push.\n\n` +
|
|
496
|
-
`Return strictly the schema: clean=true with empty findings when the diff passes every category, otherwise one entry per finding (severity P0/P1/P2, replyToCommentId=null). Set sha=${'`'}${head}${'`'}, down=false.`,
|
|
497
|
-
{ label: 'lens:bug-audit', phase: 'Converge', schema: LENS_SCHEMA, agentType: 'code-quality-agent' },
|
|
497
|
+
`Return strictly the schema: clean=true with empty findings when the diff passes every category, otherwise one entry per finding (severity P0/P1/P2; category 'code-standard' for pure CODE_RULES/style violations with no behavioral impact, 'bug' otherwise; replyToCommentId=null). Set sha=${'`'}${head}${'`'}, down=false.`,
|
|
498
|
+
{ model: 'fable', label: 'lens:bug-audit', phase: 'Converge', schema: LENS_SCHEMA, agentType: 'code-quality-agent' },
|
|
498
499
|
)
|
|
499
500
|
}
|
|
500
501
|
|
|
@@ -532,7 +533,7 @@ function applyFixes(head, findings, sourceLabel) {
|
|
|
532
533
|
`- When you commit and push a fix: newSha=the new HEAD SHA after your push, pushed=true, resolvedWithoutCommit=false.\n` +
|
|
533
534
|
`- When every finding was already addressed so no code change is needed — yet you still resolved each GitHub review thread above: newSha=${head} (the unchanged HEAD), pushed=false, resolvedWithoutCommit=true. Only set this when every thread that carries a comment id is resolved; otherwise the round is treated as stalled.\n` +
|
|
534
535
|
`Always include a one-line summary.`,
|
|
535
|
-
{ label: `fix:${sourceLabel}`, phase: 'Converge', schema: FIX_SCHEMA, agentType: 'clean-coder' },
|
|
536
|
+
{ model: 'fable', label: `fix:${sourceLabel}`, phase: 'Converge', schema: FIX_SCHEMA, agentType: 'clean-coder' },
|
|
536
537
|
)
|
|
537
538
|
}
|
|
538
539
|
|
|
@@ -548,7 +549,7 @@ function postCleanAudit(head) {
|
|
|
548
549
|
`Write an empty findings file: create a temp file containing exactly [] (an empty JSON array). Then run:\n` +
|
|
549
550
|
`python "${CONFIG.prLoopScripts}/post_audit_thread.py" --skill bugteam --owner ${input.owner} --repo ${input.repo} --pr-number ${input.prNumber} --commit ${head} --state CLEAN --findings-json <temp-file>\n` +
|
|
550
551
|
`Run the script with --help first if any flag name differs. This posts the APPROVE review body that check_convergence.py reads for the bugteam gate. Do not edit code, commit, or push.`,
|
|
551
|
-
{ label: 'post-clean-audit', phase: 'Converge', agentType: 'general-purpose' },
|
|
552
|
+
{ model: 'fable', label: 'post-clean-audit', phase: 'Converge', agentType: 'general-purpose' },
|
|
552
553
|
)
|
|
553
554
|
}
|
|
554
555
|
|
|
@@ -565,10 +566,10 @@ function runCopilotGate(head) {
|
|
|
565
566
|
` gh api --method POST repos/${input.owner}/${input.repo}/pulls/${input.prNumber}/requested_reviewers -f 'reviewers[]=copilot-pull-request-reviewer[bot]'\n` +
|
|
566
567
|
`2. Poll for Copilot's review on HEAD ${head}: up to ${CONFIG.copilotMaxPolls} attempts, 360 seconds apart (delay each attempt with "sleep 360", or the PowerShell alternative "Start-Sleep -Seconds 360"). Each attempt: python "${CONFIG.sharedScripts}/fetch_copilot_reviews.py" --owner ${input.owner} --repo ${input.repo} --pr-number ${input.prNumber} for the top-level review state, plus gh api "repos/${input.owner}/${input.repo}/pulls/${input.prNumber}/comments" --paginate --slurp for inline comment ids (Copilot's login contains "copilot", case-insensitive). Only count entries whose commit_id starts with ${head}.\n` +
|
|
567
568
|
` - Copilot review present and clean/approved on HEAD -> return {sha:${'`'}${head}${'`'}, clean:true, findings:[], blocker:null}.\n` +
|
|
568
|
-
` - Copilot findings on HEAD -> return them (each with its inline comment id in replyToCommentId), clean:false, blocker:null.\n` +
|
|
569
|
+
` - Copilot findings on HEAD -> return them (each with its inline comment id in replyToCommentId; category 'code-standard' for pure CODE_RULES/style violations with no behavioral impact, 'bug' otherwise), clean:false, blocker:null.\n` +
|
|
569
570
|
` - No review after ${CONFIG.copilotMaxPolls} attempts -> return {sha:${'`'}${head}${'`'}, clean:false, findings:[], blocker:"Copilot did not surface a review on HEAD after ${CONFIG.copilotMaxPolls} polls"}.\n\n` +
|
|
570
571
|
`Return strictly the schema.`,
|
|
571
|
-
{ label: 'copilot-gate', phase: 'Copilot gate', schema: COPILOT_SCHEMA },
|
|
572
|
+
{ model: 'fable', label: 'copilot-gate', phase: 'Copilot gate', schema: COPILOT_SCHEMA },
|
|
572
573
|
)
|
|
573
574
|
}
|
|
574
575
|
|
|
@@ -585,7 +586,7 @@ function checkConvergence(bugbotDown) {
|
|
|
585
586
|
`Exit 0 -> every gate passed: return {pass:true, failures:[]}.\n` +
|
|
586
587
|
`Exit 1 -> return {pass:false, failures:[<each printed FAIL line verbatim>]}.\n` +
|
|
587
588
|
`Exit 2 -> retry once; if it still errors, return {pass:false, failures:["check_convergence gh error"]}.`,
|
|
588
|
-
{ label: 'check-convergence', phase: 'Finalize', schema: CONVERGENCE_SCHEMA, agentType: 'Explore' },
|
|
589
|
+
{ model: 'fable', label: 'check-convergence', phase: 'Finalize', schema: CONVERGENCE_SCHEMA, agentType: 'Explore' },
|
|
589
590
|
)
|
|
590
591
|
}
|
|
591
592
|
|
|
@@ -600,7 +601,7 @@ function markReady(head) {
|
|
|
600
601
|
`1. Run: gh pr ready ${input.prNumber} --repo ${input.owner}/${input.repo}\n` +
|
|
601
602
|
`2. Re-query the draft state: gh api repos/${input.owner}/${input.repo}/pulls/${input.prNumber} --jq .draft\n` +
|
|
602
603
|
`Return {ready:true} only when step 2 prints false (the PR is no longer a draft). If step 1 errors or step 2 still prints true, return {ready:false}.`,
|
|
603
|
-
{ label: 'mark-ready', phase: 'Finalize', schema: READY_SCHEMA, agentType: 'general-purpose' },
|
|
604
|
+
{ model: 'fable', label: 'mark-ready', phase: 'Finalize', schema: READY_SCHEMA, agentType: 'general-purpose' },
|
|
604
605
|
)
|
|
605
606
|
}
|
|
606
607
|
|
|
@@ -623,7 +624,51 @@ function repairConvergence(head, failures) {
|
|
|
623
624
|
`- PR not mergeable: rebase onto origin/main and force-push (git fetch origin main; git rebase origin/main; resolve conflicts; git push --force-with-lease).\n` +
|
|
624
625
|
`- A dirty bot review or a still-pending requested reviewer: leave it; the next round re-runs that reviewer.\n` +
|
|
625
626
|
`Make at most one commit for any code fix. Return the HEAD SHA after any push in newSha (the unchanged ${head} when nothing was pushed), pushed true/false, resolvedWithoutCommit=false (this gate already accepts an unchanged HEAD), and a one-line summary.`,
|
|
626
|
-
{ label: 'repair-convergence', phase: 'Finalize', schema: FIX_SCHEMA, agentType: 'clean-coder' },
|
|
627
|
+
{ model: 'fable', label: 'repair-convergence', phase: 'Finalize', schema: FIX_SCHEMA, agentType: 'clean-coder' },
|
|
628
|
+
)
|
|
629
|
+
}
|
|
630
|
+
|
|
631
|
+
/**
|
|
632
|
+
* Decide whether a review round surfaced ONLY code-standard violations — pure
|
|
633
|
+
* CODE_RULES/style findings with no behavioral impact. Such a round passes for
|
|
634
|
+
* convergence purposes: the violations are deferred to a follow-up fix issue
|
|
635
|
+
* (plus an environment-hardening PR) rather than blocking this PR.
|
|
636
|
+
* @param {Array<object>} findings deduped findings for the round
|
|
637
|
+
* @returns {boolean} true when every finding is category code-standard
|
|
638
|
+
*/
|
|
639
|
+
function isStandardsOnlyRound(findings) {
|
|
640
|
+
return findings.length > 0 && findings.every((each) => each.category === 'code-standard')
|
|
641
|
+
}
|
|
642
|
+
|
|
643
|
+
/**
|
|
644
|
+
* Defer a standards-only round: one agent files a GitHub issue listing every
|
|
645
|
+
* code-standard finding, opens a draft PR hardening the Claude environment
|
|
646
|
+
* (hooks/rules) so those violation classes are blocked before code is written,
|
|
647
|
+
* and replies to / resolves any GitHub threads the findings carry, noting the
|
|
648
|
+
* deferral. This PR's branch is never touched.
|
|
649
|
+
* @param {string} head PR HEAD SHA the findings were raised against
|
|
650
|
+
* @param {Array<object>} findings deduped code-standard-only findings
|
|
651
|
+
* @param {string} sourceLabel short description of where the findings came from
|
|
652
|
+
* @returns {Promise<string>} agent transcript (unused)
|
|
653
|
+
*/
|
|
654
|
+
function spawnStandardsFollowUp(head, findings, sourceLabel) {
|
|
655
|
+
const findingsBlock = findings
|
|
656
|
+
.map((each, position) => {
|
|
657
|
+
const eachThreadIds = collectFindingThreadIds(each)
|
|
658
|
+
const threadNote = eachThreadIds.length
|
|
659
|
+
? `\n (GitHub review comment ids: ${eachThreadIds.join(', ')})`
|
|
660
|
+
: ''
|
|
661
|
+
return `${position + 1}. [${each.severity}] ${each.file}:${each.line} — ${each.title}\n ${each.detail}${threadNote}`
|
|
662
|
+
})
|
|
663
|
+
.join('\n')
|
|
664
|
+
return agent(
|
|
665
|
+
`A review round on ${prCoordinates}, HEAD ${head}, surfaced ONLY code-standard violations (CODE_RULES/style, no behavioral impact). The convergence run treats the round as passed and defers these to follow-up work, which you now create. Do NOT commit or push to the PR's own branch.\n\n` +
|
|
666
|
+
`Findings:\n${findingsBlock}\n\n` +
|
|
667
|
+
`1. Follow-up fix issue: file a GitHub issue on ${input.owner}/${input.repo} (gh issue create --body-file with a temp file) titled "Deferred code-standard fixes from PR #${input.prNumber}". The body references the PR and lists each finding with its file:line, severity, and detail. The issue carries the fix work; do not open a fix PR.\n` +
|
|
668
|
+
`2. Environment-hardening PR: in the Claude environment config repo (the repo owning ~/.claude hooks and rules — JonEcho/llm-settings for hooks, jl-cmd/claude-code-config for rules/skills; pick whichever owns the needed surface), create a branch and open a DRAFT PR that hardens hooks/rules so each violation class found here is blocked at Write/Edit time, before code is written or reviewed. Reference the issue from step 1 in the PR body.\n` +
|
|
669
|
+
`3. For each finding that carries a GitHub review comment id: post an inline reply via python "${CONFIG.sharedScripts}/post_fix_reply.py" --owner ${input.owner} --repo ${input.repo} --pr-number ${input.prNumber} --in-reply-to <id> --body "Code-standard-only finding — deferred to follow-up issue <url>." Then resolve the thread by its PRRT_ node id (GraphQL lookup on comment databaseId, then resolveReviewThread or the github MCP pull_request_review_write method=resolve_thread).\n\n` +
|
|
670
|
+
`Return a one-line summary naming the follow-up issue URL and the hardening PR URL.`,
|
|
671
|
+
{ model: 'fable', label: `standards-followup:${sourceLabel}`, phase: 'Converge', agentType: 'clean-coder' },
|
|
627
672
|
)
|
|
628
673
|
}
|
|
629
674
|
|
|
@@ -633,6 +678,7 @@ let rounds = 0
|
|
|
633
678
|
let iterations = 0
|
|
634
679
|
let blocker = null
|
|
635
680
|
let bugbotDown = input.bugbotDisabled || false
|
|
681
|
+
let standardsNote = null
|
|
636
682
|
|
|
637
683
|
while (iterations < CONFIG.maxIterations) {
|
|
638
684
|
iterations += 1
|
|
@@ -657,6 +703,14 @@ while (iterations < CONFIG.maxIterations) {
|
|
|
657
703
|
continue
|
|
658
704
|
}
|
|
659
705
|
const findings = roundOutcome.findings
|
|
706
|
+
if (isStandardsOnlyRound(findings)) {
|
|
707
|
+
log(`Round ${rounds}: ${findings.length} code-standard-only finding(s) — deferring to follow-up PRs and treating the round as passed`)
|
|
708
|
+
await spawnStandardsFollowUp(head, findings, 'converge-round')
|
|
709
|
+
standardsNote = `${findings.length} code-standard finding(s) deferred to a follow-up fix issue plus an environment-hardening PR — verify both land`
|
|
710
|
+
await postCleanAudit(head)
|
|
711
|
+
phase = 'COPILOT'
|
|
712
|
+
continue
|
|
713
|
+
}
|
|
660
714
|
if (findings.length > 0) {
|
|
661
715
|
log(`Round ${rounds}: ${findings.length} finding(s) — applying fixes`)
|
|
662
716
|
const fixResult = await applyFixes(head, findings, 'converge-round')
|
|
@@ -693,6 +747,13 @@ while (iterations < CONFIG.maxIterations) {
|
|
|
693
747
|
break
|
|
694
748
|
}
|
|
695
749
|
if (copilotOutcome.kind === 'fix') {
|
|
750
|
+
if (isStandardsOnlyRound(copilotOutcome.findings)) {
|
|
751
|
+
log(`Copilot raised ${copilotOutcome.findings.length} code-standard-only finding(s) — deferring to follow-up PRs and treating the gate as passed`)
|
|
752
|
+
await spawnStandardsFollowUp(head, copilotOutcome.findings, 'copilot')
|
|
753
|
+
standardsNote = `${copilotOutcome.findings.length} code-standard finding(s) deferred to a follow-up fix issue plus an environment-hardening PR — verify both land`
|
|
754
|
+
phase = 'FINALIZE'
|
|
755
|
+
continue
|
|
756
|
+
}
|
|
696
757
|
log(`Copilot raised ${copilotOutcome.findings.length} finding(s) — fixing and re-converging`)
|
|
697
758
|
const fixResult = await applyFixes(head, copilotOutcome.findings, 'copilot')
|
|
698
759
|
const hadThreadBearingFinding = copilotOutcome.findings.some((each) => collectFindingThreadIds(each).length > 0)
|
|
@@ -722,7 +783,7 @@ while (iterations < CONFIG.maxIterations) {
|
|
|
722
783
|
const readyResult = await markReady(head)
|
|
723
784
|
const readyOutcome = classifyReadyOutcome(readyResult)
|
|
724
785
|
if (readyOutcome.converged) {
|
|
725
|
-
return { converged: true, rounds, finalSha: head, blocker: null }
|
|
786
|
+
return { converged: true, rounds, finalSha: head, blocker: null, standardsNote }
|
|
726
787
|
}
|
|
727
788
|
blocker = readyOutcome.blocker
|
|
728
789
|
break
|
|
@@ -739,4 +800,5 @@ return {
|
|
|
739
800
|
rounds,
|
|
740
801
|
finalSha: head,
|
|
741
802
|
blocker: blocker || `iteration cap reached (${CONFIG.maxIterations})`,
|
|
803
|
+
standardsNote,
|
|
742
804
|
}
|
|
@@ -43,6 +43,27 @@ working directory routes into the PR's repo for local work and returns to
|
|
|
43
43
|
the session worktree before teardown. See
|
|
44
44
|
[`reference/per-tick.md` § Step 1.5](reference/per-tick.md).
|
|
45
45
|
|
|
46
|
+
## Budget-aware tick boundaries
|
|
47
|
+
|
|
48
|
+
Before starting any tick, estimate whether the remaining session/usage
|
|
49
|
+
budget covers one full clean tick (worst case: a BUGBOT fetch + a
|
|
50
|
+
full-diff CODE_REVIEW + a fix commit + replies). If it does not, do not
|
|
51
|
+
start the tick. Stop at the current tick boundary: write updated state to
|
|
52
|
+
`$CLAUDE_JOB_DIR/pr-converge-state.json`, then report the exact resume
|
|
53
|
+
command (`/pr-converge <PR URL>`) and the persisted `phase`/`tick_count`.
|
|
54
|
+
A tick cut off mid-flight poisons the resume state — clean SHAs recorded
|
|
55
|
+
against work that never landed — so an unstarted tick is always cheaper
|
|
56
|
+
than a half-finished one.
|
|
57
|
+
|
|
58
|
+
## Findings discipline
|
|
59
|
+
|
|
60
|
+
Every finding, reply, and report states verified facts only — no hedging
|
|
61
|
+
language (`likely`, `probably`, `should`, `appears to`). Verify each
|
|
62
|
+
claim against the code on `current_head` before stating it; the
|
|
63
|
+
anti-hallucination Stop hook rejects hedged output, forcing a rework
|
|
64
|
+
pass. A claim that cannot be verified is reported as unverified, not
|
|
65
|
+
softened.
|
|
66
|
+
|
|
46
67
|
## State persistence
|
|
47
68
|
|
|
48
69
|
Single-PR mode persists loop state to `$CLAUDE_JOB_DIR/pr-converge-state.json`.
|
|
@@ -354,6 +375,7 @@ round as converged. This rule holds every tick, every loop, every PR.
|
|
|
354
375
|
`python "$HOME/.claude/skills/bugteam/scripts/revoke_project_claude_permissions.py"`
|
|
355
376
|
|
|
356
377
|
- [ ] **Step 11: Print final report**
|
|
378
|
+
Print this block verbatim — no paraphrase, no extra commentary:
|
|
357
379
|
```
|
|
358
380
|
/pr-converge exit: converged
|
|
359
381
|
Loops: <N>
|