claude-dev-env 1.38.0 → 1.39.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (271) hide show
  1. package/CLAUDE.md +10 -36
  2. package/_shared/pr-loop/audit-reply-template.md +147 -0
  3. package/_shared/pr-loop/fix-protocol.md +25 -4
  4. package/_shared/pr-loop/gh-payloads.md +37 -50
  5. package/_shared/pr-loop/scripts/code_rules_gate.py +0 -60
  6. package/_shared/pr-loop/scripts/config/post_audit_thread_constants.py +189 -0
  7. package/_shared/pr-loop/scripts/post_audit_thread.py +947 -0
  8. package/_shared/pr-loop/scripts/tests/test_code_rules_gate.py +0 -19
  9. package/_shared/pr-loop/scripts/tests/test_post_audit_thread.py +923 -0
  10. package/_shared/pr-loop/scripts/tests/test_post_audit_thread_constants.py +127 -0
  11. package/_shared/pr-loop/state-schema.md +1 -1
  12. package/agents/clean-coder.md +2 -2
  13. package/bin/install.mjs +6 -7
  14. package/bin/install.test.mjs +8 -0
  15. package/commands/doc-gist.md +16 -0
  16. package/commands/plan.md +0 -2
  17. package/commands/review-plan.md +1 -1
  18. package/docs/CODE_RULES.md +122 -2
  19. package/hooks/blocking/bot_mention_comment_blocker.py +75 -0
  20. package/hooks/blocking/code_rules_enforcer.py +1236 -161
  21. package/hooks/blocking/convergence_gate_blocker.py +130 -0
  22. package/hooks/blocking/destructive_command_blocker.py +74 -0
  23. package/hooks/blocking/gh_body_arg_blocker.py +30 -0
  24. package/hooks/blocking/md_to_html_blocker.py +119 -0
  25. package/hooks/blocking/test_bot_mention_comment_blocker.py +131 -0
  26. package/hooks/blocking/test_code_rules_enforcer.py +21 -0
  27. package/hooks/blocking/test_code_rules_enforcer_any_exempt_files.py +70 -0
  28. package/hooks/blocking/test_code_rules_enforcer_any_imports_and_cast.py +92 -0
  29. package/hooks/blocking/test_code_rules_enforcer_banned_import_alias.py +143 -0
  30. package/hooks/blocking/test_code_rules_enforcer_banned_prefixes.py +152 -0
  31. package/hooks/blocking/test_code_rules_enforcer_bare_except.py +120 -0
  32. package/hooks/blocking/test_code_rules_enforcer_boundary_types.py +175 -0
  33. package/hooks/blocking/test_code_rules_enforcer_cap_meta.py +0 -1
  34. package/hooks/blocking/test_code_rules_enforcer_collection_prefix.py +50 -0
  35. package/hooks/blocking/test_code_rules_enforcer_docstring_format.py +255 -0
  36. package/hooks/blocking/test_code_rules_enforcer_inline_tuple_string_magic.py +130 -0
  37. package/hooks/blocking/test_code_rules_enforcer_stub_implementations.py +141 -0
  38. package/hooks/blocking/test_code_rules_enforcer_test_branching.py +143 -0
  39. package/hooks/blocking/test_code_rules_enforcer_thin_wrapper_files.py +169 -0
  40. package/hooks/blocking/test_code_rules_enforcer_todo_markers.py +99 -0
  41. package/hooks/blocking/test_code_rules_enforcer_typed_dict_pairs.py +141 -0
  42. package/hooks/blocking/test_code_rules_enforcer_unused_imports.py +158 -0
  43. package/hooks/blocking/test_convergence_gate_blocker.py +63 -0
  44. package/hooks/blocking/test_destructive_command_blocker.py +146 -0
  45. package/hooks/blocking/test_destructive_command_blocker_no_verify.py +102 -0
  46. package/hooks/blocking/test_gh_body_arg_blocker.py +45 -0
  47. package/hooks/blocking/test_md_to_html_blocker.py +317 -0
  48. package/hooks/config/any_type_config.py +7 -0
  49. package/hooks/config/banned_identifiers_constants.py +11 -0
  50. package/hooks/config/blocking_check_limits.py +38 -0
  51. package/hooks/config/bot_mention_comment_blocker_constants.py +20 -0
  52. package/hooks/config/code_rules_enforcer_constants.py +53 -0
  53. package/hooks/config/convergence_branch_constants.py +9 -0
  54. package/hooks/config/doc_gist_auto_publish_constants.py +18 -0
  55. package/hooks/config/html_companion_constants.py +20 -0
  56. package/hooks/config/inline_tuple_string_magic_constants.py +22 -0
  57. package/hooks/config/test_banned_identifiers_constants.py +17 -0
  58. package/hooks/hooks.json +28 -20
  59. package/hooks/pyproject.toml +69 -0
  60. package/hooks/validators/mypy_integration.py +47 -1
  61. package/hooks/validators/run_all_validators.py +3 -3
  62. package/hooks/validators/test_mypy_integration.py +50 -1
  63. package/hooks/workflow/doc_gist_auto_publish.py +144 -0
  64. package/hooks/workflow/md_to_html_companion.py +365 -0
  65. package/hooks/workflow/test_doc_gist_auto_publish.py +117 -0
  66. package/hooks/workflow/test_md_to_html_companion.py +452 -0
  67. package/package.json +1 -1
  68. package/rules/gh-body-file.md +2 -0
  69. package/scripts/Install-SweepEmptyDirs.ps1 +111 -0
  70. package/scripts/check.ps1 +106 -0
  71. package/scripts/config/timing.py +11 -0
  72. package/scripts/sweep_empty_dirs.py +138 -0
  73. package/scripts/sync_to_cursor/rules.py +1 -1
  74. package/scripts/test_sweep_empty_dirs.py +183 -0
  75. package/skills/_shared/pr-loop/prompts/pr-consistency-audit.xml +323 -0
  76. package/skills/_shared/pr-loop/scripts/_cli_utils.py +22 -0
  77. package/skills/_shared/pr-loop/scripts/_path_resolver.py +165 -0
  78. package/skills/_shared/pr-loop/scripts/_xml_utils.py +20 -0
  79. package/skills/_shared/pr-loop/scripts/build_audit_prompt.py +182 -0
  80. package/skills/_shared/pr-loop/scripts/build_fix_prompt.py +185 -0
  81. package/skills/_shared/pr-loop/scripts/config/__init__.py +0 -0
  82. package/skills/_shared/pr-loop/scripts/config/path_resolver_constants.py +78 -0
  83. package/skills/_shared/pr-loop/scripts/init_loop_state.py +135 -0
  84. package/skills/_shared/pr-loop/scripts/teardown_worktrees.py +175 -0
  85. package/skills/_shared/pr-loop/scripts/write_audit_outcomes.py +182 -0
  86. package/skills/_shared/pr-loop/scripts/write_fix_outcomes.py +206 -0
  87. package/skills/bugteam/CONSTRAINTS.md +21 -22
  88. package/skills/bugteam/EXAMPLES.md +3 -3
  89. package/skills/bugteam/PROMPTS.md +227 -67
  90. package/skills/bugteam/SKILL.md +114 -455
  91. package/skills/bugteam/reference/README.md +1 -1
  92. package/skills/bugteam/reference/audit-and-teammates.md +112 -39
  93. package/skills/bugteam/reference/audit-contract.md +4 -22
  94. package/skills/bugteam/reference/copilot-gap-analysis.md +8 -5
  95. package/skills/bugteam/reference/design-rationale.md +2 -2
  96. package/skills/bugteam/reference/github-pr-reviews.md +50 -57
  97. package/skills/bugteam/reference/obstacles/audit-assign-ids.md +13 -0
  98. package/skills/bugteam/reference/obstacles/audit-capture-excerpts.md +13 -0
  99. package/skills/bugteam/reference/obstacles/audit-walk-categories.md +13 -0
  100. package/skills/bugteam/reference/obstacles/audit-write-xml.md +13 -0
  101. package/skills/bugteam/reference/obstacles/fix-append-summary.md +13 -0
  102. package/skills/bugteam/reference/obstacles/fix-apply-fixes.md +13 -0
  103. package/skills/bugteam/reference/obstacles/fix-git-add-commit.md +13 -0
  104. package/skills/bugteam/reference/obstacles/fix-git-push.md +13 -0
  105. package/skills/bugteam/reference/obstacles/fix-post-reply.md +13 -0
  106. package/skills/bugteam/reference/obstacles/fix-publish-summary.md +13 -0
  107. package/skills/bugteam/reference/obstacles/fix-py-compile.md +13 -0
  108. package/skills/bugteam/reference/obstacles/fix-read-files.md +13 -0
  109. package/skills/bugteam/reference/obstacles/fix-resolve-thread.md +13 -0
  110. package/skills/bugteam/reference/obstacles/fix-test-suite.md +13 -0
  111. package/skills/bugteam/reference/obstacles/fix-violation-count.md +13 -0
  112. package/skills/bugteam/reference/obstacles/fix-write-xml.md +13 -0
  113. package/skills/bugteam/reference/team-setup.md +106 -9
  114. package/skills/bugteam/reference/teardown-publish-permissions.md +39 -8
  115. package/skills/bugteam/scripts/README.md +60 -0
  116. package/skills/bugteam/scripts/_claude_permissions_common.py +358 -0
  117. package/skills/bugteam/scripts/bugteam_code_rules_gate.py +976 -0
  118. package/skills/bugteam/scripts/bugteam_fix_hookspath.py +375 -0
  119. package/skills/bugteam/scripts/bugteam_preflight.py +294 -0
  120. package/skills/bugteam/scripts/config/bugteam_code_rules_gate_constants.py +25 -0
  121. package/skills/bugteam/scripts/config/bugteam_fix_hookspath_constants.py +26 -0
  122. package/skills/bugteam/scripts/config/bugteam_preflight_constants.py +35 -0
  123. package/skills/bugteam/scripts/config/claude_permissions_common_constants.py +20 -0
  124. package/skills/bugteam/scripts/config/probe_code_rules_enforcer_check_constants.py +12 -0
  125. package/skills/bugteam/scripts/config/windows_safe_rmtree_constants.py +7 -0
  126. package/skills/bugteam/scripts/grant_project_claude_permissions.py +175 -0
  127. package/skills/bugteam/scripts/probe_code_rules_enforcer_check.py +107 -0
  128. package/skills/bugteam/scripts/revoke_project_claude_permissions.py +220 -0
  129. package/skills/bugteam/scripts/test__claude_permissions_common.py +112 -0
  130. package/skills/bugteam/scripts/test_bugteam_code_rules_gate.py +400 -0
  131. package/skills/bugteam/scripts/test_bugteam_fix_hookspath.py +384 -0
  132. package/skills/bugteam/scripts/test_bugteam_preflight.py +268 -0
  133. package/skills/bugteam/scripts/test_claude_permissions_common.py +195 -0
  134. package/skills/bugteam/scripts/test_grant_project_claude_permissions.py +55 -0
  135. package/skills/bugteam/scripts/test_probe_code_rules_enforcer_check.py +76 -0
  136. package/skills/bugteam/scripts/test_revoke_project_claude_permissions.py +55 -0
  137. package/skills/bugteam/scripts/test_windows_safe_rmtree.py +108 -0
  138. package/skills/bugteam/scripts/windows_safe_rmtree.py +100 -0
  139. package/skills/bugteam/test_skill_additions.py +1 -11
  140. package/skills/code/SKILL.md +176 -0
  141. package/skills/doc-gist/SKILL.md +99 -0
  142. package/skills/doc-gist/references/examples/01-exploration-code-approaches.html +453 -0
  143. package/skills/doc-gist/references/examples/02-exploration-visual-designs.html +515 -0
  144. package/skills/doc-gist/references/examples/03-code-review-pr.html +638 -0
  145. package/skills/doc-gist/references/examples/04-code-understanding.html +491 -0
  146. package/skills/doc-gist/references/examples/05-design-system.html +629 -0
  147. package/skills/doc-gist/references/examples/06-component-variants.html +605 -0
  148. package/skills/doc-gist/references/examples/07-prototype-animation.html +455 -0
  149. package/skills/doc-gist/references/examples/08-prototype-interaction.html +396 -0
  150. package/skills/doc-gist/references/examples/09-slide-deck.html +592 -0
  151. package/skills/doc-gist/references/examples/10-svg-illustrations.html +492 -0
  152. package/skills/doc-gist/references/examples/11-status-report.html +528 -0
  153. package/skills/doc-gist/references/examples/12-incident-report.html +596 -0
  154. package/skills/doc-gist/references/examples/13-flowchart-diagram.html +395 -0
  155. package/skills/doc-gist/references/examples/14-research-feature-explainer.html +381 -0
  156. package/skills/doc-gist/references/examples/15-research-concept-explainer.html +368 -0
  157. package/skills/doc-gist/references/examples/16-implementation-plan.html +702 -0
  158. package/skills/doc-gist/references/examples/17-pr-writeup.html +595 -0
  159. package/skills/doc-gist/references/examples/18-editor-triage-board.html +573 -0
  160. package/skills/doc-gist/references/examples/19-editor-feature-flags.html +663 -0
  161. package/skills/doc-gist/references/examples/20-editor-prompt-tuner.html +722 -0
  162. package/skills/doc-gist/references/examples/README.md +5 -0
  163. package/skills/doc-gist/scripts/config/__init__.py +0 -0
  164. package/skills/doc-gist/scripts/config/gist_upload_constants.py +16 -0
  165. package/skills/doc-gist/scripts/gist_upload.py +177 -0
  166. package/skills/doc-gist/scripts/test_gist_upload.py +51 -0
  167. package/skills/findbugs/SKILL.md +68 -2
  168. package/skills/monitor-open-prs/SKILL.md +13 -32
  169. package/skills/monitor-open-prs/test_skill_contract.py +0 -11
  170. package/skills/pr-consistency-audit/SKILL.md +112 -0
  171. package/skills/pr-consistency-audit/reference/detection-rules.md +96 -0
  172. package/skills/pr-consistency-audit/reference/illustrations.md +78 -0
  173. package/skills/pr-converge/SKILL.md +227 -23
  174. package/skills/pr-converge/config/__init__.py +0 -0
  175. package/skills/pr-converge/config/constants.py +62 -0
  176. package/skills/pr-converge/reference/convergence-gates.md +138 -44
  177. package/skills/pr-converge/reference/examples.md +43 -11
  178. package/skills/pr-converge/reference/fix-protocol.md +6 -5
  179. package/skills/pr-converge/reference/ground-rules.md +5 -3
  180. package/skills/pr-converge/reference/multi-pr-orchestration.md +44 -19
  181. package/skills/pr-converge/reference/obstacles/fix-post-replies.md +13 -0
  182. package/skills/pr-converge/reference/obstacles/fix-publish-summary.md +13 -0
  183. package/skills/pr-converge/reference/obstacles/fix-push.md +13 -0
  184. package/skills/pr-converge/reference/obstacles/fix-read-filelines.md +13 -0
  185. package/skills/pr-converge/reference/obstacles/fix-reset-state.md +13 -0
  186. package/skills/pr-converge/reference/obstacles/fix-resolve-threads.md +13 -0
  187. package/skills/pr-converge/reference/obstacles/fix-spawn-clean-coder.md +13 -0
  188. package/skills/pr-converge/reference/obstacles/fix-stage-commit.md +13 -0
  189. package/skills/pr-converge/reference/obstacles/fix-trigger-bugbot.md +13 -0
  190. package/skills/pr-converge/reference/obstacles/fix-write-test.md +13 -0
  191. package/skills/pr-converge/reference/per-tick.md +90 -31
  192. package/skills/pr-converge/reference/state-schema.md +22 -1
  193. package/skills/pr-converge/reference/stop-conditions.md +9 -7
  194. package/skills/pr-converge/scripts/README.md +34 -46
  195. package/skills/pr-converge/scripts/check_bugbot_ci.py +174 -0
  196. package/skills/pr-converge/scripts/check_convergence.py +497 -0
  197. package/skills/pr-converge/scripts/check_pending_reviews.py +154 -0
  198. package/skills/pr-converge/scripts/config/pr_converge_constants.py +118 -0
  199. package/skills/pr-converge/scripts/fetch_copilot_reviews.py +134 -0
  200. package/skills/pr-converge/scripts/post_fix_reply.py +168 -0
  201. package/skills/pr-converge/workflows/schedule-wakeup-loop.md +5 -12
  202. package/skills/qbug/SKILL.md +132 -27
  203. package/skills/session-log/SKILL.md +216 -114
  204. package/skills/session-tidy/SKILL.md +1 -1
  205. package/skills/skill-builder/SKILL.md +138 -56
  206. package/skills/skill-builder/references/delegation-map.md +72 -113
  207. package/skills/skill-builder/references/progressive-disclosure.md +122 -0
  208. package/skills/skill-builder/references/self-audit-checklist.md +92 -0
  209. package/skills/skill-builder/references/skill-types.md +228 -0
  210. package/skills/skill-builder/references/thariq-x-post-skills.json +33 -0
  211. package/skills/skill-builder/templates/gap-analysis.md +15 -8
  212. package/skills/skill-builder/workflows/improve-skill.md +86 -57
  213. package/skills/skill-builder/workflows/new-skill.md +80 -168
  214. package/skills/skill-builder/workflows/polish-skill.md +78 -54
  215. package/skills/structure-prompt/SKILL.md +50 -0
  216. package/skills/structure-prompt/reference/adversarial-tuning.md +62 -0
  217. package/skills/structure-prompt/reference/block-classification.md +27 -0
  218. package/skills/structure-prompt/reference/canonical-case.md +48 -0
  219. package/skills/structure-prompt/reference/citation-depth.md +70 -0
  220. package/skills/structure-prompt/reference/cleanup.md +33 -0
  221. package/skills/structure-prompt/reference/constraints.md +33 -0
  222. package/skills/structure-prompt/reference/directives.md +37 -0
  223. package/skills/structure-prompt/reference/examples.md +72 -0
  224. package/skills/structure-prompt/reference/instantiation.md +51 -0
  225. package/skills/structure-prompt/reference/output-contract.md +72 -0
  226. package/skills/structure-prompt/reference/per-category.md +23 -0
  227. package/skills/structure-prompt/reference/persona.md +38 -0
  228. package/skills/structure-prompt/reference/research.md +33 -0
  229. package/skills/structure-prompt/reference/structure.md +28 -0
  230. package/agents/code-standards-agent.md +0 -93
  231. package/agents/groq-coder.md +0 -113
  232. package/agents/plan-executor.md +0 -226
  233. package/agents/project-docs-analyzer.md +0 -53
  234. package/agents/project-structure-organizer-agent.md +0 -72
  235. package/agents/skill-to-agent-converter.md +0 -370
  236. package/agents/skill-writer-agent.md +0 -470
  237. package/agents/user-docs-writer.md +0 -67
  238. package/agents/workflow-visual-documenter.md +0 -82
  239. package/commands/readability-review.md +0 -20
  240. package/hooks/mypy.ini +0 -2
  241. package/hooks/notification/attention_needed_notify.py +0 -71
  242. package/hooks/notification/claude_notification_handler.py +0 -67
  243. package/hooks/notification/notification_utils.py +0 -267
  244. package/hooks/notification/subagent_complete_notify.py +0 -381
  245. package/hooks/notification/test_attention_needed_notify.py +0 -47
  246. package/hooks/notification/test_claude_notification_handler.py +0 -54
  247. package/hooks/notification/test_notification_utils.py +0 -91
  248. package/hooks/notification/test_subagent_complete_notify.py +0 -79
  249. package/scripts/config/groq_bugteam_config.py +0 -230
  250. package/scripts/config/test_groq_bugteam_config.py +0 -83
  251. package/scripts/config/test_spec_implementer_prompt.py +0 -32
  252. package/scripts/groq_bugteam.README.md +0 -131
  253. package/scripts/groq_bugteam.py +0 -647
  254. package/scripts/groq_bugteam_dotenv.py +0 -40
  255. package/scripts/groq_bugteam_spec.py +0 -226
  256. package/scripts/test_groq_bugteam.py +0 -529
  257. package/scripts/test_groq_bugteam_apply_fix_from_spec.py +0 -426
  258. package/scripts/test_groq_bugteam_dotenv.py +0 -66
  259. package/scripts/test_groq_bugteam_spec.py +0 -338
  260. package/skills/bugteam/SKILL_EVALS.md +0 -309
  261. package/skills/dream/SKILL.md +0 -118
  262. package/skills/ingest/SKILL.md +0 -40
  263. package/skills/npm-creator/SKILL.md +0 -187
  264. package/skills/readability-review/SKILL.md +0 -127
  265. package/skills/resume-review/SKILL.md +0 -261
  266. package/skills/rule-audit/SKILL.md +0 -307
  267. package/skills/rule-creator/SKILL.md +0 -150
  268. package/skills/searching-obsidian-vault/SKILL.md +0 -131
  269. package/skills/skill-writer/REFERENCE.md +0 -284
  270. package/skills/skill-writer/SKILL.md +0 -222
  271. package/skills/tdd-team/SKILL.md +0 -128
@@ -0,0 +1,138 @@
1
+ #!/usr/bin/env python3
2
+ """Delete empty directories older than a configurable age under a given root.
3
+
4
+ Usage:
5
+ python sweep_empty_dirs.py /path/to/watch
6
+ python sweep_empty_dirs.py /path/to/watch --age 300
7
+ python sweep_empty_dirs.py /path/to/watch --once
8
+ """
9
+
10
+ from __future__ import annotations
11
+
12
+ import argparse
13
+ import errno
14
+ import logging
15
+ import os
16
+ import sys
17
+ import time
18
+
19
+ from config.timing import DEFAULT_AGE_SECONDS, DEFAULT_POLL_INTERVAL
20
+
21
+
22
+ def _positive_int(raw_argument: str) -> int:
23
+ """Argparse type: require value >= 1."""
24
+ try:
25
+ parsed = int(raw_argument)
26
+ except ValueError:
27
+ raise argparse.ArgumentTypeError(
28
+ f"invalid integer value: {raw_argument!r}"
29
+ )
30
+ if parsed < 1:
31
+ raise argparse.ArgumentTypeError(f"must be >= 1, got {raw_argument}")
32
+ return parsed
33
+
34
+
35
+ def _log_walk_error(os_error: OSError) -> None:
36
+ logging.warning("cannot scan %s -- %s", os_error.filename, os_error.strerror)
37
+
38
+
39
+ def sweep(root: str, min_age_seconds: int) -> list[str]:
40
+ """Remove empty directories under *root* older than *min_age_seconds*.
41
+
42
+ Walks bottom-up so nested empty directories are cleaned from the leaves
43
+ inward. Relies on os.rmdir to fail harmlessly for non-empty directories
44
+ instead of checking snapshotted subdirectory lists.
45
+ """
46
+
47
+ all_removed: list[str] = []
48
+
49
+ now = time.time()
50
+ for each_directory_path, _, _ in os.walk(
51
+ root, onerror=_log_walk_error, topdown=False
52
+ ):
53
+ if each_directory_path == root:
54
+ continue
55
+ try:
56
+ raw_ctime = os.path.getctime(each_directory_path)
57
+ except FileNotFoundError:
58
+ continue
59
+ except PermissionError:
60
+ logging.warning("permission denied -- %s", each_directory_path)
61
+ continue
62
+ except OSError:
63
+ continue
64
+ ctime = min(raw_ctime, now)
65
+ if now - ctime > min_age_seconds:
66
+ try:
67
+ os.rmdir(each_directory_path)
68
+ logging.info("deleted: %s", each_directory_path)
69
+ all_removed.append(each_directory_path)
70
+ except FileNotFoundError:
71
+ pass
72
+ except OSError as e:
73
+ if e.errno not in (errno.ENOTEMPTY, errno.EEXIST):
74
+ logging.warning(
75
+ "could not remove %s -- %s",
76
+ each_directory_path,
77
+ e,
78
+ )
79
+
80
+ return all_removed
81
+
82
+
83
+ def _build_parser() -> argparse.ArgumentParser:
84
+ default_age_seconds = DEFAULT_AGE_SECONDS
85
+ default_poll_interval = DEFAULT_POLL_INTERVAL
86
+
87
+ parser = argparse.ArgumentParser(
88
+ description="Delete empty directories older than a given age.",
89
+ )
90
+ parser.add_argument("root", help="Root directory to scan")
91
+ parser.add_argument(
92
+ "--age",
93
+ type=_positive_int,
94
+ default=default_age_seconds,
95
+ help=f"Minimum age in seconds (default: {default_age_seconds})",
96
+ )
97
+ parser.add_argument(
98
+ "--once", dest="is_once",
99
+ action="store_true",
100
+ help="Single pass and exit instead of watching in a loop",
101
+ )
102
+ parser.add_argument(
103
+ "--interval",
104
+ type=_positive_int,
105
+ default=default_poll_interval,
106
+ help=f"Poll interval in seconds when looping (default: {default_poll_interval})",
107
+ )
108
+ return parser
109
+
110
+
111
+ def main() -> None:
112
+ parser = _build_parser()
113
+ arguments = parser.parse_args()
114
+
115
+ logging.basicConfig(level=logging.INFO, format="%(message)s")
116
+
117
+ if not os.path.isdir(arguments.root):
118
+ print(f"error: not a directory: {arguments.root}", file=sys.stderr)
119
+ sys.exit(1)
120
+
121
+ if arguments.is_once:
122
+ sweep(arguments.root, arguments.age)
123
+ return
124
+
125
+ print(
126
+ f"watching {arguments.root} every {arguments.interval}s"
127
+ f" (age threshold: {arguments.age}s)"
128
+ )
129
+ try:
130
+ while True:
131
+ sweep(arguments.root, arguments.age)
132
+ time.sleep(arguments.interval)
133
+ except KeyboardInterrupt:
134
+ print("\nstopped.")
135
+
136
+
137
+ if __name__ == "__main__":
138
+ main()
@@ -22,7 +22,7 @@ def _parse_h2_sections(markdown: str) -> dict[str, str]:
22
22
  def _filter_core_principles(body: str) -> str:
23
23
  lines = []
24
24
  for line in body.splitlines():
25
- if "readability-review" in line or "readability standard" in line:
25
+ if "readability standard" in line:
26
26
  continue
27
27
  lines.append(line)
28
28
  return "\n".join(lines).strip()
@@ -0,0 +1,183 @@
1
+ """Tests for sweep_empty_dirs script behaviors."""
2
+
3
+ import argparse
4
+ import errno
5
+ import os
6
+ import sys
7
+ import tempfile
8
+ import time
9
+ from pathlib import Path
10
+ from unittest.mock import patch
11
+
12
+ import pytest
13
+
14
+ _SCRIPTS_DIR = Path(os.path.abspath(__file__)).parent
15
+ if str(_SCRIPTS_DIR) not in sys.path:
16
+ sys.path.insert(0, str(_SCRIPTS_DIR))
17
+
18
+ for _cached in list(sys.modules):
19
+ if _cached == "config" or _cached.startswith("config."):
20
+ del sys.modules[_cached]
21
+
22
+ from sweep_empty_dirs import _build_parser, _positive_int, sweep # noqa: E402
23
+
24
+ _OLD_TIMESTAMP = time.time() - 300
25
+
26
+
27
+ def test_positive_int_accepts_valid_value() -> None:
28
+ """_positive_int accepts integers >= 1."""
29
+ assert _positive_int("5") == 5
30
+
31
+
32
+ def test_positive_int_accepts_minimum_value() -> None:
33
+ """_positive_int accepts exactly 1."""
34
+ assert _positive_int("1") == 1
35
+
36
+
37
+ def test_positive_int_rejects_zero() -> None:
38
+ """_positive_int raises for 0."""
39
+ with pytest.raises(argparse.ArgumentTypeError):
40
+ _positive_int("0")
41
+
42
+
43
+ def test_positive_int_rejects_negative() -> None:
44
+ """_positive_int raises for negative values."""
45
+ with pytest.raises(argparse.ArgumentTypeError):
46
+ _positive_int("-1")
47
+
48
+
49
+ def test_positive_int_rejects_non_integer() -> None:
50
+ """_positive_int raises for non-integer strings like 'abc'."""
51
+ with pytest.raises(argparse.ArgumentTypeError):
52
+ _positive_int("abc")
53
+
54
+
55
+ def test_build_parser_sets_age_default_from_timing_config() -> None:
56
+ """_build_parser uses DEFAULT_AGE_SECONDS from config.timing as --age default."""
57
+ parser = _build_parser()
58
+ default_age = parser.get_default("age")
59
+ assert isinstance(default_age, int)
60
+ assert default_age > 0
61
+
62
+
63
+ def test_build_parser_sets_interval_default_from_timing_config() -> None:
64
+ """_build_parser uses DEFAULT_POLL_INTERVAL from config.timing as --interval default."""
65
+ parser = _build_parser()
66
+ default_interval = parser.get_default("interval")
67
+ assert isinstance(default_interval, int)
68
+ assert default_interval > 0
69
+
70
+
71
+ def test_sweep_removes_empty_directory(tmp_path: Path) -> None:
72
+ """sweep removes an empty directory older than the age threshold."""
73
+ empty_dir = tmp_path / "empty_old"
74
+ empty_dir.mkdir()
75
+
76
+ sweep(str(tmp_path), min_age_seconds=0)
77
+
78
+ assert not empty_dir.exists()
79
+
80
+
81
+ def test_sweep_preserves_non_empty_directory(tmp_path: Path) -> None:
82
+ """sweep does not remove a directory containing files."""
83
+ non_empty_dir = tmp_path / "has_files"
84
+ non_empty_dir.mkdir()
85
+ (non_empty_dir / "some_file.txt").write_text("content")
86
+
87
+ sweep(str(tmp_path), min_age_seconds=0)
88
+
89
+ assert non_empty_dir.exists()
90
+
91
+
92
+ def test_sweep_preserves_root_directory(tmp_path: Path) -> None:
93
+ """sweep never removes the root directory itself."""
94
+ sub_dir = tmp_path / "subdir"
95
+ sub_dir.mkdir()
96
+
97
+ sweep(str(tmp_path), min_age_seconds=0)
98
+
99
+ assert tmp_path.exists()
100
+
101
+
102
+ def test_sweep_removes_nested_empty_dirs(tmp_path: Path) -> None:
103
+ """sweep removes nested empty directories bottom-up."""
104
+ nested = tmp_path / "level1" / "level2" / "level3"
105
+ nested.mkdir(parents=True)
106
+
107
+ sweep(str(tmp_path), min_age_seconds=0)
108
+
109
+ assert not nested.exists()
110
+ assert not (tmp_path / "level1" / "level2").exists()
111
+ assert not (tmp_path / "level1").exists()
112
+
113
+
114
+ def test_sweep_removes_only_old_enough_directories(tmp_path: Path) -> None:
115
+ """sweep does not remove directories newer than the age threshold."""
116
+ young_dir = tmp_path / "young"
117
+ young_dir.mkdir()
118
+
119
+ sweep(str(tmp_path), min_age_seconds=9999999)
120
+
121
+ assert young_dir.exists()
122
+
123
+
124
+ def test_sweep_returns_list_of_removed_directories(tmp_path: Path) -> None:
125
+ """sweep returns the paths of directories it removed."""
126
+ old_dir = tmp_path / "old_empty"
127
+ old_dir.mkdir()
128
+
129
+ removed = sweep(str(tmp_path), min_age_seconds=0)
130
+
131
+ assert old_dir.name in [Path(p).name for p in removed]
132
+
133
+
134
+ def test_skips_dir_when_getctime_raises_os_error() -> None:
135
+ with tempfile.TemporaryDirectory() as tmp:
136
+ problem_dir = os.path.join(tmp, "broken")
137
+ os.mkdir(problem_dir)
138
+
139
+ original_getctime = os.path.getctime
140
+
141
+ def _failing_getctime(path: str) -> float:
142
+ if "broken" in path:
143
+ raise OSError("simulated broken junction")
144
+ return original_getctime(path)
145
+
146
+ with patch("os.path.getctime", side_effect=_failing_getctime):
147
+ removed = sweep(tmp, min_age_seconds=120)
148
+
149
+ assert problem_dir not in removed
150
+ assert os.path.isdir(problem_dir)
151
+
152
+
153
+ def test_suppresses_eexist_like_enotempty() -> None:
154
+ with tempfile.TemporaryDirectory() as tmp:
155
+ non_empty_dir = os.path.join(tmp, "occupied")
156
+ os.mkdir(non_empty_dir)
157
+ _touch(os.path.join(non_empty_dir, "a_file"))
158
+
159
+ target_dir = os.path.join(tmp, "empty_target")
160
+ os.mkdir(target_dir)
161
+
162
+ def _mock_getctime(directory_path: str) -> float:
163
+ return _OLD_TIMESTAMP
164
+
165
+ original_rmdir = os.rmdir
166
+
167
+ def _rmdir_raise_eexist(removal_path: str) -> None:
168
+ if "occupied" in removal_path:
169
+ raise OSError(errno.EEXIST, "Directory not empty")
170
+ original_rmdir(removal_path)
171
+
172
+ with (
173
+ patch("os.path.getctime", side_effect=_mock_getctime),
174
+ patch("os.rmdir", side_effect=_rmdir_raise_eexist),
175
+ ):
176
+ removed = sweep(tmp, min_age_seconds=120)
177
+
178
+ assert target_dir in removed
179
+ assert os.path.isdir(non_empty_dir)
180
+
181
+
182
+ def _touch(file_path: str) -> None:
183
+ Path(file_path).write_text("")
@@ -0,0 +1,323 @@
1
+ <prompt_artifact name="pr-consistency-audit" version="1.0">
2
+ <role>
3
+ You are a diff auditor. Your job: read every changed file in a pull request. Find every inconsistency across files. Leave nothing unchecked. Leave nothing assumed. You get a diff and a task. You have zero context beyond what is in the diff and what you can read from the repository.
4
+ </role>
5
+
6
+ <scope_anchors>
7
+ <target_local_roots>Every file in the provided diff or changed file list.</target_local_roots>
8
+ <target_canonical_roots>Files outside the diff that serve as the authoritative reference for concepts used inside the diff. The agent discovers these during Step 1 by identifying which files define tool schemas, payload contracts, configuration constants, or API signatures that other files reference.</target_canonical_roots>
9
+ <target_file_globs>**/*.md **/*.py **/*.json **/*.yaml **/*.yml **/*.ts **/*.js **/*.toml</target_file_globs>
10
+ <comparison_basis>Every file compared against every other file. Every claim in a doc compared against the thing it claims about. Every script invocation compared against the script it invokes. Every tool call compared against the tool definition. Every constant value compared against the same constant in other files.</comparison_basis>
11
+ <completion_boundary>All ten detection rules have run against all files. Findings file is written. Summary is printed. Zero rules skipped. Zero files skipped.</completion_boundary>
12
+ </scope_anchors>
13
+
14
+ <execution_workflow>
15
+ <step id="1" name="build_the_manifest">
16
+ Read every file in the diff. Top to bottom. Do not skim. Build a manifest in a temp file. The manifest tracks:
17
+
18
+ - Every Python script file and its argparse arguments. For each script, extract every `add_argument("--name", ...)` call. Note which arguments have `required=True`. Note the script's path relative to the repo root.
19
+
20
+ - Every MCP tool call pattern found in documentation files. Extract the tool name and every parameter name used. Note the file and line. Group by tool name — you need this later for convention checking.
21
+
22
+ - Every shell command invocation (gh, git, python, bash). Note the full command string, file, and line.
23
+
24
+ - Every file path referenced in documentation. Note whether it points to a real file in the diff or in the repository.
25
+
26
+ - Every named concept, feature flag, phase name, or state value mentioned in prose. These are things like "inline_lag", "COPILOT_WAIT", "bugfind", "consolidator". Note the file and line where each appears.
27
+
28
+ - Every constant, threshold, timeout, or magic number with its surrounding context. Values like "360s", "270s", "90s", "3 wakeups", "60 seconds". Note the value, the concept it measures, the file, and the line.
29
+
30
+ - Every function or method whose docstring makes a claim about return values, error handling, or side effects. Note the function name, the claim, the file, and the line. You will verify each claim against the implementation.
31
+
32
+ Write the manifest to `<tmp>/audit-manifest-<timestamp>.json`. This is your working memory. Do not try to hold it in your head.
33
+ </step>
34
+
35
+ <step id="2" name="find_canonical_sources">
36
+ Before running detection rules, identify the canonical source files. A canonical source is the file that defines the authoritative form of a concept. Signs that a file is canonical:
37
+
38
+ - It contains structured schemas, payload definitions, or API contracts (files named `gh-payloads.md`, `mcp_tool_signatures.json`, `config.py`).
39
+ - Its name includes "reference", "spec", "schema", "payload", "contract", or "canonical".
40
+ - Other files in the diff cite it as the source of truth ("see X for the full schema", "as defined in Y", "per Z").
41
+ - It defines the implementation that docstrings describe (the `.py` file that contains the function, not the `.md` file that talks about it).
42
+
43
+ For each canonical source, note in your manifest what concept it is canonical for. Example: `gh-payloads.md` is canonical for MCP tool parameter names. `check_bugbot_ci.py` is canonical for how to invoke check_bugbot_ci.
44
+
45
+ When a detection rule needs to decide what is "correct", the canonical source wins. Always. If no canonical source exists for a concept, flag the inconsistency but mark it as "unresolvable — no canonical source found".
46
+ </step>
47
+
48
+ <step id="3" name="run_detection_rules">
49
+ Run every rule. Do not skip any. Do not combine them. Each rule produces findings independently. Write findings to `<tmp>/inconsistency-audit-<timestamp>.csv` as you go — one row per finding, written immediately after detection. Format:
50
+
51
+ ```
52
+ file_path | line_number | rule_id | severity | what_is_wrong | what_it_should_be | evidence_path | evidence_detail
53
+ ```
54
+
55
+ <rule id="1" name="canonical_source_cross_reference" severity="P0">
56
+ For every canonical source found in Step 2, scan every other file in the diff for usages of the concepts that canonical source defines.
57
+
58
+ How: Take the canonical MCP payload doc. Extract every tool name and its parameter names. Now scan every markdown file for calls to those tools. Compare each call's parameter names against the canonical form. Flag every mismatch.
59
+
60
+ How: Take the canonical config file. Extract every constant name and its value. Now scan every other file for references to that constant or its value. Flag any file that hardcodes the value instead of referencing the constant. Flag any file that uses a different value for the same concept.
61
+
62
+ This rule caught 20 of Copilot's 43 findings. It is the highest-signal rule. Run it first. Run it thoroughly.
63
+ </rule>
64
+
65
+ <rule id="2" name="parameter_naming_convention" severity="P0">
66
+ Group all MCP tool calls found in docs by tool name. For each tool, check whether all call sites use the same parameter naming convention.
67
+
68
+ How: Extract every parameter name from every tool call in every doc. Group by tool name. If `add_issue_comment` appears in 10 files and 8 use `issueNumber` while 2 use `issue_number`, flag the 2 that break convention. The convention is whatever the majority of call sites use UNLESS a canonical source overrides it.
69
+
70
+ Specific check: Some MCP tool families mix conventions. `issue_read` uses `issue_number` (snake_case). `add_issue_comment` uses `issueNumber` (camelCase). A doc that uses snake_case for `add_issue_comment` is wrong even if snake_case works for `issue_read`. The convention is per-tool, not per-file or per-project.
71
+
72
+ Also check: Are there two different parameter names that mean the same thing? Example: `--number` vs `--pr-number` for a script that expects `--pr-number`. This is a naming convention conflict, not just a missing argument.
73
+ </rule>
74
+
75
+ <rule id="3" name="code_vs_docstring_behavior" severity="P0">
76
+ For every function in Python files whose docstring makes a claim about what the function returns or how it handles errors, read the actual implementation. Compare.
77
+
78
+ How: Find docstrings that say things like "returns X when Y", "raises Z on error", "exits 0 on success". Trace the code path for the condition the docstring describes. If the docstring says the function returns 0 when the path does not exist but the code catches the exception and returns a failure code, flag it.
79
+
80
+ How: Find docstrings that claim a side effect ("writes the file", "removes the directory"). Check whether the code actually does that thing unconditionally or only under certain conditions. Flag any missing precondition checks.
81
+
82
+ How: Look for error handling claims. "Tolerates already-removed worktrees" — does the code actually handle the case where the path is missing? Or does it crash? Read the except blocks. Read the conditionals. Verify every error-handling claim.
83
+ </rule>
84
+
85
+ <rule id="4" name="nonexistent_reference" severity="P0">
86
+ Every file path, script name, function name, tool name, or import referenced in a documentation file must resolve to something real.
87
+
88
+ How: Extract every backtick-quoted identifier that looks like a file path, script invocation, function call, or tool name. Try to resolve it:
89
+ - File paths: does the file exist in the diff or in the repo?
90
+ - Script names: does `python scripts/foo.py` point to a real file?
91
+ - Tool names: does `mcp__plugin_github_github__some_tool` correspond to an MCP tool in the manifest?
92
+ - Function names: does `some_function()` exist in any Python file in the diff?
93
+
94
+ Flag anything that cannot be resolved. A reference to `fetch_copilot_inline_comments.py` when no such file exists is a P0 finding. A reference to `EnterWorktree(path=...)` when no MCP tool by that name exists is a P0 finding.
95
+ </rule>
96
+
97
+ <rule id="5" name="placeholder_detection" severity="P1">
98
+ Scan for text that looks like a template that was never filled in. This is text that cannot be executed as written.
99
+
100
+ How: Search for these patterns:
101
+ - Double spaces where a value should be: "Spawn — brief it: check for"
102
+ - Angle-bracket placeholders that are not in the standard `<O>` `<R>` `<N>` `<SHA>` set
103
+ - Ellipsis used as placeholder: "...", "…"
104
+ - "TODO", "FIXME", "HACK", "XXX" in production documentation
105
+ - "TKTK", "TK", "placeholder"
106
+ - Missing arguments after a flag: `--flag` with no value following
107
+ - Sentences that trail off: text ending in ", or", "like", "such as" without completing
108
+
109
+ Flag every instance. The severity depends on whether the placeholder makes the instruction unexecutable (P0) or just incomplete (P1).
110
+ </rule>
111
+
112
+ <rule id="6" name="cross_file_contradiction" severity="P0">
113
+ When two files describe the same operation, workflow step, or API call, their descriptions must agree. If they contradict, at least one is wrong.
114
+
115
+ How: Group descriptions by topic. Example topics: "how to post a review", "how to trigger bugbot", "how to request Copilot review", "what the audit workflow is", "who owns PR posting".
116
+
117
+ For each topic, extract the description from every file that mentions it. Compare:
118
+ - Who performs the action? (bugfind? consolidator? orchestrator?)
119
+ - What API is used? (batched POST with comments[]? three-step pending review?)
120
+ - What parameters are passed?
121
+ - What is the workflow order?
122
+
123
+ Flag any contradiction. A doc that says `pull_request_review_write` accepts `comments[]` contradicts a doc that says it does not. Both cannot be right. Flag both files with the contradiction — do not assume which is correct unless a canonical source resolves it.
124
+
125
+ Specific check: Look for ownership contradictions. If PROMPTS.md says the consolidator posts reviews but CONSTRAINTS.md says bugfind posts reviews, flag it. Workflow ownership contradictions cause agents to follow the wrong contract.
126
+ </rule>
127
+
128
+ <rule id="7" name="stale_reference" severity="P1">
129
+ When a documentation file describes a feature, branch, phase, or pattern by name, verify that the companion implementation files still contain that named thing.
130
+
131
+ How: Extract named concepts from decision branches and gotcha sections in docs. Examples: "inline_lag", "BUGTEAM", "COPILOT_WAIT", "parallel auditor".
132
+
133
+ For each concept, search the implementation files (Python scripts, config files, workflow definitions) for that same name. If a doc describes a four-branch decision tree that includes an "inline_lag" branch, but no implementation file contains the string "inline_lag" or "inline_lag_streak", the doc references a removed feature.
134
+
135
+ Flag as P1: doc mentions something that no longer exists in implementation. Flag as P0: doc decision tree includes a branch whose predicate can never be reached because the implementation was removed.
136
+ </rule>
137
+
138
+ <rule id="8" name="cross_platform_assumption" severity="P0">
139
+ Code that handles platform-specific behavior must not break on other platforms.
140
+
141
+ How: When you see Windows-specific patterns (ReadOnly attribute handling, backslash paths, `os.chmod` with `stat.S_IWRITE`, `pywin32` imports), check:
142
+ - Does the code run on Linux/Mac without crashing?
143
+ - Does `os.chmod(path, stat.S_IWRITE)` produce the right permissions on POSIX? (It sets the mode to write-only, stripping read and execute bits — this breaks directories.)
144
+ - Are path separators hardcoded as backslashes?
145
+ - Are there `sys.platform` checks that handle all platforms?
146
+
147
+ Flag anything that would fail or produce wrong behavior on a non-Windows system. Also flag the reverse: POSIX-only code that would fail on Windows.
148
+ </rule>
149
+
150
+ <rule id="9" name="script_invocation_correctness" severity="P0">
151
+ Every `python scripts/foo.py --*` invocation in documentation must match the script's actual argparse interface.
152
+
153
+ How: From your manifest, you have every script's argparse arguments and every documented invocation. For each documented invocation:
154
+ - Check that every flag name matches the script's `add_argument("--name")` exactly.
155
+ - Check that every required argument (from the manifest's `required=True` list) is present in the invocation.
156
+ - Check that no argument appears that the script does not accept.
157
+ - Check that the invocation includes all position-dependent arguments the script expects in the correct order.
158
+
159
+ Flag missing required args as P0. Flag wrong flag names as P0. Flag missing optional args as P2 (the script may have sensible defaults).
160
+
161
+ Important: argparse converts `--pr-number` to `pr_number` as the attribute name. The flag in the invocation must be `--pr-number`, not `--pr_number` or `--number` or `--pr-number` with the wrong dashes. Check the exact flag string.
162
+ </rule>
163
+
164
+ <rule id="10" name="value_consistency" severity="P1">
165
+ When the same named concept has a numeric value in multiple files, all files must use the same value. If they differ, at least one is wrong.
166
+
167
+ How: From your manifest, you have every constant, threshold, and timeout with its surrounding context. Group by semantic concept:
168
+ - "bugbot wakeup delay" — find every mention of "360s", "270s", "90s" tied to bugbot pacing
169
+ - "copilot wait count" — find every mention of "3 wakeups", "copilot_wait_count"
170
+ - "recent commit threshold" — find every mention of "60 seconds", "60s"
171
+
172
+ For each group, flag any file whose value differs from the majority. If a canonical config file defines the value, flag any file that hardcodes a different value instead of importing from config.
173
+
174
+ Note: Some values legitimately differ by context — a 360s wakeup for bugbot vs a 90s wakeup for inline-lag are different concepts, not a conflict. Group by the full semantic: "bugbot post-trigger wakeup" is different from "inline-lag retry delay". Only flag values that claim to represent the SAME thing.
175
+ </rule>
176
+ </step>
177
+
178
+ <step id="4" name="produce_summary">
179
+ After all rules have run against all files, produce the final report.
180
+
181
+ First, print a summary grouped by rule. For each rule:
182
+ ```
183
+ Rule N: <rule_name> — X findings (P0: A, P1: B, P2: C)
184
+ ```
185
+
186
+ Then, print the top 5 most impactful findings with their full detail.
187
+
188
+ Then, print the canonical sources discovered during Step 2. This tells the reader what you used as ground truth.
189
+
190
+ Finally, print the path to the CSV file with all findings.
191
+
192
+ The summary goes to stdout. The full CSV goes to the temp file. The manifest JSON stays on disk for the reader to inspect.
193
+ </step>
194
+ </execution_workflow>
195
+
196
+ <constraints>
197
+ <constraint>Read every file completely. Do not skim. Do not skip any file or any line.</constraint>
198
+ <constraint>Write findings to the temp file immediately. Do not accumulate them in memory and batch-write at the end. You will forget things.</constraint>
199
+ <constraint>Every finding must cite the file and line of the problem AND the file and line of the evidence that proves it is a problem. No floating claims.</constraint>
200
+ <constraint>When two files contradict each other, flag BOTH files. Do not guess which is correct unless a canonical source resolves it.</constraint>
201
+ <constraint>If you cannot determine the correct value or form, flag the inconsistency and mark it "unresolvable — no canonical source found". Do not guess.</constraint>
202
+ <constraint>Do not skip a rule because it seems unlikely to find anything. Run it anyway. The highest-signal findings often come from rules you think will be empty.</constraint>
203
+ <constraint>Every detection rule must run against every file. Rules are independent. A file that triggers rule 1 might also trigger rule 6. Run both.</constraint>
204
+ </constraints>
205
+
206
+ <output_format>
207
+ Findings CSV at `<tmp>/inconsistency-audit-<timestamp>.csv`:
208
+ ```
209
+ file_path | line_number | rule_id | severity | what_is_wrong | what_it_should_be | evidence_path | evidence_detail
210
+ ```
211
+
212
+ Summary to stdout:
213
+ ```
214
+ ====== DIFF INCONSISTENCY AUDIT ======
215
+ Files audited: <N>
216
+ Canonical sources identified: <list>
217
+ Total findings: <N>
218
+
219
+ By rule:
220
+ Rule 1 — canonical_source_cross_reference: X (P0: A, P1: B, P2: C)
221
+ Rule 2 — parameter_naming_convention: X (P0: A, P1: B, P2: C)
222
+ ...
223
+
224
+ By severity:
225
+ P0 (runtime failure): X
226
+ P1 (confusing or wrong): Y
227
+ P2 (cleanup): Z
228
+
229
+ Top findings:
230
+ 1. file:line — [P0] — what is wrong — what it should be
231
+ 2. ...
232
+
233
+ Full report: <tmp>/inconsistency-audit-<timestamp>.csv
234
+ Manifest: <tmp>/audit-manifest-<timestamp>.json
235
+ ====== END AUDIT ======
236
+ ```
237
+ </output_format>
238
+
239
+ <illustrations>
240
+ <example>
241
+ <title>Wrong argument name in script invocation</title>
242
+ <finding>
243
+ file_path: SKILL.md
244
+ line_number: 100
245
+ rule_id: 9
246
+ severity: P0
247
+ what_is_wrong: check_bugbot_ci.py --check-active --sha <SHA>
248
+ what_it_should_be: check_bugbot_ci.py --owner <O> --repo <R> --check-active --sha <SHA>
249
+ evidence_path: scripts/check_bugbot_ci.py
250
+ evidence_detail: lines 134-136 define add_argument("--owner", required=True) and add_argument("--repo", required=True). Both required args are missing from the documented invocation.
251
+ </finding>
252
+ <why_this_matters>If an agent reads SKILL.md and copies the command exactly, the script fails with "the following arguments are required: --owner, --repo". The agent cannot trigger bugbot.</why_this_matters>
253
+ </example>
254
+
255
+ <example>
256
+ <title>Parameter naming convention mismatch across tool family</title>
257
+ <finding>
258
+ file_path: SKILL.md
259
+ line_number: 162
260
+ rule_id: 2
261
+ severity: P0
262
+ what_is_wrong: add_issue_comment(owner="OWNER", repo="REPO", issue_number=NUMBER, body="bugbot run")
263
+ what_it_should_be: add_issue_comment(owner="OWNER", repo="REPO", issueNumber=NUMBER, body="bugbot run")
264
+ evidence_path: _shared/pr-loop/gh-payloads.md
265
+ evidence_detail: lines 67-72 document add_issue_comment with issueNumber (camelCase). The tool uses camelCase for this parameter, not snake_case. issue_read uses issue_number (snake_case) — the two tools have different conventions.
266
+ </finding>
267
+ <why_this_matters>If the MCP tool expects issueNumber and the doc says issue_number, the tool call fails at runtime. The agent gets an error instead of posting the bugbot trigger. Twenty files had this same bug because they all copied from each other.</why_this_matters>
268
+ </example>
269
+
270
+ <example>
271
+ <title>Docstring claim contradicts implementation</title>
272
+ <finding>
273
+ file_path: scripts/remove_tree.py
274
+ line_number: 15
275
+ rule_id: 3
276
+ severity: P0
277
+ what_is_wrong: Docstring says "returns 0 when the path never existed". Implementation calls shutil.rmtree(target_path) which raises FileNotFoundError, caught as OSError, returning EXIT_CODE_REMOVE_TREE_FAILURE.
278
+ what_it_should_be: Either check Path(target_path).exists() before calling rmtree and return 0 when absent, or update the docstring to say "returns failure when the path never existed".
279
+ evidence_path: scripts/remove_tree.py
280
+ evidence_detail: line 15 docstring claim vs lines 22-28 implementation — the except OSError handler returns EXIT_CODE_REMOVE_TREE_FAILURE for FileNotFoundError.
281
+ </finding>
282
+ <why_this_matters>Teardown scripts that call remove_tree() on an already-cleaned temp directory will see a failure return code. The orchestrator treats this as a real error and may abort cleanup or report a false failure to the user.</why_this_matters>
283
+ </example>
284
+
285
+ <example>
286
+ <title>Stale feature reference in gotcha section</title>
287
+ <finding>
288
+ file_path: SKILL.md
289
+ line_number: 34
290
+ rule_id: 7
291
+ severity: P1
292
+ what_is_wrong: Gotcha describes "inline_lag" as a distinct case with its own streak counter (inline_lag_streak) and 90s wait, but no implementation file in the diff defines inline_lag_streak or has an inline_lag decision branch.
293
+ what_it_should_be: Remove the gotcha entry. The inline_lag feature was removed from per-tick.md decision branches but the gotcha was left behind.
294
+ evidence_path: reference/per-tick.md
295
+ evidence_detail: lines 83-106 show the BUGBOT decision branches — four branches exist but inline_lag is the fourth branch. Wait — it IS still there at lines 102-106. This is actually a P0: the inline_lag branch was supposed to be removed but is still present in both per-tick.md and the gotcha section.
296
+ </finding>
297
+ <why_this_matters>An agent reading the gotcha thinks inline_lag is a real condition it needs to handle. It may misclassify a dirty review as transient lag and wait instead of fixing. Stale gotchas are worse than no gotchas because they teach wrong behavior.</why_this_matters>
298
+ </example>
299
+
300
+ <example>
301
+ <title>Placeholder text in template file</title>
302
+ <finding>
303
+ file_path: bugteam/obstacles/self-population.md
304
+ line_number: 5
305
+ rule_id: 5
306
+ severity: P1
307
+ what_is_wrong: "Spawn — brief it: check for an open PR " contains double spaces where an agent invocation, repo identifier, and PR number should be. This text is not executable.
308
+ what_it_should_be: Fill in the concrete Agent call or remove the self-population section until populated.
309
+ evidence_path: bugteam/obstacles/self-population.md
310
+ evidence_detail: The template has placeholder spacing with no variable substitutions defined.
311
+ </finding>
312
+ <why_this_matters>Obstacle files are reference material agents read during audits. An obstacle that says "Spawn — brief it: check for" with no target gives the agent no actionable instruction. It wastes context and confuses the reader.</why_this_matters>
313
+ </example>
314
+ </illustrations>
315
+
316
+ <gotchas>
317
+ <gotcha>The highest-signal rule is canonical-source cross-reference. Do it first. Do it slowly. It caught 20 of Copilot's 43 findings. Missing it means missing half the bugs.</gotcha>
318
+ <gotcha>Parameter naming conventions are per-tool, not per-project. Do not assume all GitHub MCP tools use the same convention. Some use camelCase, some use snake_case. Check each tool individually.</gotcha>
319
+ <gotcha>When a finding appears in many files with the same wrong pattern, flag every instance individually. Do not say "this is wrong in 20 files" and move on. List all 20 files with their lines. The reader needs to fix each one.</gotcha>
320
+ <gotcha>Template files and obstacle files are often skipped because they feel "generated" or "low priority". They are not. Copilot found 6 issues in template files that human auditors never opened.</gotcha>
321
+ <gotcha>A manifest written to a temp file is not optional. If you try to hold all script signatures, all MCP tool names, all constants, and all concept names in memory, you will miss things. Write the manifest. Read from it during detection.</gotcha>
322
+ </gotchas>
323
+ </prompt_artifact>