claude-dev-env 1.38.1 → 1.40.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (282) hide show
  1. package/CLAUDE.md +10 -36
  2. package/_shared/pr-loop/audit-reply-template.md +147 -0
  3. package/_shared/pr-loop/fix-protocol.md +25 -4
  4. package/_shared/pr-loop/gh-payloads.md +37 -50
  5. package/_shared/pr-loop/scripts/code_rules_gate.py +0 -60
  6. package/_shared/pr-loop/scripts/config/post_audit_thread_constants.py +199 -0
  7. package/_shared/pr-loop/scripts/config/reviews_disabled_constants.py +8 -0
  8. package/_shared/pr-loop/scripts/post_audit_thread.py +1242 -0
  9. package/_shared/pr-loop/scripts/preflight.py +129 -2
  10. package/_shared/pr-loop/scripts/reviews_disabled.py +59 -0
  11. package/_shared/pr-loop/scripts/tests/test_code_rules_gate.py +0 -19
  12. package/_shared/pr-loop/scripts/tests/test_post_audit_thread.py +1116 -0
  13. package/_shared/pr-loop/scripts/tests/test_post_audit_thread_constants.py +127 -0
  14. package/_shared/pr-loop/scripts/tests/test_preflight.py +41 -0
  15. package/_shared/pr-loop/scripts/tests/test_reviews_disabled.py +36 -0
  16. package/_shared/pr-loop/state-schema.md +1 -1
  17. package/agents/clean-coder.md +2 -2
  18. package/agents/pr-description-writer.md +150 -52
  19. package/bin/install.mjs +6 -7
  20. package/bin/install.test.mjs +8 -0
  21. package/commands/doc-gist.md +16 -0
  22. package/commands/plan.md +0 -2
  23. package/commands/review-plan.md +1 -1
  24. package/docs/CODE_RULES.md +122 -2
  25. package/docs/PR_DESCRIPTION_GUIDE.md +127 -64
  26. package/hooks/blocking/bot_mention_comment_blocker.py +75 -0
  27. package/hooks/blocking/code_rules_enforcer.py +1143 -129
  28. package/hooks/blocking/convergence_gate_blocker.py +130 -0
  29. package/hooks/blocking/destructive_command_blocker.py +74 -0
  30. package/hooks/blocking/gh_body_arg_blocker.py +30 -0
  31. package/hooks/blocking/md_to_html_blocker.py +119 -0
  32. package/hooks/blocking/pr_description_enforcer.py +57 -22
  33. package/hooks/blocking/test_bot_mention_comment_blocker.py +131 -0
  34. package/hooks/blocking/test_code_rules_enforcer.py +21 -0
  35. package/hooks/blocking/test_code_rules_enforcer_any_exempt_files.py +70 -0
  36. package/hooks/blocking/test_code_rules_enforcer_any_imports_and_cast.py +92 -0
  37. package/hooks/blocking/test_code_rules_enforcer_banned_import_alias.py +143 -0
  38. package/hooks/blocking/test_code_rules_enforcer_banned_prefixes.py +152 -0
  39. package/hooks/blocking/test_code_rules_enforcer_bare_except.py +120 -0
  40. package/hooks/blocking/test_code_rules_enforcer_boundary_types.py +175 -0
  41. package/hooks/blocking/test_code_rules_enforcer_cap_meta.py +0 -1
  42. package/hooks/blocking/test_code_rules_enforcer_collection_prefix.py +50 -0
  43. package/hooks/blocking/test_code_rules_enforcer_docstring_format.py +255 -0
  44. package/hooks/blocking/test_code_rules_enforcer_inline_tuple_string_magic.py +130 -0
  45. package/hooks/blocking/test_code_rules_enforcer_stub_implementations.py +141 -0
  46. package/hooks/blocking/test_code_rules_enforcer_test_branching.py +143 -0
  47. package/hooks/blocking/test_code_rules_enforcer_thin_wrapper_files.py +169 -0
  48. package/hooks/blocking/test_code_rules_enforcer_todo_markers.py +99 -0
  49. package/hooks/blocking/test_code_rules_enforcer_typed_dict_pairs.py +141 -0
  50. package/hooks/blocking/test_convergence_gate_blocker.py +63 -0
  51. package/hooks/blocking/test_destructive_command_blocker.py +146 -0
  52. package/hooks/blocking/test_destructive_command_blocker_no_verify.py +102 -0
  53. package/hooks/blocking/test_gh_body_arg_blocker.py +45 -0
  54. package/hooks/blocking/test_md_to_html_blocker.py +317 -0
  55. package/hooks/blocking/test_pr_description_enforcer.py +69 -8
  56. package/hooks/config/any_type_config.py +7 -0
  57. package/hooks/config/banned_identifiers_constants.py +11 -0
  58. package/hooks/config/blocking_check_limits.py +38 -0
  59. package/hooks/config/bot_mention_comment_blocker_constants.py +20 -0
  60. package/hooks/config/code_rules_enforcer_constants.py +53 -0
  61. package/hooks/config/convergence_branch_constants.py +9 -0
  62. package/hooks/config/doc_gist_auto_publish_constants.py +18 -0
  63. package/hooks/config/html_companion_constants.py +20 -0
  64. package/hooks/config/inline_tuple_string_magic_constants.py +22 -0
  65. package/hooks/config/pr_description_enforcer_constants.py +14 -0
  66. package/hooks/config/test_banned_identifiers_constants.py +17 -0
  67. package/hooks/hooks.json +28 -20
  68. package/hooks/pyproject.toml +69 -0
  69. package/hooks/validators/mypy_integration.py +47 -1
  70. package/hooks/validators/run_all_validators.py +3 -3
  71. package/hooks/validators/test_mypy_integration.py +50 -1
  72. package/hooks/workflow/doc_gist_auto_publish.py +144 -0
  73. package/hooks/workflow/md_to_html_companion.py +365 -0
  74. package/hooks/workflow/test_doc_gist_auto_publish.py +117 -0
  75. package/hooks/workflow/test_md_to_html_companion.py +452 -0
  76. package/package.json +1 -1
  77. package/rules/gh-body-file.md +2 -0
  78. package/scripts/Install-SweepEmptyDirs.ps1 +111 -0
  79. package/scripts/check.ps1 +106 -0
  80. package/scripts/config/timing.py +11 -0
  81. package/scripts/sweep_empty_dirs.py +138 -0
  82. package/scripts/sync_to_cursor/rules.py +1 -1
  83. package/scripts/test_sweep_empty_dirs.py +183 -0
  84. package/skills/_shared/pr-loop/prompts/pr-consistency-audit.xml +323 -0
  85. package/skills/_shared/pr-loop/scripts/_cli_utils.py +22 -0
  86. package/skills/_shared/pr-loop/scripts/_path_resolver.py +165 -0
  87. package/skills/_shared/pr-loop/scripts/_xml_utils.py +20 -0
  88. package/skills/_shared/pr-loop/scripts/build_audit_prompt.py +182 -0
  89. package/skills/_shared/pr-loop/scripts/build_fix_prompt.py +185 -0
  90. package/skills/_shared/pr-loop/scripts/config/__init__.py +0 -0
  91. package/skills/_shared/pr-loop/scripts/config/path_resolver_constants.py +78 -0
  92. package/skills/_shared/pr-loop/scripts/init_loop_state.py +135 -0
  93. package/skills/_shared/pr-loop/scripts/teardown_worktrees.py +175 -0
  94. package/skills/_shared/pr-loop/scripts/write_audit_outcomes.py +182 -0
  95. package/skills/_shared/pr-loop/scripts/write_fix_outcomes.py +206 -0
  96. package/skills/bugteam/CONSTRAINTS.md +21 -22
  97. package/skills/bugteam/EXAMPLES.md +3 -3
  98. package/skills/bugteam/PROMPTS.md +227 -67
  99. package/skills/bugteam/SKILL.md +132 -455
  100. package/skills/bugteam/reference/README.md +1 -1
  101. package/skills/bugteam/reference/audit-and-teammates.md +112 -39
  102. package/skills/bugteam/reference/audit-contract.md +4 -22
  103. package/skills/bugteam/reference/copilot-gap-analysis.md +8 -5
  104. package/skills/bugteam/reference/design-rationale.md +2 -2
  105. package/skills/bugteam/reference/github-pr-reviews.md +50 -57
  106. package/skills/bugteam/reference/obstacles/audit-assign-ids.md +13 -0
  107. package/skills/bugteam/reference/obstacles/audit-capture-excerpts.md +13 -0
  108. package/skills/bugteam/reference/obstacles/audit-walk-categories.md +13 -0
  109. package/skills/bugteam/reference/obstacles/audit-write-xml.md +13 -0
  110. package/skills/bugteam/reference/obstacles/fix-append-summary.md +13 -0
  111. package/skills/bugteam/reference/obstacles/fix-apply-fixes.md +13 -0
  112. package/skills/bugteam/reference/obstacles/fix-git-add-commit.md +13 -0
  113. package/skills/bugteam/reference/obstacles/fix-git-push.md +13 -0
  114. package/skills/bugteam/reference/obstacles/fix-post-reply.md +13 -0
  115. package/skills/bugteam/reference/obstacles/fix-publish-summary.md +13 -0
  116. package/skills/bugteam/reference/obstacles/fix-py-compile.md +13 -0
  117. package/skills/bugteam/reference/obstacles/fix-read-files.md +13 -0
  118. package/skills/bugteam/reference/obstacles/fix-resolve-thread.md +13 -0
  119. package/skills/bugteam/reference/obstacles/fix-test-suite.md +13 -0
  120. package/skills/bugteam/reference/obstacles/fix-violation-count.md +13 -0
  121. package/skills/bugteam/reference/obstacles/fix-write-xml.md +13 -0
  122. package/skills/bugteam/reference/team-setup.md +111 -9
  123. package/skills/bugteam/reference/teardown-publish-permissions.md +39 -8
  124. package/skills/bugteam/scripts/README.md +60 -0
  125. package/skills/bugteam/scripts/_claude_permissions_common.py +358 -0
  126. package/skills/bugteam/scripts/bugteam_code_rules_gate.py +976 -0
  127. package/skills/bugteam/scripts/bugteam_fix_hookspath.py +375 -0
  128. package/skills/bugteam/scripts/bugteam_preflight.py +328 -0
  129. package/skills/bugteam/scripts/config/bugteam_code_rules_gate_constants.py +25 -0
  130. package/skills/bugteam/scripts/config/bugteam_fix_hookspath_constants.py +26 -0
  131. package/skills/bugteam/scripts/config/bugteam_preflight_constants.py +35 -0
  132. package/skills/bugteam/scripts/config/claude_permissions_common_constants.py +20 -0
  133. package/skills/bugteam/scripts/config/probe_code_rules_enforcer_check_constants.py +12 -0
  134. package/skills/bugteam/scripts/config/windows_safe_rmtree_constants.py +7 -0
  135. package/skills/bugteam/scripts/grant_project_claude_permissions.py +175 -0
  136. package/skills/bugteam/scripts/probe_code_rules_enforcer_check.py +107 -0
  137. package/skills/bugteam/scripts/revoke_project_claude_permissions.py +220 -0
  138. package/skills/bugteam/scripts/test__claude_permissions_common.py +112 -0
  139. package/skills/bugteam/scripts/test_bugteam_code_rules_gate.py +400 -0
  140. package/skills/bugteam/scripts/test_bugteam_fix_hookspath.py +384 -0
  141. package/skills/bugteam/scripts/test_bugteam_preflight.py +309 -0
  142. package/skills/bugteam/scripts/test_claude_permissions_common.py +195 -0
  143. package/skills/bugteam/scripts/test_grant_project_claude_permissions.py +55 -0
  144. package/skills/bugteam/scripts/test_probe_code_rules_enforcer_check.py +76 -0
  145. package/skills/bugteam/scripts/test_revoke_project_claude_permissions.py +55 -0
  146. package/skills/bugteam/scripts/test_windows_safe_rmtree.py +108 -0
  147. package/skills/bugteam/scripts/windows_safe_rmtree.py +100 -0
  148. package/skills/bugteam/test_skill_additions.py +1 -11
  149. package/skills/code/SKILL.md +176 -0
  150. package/skills/copilot-review/SKILL.md +16 -0
  151. package/skills/doc-gist/SKILL.md +99 -0
  152. package/skills/doc-gist/references/examples/01-exploration-code-approaches.html +453 -0
  153. package/skills/doc-gist/references/examples/02-exploration-visual-designs.html +515 -0
  154. package/skills/doc-gist/references/examples/03-code-review-pr.html +638 -0
  155. package/skills/doc-gist/references/examples/04-code-understanding.html +491 -0
  156. package/skills/doc-gist/references/examples/05-design-system.html +629 -0
  157. package/skills/doc-gist/references/examples/06-component-variants.html +605 -0
  158. package/skills/doc-gist/references/examples/07-prototype-animation.html +455 -0
  159. package/skills/doc-gist/references/examples/08-prototype-interaction.html +396 -0
  160. package/skills/doc-gist/references/examples/09-slide-deck.html +592 -0
  161. package/skills/doc-gist/references/examples/10-svg-illustrations.html +492 -0
  162. package/skills/doc-gist/references/examples/11-status-report.html +528 -0
  163. package/skills/doc-gist/references/examples/12-incident-report.html +596 -0
  164. package/skills/doc-gist/references/examples/13-flowchart-diagram.html +395 -0
  165. package/skills/doc-gist/references/examples/14-research-feature-explainer.html +381 -0
  166. package/skills/doc-gist/references/examples/15-research-concept-explainer.html +368 -0
  167. package/skills/doc-gist/references/examples/16-implementation-plan.html +702 -0
  168. package/skills/doc-gist/references/examples/17-pr-writeup.html +595 -0
  169. package/skills/doc-gist/references/examples/18-editor-triage-board.html +573 -0
  170. package/skills/doc-gist/references/examples/19-editor-feature-flags.html +663 -0
  171. package/skills/doc-gist/references/examples/20-editor-prompt-tuner.html +722 -0
  172. package/skills/doc-gist/references/examples/README.md +5 -0
  173. package/skills/doc-gist/scripts/config/__init__.py +0 -0
  174. package/skills/doc-gist/scripts/config/gist_upload_constants.py +16 -0
  175. package/skills/doc-gist/scripts/gist_upload.py +177 -0
  176. package/skills/doc-gist/scripts/test_gist_upload.py +51 -0
  177. package/skills/findbugs/SKILL.md +96 -2
  178. package/skills/monitor-open-prs/SKILL.md +14 -32
  179. package/skills/monitor-open-prs/test_skill_contract.py +0 -11
  180. package/skills/pr-consistency-audit/SKILL.md +112 -0
  181. package/skills/pr-consistency-audit/reference/detection-rules.md +96 -0
  182. package/skills/pr-consistency-audit/reference/illustrations.md +78 -0
  183. package/skills/pr-converge/SKILL.md +229 -23
  184. package/skills/pr-converge/config/__init__.py +0 -0
  185. package/skills/pr-converge/config/constants.py +63 -0
  186. package/skills/pr-converge/reference/convergence-gates.md +138 -44
  187. package/skills/pr-converge/reference/examples.md +43 -11
  188. package/skills/pr-converge/reference/fix-protocol.md +6 -5
  189. package/skills/pr-converge/reference/ground-rules.md +5 -3
  190. package/skills/pr-converge/reference/multi-pr-orchestration.md +44 -19
  191. package/skills/pr-converge/reference/obstacles/fix-post-replies.md +13 -0
  192. package/skills/pr-converge/reference/obstacles/fix-publish-summary.md +13 -0
  193. package/skills/pr-converge/reference/obstacles/fix-push.md +13 -0
  194. package/skills/pr-converge/reference/obstacles/fix-read-filelines.md +13 -0
  195. package/skills/pr-converge/reference/obstacles/fix-reset-state.md +13 -0
  196. package/skills/pr-converge/reference/obstacles/fix-resolve-threads.md +13 -0
  197. package/skills/pr-converge/reference/obstacles/fix-spawn-clean-coder.md +13 -0
  198. package/skills/pr-converge/reference/obstacles/fix-stage-commit.md +13 -0
  199. package/skills/pr-converge/reference/obstacles/fix-trigger-bugbot.md +13 -0
  200. package/skills/pr-converge/reference/obstacles/fix-write-test.md +13 -0
  201. package/skills/pr-converge/reference/per-tick.md +107 -31
  202. package/skills/pr-converge/reference/state-schema.md +22 -1
  203. package/skills/pr-converge/reference/stop-conditions.md +9 -7
  204. package/skills/pr-converge/scripts/README.md +34 -46
  205. package/skills/pr-converge/scripts/check_bugbot_ci.py +279 -0
  206. package/skills/pr-converge/scripts/check_convergence.py +497 -0
  207. package/skills/pr-converge/scripts/check_pending_reviews.py +154 -0
  208. package/skills/pr-converge/scripts/config/pr_converge_constants.py +118 -0
  209. package/skills/pr-converge/scripts/fetch_copilot_reviews.py +134 -0
  210. package/skills/pr-converge/scripts/post_fix_reply.py +168 -0
  211. package/skills/pr-converge/scripts/test_check_bugbot_ci.py +312 -0
  212. package/skills/pr-converge/workflows/schedule-wakeup-loop.md +5 -12
  213. package/skills/qbug/SKILL.md +157 -27
  214. package/skills/session-log/SKILL.md +216 -114
  215. package/skills/session-tidy/SKILL.md +1 -1
  216. package/skills/skill-builder/SKILL.md +138 -56
  217. package/skills/skill-builder/references/delegation-map.md +72 -113
  218. package/skills/skill-builder/references/progressive-disclosure.md +122 -0
  219. package/skills/skill-builder/references/self-audit-checklist.md +92 -0
  220. package/skills/skill-builder/references/skill-types.md +228 -0
  221. package/skills/skill-builder/references/thariq-x-post-skills.json +33 -0
  222. package/skills/skill-builder/templates/gap-analysis.md +15 -8
  223. package/skills/skill-builder/workflows/improve-skill.md +86 -57
  224. package/skills/skill-builder/workflows/new-skill.md +80 -168
  225. package/skills/skill-builder/workflows/polish-skill.md +78 -54
  226. package/skills/structure-prompt/SKILL.md +50 -0
  227. package/skills/structure-prompt/reference/adversarial-tuning.md +62 -0
  228. package/skills/structure-prompt/reference/block-classification.md +27 -0
  229. package/skills/structure-prompt/reference/canonical-case.md +48 -0
  230. package/skills/structure-prompt/reference/citation-depth.md +70 -0
  231. package/skills/structure-prompt/reference/cleanup.md +33 -0
  232. package/skills/structure-prompt/reference/constraints.md +33 -0
  233. package/skills/structure-prompt/reference/directives.md +37 -0
  234. package/skills/structure-prompt/reference/examples.md +72 -0
  235. package/skills/structure-prompt/reference/instantiation.md +51 -0
  236. package/skills/structure-prompt/reference/output-contract.md +72 -0
  237. package/skills/structure-prompt/reference/per-category.md +23 -0
  238. package/skills/structure-prompt/reference/persona.md +38 -0
  239. package/skills/structure-prompt/reference/research.md +33 -0
  240. package/skills/structure-prompt/reference/structure.md +28 -0
  241. package/agents/code-standards-agent.md +0 -93
  242. package/agents/groq-coder.md +0 -113
  243. package/agents/plan-executor.md +0 -226
  244. package/agents/project-docs-analyzer.md +0 -53
  245. package/agents/project-structure-organizer-agent.md +0 -72
  246. package/agents/skill-to-agent-converter.md +0 -370
  247. package/agents/skill-writer-agent.md +0 -470
  248. package/agents/user-docs-writer.md +0 -67
  249. package/agents/workflow-visual-documenter.md +0 -82
  250. package/commands/readability-review.md +0 -20
  251. package/hooks/mypy.ini +0 -2
  252. package/hooks/notification/attention_needed_notify.py +0 -71
  253. package/hooks/notification/claude_notification_handler.py +0 -67
  254. package/hooks/notification/notification_utils.py +0 -267
  255. package/hooks/notification/subagent_complete_notify.py +0 -381
  256. package/hooks/notification/test_attention_needed_notify.py +0 -47
  257. package/hooks/notification/test_claude_notification_handler.py +0 -54
  258. package/hooks/notification/test_notification_utils.py +0 -91
  259. package/hooks/notification/test_subagent_complete_notify.py +0 -79
  260. package/scripts/config/groq_bugteam_config.py +0 -230
  261. package/scripts/config/test_groq_bugteam_config.py +0 -83
  262. package/scripts/config/test_spec_implementer_prompt.py +0 -32
  263. package/scripts/groq_bugteam.README.md +0 -131
  264. package/scripts/groq_bugteam.py +0 -647
  265. package/scripts/groq_bugteam_dotenv.py +0 -40
  266. package/scripts/groq_bugteam_spec.py +0 -226
  267. package/scripts/test_groq_bugteam.py +0 -529
  268. package/scripts/test_groq_bugteam_apply_fix_from_spec.py +0 -426
  269. package/scripts/test_groq_bugteam_dotenv.py +0 -66
  270. package/scripts/test_groq_bugteam_spec.py +0 -338
  271. package/skills/bugteam/SKILL_EVALS.md +0 -309
  272. package/skills/dream/SKILL.md +0 -118
  273. package/skills/ingest/SKILL.md +0 -40
  274. package/skills/npm-creator/SKILL.md +0 -187
  275. package/skills/readability-review/SKILL.md +0 -127
  276. package/skills/resume-review/SKILL.md +0 -261
  277. package/skills/rule-audit/SKILL.md +0 -307
  278. package/skills/rule-creator/SKILL.md +0 -150
  279. package/skills/searching-obsidian-vault/SKILL.md +0 -131
  280. package/skills/skill-writer/REFERENCE.md +0 -284
  281. package/skills/skill-writer/SKILL.md +0 -222
  282. package/skills/tdd-team/SKILL.md +0 -128
@@ -1,235 +1,147 @@
1
1
  # New Skill Workflow
2
2
 
3
- Full evaluation-driven lifecycle for building a new skill from scratch.
3
+ Best-practice-driven lifecycle for building a skill from scratch.
4
4
 
5
5
  ## Prerequisites
6
6
 
7
7
  - The user has a task or domain they want to capture as a skill
8
8
  - No existing skill for this capability (or intentionally starting fresh)
9
9
 
10
- ### Ground-up package layout (required before multi-file implementation)
11
-
12
- When the outcome includes **ARCHITECTURE.md**, **REFERENCE / EXAMPLES / WORKFLOWS**, and **`evals/*.json`** under a workspace (Anthropic-style progressive disclosure plus checkpointed rollout):
13
-
14
- 1. Read `prompt-generator/templates/skill-from-ground-up.md` from the installed `~/.claude/skills/` tree (provided by [@jl-cmd/prompt-generator](https://github.com/jl-cmd/prompt-generator)).
15
- 2. Run `/prompt-generator` using that template (substitute tokens per its table) **before** Phase 3 expands the repo; align the XML scope block with this workflow’s workspace and evidence rules.
16
- 3. Keep Phase 1–2 artifacts honest: eval prompts and expectations stay grounded in **real** user scenarios; the template reinforces eval rows that reference pasted or explicitly approved evidence only.
17
-
18
- Skip this block only when the user explicitly wants a **single-file** SKILL.md with no staged package plan.
19
-
20
- Refinements to an **existing** skill package use `prompt-generator/templates/skill-refinement-package.md` instead (see `improve-skill.md`).
21
-
22
- ---
23
-
24
- ## Phase 1: Identify Gaps
25
-
26
- **Goal:** Document what fails or requires repeated context when working without a skill.
27
-
28
- ### Process
29
-
30
- 1. Have a guided conversation to uncover gaps. Explore these areas:
31
- - "What task were you doing when you realized you needed a skill?"
32
- - "What context did you repeatedly provide to Claude?"
33
- - "Where did Claude fail or produce subpar results without guidance?"
34
- - "What domain knowledge was missing?"
35
- - "What specific format or structure did you need?"
36
- - "Were there tools or scripts that needed to be used in a particular way?"
37
- - "What rules or constraints did Claude violate?"
38
-
39
- 2. As patterns emerge, probe for eval-worthy scenarios:
40
- - "Can you give me a concrete example of a task where this failed?"
41
- - "What would success look like for that specific task?"
42
- - "Are there edge cases where the right approach changes?"
43
-
44
- 3. Generate `gap-analysis.md` from the conversation using the template at `${CLAUDE_SKILL_DIR}/templates/gap-analysis.md`. Fill in all sections from what was discussed.
45
-
46
- 4. Review the gap analysis with the user. Confirm completeness before moving to Phase 2.
47
-
48
- **Output:** `[skill-name]-workspace/gap-analysis.md`
49
-
50
10
  ---
51
11
 
52
- ## Phase 2: Build Evals
53
-
54
- **Goal:** Create 3+ evaluation scenarios that test the identified gaps. Establish a baseline.
12
+ ## Step 1: Classify
55
13
 
56
- ### Process
14
+ **Goal:** Determine the skill type. Type dictates folder structure.
57
15
 
58
- 1. Transform each gap into at least one eval scenario. Each scenario needs:
59
- - A realistic user prompt (detailed and specific, like a real request)
60
- - A description of what success looks like
61
- - Objectively verifiable expectations (assertions)
16
+ 1. Read `${CLAUDE_SKILL_DIR}/references/skill-types.md`.
62
17
 
63
- 2. Draft evals using the schema at `${CLAUDE_SKILL_DIR}/templates/eval-scenario.json`. Ensure:
64
- - Minimum 3 scenarios (official requirement)
65
- - Every identified gap has at least one scenario testing it
66
- - Expectations are objectively verifiable, not subjective
67
- - Prompts sound like things a real user would say
18
+ 2. Ask the user about the skill’s purpose:
68
19
 
69
- 3. Review eval scenarios with the user. Adjust until both sides are satisfied.
20
+ > "What will this skill help Claude do?"
70
21
 
71
- 4. Save to `[skill-name]-workspace/evals/evals.json`.
22
+ Match the answer against the 9 types. If ambiguous, present the top 2-3 matches and ask the user to choose.
72
23
 
73
- 5. **Establish baseline.** For each eval, spawn a subagent WITHOUT any skill:
24
+ 3. Record the classification: type number, type name, recommended folders.
74
25
 
75
- ```
76
- Execute this task with NO skill loaded:
77
- - Task: [eval prompt]
78
- - Input files: [eval files if any, or "none"]
79
- - Save all output files to: [workspace]/iteration-0/eval-[name]/without_skill/outputs/
80
- - Save a complete transcript of your work to: [workspace]/iteration-0/eval-[name]/without_skill/transcript.md
81
- ```
82
-
83
- Spawn all baseline runs in parallel. Capture timing data when each completes.
84
-
85
- 6. Grade baseline results using the skill-creator grading agent. See `${CLAUDE_SKILL_DIR}/references/delegation-map.md` for exact grading invocation.
86
-
87
- **Output:** `[skill-name]-workspace/evals/evals.json` and baseline results in `iteration-0/`
26
+ **Output:** Type classification with folder plan.
88
27
 
89
28
  ---
90
29
 
91
- ## Phase 3: Write Minimal Skill
30
+ ## Step 2: Scaffold
92
31
 
93
- **Goal:** Create just enough skill content to address the documented gaps and pass evaluations.
32
+ **Goal:** Create the folder structure. Every skill starts with the same skeleton plus type-specific additions.
94
33
 
95
- ### Process
34
+ 1. Create the skill directory if it doesn’t exist.
96
35
 
97
- 1. Invoke `/skill-writer` with this context:
36
+ 2. Create the minimum structure:
98
37
 
99
38
  ```
100
- Create a skill based on this gap analysis and eval scenarios.
101
-
102
- Gap analysis: [reference or paste gap-analysis.md]
103
- Eval scenarios: [reference or paste evals.json expected_output and expectations]
104
- Baseline failures: [summarize what Claude got wrong in iteration-0]
105
-
106
- Constraint: Write the minimum instructions needed to address these specific gaps.
107
- Every line must serve a documented gap. Do not over-document.
39
+ skill-name/
40
+ ├── SKILL.md # Hub — every skill has this
108
41
  ```
109
42
 
110
- 2. `/skill-writer` will run its workflow: classify type, set degree of freedom, ask clarifying questions, produce the SKILL.md artifact.
43
+ 3. Add type-specific directories based on Step 1 classification (see `${CLAUDE_SKILL_DIR}/references/skill-types.md` for the folder recommendations per type).
111
44
 
112
- 3. Review the draft with the user:
113
- - "Does this address all the gaps we identified?"
114
- - "Is anything here unnecessary or over-engineered?"
115
- - "Would this pass our eval scenarios?"
45
+ 4. Verify the scaffold matches the type recommendation.
116
46
 
117
- 4. Save the skill to its target directory.
47
+ > "As your Skill grows, you can bundle additional content that Claude loads only when needed."
118
48
 
119
- **Output:** The skill's SKILL.md (and optional reference files)
49
+ **Output:** Directory tree with SKILL.md stub.
120
50
 
121
51
  ---
122
52
 
123
- ## Phase 4: Test (Feedback Loop)
53
+ ## Step 3: Gather
124
54
 
125
- **Goal:** Run the skill on eval scenarios, compare against baseline, identify remaining gaps.
55
+ **Goal:** Collect domain knowledge, failure patterns, and gotchas from the user.
126
56
 
127
- ### Process
57
+ > "Build a Gotchas Section — these sections should be built up from common failure points that Claude runs into when using your skill."
128
58
 
129
- 1. **Spawn all runs in parallel.** For each eval scenario, launch a with-skill subagent:
59
+ ### Interview questions
130
60
 
131
- ```
132
- Execute this task:
133
- - Read the skill at [path-to-skill]/SKILL.md and follow its instructions
134
- - Task: [eval prompt from evals.json]
135
- - Input files: [eval files if any, or "none"]
136
- - Save all output files to: [workspace]/iteration-N/eval-[name]/with_skill/outputs/
137
- - Save a complete transcript of your work to: [workspace]/iteration-N/eval-[name]/with_skill/transcript.md
138
- ```
61
+ Ask the user:
139
62
 
140
- For iteration-1, the without-skill baseline already exists from Phase 2.
63
+ 1. "What task were you doing when you realized you needed a skill?"
64
+ 2. "What context did you repeatedly provide to Claude?"
65
+ 3. "Where did Claude fail or produce subpar results without guidance?"
66
+ 4. "What does Claude consistently get wrong about this domain?"
67
+ 5. "What specific format or structure do you need in the output?"
68
+ 6. "Are there rules or constraints Claude must never violate?"
69
+ 7. "What tools, scripts, or libraries does Claude need to use?"
70
+ 8. "Does this skill need to run differently for different models (Haiku vs Opus)?"
141
71
 
142
- 2. **While runs are in progress**, review and refine assertions if needed based on what was learned from the baseline.
72
+ ### Generate gap analysis
143
73
 
144
- 3. **When runs complete**, immediately capture timing data (`total_tokens`, `duration_ms`) to `timing.json` in each run directory. This data is only available in the task completion notification.
74
+ Use the template at `${CLAUDE_SKILL_DIR}/templates/gap-analysis.md`. Fill in:
145
75
 
146
- 4. **Grade each run** using the skill-creator grading agent. See `${CLAUDE_SKILL_DIR}/references/delegation-map.md` for the grading process.
76
+ - Skill type and degree of freedom
77
+ - Task description
78
+ - Gaps identified (what failed, what was needed)
79
+ - Recurring patterns across gaps
80
+ - Initial gotcha candidates
147
81
 
148
- 5. **Aggregate into benchmark** using skill-creator's aggregation script. See delegation-map.md for the exact command.
82
+ ### Assess degree of freedom
149
83
 
150
- 6. **Launch the eval viewer** using skill-creator's generate_review.py. See delegation-map.md for the exact command. For iteration 2+, include `--previous-workspace` to show diffs.
84
+ > "Match the level of specificity to the task’s fragility and variability."
151
85
 
152
- 7. Tell the user to review in the viewer:
153
- - "Outputs" tab: click through each test case, leave feedback
154
- - "Benchmark" tab: quantitative comparison (pass rates, timing, tokens)
86
+ | Degree | When | Example |
87
+ |---|---|---|
88
+ | High | Multiple valid approaches, context-dependent | Code review guidelines |
89
+ | Medium | Preferred pattern exists, some variation ok | Report generation with template |
90
+ | Low | Fragile operations, consistency critical | Database migration with exact script |
155
91
 
156
- 8. Wait for the user to complete their review.
92
+ Record the assessment with reasoning.
157
93
 
158
- **Output:** `grading.json`, `benchmark.json`, `feedback.json` in the iteration directory
94
+ **Output:** Completed gap analysis, initial gotchas list, degree-of-freedom assessment.
159
95
 
160
96
  ---
161
97
 
162
- ## Phase 5: Iterate
163
-
164
- **Goal:** Refine the skill based on observed Claude B behavior and user feedback.
98
+ ## Step 4: Write
165
99
 
166
- ### Process
100
+ **Goal:** Produce the skill package — SKILL.md and companion files.
167
101
 
168
- 1. Read `feedback.json` from the viewer. Empty feedback means the user was satisfied with that test case.
102
+ Delegate to `/skill-writer` using the structured handoff from `${CLAUDE_SKILL_DIR}/references/delegation-map.md`.
169
103
 
170
- 2. Read transcripts from Phase 4 runs. Watch for the signals the official docs highlight:
171
- - **Unexpected exploration paths** -- Claude B read files in an order you did not anticipate
172
- - **Missed connections** -- Claude B did not follow references to important files
173
- - **Overreliance on certain sections** -- content that should be promoted to SKILL.md
174
- - **Ignored content** -- files Claude B never accessed (may be unnecessary or poorly signaled)
175
- - **Repeated work across test cases** -- all subagents wrote similar helper scripts (bundle them into the skill)
104
+ The handoff must include: skill type, folder structure, gap analysis, initial gotchas, degree of freedom, constraints.
176
105
 
177
- 3. Synthesize observations into actionable improvements. For each piece of feedback, identify the specific skill change that would fix it.
106
+ After skill-writer produces the draft:
178
107
 
179
- 4. Apply improvements. For significant changes, re-invoke `/skill-writer` with:
180
-
181
- ```
182
- Refine this existing skill based on testing observations.
108
+ 1. Verify it follows the hub layout (principle → gotchas → when-applies → process → file index → folder map).
109
+ 2. Verify SKILL.md body is under 500 lines.
110
+ 3. Verify all references are one level deep.
111
+ 4. Verify files over 100 lines have a TOC.
183
112
 
184
- Current SKILL.md: [reference or paste]
185
- User feedback: [from feedback.json -- only non-empty entries]
186
- Behavioral observations: [from transcript analysis]
113
+ Fix structural issues before proceeding.
187
114
 
188
- Specific issues to address:
189
- 1. [Issue]
190
- 2. [Issue]
115
+ **Output:** Complete skill package at the target directory.
191
116
 
192
- Constraint: Only change what the feedback demands. Do not reorganize working content.
193
- ```
117
+ ---
194
118
 
195
- 5. Key principles for this phase (from the official docs):
196
- - **Generalize from feedback** -- the skill will be used across many different prompts, not just these test cases
197
- - **Keep the prompt lean** -- remove instructions that are not pulling their weight
198
- - **Explain the why** -- theory of mind beats rigid MUSTs
199
- - **Bundle repeated work** -- if subagents all wrote similar scripts, add them to the skill
119
+ ## Step 5: Self-Audit
200
120
 
201
- 6. Return to Phase 4 with the refined skill. Continue iterating until:
202
- - User feedback is all empty (satisfied with every test case)
203
- - Pass rates meet acceptable thresholds
204
- - No meaningful progress between iterations
121
+ **Goal:** Verify every best practice is satisfied before delivery.
205
122
 
206
- ---
123
+ 1. Read `${CLAUDE_SKILL_DIR}/references/self-audit-checklist.md`.
124
+ 2. Copy the checklist into your response.
125
+ 3. Check every item against the built skill. For each: PASS, FAIL with file:line evidence, or N/A with reason.
126
+ 4. Every FAIL must be fixed before proceeding. Apply fixes, then re-check that item.
127
+ 5. When all items are PASS or N/A, proceed to Step 6.
207
128
 
208
- ## Phase 6: Polish
129
+ For an independent check, spawn a subagent to run the audit (see delegation-map.md).
209
130
 
210
- **Goal:** Optimize the skill description for triggering accuracy and run final validation.
131
+ **Output:** Completed checklist with all items PASS or N/A.
211
132
 
212
- ### Process
133
+ ---
213
134
 
214
- 1. **Description optimization.** Follow the process in `${CLAUDE_SKILL_DIR}/workflows/polish-skill.md`.
135
+ ## Step 6: Deliver
215
136
 
216
- 2. **Final validation.** Run the skill-writer self-check rubric against the finished skill:
217
- - [ ] Description is third person with trigger phrases
218
- - [ ] Under 500 lines
219
- - [ ] States what to do in positive terms (not prohibition-heavy)
220
- - [ ] Degree of freedom matches task fragility
221
- - [ ] Progressive disclosure used (heavy content in separate files)
222
- - [ ] Examples are concrete, not abstract
223
- - [ ] Frontmatter fields are valid
224
- - [ ] One skill = one capability
137
+ **Goal:** Hand off the finished skill with full documentation.
225
138
 
226
- 3. **Final checklist** from the official Anthropic docs:
227
- - [ ] At least 3 evaluation scenarios created and passing
228
- - [ ] Tested with real usage scenarios
229
- - [ ] Skill solves documented gaps (not imagined requirements)
230
- - [ ] Iterative refinement based on observed behavior (not assumptions)
139
+ Present to the user:
231
140
 
232
- 4. Present the finished skill to the user with:
233
- - Final benchmark comparison (latest iteration vs baseline)
234
- - Summary of gaps addressed
235
- - Any remaining limitations or known edge cases
141
+ 1. **File map** every file created, with its purpose.
142
+ 2. **Skill type** classification and why it fits.
143
+ 3. **Degree of freedom** — assessment and reasoning.
144
+ 4. **Gotchas seeded** initial gotchas captured.
145
+ 5. **Audit summary** — "All 38 items: N passed, M N/A."
146
+ 6. **Maintenance notes** — what to watch for in future usage that might warrant iteration.
147
+ 7. **Suggested first test** — a concrete task to try with Claude B.
@@ -4,89 +4,113 @@ Final optimization pass for a skill that is functionally complete.
4
4
 
5
5
  ## Prerequisites
6
6
 
7
- - The skill passes its evaluation scenarios
7
+ - The skill has been used and observed
8
8
  - The user is satisfied with output quality
9
9
  - This is the final step before the skill is considered done
10
10
 
11
- ### Package-aware polish (recommended)
11
+ ---
12
+
13
+ ## Step 1: Description Audit
14
+
15
+ **Goal:** Verify the description field is optimized for model discovery.
16
+
17
+ > "The description is critical for skill selection: Claude uses it to choose the right Skill from potentially 100+ available Skills."
18
+
19
+ > "The description field is not a summary — it's a description of when to trigger."
12
20
 
13
- When the polish pass will touch **more than frontmatter alone** (for example `REFERENCE.md`, `EXAMPLES.md`, `WORKFLOWS.md`, link structure, or eval JSON), or the user wants **checkpointed** multi-file updates alongside description work:
21
+ Check each requirement:
22
+
23
+ - [ ] **Third person.** "Processes Excel files" not "I can help you process Excel files."
24
+ - [ ] **Includes what AND when.** Both the capability and trigger contexts.
25
+ - [ ] **Specific trigger phrases.** Different phrasings of the same intent should all match.
26
+ - [ ] **Under 1024 characters.** Hard limit.
27
+ - [ ] **No XML tags.**
28
+ - [ ] **Distinguishable from similar skills.** If two skills overlap, the descriptions must make the boundary clear.
29
+
30
+ ### Trigger phrase review
31
+
32
+ Generate 10 variations of the user's intent:
33
+ - Formal and casual phrasings
34
+ - Cases where the user doesn't explicitly name the skill but clearly needs it
35
+ - Cases where this skill competes with another but should win
14
36
 
15
- 1. Read `prompt-generator/templates/skill-refinement-package.md` (repository path: `skills/prompt-generator/templates/skill-refinement-package.md` in [jl-cmd/prompt-generator](https://github.com/jl-cmd/prompt-generator)).
16
- 2. Run `/prompt-generator` with tokens filled so `ARCHITECTURE.md` records baseline inventory, planned deltas for polish, and evidence rules for any new trigger or behavior evals.
37
+ For each, answer: would the current description cause Claude to select this skill?
17
38
 
18
- Purely **single-field** `description` edits with no structural package changes can skip this block.
39
+ Also check 5 near-miss phrasings adjacent domains where this skill should NOT trigger. Verify the description doesn't cause false activation.
40
+
41
+ ### Fix issues
42
+
43
+ If the description fails any check, revise it. Show before/after with the specific change and why it improves discovery.
44
+
45
+ **Output:** Verified description (and revised version if changes were made).
19
46
 
20
47
  ---
21
48
 
22
- ## Step 1: Description Optimization
49
+ ## Step 2: Progressive Disclosure Audit
23
50
 
24
- Optimize the skill's description for triggering accuracy using the skill-creator's trigger eval system.
51
+ **Goal:** Verify the file structure follows all progressive disclosure rules.
25
52
 
26
- ### Generate trigger eval queries
53
+ > "Keep SKILL.md body under 500 lines."
27
54
 
28
- Create 20 eval queries: 10 should-trigger and 10 should-not-trigger.
55
+ Check:
29
56
 
30
- **Should-trigger queries (10):** Different phrasings of the same intent. Include:
31
- - Formal and casual variations
32
- - Cases where the user does not explicitly name the skill but clearly needs it
33
- - Uncommon use cases
34
- - Cases where this skill competes with another but should win
57
+ - [ ] SKILL.md body under 500 lines.
58
+ - [ ] All reference files link directly from SKILL.md (one level deep).
59
+ - [ ] Every file over 100 lines has a table of contents.
60
+ - [ ] File index in SKILL.md lists every companion file with its purpose.
61
+ - [ ] Forward slashes only in all paths.
62
+ - [ ] File names are descriptive (`form_validation_rules.md`, not `doc2.md`).
63
+ - [ ] Scripts clearly marked as execute vs read-as-reference.
35
64
 
36
- **Should-not-trigger queries (10):** Near-misses that share keywords but need something different. Include:
37
- - Adjacent domains with overlapping terminology
38
- - Ambiguous phrasing where naive keyword matching would falsely trigger
39
- - Tasks that touch the skill's domain but in a context where another tool is better
65
+ ### Fix structural issues
40
66
 
41
- All queries must be realistic -- detailed, specific, with file paths, personal context, casual speech. Not abstract one-liners.
67
+ If any check fails, restructure. Common fixes:
68
+ - SKILL.md too long → move sections to companion files, leave links.
69
+ - Nested references → surface all links to SKILL.md.
70
+ - Missing TOC → add to files over 100 lines.
42
71
 
43
- ### Review with user
72
+ **Output:** Verified file structure (and restructured files if changes were made).
44
73
 
45
- Present the eval set using the skill-creator's HTML review template. See `${CLAUDE_SKILL_DIR}/references/delegation-map.md` for the exact process.
74
+ ---
46
75
 
47
- The user can edit queries, toggle should-trigger, and add/remove entries.
76
+ ## Step 3: Gotcha Freshness
48
77
 
49
- ### Run optimization loop
78
+ **Goal:** Ensure gotchas reflect current observations.
50
79
 
51
- See `${CLAUDE_SKILL_DIR}/references/delegation-map.md` for the exact command. The loop:
52
- 1. Splits eval set into 60% train / 40% held-out test
53
- 2. Evaluates current description (3 runs per query for reliability)
54
- 3. Proposes improvements based on failures
55
- 4. Re-evaluates on both train and test
56
- 5. Iterates up to 5 times
57
- 6. Selects best description by test score (avoids overfitting)
80
+ > "Ideally, you will update your skill over time to capture these gotchas."
58
81
 
59
- ### Apply result
82
+ - Review the skill's Gotchas section.
83
+ - Check against recent usage: are there new failure modes not yet captured?
84
+ - Remove gotchas for issues that no longer occur (the skill fixed them).
85
+ - Verify each gotcha is actionable — a reader should know what to avoid and why.
60
86
 
61
- Update the skill's SKILL.md frontmatter with the optimized description. Show the user before/after with scores.
87
+ **Output:** Updated gotchas section (and any new gotchas for skill-builder itself).
62
88
 
63
89
  ---
64
90
 
65
- ## Step 2: Final Validation
91
+ ## Step 4: Full Self-Audit
92
+
93
+ **Goal:** Complete 38-point checklist pass.
66
94
 
67
- Run the skill-writer self-check rubric:
95
+ Same as new-skill Step 5 and improve-skill Step 5:
68
96
 
69
- - [ ] Description is third person with trigger phrases
70
- - [ ] SKILL.md body under 500 lines
71
- - [ ] States what to do in positive terms (not prohibition-heavy)
72
- - [ ] Degree of freedom matches task fragility
73
- - [ ] Progressive disclosure used (heavy content in separate files)
74
- - [ ] No time-sensitive claims unless clearly dated
75
- - [ ] Examples are concrete, not abstract
76
- - [ ] Frontmatter fields are valid per official docs
77
- - [ ] One skill = one capability
78
- - [ ] Consistent terminology throughout
79
- - [ ] File references are one level deep from SKILL.md
80
- - [ ] Files over 100 lines have a table of contents
97
+ 1. Read `${CLAUDE_SKILL_DIR}/references/self-audit-checklist.md`.
98
+ 2. Check every item. Fix failures. Re-check.
99
+ 3. All items must be PASS or N/A.
100
+
101
+ **Output:** Completed checklist.
81
102
 
82
103
  ---
83
104
 
84
- ## Step 3: Final Summary
105
+ ## Step 5: Deliver
106
+
107
+ **Goal:** Final summary of the polished skill.
85
108
 
86
- Present the finished skill to the user:
109
+ Present to the user:
87
110
 
88
- 1. **Benchmark summary:** Final pass rate vs baseline, with delta
89
- 2. **Gaps addressed:** Map each original gap to the skill content that addresses it
90
- 3. **Description optimization:** Before/after trigger accuracy scores
91
- 4. **Known limitations:** Anything the skill does not handle (scope boundaries)
92
- 5. **Maintenance notes:** What to watch for in future usage that might warrant re-iteration
111
+ 1. **Description** final version, confirmed trigger phrases.
112
+ 2. **File structure** folder map with line counts.
113
+ 3. **Gotchas** current gotcha count and most recent additions.
114
+ 4. **Audit summary** "All 38 items: N passed, M N/A."
115
+ 5. **Before/after** description changes if any, structural changes if any.
116
+ 6. **Maintenance notes** — what to watch for, when to re-audit.
@@ -0,0 +1,50 @@
1
+ ---
2
+ name: structure-prompt
3
+ description: >-
4
+ Restructure any user-provided prompt — order blocks correctly, replace persona
5
+ framing with task constraints, enforce per-category dispositions, replace
6
+ ceremony directives with measurable constraints, expand placeholder tokens
7
+ into real values via the sibling rubric or AskUserQuestion, add file:line
8
+ citations for identifiers that appear in the data body, mark the canonical
9
+ sub-bucket with ⭐, and sharpen generic adversarial-pass phrasing into a
10
+ category-specific failure-mode noun. Trigger when the user invokes
11
+ /structure-prompt, pastes a prompt and asks to optimize it, asks for a
12
+ "minimally invasive edit" to a prompt artifact, or asks to "tighten this
13
+ prompt."
14
+ ---
15
+
16
+ # structure-prompt
17
+
18
+ One pass per invocation. Classify each block of the input prompt, apply the matching spoke rules, and emit the rewritten prompt as a single fenced block (paste mode) or rewrite the file in place (file-path mode).
19
+
20
+ ## Pre-flight
21
+
22
+ The input prompt arrives as the user's message body, as a fenced block within it, or as a file path argument. Treat the entire input as the artifact under optimization.
23
+
24
+ ## First invocation of a session
25
+
26
+ Read [`reference/block-classification.md`](reference/block-classification.md), then [`reference/research.md`](reference/research.md), then [`reference/output-contract.md`](reference/output-contract.md).
27
+
28
+ ## Match situation, read spoke
29
+
30
+ | Situation | Read |
31
+ |---|---|
32
+ | Starting any optimization | [`reference/block-classification.md`](reference/block-classification.md) |
33
+ | A spoke needs information that isn't in the input | [`reference/research.md`](reference/research.md) |
34
+ | Input contains a fenced code block, diff, dump, transcript, or single content region ≥ 500 characters, OR blocks appear out of canonical sequence (mission, metadata, framework, questions, output spec, data body) | [`reference/structure.md`](reference/structure.md) |
35
+ | Input opens with a role assignment ("You are…", "Act as…", "Imagine you are…", "As a…", "Pretend to be…", "Role:…") | [`reference/persona.md`](reference/persona.md) |
36
+ | Input names 2+ categories, surfaces, sub-buckets, items, checks, or criteria the agent processes | [`reference/per-category.md`](reference/per-category.md) |
37
+ | Input contains performance directives ("be thorough", "think step by step", "you are an expert", "please", "kindly") | [`reference/directives.md`](reference/directives.md) |
38
+ | Input contains narrative directives ("try to", "look at", "make sure", "consider", "be sure to", "think about") | [`reference/constraints.md`](reference/constraints.md) |
39
+ | Input contains placeholder tokens (`[REPO/ARTIFACT]`, `[INLINE THE FULL ARTIFACT HERE]`, `[N]`, etc.) | [`reference/instantiation.md`](reference/instantiation.md) |
40
+ | Sub-bucket bullets reference identifiers from the data body without `file:line` citations | [`reference/citation-depth.md`](reference/citation-depth.md) |
41
+ | Framework has 5+ sub-buckets and no ⭐ canonical-case marker | [`reference/canonical-case.md`](reference/canonical-case.md) |
42
+ | Output spec contains generic adversarial-pass phrasing ("missed at least N bugs/findings") | [`reference/adversarial-tuning.md`](reference/adversarial-tuning.md) |
43
+ | Input has typos, mixed bullet styles, untagged code blocks, trailing whitespace, blank-line runs, or non-sequential heading levels | [`reference/cleanup.md`](reference/cleanup.md) |
44
+ | Situation doesn't match any spoke above | [`reference/examples.md`](reference/examples.md) |
45
+ | Emitting the rewritten prompt | [`reference/output-contract.md`](reference/output-contract.md) |
46
+
47
+ ## Folder map
48
+
49
+ - `SKILL.md` — this hub.
50
+ - `reference/` — rule detail per situation.
@@ -0,0 +1,62 @@
1
+ # Sharpen the adversarial-pass phrasing
2
+
3
+ The output spec usually closes with an adversarial second-pass instruction like *assume your first pass missed at least 3 P1 bugs across these N sub-buckets — find them*. When that phrase uses a generic noun (`bugs`, `findings`, `issues`, `problems`), the skill replaces the noun with one that names the category's specific failure mode.
4
+
5
+ ## Detection
6
+
7
+ The fix fires when the output spec contains a phrase matching this shape, with a generic noun:
8
+
9
+ - "missed at least `<number>` [bugs / findings / issues / problems]" — optionally preceded by a severity tier (`P0` or `P1`) when the framework uses tiered findings.
10
+
11
+ A noun is "generic" when it could apply to any audit category. A noun is "specific" when it names the failure mode of the category.
12
+
13
+ ## How to derive the specific noun
14
+
15
+ Read the mission line and the framework header. Pull the category's domain from there. Match against this lookup:
16
+
17
+ | Category domain | Specific failure-mode noun |
18
+ |---|---|
19
+ | API contracts (signatures, return types, callback shape) | contract drifts |
20
+ | Selector / query / engine compatibility | engine-version incompatibilities |
21
+ | Resource cleanup (handles, locks, subscriptions) | leaked resources |
22
+ | Scoping and ordering | scope or ordering bugs |
23
+ | Dead code | dead code paths |
24
+ | Silent failures (swallowed exceptions, dropped errors) | silent failures |
25
+ | Bounds and overflow | bounds or overflow bugs |
26
+ | Security boundaries | trust-boundary violations |
27
+ | Concurrency | concurrency hazards |
28
+ | Code rules compliance | rule violations |
29
+ | Codebase conflicts (incomplete propagation) | parallel sites that should have been updated alongside the diff |
30
+
31
+ When the category sits outside this list, derive the noun from the framework's most prominent axis name (e.g., a framework whose axes all name "selectors" → "selector incompatibilities").
32
+
33
+ ## Procedure
34
+
35
+ 1. Find the adversarial-pass sentence in the output spec.
36
+ 2. Identify the generic noun in that sentence.
37
+ 3. Replace it with the specific noun from the table or framework.
38
+ 4. Keep the rest of the sentence intact: count (e.g., "3"), severity tier (e.g., "P1") when the original phrase carries one, and the closing "find them".
39
+
40
+ ## Examples
41
+
42
+ Before (generic):
43
+ > "assume your first pass missed at least 3 P1 bugs across these 7 sub-buckets — find them"
44
+
45
+ After (Category B):
46
+ > "assume your first pass missed at least 3 P1 engine-version incompatibilities across these 7 sub-buckets — find them"
47
+
48
+ After (Category K):
49
+ > "assume your first pass missed at least 3 P1 parallel sites that should have been updated alongside the diff across these 7 sub-buckets — find them"
50
+
51
+ After (Category C):
52
+ > "assume your first pass missed at least 3 P1 leaked resources across these 7 sub-buckets — find them"
53
+
54
+ ## What stays put
55
+
56
+ When the adversarial phrase already names a specific failure mode, the noun stays. The skill changes only generic nouns.
57
+
58
+ The count (e.g., 3) and severity tier (e.g., P1) stay intact when the original phrase carries them. Some categories name a noun that doesn't fit the P-tier model — Codebase Conflicts ("parallel sites that should have been updated alongside the diff") is the canonical example — but preservation still applies: if the original phrase includes a tier, the rewritten phrase includes it too. The rule is preservation, not insertion or removal.
59
+
60
+ ## Disposition reporting
61
+
62
+ Every outcome emits an action note via the mechanism that [`output-contract.md`](output-contract.md) defines. When the noun was replaced: `> Gap: Adversarial-pass noun sharpened — "bugs" → "<specific noun>".` When the phrase already carries a specific noun: `> Gap: Adversarial-pass noun verified — "<specific noun>" already specific.` Silent pass is forbidden — see the [no silent action](output-contract.md#disposition-invariants) invariant.
@@ -0,0 +1,27 @@
1
+ # Block classification
2
+
3
+ Every input prompt decomposes into six block types. Tag each region of the input as exactly one type before applying any spoke rules.
4
+
5
+ ## Block types
6
+
7
+ **Mission block.** One sentence stating what the agent does. The opening directive of the prompt.
8
+
9
+ **Metadata block.** Identifiers, SHAs, PR numbers, target paths, ID prefixes, scope flags, mode toggles. Short atomic facts the agent uses as parameters.
10
+
11
+ **Framework block.** The checklist, sub-bucket list, surface list, category list, or step list the agent processes. Multi-item structures with named entries.
12
+
13
+ **Questions block.** Cross-cutting questions, synthesis questions, or open questions the agent answers after completing the framework.
14
+
15
+ **Output spec block.** The format the agent's output takes — totals header, per-item shape, ordering, severity tags, locator format, length cap, lead phrase, closing phrase.
16
+
17
+ **Data body block.** Any of:
18
+ - Fenced code block (triple backtick) that sits INSIDE the prompt content — not the outer paste-mode fence that wraps the entire prompt artifact
19
+ - Diff, file dump, transcript, log, table, or document inlined as content
20
+ - Any single content region of 500 characters or more that the agent inspects rather than acts on
21
+
22
+ ## Tagging procedure
23
+
24
+ 1. Read the input prompt top to bottom.
25
+ 2. Annotate each region with exactly one tag.
26
+ 3. Confirm every content region is either tagged with one of the six block types or part of a gap-report block. Gap-note lines (`> Gap:`) and `<!-- gap-report:` comment blocks from a prior invocation form a passthrough region — preserved in place during classification and reordering, not re-tagged. During emission, the gap-report region is deterministically replaced by the current run's gap notes per [`output-contract.md`](output-contract.md). The gap-report region sits at the end of the prompt and carries no classification tag.
27
+ 4. Proceed to the matching spoke.