claude-dev-env 1.38.0 → 1.39.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +10 -36
- package/_shared/pr-loop/audit-reply-template.md +147 -0
- package/_shared/pr-loop/fix-protocol.md +25 -4
- package/_shared/pr-loop/gh-payloads.md +37 -50
- package/_shared/pr-loop/scripts/code_rules_gate.py +0 -60
- package/_shared/pr-loop/scripts/config/post_audit_thread_constants.py +189 -0
- package/_shared/pr-loop/scripts/post_audit_thread.py +947 -0
- package/_shared/pr-loop/scripts/tests/test_code_rules_gate.py +0 -19
- package/_shared/pr-loop/scripts/tests/test_post_audit_thread.py +923 -0
- package/_shared/pr-loop/scripts/tests/test_post_audit_thread_constants.py +127 -0
- package/_shared/pr-loop/state-schema.md +1 -1
- package/agents/clean-coder.md +2 -2
- package/bin/install.mjs +6 -7
- package/bin/install.test.mjs +8 -0
- package/commands/doc-gist.md +16 -0
- package/commands/plan.md +0 -2
- package/commands/review-plan.md +1 -1
- package/docs/CODE_RULES.md +122 -2
- package/hooks/blocking/bot_mention_comment_blocker.py +75 -0
- package/hooks/blocking/code_rules_enforcer.py +1236 -161
- package/hooks/blocking/convergence_gate_blocker.py +130 -0
- package/hooks/blocking/destructive_command_blocker.py +74 -0
- package/hooks/blocking/gh_body_arg_blocker.py +30 -0
- package/hooks/blocking/md_to_html_blocker.py +119 -0
- package/hooks/blocking/test_bot_mention_comment_blocker.py +131 -0
- package/hooks/blocking/test_code_rules_enforcer.py +21 -0
- package/hooks/blocking/test_code_rules_enforcer_any_exempt_files.py +70 -0
- package/hooks/blocking/test_code_rules_enforcer_any_imports_and_cast.py +92 -0
- package/hooks/blocking/test_code_rules_enforcer_banned_import_alias.py +143 -0
- package/hooks/blocking/test_code_rules_enforcer_banned_prefixes.py +152 -0
- package/hooks/blocking/test_code_rules_enforcer_bare_except.py +120 -0
- package/hooks/blocking/test_code_rules_enforcer_boundary_types.py +175 -0
- package/hooks/blocking/test_code_rules_enforcer_cap_meta.py +0 -1
- package/hooks/blocking/test_code_rules_enforcer_collection_prefix.py +50 -0
- package/hooks/blocking/test_code_rules_enforcer_docstring_format.py +255 -0
- package/hooks/blocking/test_code_rules_enforcer_inline_tuple_string_magic.py +130 -0
- package/hooks/blocking/test_code_rules_enforcer_stub_implementations.py +141 -0
- package/hooks/blocking/test_code_rules_enforcer_test_branching.py +143 -0
- package/hooks/blocking/test_code_rules_enforcer_thin_wrapper_files.py +169 -0
- package/hooks/blocking/test_code_rules_enforcer_todo_markers.py +99 -0
- package/hooks/blocking/test_code_rules_enforcer_typed_dict_pairs.py +141 -0
- package/hooks/blocking/test_code_rules_enforcer_unused_imports.py +158 -0
- package/hooks/blocking/test_convergence_gate_blocker.py +63 -0
- package/hooks/blocking/test_destructive_command_blocker.py +146 -0
- package/hooks/blocking/test_destructive_command_blocker_no_verify.py +102 -0
- package/hooks/blocking/test_gh_body_arg_blocker.py +45 -0
- package/hooks/blocking/test_md_to_html_blocker.py +317 -0
- package/hooks/config/any_type_config.py +7 -0
- package/hooks/config/banned_identifiers_constants.py +11 -0
- package/hooks/config/blocking_check_limits.py +38 -0
- package/hooks/config/bot_mention_comment_blocker_constants.py +20 -0
- package/hooks/config/code_rules_enforcer_constants.py +53 -0
- package/hooks/config/convergence_branch_constants.py +9 -0
- package/hooks/config/doc_gist_auto_publish_constants.py +18 -0
- package/hooks/config/html_companion_constants.py +20 -0
- package/hooks/config/inline_tuple_string_magic_constants.py +22 -0
- package/hooks/config/test_banned_identifiers_constants.py +17 -0
- package/hooks/hooks.json +28 -20
- package/hooks/pyproject.toml +69 -0
- package/hooks/validators/mypy_integration.py +47 -1
- package/hooks/validators/run_all_validators.py +3 -3
- package/hooks/validators/test_mypy_integration.py +50 -1
- package/hooks/workflow/doc_gist_auto_publish.py +144 -0
- package/hooks/workflow/md_to_html_companion.py +365 -0
- package/hooks/workflow/test_doc_gist_auto_publish.py +117 -0
- package/hooks/workflow/test_md_to_html_companion.py +452 -0
- package/package.json +1 -1
- package/rules/gh-body-file.md +2 -0
- package/scripts/Install-SweepEmptyDirs.ps1 +111 -0
- package/scripts/check.ps1 +106 -0
- package/scripts/config/timing.py +11 -0
- package/scripts/sweep_empty_dirs.py +138 -0
- package/scripts/sync_to_cursor/rules.py +1 -1
- package/scripts/test_sweep_empty_dirs.py +183 -0
- package/skills/_shared/pr-loop/prompts/pr-consistency-audit.xml +323 -0
- package/skills/_shared/pr-loop/scripts/_cli_utils.py +22 -0
- package/skills/_shared/pr-loop/scripts/_path_resolver.py +165 -0
- package/skills/_shared/pr-loop/scripts/_xml_utils.py +20 -0
- package/skills/_shared/pr-loop/scripts/build_audit_prompt.py +182 -0
- package/skills/_shared/pr-loop/scripts/build_fix_prompt.py +185 -0
- package/skills/_shared/pr-loop/scripts/config/__init__.py +0 -0
- package/skills/_shared/pr-loop/scripts/config/path_resolver_constants.py +78 -0
- package/skills/_shared/pr-loop/scripts/init_loop_state.py +135 -0
- package/skills/_shared/pr-loop/scripts/teardown_worktrees.py +175 -0
- package/skills/_shared/pr-loop/scripts/write_audit_outcomes.py +182 -0
- package/skills/_shared/pr-loop/scripts/write_fix_outcomes.py +206 -0
- package/skills/bugteam/CONSTRAINTS.md +21 -22
- package/skills/bugteam/EXAMPLES.md +3 -3
- package/skills/bugteam/PROMPTS.md +227 -67
- package/skills/bugteam/SKILL.md +114 -455
- package/skills/bugteam/reference/README.md +1 -1
- package/skills/bugteam/reference/audit-and-teammates.md +112 -39
- package/skills/bugteam/reference/audit-contract.md +4 -22
- package/skills/bugteam/reference/copilot-gap-analysis.md +8 -5
- package/skills/bugteam/reference/design-rationale.md +2 -2
- package/skills/bugteam/reference/github-pr-reviews.md +50 -57
- package/skills/bugteam/reference/obstacles/audit-assign-ids.md +13 -0
- package/skills/bugteam/reference/obstacles/audit-capture-excerpts.md +13 -0
- package/skills/bugteam/reference/obstacles/audit-walk-categories.md +13 -0
- package/skills/bugteam/reference/obstacles/audit-write-xml.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-append-summary.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-apply-fixes.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-git-add-commit.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-git-push.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-post-reply.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-publish-summary.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-py-compile.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-read-files.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-resolve-thread.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-test-suite.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-violation-count.md +13 -0
- package/skills/bugteam/reference/obstacles/fix-write-xml.md +13 -0
- package/skills/bugteam/reference/team-setup.md +106 -9
- package/skills/bugteam/reference/teardown-publish-permissions.md +39 -8
- package/skills/bugteam/scripts/README.md +60 -0
- package/skills/bugteam/scripts/_claude_permissions_common.py +358 -0
- package/skills/bugteam/scripts/bugteam_code_rules_gate.py +976 -0
- package/skills/bugteam/scripts/bugteam_fix_hookspath.py +375 -0
- package/skills/bugteam/scripts/bugteam_preflight.py +294 -0
- package/skills/bugteam/scripts/config/bugteam_code_rules_gate_constants.py +25 -0
- package/skills/bugteam/scripts/config/bugteam_fix_hookspath_constants.py +26 -0
- package/skills/bugteam/scripts/config/bugteam_preflight_constants.py +35 -0
- package/skills/bugteam/scripts/config/claude_permissions_common_constants.py +20 -0
- package/skills/bugteam/scripts/config/probe_code_rules_enforcer_check_constants.py +12 -0
- package/skills/bugteam/scripts/config/windows_safe_rmtree_constants.py +7 -0
- package/skills/bugteam/scripts/grant_project_claude_permissions.py +175 -0
- package/skills/bugteam/scripts/probe_code_rules_enforcer_check.py +107 -0
- package/skills/bugteam/scripts/revoke_project_claude_permissions.py +220 -0
- package/skills/bugteam/scripts/test__claude_permissions_common.py +112 -0
- package/skills/bugteam/scripts/test_bugteam_code_rules_gate.py +400 -0
- package/skills/bugteam/scripts/test_bugteam_fix_hookspath.py +384 -0
- package/skills/bugteam/scripts/test_bugteam_preflight.py +268 -0
- package/skills/bugteam/scripts/test_claude_permissions_common.py +195 -0
- package/skills/bugteam/scripts/test_grant_project_claude_permissions.py +55 -0
- package/skills/bugteam/scripts/test_probe_code_rules_enforcer_check.py +76 -0
- package/skills/bugteam/scripts/test_revoke_project_claude_permissions.py +55 -0
- package/skills/bugteam/scripts/test_windows_safe_rmtree.py +108 -0
- package/skills/bugteam/scripts/windows_safe_rmtree.py +100 -0
- package/skills/bugteam/test_skill_additions.py +1 -11
- package/skills/code/SKILL.md +176 -0
- package/skills/doc-gist/SKILL.md +99 -0
- package/skills/doc-gist/references/examples/01-exploration-code-approaches.html +453 -0
- package/skills/doc-gist/references/examples/02-exploration-visual-designs.html +515 -0
- package/skills/doc-gist/references/examples/03-code-review-pr.html +638 -0
- package/skills/doc-gist/references/examples/04-code-understanding.html +491 -0
- package/skills/doc-gist/references/examples/05-design-system.html +629 -0
- package/skills/doc-gist/references/examples/06-component-variants.html +605 -0
- package/skills/doc-gist/references/examples/07-prototype-animation.html +455 -0
- package/skills/doc-gist/references/examples/08-prototype-interaction.html +396 -0
- package/skills/doc-gist/references/examples/09-slide-deck.html +592 -0
- package/skills/doc-gist/references/examples/10-svg-illustrations.html +492 -0
- package/skills/doc-gist/references/examples/11-status-report.html +528 -0
- package/skills/doc-gist/references/examples/12-incident-report.html +596 -0
- package/skills/doc-gist/references/examples/13-flowchart-diagram.html +395 -0
- package/skills/doc-gist/references/examples/14-research-feature-explainer.html +381 -0
- package/skills/doc-gist/references/examples/15-research-concept-explainer.html +368 -0
- package/skills/doc-gist/references/examples/16-implementation-plan.html +702 -0
- package/skills/doc-gist/references/examples/17-pr-writeup.html +595 -0
- package/skills/doc-gist/references/examples/18-editor-triage-board.html +573 -0
- package/skills/doc-gist/references/examples/19-editor-feature-flags.html +663 -0
- package/skills/doc-gist/references/examples/20-editor-prompt-tuner.html +722 -0
- package/skills/doc-gist/references/examples/README.md +5 -0
- package/skills/doc-gist/scripts/config/__init__.py +0 -0
- package/skills/doc-gist/scripts/config/gist_upload_constants.py +16 -0
- package/skills/doc-gist/scripts/gist_upload.py +177 -0
- package/skills/doc-gist/scripts/test_gist_upload.py +51 -0
- package/skills/findbugs/SKILL.md +68 -2
- package/skills/monitor-open-prs/SKILL.md +13 -32
- package/skills/monitor-open-prs/test_skill_contract.py +0 -11
- package/skills/pr-consistency-audit/SKILL.md +112 -0
- package/skills/pr-consistency-audit/reference/detection-rules.md +96 -0
- package/skills/pr-consistency-audit/reference/illustrations.md +78 -0
- package/skills/pr-converge/SKILL.md +227 -23
- package/skills/pr-converge/config/__init__.py +0 -0
- package/skills/pr-converge/config/constants.py +62 -0
- package/skills/pr-converge/reference/convergence-gates.md +138 -44
- package/skills/pr-converge/reference/examples.md +43 -11
- package/skills/pr-converge/reference/fix-protocol.md +6 -5
- package/skills/pr-converge/reference/ground-rules.md +5 -3
- package/skills/pr-converge/reference/multi-pr-orchestration.md +44 -19
- package/skills/pr-converge/reference/obstacles/fix-post-replies.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-publish-summary.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-push.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-read-filelines.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-reset-state.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-resolve-threads.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-spawn-clean-coder.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-stage-commit.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-trigger-bugbot.md +13 -0
- package/skills/pr-converge/reference/obstacles/fix-write-test.md +13 -0
- package/skills/pr-converge/reference/per-tick.md +90 -31
- package/skills/pr-converge/reference/state-schema.md +22 -1
- package/skills/pr-converge/reference/stop-conditions.md +9 -7
- package/skills/pr-converge/scripts/README.md +34 -46
- package/skills/pr-converge/scripts/check_bugbot_ci.py +174 -0
- package/skills/pr-converge/scripts/check_convergence.py +497 -0
- package/skills/pr-converge/scripts/check_pending_reviews.py +154 -0
- package/skills/pr-converge/scripts/config/pr_converge_constants.py +118 -0
- package/skills/pr-converge/scripts/fetch_copilot_reviews.py +134 -0
- package/skills/pr-converge/scripts/post_fix_reply.py +168 -0
- package/skills/pr-converge/workflows/schedule-wakeup-loop.md +5 -12
- package/skills/qbug/SKILL.md +132 -27
- package/skills/session-log/SKILL.md +216 -114
- package/skills/session-tidy/SKILL.md +1 -1
- package/skills/skill-builder/SKILL.md +138 -56
- package/skills/skill-builder/references/delegation-map.md +72 -113
- package/skills/skill-builder/references/progressive-disclosure.md +122 -0
- package/skills/skill-builder/references/self-audit-checklist.md +92 -0
- package/skills/skill-builder/references/skill-types.md +228 -0
- package/skills/skill-builder/references/thariq-x-post-skills.json +33 -0
- package/skills/skill-builder/templates/gap-analysis.md +15 -8
- package/skills/skill-builder/workflows/improve-skill.md +86 -57
- package/skills/skill-builder/workflows/new-skill.md +80 -168
- package/skills/skill-builder/workflows/polish-skill.md +78 -54
- package/skills/structure-prompt/SKILL.md +50 -0
- package/skills/structure-prompt/reference/adversarial-tuning.md +62 -0
- package/skills/structure-prompt/reference/block-classification.md +27 -0
- package/skills/structure-prompt/reference/canonical-case.md +48 -0
- package/skills/structure-prompt/reference/citation-depth.md +70 -0
- package/skills/structure-prompt/reference/cleanup.md +33 -0
- package/skills/structure-prompt/reference/constraints.md +33 -0
- package/skills/structure-prompt/reference/directives.md +37 -0
- package/skills/structure-prompt/reference/examples.md +72 -0
- package/skills/structure-prompt/reference/instantiation.md +51 -0
- package/skills/structure-prompt/reference/output-contract.md +72 -0
- package/skills/structure-prompt/reference/per-category.md +23 -0
- package/skills/structure-prompt/reference/persona.md +38 -0
- package/skills/structure-prompt/reference/research.md +33 -0
- package/skills/structure-prompt/reference/structure.md +28 -0
- package/agents/code-standards-agent.md +0 -93
- package/agents/groq-coder.md +0 -113
- package/agents/plan-executor.md +0 -226
- package/agents/project-docs-analyzer.md +0 -53
- package/agents/project-structure-organizer-agent.md +0 -72
- package/agents/skill-to-agent-converter.md +0 -370
- package/agents/skill-writer-agent.md +0 -470
- package/agents/user-docs-writer.md +0 -67
- package/agents/workflow-visual-documenter.md +0 -82
- package/commands/readability-review.md +0 -20
- package/hooks/mypy.ini +0 -2
- package/hooks/notification/attention_needed_notify.py +0 -71
- package/hooks/notification/claude_notification_handler.py +0 -67
- package/hooks/notification/notification_utils.py +0 -267
- package/hooks/notification/subagent_complete_notify.py +0 -381
- package/hooks/notification/test_attention_needed_notify.py +0 -47
- package/hooks/notification/test_claude_notification_handler.py +0 -54
- package/hooks/notification/test_notification_utils.py +0 -91
- package/hooks/notification/test_subagent_complete_notify.py +0 -79
- package/scripts/config/groq_bugteam_config.py +0 -230
- package/scripts/config/test_groq_bugteam_config.py +0 -83
- package/scripts/config/test_spec_implementer_prompt.py +0 -32
- package/scripts/groq_bugteam.README.md +0 -131
- package/scripts/groq_bugteam.py +0 -647
- package/scripts/groq_bugteam_dotenv.py +0 -40
- package/scripts/groq_bugteam_spec.py +0 -226
- package/scripts/test_groq_bugteam.py +0 -529
- package/scripts/test_groq_bugteam_apply_fix_from_spec.py +0 -426
- package/scripts/test_groq_bugteam_dotenv.py +0 -66
- package/scripts/test_groq_bugteam_spec.py +0 -338
- package/skills/bugteam/SKILL_EVALS.md +0 -309
- package/skills/dream/SKILL.md +0 -118
- package/skills/ingest/SKILL.md +0 -40
- package/skills/npm-creator/SKILL.md +0 -187
- package/skills/readability-review/SKILL.md +0 -127
- package/skills/resume-review/SKILL.md +0 -261
- package/skills/rule-audit/SKILL.md +0 -307
- package/skills/rule-creator/SKILL.md +0 -150
- package/skills/searching-obsidian-vault/SKILL.md +0 -131
- package/skills/skill-writer/REFERENCE.md +0 -284
- package/skills/skill-writer/SKILL.md +0 -222
- package/skills/tdd-team/SKILL.md +0 -128
|
@@ -1,235 +1,147 @@
|
|
|
1
1
|
# New Skill Workflow
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Best-practice-driven lifecycle for building a skill from scratch.
|
|
4
4
|
|
|
5
5
|
## Prerequisites
|
|
6
6
|
|
|
7
7
|
- The user has a task or domain they want to capture as a skill
|
|
8
8
|
- No existing skill for this capability (or intentionally starting fresh)
|
|
9
9
|
|
|
10
|
-
### Ground-up package layout (required before multi-file implementation)
|
|
11
|
-
|
|
12
|
-
When the outcome includes **ARCHITECTURE.md**, **REFERENCE / EXAMPLES / WORKFLOWS**, and **`evals/*.json`** under a workspace (Anthropic-style progressive disclosure plus checkpointed rollout):
|
|
13
|
-
|
|
14
|
-
1. Read `prompt-generator/templates/skill-from-ground-up.md` from the installed `~/.claude/skills/` tree (provided by [@jl-cmd/prompt-generator](https://github.com/jl-cmd/prompt-generator)).
|
|
15
|
-
2. Run `/prompt-generator` using that template (substitute tokens per its table) **before** Phase 3 expands the repo; align the XML scope block with this workflow’s workspace and evidence rules.
|
|
16
|
-
3. Keep Phase 1–2 artifacts honest: eval prompts and expectations stay grounded in **real** user scenarios; the template reinforces eval rows that reference pasted or explicitly approved evidence only.
|
|
17
|
-
|
|
18
|
-
Skip this block only when the user explicitly wants a **single-file** SKILL.md with no staged package plan.
|
|
19
|
-
|
|
20
|
-
Refinements to an **existing** skill package use `prompt-generator/templates/skill-refinement-package.md` instead (see `improve-skill.md`).
|
|
21
|
-
|
|
22
|
-
---
|
|
23
|
-
|
|
24
|
-
## Phase 1: Identify Gaps
|
|
25
|
-
|
|
26
|
-
**Goal:** Document what fails or requires repeated context when working without a skill.
|
|
27
|
-
|
|
28
|
-
### Process
|
|
29
|
-
|
|
30
|
-
1. Have a guided conversation to uncover gaps. Explore these areas:
|
|
31
|
-
- "What task were you doing when you realized you needed a skill?"
|
|
32
|
-
- "What context did you repeatedly provide to Claude?"
|
|
33
|
-
- "Where did Claude fail or produce subpar results without guidance?"
|
|
34
|
-
- "What domain knowledge was missing?"
|
|
35
|
-
- "What specific format or structure did you need?"
|
|
36
|
-
- "Were there tools or scripts that needed to be used in a particular way?"
|
|
37
|
-
- "What rules or constraints did Claude violate?"
|
|
38
|
-
|
|
39
|
-
2. As patterns emerge, probe for eval-worthy scenarios:
|
|
40
|
-
- "Can you give me a concrete example of a task where this failed?"
|
|
41
|
-
- "What would success look like for that specific task?"
|
|
42
|
-
- "Are there edge cases where the right approach changes?"
|
|
43
|
-
|
|
44
|
-
3. Generate `gap-analysis.md` from the conversation using the template at `${CLAUDE_SKILL_DIR}/templates/gap-analysis.md`. Fill in all sections from what was discussed.
|
|
45
|
-
|
|
46
|
-
4. Review the gap analysis with the user. Confirm completeness before moving to Phase 2.
|
|
47
|
-
|
|
48
|
-
**Output:** `[skill-name]-workspace/gap-analysis.md`
|
|
49
|
-
|
|
50
10
|
---
|
|
51
11
|
|
|
52
|
-
##
|
|
53
|
-
|
|
54
|
-
**Goal:** Create 3+ evaluation scenarios that test the identified gaps. Establish a baseline.
|
|
12
|
+
## Step 1: Classify
|
|
55
13
|
|
|
56
|
-
|
|
14
|
+
**Goal:** Determine the skill type. Type dictates folder structure.
|
|
57
15
|
|
|
58
|
-
1.
|
|
59
|
-
- A realistic user prompt (detailed and specific, like a real request)
|
|
60
|
-
- A description of what success looks like
|
|
61
|
-
- Objectively verifiable expectations (assertions)
|
|
16
|
+
1. Read `${CLAUDE_SKILL_DIR}/references/skill-types.md`.
|
|
62
17
|
|
|
63
|
-
2.
|
|
64
|
-
- Minimum 3 scenarios (official requirement)
|
|
65
|
-
- Every identified gap has at least one scenario testing it
|
|
66
|
-
- Expectations are objectively verifiable, not subjective
|
|
67
|
-
- Prompts sound like things a real user would say
|
|
18
|
+
2. Ask the user about the skill’s purpose:
|
|
68
19
|
|
|
69
|
-
|
|
20
|
+
> "What will this skill help Claude do?"
|
|
70
21
|
|
|
71
|
-
|
|
22
|
+
Match the answer against the 9 types. If ambiguous, present the top 2-3 matches and ask the user to choose.
|
|
72
23
|
|
|
73
|
-
|
|
24
|
+
3. Record the classification: type number, type name, recommended folders.
|
|
74
25
|
|
|
75
|
-
|
|
76
|
-
Execute this task with NO skill loaded:
|
|
77
|
-
- Task: [eval prompt]
|
|
78
|
-
- Input files: [eval files if any, or "none"]
|
|
79
|
-
- Save all output files to: [workspace]/iteration-0/eval-[name]/without_skill/outputs/
|
|
80
|
-
- Save a complete transcript of your work to: [workspace]/iteration-0/eval-[name]/without_skill/transcript.md
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
Spawn all baseline runs in parallel. Capture timing data when each completes.
|
|
84
|
-
|
|
85
|
-
6. Grade baseline results using the skill-creator grading agent. See `${CLAUDE_SKILL_DIR}/references/delegation-map.md` for exact grading invocation.
|
|
86
|
-
|
|
87
|
-
**Output:** `[skill-name]-workspace/evals/evals.json` and baseline results in `iteration-0/`
|
|
26
|
+
**Output:** Type classification with folder plan.
|
|
88
27
|
|
|
89
28
|
---
|
|
90
29
|
|
|
91
|
-
##
|
|
30
|
+
## Step 2: Scaffold
|
|
92
31
|
|
|
93
|
-
**Goal:** Create
|
|
32
|
+
**Goal:** Create the folder structure. Every skill starts with the same skeleton plus type-specific additions.
|
|
94
33
|
|
|
95
|
-
|
|
34
|
+
1. Create the skill directory if it doesn’t exist.
|
|
96
35
|
|
|
97
|
-
|
|
36
|
+
2. Create the minimum structure:
|
|
98
37
|
|
|
99
38
|
```
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
Gap analysis: [reference or paste gap-analysis.md]
|
|
103
|
-
Eval scenarios: [reference or paste evals.json expected_output and expectations]
|
|
104
|
-
Baseline failures: [summarize what Claude got wrong in iteration-0]
|
|
105
|
-
|
|
106
|
-
Constraint: Write the minimum instructions needed to address these specific gaps.
|
|
107
|
-
Every line must serve a documented gap. Do not over-document.
|
|
39
|
+
skill-name/
|
|
40
|
+
├── SKILL.md # Hub — every skill has this
|
|
108
41
|
```
|
|
109
42
|
|
|
110
|
-
|
|
43
|
+
3. Add type-specific directories based on Step 1 classification (see `${CLAUDE_SKILL_DIR}/references/skill-types.md` for the folder recommendations per type).
|
|
111
44
|
|
|
112
|
-
|
|
113
|
-
- "Does this address all the gaps we identified?"
|
|
114
|
-
- "Is anything here unnecessary or over-engineered?"
|
|
115
|
-
- "Would this pass our eval scenarios?"
|
|
45
|
+
4. Verify the scaffold matches the type recommendation.
|
|
116
46
|
|
|
117
|
-
|
|
47
|
+
> "As your Skill grows, you can bundle additional content that Claude loads only when needed."
|
|
118
48
|
|
|
119
|
-
**Output:**
|
|
49
|
+
**Output:** Directory tree with SKILL.md stub.
|
|
120
50
|
|
|
121
51
|
---
|
|
122
52
|
|
|
123
|
-
##
|
|
53
|
+
## Step 3: Gather
|
|
124
54
|
|
|
125
|
-
**Goal:**
|
|
55
|
+
**Goal:** Collect domain knowledge, failure patterns, and gotchas from the user.
|
|
126
56
|
|
|
127
|
-
|
|
57
|
+
> "Build a Gotchas Section — these sections should be built up from common failure points that Claude runs into when using your skill."
|
|
128
58
|
|
|
129
|
-
|
|
59
|
+
### Interview questions
|
|
130
60
|
|
|
131
|
-
|
|
132
|
-
Execute this task:
|
|
133
|
-
- Read the skill at [path-to-skill]/SKILL.md and follow its instructions
|
|
134
|
-
- Task: [eval prompt from evals.json]
|
|
135
|
-
- Input files: [eval files if any, or "none"]
|
|
136
|
-
- Save all output files to: [workspace]/iteration-N/eval-[name]/with_skill/outputs/
|
|
137
|
-
- Save a complete transcript of your work to: [workspace]/iteration-N/eval-[name]/with_skill/transcript.md
|
|
138
|
-
```
|
|
61
|
+
Ask the user:
|
|
139
62
|
|
|
140
|
-
|
|
63
|
+
1. "What task were you doing when you realized you needed a skill?"
|
|
64
|
+
2. "What context did you repeatedly provide to Claude?"
|
|
65
|
+
3. "Where did Claude fail or produce subpar results without guidance?"
|
|
66
|
+
4. "What does Claude consistently get wrong about this domain?"
|
|
67
|
+
5. "What specific format or structure do you need in the output?"
|
|
68
|
+
6. "Are there rules or constraints Claude must never violate?"
|
|
69
|
+
7. "What tools, scripts, or libraries does Claude need to use?"
|
|
70
|
+
8. "Does this skill need to run differently for different models (Haiku vs Opus)?"
|
|
141
71
|
|
|
142
|
-
|
|
72
|
+
### Generate gap analysis
|
|
143
73
|
|
|
144
|
-
|
|
74
|
+
Use the template at `${CLAUDE_SKILL_DIR}/templates/gap-analysis.md`. Fill in:
|
|
145
75
|
|
|
146
|
-
|
|
76
|
+
- Skill type and degree of freedom
|
|
77
|
+
- Task description
|
|
78
|
+
- Gaps identified (what failed, what was needed)
|
|
79
|
+
- Recurring patterns across gaps
|
|
80
|
+
- Initial gotcha candidates
|
|
147
81
|
|
|
148
|
-
|
|
82
|
+
### Assess degree of freedom
|
|
149
83
|
|
|
150
|
-
|
|
84
|
+
> "Match the level of specificity to the task’s fragility and variability."
|
|
151
85
|
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
86
|
+
| Degree | When | Example |
|
|
87
|
+
|---|---|---|
|
|
88
|
+
| High | Multiple valid approaches, context-dependent | Code review guidelines |
|
|
89
|
+
| Medium | Preferred pattern exists, some variation ok | Report generation with template |
|
|
90
|
+
| Low | Fragile operations, consistency critical | Database migration with exact script |
|
|
155
91
|
|
|
156
|
-
|
|
92
|
+
Record the assessment with reasoning.
|
|
157
93
|
|
|
158
|
-
**Output:**
|
|
94
|
+
**Output:** Completed gap analysis, initial gotchas list, degree-of-freedom assessment.
|
|
159
95
|
|
|
160
96
|
---
|
|
161
97
|
|
|
162
|
-
##
|
|
163
|
-
|
|
164
|
-
**Goal:** Refine the skill based on observed Claude B behavior and user feedback.
|
|
98
|
+
## Step 4: Write
|
|
165
99
|
|
|
166
|
-
|
|
100
|
+
**Goal:** Produce the skill package — SKILL.md and companion files.
|
|
167
101
|
|
|
168
|
-
|
|
102
|
+
Delegate to `/skill-writer` using the structured handoff from `${CLAUDE_SKILL_DIR}/references/delegation-map.md`.
|
|
169
103
|
|
|
170
|
-
|
|
171
|
-
- **Unexpected exploration paths** -- Claude B read files in an order you did not anticipate
|
|
172
|
-
- **Missed connections** -- Claude B did not follow references to important files
|
|
173
|
-
- **Overreliance on certain sections** -- content that should be promoted to SKILL.md
|
|
174
|
-
- **Ignored content** -- files Claude B never accessed (may be unnecessary or poorly signaled)
|
|
175
|
-
- **Repeated work across test cases** -- all subagents wrote similar helper scripts (bundle them into the skill)
|
|
104
|
+
The handoff must include: skill type, folder structure, gap analysis, initial gotchas, degree of freedom, constraints.
|
|
176
105
|
|
|
177
|
-
|
|
106
|
+
After skill-writer produces the draft:
|
|
178
107
|
|
|
179
|
-
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
108
|
+
1. Verify it follows the hub layout (principle → gotchas → when-applies → process → file index → folder map).
|
|
109
|
+
2. Verify SKILL.md body is under 500 lines.
|
|
110
|
+
3. Verify all references are one level deep.
|
|
111
|
+
4. Verify files over 100 lines have a TOC.
|
|
183
112
|
|
|
184
|
-
|
|
185
|
-
User feedback: [from feedback.json -- only non-empty entries]
|
|
186
|
-
Behavioral observations: [from transcript analysis]
|
|
113
|
+
Fix structural issues before proceeding.
|
|
187
114
|
|
|
188
|
-
|
|
189
|
-
1. [Issue]
|
|
190
|
-
2. [Issue]
|
|
115
|
+
**Output:** Complete skill package at the target directory.
|
|
191
116
|
|
|
192
|
-
|
|
193
|
-
```
|
|
117
|
+
---
|
|
194
118
|
|
|
195
|
-
|
|
196
|
-
- **Generalize from feedback** -- the skill will be used across many different prompts, not just these test cases
|
|
197
|
-
- **Keep the prompt lean** -- remove instructions that are not pulling their weight
|
|
198
|
-
- **Explain the why** -- theory of mind beats rigid MUSTs
|
|
199
|
-
- **Bundle repeated work** -- if subagents all wrote similar scripts, add them to the skill
|
|
119
|
+
## Step 5: Self-Audit
|
|
200
120
|
|
|
201
|
-
|
|
202
|
-
- User feedback is all empty (satisfied with every test case)
|
|
203
|
-
- Pass rates meet acceptable thresholds
|
|
204
|
-
- No meaningful progress between iterations
|
|
121
|
+
**Goal:** Verify every best practice is satisfied before delivery.
|
|
205
122
|
|
|
206
|
-
|
|
123
|
+
1. Read `${CLAUDE_SKILL_DIR}/references/self-audit-checklist.md`.
|
|
124
|
+
2. Copy the checklist into your response.
|
|
125
|
+
3. Check every item against the built skill. For each: PASS, FAIL with file:line evidence, or N/A with reason.
|
|
126
|
+
4. Every FAIL must be fixed before proceeding. Apply fixes, then re-check that item.
|
|
127
|
+
5. When all items are PASS or N/A, proceed to Step 6.
|
|
207
128
|
|
|
208
|
-
|
|
129
|
+
For an independent check, spawn a subagent to run the audit (see delegation-map.md).
|
|
209
130
|
|
|
210
|
-
**
|
|
131
|
+
**Output:** Completed checklist with all items PASS or N/A.
|
|
211
132
|
|
|
212
|
-
|
|
133
|
+
---
|
|
213
134
|
|
|
214
|
-
|
|
135
|
+
## Step 6: Deliver
|
|
215
136
|
|
|
216
|
-
|
|
217
|
-
- [ ] Description is third person with trigger phrases
|
|
218
|
-
- [ ] Under 500 lines
|
|
219
|
-
- [ ] States what to do in positive terms (not prohibition-heavy)
|
|
220
|
-
- [ ] Degree of freedom matches task fragility
|
|
221
|
-
- [ ] Progressive disclosure used (heavy content in separate files)
|
|
222
|
-
- [ ] Examples are concrete, not abstract
|
|
223
|
-
- [ ] Frontmatter fields are valid
|
|
224
|
-
- [ ] One skill = one capability
|
|
137
|
+
**Goal:** Hand off the finished skill with full documentation.
|
|
225
138
|
|
|
226
|
-
|
|
227
|
-
- [ ] At least 3 evaluation scenarios created and passing
|
|
228
|
-
- [ ] Tested with real usage scenarios
|
|
229
|
-
- [ ] Skill solves documented gaps (not imagined requirements)
|
|
230
|
-
- [ ] Iterative refinement based on observed behavior (not assumptions)
|
|
139
|
+
Present to the user:
|
|
231
140
|
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
141
|
+
1. **File map** — every file created, with its purpose.
|
|
142
|
+
2. **Skill type** — classification and why it fits.
|
|
143
|
+
3. **Degree of freedom** — assessment and reasoning.
|
|
144
|
+
4. **Gotchas seeded** — initial gotchas captured.
|
|
145
|
+
5. **Audit summary** — "All 38 items: N passed, M N/A."
|
|
146
|
+
6. **Maintenance notes** — what to watch for in future usage that might warrant iteration.
|
|
147
|
+
7. **Suggested first test** — a concrete task to try with Claude B.
|
|
@@ -4,89 +4,113 @@ Final optimization pass for a skill that is functionally complete.
|
|
|
4
4
|
|
|
5
5
|
## Prerequisites
|
|
6
6
|
|
|
7
|
-
- The skill
|
|
7
|
+
- The skill has been used and observed
|
|
8
8
|
- The user is satisfied with output quality
|
|
9
9
|
- This is the final step before the skill is considered done
|
|
10
10
|
|
|
11
|
-
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Step 1: Description Audit
|
|
14
|
+
|
|
15
|
+
**Goal:** Verify the description field is optimized for model discovery.
|
|
16
|
+
|
|
17
|
+
> "The description is critical for skill selection: Claude uses it to choose the right Skill from potentially 100+ available Skills."
|
|
18
|
+
|
|
19
|
+
> "The description field is not a summary — it's a description of when to trigger."
|
|
12
20
|
|
|
13
|
-
|
|
21
|
+
Check each requirement:
|
|
22
|
+
|
|
23
|
+
- [ ] **Third person.** "Processes Excel files" not "I can help you process Excel files."
|
|
24
|
+
- [ ] **Includes what AND when.** Both the capability and trigger contexts.
|
|
25
|
+
- [ ] **Specific trigger phrases.** Different phrasings of the same intent should all match.
|
|
26
|
+
- [ ] **Under 1024 characters.** Hard limit.
|
|
27
|
+
- [ ] **No XML tags.**
|
|
28
|
+
- [ ] **Distinguishable from similar skills.** If two skills overlap, the descriptions must make the boundary clear.
|
|
29
|
+
|
|
30
|
+
### Trigger phrase review
|
|
31
|
+
|
|
32
|
+
Generate 10 variations of the user's intent:
|
|
33
|
+
- Formal and casual phrasings
|
|
34
|
+
- Cases where the user doesn't explicitly name the skill but clearly needs it
|
|
35
|
+
- Cases where this skill competes with another but should win
|
|
14
36
|
|
|
15
|
-
|
|
16
|
-
2. Run `/prompt-generator` with tokens filled so `ARCHITECTURE.md` records baseline inventory, planned deltas for polish, and evidence rules for any new trigger or behavior evals.
|
|
37
|
+
For each, answer: would the current description cause Claude to select this skill?
|
|
17
38
|
|
|
18
|
-
|
|
39
|
+
Also check 5 near-miss phrasings — adjacent domains where this skill should NOT trigger. Verify the description doesn't cause false activation.
|
|
40
|
+
|
|
41
|
+
### Fix issues
|
|
42
|
+
|
|
43
|
+
If the description fails any check, revise it. Show before/after with the specific change and why it improves discovery.
|
|
44
|
+
|
|
45
|
+
**Output:** Verified description (and revised version if changes were made).
|
|
19
46
|
|
|
20
47
|
---
|
|
21
48
|
|
|
22
|
-
## Step
|
|
49
|
+
## Step 2: Progressive Disclosure Audit
|
|
23
50
|
|
|
24
|
-
|
|
51
|
+
**Goal:** Verify the file structure follows all progressive disclosure rules.
|
|
25
52
|
|
|
26
|
-
|
|
53
|
+
> "Keep SKILL.md body under 500 lines."
|
|
27
54
|
|
|
28
|
-
|
|
55
|
+
Check:
|
|
29
56
|
|
|
30
|
-
|
|
31
|
-
-
|
|
32
|
-
-
|
|
33
|
-
-
|
|
34
|
-
-
|
|
57
|
+
- [ ] SKILL.md body under 500 lines.
|
|
58
|
+
- [ ] All reference files link directly from SKILL.md (one level deep).
|
|
59
|
+
- [ ] Every file over 100 lines has a table of contents.
|
|
60
|
+
- [ ] File index in SKILL.md lists every companion file with its purpose.
|
|
61
|
+
- [ ] Forward slashes only in all paths.
|
|
62
|
+
- [ ] File names are descriptive (`form_validation_rules.md`, not `doc2.md`).
|
|
63
|
+
- [ ] Scripts clearly marked as execute vs read-as-reference.
|
|
35
64
|
|
|
36
|
-
|
|
37
|
-
- Adjacent domains with overlapping terminology
|
|
38
|
-
- Ambiguous phrasing where naive keyword matching would falsely trigger
|
|
39
|
-
- Tasks that touch the skill's domain but in a context where another tool is better
|
|
65
|
+
### Fix structural issues
|
|
40
66
|
|
|
41
|
-
|
|
67
|
+
If any check fails, restructure. Common fixes:
|
|
68
|
+
- SKILL.md too long → move sections to companion files, leave links.
|
|
69
|
+
- Nested references → surface all links to SKILL.md.
|
|
70
|
+
- Missing TOC → add to files over 100 lines.
|
|
42
71
|
|
|
43
|
-
|
|
72
|
+
**Output:** Verified file structure (and restructured files if changes were made).
|
|
44
73
|
|
|
45
|
-
|
|
74
|
+
---
|
|
46
75
|
|
|
47
|
-
|
|
76
|
+
## Step 3: Gotcha Freshness
|
|
48
77
|
|
|
49
|
-
|
|
78
|
+
**Goal:** Ensure gotchas reflect current observations.
|
|
50
79
|
|
|
51
|
-
|
|
52
|
-
1. Splits eval set into 60% train / 40% held-out test
|
|
53
|
-
2. Evaluates current description (3 runs per query for reliability)
|
|
54
|
-
3. Proposes improvements based on failures
|
|
55
|
-
4. Re-evaluates on both train and test
|
|
56
|
-
5. Iterates up to 5 times
|
|
57
|
-
6. Selects best description by test score (avoids overfitting)
|
|
80
|
+
> "Ideally, you will update your skill over time to capture these gotchas."
|
|
58
81
|
|
|
59
|
-
|
|
82
|
+
- Review the skill's Gotchas section.
|
|
83
|
+
- Check against recent usage: are there new failure modes not yet captured?
|
|
84
|
+
- Remove gotchas for issues that no longer occur (the skill fixed them).
|
|
85
|
+
- Verify each gotcha is actionable — a reader should know what to avoid and why.
|
|
60
86
|
|
|
61
|
-
|
|
87
|
+
**Output:** Updated gotchas section (and any new gotchas for skill-builder itself).
|
|
62
88
|
|
|
63
89
|
---
|
|
64
90
|
|
|
65
|
-
## Step
|
|
91
|
+
## Step 4: Full Self-Audit
|
|
92
|
+
|
|
93
|
+
**Goal:** Complete 38-point checklist pass.
|
|
66
94
|
|
|
67
|
-
|
|
95
|
+
Same as new-skill Step 5 and improve-skill Step 5:
|
|
68
96
|
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
- [ ] No time-sensitive claims unless clearly dated
|
|
75
|
-
- [ ] Examples are concrete, not abstract
|
|
76
|
-
- [ ] Frontmatter fields are valid per official docs
|
|
77
|
-
- [ ] One skill = one capability
|
|
78
|
-
- [ ] Consistent terminology throughout
|
|
79
|
-
- [ ] File references are one level deep from SKILL.md
|
|
80
|
-
- [ ] Files over 100 lines have a table of contents
|
|
97
|
+
1. Read `${CLAUDE_SKILL_DIR}/references/self-audit-checklist.md`.
|
|
98
|
+
2. Check every item. Fix failures. Re-check.
|
|
99
|
+
3. All items must be PASS or N/A.
|
|
100
|
+
|
|
101
|
+
**Output:** Completed checklist.
|
|
81
102
|
|
|
82
103
|
---
|
|
83
104
|
|
|
84
|
-
## Step
|
|
105
|
+
## Step 5: Deliver
|
|
106
|
+
|
|
107
|
+
**Goal:** Final summary of the polished skill.
|
|
85
108
|
|
|
86
|
-
Present
|
|
109
|
+
Present to the user:
|
|
87
110
|
|
|
88
|
-
1. **
|
|
89
|
-
2. **
|
|
90
|
-
3. **
|
|
91
|
-
4. **
|
|
92
|
-
5. **
|
|
111
|
+
1. **Description** — final version, confirmed trigger phrases.
|
|
112
|
+
2. **File structure** — folder map with line counts.
|
|
113
|
+
3. **Gotchas** — current gotcha count and most recent additions.
|
|
114
|
+
4. **Audit summary** — "All 38 items: N passed, M N/A."
|
|
115
|
+
5. **Before/after** — description changes if any, structural changes if any.
|
|
116
|
+
6. **Maintenance notes** — what to watch for, when to re-audit.
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: structure-prompt
|
|
3
|
+
description: >-
|
|
4
|
+
Restructure any user-provided prompt — order blocks correctly, replace persona
|
|
5
|
+
framing with task constraints, enforce per-category dispositions, replace
|
|
6
|
+
ceremony directives with measurable constraints, expand placeholder tokens
|
|
7
|
+
into real values via the sibling rubric or AskUserQuestion, add file:line
|
|
8
|
+
citations for identifiers that appear in the data body, mark the canonical
|
|
9
|
+
sub-bucket with ⭐, and sharpen generic adversarial-pass phrasing into a
|
|
10
|
+
category-specific failure-mode noun. Trigger when the user invokes
|
|
11
|
+
/structure-prompt, pastes a prompt and asks to optimize it, asks for a
|
|
12
|
+
"minimally invasive edit" to a prompt artifact, or asks to "tighten this
|
|
13
|
+
prompt."
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
# structure-prompt
|
|
17
|
+
|
|
18
|
+
One pass per invocation. Classify each block of the input prompt, apply the matching spoke rules, and emit the rewritten prompt as a single fenced block (paste mode) or rewrite the file in place (file-path mode).
|
|
19
|
+
|
|
20
|
+
## Pre-flight
|
|
21
|
+
|
|
22
|
+
The input prompt arrives as the user's message body, as a fenced block within it, or as a file path argument. Treat the entire input as the artifact under optimization.
|
|
23
|
+
|
|
24
|
+
## First invocation of a session
|
|
25
|
+
|
|
26
|
+
Read [`reference/block-classification.md`](reference/block-classification.md), then [`reference/research.md`](reference/research.md), then [`reference/output-contract.md`](reference/output-contract.md).
|
|
27
|
+
|
|
28
|
+
## Match situation, read spoke
|
|
29
|
+
|
|
30
|
+
| Situation | Read |
|
|
31
|
+
|---|---|
|
|
32
|
+
| Starting any optimization | [`reference/block-classification.md`](reference/block-classification.md) |
|
|
33
|
+
| A spoke needs information that isn't in the input | [`reference/research.md`](reference/research.md) |
|
|
34
|
+
| Input contains a fenced code block, diff, dump, transcript, or single content region ≥ 500 characters, OR blocks appear out of canonical sequence (mission, metadata, framework, questions, output spec, data body) | [`reference/structure.md`](reference/structure.md) |
|
|
35
|
+
| Input opens with a role assignment ("You are…", "Act as…", "Imagine you are…", "As a…", "Pretend to be…", "Role:…") | [`reference/persona.md`](reference/persona.md) |
|
|
36
|
+
| Input names 2+ categories, surfaces, sub-buckets, items, checks, or criteria the agent processes | [`reference/per-category.md`](reference/per-category.md) |
|
|
37
|
+
| Input contains performance directives ("be thorough", "think step by step", "you are an expert", "please", "kindly") | [`reference/directives.md`](reference/directives.md) |
|
|
38
|
+
| Input contains narrative directives ("try to", "look at", "make sure", "consider", "be sure to", "think about") | [`reference/constraints.md`](reference/constraints.md) |
|
|
39
|
+
| Input contains placeholder tokens (`[REPO/ARTIFACT]`, `[INLINE THE FULL ARTIFACT HERE]`, `[N]`, etc.) | [`reference/instantiation.md`](reference/instantiation.md) |
|
|
40
|
+
| Sub-bucket bullets reference identifiers from the data body without `file:line` citations | [`reference/citation-depth.md`](reference/citation-depth.md) |
|
|
41
|
+
| Framework has 5+ sub-buckets and no ⭐ canonical-case marker | [`reference/canonical-case.md`](reference/canonical-case.md) |
|
|
42
|
+
| Output spec contains generic adversarial-pass phrasing ("missed at least N bugs/findings") | [`reference/adversarial-tuning.md`](reference/adversarial-tuning.md) |
|
|
43
|
+
| Input has typos, mixed bullet styles, untagged code blocks, trailing whitespace, blank-line runs, or non-sequential heading levels | [`reference/cleanup.md`](reference/cleanup.md) |
|
|
44
|
+
| Situation doesn't match any spoke above | [`reference/examples.md`](reference/examples.md) |
|
|
45
|
+
| Emitting the rewritten prompt | [`reference/output-contract.md`](reference/output-contract.md) |
|
|
46
|
+
|
|
47
|
+
## Folder map
|
|
48
|
+
|
|
49
|
+
- `SKILL.md` — this hub.
|
|
50
|
+
- `reference/` — rule detail per situation.
|
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
# Sharpen the adversarial-pass phrasing
|
|
2
|
+
|
|
3
|
+
The output spec usually closes with an adversarial second-pass instruction like *assume your first pass missed at least 3 P1 bugs across these N sub-buckets — find them*. When that phrase uses a generic noun (`bugs`, `findings`, `issues`, `problems`), the skill replaces the noun with one that names the category's specific failure mode.
|
|
4
|
+
|
|
5
|
+
## Detection
|
|
6
|
+
|
|
7
|
+
The fix fires when the output spec contains a phrase matching this shape, with a generic noun:
|
|
8
|
+
|
|
9
|
+
- "missed at least `<number>` [bugs / findings / issues / problems]" — optionally preceded by a severity tier (`P0` or `P1`) when the framework uses tiered findings.
|
|
10
|
+
|
|
11
|
+
A noun is "generic" when it could apply to any audit category. A noun is "specific" when it names the failure mode of the category.
|
|
12
|
+
|
|
13
|
+
## How to derive the specific noun
|
|
14
|
+
|
|
15
|
+
Read the mission line and the framework header. Pull the category's domain from there. Match against this lookup:
|
|
16
|
+
|
|
17
|
+
| Category domain | Specific failure-mode noun |
|
|
18
|
+
|---|---|
|
|
19
|
+
| API contracts (signatures, return types, callback shape) | contract drifts |
|
|
20
|
+
| Selector / query / engine compatibility | engine-version incompatibilities |
|
|
21
|
+
| Resource cleanup (handles, locks, subscriptions) | leaked resources |
|
|
22
|
+
| Scoping and ordering | scope or ordering bugs |
|
|
23
|
+
| Dead code | dead code paths |
|
|
24
|
+
| Silent failures (swallowed exceptions, dropped errors) | silent failures |
|
|
25
|
+
| Bounds and overflow | bounds or overflow bugs |
|
|
26
|
+
| Security boundaries | trust-boundary violations |
|
|
27
|
+
| Concurrency | concurrency hazards |
|
|
28
|
+
| Code rules compliance | rule violations |
|
|
29
|
+
| Codebase conflicts (incomplete propagation) | parallel sites that should have been updated alongside the diff |
|
|
30
|
+
|
|
31
|
+
When the category sits outside this list, derive the noun from the framework's most prominent axis name (e.g., a framework whose axes all name "selectors" → "selector incompatibilities").
|
|
32
|
+
|
|
33
|
+
## Procedure
|
|
34
|
+
|
|
35
|
+
1. Find the adversarial-pass sentence in the output spec.
|
|
36
|
+
2. Identify the generic noun in that sentence.
|
|
37
|
+
3. Replace it with the specific noun from the table or framework.
|
|
38
|
+
4. Keep the rest of the sentence intact: count (e.g., "3"), severity tier (e.g., "P1") when the original phrase carries one, and the closing "find them".
|
|
39
|
+
|
|
40
|
+
## Examples
|
|
41
|
+
|
|
42
|
+
Before (generic):
|
|
43
|
+
> "assume your first pass missed at least 3 P1 bugs across these 7 sub-buckets — find them"
|
|
44
|
+
|
|
45
|
+
After (Category B):
|
|
46
|
+
> "assume your first pass missed at least 3 P1 engine-version incompatibilities across these 7 sub-buckets — find them"
|
|
47
|
+
|
|
48
|
+
After (Category K):
|
|
49
|
+
> "assume your first pass missed at least 3 P1 parallel sites that should have been updated alongside the diff across these 7 sub-buckets — find them"
|
|
50
|
+
|
|
51
|
+
After (Category C):
|
|
52
|
+
> "assume your first pass missed at least 3 P1 leaked resources across these 7 sub-buckets — find them"
|
|
53
|
+
|
|
54
|
+
## What stays put
|
|
55
|
+
|
|
56
|
+
When the adversarial phrase already names a specific failure mode, the noun stays. The skill changes only generic nouns.
|
|
57
|
+
|
|
58
|
+
The count (e.g., 3) and severity tier (e.g., P1) stay intact when the original phrase carries them. Some categories name a noun that doesn't fit the P-tier model — Codebase Conflicts ("parallel sites that should have been updated alongside the diff") is the canonical example — but preservation still applies: if the original phrase includes a tier, the rewritten phrase includes it too. The rule is preservation, not insertion or removal.
|
|
59
|
+
|
|
60
|
+
## Disposition reporting
|
|
61
|
+
|
|
62
|
+
Every outcome emits an action note via the mechanism that [`output-contract.md`](output-contract.md) defines. When the noun was replaced: `> Gap: Adversarial-pass noun sharpened — "bugs" → "<specific noun>".` When the phrase already carries a specific noun: `> Gap: Adversarial-pass noun verified — "<specific noun>" already specific.` Silent pass is forbidden — see the [no silent action](output-contract.md#disposition-invariants) invariant.
|
|
@@ -0,0 +1,27 @@
|
|
|
1
|
+
# Block classification
|
|
2
|
+
|
|
3
|
+
Every input prompt decomposes into six block types. Tag each region of the input as exactly one type before applying any spoke rules.
|
|
4
|
+
|
|
5
|
+
## Block types
|
|
6
|
+
|
|
7
|
+
**Mission block.** One sentence stating what the agent does. The opening directive of the prompt.
|
|
8
|
+
|
|
9
|
+
**Metadata block.** Identifiers, SHAs, PR numbers, target paths, ID prefixes, scope flags, mode toggles. Short atomic facts the agent uses as parameters.
|
|
10
|
+
|
|
11
|
+
**Framework block.** The checklist, sub-bucket list, surface list, category list, or step list the agent processes. Multi-item structures with named entries.
|
|
12
|
+
|
|
13
|
+
**Questions block.** Cross-cutting questions, synthesis questions, or open questions the agent answers after completing the framework.
|
|
14
|
+
|
|
15
|
+
**Output spec block.** The format the agent's output takes — totals header, per-item shape, ordering, severity tags, locator format, length cap, lead phrase, closing phrase.
|
|
16
|
+
|
|
17
|
+
**Data body block.** Any of:
|
|
18
|
+
- Fenced code block (triple backtick) that sits INSIDE the prompt content — not the outer paste-mode fence that wraps the entire prompt artifact
|
|
19
|
+
- Diff, file dump, transcript, log, table, or document inlined as content
|
|
20
|
+
- Any single content region of 500 characters or more that the agent inspects rather than acts on
|
|
21
|
+
|
|
22
|
+
## Tagging procedure
|
|
23
|
+
|
|
24
|
+
1. Read the input prompt top to bottom.
|
|
25
|
+
2. Annotate each region with exactly one tag.
|
|
26
|
+
3. Confirm every content region is either tagged with one of the six block types or part of a gap-report block. Gap-note lines (`> Gap:`) and `<!-- gap-report:` comment blocks from a prior invocation form a passthrough region — preserved in place during classification and reordering, not re-tagged. During emission, the gap-report region is deterministically replaced by the current run's gap notes per [`output-contract.md`](output-contract.md). The gap-report region sits at the end of the prompt and carries no classification tag.
|
|
27
|
+
4. Proceed to the matching spoke.
|