@intentsolutionsio/skill-creator 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/.claude-plugin/plugin.json +17 -0
  2. package/README.md +55 -0
  3. package/package.json +38 -0
  4. package/scripts/validate-skill.py +1132 -0
  5. package/skills/agent-creator/SKILL.md +305 -0
  6. package/skills/agent-creator/references/anthropic-agent-spec.md +89 -0
  7. package/skills/skill-creator/SKILL.md +267 -0
  8. package/skills/skill-creator/agents/analyzer.md +279 -0
  9. package/skills/skill-creator/agents/comparator.md +207 -0
  10. package/skills/skill-creator/agents/grader.md +228 -0
  11. package/skills/skill-creator/assets/eval_review.html +146 -0
  12. package/skills/skill-creator/eval-viewer/generate_review.py +471 -0
  13. package/skills/skill-creator/eval-viewer/viewer.html +1325 -0
  14. package/skills/skill-creator/references/advanced-eval-workflow.md +320 -0
  15. package/skills/skill-creator/references/anthropic-comparison.md +93 -0
  16. package/skills/skill-creator/references/ard-template.md +47 -0
  17. package/skills/skill-creator/references/creation-guide.md +305 -0
  18. package/skills/skill-creator/references/errors-template.md +27 -0
  19. package/skills/skill-creator/references/examples-template.md +40 -0
  20. package/skills/skill-creator/references/frontmatter-spec.md +531 -0
  21. package/skills/skill-creator/references/implementation-template.md +42 -0
  22. package/skills/skill-creator/references/output-patterns.md +193 -0
  23. package/skills/skill-creator/references/prd-template.md +55 -0
  24. package/skills/skill-creator/references/schemas.md +430 -0
  25. package/skills/skill-creator/references/source-of-truth.md +658 -0
  26. package/skills/skill-creator/references/validation-rules.md +528 -0
  27. package/skills/skill-creator/references/workflows.md +233 -0
  28. package/skills/skill-creator/scripts/__init__.py +0 -0
  29. package/skills/skill-creator/scripts/aggregate_benchmark.py +401 -0
  30. package/skills/skill-creator/scripts/generate_report.py +326 -0
  31. package/skills/skill-creator/scripts/improve_description.py +247 -0
  32. package/skills/skill-creator/scripts/package_skill.py +136 -0
  33. package/skills/skill-creator/scripts/quick_validate.py +103 -0
  34. package/skills/skill-creator/scripts/run_eval.py +344 -0
  35. package/skills/skill-creator/scripts/run_loop.py +329 -0
  36. package/skills/skill-creator/scripts/utils.py +47 -0
  37. package/skills/skill-creator/scripts/validate-skill.py +87 -0
  38. package/skills/skill-creator/templates/agent-template.md +99 -0
  39. package/skills/skill-creator/templates/skill-template.md +122 -0
@@ -0,0 +1,305 @@
1
+ ---
2
+ name: agent-creator
3
+ description: |
4
+ Create production-grade agent .md files aligned with the Anthropic 2026 spec (16-field schema).
5
+ Also validates existing agents against the marketplace compliance rules. Use when building custom
6
+ subagents, reviewing agent quality, or creating parallel agent architectures for orchestrator skills.
7
+ Trigger with "/agent-creator", "create an agent", "build a subagent", or "validate my agent".
8
+ Make sure to use this skill whenever creating agents/*.md files for plugins or standalone use.
9
+ allowed-tools: "Read,Write,Edit,Glob,Grep,Bash(python:*),AskUserQuestion"
10
+ version: 1.0.0
11
+ author: Jeremy Longshore <jeremy@intentsolutions.io>
12
+ license: MIT
13
+ compatible-with: claude-code, codex, openclaw
14
+ tags: [agent-creation, validation, meta-tooling, subagents]
15
+ model: inherit
16
+ ---
17
+
18
+ # Agent Creator
19
+
20
+ Creates spec-compliant agent .md files following the Anthropic 2026 16-field schema. Supports
21
+ both creation of new agents and validation of existing ones.
22
+
23
+ ## Overview
24
+
25
+ Agent Creator fills the gap between ad-hoc agent files and production-grade agents that pass
26
+ marketplace validation. It enforces the Anthropic agent schema (14 valid fields), prevents
27
+ common mistakes (using `allowed-tools` instead of `disallowedTools`, adding invalid fields like
28
+ `capabilities` or `expertise_level`), and produces agents with substantive body content that
29
+ actually guides Claude's behavior.
30
+
31
+ Key difference from skill-creator: **agents support both `tools` (allowlist) AND `disallowedTools`
32
+ (denylist)**, while skills only use `allowed-tools` (allowlist). Agents also support `effort`,
33
+ `maxTurns`, `skills`, `memory`, `isolation`, `permissionMode`, `background`, `color`, and
34
+ `initialPrompt` — fields that don't exist for skills. The agent body becomes the **system prompt**
35
+ that drives the subagent — it does NOT receive the full Claude Code system prompt.
36
+
37
+ ## Prerequisites
38
+
39
+ - Claude Code CLI with agent support
40
+ - Target directory writable (`agents/` within a plugin or `~/.claude/agents/` for standalone)
41
+ - Familiarity with what the agent should specialize in
42
+
43
+ ## Instructions
44
+
45
+ ### Mode Detection
46
+
47
+ Determine user intent from their prompt:
48
+ - **Create mode**: "create an agent", "build a subagent", "new agent" -> Step 1
49
+ - **Validate mode**: "validate agent", "check agent", "grade agent" -> Validation Workflow
50
+
51
+ ### Step 1: Understand Requirements
52
+
53
+ Ask the user with AskUserQuestion:
54
+
55
+ **Agent Identity:**
56
+ - Name (kebab-case, 1-64 chars, e.g., `risk-assessor`, `clause-analyzer`)
57
+ - Specialty description (20-200 chars — shown in agent selection UI)
58
+
59
+ **Execution Context:**
60
+ - Plugin agent (`plugins/*/agents/`) or standalone (`~/.claude/agents/`)?
61
+ - Will it be spawned by an orchestrator skill via `Task` tool?
62
+ - Does it need to preload specific skills? (`skills: [skill-name]`)
63
+
64
+ **Behavioral Controls:**
65
+ - Model override? (`sonnet` for speed, `opus` for quality, `inherit` for default)
66
+ - Reasoning effort? (`low` for simple, `medium` default, `high` for complex analysis)
67
+ - Max iterations? (`maxTurns` — how many tool-use loops before stopping)
68
+ - Tools to deny? (`disallowedTools` — denylist approach, opposite of skills)
69
+
70
+ **Plugin Restrictions (if plugin agent):**
71
+ - `hooks` — NOT supported in plugin agents (use plugin-level hooks)
72
+ - `mcpServers` — NOT supported in plugin agents
73
+ - `permissionMode` — standalone only, NOT plugin agents
74
+
75
+ ### Step 2: Plan the Agent
76
+
77
+ Before writing, determine:
78
+
79
+ **Agent Role Clarity:**
80
+ The agent body must make three things unambiguous:
81
+ 1. **What it IS responsible for** — its specific domain/methodology
82
+ 2. **What it is NOT responsible for** — boundaries with other agents
83
+ 3. **How it communicates results** — output format and structure
84
+
85
+ **Body Structure Pattern:**
86
+ All production agents should follow this body structure:
87
+
88
+ | Section | Purpose | Required? |
89
+ |---------|---------|-----------|
90
+ | `# Title` | Agent name as heading | Yes |
91
+ | `## Role` | 2-3 sentence domain description with boundaries | Yes |
92
+ | `## Inputs` | Parameters the agent receives when spawned | Yes (if spawned by orchestrator) |
93
+ | `## Process` | Step-by-step methodology (numbered steps with ### headings) | Yes |
94
+ | `## Output Format` | Structured output spec (JSON, markdown, or table) | Yes |
95
+ | `## Guidelines` | Do/don't behavioral rules | Yes |
96
+ | `## When Activated` | Trigger conditions (when spawned or auto-detected) | Recommended |
97
+ | `## Communication Style` | Tone and formatting preferences | Recommended |
98
+ | `## Success Criteria` | What good vs poor output looks like | Recommended |
99
+ | `## Examples` | Concrete interaction examples | For complex agents |
100
+
101
+ **Output Structure Decision:**
102
+ - If the agent feeds into an orchestrator: use **JSON output** (machine-parseable)
103
+ - If the agent is user-facing: use **markdown output** (human-readable)
104
+ - If the agent produces both: JSON primary with markdown summary
105
+
106
+ ### Step 3: Write the Agent File
107
+
108
+ Generate the agent .md using the template from
109
+ `${CLAUDE_SKILL_DIR}/../skill-creator/templates/agent-template.md`.
110
+
111
+ **Frontmatter Rules (Anthropic 16-field schema):**
112
+
113
+ See [Anthropic Agent Spec](references/anthropic-agent-spec.md) for the full official reference.
114
+
115
+ Required fields:
116
+ ```yaml
117
+ name: {agent-name} # Lowercase letters and hyphens, unique identifier
118
+ description: "{specialty}" # When Claude should delegate to this subagent
119
+ ```
120
+
121
+ Optional fields (include only what's needed):
122
+ ```yaml
123
+ tools: "Read, Glob, Grep" # Allowlist — inherits all tools if omitted
124
+ disallowedTools: "Write" # Denylist — removed from inherited/specified list
125
+ model: sonnet # sonnet|haiku|opus|inherit|full model ID
126
+ effort: medium # low|medium|high|max (max = Opus 4.6 only)
127
+ maxTurns: 15 # Max agentic turns before stopping
128
+ skills: [skill-name] # Skills to inject at startup (full content loaded)
129
+ memory: project # user|project|local — persistent cross-session
130
+ background: false # Always run as background task
131
+ isolation: worktree # Run in temporary git worktree
132
+ color: blue # Display: red|blue|green|yellow|purple|orange|pink|cyan
133
+ initialPrompt: "..." # Auto-submitted first turn (--agent mode only)
134
+ permissionMode: default # Standalone only, NOT plugin agents
135
+ hooks: {} # Standalone only, NOT plugin agents
136
+ mcpServers: {} # Standalone only, NOT plugin agents
137
+ ```
138
+
139
+ **Tool access:**
140
+ - `tools` = allowlist (like skills' `allowed-tools`)
141
+ - `disallowedTools` = denylist (remove specific tools)
142
+ - If both set: disallowed applied first, then tools resolved
143
+ - If neither set: inherits all tools from parent conversation
144
+
145
+ **Invalid fields (ERROR — never use these):**
146
+ - `capabilities` — looks valid but flagged by validator
147
+ - `expertise_level` — invented, not in Anthropic spec
148
+ - `activation_priority` — invented, not in Anthropic spec
149
+ - `activation_triggers`, `type`, `category` — not in spec
150
+ - `allowed-tools` — that's the skill-only syntax; agents use `tools` or `disallowedTools`
151
+
152
+ **Body Content Guidelines:**
153
+
154
+ 1. **Role section must set boundaries.** Don't just say what the agent does — say what it
155
+ does NOT do. Example: "You analyze contract clauses for risk. You do NOT provide legal
156
+ advice or make recommendations — that is the recommendations agent's responsibility."
157
+
158
+ 2. **Process steps must be concrete.** Each step should tell Claude exactly what to do,
159
+ not vaguely gesture at an activity. Bad: "Analyze the document." Good: "Read the full
160
+ contract. For each clause, extract: (a) the exact text, (b) the clause category from
161
+ the taxonomy below, (c) a plain English summary in one sentence."
162
+
163
+ 3. **Output format must be machine-parseable if feeding an orchestrator.** Use JSON with
164
+ a concrete schema example. Include field descriptions so Claude knows what each field means.
165
+
166
+ 4. **Guidelines should include both DO and DON'T rules.** Example:
167
+ - DO: "Be specific — quote exact clause text, don't paraphrase"
168
+ - DON'T: "Don't make legal recommendations — only identify and score risks"
169
+
170
+ 5. **Keep under 300 lines** (agent body limit — prevents context bloat in subagent window).
171
+ If the agent needs extensive reference material, create a companion skill with
172
+ `references/` directory and preload it via the `skills` field.
173
+
174
+ ### Step 4: Validate the Agent
175
+
176
+ Run validation against the Anthropic 16-field schema:
177
+
178
+ **Manual checklist:**
179
+
180
+ | Check | Rule |
181
+ |-------|------|
182
+ | `name` present | 1-64 chars, kebab-case |
183
+ | `description` present | 20-200 chars |
184
+ | No invalid fields | None of: capabilities, expertise_level, activation_priority, type, category |
185
+ | No skill-only fields | No `allowed-tools` (use `disallowedTools` instead) |
186
+ | Plugin restrictions | No hooks/mcpServers/permissionMode if plugin agent |
187
+ | Body has Role section | Clear domain + boundaries |
188
+ | Body has Process section | Numbered steps |
189
+ | Body has Output Format | Concrete schema example |
190
+ | Body has Guidelines | Do/don't rules |
191
+ | Body under 300 lines | Offload to references if longer (prevents context bloat) |
192
+
193
+ **Automated validation:**
194
+ ```bash
195
+ python3 ${CLAUDE_SKILL_DIR}/../skill-creator/scripts/validate-skill.py --agents-only {plugin-dir}/
196
+ ```
197
+
198
+ ### Step 5: Test the Agent
199
+
200
+ Test the agent by spawning it via the `Task` tool or the `Agent` tool:
201
+
202
+ 1. Write a test prompt that exercises the agent's core capability
203
+ 2. Spawn the agent with that prompt
204
+ 3. Check: Does the output match the declared Output Format?
205
+ 4. Check: Does the agent stay within its declared Role boundaries?
206
+ 5. Check: Does it follow the Process steps?
207
+ 6. Iterate on the body content if the agent strays
208
+
209
+ ### Step 6: Report
210
+
211
+ Provide a summary:
212
+ - Agent name and file path
213
+ - Frontmatter field count (of 14 possible)
214
+ - Body line count
215
+ - Sections present
216
+ - Validation result (pass/fail with specific issues)
217
+ - Test result summary
218
+
219
+ ## Validation Workflow
220
+
221
+ When the user wants to validate an existing agent:
222
+
223
+ 1. Locate the agent .md file
224
+ 2. Parse YAML frontmatter
225
+ 3. Check against the 16-field Anthropic schema:
226
+ - `name` present and valid (1-64 chars, kebab-case)?
227
+ - `description` present and valid (20-200 chars)?
228
+ - Any invalid fields? (capabilities, expertise_level, activation_priority, etc.)
229
+ - Any skill-only fields? (allowed-tools)
230
+ - Plugin restrictions respected?
231
+ 4. Check body content:
232
+ - Has `## Role` section?
233
+ - Has `## Process` section with numbered steps?
234
+ - Has `## Output Format` with concrete example?
235
+ - Has `## Guidelines`?
236
+ - Under 300 lines? (agent body limit)
237
+ 5. Report findings with severity (ERROR/WARNING/INFO)
238
+ 6. Suggest specific fixes for each issue
239
+
240
+ ## Output
241
+
242
+ - **Create mode**: A complete agent .md file with valid frontmatter and substantive body,
243
+ plus a creation report with validation status.
244
+ - **Validate mode**: A compliance report listing errors, warnings, and info items with
245
+ specific fix recommendations for each.
246
+
247
+ ## Examples
248
+
249
+ ### Subagent for Orchestrator Skill
250
+
251
+ **Input:** "Create a risk assessment agent that scores contract clauses"
252
+
253
+ **Output:** `agents/risk-assessor.md` with frontmatter:
254
+
255
+ ```yaml
256
+ name: risk-assessor
257
+ description: "Score contract clauses for legal and financial risk on a 1-10 scale"
258
+ model: sonnet
259
+ effort: high
260
+ maxTurns: 10
261
+ ```
262
+
263
+ Body sections: Role (risk scoring specialist, does NOT make recommendations), Inputs
264
+ (contract_text, contract_type, output_path), Process (4 steps: read, categorize, score,
265
+ aggregate), Output Format (JSON with clause scores and risk matrix), Guidelines (be specific,
266
+ cite clause text, use 4-factor scoring methodology).
267
+
268
+ ### Standalone User-Facing Agent
269
+
270
+ **Input:** "Create a code review agent"
271
+
272
+ **Output:** `~/.claude/agents/code-reviewer.md` with frontmatter:
273
+
274
+ ```yaml
275
+ name: code-reviewer
276
+ description: "Review code for bugs, performance issues, and security vulnerabilities"
277
+ effort: high
278
+ ```
279
+
280
+ Body sections: Role (code quality specialist), Process (read code, check patterns, identify
281
+ issues, suggest fixes), Output Format (markdown with severity-rated findings), Guidelines
282
+ (cite line numbers, explain why not just what), Communication Style (direct, educational,
283
+ actionable).
284
+
285
+ ## Error Handling
286
+
287
+ | Error | Cause | Resolution |
288
+ |-------|-------|------------|
289
+ | `allowed-tools` in agent | Used skill-only field | Replace with `disallowedTools` (denylist) or remove |
290
+ | `capabilities` field | Common mistake — looks valid but isn't in Anthropic spec | Remove field entirely |
291
+ | `expertise_level` field | Invented field from community templates | Remove — express expertise in body content |
292
+ | Description > 200 chars | Exceeds Anthropic limit | Shorten to 20-200 char range |
293
+ | Description < 20 chars | Below minimum | Expand to describe agent's specific specialty |
294
+ | `permissionMode` in plugin agent | Standalone-only field used in plugin context | Remove — only valid in `~/.claude/agents/` |
295
+ | `hooks` in plugin agent | Plugin agents can't have hooks | Move to plugin-level `hooks/hooks.json` |
296
+ | Body has no Process section | Agent lacks step-by-step methodology | Add numbered steps under `## Process` |
297
+ | Body over 300 lines | Too long for agent context | Extract reference material to companion skill |
298
+
299
+ ## Resources
300
+
301
+ - [Anthropic Agent Spec](references/anthropic-agent-spec.md) — Official 16-field schema from code.claude.com/docs/en/sub-agents
302
+ - [Agent template](${CLAUDE_SKILL_DIR}/../skill-creator/templates/agent-template.md) — Skeleton with placeholders
303
+ - [Frontmatter spec](${CLAUDE_SKILL_DIR}/../skill-creator/references/frontmatter-spec.md) — Field reference (internal)
304
+ - [Source of truth](${CLAUDE_SKILL_DIR}/../skill-creator/references/source-of-truth.md) — Canonical spec
305
+ - [Validation rules](${CLAUDE_SKILL_DIR}/../skill-creator/references/validation-rules.md) — Agent validation section
@@ -0,0 +1,89 @@
1
+ # Anthropic Agent Spec — Official Reference
2
+
3
+ Source: https://code.claude.com/docs/en/sub-agents (fetched 2026-04-05)
4
+
5
+ ## Supported Frontmatter Fields
6
+
7
+ Only `name` and `description` are required.
8
+
9
+ | Field | Required | Description |
10
+ |:------|:---------|:------------|
11
+ | `name` | Yes | Unique identifier using lowercase letters and hyphens |
12
+ | `description` | Yes | When Claude should delegate to this subagent |
13
+ | `tools` | No | Tools the subagent can use (allowlist). Inherits all tools if omitted |
14
+ | `disallowedTools` | No | Tools to deny, removed from inherited or specified list |
15
+ | `model` | No | `sonnet`, `opus`, `haiku`, a full model ID, or `inherit`. Defaults to `inherit` |
16
+ | `permissionMode` | No | `default`, `acceptEdits`, `auto`, `dontAsk`, `bypassPermissions`, or `plan` |
17
+ | `maxTurns` | No | Maximum number of agentic turns before the subagent stops |
18
+ | `skills` | No | Skills to load into the subagent's context at startup (full content injected) |
19
+ | `mcpServers` | No | MCP servers available to this subagent (inline definition or string reference) |
20
+ | `hooks` | No | Lifecycle hooks scoped to this subagent |
21
+ | `memory` | No | Persistent memory scope: `user`, `project`, or `local` |
22
+ | `background` | No | Set to `true` to always run as background task. Default: `false` |
23
+ | `effort` | No | `low`, `medium`, `high`, `max` (Opus 4.6 only). Default: inherits from session |
24
+ | `isolation` | No | `worktree` — run in temporary git worktree |
25
+ | `color` | No | Display color: `red`, `blue`, `green`, `yellow`, `purple`, `orange`, `pink`, `cyan` |
26
+ | `initialPrompt` | No | Auto-submitted as first user turn when running as main agent via `--agent` |
27
+
28
+ Total: 16 official fields.
29
+
30
+ ## Key Facts
31
+
32
+ - The **body** (markdown after frontmatter) becomes the **system prompt** that guides the subagent
33
+ - Subagents receive ONLY the system prompt + basic environment details, NOT the full Claude Code system prompt
34
+ - `tools` is an **allowlist** (like skills' `allowed-tools`)
35
+ - `disallowedTools` is a **denylist** — if both set, disallowed applied first, then tools resolved
36
+ - Subagents **cannot spawn other subagents** (no nesting)
37
+ - Subagents **don't inherit skills** from parent conversation — must list explicitly via `skills` field
38
+
39
+ ## Plugin Agent Restrictions
40
+
41
+ Plugin agents (`plugins/*/agents/*.md`) do NOT support:
42
+ - `hooks` — ignored when loading from plugin
43
+ - `mcpServers` — ignored when loading from plugin
44
+ - `permissionMode` — ignored when loading from plugin
45
+
46
+ These are standalone-only features. If needed, copy the agent to `.claude/agents/` or `~/.claude/agents/`.
47
+
48
+ ## Tool Scoping
49
+
50
+ Subagents can use `Agent(type)` syntax to restrict which subagents they can spawn (only for main-thread agents via `--agent`).
51
+
52
+ ## Skills Preloading
53
+
54
+ ```yaml
55
+ skills:
56
+ - api-conventions
57
+ - error-handling-patterns
58
+ ```
59
+
60
+ Full content of each skill is injected at startup. This is the inverse of `context: fork` in skills.
61
+
62
+ ## Model Resolution Order
63
+
64
+ 1. `CLAUDE_CODE_SUBAGENT_MODEL` env var
65
+ 2. Per-invocation `model` parameter
66
+ 3. Subagent definition's `model` frontmatter
67
+ 4. Main conversation's model
68
+
69
+ ## Example Agent File
70
+
71
+ ```markdown
72
+ ---
73
+ name: code-reviewer
74
+ description: Reviews code for quality and best practices
75
+ tools: Read, Glob, Grep
76
+ model: sonnet
77
+ ---
78
+
79
+ You are a code reviewer. When invoked, analyze the code and provide
80
+ specific, actionable feedback on quality, security, and best practices.
81
+ ```
82
+
83
+ ## Scope Priority (highest to lowest)
84
+
85
+ 1. Managed settings (organization-wide)
86
+ 2. `--agents` CLI flag (current session only)
87
+ 3. `.claude/agents/` (project)
88
+ 4. `~/.claude/agents/` (personal)
89
+ 5. Plugin `agents/` directory (where plugin enabled)
@@ -0,0 +1,267 @@
1
+ ---
2
+ name: skill-creator
3
+ description: |
4
+ Create production-grade agent skills aligned with the 2026 AgentSkills.io spec and Anthropic
5
+ best practices (2026). Also validates existing skills against the Intent Solutions 100-point rubric.
6
+ Use when building, testing, validating, or optimizing Claude Code skills.
7
+ Trigger with "/skill-creator", "create a skill", "validate my skill", or "check skill quality".
8
+ Make sure to use this skill whenever creating a new skill, slash command, or agent capability.
9
+ allowed-tools: "Read,Write,Edit,Glob,Grep,Bash(mkdir:*),Bash(chmod:*),Bash(python:*),Bash(claude:*),Task,AskUserQuestion"
10
+ version: 5.1.0
11
+ author: Jeremy Longshore <jeremy@intentsolutions.io>
12
+ license: MIT
13
+ compatible-with: claude-code, codex, openclaw
14
+ tags: [skill-creation, validation, meta-tooling]
15
+ model: inherit
16
+ ---
17
+
18
+ # Skill Creator
19
+
20
+ Creates complete, spec-compliant skill packages following AgentSkills.io and Anthropic standards.
21
+ Supports both creation and validation workflows with 100-point marketplace grading.
22
+
23
+ ## Overview
24
+
25
+ Skill Creator solves the gap between writing ad-hoc agent skills and producing marketplace-ready
26
+ packages that score well on the Intent Solutions 100-point rubric. It enforces the 2026 spec
27
+ (top-level identity fields, `${CLAUDE_SKILL_DIR}` paths, scored sections) and catches
28
+ contradictions that would cost marketplace points. Supports two modes: create new skills from
29
+ scratch with full validation, or grade/audit existing skills with actionable fix suggestions.
30
+
31
+ ## Prerequisites
32
+
33
+ - Claude Code CLI with skill support (v2.1.78+ for advanced features like `effort`, `maxTurns`)
34
+ - Python 3.10+ for validation scripts (`validate-skill.py`, `aggregate_benchmark.py`)
35
+ - Target skill directory writable (`~/.claude/skills/` or `.claude/skills/`)
36
+
37
+ ## Instructions
38
+
39
+ ### Mode Detection
40
+
41
+ Determine user intent from their prompt:
42
+ - **Create mode**: "create a skill", "build a skill", "new skill" -> proceed to Step 1
43
+ - **Validate mode**: "validate", "check", "grade", "score", "audit" -> jump to Validation Workflow
44
+
45
+ ### Communicating with the User
46
+
47
+ Pay attention to context cues to understand the user's technical level. Skill creator is used by people across a wide range of familiarity — from first-time coders to senior engineers. In the default case:
48
+ - "evaluation" and "benchmark" are borderline but OK
49
+ - For "JSON" and "assertion", check for cues the user knows these terms before using them without explanation
50
+ - Briefly explain terms if in doubt
51
+
52
+ ### Step 1: Understand Requirements
53
+
54
+ If the current conversation already contains a workflow the user wants to capture (e.g., "turn this into a skill"), extract answers from the conversation history first — the tools used, the sequence of steps, corrections the user made, input/output formats observed. Confirm with the user before proceeding.
55
+
56
+ Ask the user with AskUserQuestion:
57
+
58
+ **Skill Identity:**
59
+ - Name (kebab-case, gerund preferred: `processing-pdfs`, `analyzing-data`)
60
+ - Purpose (1-2 sentences: what it does + when to use it)
61
+
62
+ **Execution Model:**
63
+ - User-invocable via `/name`? Or background knowledge only?
64
+ - Accepts arguments? (`$ARGUMENTS` substitution)
65
+ - Needs isolated context? (`context: fork` for subagent execution)
66
+ - Explicit-only invocation? (`disable-model-invocation: true` — prevents auto-activation, requires `/name`)
67
+
68
+ **Required Tools:**
69
+ - Read, Write, Edit, Glob, Grep, WebFetch, WebSearch, Task, AskUserQuestion, Skill
70
+ - Bash must be scoped: `Bash(git:*)`, `Bash(npm:*)`, etc.
71
+ - MCP tools: `ServerName:tool_name`
72
+
73
+ **Complexity:**
74
+ - Simple (SKILL.md only)
75
+ - With scripts (automation code in `scripts/`)
76
+ - With references (documentation in `references/`)
77
+ - With templates (boilerplate in `templates/`)
78
+ - Full package (all directories)
79
+
80
+ **Location:**
81
+ - Global: `~/.claude/skills/<skill-name>/`
82
+ - Project: `.claude/skills/<skill-name>/`
83
+
84
+ ### Step 2: Plan the Skill
85
+
86
+ Before writing, determine:
87
+
88
+ **Degrees of Freedom:**
89
+ | Level | When to Use |
90
+ |-------|-------------|
91
+ | High | Creative/open-ended tasks (analysis, writing) |
92
+ | Medium | Defined workflow, flexible content (most skills) |
93
+ | Low | Strict output format (compliance, API calls, configs) |
94
+
95
+ Think of it as **narrow bridge vs open field**: a deployment skill is a narrow bridge (one safe path, guard rails everywhere), while a writing skill is an open field (Claude roams freely within broad boundaries). Match constraint level to the task.
96
+
97
+ **Workflow Pattern** (see `${CLAUDE_SKILL_DIR}/references/workflows.md`):
98
+ - Sequential: fixed steps in order
99
+ - Conditional: branch based on input
100
+ - Wizard: interactive multi-step gathering
101
+ - Plan-Validate-Execute: verifiable intermediates
102
+ - Feedback Loop: iterate until quality met
103
+ - Checklist Workflow: copy-pasteable progress tracking for complex multi-step processes
104
+ - Search-Analyze-Report: explore and summarize
105
+
106
+ **Output Pattern** (see `${CLAUDE_SKILL_DIR}/references/output-patterns.md`):
107
+ - Strict template (exact format)
108
+ - Flexible template (structure with creative content)
109
+ - Examples-driven (input/output pairs)
110
+ - Visual (HTML generation)
111
+ - Structured data (JSON/YAML)
112
+
113
+ ### Step 3: Initialize Structure
114
+
115
+ Create the skill directory and files:
116
+
117
+ ```bash
118
+ mkdir -p {location}/{skill-name}
119
+ mkdir -p {location}/{skill-name}/scripts # if needed
120
+ mkdir -p {location}/{skill-name}/references # if needed
121
+ mkdir -p {location}/{skill-name}/templates # if needed
122
+ mkdir -p {location}/{skill-name}/assets # if needed
123
+ mkdir -p {location}/{skill-name}/evals # for eval-driven development
124
+ ```
125
+
126
+ ### Steps 4-10: Write, Validate, Test, Iterate, Optimize, Report
127
+
128
+ For detailed guidance on writing SKILL.md (frontmatter rules, description scoring, body guidelines, string substitutions, DCI syntax), creating supporting files, validation, testing, iteration, description optimization, and final reporting, see [Creation Guide](references/creation-guide.md).
129
+
130
+ Key rules:
131
+ - `version`, `author`, `license`, `tags`, `compatible-with` are TOP-LEVEL fields (not nested under `metadata:`)
132
+ - Scope Bash: `Bash(git:*)` not bare `Bash`
133
+ - Keep under 500 lines; offload to `references/` if longer
134
+ - Include "Use when" and "Trigger with" in description for enterprise scoring
135
+ - No XML tags in name or description (Anthropic spec prohibition)
136
+ - No time-sensitive information; use 'old patterns' section for deprecated approaches
137
+ - Include feedback loops for quality-critical workflows
138
+ - Run `python3 ${CLAUDE_SKILL_DIR}/scripts/validate-skill.py --grade {skill-dir}/SKILL.md` to validate
139
+ - Create `evals/evals.json` with 3+ scenarios, iterate until all assertions pass
140
+
141
+ ## Validation Workflow
142
+
143
+ When the user wants to validate, grade, or audit an existing skill. For detailed steps (V1-V5), see [Creation Guide](references/creation-guide.md).
144
+
145
+ 1. Locate the SKILL.md (global `~/.claude/skills/` or project `.claude/skills/`)
146
+ 2. Run `python3 ${CLAUDE_SKILL_DIR}/scripts/validate-skill.py --grade {path}/SKILL.md`
147
+ 3. Review grade against the 100-point rubric (A: 90+, B: 80-89, C: 70-79, D: 60-69, F: <60)
148
+ 4. Report results with prioritized fix recommendations
149
+ 5. Auto-fix if requested: add missing sections, fix description patterns, move nested metadata to top-level
150
+
151
+ ## Output
152
+
153
+ The skill produces one of two outputs depending on mode:
154
+
155
+ - **Create mode**: A complete skill package directory containing SKILL.md, optional `scripts/`, `references/`, `templates/`, `assets/`, and `evals/` subdirectories, plus a creation summary report with validation grade and eval results.
156
+ - **Validate mode**: A grade report showing the 100-point rubric score across 5 pillars (Progressive Disclosure, Ease of Use, Utility, Spec Compliance, Writing Style), with prioritized fix recommendations sorted by point value.
157
+
158
+ ## Examples
159
+
160
+ ### Simple Skill (Create Mode)
161
+
162
+ ```
163
+ User: Create a skill called "code-review" that reviews code quality
164
+
165
+ Creates:
166
+ ~/.claude/skills/code-review/
167
+ ├── SKILL.md
168
+ └── evals/
169
+ └── evals.json
170
+
171
+ Frontmatter:
172
+ ---
173
+ name: code-review
174
+ description: |
175
+ Make sure to use this skill whenever reviewing code for quality, security
176
+ vulnerabilities, and best practices. Use when doing code reviews, PR analysis,
177
+ or checking code quality. Trigger with "/code-review" or "review this code".
178
+ allowed-tools: "Read,Glob,Grep"
179
+ version: 1.0.0
180
+ author: Jeremy Longshore <jeremy@intentsolutions.io>
181
+ license: MIT
182
+ model: inherit
183
+ ---
184
+ ```
185
+
186
+ ### Full Package with Arguments (Create Mode)
187
+
188
+ ```
189
+ User: Create a skill that generates release notes from git history
190
+
191
+ Creates:
192
+ ~/.claude/skills/generating-release-notes/
193
+ ├── SKILL.md (argument-hint: "[version-tag]")
194
+ ├── scripts/
195
+ │ └── parse-commits.py
196
+ ├── references/
197
+ │ └── commit-conventions.md
198
+ ├── templates/
199
+ │ └── release-template.md
200
+ └── evals/
201
+ └── evals.json
202
+
203
+ Uses $ARGUMENTS[0] for version tag.
204
+ Uses context: fork for isolated execution.
205
+ ```
206
+
207
+ ### Validate Mode
208
+
209
+ ```
210
+ User: Grade my skill at ~/.claude/skills/code-review/SKILL.md
211
+
212
+ Runs: python3 ${CLAUDE_SKILL_DIR}/scripts/validate-skill.py --grade ~/.claude/skills/code-review/SKILL.md
213
+
214
+ Output:
215
+ Grade: B (84/100)
216
+ Improvements:
217
+ - Add "Trigger with" to description (+3 pts)
218
+ - Add ## Output section (+2 pts)
219
+ - Add ## Prerequisites section (+2 pts)
220
+ ```
221
+
222
+ ## Edge Cases
223
+
224
+ - **Name conflicts**: Check if skill directory already exists before creating
225
+ - **Empty arguments**: If skill uses `$ARGUMENTS`, handle the empty case
226
+ - **Long content**: If SKILL.md exceeds 300 lines during writing, stop and split to references
227
+ - **Bash scoping**: If user requests raw `Bash`, always scope it
228
+ - **Model selection**: Default to `inherit`, only override with good reason
229
+ - **Undertriggering**: If skill isn't activating, make description more aggressive/pushy
230
+ - **Legacy metadata nesting**: If found, move author/version/license to top-level
231
+
232
+ ## Error Handling
233
+
234
+ | Error | Cause | Solution |
235
+ |-------|-------|----------|
236
+ | Name exists | Directory already present | Choose different name or confirm overwrite |
237
+ | Invalid name | Not kebab-case or >64 chars | Fix to lowercase-with-hyphens |
238
+ | Validation fails | Missing fields or anti-patterns | Run validator, fix reported issues |
239
+ | Resource missing | `${CLAUDE_SKILL_DIR}/` ref points to nonexistent file | Create the file or fix the reference |
240
+ | Undertriggering | Description too passive | Add "Make sure to use whenever..." phrasing |
241
+ | Eval failures | Skill not producing expected output | Iterate on instructions and re-test |
242
+ | Low grade | Missing scored sections or fields | Add Overview, Prerequisites, Output sections |
243
+
244
+ ## Resources
245
+
246
+ **References:** `${CLAUDE_SKILL_DIR}/references/`
247
+ - `creation-guide.md` — Detailed Steps 4-10 and Validation Workflow (V1-V5)
248
+ - `source-of-truth.md` — Canonical spec ([AgentSkills.io](https://agentskills.io/specification), [Anthropic docs](https://code.claude.com/docs/en/skills), [Lee Han Chung deep dive](https://leehanchung.github.io/blogs/2025/10/26/claude-skills-deep-dive/)) | `frontmatter-spec.md` — Field reference | `validation-rules.md` — 100-point rubric
249
+ - `workflows.md` — Workflow patterns | `output-patterns.md` — Output formats | `schemas.md` — JSON schemas (evals, grading, benchmarks)
250
+ - `anthropic-comparison.md` — Gap analysis | `advanced-eval-workflow.md` — Eval, iteration, optimization, platform notes
251
+
252
+ **Agents** (read when spawning subagents): `${CLAUDE_SKILL_DIR}/agents/`
253
+ - `grader.md` — Assertion evaluation | `comparator.md` — Blind A/B comparison | `analyzer.md` — Benchmark analysis
254
+
255
+ **Scripts:** `${CLAUDE_SKILL_DIR}/scripts/`
256
+ - `validate-skill.py` — 100-point rubric grading | `quick_validate.py` — Lightweight validation
257
+ - `aggregate_benchmark.py` — Benchmark stats | `run_eval.py` — Trigger accuracy testing
258
+ - `run_loop.py` — Description optimization loop | `improve_description.py` — LLM-powered rewriting
259
+ - `generate_report.py` — HTML reports | `package_skill.py` — .skill packaging | `utils.py` — Shared utilities
260
+
261
+ **Eval Viewer:** `${CLAUDE_SKILL_DIR}/eval-viewer/` — `generate_review.py` + `viewer.html` (interactive output comparison)
262
+ **Assets:** `${CLAUDE_SKILL_DIR}/assets/eval_review.html` (trigger eval set editor)
263
+ **Templates:** `${CLAUDE_SKILL_DIR}/templates/skill-template.md` (SKILL.md skeleton)
264
+
265
+ ---
266
+
267
+ For advanced workflows (empirical eval, description optimization, blind comparison, packaging, platform notes), see [Creation Guide](references/creation-guide.md) and `${CLAUDE_SKILL_DIR}/references/advanced-eval-workflow.md`.