claude-dev-env 1.4.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md DELETED
@@ -1,247 +0,0 @@
1
- # claude-code-config
2
-
3
- Consistent development standards for Claude Code across every repo. Install once, get TDD enforcement, code quality hooks, specialized agents, and battle-tested rules everywhere.
4
-
5
- ## Quick Start
6
-
7
- ### Prerequisites
8
-
9
- - **Node.js 18+** (includes `npx`)
10
- - **Python 3.8+** (for hook scripts)
11
- - **Claude Code CLI** installed and working
12
-
13
- ### Install
14
-
15
- ```bash
16
- npx claude-dev-env
17
- ```
18
-
19
- That's it. The installer will:
20
-
21
- 1. Detect your Python 3 command (`python3`, `python`, or `py -3`)
22
- 2. Copy 13 rules, 4 docs, 34 agents, 11 commands, and 14 skills to `~/.claude/`
23
- 3. Copy 90+ hook scripts to `~/.claude/hooks/`
24
- 4. Merge 31 hook groups into `~/.claude/settings.json` (preserves your existing hooks)
25
- 5. Write a manifest to `~/.claude/.claude-dev-env-manifest.json` for clean uninstall
26
-
27
- ### Verify
28
-
29
- Start a new Claude Code session. You should see hook activity on your first prompt (code-rules-reminder, hook-structure-context). Run any slash command like `/commit` or `/readability-review` to confirm commands loaded.
30
-
31
- ### Update
32
-
33
- Run the same command again. It overwrites existing files and updates hook entries in place:
34
-
35
- ```bash
36
- npx claude-dev-env
37
- ```
38
-
39
- ### Uninstall
40
-
41
- Removes only the files this package installed (tracked via manifest) and cleans hook entries from `settings.json`:
42
-
43
- ```bash
44
- npx claude-dev-env --uninstall
45
- ```
46
-
47
- ## What This Solves
48
-
49
- Without shared config, every repo needs its own `.claude/rules/`, `.claude/hooks/`, `.claude/agents/`, etc. That means:
50
-
51
- - Duplicated config across 5+ repos
52
- - Drift when you update standards in one place but forget others
53
- - New repos start with zero guardrails
54
-
55
- This package centralizes all general-purpose Claude Code config. Project-specific rules still live in each repo's `.claude/` directory and merge with these.
56
-
57
- ## What's Included
58
-
59
- ### Rules (13)
60
-
61
- Behavioral rules loaded into every session. These shape how Claude approaches work before any code is written.
62
-
63
- | Rule | What it does |
64
- |------|-------------|
65
- | `tdd` | Red-green-refactor is non-negotiable |
66
- | `code-standards` | References CODE_RULES.md for all code generation |
67
- | `conservative-action` | Research first, act only when explicitly asked |
68
- | `right-sized-engineering` | Simple > clever, functions > classes, concrete > abstract |
69
- | `explore-thoroughly` | Read before proposing, map patterns before committing |
70
- | `research-mode` | Anti-hallucination: cite sources, say "I don't know", use direct quotes |
71
- | `parallel-tools` | Independent tool calls run simultaneously |
72
- | `agent-spawn-protocol` | Context sufficiency check before delegating to agents |
73
- | `git-workflow` | Draft PRs, one commit per review stage, stacked PR patterns |
74
- | `code-reviews` | Systematic PR review response protocol |
75
- | `testing` | Complete mocks, reference TEST_QUALITY.md |
76
- | `context7` | Fetch current docs via Context7 MCP instead of relying on training data |
77
- | `cleanup-temp-files` | Remove scratch files after tasks complete |
78
-
79
- ### Docs (4)
80
-
81
- Reference documents that rules and agents point to for detailed standards.
82
-
83
- | Document | Coverage |
84
- |----------|----------|
85
- | `CODE_RULES.md` | Hook-enforced rules, naming conventions, config patterns, type hints, readability rubric |
86
- | `TEST_QUALITY.md` | Test writing standards, mock completeness, assertion patterns |
87
- | `REACT_PATTERNS.md` | Component architecture, hooks, state management conventions |
88
- | `DJANGO_PATTERNS.md` | Model patterns, view architecture, ORM best practices |
89
-
90
- ### Agents (34)
91
-
92
- Specialized agent prompts for common development tasks. Claude Code automatically discovers these and makes them available for delegation.
93
-
94
- **Code Quality:** clean-coder, code-quality-agent, code-standards-agent, readability-review-agent, refactoring-specialist, right-sized-engineer
95
-
96
- **Testing:** tdd-test-writer, test-data-builder, validation-expert
97
-
98
- **Planning:** plan-executor, parallel-workflow-coordinator, mandatory-agent-workflow-agent, stub-detector-agent
99
-
100
- **Documentation:** docs-agent, doc-orchestrator, user-docs-writer, project-docs-analyzer
101
-
102
- **Configuration:** config-extraction-agent, config-centralizer, magic-value-eliminator-agent, project-structure-organizer-agent
103
-
104
- **Tooling:** agent-writer, skill-writer-agent, skill-to-agent-converter, tooling-builder
105
-
106
- **Git:** git-commit-crafter, pr-description-writer, session-continuity-manager
107
-
108
- **File Formats:** docx-agent, pdf-agent, xlsx-agent
109
-
110
- **Other:** clasp-deployment-orchestrator, workflow-visual-documenter, project-context-loader
111
-
112
- ### Commands (11)
113
-
114
- Slash commands for common workflows.
115
-
116
- | Command | Purpose |
117
- |---------|---------|
118
- | `/commit` | Structured git commit with conventional format |
119
- | `/plan` | Create implementation plans with config search |
120
- | `/implement` | Execute plans with TDD workflow |
121
- | `/review-plan` | Review and critique implementation plans |
122
- | `/readability-review` | 8-dimension readability scoring |
123
- | `/right-size` | Check for over/under-engineering |
124
- | `/stubcheck` | Find stubs, TODOs, and NotImplementedError |
125
- | `/pr-comments` | Process PR review comments systematically |
126
- | `/docupdate` | Update documentation after changes |
127
- | `/initialize` | Session initialization with protocol review |
128
- | `/sum` | Summarize current work context |
129
-
130
- ### Skills (14)
131
-
132
- | Skill | Purpose |
133
- |-------|---------|
134
- | `prompt-generator` | Write, refine, and structure prompts for Claude with emotion-informed framing |
135
- | `agent-prompt` | Craft structured agent prompts and spawn background agents after approval |
136
- | `tdd-team` | Orchestrate a 4-agent TDD team (planner, tester, implementer, validator) |
137
- | `pr-review-responder` | Systematic PR review response: fetch comments, checklist, fix, reply, commit |
138
- | `anthropic-plan` | Readonly codebase exploration before code changes, produces a plan file |
139
- | `readability-review` | 8-dimension readability scoring (160 pts) with automatic fixes |
140
- | `ingest` | Digest codebase into LLM-friendly text files via gitingest |
141
- | `npm-creator` | Scaffold npm installer packages for Claude Code plugin repos |
142
- | `rule-audit` | Full enforcement audit of rules, hooks, and docs across user and project layers |
143
- | `rule-creator` | Create and harden Claude Code rules with positive framing and rationale |
144
- | `skill-writer` | Guide for creating well-structured Agent Skills |
145
- | `everything-search` | Fast Windows file search via Everything (voidtools) es.exe |
146
- | `recall` | Retrieve prior session context and decisions from Obsidian vault |
147
- | `remember` | Save decisions, gotchas, and architectural choices to Obsidian vault |
148
-
149
- ### Hooks (31 registered, 70+ files)
150
-
151
- Automated enforcement that runs on Claude Code events. The installer detects your Python 3 command and rewrites hook paths to absolute `~/.claude/hooks/` paths in `settings.json`.
152
-
153
- #### PreToolUse (before tool execution)
154
-
155
- | Matcher | Hook | What it does |
156
- |---------|------|-------------|
157
- | Write\|Edit | `write-existing-file-blocker` | Warns before overwriting files that should be edited |
158
- | Write\|Edit | `sensitive-file-protector` | Blocks writes to .env, credentials, and sensitive files |
159
- | Write\|Edit | `pyautogui-scroll-blocker` | Prevents pyautogui scroll direction bugs |
160
- | Write\|Edit | `hook-format-validator` | Validates hook file format on write |
161
- | Write\|Edit | `run_all_validators` | Runs the full validation suite (30+ checks) |
162
- | Write\|Edit | `code-rules-enforcer` | Blocks CODE_RULES.md violations (comments, magic values, imports) |
163
- | Write\|Edit | `tdd-enforcer` | Prompts TDD confirmation when writing production code |
164
- | Write\|Edit | `code-style-validator` | Checks indentation and function spacing |
165
- | Write\|Edit | `docker-settings-guard` | Blocks direct edits to Docker settings files |
166
- | Edit | `refactor-guard` | Ensures refactoring happens only after green tests |
167
- | Edit | `migration-safety-advisor` | Warns about risky database migration patterns |
168
- | Bash | `destructive-command-blocker` | Blocks rm -rf, git reset --hard, and other destructive commands |
169
- | Bash | `block-main-commit` | Blocks direct commits to main/master branch |
170
- | Bash | `pr-description-enforcer` | Enforces PR description structure and style |
171
- | Bash | `test-preflight-check` | Validates server health and database before test runs |
172
- | Task\|Agent | `parallel-task-blocker` | Suggests team orchestration for parallel agent spawning |
173
- | AskUserQuestion | `attention-needed-notify` | Desktop notification when Claude needs your input |
174
-
175
- #### Other Events
176
-
177
- | Event | Hook | What it does |
178
- |-------|------|-------------|
179
- | UserPromptSubmit | `hook-structure-context` | Injects hook directory context into session |
180
- | UserPromptSubmit | `bulk-edit-reminder` | Suggests script-based approach for bulk updates |
181
- | UserPromptSubmit | `code-rules-reminder` | Injects CODE_RULES.md reminder on code-related prompts |
182
- | SessionStart (compact) | `compact-context-reinject` | Re-injects critical rules after context compaction |
183
- | SessionStart | `plugin-data-dir-cleanup` | Cleans stale plugin data on session start |
184
- | Stop | `attention-needed-notify` | Desktop notification when Claude stops |
185
- | Stop | `hedging-language-blocker` | Blocks responses with hedging language (anti-hallucination) |
186
- | SessionEnd | `session-end-cleanup` | Cleans temporary state on session end |
187
- | ConfigChange | `config-change-guard` | Guards against accidental settings changes |
188
- | PostToolUse (Write\|Edit) | `mypy_validator` | Runs mypy type checking after file writes |
189
- | PostToolUse (Write\|Edit) | `e2e-test-validator` | Validates e2e test conventions after writes |
190
- | PostToolUse (Write\|Edit) | `auto-formatter` | Auto-formats Python (ruff/black) and JS (prettier) on write |
191
- | PostToolUse (Agent\|Task) | `investigation-tracker-reset` | Resets investigation tracker after delegation |
192
- | Notification | `claude-notification-handler` | Routes Claude Code notifications to desktop |
193
-
194
- #### Validators Module
195
-
196
- The `hooks/validators/` directory contains 30+ individual check modules with a full test suite:
197
-
198
- Abbreviations, code quality, comments, file structure, git conventions, magic values, mypy integration, PR references, Python antipatterns, Python style, React patterns, ruff integration, security, TODO tracking, type safety, useless test detection, and more.
199
-
200
- ## Also Available as a Plugin
201
-
202
- If you prefer the Claude Code plugin system over npm:
203
-
204
- ```bash
205
- claude plugin install jl-cmd/claude-code-config
206
- ```
207
-
208
- ## Recommended Companion Plugins
209
-
210
- These plugins provide additional skills and capabilities that complement this config:
211
-
212
- ```bash
213
- claude plugin install anthropics/claude-code-plugins # Official: frontend-design, code-review, playwright, hookify, skill-creator, claude-md-management, serena, pyright-lsp, typescript-lsp, claude-code-setup
214
- claude plugin install anthropics/claude-code-workflows # Official: python-dev, ui-design, unit-testing, context-management, agent-teams, and more
215
- claude plugin install jl-cmd/claude-journal # Session logging to Obsidian vault (provides /session-log)
216
- claude plugin install jl-cmd/claude-deep-research # Deep multi-source research with citations
217
- claude plugin install jl-cmd/claude-workflow # Workflow definitions with YAML schemas
218
- ```
219
-
220
- GSD (project management) is available as an npm package:
221
- ```bash
222
- npx get-shit-done-cc
223
- ```
224
-
225
- ## Customization
226
-
227
- Installed rules merge with your project's `.claude/` config. To override a rule for a specific project, create a rule with the same filename in your project's `.claude/rules/` directory.
228
-
229
- Installed hooks run alongside any hooks already in your `settings.json` or `settings.local.json`. The installer preserves existing hook entries.
230
-
231
- ## Agent Gate
232
-
233
- For a prompt evaluation gate that reviews prompts before execution, see [agent-gate](https://github.com/jl-cmd/agent-gate):
234
-
235
- ```bash
236
- npx agent-gate-installer
237
- ```
238
-
239
- ## Requirements
240
-
241
- - Node.js 18+ (for the installer)
242
- - Python 3.8+ (for hooks)
243
- - Claude Code CLI
244
-
245
- ## License
246
-
247
- MIT
@@ -1,102 +0,0 @@
1
- ---
2
- name: agent-prompt
3
- description: >-
4
- Craft a structured prompt using prompt-generator's workflow, then spawn a
5
- background agent to execute it after user approval. Use instead of
6
- /prompt-generator when the user wants execution, not just the prompt.
7
- Triggers on /agent-prompt, "launch an agent for this", "spawn agent to do X",
8
- "delegate this", "run this in background", or any task that benefits from
9
- agent delegation with prompt quality.
10
- ---
11
-
12
- @~/.claude/skills/prompt-generator/SKILL.md
13
- @~/.claude/skills/prompt-generator/REFERENCE.md
14
-
15
- # Agent Prompt
16
-
17
- Craft a structured agent prompt, get approval, spawn a background agent.
18
-
19
- The prompt-generator skill above defines the prompt-crafting workflow. This skill extends it: instead of delivering the prompt as a fenced block, it presents the prompt for approval and spawns a background agent.
20
-
21
- ## When this skill applies
22
-
23
- Trigger when the user wants to delegate a task to an agent. The difference from /prompt-generator: this skill **executes**.
24
-
25
- When invoked with arguments (e.g. `/agent-prompt fix the auth bug via TDD`), treat the arguments as the task to build a prompt for and execute.
26
-
27
- ## Workflow
28
-
29
- ### Steps 1-8: Craft the prompt
30
-
31
- Follow the prompt-generator workflow steps 1 through 8 exactly as written. Classify the prompt type, set degree of freedom, collect missing facts, build the prompt with XML tags and role, control format and style, add examples if needed, and self-check against the rubric.
32
-
33
- Skip step 9 (Deliver). Continue below instead.
34
-
35
- ### Step 9: Gather context before crafting
36
-
37
- The agent starts with zero conversation history. Before building the prompt, use Read, Glob, Grep, and other research tools to gather the concrete values the agent will need -- file paths, function signatures, existing patterns, branch names. Embed these directly in the prompt instead of telling the agent to "find" them.
38
-
39
- The agent-spawn-protocol rule requires this: if any context question has the answer "I don't know", investigate first. Do not delegate the context-gathering.
40
-
41
- Proactive context gathering enables agents to plan effectively from the start. Anthropic's emotion concepts research (2026) found that agents produce higher-quality output when they understand constraints, available tools, and system boundaries upfront — they incorporate these into their approach naturally, leading to better first attempts and more accurate results.
42
-
43
- ### Step 10: Determine agent configuration
44
-
45
- Map the task to agent parameters:
46
-
47
- | Task type | subagent_type | mode |
48
- |---|---|---|
49
- | Codebase exploration, search, research | Explore | default |
50
- | Code implementation, bug fix, refactoring | general-purpose | auto |
51
- | Read-only audit, analysis, review | general-purpose | default |
52
- | Architecture, multi-step planning | Plan | plan |
53
-
54
- Always set `run_in_background: true`.
55
-
56
- Generate a descriptive `name` (3-5 words, kebab-case) so the user can track progress and send follow-up messages via `SendMessage({to: name})`.
57
-
58
- ### Step 11: Present for approval
59
-
60
- Use AskUserQuestion with one question. The question text should summarize the agent config (type, mode, name). Each option should use the `preview` field to show the full crafted prompt.
61
-
62
- Options:
63
- 1. "Launch it" (recommended) -- preview shows the crafted prompt
64
- 2. "Edit first" -- preview shows the prompt with a note that user can provide changes
65
- 3. "Cancel" -- no preview
66
-
67
- ### Step 12: Spawn
68
-
69
- On **"Launch it"**: spawn the Agent tool with the crafted prompt and configuration. Report the agent name so the user knows what's running.
70
-
71
- On **"Edit first"**: present the prompt in conversation text. After the user provides changes, return to step 11 with the updated prompt.
72
-
73
- On **"Cancel"**: acknowledge and stop.
74
-
75
- ## Prompt adjustments for agent execution
76
-
77
- When building the prompt in step 4, these adjustments ensure the agent can work independently:
78
-
79
- **Context completeness** -- include file paths, line numbers, function names, branch state, and anything you learned during step 9. The agent cannot see this conversation.
80
-
81
- **Acceptance criteria** -- state what "done" looks like. For code: include the test command. For research: specify the output format and save location.
82
-
83
- **Scope boundary** -- include "Only make changes directly requested; do not refactor surrounding code" or equivalent. Agents without scope constraints tend to over-engineer.
84
-
85
- **Constraints from this project** -- if the project has CODE_RULES.md, TDD requirements, or naming conventions, include the relevant subset in the prompt so the agent follows them.
86
-
87
- **Emotion-informed briefing** -- Anthropic's emotion concepts research (2026) found that briefing style causally affects output quality. Frame tasks collaboratively ("work on this together", "help figure out"). Include permission to express uncertainty ("flag anything you're unsure about", "use [PLACEHOLDER] for unverified specifics"). Provide motivation behind constraints ("this ordering ensures tests define behavior before implementation exists"). Share system context proactively (what hooks enforce, what tools are available, what the fallback is) so the agent can incorporate constraints into its plan from the start.
88
-
89
- **Anti-test-fixation** -- For code tasks, include guidance against test-specific solutions. Anthropic: "Implement a solution that works correctly for all valid inputs, not just the test cases. Tests are there to verify correctness, not to define the solution. If the task is unreasonable or infeasible, or if any of the tests are incorrect, please inform me rather than working around them."
90
-
91
- **Commit-and-execute** -- For multi-step agent work, include decision commitment guidance. Anthropic: "When deciding how to approach a problem, choose an approach and commit to it. Avoid revisiting decisions unless you encounter new information that directly contradicts your reasoning."
92
-
93
- **Temp file cleanup** -- If the agent may create scratch files during iteration, include cleanup instructions. Anthropic: "If you create any temporary new files, scripts, or helper files for iteration, clean up these files by removing them at the end of the task."
94
-
95
- ## Constraints
96
-
97
- - Always present for approval via AskUserQuestion -- never auto-spawn
98
- - Always run agents in background
99
- - Gather context before crafting -- do not send an agent in blind
100
- - If the task is too small for an agent (single file read, quick grep), say so and just do it directly
101
- - Include obstacle handling: "When encountering obstacles, do not use destructive actions as a shortcut (e.g. --no-verify, discarding unfamiliar files)" -- agents without this guidance may take irreversible shortcuts
102
- - Frame agent tasks with collaborative language and include permission to express uncertainty — agents produce higher-quality output with collaborative briefing (Anthropic emotion concepts research, 2026)
@@ -1,150 +0,0 @@
1
- # Prompt generator -- reference
2
-
3
- ## Canonical resources
4
-
5
- When authoring or refining prompts, ground decisions in these sources. If guidance conflicts, defer to the higher tier.
6
-
7
- ### Tier 1: Anthropic (primary authority for Claude)
8
-
9
- - https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/overview -- overview, links to all sub-guides
10
- - https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices -- the single living reference for Claude's latest models. Covers general principles, XML tags, prefill deprecation, tool use, thinking, agentic systems, overeagerness, anti-hallucination.
11
- - https://transformer-circuits.pub/2026/emotions/index.html -- emotion concepts research (April 2026): 171 internal activation patterns that causally influence behavior. Key prompt-engineering takeaways: clear criteria and escape routes improve output quality, collaborative framing activates engagement, positive task framing correlates with better results, inviting transparency produces more reliable output. Cross-model caveat: studied on Sonnet 4.5; patterns align with best practices independently.
12
- - https://www.anthropic.com/research/emotion-concepts-function -- blog summary of the above paper.
13
- - https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking -- adaptive thinking reference; replaces manual budget_tokens with effort-based control.
14
-
15
- ### Tier 2: Major labs (strong secondary, often transfers across models)
16
-
17
- - https://platform.openai.com/docs/guides/prompt-engineering -- six strategies: write clear instructions, provide reference text, split complex tasks, give models time to think, use external tools, test systematically.
18
- - https://deepmind.google/research/ -- learning resources and chain-of-thought research.
19
- - https://www.microsoft.com/en-us/research/blog/ -- publications and applied research.
20
-
21
- ### Tier 3: Courses, communities, individuals (supplementary)
22
-
23
- **Courses:**
24
-
25
- - https://www.deeplearning.ai/short-courses/ -- Andrew Ng's courses. "ChatGPT Prompt Engineering for Developers" (with OpenAI) is the foundational one.
26
- - https://course.fast.ai/ -- Jeremy Howard's top-down teaching style.
27
- - https://www.elementsofai.com/ -- University of Helsinki introductory course.
28
- - https://ocw.mit.edu/search/?t=Artificial%20Intelligence -- MIT OpenCourseWare AI curriculum.
29
-
30
- **Communities and individuals:**
31
-
32
- - https://discuss.huggingface.co/ -- open-source model community.
33
- - https://www.latent.space/ -- AI engineering perspective (Latent Space Podcast & Newsletter).
34
- - https://simonwillison.net/ -- practical LLM experiments. His "LLM" tag is especially valuable.
35
-
36
- ### Conflict resolution rule
37
-
38
- If sources disagree on a technique, apply in order: Anthropic documentation first (it describes the actual model behavior), then OpenAI/Google/Microsoft (large-scale research with cross-model relevance), then community sources (patterns and intuition, not authoritative on model internals). When Tier 3 contradicts Tier 1, Tier 1 wins without exception.
39
-
40
- ## NotebookLM Audio Overview customization (example)
41
-
42
- Adapt `[FOCUS AREA]` per notebook. Pair with Deep Dive + Longer in the product UI when that matches the user's plan.
43
-
44
- ```text
45
- Target audience: [Expert-level listener profile -- skip beginner padding.]
46
-
47
- Focus: [FOCUS AREA -- single notebook-specific paragraph.]
48
-
49
- Style: [Technical depth, anti-patterns, implications for builders.]
50
-
51
- Prioritize: [Technical depth and specific findings over marketing tone or generic summaries.]
52
- ```
53
-
54
- ## Agent checklist pattern
55
-
56
- For long tasks, optional checklist the model can mirror:
57
-
58
- ```text
59
- Copy this checklist and mark items as you go:
60
-
61
- Progress:
62
- - [ ] ...
63
- - [ ] ...
64
- ```
65
-
66
- ## Agentic state management
67
-
68
- For `agent-harness` prompts that span multiple context windows, include state persistence and multi-window patterns. Based on Anthropic's guidance:
69
-
70
- ### Context awareness
71
-
72
- Claude 4.6 tracks its remaining context window. Include harness capabilities so Claude can plan accordingly:
73
-
74
- ```text
75
- <context_management>
76
- Your context window will be automatically compacted as it approaches its limit, allowing you to continue working indefinitely from where you left off. Do not stop tasks early due to token budget concerns. As you approach the limit, save current progress and state before the context window refreshes. Always be as persistent and autonomous as possible and complete tasks fully.
77
- </context_management>
78
- ```
79
-
80
- ### Multi-window workflow
81
-
82
- Anthropic recommends differentiating the first context window from subsequent ones:
83
-
84
- **First window:** Set up the framework -- write tests, create setup scripts, establish the todo-list.
85
-
86
- **Subsequent windows:** Iterate on the todo-list, using state files to resume.
87
-
88
- Key patterns from Anthropic:
89
- - Have the model write tests in a **structured format** (e.g. `tests.json` with `{id, name, status}`) before starting work. Remind: "It is unacceptable to remove or edit tests because this could lead to missing or buggy functionality."
90
- - Encourage **setup scripts** (e.g. `init.sh`) to start servers, run test suites, and linters. This prevents repeated work across windows.
91
- - When starting fresh, be **prescriptive about resumption**: "Review progress.txt, tests.json, and the git logs."
92
- - Provide **verification tools** (Playwright, computer use) for autonomous UI testing.
93
-
94
- ### State tracking
95
-
96
- ```text
97
- <state_management>
98
- Track progress in structured + freeform files:
99
- - tests.json: structured test status {id, name, status}
100
- - progress.txt: freeform session notes and next steps
101
- - Use git commits as checkpoints for rollback
102
-
103
- When approaching context limits, save current state before the window refreshes.
104
- Do not stop tasks early due to token budget concerns.
105
- </state_management>
106
- ```
107
-
108
- ### Encouraging complete context usage
109
-
110
- ```text
111
- This is a very long task, so it may be beneficial to plan out your work clearly. It's encouraged to spend your entire output context working on the task - just make sure you don't run out of context with significant uncommitted work. Continue working systematically until you have completed this task.
112
- ```
113
-
114
- ## Research prompt pattern
115
-
116
- For `research` prompt types, include structured investigation with hypothesis tracking:
117
-
118
- ```text
119
- <research_approach>
120
- Search for this information in a structured way. As you gather data, develop several competing hypotheses. Track your confidence levels in your progress notes to improve calibration. Regularly self-critique your approach and plan. Update a hypothesis tree or research notes file to persist information and provide transparency. Break down this complex research task systematically.
121
- </research_approach>
122
- ```
123
-
124
- Key elements:
125
- - Define clear **success criteria** for the research question
126
- - Encourage **source verification** across multiple sources
127
- - Track **competing hypotheses** with confidence levels
128
- - **Self-critique** approach and plan regularly
129
-
130
- ## Evaluation loop
131
-
132
- For prompt drafts that must hold up over time:
133
-
134
- 1. Run the draft on 2-3 representative user utterances.
135
- 2. Note failure modes (skipped steps, wrong format, over-refusal).
136
- 3. Tighten **constraints** or add **examples** for the failure class only.
137
-
138
- Anthropic's **self-correction chaining** pattern extends this: generate a draft, have Claude review it against criteria, then have Claude refine based on the review. Each step can be a separate API call for inspection and branching.
139
-
140
- ## Anti-test-fixation pattern
141
-
142
- ```text
143
- Write general-purpose solutions using the standard tools available. Implement logic that works correctly for all valid inputs, not just the test cases. Tests verify correctness -- they do not define the solution. If a test seems incorrect or the task is unreasonable, flag it rather than working around it.
144
- ```
145
-
146
- ## Commit-and-execute pattern
147
-
148
- ```text
149
- When deciding how to approach a problem, choose an approach and commit to it. Avoid revisiting decisions unless you encounter new information that directly contradicts your reasoning. If you are weighing two approaches, pick one and see it through. You can always course-correct later if the chosen approach fails.
150
- ```
@@ -1,154 +0,0 @@
1
- ---
2
- name: prompt-generator
3
- description: >-
4
- Write, generate, or improve prompts and system instructions for Claude.
5
- Covers system prompts, agent harness, tool-use, evaluation rubrics,
6
- NotebookLM audio, and MCP/browser automation prompts.
7
- ---
8
- @~/.claude/skills/prompt-generator/REFERENCE.md
9
-
10
- # Prompt generator
11
-
12
- **Core principle:** A good prompt is explicit, structured, and matched to task fragility -- high freedom for open-ended work, low freedom for fragile sequences.
13
-
14
- **Canonical source:** https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices -- the single reference for Claude's latest models. When sources conflict, defer to the authority tiers (Anthropic > major labs > community).
15
-
16
- ## When this skill applies
17
-
18
- Trigger for any request to **author** or **refine** text that steers Claude: system prompts, developer messages, agent harness instructions, evaluation rubrics, MCP/browser automation prompts, NotebookLM Audio Overview customization, etc.
19
-
20
- Do **not** use this skill when the user only wants a one-line reply with no template.
21
-
22
- When invoked with arguments (e.g. `/prompt-generator improve this: [paste]`), treat `$ARGUMENTS` as the prompt to refine.
23
-
24
- ## Workflow (run in order)
25
-
26
- ### 1. Classify the prompt type
27
-
28
- Pick one primary: `system` | `user-task` | `agent-harness` | `tool-use` | `audio-customization` | `evaluation` | `research` | `other`.
29
-
30
- ### 2. Set degree of freedom
31
-
32
- Match specificity to task fragility:
33
- - **High:** Multiple valid approaches; use numbered goals and acceptance criteria.
34
- - **Medium:** Preferred pattern exists; use pseudocode or a parameterised template.
35
- - **Low:** Fragile or safety-critical; use exact steps, exact labels, and "do not" boundaries.
36
-
37
- ### 3. Collect only missing facts
38
-
39
- Ask 1-3 short questions if needed: audience, output format, constraints, tools available, tone, length.
40
-
41
- ### 4. Build the prompt
42
-
43
- Apply these principles (source: https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices):
44
-
45
- **Structure with XML section tags** (`<role>`, `<context>`, `<instructions>`, `<constraints>`, `<examples>`, `<output_format>`) for prompts that mix instruction + context + examples. Skip XML for simple prompts under ~3 lines. Anthropic: "Use consistent, descriptive tag names across your prompts. Nest tags when content has a natural hierarchy."
46
-
47
- **Set a role** in the system prompt. Anthropic: "Setting a role in the system prompt focuses Claude's behavior and tone for your use case. Even a single sentence makes a difference."
48
-
49
- **Add motivation behind constraints** in `<context>`. Anthropic: "Providing context or motivation behind your instructions... can help Claude better understand your goals and deliver more targeted responses." Claude generalizes from the explanation.
50
-
51
- **Frame positively.** Anthropic: tell Claude what to DO, not only what to avoid. "Your response should be composed of smoothly flowing prose paragraphs" beats "Do not use markdown."
52
-
53
- **Emotion-informed framing.** Anthropic's emotion concepts research (2026) found that internal activation patterns causally influence output quality. Five patterns apply to prompt design: (1) provide clear criteria and escape routes — the model produces better results when success criteria are explicit and "say so if you're unsure" is an accepted response; (2) use collaborative framing — collaborative language ("help figure out", "work on this together") activates engagement states that correlate with higher quality; (3) frame tasks with positive engagement — presenting tasks as interesting problems activates curiosity states; (4) invite transparency — include "say so if you're unsure" or placeholder notation so the model expresses uncertainty directly; (5) use constructive, forward-looking tone — post-training RLHF creates a reflective default that benefits from energetic counterbalancing. Cross-model caveat: studied on Sonnet 4.5; the patterns align with Anthropic's best practices independently.
54
-
55
- **Golden rule check.** Anthropic: "Show your prompt to a colleague with minimal context on the task and ask them to follow it. If they'd be confused, Claude will be too."
56
-
57
- **Commit-and-execute pattern.** Anthropic: "When you're deciding how to approach a problem, choose an approach and commit to it. Avoid revisiting decisions unless you encounter new information that directly contradicts your reasoning." For prompts that guide agents through multi-step work, include this pattern so the agent doesn't spin revisiting decisions.
58
-
59
- **For long context** (20k+ tokens): put documents first, query/instructions last. Anthropic: "Queries at the end can improve response quality by up to 30% in tests." Ground responses in quotes from source material before analysis.
60
-
61
- ### 5. Control output format
62
-
63
- Apply these four techniques from the Anthropic guide:
64
-
65
- 1. **Tell Claude what to do, not what to avoid.** "Your response should be composed of smoothly flowing prose paragraphs" is more effective than "Do not use markdown."
66
- 2. **Use XML format indicators.** "Write the prose sections of your response in `<smoothly_flowing_prose_paragraphs>` tags."
67
- 3. **Match your prompt style to the desired output.** The formatting in your prompt influences the response. Removing markdown from the prompt reduces markdown in the output.
68
- 4. **Use detailed formatting preferences** when precision matters. Provide explicit guidance on markdown usage, list vs. prose preference, heading levels.
69
-
70
- For structured data output, prefer **structured outputs** (schema-constrained) or **tool calling** over prefill. Anthropic: "The Structured Outputs feature is designed specifically to constrain Claude's responses to follow a given schema."
71
-
72
- ### 6. Control communication style
73
-
74
- Anthropic notes Claude 4.6 is "more direct and grounded... less verbose: may skip detailed summaries for efficiency unless prompted otherwise."
75
-
76
- - If more visibility is wanted: "After completing a task that involves tool use, provide a quick summary of the work you've done."
77
- - If less verbosity is wanted: "Respond directly without preamble. Do not start with phrases like 'Here is...', 'Based on...'."
78
-
79
- ### 7. Add examples
80
-
81
- 3-5 concrete examples for structured output, format, or tone-sensitive prompts. Wrap in `<example>` tags with diverse, representative inputs. Anthropic: "Include 3-5 examples for best results. You can also ask Claude to evaluate your examples for relevance and diversity."
82
-
83
- ### 8. Self-check
84
-
85
- Before delivering, verify against the rubric:
86
-
87
- - [ ] States what to do in positive terms (not only what to avoid)
88
- - [ ] Output shape is specified if it matters (prose vs JSON vs XML vs structured outputs)
89
- - [ ] Communication style addressed (verbosity, summaries, preamble)
90
- - [ ] If tools exist: instructions tell Claude **when** to call each tool -- use natural phrasing ("Use this tool when...") over forceful directives to avoid overtriggering
91
- - [ ] No time-sensitive claims unless user asked for a snapshot date
92
- - [ ] For agent/tool prompts: includes a scope boundary ("Only make changes directly requested; do not refactor surrounding code")
93
- - [ ] For agent/tool prompts: includes autonomy/safety guidance (see pattern below)
94
- - [ ] For code/research prompts: includes grounding ("Read files before answering; say 'I don't know' when uncertain")
95
- - [ ] For research prompts: anti-hallucination ("Never speculate about code you have not opened")
96
- - [ ] For research prompts: structured approach ("Develop competing hypotheses, track confidence, self-critique")
97
- - [ ] Self-correction chain considered: would a generate-review-refine loop improve output?
98
- - [ ] For agentic prompts: state management addressed (context awareness, multi-window workflow, state tracking patterns)
99
- - [ ] Emotion-informed: uses collaborative framing (roles, motivation, partnership language)
100
- - [ ] Emotion-informed: includes permission to express uncertainty ("say so if unsure", placeholder notation)
101
- - [ ] Emotion-informed: proactive constraint awareness (inform about constraints upfront so the model can incorporate them into its plan)
102
- - [ ] For code prompts: includes anti-test-fixation ("Write general solutions, not code that only passes specific test cases; if tests seem wrong, flag them")
103
- - [ ] For agent prompts: includes temp file cleanup ("Clean up temporary files, scripts, or helper files created during the task")
104
- - [ ] For agent prompts: includes commit-and-execute pattern ("Choose an approach and commit; avoid revisiting decisions without new contradicting information")
105
-
106
- ### 9. Deliver
107
-
108
- Final artifact as **one or more fenced blocks** the user can paste as-is. Offer a **one-line "when to use"** summary if the prompt is long.
109
-
110
- ## Claude 4.6 considerations
111
-
112
- When generating prompts for current Claude models, apply these patterns:
113
-
114
- - **Prefill deprecated:** Do not use prefilled assistant responses. Anthropic: "Model intelligence and instruction following has advanced such that most use cases of prefill no longer require it." Use structured outputs, direct instructions, or XML tags instead.
115
- - **Overtriggering:** Dial back aggressive language. Anthropic: "Where you might have said 'CRITICAL: You MUST use this tool when...', you can use more normal prompting like 'Use this tool when...'."
116
- - **Overeagerness:** Include scope constraints. Anthropic: "Claude Opus 4.5 and Claude Opus 4.6 have a tendency to overengineer by creating extra files, adding unnecessary abstractions, or building in flexibility that wasn't requested."
117
- - **Overthinking:** Anthropic: "Replace blanket defaults with more targeted instructions. Instead of 'Default to using [tool],' add guidance like 'Use [tool] when it would enhance your understanding of the problem.'"
118
- - **Adaptive thinking replaces budget_tokens:** Claude 4.6 uses adaptive thinking (thinking: {type: "adaptive"}) where the model dynamically decides when and how much to think. Use the effort parameter (low | medium | high | max) to control depth. Anthropic: "In internal evaluations, adaptive thinking reliably drives better performance than extended thinking." Manual budget_tokens is deprecated.
119
- - **Subagent orchestration:** Include guidance for when subagents ARE and ARE NOT warranted. Anthropic: "Use subagents when tasks can run in parallel, require isolated context, or involve independent workstreams that don't need to share state. For simple tasks, sequential operations, single-file edits, or tasks where you need to maintain context across steps, work directly rather than delegating."
120
- - **Conservative vs proactive action:** For tools that should act, use explicit language ("Change this function"). For tools that should advise, use: "Default to providing information... Only proceed with edits when the user explicitly requests them."
121
- - **Anti-hallucination:** Anthropic: "Never speculate about code you have not opened. If the user references a specific file, you MUST read the file before answering."
122
- - **Self-correction chaining:** Anthropic: "The most common chaining pattern is self-correction: generate a draft, have Claude review it against criteria, have Claude refine based on the review." Consider adding a generate-review-refine loop for prompts that must hold up over time.
123
-
124
- ## Autonomy and safety pattern
125
-
126
- For `agent-harness` and `tool-use` prompt types, include guidance on reversibility. Anthropic provides this pattern:
127
-
128
- ```text
129
- Consider the reversibility and potential impact of your actions. You are encouraged to take local, reversible actions like editing files or running tests, but for actions that are hard to reverse, affect shared systems, or could be destructive, ask the user before proceeding.
130
-
131
- Examples of actions that warrant confirmation:
132
- - Destructive operations: deleting files or branches, dropping database tables, rm -rf
133
- - Hard to reverse operations: git push --force, git reset --hard, amending published commits
134
- - Operations visible to others: pushing code, commenting on PRs/issues, sending messages
135
- When encountering obstacles, do not use destructive actions as a shortcut. For example, don't bypass safety checks (e.g. --no-verify) or discard unfamiliar files that may be in-progress work.
136
- ```
137
-
138
- ## Research prompt pattern
139
-
140
- For `research` prompt types, include structured investigation. Anthropic provides this pattern:
141
-
142
- ```text
143
- Search for this information in a structured way. As you gather data, develop several competing hypotheses. Track your confidence levels in your progress notes to improve calibration. Regularly self-critique your approach and plan. Update a hypothesis tree or research notes file to persist information and provide transparency.
144
- ```
145
-
146
- ## Conflict resolution
147
-
148
- When prompt engineering guidance conflicts across sources, defer to the authority tier:
149
-
150
- 1. **Tier 1 (primary):** Anthropic -- the model provider's own documentation is authoritative for Claude behavior
151
- 2. **Tier 2 (strong secondary):** OpenAI, Google DeepMind, Microsoft Research -- major lab guidance often transfers across models
152
- 3. **Tier 3 (supplementary):** Community resources, courses, individual blogs -- valuable for patterns and intuition, not authoritative on model specifics
153
-
154
- The full curated resource list with links is in the canonical resources section above.