@claudetools/tools 0.4.0 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +60 -4
- package/dist/cli.js +0 -0
- package/dist/codedna/parser.d.ts +40 -3
- package/dist/codedna/parser.js +65 -8
- package/dist/codedna/registry.js +4 -1
- package/dist/codedna/template-engine.js +66 -32
- package/dist/handlers/codedna-handlers.d.ts +1 -1
- package/dist/handlers/codedna-handlers.js +27 -0
- package/dist/helpers/api-client.js +7 -0
- package/dist/helpers/codedna-monitoring.d.ts +34 -0
- package/dist/helpers/codedna-monitoring.js +159 -0
- package/dist/helpers/error-tracking.d.ts +73 -0
- package/dist/helpers/error-tracking.js +164 -0
- package/dist/helpers/usage-analytics.d.ts +91 -0
- package/dist/helpers/usage-analytics.js +256 -0
- package/dist/templates/claude-md.d.ts +1 -1
- package/dist/templates/claude-md.js +73 -0
- package/docs/AUTO-REGISTRATION.md +353 -0
- package/docs/CLAUDE4_PROMPT_ANALYSIS.md +589 -0
- package/docs/ENTITY_DSL_REFERENCE.md +685 -0
- package/docs/MODERN_STACK_COMPLETE_GUIDE.md +706 -0
- package/docs/PROMPT_STANDARDIZATION_RESULTS.md +324 -0
- package/docs/PROMPT_TIER_TEMPLATES.md +787 -0
- package/docs/RESEARCH_METHODOLOGY_EXTRACTION.md +336 -0
- package/package.json +12 -3
- package/scripts/verify-prompt-compliance.sh +197 -0
|
@@ -0,0 +1,324 @@
|
|
|
1
|
+
# 10/10 Framework Prompt Standardization Results
|
|
2
|
+
|
|
3
|
+
> **Completed:** 2025-12-05
|
|
4
|
+
> **Epic:** Standardize ClaudeTools Prompts with 10/10 Framework
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## Executive Summary
|
|
9
|
+
|
|
10
|
+
Successfully standardized all ClaudeTools prompts using the 10/10 AI System Prompt Architecture framework. Updated 7 prompts across 3 tiers (Minimal, Standard, Professional), incorporating production-proven patterns from Claude 4 system prompt analysis.
|
|
11
|
+
|
|
12
|
+
**Key Achievements:**
|
|
13
|
+
- ✅ All prompts now use semantic XML structure
|
|
14
|
+
- ✅ Token budgets optimized per tier
|
|
15
|
+
- ✅ Priority markers added (CRITICAL/MANDATORY/IMPORTANT)
|
|
16
|
+
- ✅ Forbidden patterns documented
|
|
17
|
+
- ✅ Progressive disclosure implemented
|
|
18
|
+
- ✅ Research methodology extracted from Claude.ai Desktop
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## Updated Prompts
|
|
23
|
+
|
|
24
|
+
### Minimal Tier (~500 tokens)
|
|
25
|
+
|
|
26
|
+
| File | Before | After | Token Change | Status |
|
|
27
|
+
|------|--------|-------|--------------|--------|
|
|
28
|
+
| `context-signal-detector.md` | ~50t | ~400t | +350t | ✅ |
|
|
29
|
+
| `context-template-selector.md` | ~55t | ~450t | +395t | ✅ |
|
|
30
|
+
|
|
31
|
+
**Changes:**
|
|
32
|
+
- Added XML structure (<identity>, <behavioral_guidelines>, <standards>, <user_input>)
|
|
33
|
+
- Added priority markers (CRITICAL/MANDATORY)
|
|
34
|
+
- Maintained fast haiku model and JSON-only output
|
|
35
|
+
- Preserved keyword-based detection logic
|
|
36
|
+
|
|
37
|
+
**Impact:** More structured without sacrificing speed
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
### Standard Tier (~2000 tokens)
|
|
42
|
+
|
|
43
|
+
| File | Before | After | Token Change | Status |
|
|
44
|
+
|------|--------|-------|--------------|--------|
|
|
45
|
+
| `commands/tasks.md` | ~300t | ~2000t | +1700t | ✅ |
|
|
46
|
+
| `commands/work.md` | ~200t | ~1800t | +1600t | ✅ |
|
|
47
|
+
| `commands/research.md` | 0 (new) | ~800t | +800t | ✅ |
|
|
48
|
+
|
|
49
|
+
**Changes:**
|
|
50
|
+
- Added comprehensive Layer 4 (Domain Knowledge)
|
|
51
|
+
- Documented task system operations (list/summary/start/complete/dispatch)
|
|
52
|
+
- Added common workflows and task types
|
|
53
|
+
- Included anti-patterns (no TodoWrite)
|
|
54
|
+
- Added research methodology from Claude.ai Desktop
|
|
55
|
+
|
|
56
|
+
**Impact:** Commands now provide rich context while remaining actionable
|
|
57
|
+
|
|
58
|
+
---
|
|
59
|
+
|
|
60
|
+
### Professional Tier (~5000 tokens)
|
|
61
|
+
|
|
62
|
+
| File | Before | After | Token Change | Status |
|
|
63
|
+
|------|--------|-------|--------------|--------|
|
|
64
|
+
| `commands/plan.md` | ~700t | ~5000t | +4300t | ✅ |
|
|
65
|
+
|
|
66
|
+
**Changes:**
|
|
67
|
+
- Added Layer 5 (Cross-Cutting Concerns):
|
|
68
|
+
- Error handling strategies
|
|
69
|
+
- Memory integration patterns
|
|
70
|
+
- Performance considerations
|
|
71
|
+
- Security planning (OWASP Top 10)
|
|
72
|
+
- Testing strategy (unit/integration/e2e)
|
|
73
|
+
- Added common planning patterns (new feature, refactor, integration, bug fix)
|
|
74
|
+
- Included strategic thinking behavioral guidelines
|
|
75
|
+
- Added thinking mode specifications (8000 token budget)
|
|
76
|
+
|
|
77
|
+
**Impact:** Professional-grade planning guidance with explicit security, performance, and testing considerations
|
|
78
|
+
|
|
79
|
+
---
|
|
80
|
+
|
|
81
|
+
## New Deliverables
|
|
82
|
+
|
|
83
|
+
### 1. Tier Templates (`docs/PROMPT_TIER_TEMPLATES.md`)
|
|
84
|
+
|
|
85
|
+
Complete XML-based templates for all 4 tiers:
|
|
86
|
+
- **Minimal:** Layers 1,2,3,7 (~500t)
|
|
87
|
+
- **Standard:** Layers 1,2,3,4,7 (~2000t)
|
|
88
|
+
- **Professional:** Layers 1,2,3,4,5,7 (~5000t)
|
|
89
|
+
- **Enterprise:** All 7 layers (~10000t)
|
|
90
|
+
|
|
91
|
+
**Features:**
|
|
92
|
+
- Semantic XML structure
|
|
93
|
+
- Priority marker system
|
|
94
|
+
- Forbidden patterns lists
|
|
95
|
+
- Token budget allocations per layer
|
|
96
|
+
- Customization guide
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
### 2. Research Methodology (`docs/RESEARCH_METHODOLOGY_EXTRACTION.md`)
|
|
101
|
+
|
|
102
|
+
Extracted from Claude.ai Desktop leaked prompt:
|
|
103
|
+
- Query complexity detection (NEVER SEARCH / SINGLE SEARCH / RESEARCH)
|
|
104
|
+
- Search usage guidelines (1-6 word queries, broad→narrow)
|
|
105
|
+
- Copyright-safe summarization (15-20 word quote limit)
|
|
106
|
+
- Harmful content safety filters
|
|
107
|
+
- Citation format requirements
|
|
108
|
+
|
|
109
|
+
**Implementation:**
|
|
110
|
+
- `/research` command created
|
|
111
|
+
- `research-specialist` agent configured
|
|
112
|
+
- Available for task spawning
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
### 3. Claude 4 Analysis (`docs/CLAUDE4_PROMPT_ANALYSIS.md`)
|
|
117
|
+
|
|
118
|
+
Comprehensive analysis of ~60K character production prompt:
|
|
119
|
+
- 10 key insights for 10/10 framework
|
|
120
|
+
- XML semantic boundaries pattern
|
|
121
|
+
- Progressive disclosure mechanisms
|
|
122
|
+
- Memory integration best practices
|
|
123
|
+
- Token budget distribution (~15,200t total)
|
|
124
|
+
- Production-proven checklist
|
|
125
|
+
|
|
126
|
+
**Applied Patterns:**
|
|
127
|
+
- Semantic XML tags for all layers
|
|
128
|
+
- Priority markers (CRITICAL/MANDATORY/PRIORITY/IMPORTANT/NOTE)
|
|
129
|
+
- Forbidden phrases library
|
|
130
|
+
- Natural memory awareness (not external data)
|
|
131
|
+
- Interleaved thinking mode specifications
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
### 4. Compliance Verification Script (`scripts/verify-prompt-compliance.sh`)
|
|
136
|
+
|
|
137
|
+
Automated tool to verify prompts follow framework:
|
|
138
|
+
- Checks required XML tags per tier
|
|
139
|
+
- Validates layer presence
|
|
140
|
+
- Estimates token budgets
|
|
141
|
+
- Identifies priority markers
|
|
142
|
+
- Color-coded pass/fail/warning output
|
|
143
|
+
|
|
144
|
+
**Usage:**
|
|
145
|
+
```bash
|
|
146
|
+
./scripts/verify-prompt-compliance.sh
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
### 5. Prompt Command (`/prompt`)
|
|
152
|
+
|
|
153
|
+
On-demand prompt analysis and transformation:
|
|
154
|
+
- **analyze <file>**: Check compliance, identify tier
|
|
155
|
+
- **transform <file> <tier>**: Convert to target tier
|
|
156
|
+
- **recommend <use_case>**: Suggest appropriate tier
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## Before/After Metrics
|
|
161
|
+
|
|
162
|
+
### Token Efficiency
|
|
163
|
+
|
|
164
|
+
| Tier | Target | Achieved | Prompts | Variance |
|
|
165
|
+
|------|--------|----------|---------|----------|
|
|
166
|
+
| Minimal | ~500t | 400-450t | 2 | -10% to -20% ✅ |
|
|
167
|
+
| Standard | ~2000t | 1800-2000t | 3 | -10% to 0% ✅ |
|
|
168
|
+
| Professional | ~5000t | ~5000t | 1 | 0% ✅ |
|
|
169
|
+
|
|
170
|
+
**Result:** All prompts within or under target budgets
|
|
171
|
+
|
|
172
|
+
---
|
|
173
|
+
|
|
174
|
+
### Structure Compliance
|
|
175
|
+
|
|
176
|
+
| Requirement | Before | After | Improvement |
|
|
177
|
+
|-------------|--------|-------|-------------|
|
|
178
|
+
| XML semantic tags | 0/7 (0%) | 7/7 (100%) | +100% |
|
|
179
|
+
| Priority markers | 0/7 (0%) | 7/7 (100%) | +100% |
|
|
180
|
+
| Layer boundaries | 0/7 (0%) | 7/7 (100%) | +100% |
|
|
181
|
+
| Forbidden patterns | 0/7 (0%) | 7/7 (100%) | +100% |
|
|
182
|
+
| Thinking mode specs | 0/7 (0%) | 2/7 (29%) | +29% |
|
|
183
|
+
|
|
184
|
+
**Note:** Thinking mode specs only in Professional+ tiers (plan.md, future enterprise prompts)
|
|
185
|
+
|
|
186
|
+
---
|
|
187
|
+
|
|
188
|
+
### Content Quality
|
|
189
|
+
|
|
190
|
+
**Added to All Tiers:**
|
|
191
|
+
- ✅ Identity definition (role, purpose, capabilities)
|
|
192
|
+
- ✅ Behavioral guidelines (core behaviors, safety rules)
|
|
193
|
+
- ✅ Standards (formatting, style, quality checks)
|
|
194
|
+
- ✅ User input specification
|
|
195
|
+
|
|
196
|
+
**Added to Standard Tier:**
|
|
197
|
+
- ✅ Domain knowledge (tools, workflows, common tasks)
|
|
198
|
+
- ✅ Task operations documentation
|
|
199
|
+
- ✅ Selection strategies
|
|
200
|
+
|
|
201
|
+
**Added to Professional Tier:**
|
|
202
|
+
- ✅ Cross-cutting concerns (error handling, memory, performance, security, testing)
|
|
203
|
+
- ✅ Strategic thinking guidelines
|
|
204
|
+
- ✅ Planning patterns with risk areas
|
|
205
|
+
- ✅ Comprehensive effort estimation
|
|
206
|
+
|
|
207
|
+
---
|
|
208
|
+
|
|
209
|
+
## Key Insights
|
|
210
|
+
|
|
211
|
+
### 1. XML Structure is Machine-Parseable
|
|
212
|
+
|
|
213
|
+
Semantic tags enable:
|
|
214
|
+
- Automated compliance checking
|
|
215
|
+
- Dynamic section loading (future enhancement)
|
|
216
|
+
- Clear layer boundaries
|
|
217
|
+
- Tool-assisted prompt development
|
|
218
|
+
|
|
219
|
+
### 2. Progressive Disclosure Reduces Tokens
|
|
220
|
+
|
|
221
|
+
By only loading layers needed for complexity:
|
|
222
|
+
- Minimal tasks use 75% fewer tokens than Professional
|
|
223
|
+
- Standard tasks use 60% fewer tokens than Professional
|
|
224
|
+
- Enables cost-effective scaling
|
|
225
|
+
|
|
226
|
+
### 3. Priority Markers Improve Clarity
|
|
227
|
+
|
|
228
|
+
CRITICAL/MANDATORY/PRIORITY/IMPORTANT/NOTE hierarchy:
|
|
229
|
+
- Makes constraints unambiguous
|
|
230
|
+
- Helps models prioritize behaviors
|
|
231
|
+
- Reduces interpretation variance
|
|
232
|
+
|
|
233
|
+
### 4. Forbidden Patterns Prevent Anti-Patterns
|
|
234
|
+
|
|
235
|
+
Explicit "NEVER" lists for:
|
|
236
|
+
- Memory meta-commentary ("I can see from your history...")
|
|
237
|
+
- Flattery ("Great question!")
|
|
238
|
+
- Safety overrides
|
|
239
|
+
- Tool misuse (TodoWrite deprecation)
|
|
240
|
+
|
|
241
|
+
### 5. Domain Knowledge is High-ROI
|
|
242
|
+
|
|
243
|
+
46% of Claude 4's tokens go to tools+domain knowledge:
|
|
244
|
+
- Standard tier sees biggest benefit from Layer 4
|
|
245
|
+
- Workflows and examples reduce ambiguity
|
|
246
|
+
- Task-specific guidance improves quality
|
|
247
|
+
|
|
248
|
+
---
|
|
249
|
+
|
|
250
|
+
## Lessons Learned
|
|
251
|
+
|
|
252
|
+
### What Worked Well
|
|
253
|
+
|
|
254
|
+
1. **XML semantic structure** - Clean, parseable, hierarchical
|
|
255
|
+
2. **Tier-based approach** - Right complexity for right task
|
|
256
|
+
3. **Priority markers** - Unambiguous constraint strength
|
|
257
|
+
4. **Forbidden patterns** - Proactive anti-pattern prevention
|
|
258
|
+
5. **Token budgets** - Keeps prompts focused and efficient
|
|
259
|
+
|
|
260
|
+
### What Could Be Improved
|
|
261
|
+
|
|
262
|
+
1. **Thinking mode specs** - Only in 2/7 prompts, could expand
|
|
263
|
+
2. **Enterprise tier** - No prompts yet, need real use case
|
|
264
|
+
3. **Dynamic loading** - Templates are static, could be conditional
|
|
265
|
+
4. **Automated transformation** - `/prompt` command needs enhancement
|
|
266
|
+
5. **Testing** - Need real-world validation of updated prompts
|
|
267
|
+
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
## Recommendations
|
|
271
|
+
|
|
272
|
+
### Immediate (Week 1)
|
|
273
|
+
|
|
274
|
+
1. **Test updated prompts** in production workflows
|
|
275
|
+
2. **Gather feedback** on token efficiency vs completeness
|
|
276
|
+
3. **Monitor quality** - are prompts more or less effective?
|
|
277
|
+
4. **Run compliance script** weekly to catch regressions
|
|
278
|
+
|
|
279
|
+
### Short-term (Month 1)
|
|
280
|
+
|
|
281
|
+
1. **Create first Enterprise tier prompt** for mission-critical use case
|
|
282
|
+
2. **Enhance `/prompt` command** with actual transformation logic
|
|
283
|
+
3. **Build forbidden patterns library** across all categories
|
|
284
|
+
4. **Add thinking mode specs** to more Standard tier prompts
|
|
285
|
+
|
|
286
|
+
### Long-term (Quarter 1)
|
|
287
|
+
|
|
288
|
+
1. **Dynamic section loading** based on context
|
|
289
|
+
2. **Automated tier selection** based on task complexity
|
|
290
|
+
3. **Prompt versioning system** to track evolution
|
|
291
|
+
4. **A/B testing framework** for prompt effectiveness
|
|
292
|
+
|
|
293
|
+
---
|
|
294
|
+
|
|
295
|
+
## Compliance Status
|
|
296
|
+
|
|
297
|
+
| Prompt | Tier | XML | Layers | Tokens | Priority | Status |
|
|
298
|
+
|--------|------|-----|--------|--------|----------|--------|
|
|
299
|
+
| context-signal-detector.md | Minimal | ✅ | 4/4 | 400t | ✅ | ✅ PASS |
|
|
300
|
+
| context-template-selector.md | Minimal | ✅ | 4/4 | 450t | ✅ | ✅ PASS |
|
|
301
|
+
| commands/tasks.md | Standard | ✅ | 5/5 | 2000t | ✅ | ✅ PASS |
|
|
302
|
+
| commands/work.md | Standard | ✅ | 5/5 | 1800t | ✅ | ✅ PASS |
|
|
303
|
+
| commands/research.md | Standard | ✅ | 5/5 | 800t | ✅ | ✅ PASS |
|
|
304
|
+
| commands/plan.md | Professional | ✅ | 6/6 | 5000t | ✅ | ✅ PASS |
|
|
305
|
+
| commands/prompt.md | Standard | ✅ | 5/5 | 1500t | ✅ | ✅ PASS |
|
|
306
|
+
|
|
307
|
+
**Overall: 7/7 prompts (100%) compliant with 10/10 framework ✅**
|
|
308
|
+
|
|
309
|
+
---
|
|
310
|
+
|
|
311
|
+
## Next Steps
|
|
312
|
+
|
|
313
|
+
1. ✅ **Epic complete** - All 8 tasks finished
|
|
314
|
+
2. ⏭️ **Production validation** - Test in real workflows
|
|
315
|
+
3. ⏭️ **Gather metrics** - Track token usage, quality, effectiveness
|
|
316
|
+
4. ⏭️ **Iterate** - Refine based on production experience
|
|
317
|
+
5. ⏭️ **Expand** - Apply framework to additional agents/commands
|
|
318
|
+
|
|
319
|
+
---
|
|
320
|
+
|
|
321
|
+
**Epic Completion:** 8/8 tasks (100%)
|
|
322
|
+
**Framework Version:** 10/10 AI System Prompt Architecture v1.0
|
|
323
|
+
**Standards Applied:** Production-proven patterns from Claude 4 analysis
|
|
324
|
+
**Methodology:** Progressive disclosure with semantic XML boundaries
|