@hailer/mcp 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/tool-builder.md +37 -0
- package/.claude/commands/ws-pull.md +44 -0
- package/.claude/settings.json +8 -0
- package/.claude/settings.local.json +49 -0
- package/.claude/skills/activity-api/SKILL.md +96 -0
- package/.claude/skills/activity-api/references/activity-endpoints.md +845 -0
- package/.claude/skills/add-app-member-skill/SKILL.md +977 -0
- package/.claude/skills/agent-building/SKILL.md +243 -0
- package/.claude/skills/agent-building/references/architecture-patterns.md +446 -0
- package/.claude/skills/agent-building/references/code-examples.md +587 -0
- package/.claude/skills/agent-building/references/implementation-guide.md +619 -0
- package/.claude/skills/app-api/SKILL.md +219 -0
- package/.claude/skills/app-api/references/app-endpoints.md +759 -0
- package/.claude/skills/building-hailer-apps-skill/SKILL.md +548 -0
- package/.claude/skills/create-app-skill/SKILL.md +1101 -0
- package/.claude/skills/create-insight-skill/SKILL.md +1317 -0
- package/.claude/skills/get-insight-data-skill/SKILL.md +1053 -0
- package/.claude/skills/hailer-api/SKILL.md +283 -0
- package/.claude/skills/hailer-api/references/activities.md +620 -0
- package/.claude/skills/hailer-api/references/authentication.md +216 -0
- package/.claude/skills/hailer-api/references/datasets.md +437 -0
- package/.claude/skills/hailer-api/references/files.md +301 -0
- package/.claude/skills/hailer-api/references/insights.md +469 -0
- package/.claude/skills/hailer-api/references/workflows.md +720 -0
- package/.claude/skills/hailer-api/references/workspaces-users.md +445 -0
- package/.claude/skills/insight-api/SKILL.md +185 -0
- package/.claude/skills/insight-api/references/insight-endpoints.md +514 -0
- package/.claude/skills/install-workflow-skill/SKILL.md +1056 -0
- package/.claude/skills/list-apps-skill/SKILL.md +1010 -0
- package/.claude/skills/list-workflows-minimal-skill/SKILL.md +992 -0
- package/.claude/skills/local-first-skill/SKILL.md +570 -0
- package/.claude/skills/mcp-tools/SKILL.md +419 -0
- package/.claude/skills/mcp-tools/references/api-endpoints.md +499 -0
- package/.claude/skills/mcp-tools/references/data-structures.md +554 -0
- package/.claude/skills/mcp-tools/references/implementation-patterns.md +717 -0
- package/.claude/skills/preview-insight-skill/SKILL.md +1290 -0
- package/.claude/skills/publish-hailer-app-skill/SKILL.md +453 -0
- package/.claude/skills/remove-app-member-skill/SKILL.md +671 -0
- package/.claude/skills/remove-app-skill/SKILL.md +985 -0
- package/.claude/skills/remove-insight-skill/SKILL.md +1011 -0
- package/.claude/skills/remove-workflow-skill/SKILL.md +920 -0
- package/.claude/skills/scaffold-hailer-app-skill/SKILL.md +1034 -0
- package/.claude/skills/skill-testing/README.md +137 -0
- package/.claude/skills/skill-testing/SKILL.md +348 -0
- package/.claude/skills/skill-testing/references/test-patterns.md +705 -0
- package/.claude/skills/skill-testing/references/testing-guide.md +603 -0
- package/.claude/skills/skill-testing/references/validation-checklist.md +537 -0
- package/.claude/skills/tool-builder/SKILL.md +328 -0
- package/.claude/skills/update-app-skill/SKILL.md +970 -0
- package/.claude/skills/update-workflow-field-skill/SKILL.md +1098 -0
- package/.env.example +81 -0
- package/.mcp.json +13 -0
- package/README.md +297 -0
- package/dist/app.d.ts +4 -0
- package/dist/app.js +74 -0
- package/dist/cli.d.ts +3 -0
- package/dist/cli.js +5 -0
- package/dist/client/adaptive-documentation-bot.d.ts +108 -0
- package/dist/client/adaptive-documentation-bot.js +475 -0
- package/dist/client/adaptive-documentation-types.d.ts +66 -0
- package/dist/client/adaptive-documentation-types.js +9 -0
- package/dist/client/agent-activity-bot.d.ts +51 -0
- package/dist/client/agent-activity-bot.js +166 -0
- package/dist/client/agent-tracker.d.ts +499 -0
- package/dist/client/agent-tracker.js +659 -0
- package/dist/client/description-updater.d.ts +56 -0
- package/dist/client/description-updater.js +259 -0
- package/dist/client/log-parser.d.ts +72 -0
- package/dist/client/log-parser.js +387 -0
- package/dist/client/mcp-client.d.ts +50 -0
- package/dist/client/mcp-client.js +532 -0
- package/dist/client/message-processor.d.ts +35 -0
- package/dist/client/message-processor.js +352 -0
- package/dist/client/multi-bot-manager.d.ts +24 -0
- package/dist/client/multi-bot-manager.js +74 -0
- package/dist/client/providers/anthropic-provider.d.ts +19 -0
- package/dist/client/providers/anthropic-provider.js +631 -0
- package/dist/client/providers/llm-provider.d.ts +47 -0
- package/dist/client/providers/llm-provider.js +367 -0
- package/dist/client/providers/openai-provider.d.ts +23 -0
- package/dist/client/providers/openai-provider.js +621 -0
- package/dist/client/simple-llm-caller.d.ts +19 -0
- package/dist/client/simple-llm-caller.js +100 -0
- package/dist/client/skill-generator.d.ts +81 -0
- package/dist/client/skill-generator.js +386 -0
- package/dist/client/test-adaptive-bot.d.ts +9 -0
- package/dist/client/test-adaptive-bot.js +82 -0
- package/dist/client/token-pricing.d.ts +38 -0
- package/dist/client/token-pricing.js +127 -0
- package/dist/client/token-tracker.d.ts +232 -0
- package/dist/client/token-tracker.js +457 -0
- package/dist/client/token-usage-bot.d.ts +53 -0
- package/dist/client/token-usage-bot.js +153 -0
- package/dist/client/tool-executor.d.ts +69 -0
- package/dist/client/tool-executor.js +159 -0
- package/dist/client/tool-schema-loader.d.ts +60 -0
- package/dist/client/tool-schema-loader.js +178 -0
- package/dist/client/types.d.ts +69 -0
- package/dist/client/types.js +7 -0
- package/dist/config.d.ts +162 -0
- package/dist/config.js +296 -0
- package/dist/core.d.ts +26 -0
- package/dist/core.js +147 -0
- package/dist/lib/context-manager.d.ts +111 -0
- package/dist/lib/context-manager.js +431 -0
- package/dist/lib/logger.d.ts +74 -0
- package/dist/lib/logger.js +277 -0
- package/dist/lib/materialize.d.ts +3 -0
- package/dist/lib/materialize.js +101 -0
- package/dist/lib/normalizedName.d.ts +7 -0
- package/dist/lib/normalizedName.js +48 -0
- package/dist/lib/prompt-length-manager.d.ts +81 -0
- package/dist/lib/prompt-length-manager.js +457 -0
- package/dist/lib/terminal-prompt.d.ts +9 -0
- package/dist/lib/terminal-prompt.js +108 -0
- package/dist/mcp/UserContextCache.d.ts +56 -0
- package/dist/mcp/UserContextCache.js +163 -0
- package/dist/mcp/auth.d.ts +2 -0
- package/dist/mcp/auth.js +29 -0
- package/dist/mcp/hailer-clients.d.ts +42 -0
- package/dist/mcp/hailer-clients.js +246 -0
- package/dist/mcp/signal-handler.d.ts +45 -0
- package/dist/mcp/signal-handler.js +317 -0
- package/dist/mcp/tool-registry.d.ts +100 -0
- package/dist/mcp/tool-registry.js +306 -0
- package/dist/mcp/tools/activity.d.ts +15 -0
- package/dist/mcp/tools/activity.js +955 -0
- package/dist/mcp/tools/app.d.ts +20 -0
- package/dist/mcp/tools/app.js +1488 -0
- package/dist/mcp/tools/discussion.d.ts +19 -0
- package/dist/mcp/tools/discussion.js +950 -0
- package/dist/mcp/tools/file.d.ts +15 -0
- package/dist/mcp/tools/file.js +119 -0
- package/dist/mcp/tools/insight.d.ts +17 -0
- package/dist/mcp/tools/insight.js +806 -0
- package/dist/mcp/tools/skill.d.ts +10 -0
- package/dist/mcp/tools/skill.js +279 -0
- package/dist/mcp/tools/user.d.ts +10 -0
- package/dist/mcp/tools/user.js +108 -0
- package/dist/mcp/tools/workflow-template.d.ts +19 -0
- package/dist/mcp/tools/workflow-template.js +822 -0
- package/dist/mcp/tools/workflow.d.ts +18 -0
- package/dist/mcp/tools/workflow.js +1362 -0
- package/dist/mcp/utils/api-errors.d.ts +45 -0
- package/dist/mcp/utils/api-errors.js +160 -0
- package/dist/mcp/utils/data-transformers.d.ts +102 -0
- package/dist/mcp/utils/data-transformers.js +194 -0
- package/dist/mcp/utils/file-upload.d.ts +33 -0
- package/dist/mcp/utils/file-upload.js +148 -0
- package/dist/mcp/utils/hailer-api-client.d.ts +120 -0
- package/dist/mcp/utils/hailer-api-client.js +323 -0
- package/dist/mcp/utils/index.d.ts +13 -0
- package/dist/mcp/utils/index.js +39 -0
- package/dist/mcp/utils/logger.d.ts +42 -0
- package/dist/mcp/utils/logger.js +103 -0
- package/dist/mcp/utils/types.d.ts +286 -0
- package/dist/mcp/utils/types.js +7 -0
- package/dist/mcp/workspace-cache.d.ts +42 -0
- package/dist/mcp/workspace-cache.js +97 -0
- package/dist/mcp-server.d.ts +42 -0
- package/dist/mcp-server.js +280 -0
- package/package.json +56 -0
- package/tsconfig.json +23 -0
|
@@ -0,0 +1,537 @@
|
|
|
1
|
+
# Skill Validation Checklist
|
|
2
|
+
|
|
3
|
+
Comprehensive quality checklist for validating skills before deployment.
|
|
4
|
+
|
|
5
|
+
## Pre-Deployment Checklist
|
|
6
|
+
|
|
7
|
+
Use this checklist for every new skill or major skill update:
|
|
8
|
+
|
|
9
|
+
### 📁 Structure & Files
|
|
10
|
+
|
|
11
|
+
```
|
|
12
|
+
□ SKILL.md exists in root of skill directory
|
|
13
|
+
□ SKILL.md contains valid markdown
|
|
14
|
+
□ SKILL.md has clear title (# Heading)
|
|
15
|
+
□ SKILL.md has overview/introduction section
|
|
16
|
+
□ references/ directory exists (if needed)
|
|
17
|
+
□ All reference files are .md format
|
|
18
|
+
□ No broken file references
|
|
19
|
+
□ File permissions are correct (readable)
|
|
20
|
+
□ Total size < 100KB (or justified if larger)
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
### 📝 Content Quality
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
□ Title is clear and descriptive
|
|
27
|
+
□ Overview explains skill purpose
|
|
28
|
+
□ Content is technically accurate
|
|
29
|
+
□ Examples are working and tested
|
|
30
|
+
□ Code blocks use proper syntax highlighting
|
|
31
|
+
□ Instructions are clear and actionable
|
|
32
|
+
□ No typos or grammar errors
|
|
33
|
+
□ Consistent formatting throughout
|
|
34
|
+
□ Appropriate level of detail
|
|
35
|
+
□ No outdated information
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### 🔍 Keywords & Matching
|
|
39
|
+
|
|
40
|
+
```
|
|
41
|
+
□ Keywords added to SkillManager.buildSkillMappings()
|
|
42
|
+
□ Primary keywords are specific and unique
|
|
43
|
+
□ Secondary keywords provide coverage
|
|
44
|
+
□ Keywords match actual user queries
|
|
45
|
+
□ Tested with 5+ real user queries
|
|
46
|
+
□ Confidence scores are appropriate
|
|
47
|
+
□ No keyword conflicts with other skills
|
|
48
|
+
□ Edge cases handled (typos, variations)
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
### 🔧 Tools & Integration
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
□ Recommended tools list is accurate
|
|
55
|
+
□ All recommended tools actually exist
|
|
56
|
+
□ Tools are appropriate for skill purpose
|
|
57
|
+
□ Tool usage examples are provided
|
|
58
|
+
□ Integration with MCP system verified
|
|
59
|
+
□ Works with SkillLoader
|
|
60
|
+
□ Works with SkillManager
|
|
61
|
+
□ Error handling is graceful
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
### ⚡ Performance
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
□ First load time < 100ms
|
|
68
|
+
□ Cached load time < 10ms
|
|
69
|
+
□ Skill content is optimized
|
|
70
|
+
□ No unnecessary large assets
|
|
71
|
+
□ Caching works correctly
|
|
72
|
+
□ No memory leaks
|
|
73
|
+
□ Performance under load tested
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
### 🧪 Testing
|
|
77
|
+
|
|
78
|
+
```
|
|
79
|
+
□ Unit tests pass
|
|
80
|
+
□ Integration tests pass
|
|
81
|
+
□ Keyword matching tests pass
|
|
82
|
+
□ Performance benchmarks pass
|
|
83
|
+
□ Error handling tested
|
|
84
|
+
□ Edge cases covered
|
|
85
|
+
□ All test patterns applied
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### 🤖 LLM Integration
|
|
89
|
+
|
|
90
|
+
```
|
|
91
|
+
□ Works with OpenAI provider
|
|
92
|
+
□ Works with Anthropic provider
|
|
93
|
+
□ System prompt enhancement works
|
|
94
|
+
□ Skill content is helpful to LLM
|
|
95
|
+
□ LLM can understand and use content
|
|
96
|
+
□ Examples are LLM-friendly
|
|
97
|
+
□ Guidance text is clear
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### 📚 Documentation
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
□ Skill purpose is documented
|
|
104
|
+
□ Usage examples provided
|
|
105
|
+
□ Prerequisites listed (if any)
|
|
106
|
+
□ Troubleshooting tips included
|
|
107
|
+
□ References are cited
|
|
108
|
+
□ Related skills mentioned
|
|
109
|
+
□ Changelog/version info (if updated)
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
### 🔒 Security & Safety
|
|
113
|
+
|
|
114
|
+
```
|
|
115
|
+
□ No sensitive information in skill
|
|
116
|
+
□ No hardcoded credentials
|
|
117
|
+
□ No malicious code
|
|
118
|
+
□ Safe examples (no destructive operations)
|
|
119
|
+
□ Appropriate warnings for risky operations
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
### 🎯 User Experience
|
|
123
|
+
|
|
124
|
+
```
|
|
125
|
+
□ Skill solves a real user need
|
|
126
|
+
□ Content is easy to understand
|
|
127
|
+
□ Examples are relevant and practical
|
|
128
|
+
□ Navigation is logical
|
|
129
|
+
□ Skill complements existing skills
|
|
130
|
+
□ No duplicate content with other skills
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
## Quality Scoring Matrix
|
|
134
|
+
|
|
135
|
+
Score each dimension from 1-5:
|
|
136
|
+
|
|
137
|
+
### Content Quality (Weight: 30%)
|
|
138
|
+
```
|
|
139
|
+
5 - Exceptional: Clear, comprehensive, accurate, well-organized
|
|
140
|
+
4 - Good: Clear and accurate with minor improvements needed
|
|
141
|
+
3 - Adequate: Usable but needs refinement
|
|
142
|
+
2 - Poor: Confusing or inaccurate in places
|
|
143
|
+
1 - Unacceptable: Major issues, not usable
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### Technical Accuracy (Weight: 25%)
|
|
147
|
+
```
|
|
148
|
+
5 - Exceptional: 100% accurate, tested examples
|
|
149
|
+
4 - Good: Accurate with minor issues
|
|
150
|
+
3 - Adequate: Mostly accurate, some gaps
|
|
151
|
+
2 - Poor: Several inaccuracies
|
|
152
|
+
1 - Unacceptable: Fundamentally incorrect
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
### Usability (Weight: 20%)
|
|
156
|
+
```
|
|
157
|
+
5 - Exceptional: Intuitive, excellent examples, clear instructions
|
|
158
|
+
4 - Good: Easy to use with minor confusion points
|
|
159
|
+
3 - Adequate: Usable but requires effort
|
|
160
|
+
2 - Poor: Confusing, difficult to apply
|
|
161
|
+
1 - Unacceptable: Unusable
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
### Performance (Weight: 15%)
|
|
165
|
+
```
|
|
166
|
+
5 - Exceptional: Blazing fast, < 50ms first load
|
|
167
|
+
4 - Good: Fast, < 100ms first load
|
|
168
|
+
3 - Adequate: Acceptable, < 150ms first load
|
|
169
|
+
2 - Poor: Slow, > 150ms first load
|
|
170
|
+
1 - Unacceptable: Very slow, > 300ms first load
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### Coverage (Weight: 10%)
|
|
174
|
+
```
|
|
175
|
+
5 - Exceptional: Comprehensive, covers all scenarios
|
|
176
|
+
4 - Good: Covers main scenarios well
|
|
177
|
+
3 - Adequate: Basic coverage
|
|
178
|
+
2 - Poor: Significant gaps
|
|
179
|
+
1 - Unacceptable: Minimal coverage
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
**Minimum acceptable score: 3.5/5 weighted average**
|
|
183
|
+
|
|
184
|
+
## Automated Validation Script
|
|
185
|
+
|
|
186
|
+
```typescript
|
|
187
|
+
/**
|
|
188
|
+
* Automated skill validation
|
|
189
|
+
* Run: npx tsx validate-skill.ts <skill-name>
|
|
190
|
+
*/
|
|
191
|
+
|
|
192
|
+
import { SkillLoader } from './src/client/skill-loader';
|
|
193
|
+
import { SkillManager } from './src/client/skill-manager';
|
|
194
|
+
import * as path from 'path';
|
|
195
|
+
import * as fs from 'fs/promises';
|
|
196
|
+
|
|
197
|
+
interface ValidationResult {
|
|
198
|
+
category: string;
|
|
199
|
+
checks: Array<{ name: string; passed: boolean; message?: string }>;
|
|
200
|
+
}
|
|
201
|
+
|
|
202
|
+
async function validateSkill(skillName: string): Promise<boolean> {
|
|
203
|
+
console.log(`\n🔍 Validating Skill: ${skillName}\n`);
|
|
204
|
+
|
|
205
|
+
const skillsPath = path.join(process.cwd(), '..', '.claude', 'skills');
|
|
206
|
+
const skillPath = path.join(skillsPath, skillName);
|
|
207
|
+
const loader = new SkillLoader(skillsPath);
|
|
208
|
+
const manager = new SkillManager(loader);
|
|
209
|
+
|
|
210
|
+
const results: ValidationResult[] = [];
|
|
211
|
+
let totalPassed = 0;
|
|
212
|
+
let totalChecks = 0;
|
|
213
|
+
|
|
214
|
+
// Category 1: Structure & Files
|
|
215
|
+
const structureChecks: ValidationResult = {
|
|
216
|
+
category: 'Structure & Files',
|
|
217
|
+
checks: [],
|
|
218
|
+
};
|
|
219
|
+
|
|
220
|
+
try {
|
|
221
|
+
await fs.stat(path.join(skillPath, 'SKILL.md'));
|
|
222
|
+
structureChecks.checks.push({ name: 'SKILL.md exists', passed: true });
|
|
223
|
+
} catch {
|
|
224
|
+
structureChecks.checks.push({ name: 'SKILL.md exists', passed: false });
|
|
225
|
+
}
|
|
226
|
+
|
|
227
|
+
try {
|
|
228
|
+
const content = await fs.readFile(path.join(skillPath, 'SKILL.md'), 'utf-8');
|
|
229
|
+
structureChecks.checks.push({
|
|
230
|
+
name: 'Has title',
|
|
231
|
+
passed: /^#\s+/.test(content),
|
|
232
|
+
});
|
|
233
|
+
structureChecks.checks.push({
|
|
234
|
+
name: 'Has overview',
|
|
235
|
+
passed: /overview|introduction|about/i.test(content),
|
|
236
|
+
});
|
|
237
|
+
structureChecks.checks.push({
|
|
238
|
+
name: 'Size reasonable',
|
|
239
|
+
passed: content.length < 100000,
|
|
240
|
+
message: `${(content.length / 1024).toFixed(1)} KB`,
|
|
241
|
+
});
|
|
242
|
+
} catch {
|
|
243
|
+
structureChecks.checks.push({ name: 'Content readable', passed: false });
|
|
244
|
+
}
|
|
245
|
+
|
|
246
|
+
results.push(structureChecks);
|
|
247
|
+
|
|
248
|
+
// Category 2: Loading & Performance
|
|
249
|
+
const perfChecks: ValidationResult = {
|
|
250
|
+
category: 'Performance',
|
|
251
|
+
checks: [],
|
|
252
|
+
};
|
|
253
|
+
|
|
254
|
+
try {
|
|
255
|
+
loader.clearCache(skillName);
|
|
256
|
+
const start1 = Date.now();
|
|
257
|
+
await loader.load(skillName);
|
|
258
|
+
const loadTime = Date.now() - start1;
|
|
259
|
+
|
|
260
|
+
perfChecks.checks.push({
|
|
261
|
+
name: 'First load < 100ms',
|
|
262
|
+
passed: loadTime < 100,
|
|
263
|
+
message: `${loadTime}ms`,
|
|
264
|
+
});
|
|
265
|
+
|
|
266
|
+
const start2 = Date.now();
|
|
267
|
+
await loader.load(skillName);
|
|
268
|
+
const cacheTime = Date.now() - start2;
|
|
269
|
+
|
|
270
|
+
perfChecks.checks.push({
|
|
271
|
+
name: 'Cached load < 10ms',
|
|
272
|
+
passed: cacheTime < 10,
|
|
273
|
+
message: `${cacheTime}ms`,
|
|
274
|
+
});
|
|
275
|
+
} catch (error) {
|
|
276
|
+
perfChecks.checks.push({
|
|
277
|
+
name: 'Loads without errors',
|
|
278
|
+
passed: false,
|
|
279
|
+
message: String(error),
|
|
280
|
+
});
|
|
281
|
+
}
|
|
282
|
+
|
|
283
|
+
results.push(perfChecks);
|
|
284
|
+
|
|
285
|
+
// Category 3: Content Quality
|
|
286
|
+
const contentChecks: ValidationResult = {
|
|
287
|
+
category: 'Content Quality',
|
|
288
|
+
checks: [],
|
|
289
|
+
};
|
|
290
|
+
|
|
291
|
+
try {
|
|
292
|
+
const skill = await loader.load(skillName);
|
|
293
|
+
const content = skill.content;
|
|
294
|
+
|
|
295
|
+
contentChecks.checks.push({
|
|
296
|
+
name: 'Has examples',
|
|
297
|
+
passed: content.includes('```'),
|
|
298
|
+
});
|
|
299
|
+
|
|
300
|
+
contentChecks.checks.push({
|
|
301
|
+
name: 'Has structure (headings)',
|
|
302
|
+
passed: content.includes('##'),
|
|
303
|
+
});
|
|
304
|
+
|
|
305
|
+
contentChecks.checks.push({
|
|
306
|
+
name: 'Not too short',
|
|
307
|
+
passed: content.length > 500,
|
|
308
|
+
message: `${content.length} chars`,
|
|
309
|
+
});
|
|
310
|
+
|
|
311
|
+
contentChecks.checks.push({
|
|
312
|
+
name: 'Has actionable content',
|
|
313
|
+
passed: /how to|step|example|guide|use/i.test(content),
|
|
314
|
+
});
|
|
315
|
+
} catch (error) {
|
|
316
|
+
contentChecks.checks.push({
|
|
317
|
+
name: 'Content validation',
|
|
318
|
+
passed: false,
|
|
319
|
+
message: String(error),
|
|
320
|
+
});
|
|
321
|
+
}
|
|
322
|
+
|
|
323
|
+
results.push(contentChecks);
|
|
324
|
+
|
|
325
|
+
// Category 4: Keyword Matching
|
|
326
|
+
const keywordChecks: ValidationResult = {
|
|
327
|
+
category: 'Keyword Matching',
|
|
328
|
+
checks: [],
|
|
329
|
+
};
|
|
330
|
+
|
|
331
|
+
const mappings = manager.getSkillMappings();
|
|
332
|
+
const mapping = mappings.get(skillName);
|
|
333
|
+
|
|
334
|
+
if (mapping) {
|
|
335
|
+
keywordChecks.checks.push({
|
|
336
|
+
name: 'Keywords defined',
|
|
337
|
+
passed: mapping.keywords.length > 0,
|
|
338
|
+
message: `${mapping.keywords.length} keywords`,
|
|
339
|
+
});
|
|
340
|
+
|
|
341
|
+
keywordChecks.checks.push({
|
|
342
|
+
name: 'Tools defined',
|
|
343
|
+
passed: mapping.tools !== undefined,
|
|
344
|
+
message: `${mapping.tools?.length || 0} tools`,
|
|
345
|
+
});
|
|
346
|
+
|
|
347
|
+
// Test primary keyword
|
|
348
|
+
if (mapping.keywords.length > 0) {
|
|
349
|
+
const guidance = await manager.analyzeRequest(mapping.keywords[0]);
|
|
350
|
+
keywordChecks.checks.push({
|
|
351
|
+
name: 'Primary keyword matches',
|
|
352
|
+
passed: guidance.skills.includes(skillName),
|
|
353
|
+
message: `"${mapping.keywords[0]}"`,
|
|
354
|
+
});
|
|
355
|
+
}
|
|
356
|
+
} else {
|
|
357
|
+
keywordChecks.checks.push({
|
|
358
|
+
name: 'Registered in SkillManager',
|
|
359
|
+
passed: false,
|
|
360
|
+
});
|
|
361
|
+
}
|
|
362
|
+
|
|
363
|
+
results.push(keywordChecks);
|
|
364
|
+
|
|
365
|
+
// Print results
|
|
366
|
+
for (const result of results) {
|
|
367
|
+
console.log(`\n${result.category}`);
|
|
368
|
+
console.log('─'.repeat(50));
|
|
369
|
+
|
|
370
|
+
for (const check of result.checks) {
|
|
371
|
+
totalChecks++;
|
|
372
|
+
const icon = check.passed ? '✅' : '❌';
|
|
373
|
+
const msg = check.message ? ` (${check.message})` : '';
|
|
374
|
+
console.log(`${icon} ${check.name}${msg}`);
|
|
375
|
+
|
|
376
|
+
if (check.passed) totalPassed++;
|
|
377
|
+
}
|
|
378
|
+
}
|
|
379
|
+
|
|
380
|
+
// Summary
|
|
381
|
+
const passRate = (totalPassed / totalChecks * 100).toFixed(1);
|
|
382
|
+
console.log('\n' + '='.repeat(50));
|
|
383
|
+
console.log(`Results: ${totalPassed}/${totalChecks} checks passed (${passRate}%)`);
|
|
384
|
+
console.log('='.repeat(50) + '\n');
|
|
385
|
+
|
|
386
|
+
const passed = totalPassed === totalChecks;
|
|
387
|
+
if (passed) {
|
|
388
|
+
console.log(`✅ ${skillName} passed all validation checks!\n`);
|
|
389
|
+
} else {
|
|
390
|
+
console.log(`❌ ${skillName} has ${totalChecks - totalPassed} failing checks\n`);
|
|
391
|
+
}
|
|
392
|
+
|
|
393
|
+
return passed;
|
|
394
|
+
}
|
|
395
|
+
|
|
396
|
+
// CLI usage
|
|
397
|
+
const skillName = process.argv[2];
|
|
398
|
+
if (!skillName) {
|
|
399
|
+
console.error('Usage: npx tsx validate-skill.ts <skill-name>');
|
|
400
|
+
process.exit(1);
|
|
401
|
+
}
|
|
402
|
+
|
|
403
|
+
validateSkill(skillName)
|
|
404
|
+
.then(passed => process.exit(passed ? 0 : 1))
|
|
405
|
+
.catch(error => {
|
|
406
|
+
console.error('Validation error:', error);
|
|
407
|
+
process.exit(1);
|
|
408
|
+
});
|
|
409
|
+
```
|
|
410
|
+
|
|
411
|
+
## Manual Review Checklist
|
|
412
|
+
|
|
413
|
+
After automated validation passes, perform manual review:
|
|
414
|
+
|
|
415
|
+
### Content Review (10-15 minutes)
|
|
416
|
+
```
|
|
417
|
+
□ Read through entire skill content
|
|
418
|
+
□ Verify examples are clear and useful
|
|
419
|
+
□ Check for typos and grammar
|
|
420
|
+
□ Ensure logical flow and organization
|
|
421
|
+
□ Validate technical accuracy
|
|
422
|
+
□ Confirm examples can be copied and used
|
|
423
|
+
```
|
|
424
|
+
|
|
425
|
+
### User Testing (15-20 minutes)
|
|
426
|
+
```
|
|
427
|
+
□ Test with 5 different user queries
|
|
428
|
+
□ Verify skill triggers appropriately
|
|
429
|
+
□ Check that content helps LLM respond well
|
|
430
|
+
□ Test with both OpenAI and Anthropic (if available)
|
|
431
|
+
□ Gather feedback from team members
|
|
432
|
+
□ Validate recommended tools are correct
|
|
433
|
+
```
|
|
434
|
+
|
|
435
|
+
### Integration Testing (10 minutes)
|
|
436
|
+
```
|
|
437
|
+
□ Test in development environment
|
|
438
|
+
□ Verify no errors in logs
|
|
439
|
+
□ Check performance is acceptable
|
|
440
|
+
□ Ensure graceful error handling
|
|
441
|
+
□ Validate caching works
|
|
442
|
+
□ Test with concurrent requests
|
|
443
|
+
```
|
|
444
|
+
|
|
445
|
+
## Post-Deployment Monitoring
|
|
446
|
+
|
|
447
|
+
After deploying a skill to production:
|
|
448
|
+
|
|
449
|
+
### Week 1 Monitoring
|
|
450
|
+
```
|
|
451
|
+
□ Monitor error rates
|
|
452
|
+
□ Check load times
|
|
453
|
+
□ Review skill matching accuracy
|
|
454
|
+
□ Collect user feedback
|
|
455
|
+
□ Track usage metrics
|
|
456
|
+
□ Identify any issues
|
|
457
|
+
```
|
|
458
|
+
|
|
459
|
+
### Week 2-4 Iteration
|
|
460
|
+
```
|
|
461
|
+
□ Address reported issues
|
|
462
|
+
□ Refine keywords based on usage
|
|
463
|
+
□ Update content based on feedback
|
|
464
|
+
□ Optimize performance if needed
|
|
465
|
+
□ Improve examples
|
|
466
|
+
□ Update documentation
|
|
467
|
+
```
|
|
468
|
+
|
|
469
|
+
### Monthly Review
|
|
470
|
+
```
|
|
471
|
+
□ Review usage statistics
|
|
472
|
+
□ Assess content relevance
|
|
473
|
+
□ Check for outdated information
|
|
474
|
+
□ Compare with similar skills
|
|
475
|
+
□ Identify improvement opportunities
|
|
476
|
+
□ Plan updates if needed
|
|
477
|
+
```
|
|
478
|
+
|
|
479
|
+
## Quality Gates
|
|
480
|
+
|
|
481
|
+
Define quality gates that must pass before deployment:
|
|
482
|
+
|
|
483
|
+
### Gate 1: Automated Tests
|
|
484
|
+
- All unit tests pass
|
|
485
|
+
- All integration tests pass
|
|
486
|
+
- Performance benchmarks met
|
|
487
|
+
- No critical errors
|
|
488
|
+
|
|
489
|
+
### Gate 2: Manual Review
|
|
490
|
+
- Content reviewed by at least one other person
|
|
491
|
+
- Examples tested manually
|
|
492
|
+
- Technical accuracy verified
|
|
493
|
+
- No major issues found
|
|
494
|
+
|
|
495
|
+
### Gate 3: Staging Testing
|
|
496
|
+
- Tested in staging environment
|
|
497
|
+
- Works with all providers
|
|
498
|
+
- No performance regressions
|
|
499
|
+
- Logs are clean
|
|
500
|
+
|
|
501
|
+
### Gate 4: Approval
|
|
502
|
+
- Team lead approves
|
|
503
|
+
- No blocking issues
|
|
504
|
+
- Documentation complete
|
|
505
|
+
- Rollback plan ready
|
|
506
|
+
|
|
507
|
+
## Failure Response
|
|
508
|
+
|
|
509
|
+
If validation fails:
|
|
510
|
+
|
|
511
|
+
1. **Document the Issues**: Record all failing checks
|
|
512
|
+
2. **Prioritize Fixes**: Address critical issues first
|
|
513
|
+
3. **Fix and Retest**: Make corrections and re-run validation
|
|
514
|
+
4. **Review Root Cause**: Understand why issues occurred
|
|
515
|
+
5. **Update Process**: Improve process to prevent recurrence
|
|
516
|
+
|
|
517
|
+
## Success Criteria
|
|
518
|
+
|
|
519
|
+
A skill is ready for production when:
|
|
520
|
+
|
|
521
|
+
- ✅ All automated checks pass
|
|
522
|
+
- ✅ Manual review complete and approved
|
|
523
|
+
- ✅ User testing shows positive results
|
|
524
|
+
- ✅ Performance meets targets
|
|
525
|
+
- ✅ No blocking issues identified
|
|
526
|
+
- ✅ Team approval obtained
|
|
527
|
+
- ✅ Monitoring plan in place
|
|
528
|
+
|
|
529
|
+
## Continuous Improvement
|
|
530
|
+
|
|
531
|
+
After deployment:
|
|
532
|
+
|
|
533
|
+
1. **Monitor Metrics**: Track usage, performance, errors
|
|
534
|
+
2. **Collect Feedback**: From users and LLM responses
|
|
535
|
+
3. **Identify Issues**: Proactively find problems
|
|
536
|
+
4. **Iterate Quickly**: Make small, frequent improvements
|
|
537
|
+
5. **Share Learnings**: Update process and best practices
|