claude-flow-novice 1.5.21 → 1.5.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (25) hide show
  1. package/.claude/agents/CLAUDE.md +186 -2386
  2. package/.claude/agents/agent-principles/agent-type-guidelines.md +328 -0
  3. package/.claude/agents/agent-principles/format-selection.md +204 -0
  4. package/.claude/agents/agent-principles/prompt-engineering.md +371 -0
  5. package/.claude/agents/agent-principles/quality-metrics.md +294 -0
  6. package/.claude/agents/frontend/README.md +574 -53
  7. package/.claude/agents/frontend/interaction-tester.md +850 -108
  8. package/.claude/agents/frontend/react-frontend-engineer.md +130 -0
  9. package/.claude/agents/frontend/state-architect.md +240 -152
  10. package/.claude/agents/frontend/ui-designer.md +292 -68
  11. package/.claude/agents/researcher.md +1 -1
  12. package/.claude/agents/swarm/test-coordinator.md +383 -0
  13. package/.claude/agents/task-coordinator.md +126 -0
  14. package/.claude/settings.json +7 -7
  15. package/.claude-flow-novice/dist/src/hooks/enhanced-hooks-cli.js +168 -167
  16. package/.claude-flow-novice/dist/src/providers/tiered-router.js +118 -0
  17. package/.claude-flow-novice/dist/src/providers/tiered-router.js.map +1 -0
  18. package/.claude-flow-novice/dist/src/providers/types.js.map +1 -1
  19. package/.claude-flow-novice/dist/src/providers/zai-provider.js +268 -0
  20. package/.claude-flow-novice/dist/src/providers/zai-provider.js.map +1 -0
  21. package/package.json +1 -1
  22. package/src/cli/simple-commands/init/templates/CLAUDE.md +25 -0
  23. package/src/hooks/enhanced-hooks-cli.js +23 -3
  24. package/src/hooks/enhanced-post-edit-pipeline.js +154 -75
  25. /package/.claude/agents/{CLAUDE_AGENT_DESIGN_PRINCIPLES.md → agent-principles/CLAUDE_AGENT_DESIGN_PRINCIPLES.md} +0 -0
@@ -0,0 +1,371 @@
1
+ # Prompt Engineering Best Practices
2
+
3
+ **Version:** 2.0.0
4
+ **Last Updated:** 2025-09-30
5
+
6
+ ## Core Principles
7
+
8
+ Effective agent prompts require careful attention to structure, clarity, and appropriate detail level based on task complexity.
9
+
10
+ ---
11
+
12
+ ## 1. Clear Role Definition
13
+
14
+ ```yaml
15
+ GOOD:
16
+ "You are a senior Rust developer specializing in concurrent programming"
17
+
18
+ BAD:
19
+ "You write code"
20
+
21
+ WHY:
22
+ - Clear expertise domain
23
+ - Sets expectations for quality
24
+ - Activates relevant knowledge
25
+ ```
26
+
27
+ ---
28
+
29
+ ## 2. Specific Responsibilities
30
+
31
+ ```yaml
32
+ GOOD:
33
+ - Implement lock-free data structures using atomics
34
+ - Ensure memory safety with proper synchronization
35
+ - Write linearizability tests using loom
36
+
37
+ BAD:
38
+ - Write concurrent code
39
+ - Make it safe
40
+
41
+ WHY:
42
+ - Concrete and actionable
43
+ - Measurable outcomes
44
+ - Clear scope
45
+ ```
46
+
47
+ ---
48
+
49
+ ## 3. Appropriate Tool Selection
50
+
51
+ ```yaml
52
+ Essential Tools:
53
+ - Read: Required for all agents (must read before editing)
54
+ - Write: For creating new files
55
+ - Edit: For modifying existing files
56
+ - Bash: For running commands
57
+ - Grep: For searching code
58
+ - Glob: For finding files
59
+ - TodoWrite: For task tracking
60
+
61
+ Optional Tools:
62
+ - WebSearch: For research agents
63
+ - Task: For coordinator agents (spawning sub-agents)
64
+
65
+ AVOID:
66
+ - Giving unnecessary tools
67
+ - Restricting essential tools
68
+ ```
69
+
70
+ ---
71
+
72
+ ## 4. Integration Points
73
+
74
+ ```yaml
75
+ GOOD:
76
+ Collaboration:
77
+ - Architect: Provides design constraints
78
+ - Reviewer: Validates implementation
79
+ - Tester: Ensures correctness
80
+
81
+ BAD:
82
+ "Works with other agents"
83
+
84
+ WHY:
85
+ - Specific integration contracts
86
+ - Clear handoff points
87
+ - Defined outputs/inputs
88
+ ```
89
+
90
+ ---
91
+
92
+ ## 5. Validation and Hooks
93
+
94
+ ### Mandatory Post-Edit Validation
95
+
96
+ **CRITICAL**: After **EVERY** file edit operation:
97
+
98
+ ```bash
99
+ npx claude-flow@alpha hooks post-edit [FILE_PATH] --memory-key "agent/step" --structured
100
+ ```
101
+
102
+ **Benefits:**
103
+ - TDD compliance checking
104
+ - Security analysis (XSS, eval, credentials)
105
+ - Formatting validation
106
+ - Coverage analysis
107
+ - Actionable recommendations
108
+
109
+ **Rationale:**
110
+ - Ensures quality gates
111
+ - Provides immediate feedback
112
+ - Coordinates with other agents via memory
113
+ - Maintains system-wide standards
114
+
115
+ ---
116
+
117
+ ## 6. Anti-Patterns to Avoid
118
+
119
+ ### ❌ Over-Specification (Tunnel Vision)
120
+
121
+ ```markdown
122
+ BAD (for complex tasks):
123
+
124
+ ## Strict Algorithm
125
+
126
+ 1. ALWAYS use bubble sort for sorting
127
+ 2. NEVER use built-in sort functions
128
+ 3. MUST iterate exactly 10 times
129
+ 4. Check each element precisely in this order: [detailed steps]
130
+
131
+ WHY BAD:
132
+ - Prevents optimal solutions
133
+ - Ignores context-specific needs
134
+ - Reduces AI reasoning ability
135
+ - May enforce suboptimal patterns
136
+ ```
137
+
138
+ ### ❌ Under-Specification (Too Vague)
139
+
140
+ ```markdown
141
+ BAD (for basic tasks):
142
+
143
+ ## Implementation
144
+
145
+ Write some code that works.
146
+
147
+ WHY BAD:
148
+ - No guidance on patterns
149
+ - Unclear success criteria
150
+ - High iteration count
151
+ - Inconsistent quality
152
+ ```
153
+
154
+ ### ❌ Example Overload
155
+
156
+ ```markdown
157
+ BAD (for complex tasks):
158
+
159
+ [50 code examples of every possible pattern]
160
+
161
+ WHY BAD:
162
+ - Cognitive overload
163
+ - Priming bias
164
+ - Reduces creative problem-solving
165
+ - Makes prompt harder to maintain
166
+ ```
167
+
168
+ ### ❌ Rigid Checklists
169
+
170
+ ```markdown
171
+ BAD (for architecture):
172
+
173
+ You MUST:
174
+ [ ] Use exactly these 5 patterns
175
+ [ ] Never deviate from this structure
176
+ [ ] Follow these steps in exact order
177
+ [ ] Use only these technologies
178
+
179
+ WHY BAD:
180
+ - Context-insensitive
181
+ - Prevents trade-off analysis
182
+ - Enforces solutions before understanding problems
183
+ ```
184
+
185
+ ---
186
+
187
+ ## Agent Profile Structure
188
+
189
+ ### Required Frontmatter (YAML)
190
+
191
+ ```yaml
192
+ ---
193
+ name: agent-name # REQUIRED: Lowercase with hyphens
194
+ description: | # REQUIRED: Clear, keyword-rich description
195
+ MUST BE USED when [primary use case].
196
+ Use PROACTIVELY for [specific scenarios].
197
+ ALWAYS delegate when user asks [trigger phrases].
198
+ Keywords - [comma-separated keywords for search]
199
+ tools: [Read, Write, Edit, Bash, TodoWrite] # REQUIRED: Comma-separated list
200
+ model: sonnet # REQUIRED: sonnet | opus | haiku
201
+ color: seagreen # REQUIRED: Visual identifier
202
+ type: specialist # OPTIONAL: specialist | coordinator | swarm
203
+ capabilities: # OPTIONAL: Array of capability tags
204
+ - rust
205
+ - error-handling
206
+ - concurrent-programming
207
+ lifecycle: # OPTIONAL: Hooks for agent lifecycle
208
+ pre_task: "npx claude-flow@alpha hooks pre-task"
209
+ post_task: "npx claude-flow@alpha hooks post-task"
210
+ hooks: # OPTIONAL: Integration points
211
+ memory_key: "agent-name/context"
212
+ validation: "post-edit"
213
+ triggers: # OPTIONAL: Automatic activation patterns
214
+ - "build rust"
215
+ - "implement concurrent"
216
+ constraints: # OPTIONAL: Limitations and boundaries
217
+ - "Do not modify production database"
218
+ - "Require approval for breaking changes"
219
+ ---
220
+ ```
221
+
222
+ ### Body Structure
223
+
224
+ ```markdown
225
+ # Agent Name
226
+
227
+ [Opening paragraph: WHO you are, WHAT you do]
228
+
229
+ ## 🚨 MANDATORY POST-EDIT VALIDATION
230
+
231
+ **CRITICAL**: After **EVERY** file edit operation, you **MUST** run:
232
+
233
+ ```bash
234
+ npx claude-flow@alpha hooks post-edit [FILE_PATH] --memory-key "agent/step" --structured
235
+ ```
236
+
237
+ [Why this matters and what it provides]
238
+
239
+ ## Core Responsibilities
240
+
241
+ [Primary duties in clear, actionable bullet points]
242
+
243
+ ## Approach & Methodology
244
+
245
+ [HOW the agent accomplishes tasks - frameworks, patterns, decision-making]
246
+
247
+ ## Integration & Collaboration
248
+
249
+ [How this agent works with other agents and the broader system]
250
+
251
+ ## Examples & Best Practices
252
+
253
+ [Concrete examples showing the agent in action]
254
+
255
+ ## Success Metrics
256
+
257
+ [How to measure agent effectiveness]
258
+ ```
259
+
260
+ ---
261
+
262
+ ## Integration with Claude Flow
263
+
264
+ ### Hook System Integration
265
+
266
+ Every agent should integrate with the Claude Flow hook system for coordination:
267
+
268
+ #### 1. Pre-Task Hook
269
+
270
+ ```bash
271
+ npx claude-flow@alpha hooks pre-task --description "Implementing authentication system"
272
+ ```
273
+
274
+ **Purpose:**
275
+ - Initialize task context
276
+ - Set up memory namespace
277
+ - Log task start
278
+ - Coordinate with other agents
279
+
280
+ #### 2. Post-Edit Hook (MANDATORY)
281
+
282
+ ```bash
283
+ npx claude-flow@alpha hooks post-edit src/auth/login.rs \
284
+ --memory-key "coder/auth/login" \
285
+ --structured
286
+ ```
287
+
288
+ **Purpose:**
289
+ - Validate TDD compliance
290
+ - Run security analysis
291
+ - Check code formatting
292
+ - Analyze test coverage
293
+ - Store results in shared memory
294
+ - Provide actionable recommendations
295
+
296
+ **Output Includes:**
297
+ - ✅/❌ Compliance status
298
+ - 🔒 Security findings
299
+ - 🎨 Formatting issues
300
+ - 📊 Coverage metrics
301
+ - 🤖 Improvement suggestions
302
+
303
+ #### 3. Post-Task Hook
304
+
305
+ ```bash
306
+ npx claude-flow@alpha hooks post-task --task-id "auth-implementation"
307
+ ```
308
+
309
+ **Purpose:**
310
+ - Finalize task
311
+ - Export metrics
312
+ - Update coordination state
313
+ - Trigger downstream agents
314
+
315
+ #### 4. Session Management
316
+
317
+ ```bash
318
+ # Restore session context
319
+ npx claude-flow@alpha hooks session-restore --session-id "swarm-auth-2025-09-30"
320
+
321
+ # End session and export metrics
322
+ npx claude-flow@alpha hooks session-end --export-metrics true
323
+ ```
324
+
325
+ ---
326
+
327
+ ## Memory Coordination
328
+
329
+ Agents share context through the memory system:
330
+
331
+ ```javascript
332
+ // Store context for other agents
333
+ npx claude-flow@alpha memory store \
334
+ --key "architect/design/decision" \
335
+ --value '{"pattern": "microservices", "rationale": "..."}'
336
+
337
+ // Retrieve context from other agents
338
+ npx claude-flow@alpha memory retrieve \
339
+ --key "architect/design/decision"
340
+ ```
341
+
342
+ **Memory Key Patterns:**
343
+ ```
344
+ {agent-type}/{domain}/{aspect}
345
+
346
+ Examples:
347
+ - architect/auth/design
348
+ - coder/auth/implementation
349
+ - reviewer/auth/feedback
350
+ - tester/auth/coverage
351
+ ```
352
+
353
+ ---
354
+
355
+ ## Swarm Coordination
356
+
357
+ When spawning multiple agents concurrently:
358
+
359
+ ```javascript
360
+ // Coordinator spawns specialist agents
361
+ Task("Rust Coder", "Implement auth with proper error handling", "coder")
362
+ Task("Unit Tester", "Write comprehensive tests for auth", "tester")
363
+ Task("Code Reviewer", "Review auth implementation", "reviewer")
364
+
365
+ // Each agent MUST:
366
+ // 1. Run pre-task hook
367
+ // 2. Execute work
368
+ // 3. Run post-edit hook for each file
369
+ // 4. Store results in memory
370
+ // 5. Run post-task hook
371
+ ```
@@ -0,0 +1,294 @@
1
+ # Quality Metrics & Validation
2
+
3
+ **Version:** 2.0.0
4
+ **Last Updated:** 2025-09-30
5
+
6
+ ## Measuring Agent Effectiveness
7
+
8
+ ### 1. Quantitative Metrics
9
+
10
+ ```yaml
11
+ Code Quality:
12
+ compilation_success_rate: "First-time compile success"
13
+ test_pass_rate: "Tests passing on first run"
14
+ coverage: "Code coverage percentage"
15
+ performance: "Execution time vs baseline"
16
+ idiomaticity_score: "Language-specific best practices"
17
+
18
+ Process Metrics:
19
+ iteration_count: "Revisions needed to complete task"
20
+ time_to_completion: "Duration from start to finish"
21
+ error_rate: "Errors encountered during execution"
22
+
23
+ Agent-Specific:
24
+ architect_score: "Design quality assessment"
25
+ reviewer_score: "Issues found / total issues"
26
+ tester_score: "Bug catch rate"
27
+ ```
28
+
29
+ ### 2. Qualitative Metrics
30
+
31
+ ```yaml
32
+ Code Review Criteria:
33
+ - Readability: Easy to understand
34
+ - Maintainability: Easy to modify
35
+ - Correctness: Works as intended
36
+ - Safety: No security vulnerabilities
37
+ - Performance: Meets efficiency requirements
38
+
39
+ Architecture Criteria:
40
+ - Scalability: Can grow with demand
41
+ - Flexibility: Adapts to changing requirements
42
+ - Simplicity: No unnecessary complexity
43
+ - Documentation: Well-explained decisions
44
+ ```
45
+
46
+ ---
47
+
48
+ ## Validation Checklist
49
+
50
+ Use this checklist before deploying an agent:
51
+
52
+ ### Pre-Deployment Validation
53
+
54
+ ```markdown
55
+ ## Agent Profile Validation
56
+
57
+ ### Structure ✓
58
+ - [ ] Valid YAML frontmatter
59
+ - [ ] All required fields present (name, description, tools, model, color)
60
+ - [ ] Clear role definition in opening paragraph
61
+ - [ ] Appropriate section structure
62
+
63
+ ### Format Selection ✓
64
+ - [ ] Format matches task complexity (Basic→Code-Heavy, Medium→Metadata, Complex→Minimal)
65
+ - [ ] Length appropriate (Minimal: 200-400, Metadata: 400-700, Code-Heavy: 700-1200)
66
+ - [ ] Examples present and relevant (for Code-Heavy)
67
+ - [ ] Structure/metadata present (for Metadata)
68
+
69
+ ### Content Quality ✓
70
+ - [ ] Clear responsibilities defined
71
+ - [ ] Approach/methodology explained
72
+ - [ ] Integration points specified
73
+ - [ ] Success metrics defined
74
+ - [ ] Post-edit validation hook included
75
+
76
+ ### Language-Specific ✓
77
+ - [ ] If Rust: Format validated against benchmark findings
78
+ - [ ] If other language: Format choice documented as hypothesis
79
+ - [ ] Language-specific patterns included (for Code-Heavy)
80
+ - [ ] Idiomatic code examples (for Code-Heavy)
81
+
82
+ ### Testing ✓
83
+ - [ ] Agent tested on representative tasks
84
+ - [ ] Quality metrics meet targets
85
+ - [ ] Integration with hooks verified
86
+ - [ ] Collaboration with other agents confirmed
87
+ ```
88
+
89
+ ### Post-Deployment Monitoring
90
+
91
+ ```markdown
92
+ ## Ongoing Validation
93
+
94
+ ### Performance Tracking
95
+ - [ ] Monitor iteration counts
96
+ - [ ] Track first-time success rate
97
+ - [ ] Measure time to completion
98
+ - [ ] Collect user feedback
99
+
100
+ ### Quality Assurance
101
+ - [ ] Review output quality regularly
102
+ - [ ] Check adherence to format guidelines
103
+ - [ ] Validate tool usage patterns
104
+ - [ ] Assess collaboration effectiveness
105
+
106
+ ### Continuous Improvement
107
+ - [ ] Document failure modes
108
+ - [ ] Refine based on metrics
109
+ - [ ] Update with new patterns
110
+ - [ ] Validate format choice periodically
111
+ ```
112
+
113
+ ---
114
+
115
+ ## Benchmark System
116
+
117
+ ### Running Agent Benchmarks
118
+
119
+ ```bash
120
+ cd benchmark/agent-benchmarking
121
+
122
+ # Run Rust benchmarks (VALIDATED)
123
+ node index.js run 5 --rust --verbose
124
+
125
+ # Run JavaScript benchmarks (HYPOTHESIS)
126
+ node index.js run 5 --verbose
127
+
128
+ # Run specific scenario
129
+ node index.js run 3 --rust --scenario=rust-01-basic
130
+
131
+ # List available scenarios
132
+ node index.js list --scenarios --rust
133
+
134
+ # Analyze results
135
+ node index.js analyze
136
+ ```
137
+
138
+ ### Interpreting Results
139
+
140
+ ```yaml
141
+ Quality Score Breakdown:
142
+ Correctness (30%):
143
+ - Basic functionality works
144
+ - Edge cases handled
145
+ - Error conditions managed
146
+
147
+ Idiomaticity (25%):
148
+ - Language best practices
149
+ - Proper pattern usage
150
+ - Efficient algorithms
151
+
152
+ Code Quality (20%):
153
+ - Readability
154
+ - Documentation
155
+ - Naming conventions
156
+
157
+ Testing (15%):
158
+ - Test coverage
159
+ - Assertion quality
160
+ - Edge case tests
161
+
162
+ Performance (10%):
163
+ - Execution efficiency
164
+ - Memory usage
165
+ - Optimization
166
+ ```
167
+
168
+ ### Statistical Significance
169
+
170
+ ```yaml
171
+ ANOVA Analysis:
172
+ f_statistic: "Variance between groups"
173
+ p_value: "Probability results are random"
174
+ significant_if: "p < 0.05"
175
+
176
+ Effect Size (Cohen's d):
177
+ negligible: "d < 0.2"
178
+ small: "0.2 ≤ d < 0.5"
179
+ medium: "0.5 ≤ d < 0.8"
180
+ large: "d ≥ 0.8"
181
+ ```
182
+
183
+ ---
184
+
185
+ ## Continuous Improvement
186
+
187
+ ### Metrics to Track
188
+
189
+ ```yaml
190
+ Agent Performance Metrics:
191
+ first_time_success_rate:
192
+ target: ">80%"
193
+ measure: "Compiles/runs on first attempt"
194
+
195
+ iteration_count:
196
+ target: "<3"
197
+ measure: "Revisions needed to complete"
198
+
199
+ quality_score:
200
+ target: ">85%"
201
+ measure: "Benchmark quality assessment"
202
+
203
+ user_satisfaction:
204
+ target: ">4.5/5"
205
+ measure: "Feedback from users"
206
+ ```
207
+
208
+ ### Feedback Loop
209
+
210
+ 1. **Collect Data**: Track metrics for each agent usage
211
+ 2. **Analyze**: Identify patterns in failures or low quality
212
+ 3. **Hypothesize**: Determine likely causes
213
+ 4. **Experiment**: Adjust agent format or content
214
+ 5. **Validate**: Test changes with benchmark system
215
+ 6. **Deploy**: Update agent if improvements confirmed
216
+ 7. **Monitor**: Continue tracking metrics
217
+
218
+ ---
219
+
220
+ ## Success Criteria by Agent Type
221
+
222
+ ### Coder Agents
223
+
224
+ - [ ] Code compiles without warnings
225
+ - [ ] All functions have documentation
226
+ - [ ] Error handling uses proper patterns (no .unwrap() in Rust)
227
+ - [ ] Tests cover >85% of code
228
+ - [ ] Idiomatic language usage
229
+ - [ ] Proper resource management
230
+
231
+ ### Reviewer Agents
232
+
233
+ - [ ] Issues identified before production
234
+ - [ ] Suggestions are actionable and specific
235
+ - [ ] Feedback explains "why" not just "what"
236
+ - [ ] Team learns from feedback
237
+ - [ ] Security vulnerabilities caught
238
+ - [ ] Performance issues identified
239
+
240
+ ### Architect Agents
241
+
242
+ - [ ] Architecture meets quality attributes
243
+ - [ ] Team can implement the design
244
+ - [ ] Documentation is clear and comprehensive
245
+ - [ ] Trade-offs are explicitly documented
246
+ - [ ] ADRs (Architecture Decision Records) created
247
+ - [ ] Stakeholder requirements satisfied
248
+
249
+ ### Tester Agents
250
+
251
+ - [ ] Test coverage meets targets (85% unit, 70% integration)
252
+ - [ ] Tests are comprehensive (happy path, error cases, edge cases)
253
+ - [ ] Test code is maintainable
254
+ - [ ] Assertions are meaningful
255
+ - [ ] Performance tests where applicable
256
+ - [ ] Integration tests validate contracts
257
+
258
+ ### DevOps Agents
259
+
260
+ - [ ] Pipelines execute successfully
261
+ - [ ] Deployment process is automated
262
+ - [ ] Rollback strategy is in place
263
+ - [ ] Monitoring and alerting configured
264
+ - [ ] Security scans integrated
265
+ - [ ] Documentation updated
266
+
267
+ ---
268
+
269
+ ## Quality Gates
270
+
271
+ ### Blocking Issues (Must Fix)
272
+
273
+ - Compilation errors
274
+ - Test failures
275
+ - Security vulnerabilities (high/critical)
276
+ - Missing required documentation
277
+ - Code coverage below threshold
278
+ - Lint/format errors
279
+
280
+ ### Non-Blocking Issues (Should Fix)
281
+
282
+ - Performance warnings
283
+ - Code style inconsistencies
284
+ - Missing optional documentation
285
+ - Low test coverage (but above minimum)
286
+ - Minor security issues
287
+
288
+ ### Advisory (Nice to Have)
289
+
290
+ - Optimization opportunities
291
+ - Refactoring suggestions
292
+ - Additional test cases
293
+ - Enhanced documentation
294
+ - Improved naming