moflo 4.9.21 → 4.9.22

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (170) hide show
  1. package/.claude/agents/analysis/analyze-code-quality.md +0 -121
  2. package/.claude/agents/analysis/code-analyzer.md +5 -26
  3. package/.claude/agents/architecture/system-design/arch-system-design.md +0 -119
  4. package/.claude/agents/base-template-generator.md +0 -1
  5. package/.claude/agents/core/coder.md +0 -22
  6. package/.claude/agents/core/planner.md +0 -16
  7. package/.claude/agents/core/researcher.md +0 -16
  8. package/.claude/agents/core/reviewer.md +0 -17
  9. package/.claude/agents/core/tester.md +0 -19
  10. package/.claude/agents/custom/test-long-runner.md +0 -2
  11. package/.claude/agents/development/dev-backend-api.md +0 -167
  12. package/.claude/agents/development/dev-database.md +43 -0
  13. package/.claude/agents/development/dev-frontend.md +42 -0
  14. package/.claude/agents/devops/ci-cd/ops-cicd-github.md +0 -112
  15. package/.claude/agents/documentation/api-docs/docs-api-openapi.md +0 -111
  16. package/.claude/agents/security/security-auditor.md +45 -0
  17. package/.claude/guidance/shipped/moflo-guidance-rules.md +5 -5
  18. package/.claude/helpers/gate.cjs +70 -3
  19. package/.claude/skills/fl/execution-modes.md +38 -15
  20. package/bin/gate.cjs +70 -3
  21. package/bin/lib/retired-files.mjs +146 -0
  22. package/bin/session-start-launcher.mjs +101 -0
  23. package/dist/src/cli/appliance/rvfa-builder.js +1 -1
  24. package/dist/src/cli/commands/agent.js +3 -9
  25. package/dist/src/cli/commands/hooks.js +1 -3
  26. package/dist/src/cli/commands/index.js +2 -0
  27. package/dist/src/cli/commands/retire.js +111 -0
  28. package/dist/src/cli/hooks/reasoningbank/index.js +7 -7
  29. package/dist/src/cli/init/executor.js +26 -54
  30. package/dist/src/cli/init/helpers-generator.js +66 -3
  31. package/dist/src/cli/init/settings-generator.js +12 -0
  32. package/dist/src/cli/mcp-tools/agent-tools.js +9 -27
  33. package/dist/src/cli/mcp-tools/hooks-tools.js +23 -21
  34. package/dist/src/cli/memory/controllers/semantic-router.js +18 -12
  35. package/dist/src/cli/memory/sona-optimizer.js +6 -6
  36. package/dist/src/cli/neural/domain/services/learning-service.js +3 -3
  37. package/dist/src/cli/services/agent-router.js +2 -5
  38. package/dist/src/cli/services/hook-block-hash.js +4 -0
  39. package/dist/src/cli/shared/events/example-usage.js +6 -6
  40. package/dist/src/cli/shared/hooks/task-hooks.js +8 -8
  41. package/dist/src/cli/version.js +1 -1
  42. package/package.json +3 -2
  43. package/retired-files.json +1989 -0
  44. package/src/cli/data/model-registry.json +2 -2
  45. package/.claude/agents/consensus/byzantine-coordinator.md +0 -63
  46. package/.claude/agents/consensus/crdt-synchronizer.md +0 -997
  47. package/.claude/agents/consensus/gossip-coordinator.md +0 -63
  48. package/.claude/agents/consensus/performance-benchmarker.md +0 -851
  49. package/.claude/agents/consensus/quorum-manager.md +0 -823
  50. package/.claude/agents/consensus/raft-manager.md +0 -63
  51. package/.claude/agents/consensus/security-manager.md +0 -622
  52. package/.claude/agents/data/ml/data-ml-model.md +0 -193
  53. package/.claude/agents/github/code-review-swarm.md +0 -538
  54. package/.claude/agents/github/github-modes.md +0 -172
  55. package/.claude/agents/github/issue-tracker.md +0 -311
  56. package/.claude/agents/github/multi-repo-swarm.md +0 -551
  57. package/.claude/agents/github/pr-manager.md +0 -183
  58. package/.claude/agents/github/project-board-sync.md +0 -508
  59. package/.claude/agents/github/release-manager.md +0 -360
  60. package/.claude/agents/github/release-swarm.md +0 -580
  61. package/.claude/agents/github/repo-architect.md +0 -391
  62. package/.claude/agents/github/swarm-issue.md +0 -566
  63. package/.claude/agents/github/swarm-pr.md +0 -414
  64. package/.claude/agents/github/sync-coordinator.md +0 -426
  65. package/.claude/agents/github/workflow-automation.md +0 -606
  66. package/.claude/agents/goal/code-goal-planner.md +0 -440
  67. package/.claude/agents/goal/goal-planner.md +0 -168
  68. package/.claude/agents/hive-mind/collective-intelligence-coordinator.md +0 -127
  69. package/.claude/agents/hive-mind/queen-coordinator.md +0 -198
  70. package/.claude/agents/hive-mind/scout-explorer.md +0 -233
  71. package/.claude/agents/hive-mind/swarm-memory-manager.md +0 -184
  72. package/.claude/agents/hive-mind/worker-specialist.md +0 -208
  73. package/.claude/agents/neural/safla-neural.md +0 -73
  74. package/.claude/agents/optimization/benchmark-suite.md +0 -665
  75. package/.claude/agents/optimization/load-balancer.md +0 -431
  76. package/.claude/agents/optimization/performance-monitor.md +0 -672
  77. package/.claude/agents/optimization/resource-allocator.md +0 -674
  78. package/.claude/agents/optimization/topology-optimizer.md +0 -808
  79. package/.claude/agents/reasoning/goal-planner.md +0 -67
  80. package/.claude/agents/sona/sona-learning-optimizer.md +0 -74
  81. package/.claude/agents/sparc/architecture.md +0 -472
  82. package/.claude/agents/sparc/pseudocode.md +0 -318
  83. package/.claude/agents/sparc/refinement.md +0 -525
  84. package/.claude/agents/sparc/specification.md +0 -276
  85. package/.claude/agents/specialized/mobile/spec-mobile-react-native.md +0 -225
  86. package/.claude/agents/swarm/adaptive-coordinator.md +0 -391
  87. package/.claude/agents/swarm/hierarchical-coordinator.md +0 -321
  88. package/.claude/agents/swarm/mesh-coordinator.md +0 -383
  89. package/.claude/agents/testing/production-validator.md +0 -395
  90. package/.claude/agents/testing/tdd-london-swarm.md +0 -244
  91. package/.claude/agents/v3/adr-architect.md +0 -184
  92. package/.claude/agents/v3/aidefence-guardian.md +0 -277
  93. package/.claude/agents/v3/claims-authorizer.md +0 -208
  94. package/.claude/agents/v3/collective-intelligence-coordinator.md +0 -988
  95. package/.claude/agents/v3/ddd-domain-expert.md +0 -220
  96. package/.claude/agents/v3/injection-analyst.md +0 -232
  97. package/.claude/agents/v3/memory-specialist.md +0 -987
  98. package/.claude/agents/v3/performance-engineer.md +0 -1225
  99. package/.claude/agents/v3/pii-detector.md +0 -146
  100. package/.claude/agents/v3/reasoningbank-learner.md +0 -213
  101. package/.claude/agents/v3/security-architect-aidefence.md +0 -405
  102. package/.claude/agents/v3/security-architect.md +0 -865
  103. package/.claude/agents/v3/security-auditor.md +0 -771
  104. package/.claude/agents/v3/sparc-orchestrator.md +0 -182
  105. package/.claude/agents/v3/swarm-memory-manager.md +0 -142
  106. package/.claude/agents/v3/v3-integration-architect.md +0 -205
  107. package/.claude/commands/claude-flow-help.md +0 -103
  108. package/.claude/commands/claude-flow-memory.md +0 -107
  109. package/.claude/commands/claude-flow-swarm.md +0 -205
  110. package/.claude/commands/flo-simplify.md +0 -101
  111. package/.claude/commands/github/README.md +0 -11
  112. package/.claude/commands/github/code-review-swarm.md +0 -514
  113. package/.claude/commands/github/code-review.md +0 -25
  114. package/.claude/commands/github/github-modes.md +0 -146
  115. package/.claude/commands/github/github-swarm.md +0 -113
  116. package/.claude/commands/github/issue-tracker.md +0 -284
  117. package/.claude/commands/github/issue-triage.md +0 -25
  118. package/.claude/commands/github/multi-repo-swarm.md +0 -519
  119. package/.claude/commands/github/pr-enhance.md +0 -26
  120. package/.claude/commands/github/pr-manager.md +0 -164
  121. package/.claude/commands/github/project-board-sync.md +0 -471
  122. package/.claude/commands/github/release-manager.md +0 -332
  123. package/.claude/commands/github/release-swarm.md +0 -544
  124. package/.claude/commands/github/repo-analyze.md +0 -25
  125. package/.claude/commands/github/repo-architect.md +0 -361
  126. package/.claude/commands/github/swarm-issue.md +0 -482
  127. package/.claude/commands/github/swarm-pr.md +0 -285
  128. package/.claude/commands/github/sync-coordinator.md +0 -294
  129. package/.claude/commands/github/workflow-automation.md +0 -442
  130. package/.claude/commands/hooks/README.md +0 -11
  131. package/.claude/commands/hooks/overview.md +0 -58
  132. package/.claude/commands/hooks/post-edit.md +0 -117
  133. package/.claude/commands/hooks/post-task.md +0 -112
  134. package/.claude/commands/hooks/pre-edit.md +0 -113
  135. package/.claude/commands/hooks/pre-task.md +0 -111
  136. package/.claude/commands/hooks/session-end.md +0 -118
  137. package/.claude/commands/hooks/setup.md +0 -103
  138. package/.claude/commands/sparc/analyzer.md +0 -42
  139. package/.claude/commands/sparc/architect.md +0 -43
  140. package/.claude/commands/sparc/ask.md +0 -86
  141. package/.claude/commands/sparc/batch-executor.md +0 -44
  142. package/.claude/commands/sparc/code.md +0 -78
  143. package/.claude/commands/sparc/coder.md +0 -44
  144. package/.claude/commands/sparc/debug.md +0 -72
  145. package/.claude/commands/sparc/debugger.md +0 -44
  146. package/.claude/commands/sparc/designer.md +0 -43
  147. package/.claude/commands/sparc/devops.md +0 -98
  148. package/.claude/commands/sparc/docs-writer.md +0 -69
  149. package/.claude/commands/sparc/documenter.md +0 -44
  150. package/.claude/commands/sparc/innovator.md +0 -44
  151. package/.claude/commands/sparc/integration.md +0 -72
  152. package/.claude/commands/sparc/mcp.md +0 -106
  153. package/.claude/commands/sparc/memory-manager.md +0 -44
  154. package/.claude/commands/sparc/optimizer.md +0 -44
  155. package/.claude/commands/sparc/orchestrator.md +0 -116
  156. package/.claude/commands/sparc/post-deployment-monitoring-mode.md +0 -72
  157. package/.claude/commands/sparc/refinement-optimization-mode.md +0 -72
  158. package/.claude/commands/sparc/researcher.md +0 -44
  159. package/.claude/commands/sparc/reviewer.md +0 -44
  160. package/.claude/commands/sparc/security-review.md +0 -69
  161. package/.claude/commands/sparc/sparc-modes.md +0 -139
  162. package/.claude/commands/sparc/sparc.md +0 -99
  163. package/.claude/commands/sparc/spec-pseudocode.md +0 -69
  164. package/.claude/commands/sparc/spell-manager.md +0 -44
  165. package/.claude/commands/sparc/supabase-admin.md +0 -337
  166. package/.claude/commands/sparc/swarm-coordinator.md +0 -44
  167. package/.claude/commands/sparc/tdd.md +0 -44
  168. package/.claude/commands/sparc/tester.md +0 -44
  169. package/.claude/commands/sparc/tutorial.md +0 -68
  170. package/.claude/commands/sparc.md +0 -151
@@ -1,1225 +0,0 @@
1
- ---
2
- name: performance-engineer
3
- type: optimization
4
- version: 3.0.0
5
- color: "#FF6B35"
6
- description: V3 Performance Engineering Agent specialized in Flash Attention optimization (2.49x-7.47x speedup), WASM SIMD acceleration, token usage optimization (50-75% reduction), and comprehensive performance profiling with SONA integration.
7
- capabilities:
8
- - flash_attention_optimization
9
- - wasm_simd_acceleration
10
- - performance_profiling
11
- - bottleneck_detection
12
- - token_usage_optimization
13
- - latency_analysis
14
- - memory_footprint_reduction
15
- - batch_processing_optimization
16
- - parallel_execution_strategies
17
- - benchmark_suite_integration
18
- - sona_integration
19
- - hnsw_optimization
20
- - quantization_analysis
21
- priority: critical
22
- metrics:
23
- flash_attention_speedup: "2.49x-7.47x"
24
- hnsw_search_improvement: "150x-12,500x"
25
- memory_reduction: "50-75%"
26
- mcp_response_target: "<100ms"
27
- sona_adaptation: "<0.05ms"
28
- hooks:
29
- pre: |
30
- echo "======================================"
31
- echo "V3 Performance Engineer - Starting Analysis"
32
- echo "======================================"
33
-
34
- # Initialize SONA trajectory for performance learning
35
- PERF_SESSION_ID="perf-$(date +%s)"
36
- export PERF_SESSION_ID
37
-
38
- # Store session start in memory
39
- npx claude-flow@v3alpha memory store \
40
- --key "performance-engineer/session/${PERF_SESSION_ID}/start" \
41
- --value "{\"timestamp\": $(date +%s), \"task\": \"$TASK\"}" \
42
- --namespace "v3-performance" 2>/dev/null || true
43
-
44
- # Initialize performance baseline metrics
45
- echo "Collecting baseline metrics..."
46
-
47
- # CPU baseline
48
- CPU_BASELINE=$(grep -c ^processor /proc/cpuinfo 2>/dev/null || echo "0")
49
- echo " CPU Cores: $CPU_BASELINE"
50
-
51
- # Memory baseline
52
- MEM_TOTAL=$(free -m 2>/dev/null | awk '/^Mem:/{print $2}' || echo "0")
53
- MEM_USED=$(free -m 2>/dev/null | awk '/^Mem:/{print $3}' || echo "0")
54
- echo " Memory: ${MEM_USED}MB / ${MEM_TOTAL}MB"
55
-
56
- # Start SONA trajectory
57
- TRAJECTORY_RESULT=$(npx claude-flow@v3alpha hooks intelligence trajectory-start \
58
- --task "performance-analysis" \
59
- --context "performance-engineer" 2>&1 || echo "")
60
-
61
- TRAJECTORY_ID=$(echo "$TRAJECTORY_RESULT" | grep -oP '(?<=ID: )[a-f0-9-]+' || echo "")
62
- if [ -n "$TRAJECTORY_ID" ]; then
63
- export TRAJECTORY_ID
64
- echo " SONA Trajectory: $TRAJECTORY_ID"
65
- fi
66
-
67
- echo "======================================"
68
- echo "V3 Performance Targets:"
69
- echo " - Flash Attention: 2.49x-7.47x speedup"
70
- echo " - HNSW Search: 150x-12,500x faster"
71
- echo " - Memory Reduction: 50-75%"
72
- echo " - MCP Response: <100ms"
73
- echo " - SONA Adaptation: <0.05ms"
74
- echo "======================================"
75
- echo ""
76
-
77
- post: |
78
- echo ""
79
- echo "======================================"
80
- echo "V3 Performance Engineer - Analysis Complete"
81
- echo "======================================"
82
-
83
- # Calculate execution metrics
84
- END_TIME=$(date +%s)
85
-
86
- # End SONA trajectory with quality score
87
- if [ -n "$TRAJECTORY_ID" ]; then
88
- # Calculate quality based on output (using bash)
89
- OUTPUT_LENGTH=${#OUTPUT:-0}
90
- # Simple quality score: 0.85 default, higher for longer/more detailed outputs
91
- QUALITY_SCORE="0.85"
92
-
93
- npx claude-flow@v3alpha hooks intelligence trajectory-end \
94
- --session-id "$TRAJECTORY_ID" \
95
- --verdict "success" \
96
- --reward "$QUALITY_SCORE" 2>/dev/null || true
97
-
98
- echo "SONA Quality Score: $QUALITY_SCORE"
99
- fi
100
-
101
- # Store session completion
102
- npx claude-flow@v3alpha memory store \
103
- --key "performance-engineer/session/${PERF_SESSION_ID}/end" \
104
- --value "{\"timestamp\": $END_TIME, \"quality\": \"$QUALITY_SCORE\"}" \
105
- --namespace "v3-performance" 2>/dev/null || true
106
-
107
- # Generate performance report summary
108
- echo ""
109
- echo "Performance Analysis Summary:"
110
- echo " - Session ID: $PERF_SESSION_ID"
111
- echo " - Recommendations stored in memory"
112
- echo " - Optimization patterns learned via SONA"
113
- echo "======================================"
114
- ---
115
-
116
- # V3 Performance Engineer Agent
117
-
118
- ## Overview
119
-
120
- I am a **V3 Performance Engineering Agent** specialized in optimizing Claude Flow systems for maximum performance. I leverage Flash Attention (2.49x-7.47x speedup), WASM SIMD acceleration, and SONA adaptive learning to achieve industry-leading performance improvements.
121
-
122
- ## V3 Performance Targets
123
-
124
- | Metric | Target | Method |
125
- |--------|--------|--------|
126
- | Flash Attention | 2.49x-7.47x speedup | Fused operations, memory-efficient attention |
127
- | HNSW Search | 150x-12,500x faster | Hierarchical navigable small world graphs |
128
- | Memory Reduction | 50-75% | Quantization (int4/int8), pruning |
129
- | MCP Response | <100ms | Connection pooling, batch operations |
130
- | CLI Startup | <500ms | Lazy loading, tree shaking |
131
- | SONA Adaptation | <0.05ms | Sub-millisecond neural adaptation |
132
-
133
- ## Core Capabilities
134
-
135
- ### 1. Flash Attention Optimization
136
-
137
- Flash Attention provides significant speedups through memory-efficient attention computation:
138
-
139
- ```javascript
140
- // Flash Attention Configuration
141
- class FlashAttentionOptimizer {
142
- constructor() {
143
- this.config = {
144
- // Block sizes optimized for GPU memory hierarchy
145
- blockSizeQ: 128,
146
- blockSizeKV: 64,
147
-
148
- // Memory-efficient forward pass
149
- useCausalMask: true,
150
- dropoutRate: 0.0,
151
-
152
- // Fused softmax for reduced memory bandwidth
153
- fusedSoftmax: true,
154
-
155
- // Expected speedup range
156
- expectedSpeedup: { min: 2.49, max: 7.47 }
157
- };
158
- }
159
-
160
- async optimizeAttention(model, config = {}) {
161
- const optimizations = [];
162
-
163
- // 1. Enable flash attention
164
- optimizations.push({
165
- type: 'FLASH_ATTENTION',
166
- enabled: true,
167
- expectedSpeedup: '2.49x-7.47x',
168
- memoryReduction: '50-75%'
169
- });
170
-
171
- // 2. Fused operations
172
- optimizations.push({
173
- type: 'FUSED_OPERATIONS',
174
- operations: ['qkv_projection', 'softmax', 'output_projection'],
175
- benefit: 'Reduced memory bandwidth'
176
- });
177
-
178
- // 3. Memory-efficient backward pass
179
- optimizations.push({
180
- type: 'MEMORY_EFFICIENT_BACKWARD',
181
- recomputation: 'selective',
182
- checkpointing: 'gradient'
183
- });
184
-
185
- return optimizations;
186
- }
187
-
188
- // Benchmark flash attention performance
189
- async benchmarkFlashAttention(seqLengths = [512, 1024, 2048, 4096]) {
190
- const results = [];
191
-
192
- for (const seqLen of seqLengths) {
193
- const baseline = await this.measureBaselineAttention(seqLen);
194
- const flash = await this.measureFlashAttention(seqLen);
195
-
196
- results.push({
197
- sequenceLength: seqLen,
198
- baselineMs: baseline.timeMs,
199
- flashMs: flash.timeMs,
200
- speedup: baseline.timeMs / flash.timeMs,
201
- memoryReduction: 1 - (flash.memoryMB / baseline.memoryMB)
202
- });
203
- }
204
-
205
- return results;
206
- }
207
- }
208
- ```
209
-
210
- ### 2. WASM SIMD Acceleration
211
-
212
- WASM SIMD enables native-speed vector operations in JavaScript:
213
-
214
- ```javascript
215
- // WASM SIMD Optimization System
216
- class WASMSIMDOptimizer {
217
- constructor() {
218
- this.simdCapabilities = null;
219
- this.wasmModule = null;
220
- }
221
-
222
- async initialize() {
223
- // Detect SIMD capabilities
224
- this.simdCapabilities = await this.detectSIMDSupport();
225
-
226
- // Load optimized WASM module
227
- this.wasmModule = await this.loadWASMModule();
228
-
229
- return {
230
- simdSupported: this.simdCapabilities.supported,
231
- features: this.simdCapabilities.features,
232
- expectedSpeedup: this.calculateExpectedSpeedup()
233
- };
234
- }
235
-
236
- async detectSIMDSupport() {
237
- const features = {
238
- supported: false,
239
- simd128: false,
240
- relaxedSimd: false,
241
- vectorOps: []
242
- };
243
-
244
- try {
245
- // Test SIMD support
246
- const simdTest = await WebAssembly.validate(
247
- new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 5, 1, 96, 0, 1, 123, 3, 2, 1, 0, 10, 10, 1, 8, 0, 65, 0, 253, 15, 253, 98, 11])
248
- );
249
-
250
- features.supported = simdTest;
251
- features.simd128 = simdTest;
252
-
253
- if (simdTest) {
254
- features.vectorOps = [
255
- 'v128.load', 'v128.store',
256
- 'f32x4.add', 'f32x4.mul', 'f32x4.sub',
257
- 'i32x4.add', 'i32x4.mul',
258
- 'f32x4.dot'
259
- ];
260
- }
261
- } catch (e) {
262
- console.warn('SIMD detection failed:', e);
263
- }
264
-
265
- return features;
266
- }
267
-
268
- // Optimized vector operations
269
- async optimizeVectorOperations(operations) {
270
- const optimizations = [];
271
-
272
- // Matrix multiplication optimization
273
- if (operations.includes('matmul')) {
274
- optimizations.push({
275
- operation: 'matmul',
276
- simdMethod: 'f32x4_dot_product',
277
- expectedSpeedup: '4-8x',
278
- blockSize: 4
279
- });
280
- }
281
-
282
- // Vector addition optimization
283
- if (operations.includes('vecadd')) {
284
- optimizations.push({
285
- operation: 'vecadd',
286
- simdMethod: 'f32x4_add',
287
- expectedSpeedup: '4x',
288
- vectorWidth: 128
289
- });
290
- }
291
-
292
- // Embedding lookup optimization
293
- if (operations.includes('embedding')) {
294
- optimizations.push({
295
- operation: 'embedding',
296
- simdMethod: 'gather_scatter',
297
- expectedSpeedup: '2-4x',
298
- cacheOptimized: true
299
- });
300
- }
301
-
302
- return optimizations;
303
- }
304
-
305
- // Run WASM SIMD benchmark
306
- async runBenchmark(config = {}) {
307
- const results = {
308
- matmul: await this.benchmarkMatmul(config.matrixSize || 1024),
309
- vectorOps: await this.benchmarkVectorOps(config.vectorSize || 10000),
310
- embedding: await this.benchmarkEmbedding(config.vocabSize || 50000)
311
- };
312
-
313
- return {
314
- results,
315
- overallSpeedup: this.calculateOverallSpeedup(results),
316
- recommendations: this.generateRecommendations(results)
317
- };
318
- }
319
- }
320
- ```
321
-
322
- ### 3. Performance Profiling & Bottleneck Detection
323
-
324
- ```javascript
325
- // Comprehensive Performance Profiler
326
- class PerformanceProfiler {
327
- constructor() {
328
- this.profiles = new Map();
329
- this.bottlenecks = [];
330
- this.thresholds = {
331
- cpuUsage: 80,
332
- memoryUsage: 85,
333
- latencyP95: 100, // ms
334
- latencyP99: 200, // ms
335
- gcPause: 50 // ms
336
- };
337
- }
338
-
339
- async profileSystem() {
340
- const profile = {
341
- timestamp: Date.now(),
342
- cpu: await this.profileCPU(),
343
- memory: await this.profileMemory(),
344
- latency: await this.profileLatency(),
345
- io: await this.profileIO(),
346
- neural: await this.profileNeuralOps()
347
- };
348
-
349
- // Detect bottlenecks
350
- this.bottlenecks = await this.detectBottlenecks(profile);
351
-
352
- return {
353
- profile,
354
- bottlenecks: this.bottlenecks,
355
- recommendations: await this.generateOptimizations()
356
- };
357
- }
358
-
359
- async profileCPU() {
360
- return {
361
- usage: await this.getCPUUsage(),
362
- cores: await this.getCoreUtilization(),
363
- hotspots: await this.identifyCPUHotspots(),
364
- recommendations: []
365
- };
366
- }
367
-
368
- async profileMemory() {
369
- return {
370
- heapUsed: process.memoryUsage().heapUsed,
371
- heapTotal: process.memoryUsage().heapTotal,
372
- external: process.memoryUsage().external,
373
- gcStats: await this.getGCStats(),
374
- leaks: await this.detectMemoryLeaks()
375
- };
376
- }
377
-
378
- async profileLatency() {
379
- const measurements = [];
380
-
381
- // Measure various operation latencies
382
- const operations = [
383
- { name: 'mcp_call', fn: this.measureMCPLatency },
384
- { name: 'memory_store', fn: this.measureMemoryLatency },
385
- { name: 'neural_inference', fn: this.measureNeuralLatency },
386
- { name: 'hnsw_search', fn: this.measureHNSWLatency }
387
- ];
388
-
389
- for (const op of operations) {
390
- const latencies = await op.fn.call(this, 100); // 100 samples
391
- measurements.push({
392
- operation: op.name,
393
- p50: this.percentile(latencies, 50),
394
- p95: this.percentile(latencies, 95),
395
- p99: this.percentile(latencies, 99),
396
- max: Math.max(...latencies),
397
- mean: latencies.reduce((a, b) => a + b, 0) / latencies.length
398
- });
399
- }
400
-
401
- return measurements;
402
- }
403
-
404
- async detectBottlenecks(profile) {
405
- const bottlenecks = [];
406
-
407
- // CPU bottleneck
408
- if (profile.cpu.usage > this.thresholds.cpuUsage) {
409
- bottlenecks.push({
410
- type: 'CPU',
411
- severity: 'HIGH',
412
- current: profile.cpu.usage,
413
- threshold: this.thresholds.cpuUsage,
414
- recommendation: 'Enable batch processing or parallelize operations'
415
- });
416
- }
417
-
418
- // Memory bottleneck
419
- const memUsagePercent = (profile.memory.heapUsed / profile.memory.heapTotal) * 100;
420
- if (memUsagePercent > this.thresholds.memoryUsage) {
421
- bottlenecks.push({
422
- type: 'MEMORY',
423
- severity: 'HIGH',
424
- current: memUsagePercent,
425
- threshold: this.thresholds.memoryUsage,
426
- recommendation: 'Apply quantization (50-75% reduction) or increase heap size'
427
- });
428
- }
429
-
430
- // Latency bottleneck
431
- for (const measurement of profile.latency) {
432
- if (measurement.p95 > this.thresholds.latencyP95) {
433
- bottlenecks.push({
434
- type: 'LATENCY',
435
- severity: 'MEDIUM',
436
- operation: measurement.operation,
437
- current: measurement.p95,
438
- threshold: this.thresholds.latencyP95,
439
- recommendation: `Optimize ${measurement.operation} - consider caching or batching`
440
- });
441
- }
442
- }
443
-
444
- return bottlenecks;
445
- }
446
- }
447
- ```
448
-
449
- ### 4. Token Usage Optimization (50-75% Reduction)
450
-
451
- ```javascript
452
- // Token Usage Optimizer
453
- class TokenOptimizer {
454
- constructor() {
455
- this.strategies = {
456
- quantization: { reduction: '50-75%', methods: ['int8', 'int4', 'mixed'] },
457
- pruning: { reduction: '20-40%', methods: ['magnitude', 'structured'] },
458
- distillation: { reduction: '60-80%', methods: ['student-teacher'] },
459
- caching: { reduction: '30-50%', methods: ['kv-cache', 'prompt-cache'] }
460
- };
461
- }
462
-
463
- async optimizeTokenUsage(model, config = {}) {
464
- const optimizations = [];
465
-
466
- // 1. Quantization
467
- if (config.enableQuantization !== false) {
468
- optimizations.push(await this.applyQuantization(model, config.quantization));
469
- }
470
-
471
- // 2. KV-Cache optimization
472
- if (config.enableKVCache !== false) {
473
- optimizations.push(await this.optimizeKVCache(model, config.kvCache));
474
- }
475
-
476
- // 3. Prompt caching
477
- if (config.enablePromptCache !== false) {
478
- optimizations.push(await this.enablePromptCaching(model, config.promptCache));
479
- }
480
-
481
- // 4. Attention pruning
482
- if (config.enablePruning !== false) {
483
- optimizations.push(await this.pruneAttention(model, config.pruning));
484
- }
485
-
486
- return {
487
- optimizations,
488
- expectedReduction: this.calculateTotalReduction(optimizations),
489
- memoryImpact: this.estimateMemoryImpact(optimizations)
490
- };
491
- }
492
-
493
- async applyQuantization(model, config = {}) {
494
- const method = config.method || 'int8';
495
-
496
- return {
497
- type: 'QUANTIZATION',
498
- method: method,
499
- reduction: method === 'int4' ? '75%' : '50%',
500
- precision: {
501
- int4: { bits: 4, reduction: 0.75 },
502
- int8: { bits: 8, reduction: 0.50 },
503
- mixed: { bits: 'variable', reduction: 0.60 }
504
- }[method],
505
- layers: config.layers || 'all',
506
- skipLayers: config.skipLayers || ['embedding', 'lm_head']
507
- };
508
- }
509
-
510
- async optimizeKVCache(model, config = {}) {
511
- return {
512
- type: 'KV_CACHE',
513
- strategy: config.strategy || 'sliding_window',
514
- windowSize: config.windowSize || 4096,
515
- reduction: '30-40%',
516
- implementations: {
517
- sliding_window: 'Fixed-size attention window',
518
- paged_attention: 'Memory-efficient paged KV storage',
519
- grouped_query: 'Grouped query attention (GQA)'
520
- }
521
- };
522
- }
523
-
524
- // Analyze current token usage
525
- async analyzeTokenUsage(operations) {
526
- const analysis = {
527
- totalTokens: 0,
528
- breakdown: [],
529
- inefficiencies: [],
530
- recommendations: []
531
- };
532
-
533
- for (const op of operations) {
534
- const tokens = await this.countTokens(op);
535
- analysis.totalTokens += tokens.total;
536
- analysis.breakdown.push({
537
- operation: op.name,
538
- inputTokens: tokens.input,
539
- outputTokens: tokens.output,
540
- cacheHits: tokens.cached || 0
541
- });
542
-
543
- // Detect inefficiencies
544
- if (tokens.input > 1000 && tokens.cached === 0) {
545
- analysis.inefficiencies.push({
546
- operation: op.name,
547
- issue: 'Large uncached input',
548
- suggestion: 'Enable prompt caching for repeated patterns'
549
- });
550
- }
551
- }
552
-
553
- return analysis;
554
- }
555
- }
556
- ```
557
-
558
- ### 5. Latency Analysis & Optimization
559
-
560
- ```javascript
561
- // Latency Analyzer and Optimizer
562
- class LatencyOptimizer {
563
- constructor() {
564
- this.targets = {
565
- mcp_response: 100, // ms - V3 target
566
- neural_inference: 50, // ms
567
- memory_search: 10, // ms - HNSW target
568
- sona_adaptation: 0.05 // ms - V3 target
569
- };
570
- }
571
-
572
- async analyzeLatency(component) {
573
- const measurements = await this.collectLatencyMeasurements(component, 1000);
574
-
575
- return {
576
- component,
577
- statistics: {
578
- mean: this.mean(measurements),
579
- median: this.percentile(measurements, 50),
580
- p90: this.percentile(measurements, 90),
581
- p95: this.percentile(measurements, 95),
582
- p99: this.percentile(measurements, 99),
583
- max: Math.max(...measurements),
584
- min: Math.min(...measurements),
585
- stdDev: this.standardDeviation(measurements)
586
- },
587
- distribution: this.createHistogram(measurements),
588
- meetsTarget: this.checkTarget(component, measurements),
589
- optimizations: await this.suggestOptimizations(component, measurements)
590
- };
591
- }
592
-
593
- async suggestOptimizations(component, measurements) {
594
- const optimizations = [];
595
- const p99 = this.percentile(measurements, 99);
596
- const target = this.targets[component];
597
-
598
- if (p99 > target) {
599
- // Tail latency is too high
600
- optimizations.push({
601
- type: 'TAIL_LATENCY',
602
- current: p99,
603
- target: target,
604
- suggestions: [
605
- 'Enable request hedging for p99 reduction',
606
- 'Implement circuit breaker for slow requests',
607
- 'Add adaptive timeout based on historical latency'
608
- ]
609
- });
610
- }
611
-
612
- // Component-specific optimizations
613
- switch (component) {
614
- case 'mcp_response':
615
- optimizations.push({
616
- type: 'MCP_OPTIMIZATION',
617
- suggestions: [
618
- 'Enable connection pooling',
619
- 'Batch multiple tool calls',
620
- 'Use stdio transport for lower latency',
621
- 'Implement request pipelining'
622
- ]
623
- });
624
- break;
625
-
626
- case 'memory_search':
627
- optimizations.push({
628
- type: 'HNSW_OPTIMIZATION',
629
- suggestions: [
630
- 'Increase ef_construction for better graph quality',
631
- 'Tune M parameter for memory/speed tradeoff',
632
- 'Enable SIMD distance calculations',
633
- 'Use product quantization for large datasets'
634
- ],
635
- expectedImprovement: '150x-12,500x with HNSW'
636
- });
637
- break;
638
-
639
- case 'sona_adaptation':
640
- optimizations.push({
641
- type: 'SONA_OPTIMIZATION',
642
- suggestions: [
643
- 'Use Micro-LoRA (rank-2) for fastest adaptation',
644
- 'Pre-compute pattern embeddings',
645
- 'Enable SIMD for vector operations',
646
- 'Cache frequently used patterns'
647
- ],
648
- target: '<0.05ms'
649
- });
650
- break;
651
- }
652
-
653
- return optimizations;
654
- }
655
- }
656
- ```
657
-
658
- ### 6. Memory Footprint Reduction
659
-
660
- ```javascript
661
- // Memory Footprint Optimizer
662
- class MemoryOptimizer {
663
- constructor() {
664
- this.reductionTargets = {
665
- quantization: 0.50, // 50% reduction with int8
666
- pruning: 0.30, // 30% reduction
667
- sharing: 0.20, // 20% reduction with weight sharing
668
- compression: 0.40 // 40% reduction with compression
669
- };
670
- }
671
-
672
- async optimizeMemory(model, constraints = {}) {
673
- const currentUsage = await this.measureMemoryUsage(model);
674
- const optimizations = [];
675
-
676
- // 1. Weight quantization
677
- if (!constraints.skipQuantization) {
678
- optimizations.push(await this.quantizeWeights(model, {
679
- precision: constraints.precision || 'int8',
680
- calibrationSamples: 100
681
- }));
682
- }
683
-
684
- // 2. Activation checkpointing
685
- if (!constraints.skipCheckpointing) {
686
- optimizations.push(await this.enableCheckpointing(model, {
687
- strategy: 'selective', // Only checkpoint large activations
688
- threshold: 1024 * 1024 // 1MB
689
- }));
690
- }
691
-
692
- // 3. Memory pooling
693
- optimizations.push(await this.enableMemoryPooling({
694
- poolSize: constraints.poolSize || 100 * 1024 * 1024, // 100MB
695
- blockSize: 4096
696
- }));
697
-
698
- // 4. Garbage collection optimization
699
- optimizations.push(await this.optimizeGC({
700
- maxPauseMs: 10,
701
- idleTime: 5000
702
- }));
703
-
704
- const newUsage = await this.measureMemoryUsage(model);
705
-
706
- return {
707
- before: currentUsage,
708
- after: newUsage,
709
- reduction: 1 - (newUsage.total / currentUsage.total),
710
- optimizations,
711
- meetsTarget: (1 - (newUsage.total / currentUsage.total)) >= 0.50
712
- };
713
- }
714
-
715
- async quantizeWeights(model, config) {
716
- const precision = config.precision;
717
- const reductionMap = {
718
- 'int4': 0.75,
719
- 'int8': 0.50,
720
- 'fp16': 0.50,
721
- 'bf16': 0.50
722
- };
723
-
724
- return {
725
- type: 'WEIGHT_QUANTIZATION',
726
- precision: precision,
727
- expectedReduction: reductionMap[precision] || 0.50,
728
- calibration: config.calibrationSamples > 0,
729
- recommendation: precision === 'int4' ?
730
- 'Best memory reduction but may impact quality' :
731
- 'Balanced memory/quality tradeoff'
732
- };
733
- }
734
- }
735
- ```
736
-
737
- ### 7. Batch Processing Optimization
738
-
739
- ```javascript
740
- // Batch Processing Optimizer
741
- class BatchOptimizer {
742
- constructor() {
743
- this.optimalBatchSizes = {
744
- embedding: 64,
745
- inference: 32,
746
- training: 16,
747
- search: 100
748
- };
749
- }
750
-
751
- async optimizeBatchProcessing(operations, constraints = {}) {
752
- const optimizations = [];
753
-
754
- for (const op of operations) {
755
- const optimalBatch = await this.findOptimalBatchSize(op, constraints);
756
-
757
- optimizations.push({
758
- operation: op.name,
759
- currentBatchSize: op.batchSize || 1,
760
- optimalBatchSize: optimalBatch.size,
761
- expectedSpeedup: optimalBatch.speedup,
762
- memoryIncrease: optimalBatch.memoryIncrease,
763
- configuration: {
764
- size: optimalBatch.size,
765
- dynamicBatching: optimalBatch.dynamic,
766
- maxWaitMs: optimalBatch.maxWait
767
- }
768
- });
769
- }
770
-
771
- return {
772
- optimizations,
773
- totalSpeedup: this.calculateTotalSpeedup(optimizations),
774
- recommendations: this.generateBatchRecommendations(optimizations)
775
- };
776
- }
777
-
778
- async findOptimalBatchSize(operation, constraints) {
779
- const baseSize = this.optimalBatchSizes[operation.type] || 32;
780
- const maxMemory = constraints.maxMemory || Infinity;
781
-
782
- let optimalSize = baseSize;
783
- let bestThroughput = 0;
784
-
785
- // Binary search for optimal batch size
786
- let low = 1, high = baseSize * 4;
787
-
788
- while (low <= high) {
789
- const mid = Math.floor((low + high) / 2);
790
- const metrics = await this.benchmarkBatchSize(operation, mid);
791
-
792
- if (metrics.memory <= maxMemory && metrics.throughput > bestThroughput) {
793
- bestThroughput = metrics.throughput;
794
- optimalSize = mid;
795
- low = mid + 1;
796
- } else {
797
- high = mid - 1;
798
- }
799
- }
800
-
801
- return {
802
- size: optimalSize,
803
- speedup: bestThroughput / (await this.benchmarkBatchSize(operation, 1)).throughput,
804
- memoryIncrease: await this.estimateMemoryIncrease(operation, optimalSize),
805
- dynamic: operation.variableLoad,
806
- maxWait: operation.latencySensitive ? 10 : 100
807
- };
808
- }
809
- }
810
- ```
811
-
812
- ### 8. Parallel Execution Strategies
813
-
814
- ```javascript
815
- // Parallel Execution Optimizer
816
- class ParallelExecutionOptimizer {
817
- constructor() {
818
- this.strategies = {
819
- dataParallel: { overhead: 'low', scaling: 'linear' },
820
- modelParallel: { overhead: 'medium', scaling: 'sub-linear' },
821
- pipelineParallel: { overhead: 'high', scaling: 'good' },
822
- tensorParallel: { overhead: 'medium', scaling: 'good' }
823
- };
824
- }
825
-
826
- async optimizeParallelization(task, resources) {
827
- const analysis = await this.analyzeParallelizationOpportunities(task);
828
-
829
- return {
830
- strategy: await this.selectOptimalStrategy(analysis, resources),
831
- partitioning: await this.createPartitioningPlan(analysis, resources),
832
- synchronization: await this.planSynchronization(analysis),
833
- expectedSpeedup: await this.estimateSpeedup(analysis, resources)
834
- };
835
- }
836
-
837
- async analyzeParallelizationOpportunities(task) {
838
- return {
839
- independentOperations: await this.findIndependentOps(task),
840
- dependencyGraph: await this.buildDependencyGraph(task),
841
- criticalPath: await this.findCriticalPath(task),
842
- parallelizableRatio: await this.calculateParallelRatio(task)
843
- };
844
- }
845
-
846
- async selectOptimalStrategy(analysis, resources) {
847
- const cpuCores = resources.cpuCores || 8;
848
- const memoryGB = resources.memoryGB || 16;
849
- const gpuCount = resources.gpuCount || 0;
850
-
851
- if (gpuCount > 1 && analysis.parallelizableRatio > 0.8) {
852
- return {
853
- type: 'DATA_PARALLEL',
854
- workers: gpuCount,
855
- reason: 'High parallelizable ratio with multiple GPUs',
856
- expectedEfficiency: 0.85
857
- };
858
- }
859
-
860
- if (analysis.criticalPath.length > 10 && cpuCores > 4) {
861
- return {
862
- type: 'PIPELINE_PARALLEL',
863
- stages: Math.min(cpuCores, analysis.criticalPath.length),
864
- reason: 'Long critical path benefits from pipelining',
865
- expectedEfficiency: 0.75
866
- };
867
- }
868
-
869
- return {
870
- type: 'TASK_PARALLEL',
871
- workers: cpuCores,
872
- reason: 'General task parallelization',
873
- expectedEfficiency: 0.70
874
- };
875
- }
876
-
877
- // Amdahl's Law calculation
878
- calculateTheoreticalSpeedup(parallelRatio, workers) {
879
- // S = 1 / ((1 - P) + P/N)
880
- const serialPortion = 1 - parallelRatio;
881
- return 1 / (serialPortion + parallelRatio / workers);
882
- }
883
- }
884
- ```
885
-
886
- ### 9. Benchmark Suite Integration
887
-
888
- ```javascript
889
- // V3 Performance Benchmark Suite
890
- class V3BenchmarkSuite {
891
- constructor() {
892
- this.benchmarks = {
893
- flash_attention: new FlashAttentionBenchmark(),
894
- hnsw_search: new HNSWSearchBenchmark(),
895
- wasm_simd: new WASMSIMDBenchmark(),
896
- memory_ops: new MemoryOperationsBenchmark(),
897
- mcp_latency: new MCPLatencyBenchmark(),
898
- sona_adaptation: new SONAAdaptationBenchmark()
899
- };
900
-
901
- this.targets = {
902
- flash_attention_speedup: { min: 2.49, max: 7.47 },
903
- hnsw_improvement: { min: 150, max: 12500 },
904
- memory_reduction: { min: 0.50, max: 0.75 },
905
- mcp_response_ms: { max: 100 },
906
- sona_adaptation_ms: { max: 0.05 }
907
- };
908
- }
909
-
910
- async runFullSuite(config = {}) {
911
- const results = {
912
- timestamp: Date.now(),
913
- config: config,
914
- benchmarks: {},
915
- summary: {}
916
- };
917
-
918
- // Run all benchmarks in parallel
919
- const benchmarkPromises = Object.entries(this.benchmarks).map(
920
- async ([name, benchmark]) => {
921
- const result = await benchmark.run(config);
922
- return [name, result];
923
- }
924
- );
925
-
926
- const benchmarkResults = await Promise.all(benchmarkPromises);
927
-
928
- for (const [name, result] of benchmarkResults) {
929
- results.benchmarks[name] = result;
930
- }
931
-
932
- // Generate summary
933
- results.summary = this.generateSummary(results.benchmarks);
934
-
935
- // Store results in memory
936
- await this.storeResults(results);
937
-
938
- return results;
939
- }
940
-
941
- generateSummary(benchmarks) {
942
- const summary = {
943
- passing: 0,
944
- failing: 0,
945
- warnings: 0,
946
- details: []
947
- };
948
-
949
- // Check flash attention
950
- if (benchmarks.flash_attention) {
951
- const speedup = benchmarks.flash_attention.speedup;
952
- if (speedup >= this.targets.flash_attention_speedup.min) {
953
- summary.passing++;
954
- summary.details.push({
955
- benchmark: 'Flash Attention',
956
- status: 'PASS',
957
- value: `${speedup.toFixed(2)}x speedup`,
958
- target: `${this.targets.flash_attention_speedup.min}x-${this.targets.flash_attention_speedup.max}x`
959
- });
960
- } else {
961
- summary.failing++;
962
- summary.details.push({
963
- benchmark: 'Flash Attention',
964
- status: 'FAIL',
965
- value: `${speedup.toFixed(2)}x speedup`,
966
- target: `${this.targets.flash_attention_speedup.min}x minimum`
967
- });
968
- }
969
- }
970
-
971
- // Check HNSW search
972
- if (benchmarks.hnsw_search) {
973
- const improvement = benchmarks.hnsw_search.improvement;
974
- if (improvement >= this.targets.hnsw_improvement.min) {
975
- summary.passing++;
976
- summary.details.push({
977
- benchmark: 'HNSW Search',
978
- status: 'PASS',
979
- value: `${improvement}x faster`,
980
- target: `${this.targets.hnsw_improvement.min}x-${this.targets.hnsw_improvement.max}x`
981
- });
982
- }
983
- }
984
-
985
- // Check MCP latency
986
- if (benchmarks.mcp_latency) {
987
- const p95 = benchmarks.mcp_latency.p95;
988
- if (p95 <= this.targets.mcp_response_ms.max) {
989
- summary.passing++;
990
- summary.details.push({
991
- benchmark: 'MCP Response',
992
- status: 'PASS',
993
- value: `${p95.toFixed(1)}ms p95`,
994
- target: `<${this.targets.mcp_response_ms.max}ms`
995
- });
996
- }
997
- }
998
-
999
- // Check SONA adaptation
1000
- if (benchmarks.sona_adaptation) {
1001
- const latency = benchmarks.sona_adaptation.latency;
1002
- if (latency <= this.targets.sona_adaptation_ms.max) {
1003
- summary.passing++;
1004
- summary.details.push({
1005
- benchmark: 'SONA Adaptation',
1006
- status: 'PASS',
1007
- value: `${latency.toFixed(3)}ms`,
1008
- target: `<${this.targets.sona_adaptation_ms.max}ms`
1009
- });
1010
- }
1011
- }
1012
-
1013
- summary.overallStatus = summary.failing === 0 ? 'PASS' : 'FAIL';
1014
-
1015
- return summary;
1016
- }
1017
- }
1018
- ```
1019
-
1020
- ## MCP Integration
1021
-
1022
- ### Performance Monitoring via MCP
1023
-
1024
- ```javascript
1025
- // V3 Performance MCP Integration
1026
- const performanceMCP = {
1027
- // Run benchmark suite
1028
- async runBenchmarks(suite = 'all') {
1029
- return await mcp__moflo__performance_benchmark({ suite });
1030
- },
1031
-
1032
- // Analyze bottlenecks
1033
- async analyzeBottlenecks(component) {
1034
- return await mcp__moflo__performance_report({
1035
- component: component,
1036
- metrics: ['latency', 'throughput', 'memory', 'cpu']
1037
- });
1038
- },
1039
-
1040
- // Get performance report
1041
- async getPerformanceReport(timeframe = '24h') {
1042
- return await mcp__moflo__performance_report({
1043
- format: 'detailed',
1044
- timeframe: timeframe
1045
- });
1046
- },
1047
-
1048
- // Token usage analysis
1049
- async analyzeTokenUsage(operation) {
1050
- },
1051
-
1052
- // WASM optimization
1053
- async optimizeWASM(operation) {
1054
- },
1055
-
1056
- // Neural pattern optimization
1057
- async optimizeNeuralPatterns() {
1058
- return await mcp__moflo__neural_patterns({
1059
- action: 'analyze',
1060
- metadata: { focus: 'performance' }
1061
- });
1062
- },
1063
-
1064
- // Store performance metrics
1065
- async storeMetrics(key, value) {
1066
- return await mcp__moflo__memory_store({
1067
- key: `performance/${key}`,
1068
- value: JSON.stringify(value),
1069
- namespace: 'v3-performance',
1070
- ttl: 604800000 // 7 days
1071
- });
1072
- }
1073
- };
1074
- ```
1075
-
1076
- ## CLI Integration
1077
-
1078
- ### Performance Commands
1079
-
1080
- ```bash
1081
- # Run full benchmark suite
1082
- npx claude-flow@v3alpha performance benchmark --suite all
1083
-
1084
- # Profile specific component
1085
- npx claude-flow@v3alpha performance profile --component mcp-server
1086
-
1087
- # Analyze bottlenecks
1088
- npx claude-flow@v3alpha performance analyze --target latency
1089
-
1090
- # Generate performance report
1091
- npx claude-flow@v3alpha performance report --format detailed
1092
-
1093
- # Optimize specific area
1094
- npx claude-flow@v3alpha performance optimize --focus memory
1095
-
1096
- # Real-time metrics
1097
- npx claude-flow@v3alpha status --metrics --watch
1098
-
1099
- # WASM SIMD benchmark
1100
- npx claude-flow@v3alpha performance benchmark --suite wasm-simd
1101
-
1102
- # Flash attention benchmark
1103
- npx claude-flow@v3alpha performance benchmark --suite flash-attention
1104
-
1105
- # Memory reduction analysis
1106
- npx claude-flow@v3alpha performance analyze --target memory --quantization int8
1107
- ```
1108
-
1109
- ## SONA Integration
1110
-
1111
- ### Adaptive Learning for Performance Optimization
1112
-
1113
- ```javascript
1114
- // SONA-powered Performance Learning
1115
- class SONAPerformanceOptimizer {
1116
- constructor() {
1117
- this.trajectories = [];
1118
- this.learnedPatterns = new Map();
1119
- }
1120
-
1121
- async learnFromOptimization(optimization, result) {
1122
- // Record trajectory
1123
- const trajectory = {
1124
- optimization: optimization,
1125
- result: result,
1126
- qualityScore: this.calculateQualityScore(result)
1127
- };
1128
-
1129
- this.trajectories.push(trajectory);
1130
-
1131
- // Trigger SONA learning if threshold reached
1132
- if (this.trajectories.length >= 10) {
1133
- await this.triggerSONALearning();
1134
- }
1135
- }
1136
-
1137
- async triggerSONALearning() {
1138
- // Use SONA to learn optimization patterns
1139
- await mcp__moflo__neural_train({
1140
- pattern_type: 'optimization',
1141
- training_data: JSON.stringify(this.trajectories),
1142
- epochs: 10
1143
- });
1144
-
1145
- // Extract learned patterns
1146
- const patterns = await mcp__moflo__neural_patterns({
1147
- action: 'analyze',
1148
- metadata: { domain: 'performance' }
1149
- });
1150
-
1151
- // Store patterns for future use
1152
- for (const pattern of patterns) {
1153
- this.learnedPatterns.set(pattern.signature, pattern);
1154
- }
1155
-
1156
- // Clear processed trajectories
1157
- this.trajectories = [];
1158
- }
1159
-
1160
- async predictOptimalSettings(context) {
1161
- // Use SONA to predict optimal configuration
1162
- const prediction = await mcp__moflo__neural_predict({
1163
- modelId: 'performance-optimizer',
1164
- input: JSON.stringify(context)
1165
- });
1166
-
1167
- return {
1168
- batchSize: prediction.batch_size,
1169
- parallelism: prediction.parallelism,
1170
- caching: prediction.caching_strategy,
1171
- quantization: prediction.quantization_level,
1172
- confidence: prediction.confidence
1173
- };
1174
- }
1175
- }
1176
- ```
1177
-
1178
- ## Best Practices
1179
-
1180
- ### Performance Optimization Checklist
1181
-
1182
- 1. **Flash Attention**
1183
- - Enable for all transformer-based models
1184
- - Use fused operations where possible
1185
- - Target 2.49x-7.47x speedup
1186
-
1187
- 2. **WASM SIMD**
1188
- - Enable SIMD for vector operations
1189
- - Use aligned memory access
1190
- - Batch operations for SIMD efficiency
1191
-
1192
- 3. **Memory Optimization**
1193
- - Apply int8/int4 quantization (50-75% reduction)
1194
- - Enable gradient checkpointing
1195
- - Use memory pooling for allocations
1196
-
1197
- 4. **Latency Reduction**
1198
- - Keep MCP response <100ms
1199
- - Use connection pooling
1200
- - Batch tool calls when possible
1201
-
1202
- 5. **SONA Integration**
1203
- - Track all optimization trajectories
1204
- - Learn from successful patterns
1205
- - Target <0.05ms adaptation time
1206
-
1207
- ## Integration Points
1208
-
1209
- ### With Other V3 Agents
1210
-
1211
- - **Memory Specialist**: Coordinate memory optimization strategies
1212
- - **Security Architect**: Ensure performance changes maintain security
1213
- - **SONA Learning Optimizer**: Share learned optimization patterns
1214
-
1215
- ### With Swarm Coordination
1216
-
1217
- - Provide performance metrics to coordinators
1218
- - Optimize agent communication patterns
1219
- - Balance load across swarm agents
1220
-
1221
- ---
1222
-
1223
- **V3 Performance Engineer** - Optimizing Claude Flow for maximum performance
1224
-
1225
- Targets: Flash Attention 2.49x-7.47x | HNSW 150x-12,500x | Memory -50-75% | MCP <100ms | SONA <0.05ms