specweave 0.3.13 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (112) hide show
  1. package/CLAUDE.md +17 -1
  2. package/README.md +1 -1
  3. package/bin/install-all.sh +9 -2
  4. package/bin/install-hooks.sh +57 -0
  5. package/dist/cli/commands/init.d.ts.map +1 -1
  6. package/dist/cli/commands/init.js +55 -0
  7. package/dist/cli/commands/init.js.map +1 -1
  8. package/dist/core/agent-model-manager.d.ts +52 -0
  9. package/dist/core/agent-model-manager.d.ts.map +1 -0
  10. package/dist/core/agent-model-manager.js +120 -0
  11. package/dist/core/agent-model-manager.js.map +1 -0
  12. package/dist/core/cost-tracker.d.ts +108 -0
  13. package/dist/core/cost-tracker.d.ts.map +1 -0
  14. package/dist/core/cost-tracker.js +281 -0
  15. package/dist/core/cost-tracker.js.map +1 -0
  16. package/dist/core/model-selector.d.ts +57 -0
  17. package/dist/core/model-selector.d.ts.map +1 -0
  18. package/dist/core/model-selector.js +115 -0
  19. package/dist/core/model-selector.js.map +1 -0
  20. package/dist/core/phase-detector.d.ts +62 -0
  21. package/dist/core/phase-detector.d.ts.map +1 -0
  22. package/dist/core/phase-detector.js +229 -0
  23. package/dist/core/phase-detector.js.map +1 -0
  24. package/dist/types/cost-tracking.d.ts +43 -0
  25. package/dist/types/cost-tracking.d.ts.map +1 -0
  26. package/dist/types/cost-tracking.js +8 -0
  27. package/dist/types/cost-tracking.js.map +1 -0
  28. package/dist/types/model-selection.d.ts +53 -0
  29. package/dist/types/model-selection.d.ts.map +1 -0
  30. package/dist/types/model-selection.js +12 -0
  31. package/dist/types/model-selection.js.map +1 -0
  32. package/dist/utils/cost-reporter.d.ts +58 -0
  33. package/dist/utils/cost-reporter.d.ts.map +1 -0
  34. package/dist/utils/cost-reporter.js +224 -0
  35. package/dist/utils/cost-reporter.js.map +1 -0
  36. package/dist/utils/pricing-constants.d.ts +70 -0
  37. package/dist/utils/pricing-constants.d.ts.map +1 -0
  38. package/dist/utils/pricing-constants.js +71 -0
  39. package/dist/utils/pricing-constants.js.map +1 -0
  40. package/package.json +1 -1
  41. package/src/agents/architect/AGENT.md +3 -0
  42. package/src/agents/code-reviewer.md +156 -0
  43. package/src/agents/data-scientist/AGENT.md +181 -0
  44. package/src/agents/database-optimizer/AGENT.md +147 -0
  45. package/src/agents/devops/AGENT.md +3 -0
  46. package/src/agents/diagrams-architect/AGENT.md +3 -0
  47. package/src/agents/docs-writer/AGENT.md +3 -0
  48. package/src/agents/kubernetes-architect/AGENT.md +142 -0
  49. package/src/agents/ml-engineer/AGENT.md +150 -0
  50. package/src/agents/mlops-engineer/AGENT.md +201 -0
  51. package/src/agents/network-engineer/AGENT.md +149 -0
  52. package/src/agents/observability-engineer/AGENT.md +213 -0
  53. package/src/agents/payment-integration/AGENT.md +35 -0
  54. package/src/agents/performance/AGENT.md +3 -0
  55. package/src/agents/performance-engineer/AGENT.md +153 -0
  56. package/src/agents/pm/AGENT.md +3 -0
  57. package/src/agents/qa-lead/AGENT.md +3 -0
  58. package/src/agents/security/AGENT.md +3 -0
  59. package/src/agents/sre/AGENT.md +3 -0
  60. package/src/agents/tdd-orchestrator/AGENT.md +169 -0
  61. package/src/agents/tech-lead/AGENT.md +3 -0
  62. package/src/commands/specweave.costs.md +261 -0
  63. package/src/commands/specweave.ml-pipeline.md +292 -0
  64. package/src/commands/specweave.monitor-setup.md +501 -0
  65. package/src/commands/specweave.slo-implement.md +1055 -0
  66. package/src/commands/specweave.sync-github.md +1 -1
  67. package/src/commands/specweave.tdd-cycle.md +199 -0
  68. package/src/commands/specweave.tdd-green.md +842 -0
  69. package/src/commands/specweave.tdd-red.md +135 -0
  70. package/src/commands/specweave.tdd-refactor.md +165 -0
  71. package/src/skills/SKILLS-INDEX.md +18 -10
  72. package/src/skills/billing-automation/SKILL.md +559 -0
  73. package/src/skills/distributed-tracing/SKILL.md +438 -0
  74. package/src/skills/e2e-playwright/README.md +1 -1
  75. package/src/skills/e2e-playwright/package.json +1 -1
  76. package/src/skills/gitops-workflow/SKILL.md +285 -0
  77. package/src/skills/gitops-workflow/references/argocd-setup.md +134 -0
  78. package/src/skills/gitops-workflow/references/sync-policies.md +131 -0
  79. package/src/skills/grafana-dashboards/SKILL.md +369 -0
  80. package/src/skills/helm-chart-scaffolding/SKILL.md +544 -0
  81. package/src/skills/helm-chart-scaffolding/assets/Chart.yaml.template +42 -0
  82. package/src/skills/helm-chart-scaffolding/assets/values.yaml.template +185 -0
  83. package/src/skills/helm-chart-scaffolding/references/chart-structure.md +500 -0
  84. package/src/skills/helm-chart-scaffolding/scripts/validate-chart.sh +244 -0
  85. package/src/skills/k8s-manifest-generator/SKILL.md +511 -0
  86. package/src/skills/k8s-manifest-generator/assets/configmap-template.yaml +296 -0
  87. package/src/skills/k8s-manifest-generator/assets/deployment-template.yaml +203 -0
  88. package/src/skills/k8s-manifest-generator/assets/service-template.yaml +171 -0
  89. package/src/skills/k8s-manifest-generator/references/deployment-spec.md +753 -0
  90. package/src/skills/k8s-manifest-generator/references/service-spec.md +724 -0
  91. package/src/skills/k8s-security-policies/SKILL.md +334 -0
  92. package/src/skills/k8s-security-policies/assets/network-policy-template.yaml +177 -0
  93. package/src/skills/k8s-security-policies/references/rbac-patterns.md +187 -0
  94. package/src/skills/ml-pipeline-workflow/SKILL.md +245 -0
  95. package/src/skills/paypal-integration/SKILL.md +467 -0
  96. package/src/skills/pci-compliance/SKILL.md +466 -0
  97. package/src/skills/prometheus-configuration/SKILL.md +392 -0
  98. package/src/skills/slo-implementation/SKILL.md +329 -0
  99. package/src/skills/stripe-integration/SKILL.md +442 -0
  100. package/src/skills/tdd-workflow/SKILL.md +378 -0
  101. package/src/templates/README.md.template +1 -1
  102. package/src/skills/bmad-method-expert/SKILL.md +0 -626
  103. package/src/skills/bmad-method-expert/scripts/analyze-project.js +0 -318
  104. package/src/skills/bmad-method-expert/scripts/check-setup.js +0 -208
  105. package/src/skills/bmad-method-expert/scripts/generate-template.js +0 -1149
  106. package/src/skills/bmad-method-expert/scripts/validate-documents.js +0 -340
  107. package/src/skills/context-optimizer/SKILL.md +0 -588
  108. package/src/skills/figma-designer/SKILL.md +0 -149
  109. package/src/skills/figma-implementer/SKILL.md +0 -148
  110. package/src/skills/figma-mcp-connector/SKILL.md +0 -136
  111. package/src/skills/figma-to-code/SKILL.md +0 -128
  112. package/src/skills/spec-kit-expert/SKILL.md +0 -1010
@@ -0,0 +1,292 @@
1
+ # Machine Learning Pipeline - Multi-Agent MLOps Orchestration
2
+
3
+ Design and implement a complete ML pipeline for: $ARGUMENTS
4
+
5
+ ## Thinking
6
+
7
+ This workflow orchestrates multiple specialized agents to build a production-ready ML pipeline following modern MLOps best practices. The approach emphasizes:
8
+
9
+ - **Phase-based coordination**: Each phase builds upon previous outputs, with clear handoffs between agents
10
+ - **Modern tooling integration**: MLflow/W&B for experiments, Feast/Tecton for features, KServe/Seldon for serving
11
+ - **Production-first mindset**: Every component designed for scale, monitoring, and reliability
12
+ - **Reproducibility**: Version control for data, models, and infrastructure
13
+ - **Continuous improvement**: Automated retraining, A/B testing, and drift detection
14
+
15
+ The multi-agent approach ensures each aspect is handled by domain experts:
16
+ - Data engineers handle ingestion and quality
17
+ - Data scientists design features and experiments
18
+ - ML engineers implement training pipelines
19
+ - MLOps engineers handle production deployment
20
+ - Observability engineers ensure monitoring
21
+
22
+ ## Phase 1: Data & Requirements Analysis
23
+
24
+ <Task>
25
+ subagent_type: data-engineer
26
+ prompt: |
27
+ Analyze and design data pipeline for ML system with requirements: $ARGUMENTS
28
+
29
+ Deliverables:
30
+ 1. Data source audit and ingestion strategy:
31
+ - Source systems and connection patterns
32
+ - Schema validation using Pydantic/Great Expectations
33
+ - Data versioning with DVC or lakeFS
34
+ - Incremental loading and CDC strategies
35
+
36
+ 2. Data quality framework:
37
+ - Profiling and statistics generation
38
+ - Anomaly detection rules
39
+ - Data lineage tracking
40
+ - Quality gates and SLAs
41
+
42
+ 3. Storage architecture:
43
+ - Raw/processed/feature layers
44
+ - Partitioning strategy
45
+ - Retention policies
46
+ - Cost optimization
47
+
48
+ Provide implementation code for critical components and integration patterns.
49
+ </Task>
50
+
51
+ <Task>
52
+ subagent_type: data-scientist
53
+ prompt: |
54
+ Design feature engineering and model requirements for: $ARGUMENTS
55
+ Using data architecture from: {phase1.data-engineer.output}
56
+
57
+ Deliverables:
58
+ 1. Feature engineering pipeline:
59
+ - Transformation specifications
60
+ - Feature store schema (Feast/Tecton)
61
+ - Statistical validation rules
62
+ - Handling strategies for missing data/outliers
63
+
64
+ 2. Model requirements:
65
+ - Algorithm selection rationale
66
+ - Performance metrics and baselines
67
+ - Training data requirements
68
+ - Evaluation criteria and thresholds
69
+
70
+ 3. Experiment design:
71
+ - Hypothesis and success metrics
72
+ - A/B testing methodology
73
+ - Sample size calculations
74
+ - Bias detection approach
75
+
76
+ Include feature transformation code and statistical validation logic.
77
+ </Task>
78
+
79
+ ## Phase 2: Model Development & Training
80
+
81
+ <Task>
82
+ subagent_type: ml-engineer
83
+ prompt: |
84
+ Implement training pipeline based on requirements: {phase1.data-scientist.output}
85
+ Using data pipeline: {phase1.data-engineer.output}
86
+
87
+ Build comprehensive training system:
88
+ 1. Training pipeline implementation:
89
+ - Modular training code with clear interfaces
90
+ - Hyperparameter optimization (Optuna/Ray Tune)
91
+ - Distributed training support (Horovod/PyTorch DDP)
92
+ - Cross-validation and ensemble strategies
93
+
94
+ 2. Experiment tracking setup:
95
+ - MLflow/Weights & Biases integration
96
+ - Metric logging and visualization
97
+ - Artifact management (models, plots, data samples)
98
+ - Experiment comparison and analysis tools
99
+
100
+ 3. Model registry integration:
101
+ - Version control and tagging strategy
102
+ - Model metadata and lineage
103
+ - Promotion workflows (dev -> staging -> prod)
104
+ - Rollback procedures
105
+
106
+ Provide complete training code with configuration management.
107
+ </Task>
108
+
109
+ <Task>
110
+ subagent_type: python-pro
111
+ prompt: |
112
+ Optimize and productionize ML code from: {phase2.ml-engineer.output}
113
+
114
+ Focus areas:
115
+ 1. Code quality and structure:
116
+ - Refactor for production standards
117
+ - Add comprehensive error handling
118
+ - Implement proper logging with structured formats
119
+ - Create reusable components and utilities
120
+
121
+ 2. Performance optimization:
122
+ - Profile and optimize bottlenecks
123
+ - Implement caching strategies
124
+ - Optimize data loading and preprocessing
125
+ - Memory management for large-scale training
126
+
127
+ 3. Testing framework:
128
+ - Unit tests for data transformations
129
+ - Integration tests for pipeline components
130
+ - Model quality tests (invariance, directional)
131
+ - Performance regression tests
132
+
133
+ Deliver production-ready, maintainable code with full test coverage.
134
+ </Task>
135
+
136
+ ## Phase 3: Production Deployment & Serving
137
+
138
+ <Task>
139
+ subagent_type: mlops-engineer
140
+ prompt: |
141
+ Design production deployment for models from: {phase2.ml-engineer.output}
142
+ With optimized code from: {phase2.python-pro.output}
143
+
144
+ Implementation requirements:
145
+ 1. Model serving infrastructure:
146
+ - REST/gRPC APIs with FastAPI/TorchServe
147
+ - Batch prediction pipelines (Airflow/Kubeflow)
148
+ - Stream processing (Kafka/Kinesis integration)
149
+ - Model serving platforms (KServe/Seldon Core)
150
+
151
+ 2. Deployment strategies:
152
+ - Blue-green deployments for zero downtime
153
+ - Canary releases with traffic splitting
154
+ - Shadow deployments for validation
155
+ - A/B testing infrastructure
156
+
157
+ 3. CI/CD pipeline:
158
+ - GitHub Actions/GitLab CI workflows
159
+ - Automated testing gates
160
+ - Model validation before deployment
161
+ - ArgoCD for GitOps deployment
162
+
163
+ 4. Infrastructure as Code:
164
+ - Terraform modules for cloud resources
165
+ - Helm charts for Kubernetes deployments
166
+ - Docker multi-stage builds for optimization
167
+ - Secret management with Vault/Secrets Manager
168
+
169
+ Provide complete deployment configuration and automation scripts.
170
+ </Task>
171
+
172
+ <Task>
173
+ subagent_type: kubernetes-architect
174
+ prompt: |
175
+ Design Kubernetes infrastructure for ML workloads from: {phase3.mlops-engineer.output}
176
+
177
+ Kubernetes-specific requirements:
178
+ 1. Workload orchestration:
179
+ - Training job scheduling with Kubeflow
180
+ - GPU resource allocation and sharing
181
+ - Spot/preemptible instance integration
182
+ - Priority classes and resource quotas
183
+
184
+ 2. Serving infrastructure:
185
+ - HPA/VPA for autoscaling
186
+ - KEDA for event-driven scaling
187
+ - Istio service mesh for traffic management
188
+ - Model caching and warm-up strategies
189
+
190
+ 3. Storage and data access:
191
+ - PVC strategies for training data
192
+ - Model artifact storage with CSI drivers
193
+ - Distributed storage for feature stores
194
+ - Cache layers for inference optimization
195
+
196
+ Provide Kubernetes manifests and Helm charts for entire ML platform.
197
+ </Task>
198
+
199
+ ## Phase 4: Monitoring & Continuous Improvement
200
+
201
+ <Task>
202
+ subagent_type: observability-engineer
203
+ prompt: |
204
+ Implement comprehensive monitoring for ML system deployed in: {phase3.mlops-engineer.output}
205
+ Using Kubernetes infrastructure: {phase3.kubernetes-architect.output}
206
+
207
+ Monitoring framework:
208
+ 1. Model performance monitoring:
209
+ - Prediction accuracy tracking
210
+ - Latency and throughput metrics
211
+ - Feature importance shifts
212
+ - Business KPI correlation
213
+
214
+ 2. Data and model drift detection:
215
+ - Statistical drift detection (KS test, PSI)
216
+ - Concept drift monitoring
217
+ - Feature distribution tracking
218
+ - Automated drift alerts and reports
219
+
220
+ 3. System observability:
221
+ - Prometheus metrics for all components
222
+ - Grafana dashboards for visualization
223
+ - Distributed tracing with Jaeger/Zipkin
224
+ - Log aggregation with ELK/Loki
225
+
226
+ 4. Alerting and automation:
227
+ - PagerDuty/Opsgenie integration
228
+ - Automated retraining triggers
229
+ - Performance degradation workflows
230
+ - Incident response runbooks
231
+
232
+ 5. Cost tracking:
233
+ - Resource utilization metrics
234
+ - Cost allocation by model/experiment
235
+ - Optimization recommendations
236
+ - Budget alerts and controls
237
+
238
+ Deliver monitoring configuration, dashboards, and alert rules.
239
+ </Task>
240
+
241
+ ## Configuration Options
242
+
243
+ - **experiment_tracking**: mlflow | wandb | neptune | clearml
244
+ - **feature_store**: feast | tecton | databricks | custom
245
+ - **serving_platform**: kserve | seldon | torchserve | triton
246
+ - **orchestration**: kubeflow | airflow | prefect | dagster
247
+ - **cloud_provider**: aws | azure | gcp | multi-cloud
248
+ - **deployment_mode**: realtime | batch | streaming | hybrid
249
+ - **monitoring_stack**: prometheus | datadog | newrelic | custom
250
+
251
+ ## Success Criteria
252
+
253
+ 1. **Data Pipeline Success**:
254
+ - < 0.1% data quality issues in production
255
+ - Automated data validation passing 99.9% of time
256
+ - Complete data lineage tracking
257
+ - Sub-second feature serving latency
258
+
259
+ 2. **Model Performance**:
260
+ - Meeting or exceeding baseline metrics
261
+ - < 5% performance degradation before retraining
262
+ - Successful A/B tests with statistical significance
263
+ - No undetected model drift > 24 hours
264
+
265
+ 3. **Operational Excellence**:
266
+ - 99.9% uptime for model serving
267
+ - < 200ms p99 inference latency
268
+ - Automated rollback within 5 minutes
269
+ - Complete observability with < 1 minute alert time
270
+
271
+ 4. **Development Velocity**:
272
+ - < 1 hour from commit to production
273
+ - Parallel experiment execution
274
+ - Reproducible training runs
275
+ - Self-service model deployment
276
+
277
+ 5. **Cost Efficiency**:
278
+ - < 20% infrastructure waste
279
+ - Optimized resource allocation
280
+ - Automatic scaling based on load
281
+ - Spot instance utilization > 60%
282
+
283
+ ## Final Deliverables
284
+
285
+ Upon completion, the orchestrated pipeline will provide:
286
+ - End-to-end ML pipeline with full automation
287
+ - Comprehensive documentation and runbooks
288
+ - Production-ready infrastructure as code
289
+ - Complete monitoring and alerting system
290
+ - CI/CD pipelines for continuous improvement
291
+ - Cost optimization and scaling strategies
292
+ - Disaster recovery and rollback procedures