omgkit 2.13.0 → 2.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (138) hide show
  1. package/README.md +129 -10
  2. package/package.json +2 -2
  3. package/plugin/agents/api-designer.md +5 -0
  4. package/plugin/agents/architect.md +8 -0
  5. package/plugin/agents/brainstormer.md +4 -0
  6. package/plugin/agents/cicd-manager.md +6 -0
  7. package/plugin/agents/code-reviewer.md +6 -0
  8. package/plugin/agents/copywriter.md +2 -0
  9. package/plugin/agents/data-engineer.md +255 -0
  10. package/plugin/agents/database-admin.md +10 -0
  11. package/plugin/agents/debugger.md +10 -0
  12. package/plugin/agents/devsecops.md +314 -0
  13. package/plugin/agents/docs-manager.md +4 -0
  14. package/plugin/agents/domain-decomposer.md +181 -0
  15. package/plugin/agents/embedded-systems.md +397 -0
  16. package/plugin/agents/fullstack-developer.md +12 -0
  17. package/plugin/agents/game-systems-designer.md +375 -0
  18. package/plugin/agents/git-manager.md +10 -0
  19. package/plugin/agents/journal-writer.md +2 -0
  20. package/plugin/agents/ml-engineer.md +284 -0
  21. package/plugin/agents/observability-engineer.md +353 -0
  22. package/plugin/agents/oracle.md +9 -0
  23. package/plugin/agents/performance-engineer.md +290 -0
  24. package/plugin/agents/pipeline-architect.md +6 -0
  25. package/plugin/agents/planner.md +12 -0
  26. package/plugin/agents/platform-engineer.md +325 -0
  27. package/plugin/agents/project-manager.md +3 -0
  28. package/plugin/agents/researcher.md +5 -0
  29. package/plugin/agents/scientific-computing.md +426 -0
  30. package/plugin/agents/scout.md +3 -0
  31. package/plugin/agents/security-auditor.md +7 -0
  32. package/plugin/agents/sprint-master.md +17 -0
  33. package/plugin/agents/tester.md +10 -0
  34. package/plugin/agents/ui-ux-designer.md +12 -0
  35. package/plugin/agents/vulnerability-scanner.md +6 -0
  36. package/plugin/commands/data/pipeline.md +47 -0
  37. package/plugin/commands/data/quality.md +49 -0
  38. package/plugin/commands/domain/analyze.md +34 -0
  39. package/plugin/commands/domain/map.md +41 -0
  40. package/plugin/commands/game/balance.md +56 -0
  41. package/plugin/commands/game/optimize.md +62 -0
  42. package/plugin/commands/iot/provision.md +58 -0
  43. package/plugin/commands/ml/evaluate.md +47 -0
  44. package/plugin/commands/ml/train.md +48 -0
  45. package/plugin/commands/perf/benchmark.md +54 -0
  46. package/plugin/commands/perf/profile.md +49 -0
  47. package/plugin/commands/platform/blueprint.md +56 -0
  48. package/plugin/commands/security/audit.md +54 -0
  49. package/plugin/commands/security/scan.md +55 -0
  50. package/plugin/commands/sre/dashboard.md +53 -0
  51. package/plugin/registry.yaml +787 -0
  52. package/plugin/skills/ai-ml/experiment-tracking/SKILL.md +338 -0
  53. package/plugin/skills/ai-ml/feature-stores/SKILL.md +340 -0
  54. package/plugin/skills/ai-ml/llm-ops/SKILL.md +454 -0
  55. package/plugin/skills/ai-ml/ml-pipelines/SKILL.md +390 -0
  56. package/plugin/skills/ai-ml/model-monitoring/SKILL.md +398 -0
  57. package/plugin/skills/ai-ml/model-serving/SKILL.md +386 -0
  58. package/plugin/skills/event-driven/cqrs-patterns/SKILL.md +348 -0
  59. package/plugin/skills/event-driven/event-sourcing/SKILL.md +334 -0
  60. package/plugin/skills/event-driven/kafka-deep/SKILL.md +252 -0
  61. package/plugin/skills/event-driven/saga-orchestration/SKILL.md +335 -0
  62. package/plugin/skills/event-driven/schema-registry/SKILL.md +328 -0
  63. package/plugin/skills/event-driven/stream-processing/SKILL.md +313 -0
  64. package/plugin/skills/game/game-audio/SKILL.md +446 -0
  65. package/plugin/skills/game/game-networking/SKILL.md +490 -0
  66. package/plugin/skills/game/godot-patterns/SKILL.md +413 -0
  67. package/plugin/skills/game/shader-programming/SKILL.md +492 -0
  68. package/plugin/skills/game/unity-patterns/SKILL.md +488 -0
  69. package/plugin/skills/iot/device-provisioning/SKILL.md +405 -0
  70. package/plugin/skills/iot/edge-computing/SKILL.md +369 -0
  71. package/plugin/skills/iot/industrial-protocols/SKILL.md +438 -0
  72. package/plugin/skills/iot/mqtt-deep/SKILL.md +418 -0
  73. package/plugin/skills/iot/ota-updates/SKILL.md +426 -0
  74. package/plugin/skills/microservices/api-gateway-patterns/SKILL.md +201 -0
  75. package/plugin/skills/microservices/circuit-breaker-patterns/SKILL.md +246 -0
  76. package/plugin/skills/microservices/contract-testing/SKILL.md +284 -0
  77. package/plugin/skills/microservices/distributed-tracing/SKILL.md +246 -0
  78. package/plugin/skills/microservices/service-discovery/SKILL.md +304 -0
  79. package/plugin/skills/microservices/service-mesh/SKILL.md +181 -0
  80. package/plugin/skills/mobile-advanced/mobile-ci-cd/SKILL.md +407 -0
  81. package/plugin/skills/mobile-advanced/mobile-security/SKILL.md +403 -0
  82. package/plugin/skills/mobile-advanced/offline-first/SKILL.md +473 -0
  83. package/plugin/skills/mobile-advanced/push-notifications/SKILL.md +494 -0
  84. package/plugin/skills/mobile-advanced/react-native-deep/SKILL.md +374 -0
  85. package/plugin/skills/simulation/numerical-methods/SKILL.md +434 -0
  86. package/plugin/skills/simulation/parallel-computing/SKILL.md +382 -0
  87. package/plugin/skills/simulation/physics-engines/SKILL.md +377 -0
  88. package/plugin/skills/simulation/validation-verification/SKILL.md +479 -0
  89. package/plugin/skills/simulation/visualization-scientific/SKILL.md +365 -0
  90. package/plugin/stdrules/ALIGNMENT_PRINCIPLE.md +240 -0
  91. package/plugin/workflows/ai-engineering/agent-development.md +3 -3
  92. package/plugin/workflows/ai-engineering/fine-tuning.md +3 -3
  93. package/plugin/workflows/ai-engineering/model-evaluation.md +3 -3
  94. package/plugin/workflows/ai-engineering/prompt-engineering.md +2 -2
  95. package/plugin/workflows/ai-engineering/rag-development.md +4 -4
  96. package/plugin/workflows/ai-ml/data-pipeline.md +188 -0
  97. package/plugin/workflows/ai-ml/experiment-cycle.md +203 -0
  98. package/plugin/workflows/ai-ml/feature-engineering.md +208 -0
  99. package/plugin/workflows/ai-ml/model-deployment.md +199 -0
  100. package/plugin/workflows/ai-ml/monitoring-setup.md +227 -0
  101. package/plugin/workflows/api/api-design.md +1 -1
  102. package/plugin/workflows/api/api-testing.md +2 -2
  103. package/plugin/workflows/content/technical-docs.md +1 -1
  104. package/plugin/workflows/database/migration.md +1 -1
  105. package/plugin/workflows/database/optimization.md +1 -1
  106. package/plugin/workflows/database/schema-design.md +3 -3
  107. package/plugin/workflows/development/bug-fix.md +3 -3
  108. package/plugin/workflows/development/code-review.md +2 -1
  109. package/plugin/workflows/development/feature.md +3 -3
  110. package/plugin/workflows/development/refactor.md +2 -2
  111. package/plugin/workflows/event-driven/consumer-groups.md +190 -0
  112. package/plugin/workflows/event-driven/event-storming.md +172 -0
  113. package/plugin/workflows/event-driven/replay-testing.md +186 -0
  114. package/plugin/workflows/event-driven/saga-implementation.md +206 -0
  115. package/plugin/workflows/event-driven/schema-evolution.md +173 -0
  116. package/plugin/workflows/fullstack/authentication.md +4 -4
  117. package/plugin/workflows/fullstack/full-feature.md +4 -4
  118. package/plugin/workflows/game-dev/content-pipeline.md +218 -0
  119. package/plugin/workflows/game-dev/platform-submission.md +263 -0
  120. package/plugin/workflows/game-dev/playtesting.md +237 -0
  121. package/plugin/workflows/game-dev/prototype-to-production.md +205 -0
  122. package/plugin/workflows/microservices/contract-first.md +151 -0
  123. package/plugin/workflows/microservices/distributed-tracing.md +166 -0
  124. package/plugin/workflows/microservices/domain-decomposition.md +123 -0
  125. package/plugin/workflows/microservices/integration-testing.md +149 -0
  126. package/plugin/workflows/microservices/service-mesh-setup.md +153 -0
  127. package/plugin/workflows/microservices/service-scaffolding.md +151 -0
  128. package/plugin/workflows/omega/1000x-innovation.md +2 -2
  129. package/plugin/workflows/omega/100x-architecture.md +2 -2
  130. package/plugin/workflows/omega/10x-improvement.md +2 -2
  131. package/plugin/workflows/quality/performance-optimization.md +2 -2
  132. package/plugin/workflows/research/best-practices.md +1 -1
  133. package/plugin/workflows/research/technology-research.md +1 -1
  134. package/plugin/workflows/security/penetration-testing.md +3 -3
  135. package/plugin/workflows/security/security-audit.md +3 -3
  136. package/plugin/workflows/sprint/sprint-execution.md +2 -2
  137. package/plugin/workflows/sprint/sprint-retrospective.md +1 -1
  138. package/plugin/workflows/sprint/sprint-setup.md +1 -1
@@ -0,0 +1,398 @@
1
+ # Model Monitoring
2
+
3
+ Data drift detection, model performance monitoring, explainability dashboards, and alerting systems.
4
+
5
+ ## Overview
6
+
7
+ Model monitoring ensures ML models perform correctly in production by detecting drift, tracking performance, and alerting on anomalies.
8
+
9
+ ## Core Concepts
10
+
11
+ ### Types of Drift
12
+ - **Data Drift**: Input distribution changes
13
+ - **Concept Drift**: Relationship between X→Y changes
14
+ - **Prediction Drift**: Output distribution changes
15
+ - **Label Drift**: Ground truth distribution changes
16
+
17
+ ### Monitoring Dimensions
18
+ - **Data Quality**: Missing values, outliers, schema
19
+ - **Model Performance**: Accuracy, latency, throughput
20
+ - **Feature Health**: Statistical properties over time
21
+ - **Business Metrics**: Revenue impact, user engagement
22
+
23
+ ## Data Drift Detection
24
+
25
+ ### Statistical Tests
26
+ ```python
27
+ from scipy import stats
28
+ import numpy as np
29
+ from typing import Dict, Tuple
30
+
31
+ class DriftDetector:
32
+ def __init__(self, reference_data: np.ndarray):
33
+ self.reference = reference_data
34
+
35
+ def detect_drift(
36
+ self,
37
+ current_data: np.ndarray,
38
+ method: str = "ks"
39
+ ) -> Dict[str, float]:
40
+ if method == "ks":
41
+ # Kolmogorov-Smirnov test
42
+ statistic, p_value = stats.ks_2samp(self.reference, current_data)
43
+ elif method == "chi2":
44
+ # Chi-squared test for categorical
45
+ statistic, p_value = stats.chisquare(current_data, self.reference)
46
+ elif method == "psi":
47
+ # Population Stability Index
48
+ statistic = self._calculate_psi(current_data)
49
+ p_value = None
50
+
51
+ return {
52
+ "statistic": statistic,
53
+ "p_value": p_value,
54
+ "drift_detected": p_value < 0.05 if p_value else statistic > 0.25
55
+ }
56
+
57
+ def _calculate_psi(self, current_data: np.ndarray) -> float:
58
+ # Bin data
59
+ bins = np.histogram_bin_edges(self.reference, bins=10)
60
+ ref_counts = np.histogram(self.reference, bins=bins)[0]
61
+ cur_counts = np.histogram(current_data, bins=bins)[0]
62
+
63
+ # Calculate PSI
64
+ ref_pct = ref_counts / len(self.reference) + 0.0001
65
+ cur_pct = cur_counts / len(current_data) + 0.0001
66
+
67
+ psi = np.sum((cur_pct - ref_pct) * np.log(cur_pct / ref_pct))
68
+ return psi
69
+ ```
70
+
71
+ ### Evidently AI Integration
72
+ ```python
73
+ from evidently import ColumnMapping
74
+ from evidently.report import Report
75
+ from evidently.metric_preset import DataDriftPreset, DataQualityPreset
76
+ from evidently.metrics import (
77
+ DataDriftTable,
78
+ DatasetDriftMetric,
79
+ ColumnDriftMetric
80
+ )
81
+
82
+ # Define column mapping
83
+ column_mapping = ColumnMapping(
84
+ target="label",
85
+ prediction="prediction",
86
+ numerical_features=["feature1", "feature2", "feature3"],
87
+ categorical_features=["category1", "category2"]
88
+ )
89
+
90
+ # Create drift report
91
+ report = Report(metrics=[
92
+ DatasetDriftMetric(),
93
+ DataDriftTable(),
94
+ ColumnDriftMetric(column_name="feature1"),
95
+ ColumnDriftMetric(column_name="feature2")
96
+ ])
97
+
98
+ report.run(
99
+ reference_data=reference_df,
100
+ current_data=current_df,
101
+ column_mapping=column_mapping
102
+ )
103
+
104
+ # Export results
105
+ report.save_html("drift_report.html")
106
+ drift_metrics = report.as_dict()
107
+ ```
108
+
109
+ ## Performance Monitoring
110
+
111
+ ### Metrics Tracking
112
+ ```python
113
+ from dataclasses import dataclass
114
+ from typing import List, Optional
115
+ from datetime import datetime
116
+ import prometheus_client as prom
117
+
118
+ # Define metrics
119
+ PREDICTION_LATENCY = prom.Histogram(
120
+ "model_prediction_latency_seconds",
121
+ "Time spent processing prediction",
122
+ ["model_name", "model_version"],
123
+ buckets=[0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1.0]
124
+ )
125
+
126
+ PREDICTION_COUNT = prom.Counter(
127
+ "model_predictions_total",
128
+ "Total number of predictions",
129
+ ["model_name", "model_version", "prediction_class"]
130
+ )
131
+
132
+ PREDICTION_ERROR = prom.Counter(
133
+ "model_prediction_errors_total",
134
+ "Total prediction errors",
135
+ ["model_name", "model_version", "error_type"]
136
+ )
137
+
138
+ @dataclass
139
+ class PredictionLog:
140
+ request_id: str
141
+ model_name: str
142
+ model_version: str
143
+ features: dict
144
+ prediction: any
145
+ probability: Optional[float]
146
+ latency_ms: float
147
+ timestamp: datetime
148
+
149
+ class ModelMonitor:
150
+ def __init__(self, model_name: str, model_version: str):
151
+ self.model_name = model_name
152
+ self.model_version = model_version
153
+
154
+ def log_prediction(self, log: PredictionLog):
155
+ # Record latency
156
+ PREDICTION_LATENCY.labels(
157
+ model_name=self.model_name,
158
+ model_version=self.model_version
159
+ ).observe(log.latency_ms / 1000)
160
+
161
+ # Record prediction class
162
+ PREDICTION_COUNT.labels(
163
+ model_name=self.model_name,
164
+ model_version=self.model_version,
165
+ prediction_class=str(log.prediction)
166
+ ).inc()
167
+
168
+ # Store for offline analysis
169
+ self._store_log(log)
170
+
171
+ def log_error(self, error_type: str):
172
+ PREDICTION_ERROR.labels(
173
+ model_name=self.model_name,
174
+ model_version=self.model_version,
175
+ error_type=error_type
176
+ ).inc()
177
+ ```
178
+
179
+ ### Ground Truth Comparison
180
+ ```python
181
+ import pandas as pd
182
+ from sklearn.metrics import (
183
+ accuracy_score, precision_score, recall_score,
184
+ f1_score, roc_auc_score, confusion_matrix
185
+ )
186
+ from datetime import datetime, timedelta
187
+
188
+ class PerformanceMonitor:
189
+ def __init__(self, db_connection):
190
+ self.db = db_connection
191
+
192
+ def calculate_metrics(
193
+ self,
194
+ start_date: datetime,
195
+ end_date: datetime
196
+ ) -> dict:
197
+ # Join predictions with ground truth
198
+ query = """
199
+ SELECT p.prediction, p.probability, g.actual
200
+ FROM predictions p
201
+ JOIN ground_truth g ON p.request_id = g.request_id
202
+ WHERE p.timestamp BETWEEN %s AND %s
203
+ """
204
+
205
+ df = pd.read_sql(query, self.db, params=[start_date, end_date])
206
+
207
+ if len(df) == 0:
208
+ return {}
209
+
210
+ return {
211
+ "accuracy": accuracy_score(df["actual"], df["prediction"]),
212
+ "precision": precision_score(df["actual"], df["prediction"], average="weighted"),
213
+ "recall": recall_score(df["actual"], df["prediction"], average="weighted"),
214
+ "f1": f1_score(df["actual"], df["prediction"], average="weighted"),
215
+ "auc": roc_auc_score(df["actual"], df["probability"]) if "probability" in df else None,
216
+ "sample_count": len(df),
217
+ "confusion_matrix": confusion_matrix(df["actual"], df["prediction"]).tolist()
218
+ }
219
+
220
+ def detect_performance_degradation(
221
+ self,
222
+ current_metrics: dict,
223
+ baseline_metrics: dict,
224
+ threshold: float = 0.05
225
+ ) -> bool:
226
+ for metric in ["accuracy", "precision", "recall", "f1"]:
227
+ if current_metrics[metric] < baseline_metrics[metric] - threshold:
228
+ return True
229
+ return False
230
+ ```
231
+
232
+ ## Alerting System
233
+
234
+ ### Alert Configuration
235
+ ```python
236
+ from dataclasses import dataclass
237
+ from enum import Enum
238
+ from typing import Callable, List
239
+ import smtplib
240
+ from email.mime.text import MIMEText
241
+
242
+ class AlertSeverity(Enum):
243
+ INFO = "info"
244
+ WARNING = "warning"
245
+ CRITICAL = "critical"
246
+
247
+ @dataclass
248
+ class AlertRule:
249
+ name: str
250
+ condition: Callable[[dict], bool]
251
+ severity: AlertSeverity
252
+ message_template: str
253
+
254
+ class AlertManager:
255
+ def __init__(self, rules: List[AlertRule]):
256
+ self.rules = rules
257
+ self.channels = []
258
+
259
+ def add_channel(self, channel):
260
+ self.channels.append(channel)
261
+
262
+ def evaluate(self, metrics: dict):
263
+ for rule in self.rules:
264
+ if rule.condition(metrics):
265
+ alert = {
266
+ "name": rule.name,
267
+ "severity": rule.severity,
268
+ "message": rule.message_template.format(**metrics),
269
+ "metrics": metrics
270
+ }
271
+ self._send_alert(alert)
272
+
273
+ def _send_alert(self, alert: dict):
274
+ for channel in self.channels:
275
+ channel.send(alert)
276
+
277
+ # Define rules
278
+ rules = [
279
+ AlertRule(
280
+ name="accuracy_drop",
281
+ condition=lambda m: m.get("accuracy", 1.0) < 0.80,
282
+ severity=AlertSeverity.CRITICAL,
283
+ message="Model accuracy dropped to {accuracy:.2%}"
284
+ ),
285
+ AlertRule(
286
+ name="high_latency",
287
+ condition=lambda m: m.get("p99_latency_ms", 0) > 500,
288
+ severity=AlertSeverity.WARNING,
289
+ message="P99 latency is {p99_latency_ms}ms"
290
+ ),
291
+ AlertRule(
292
+ name="data_drift",
293
+ condition=lambda m: m.get("drift_score", 0) > 0.25,
294
+ severity=AlertSeverity.WARNING,
295
+ message="Data drift detected: PSI={drift_score:.3f}"
296
+ )
297
+ ]
298
+ ```
299
+
300
+ ### Slack Integration
301
+ ```python
302
+ import requests
303
+ from typing import Dict
304
+
305
+ class SlackChannel:
306
+ def __init__(self, webhook_url: str):
307
+ self.webhook_url = webhook_url
308
+
309
+ def send(self, alert: Dict):
310
+ color = {
311
+ AlertSeverity.INFO: "#36a64f",
312
+ AlertSeverity.WARNING: "#ff9800",
313
+ AlertSeverity.CRITICAL: "#f44336"
314
+ }[alert["severity"]]
315
+
316
+ payload = {
317
+ "attachments": [{
318
+ "color": color,
319
+ "title": f":warning: {alert['name']}",
320
+ "text": alert["message"],
321
+ "fields": [
322
+ {"title": k, "value": str(v), "short": True}
323
+ for k, v in alert["metrics"].items()
324
+ ],
325
+ "footer": "ML Monitoring System"
326
+ }]
327
+ }
328
+
329
+ requests.post(self.webhook_url, json=payload)
330
+ ```
331
+
332
+ ## Monitoring Dashboard
333
+
334
+ ### Grafana Queries
335
+ ```promql
336
+ # Prediction latency percentiles
337
+ histogram_quantile(0.99,
338
+ sum(rate(model_prediction_latency_seconds_bucket[5m])) by (le, model_name)
339
+ )
340
+
341
+ # Predictions per second
342
+ sum(rate(model_predictions_total[1m])) by (model_name, prediction_class)
343
+
344
+ # Error rate
345
+ sum(rate(model_prediction_errors_total[5m]))
346
+ /
347
+ sum(rate(model_predictions_total[5m]))
348
+
349
+ # Feature drift over time (custom metric)
350
+ model_feature_drift_psi{feature_name=~"feature.*"}
351
+ ```
352
+
353
+ ## Best Practices
354
+
355
+ 1. **Baseline Metrics**: Establish clear baselines
356
+ 2. **Granular Monitoring**: Per-segment analysis
357
+ 3. **Alert Fatigue**: Tune thresholds carefully
358
+ 4. **Root Cause Analysis**: Correlate metrics
359
+ 5. **Automated Remediation**: Retrain triggers
360
+
361
+ ## Monitoring Checklist
362
+
363
+ ```
364
+ □ Data quality checks (schema, nulls, ranges)
365
+ □ Feature distribution monitoring
366
+ □ Prediction distribution tracking
367
+ □ Latency monitoring (p50, p95, p99)
368
+ □ Error rate tracking
369
+ □ Ground truth collection pipeline
370
+ □ Performance metrics computation
371
+ □ Drift detection (statistical tests)
372
+ □ Alert rules configured
373
+ □ Dashboard created
374
+ □ Runbook documented
375
+ ```
376
+
377
+ ## Anti-Patterns
378
+
379
+ - Monitoring only accuracy
380
+ - Ignoring data quality
381
+ - Alert threshold too sensitive
382
+ - Missing ground truth pipeline
383
+ - No historical comparison
384
+
385
+ ## When to Use
386
+
387
+ - Production ML models
388
+ - High-stakes predictions
389
+ - Regulatory requirements
390
+ - Continuous learning systems
391
+ - Multi-model environments
392
+
393
+ ## When NOT to Use
394
+
395
+ - Development/testing only
396
+ - Batch jobs with manual review
397
+ - Very stable domains
398
+ - Low-volume predictions