maestro-bundle 1.3.1 → 1.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/templates/bundle-ai-agents/skills/agent-orchestration/SKILL.md +107 -41
- package/templates/bundle-ai-agents/skills/agent-orchestration/references/graph-patterns.md +50 -0
- package/templates/bundle-ai-agents/skills/agent-orchestration/references/routing-strategies.md +47 -0
- package/templates/bundle-ai-agents/skills/api-design/SKILL.md +125 -16
- package/templates/bundle-ai-agents/skills/api-design/references/pydantic-patterns.md +72 -0
- package/templates/bundle-ai-agents/skills/api-design/references/rest-conventions.md +51 -0
- package/templates/bundle-ai-agents/skills/clean-architecture/SKILL.md +113 -21
- package/templates/bundle-ai-agents/skills/clean-architecture/references/dependency-injection.md +60 -0
- package/templates/bundle-ai-agents/skills/clean-architecture/references/layer-rules.md +56 -0
- package/templates/bundle-ai-agents/skills/context-engineering/SKILL.md +104 -36
- package/templates/bundle-ai-agents/skills/context-engineering/references/compression-techniques.md +76 -0
- package/templates/bundle-ai-agents/skills/context-engineering/references/context-budget-calculator.md +45 -0
- package/templates/bundle-ai-agents/skills/database-modeling/SKILL.md +146 -19
- package/templates/bundle-ai-agents/skills/database-modeling/references/index-strategies.md +48 -0
- package/templates/bundle-ai-agents/skills/database-modeling/references/naming-conventions.md +27 -0
- package/templates/bundle-ai-agents/skills/docker-containerization/SKILL.md +124 -15
- package/templates/bundle-ai-agents/skills/docker-containerization/references/compose-patterns.md +97 -0
- package/templates/bundle-ai-agents/skills/docker-containerization/references/dockerfile-checklist.md +37 -0
- package/templates/bundle-ai-agents/skills/eval-testing/SKILL.md +113 -25
- package/templates/bundle-ai-agents/skills/eval-testing/references/eval-types.md +52 -0
- package/templates/bundle-ai-agents/skills/eval-testing/references/golden-dataset-template.md +59 -0
- package/templates/bundle-ai-agents/skills/memory-management/SKILL.md +112 -28
- package/templates/bundle-ai-agents/skills/memory-management/references/memory-tiers.md +41 -0
- package/templates/bundle-ai-agents/skills/memory-management/references/namespace-conventions.md +41 -0
- package/templates/bundle-ai-agents/skills/prompt-engineering/SKILL.md +139 -47
- package/templates/bundle-ai-agents/skills/prompt-engineering/references/anti-patterns.md +59 -0
- package/templates/bundle-ai-agents/skills/prompt-engineering/references/prompt-templates.md +75 -0
- package/templates/bundle-ai-agents/skills/rag-pipeline/SKILL.md +104 -27
- package/templates/bundle-ai-agents/skills/rag-pipeline/references/chunking-strategies.md +27 -0
- package/templates/bundle-ai-agents/skills/rag-pipeline/references/embedding-models.md +31 -0
- package/templates/bundle-ai-agents/skills/rag-pipeline/references/rag-evaluation.md +39 -0
- package/templates/bundle-ai-agents/skills/testing-strategy/SKILL.md +127 -18
- package/templates/bundle-ai-agents/skills/testing-strategy/references/fixture-patterns.md +81 -0
- package/templates/bundle-ai-agents/skills/testing-strategy/references/naming-conventions.md +69 -0
- package/templates/bundle-base/skills/branch-strategy/SKILL.md +134 -21
- package/templates/bundle-base/skills/branch-strategy/references/branch-rules.md +40 -0
- package/templates/bundle-base/skills/code-review/SKILL.md +123 -38
- package/templates/bundle-base/skills/code-review/references/review-checklist.md +45 -0
- package/templates/bundle-base/skills/commit-pattern/SKILL.md +98 -39
- package/templates/bundle-base/skills/commit-pattern/references/conventional-commits.md +40 -0
- package/templates/bundle-data-pipeline/skills/data-preprocessing/SKILL.md +110 -19
- package/templates/bundle-data-pipeline/skills/data-preprocessing/references/pandas-cheatsheet.md +63 -0
- package/templates/bundle-data-pipeline/skills/data-preprocessing/references/pandera-schemas.md +44 -0
- package/templates/bundle-data-pipeline/skills/docker-containerization/SKILL.md +132 -16
- package/templates/bundle-data-pipeline/skills/docker-containerization/references/compose-patterns.md +82 -0
- package/templates/bundle-data-pipeline/skills/docker-containerization/references/dockerfile-best-practices.md +57 -0
- package/templates/bundle-data-pipeline/skills/feature-engineering/SKILL.md +143 -45
- package/templates/bundle-data-pipeline/skills/feature-engineering/references/encoding-guide.md +41 -0
- package/templates/bundle-data-pipeline/skills/feature-engineering/references/scaling-guide.md +38 -0
- package/templates/bundle-data-pipeline/skills/mlops-pipeline/SKILL.md +156 -37
- package/templates/bundle-data-pipeline/skills/mlops-pipeline/references/mlflow-commands.md +69 -0
- package/templates/bundle-data-pipeline/skills/model-training/SKILL.md +152 -33
- package/templates/bundle-data-pipeline/skills/model-training/references/evaluation-metrics.md +52 -0
- package/templates/bundle-data-pipeline/skills/model-training/references/model-selection-guide.md +41 -0
- package/templates/bundle-data-pipeline/skills/rag-pipeline/SKILL.md +127 -39
- package/templates/bundle-data-pipeline/skills/rag-pipeline/references/chunking-strategies.md +51 -0
- package/templates/bundle-data-pipeline/skills/rag-pipeline/references/embedding-models.md +49 -0
- package/templates/bundle-frontend-spa/skills/authentication/SKILL.md +196 -13
- package/templates/bundle-frontend-spa/skills/authentication/references/jwt-security.md +41 -0
- package/templates/bundle-frontend-spa/skills/component-design/SKILL.md +191 -41
- package/templates/bundle-frontend-spa/skills/component-design/references/accessibility-checklist.md +41 -0
- package/templates/bundle-frontend-spa/skills/component-design/references/tailwind-patterns.md +65 -0
- package/templates/bundle-frontend-spa/skills/e2e-testing/SKILL.md +241 -79
- package/templates/bundle-frontend-spa/skills/e2e-testing/references/playwright-selectors.md +66 -0
- package/templates/bundle-frontend-spa/skills/e2e-testing/references/test-patterns.md +82 -0
- package/templates/bundle-frontend-spa/skills/integration-api/SKILL.md +221 -31
- package/templates/bundle-frontend-spa/skills/integration-api/references/api-patterns.md +81 -0
- package/templates/bundle-frontend-spa/skills/react-patterns/SKILL.md +195 -70
- package/templates/bundle-frontend-spa/skills/react-patterns/references/component-checklist.md +22 -0
- package/templates/bundle-frontend-spa/skills/react-patterns/references/hook-patterns.md +63 -0
- package/templates/bundle-frontend-spa/skills/responsive-layout/SKILL.md +162 -22
- package/templates/bundle-frontend-spa/skills/responsive-layout/references/breakpoint-guide.md +63 -0
- package/templates/bundle-frontend-spa/skills/state-management/SKILL.md +158 -30
- package/templates/bundle-frontend-spa/skills/state-management/references/react-query-config.md +64 -0
- package/templates/bundle-frontend-spa/skills/state-management/references/state-patterns.md +78 -0
- package/templates/bundle-jhipster-microservices/skills/ci-cd-pipeline/SKILL.md +135 -45
- package/templates/bundle-jhipster-microservices/skills/ci-cd-pipeline/references/gitlab-ci-templates.md +93 -0
- package/templates/bundle-jhipster-microservices/skills/clean-architecture/SKILL.md +87 -21
- package/templates/bundle-jhipster-microservices/skills/clean-architecture/references/layer-rules.md +78 -0
- package/templates/bundle-jhipster-microservices/skills/ddd-tactical/SKILL.md +94 -25
- package/templates/bundle-jhipster-microservices/skills/ddd-tactical/references/ddd-patterns.md +48 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-angular/SKILL.md +63 -21
- package/templates/bundle-jhipster-microservices/skills/jhipster-angular/references/angular-microservices.md +40 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-angular/references/angular-structure.md +59 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-docker-k8s/SKILL.md +125 -91
- package/templates/bundle-jhipster-microservices/skills/jhipster-docker-k8s/references/docker-k8s-commands.md +68 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-entities/SKILL.md +72 -20
- package/templates/bundle-jhipster-microservices/skills/jhipster-entities/references/cross-service-entities.md +36 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-entities/references/jdl-types.md +56 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-gateway/SKILL.md +80 -8
- package/templates/bundle-jhipster-microservices/skills/jhipster-gateway/references/gateway-config.md +43 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-kafka/SKILL.md +115 -22
- package/templates/bundle-jhipster-microservices/skills/jhipster-kafka/references/kafka-events.md +39 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-registry/SKILL.md +92 -23
- package/templates/bundle-jhipster-microservices/skills/jhipster-registry/references/consul-config.md +61 -0
- package/templates/bundle-jhipster-microservices/skills/jhipster-service/SKILL.md +81 -18
- package/templates/bundle-jhipster-microservices/skills/jhipster-service/references/service-patterns.md +40 -0
- package/templates/bundle-jhipster-microservices/skills/testing-strategy/SKILL.md +101 -20
- package/templates/bundle-jhipster-microservices/skills/testing-strategy/references/test-naming.md +55 -0
- package/templates/bundle-jhipster-monorepo/skills/clean-architecture/SKILL.md +87 -21
- package/templates/bundle-jhipster-monorepo/skills/clean-architecture/references/layer-rules.md +78 -0
- package/templates/bundle-jhipster-monorepo/skills/ddd-tactical/SKILL.md +94 -25
- package/templates/bundle-jhipster-monorepo/skills/ddd-tactical/references/ddd-patterns.md +48 -0
- package/templates/bundle-jhipster-monorepo/skills/jhipster-angular/SKILL.md +99 -52
- package/templates/bundle-jhipster-monorepo/skills/jhipster-angular/references/angular-structure.md +59 -0
- package/templates/bundle-jhipster-monorepo/skills/jhipster-entities/SKILL.md +89 -36
- package/templates/bundle-jhipster-monorepo/skills/jhipster-entities/references/jdl-types.md +56 -0
- package/templates/bundle-jhipster-monorepo/skills/jhipster-liquibase/SKILL.md +123 -23
- package/templates/bundle-jhipster-monorepo/skills/jhipster-liquibase/references/liquibase-operations.md +95 -0
- package/templates/bundle-jhipster-monorepo/skills/jhipster-security/SKILL.md +106 -19
- package/templates/bundle-jhipster-monorepo/skills/jhipster-security/references/security-checklist.md +47 -0
- package/templates/bundle-jhipster-monorepo/skills/jhipster-spring/SKILL.md +84 -16
- package/templates/bundle-jhipster-monorepo/skills/jhipster-spring/references/spring-layers.md +41 -0
- package/templates/bundle-jhipster-monorepo/skills/testing-strategy/SKILL.md +101 -20
- package/templates/bundle-jhipster-monorepo/skills/testing-strategy/references/test-naming.md +55 -0
|
@@ -1,77 +1,196 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: mlops-pipeline
|
|
3
|
-
description:
|
|
3
|
+
description: Build MLOps pipelines with MLflow for experiment tracking, model registry, and automated deployment. Use when you need to version models, track experiments, automate training pipelines, or configure a model registry.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
author: Maestro
|
|
4
6
|
---
|
|
5
7
|
|
|
6
8
|
# MLOps Pipeline
|
|
7
9
|
|
|
8
|
-
|
|
10
|
+
Set up end-to-end MLOps workflows using MLflow for experiment tracking, model versioning, and automated training pipelines.
|
|
9
11
|
|
|
12
|
+
## When to Use
|
|
13
|
+
- User needs to track experiments (parameters, metrics, artifacts)
|
|
14
|
+
- User wants to version and register models
|
|
15
|
+
- User needs to compare runs and select the best model
|
|
16
|
+
- User wants to automate a training pipeline with promotion logic
|
|
17
|
+
- User needs to serve a model via MLflow
|
|
18
|
+
|
|
19
|
+
## Available Operations
|
|
20
|
+
1. Set up MLflow tracking server
|
|
21
|
+
2. Log experiments (params, metrics, artifacts, models)
|
|
22
|
+
3. Register models in the Model Registry
|
|
23
|
+
4. Promote models through stages (Staging -> Production)
|
|
24
|
+
5. Build an automated training pipeline with comparison logic
|
|
25
|
+
6. Serve a model via MLflow REST API
|
|
26
|
+
|
|
27
|
+
## Multi-Step Workflow
|
|
28
|
+
|
|
29
|
+
### Step 1: Install Dependencies
|
|
30
|
+
```bash
|
|
31
|
+
pip install mlflow scikit-learn pandas boto3
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
### Step 2: Start MLflow Tracking Server (Local Development)
|
|
35
|
+
```bash
|
|
36
|
+
mlflow server --host 0.0.0.0 --port 5000 --backend-store-uri sqlite:///mlflow.db --default-artifact-root ./mlflow-artifacts
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
For production, use a remote tracking URI:
|
|
40
|
+
```bash
|
|
41
|
+
export MLFLOW_TRACKING_URI=http://mlflow.your-domain.com
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
### Step 3: Create an Experiment and Log a Run
|
|
10
45
|
```python
|
|
11
46
|
import mlflow
|
|
47
|
+
from sklearn.ensemble import RandomForestClassifier
|
|
48
|
+
from sklearn.metrics import accuracy_score, f1_score, precision_score
|
|
12
49
|
|
|
13
|
-
mlflow.set_tracking_uri("http://
|
|
14
|
-
mlflow.set_experiment("
|
|
50
|
+
mlflow.set_tracking_uri("http://localhost:5000")
|
|
51
|
+
mlflow.set_experiment("my-classifier")
|
|
15
52
|
|
|
16
|
-
with mlflow.start_run(run_name="rf-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
"cv_folds": 5
|
|
21
|
-
})
|
|
53
|
+
with mlflow.start_run(run_name="rf-baseline"):
|
|
54
|
+
# Log parameters
|
|
55
|
+
params = {"n_estimators": 200, "max_depth": 20, "cv_folds": 5}
|
|
56
|
+
mlflow.log_params(params)
|
|
22
57
|
|
|
58
|
+
# Train model
|
|
59
|
+
model = RandomForestClassifier(**{k: v for k, v in params.items() if k != "cv_folds"}, random_state=42)
|
|
23
60
|
model.fit(X_train, y_train)
|
|
24
61
|
y_pred = model.predict(X_test)
|
|
25
62
|
|
|
26
|
-
|
|
63
|
+
# Log metrics
|
|
64
|
+
metrics = {
|
|
27
65
|
"accuracy": accuracy_score(y_test, y_pred),
|
|
28
|
-
"f1": f1_score(y_test, y_pred, average=
|
|
29
|
-
"precision": precision_score(y_test, y_pred, average=
|
|
30
|
-
}
|
|
66
|
+
"f1": f1_score(y_test, y_pred, average="weighted"),
|
|
67
|
+
"precision": precision_score(y_test, y_pred, average="weighted"),
|
|
68
|
+
}
|
|
69
|
+
mlflow.log_metrics(metrics)
|
|
70
|
+
print(f"Metrics: {metrics}")
|
|
31
71
|
|
|
72
|
+
# Log model
|
|
32
73
|
mlflow.sklearn.log_model(model, "model")
|
|
74
|
+
print(f"Run ID: {mlflow.active_run().info.run_id}")
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
### Step 4: Compare Runs
|
|
78
|
+
```bash
|
|
79
|
+
mlflow runs list --experiment-id 1 --order-by "metrics.f1 DESC"
|
|
33
80
|
```
|
|
34
81
|
|
|
35
|
-
|
|
82
|
+
Or programmatically:
|
|
83
|
+
```python
|
|
84
|
+
import mlflow
|
|
85
|
+
|
|
86
|
+
experiment = mlflow.get_experiment_by_name("my-classifier")
|
|
87
|
+
runs = mlflow.search_runs(experiment_ids=[experiment.experiment_id], order_by=["metrics.f1 DESC"])
|
|
88
|
+
print(runs[["run_id", "params.n_estimators", "metrics.f1", "metrics.accuracy"]].head(10))
|
|
89
|
+
```
|
|
36
90
|
|
|
91
|
+
### Step 5: Register the Best Model
|
|
37
92
|
```python
|
|
38
|
-
#
|
|
93
|
+
run_id = runs.iloc[0]["run_id"] # best run by F1
|
|
39
94
|
model_uri = f"runs:/{run_id}/model"
|
|
40
|
-
mlflow.register_model(model_uri, "compliance-classifier")
|
|
41
95
|
|
|
42
|
-
|
|
96
|
+
result = mlflow.register_model(model_uri, "my-classifier")
|
|
97
|
+
print(f"Registered model version: {result.version}")
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### Step 6: Promote to Production
|
|
101
|
+
```python
|
|
43
102
|
client = mlflow.MlflowClient()
|
|
103
|
+
|
|
104
|
+
# Move to staging first
|
|
44
105
|
client.transition_model_version_stage(
|
|
45
|
-
name="
|
|
46
|
-
version=2,
|
|
47
|
-
stage="Production"
|
|
106
|
+
name="my-classifier", version=result.version, stage="Staging"
|
|
48
107
|
)
|
|
49
|
-
|
|
108
|
+
print(f"Model v{result.version} moved to Staging")
|
|
50
109
|
|
|
51
|
-
|
|
110
|
+
# After validation, promote to production
|
|
111
|
+
client.transition_model_version_stage(
|
|
112
|
+
name="my-classifier", version=result.version, stage="Production"
|
|
113
|
+
)
|
|
114
|
+
print(f"Model v{result.version} promoted to Production")
|
|
115
|
+
```
|
|
52
116
|
|
|
117
|
+
### Step 7: Build Automated Training Pipeline
|
|
53
118
|
```python
|
|
54
119
|
# pipelines/training.py
|
|
55
|
-
|
|
56
|
-
|
|
120
|
+
import mlflow
|
|
121
|
+
from sklearn.model_selection import train_test_split
|
|
57
122
|
|
|
58
|
-
|
|
59
|
-
|
|
123
|
+
def training_pipeline(data_path: str, experiment_name: str, model_name: str):
|
|
124
|
+
"""End-to-end pipeline: load -> train -> evaluate -> register if better."""
|
|
125
|
+
mlflow.set_experiment(experiment_name)
|
|
60
126
|
|
|
61
|
-
#
|
|
62
|
-
|
|
63
|
-
|
|
127
|
+
# 1. Load and split
|
|
128
|
+
df = pd.read_parquet(data_path)
|
|
129
|
+
X = df.drop(columns=["target"])
|
|
130
|
+
y = df["target"]
|
|
131
|
+
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
|
|
64
132
|
|
|
65
|
-
#
|
|
133
|
+
# 2. Train with tracking
|
|
66
134
|
with mlflow.start_run():
|
|
67
135
|
model = train_model(X_train, y_train)
|
|
68
136
|
metrics = evaluate_model(model, X_test, y_test)
|
|
69
137
|
mlflow.log_metrics(metrics)
|
|
138
|
+
mlflow.sklearn.log_model(model, "model")
|
|
139
|
+
|
|
140
|
+
# 3. Compare with current production
|
|
141
|
+
client = mlflow.MlflowClient()
|
|
142
|
+
try:
|
|
143
|
+
prod_versions = client.get_latest_versions(model_name, stages=["Production"])
|
|
144
|
+
prod_run = client.get_run(prod_versions[0].run_id)
|
|
145
|
+
prod_f1 = float(prod_run.data.metrics.get("f1", 0))
|
|
146
|
+
except Exception:
|
|
147
|
+
prod_f1 = 0.0
|
|
148
|
+
|
|
149
|
+
# 4. Register if better
|
|
150
|
+
if metrics["f1"] > prod_f1:
|
|
151
|
+
result = mlflow.register_model(f"runs:/{mlflow.active_run().info.run_id}/model", model_name)
|
|
152
|
+
client.transition_model_version_stage(name=model_name, version=result.version, stage="Staging")
|
|
153
|
+
print(f"New candidate v{result.version}: F1={metrics['f1']:.3f} > Production F1={prod_f1:.3f}")
|
|
154
|
+
else:
|
|
155
|
+
print(f"Model not better: F1={metrics['f1']:.3f} <= Production F1={prod_f1:.3f}")
|
|
156
|
+
|
|
157
|
+
if __name__ == "__main__":
|
|
158
|
+
training_pipeline("data/processed/dataset.parquet", "my-classifier", "my-classifier")
|
|
159
|
+
```
|
|
70
160
|
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
mlflow.sklearn.log_model(model, "model")
|
|
75
|
-
register_as_candidate(model)
|
|
76
|
-
notify_team("Novo modelo candidato disponível")
|
|
161
|
+
### Step 8: Serve Model via REST API
|
|
162
|
+
```bash
|
|
163
|
+
mlflow models serve -m "models:/my-classifier/Production" --port 5001 --no-conda
|
|
77
164
|
```
|
|
165
|
+
|
|
166
|
+
Test the endpoint:
|
|
167
|
+
```bash
|
|
168
|
+
curl -X POST http://localhost:5001/invocations -H "Content-Type: application/json" -d '{"inputs": [{"age": 30, "salary": 50000}]}'
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
## Resources
|
|
172
|
+
- `references/mlflow-commands.md` - MLflow CLI and Python API quick reference
|
|
173
|
+
|
|
174
|
+
## Examples
|
|
175
|
+
### Example 1: Track a New Experiment
|
|
176
|
+
User asks: "Set up experiment tracking for our fraud detection model"
|
|
177
|
+
Response approach:
|
|
178
|
+
1. Install mlflow and set tracking URI
|
|
179
|
+
2. Create experiment with `mlflow.set_experiment("fraud-detection")`
|
|
180
|
+
3. Log params, metrics, and model inside `mlflow.start_run()`
|
|
181
|
+
4. Show how to view results in MLflow UI at http://localhost:5000
|
|
182
|
+
|
|
183
|
+
### Example 2: Promote a Model
|
|
184
|
+
User asks: "Our latest model passed validation, deploy it to production"
|
|
185
|
+
Response approach:
|
|
186
|
+
1. Find the latest Staging model version with `get_latest_versions`
|
|
187
|
+
2. Transition to Production with `transition_model_version_stage`
|
|
188
|
+
3. Verify with `mlflow models serve`
|
|
189
|
+
4. Test the endpoint with curl
|
|
190
|
+
|
|
191
|
+
## Notes
|
|
192
|
+
- Always log both parameters and metrics for every run
|
|
193
|
+
- Use `mlflow.autolog()` for automatic logging with sklearn, pytorch, etc.
|
|
194
|
+
- Set meaningful run names for easier comparison in the UI
|
|
195
|
+
- Never skip the baseline comparison step when training new models
|
|
196
|
+
- For team setups, use a shared PostgreSQL backend instead of SQLite
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
# MLflow Quick Reference
|
|
2
|
+
|
|
3
|
+
## CLI Commands
|
|
4
|
+
```bash
|
|
5
|
+
# Start tracking server
|
|
6
|
+
mlflow server --host 0.0.0.0 --port 5000 --backend-store-uri sqlite:///mlflow.db
|
|
7
|
+
|
|
8
|
+
# List experiments
|
|
9
|
+
mlflow experiments list
|
|
10
|
+
|
|
11
|
+
# List runs in an experiment
|
|
12
|
+
mlflow runs list --experiment-id 1
|
|
13
|
+
|
|
14
|
+
# Serve a model
|
|
15
|
+
mlflow models serve -m "models:/model-name/Production" --port 5001 --no-conda
|
|
16
|
+
|
|
17
|
+
# Download artifacts
|
|
18
|
+
mlflow artifacts download -r <run-id> -d ./downloaded-artifacts
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
## Python API - Tracking
|
|
22
|
+
```python
|
|
23
|
+
import mlflow
|
|
24
|
+
|
|
25
|
+
mlflow.set_tracking_uri("http://localhost:5000")
|
|
26
|
+
mlflow.set_experiment("experiment-name")
|
|
27
|
+
|
|
28
|
+
# Auto-log everything (sklearn, pytorch, etc.)
|
|
29
|
+
mlflow.autolog()
|
|
30
|
+
|
|
31
|
+
# Manual logging
|
|
32
|
+
with mlflow.start_run(run_name="descriptive-name"):
|
|
33
|
+
mlflow.log_param("learning_rate", 0.01)
|
|
34
|
+
mlflow.log_params({"epochs": 100, "batch_size": 32})
|
|
35
|
+
mlflow.log_metric("loss", 0.5)
|
|
36
|
+
mlflow.log_metrics({"accuracy": 0.95, "f1": 0.92})
|
|
37
|
+
mlflow.log_artifact("path/to/file.csv")
|
|
38
|
+
mlflow.sklearn.log_model(model, "model")
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
## Python API - Model Registry
|
|
42
|
+
```python
|
|
43
|
+
client = mlflow.MlflowClient()
|
|
44
|
+
|
|
45
|
+
# Register model
|
|
46
|
+
mlflow.register_model("runs:/<run-id>/model", "model-name")
|
|
47
|
+
|
|
48
|
+
# List versions
|
|
49
|
+
versions = client.search_model_versions("name='model-name'")
|
|
50
|
+
|
|
51
|
+
# Get latest production version
|
|
52
|
+
prod = client.get_latest_versions("model-name", stages=["Production"])
|
|
53
|
+
|
|
54
|
+
# Transition stage
|
|
55
|
+
client.transition_model_version_stage("model-name", version=1, stage="Production")
|
|
56
|
+
|
|
57
|
+
# Load production model
|
|
58
|
+
model = mlflow.sklearn.load_model("models:/model-name/Production")
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## Python API - Search Runs
|
|
62
|
+
```python
|
|
63
|
+
runs = mlflow.search_runs(
|
|
64
|
+
experiment_ids=["1"],
|
|
65
|
+
filter_string="metrics.f1 > 0.9",
|
|
66
|
+
order_by=["metrics.f1 DESC"],
|
|
67
|
+
max_results=10
|
|
68
|
+
)
|
|
69
|
+
```
|
|
@@ -1,68 +1,187 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: model-training
|
|
3
|
-
description:
|
|
3
|
+
description: Train ML models with scikit-learn including preprocessing pipelines, cross-validation, hyperparameter tuning, and evaluation. Use when you need to train a classifier or regressor, run cross-validation, tune hyperparameters, or compare models against baselines.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
author: Maestro
|
|
4
6
|
---
|
|
5
7
|
|
|
6
8
|
# Model Training
|
|
7
9
|
|
|
8
|
-
|
|
10
|
+
Train, evaluate, and export ML models using scikit-learn pipelines with proper cross-validation and hyperparameter tuning.
|
|
9
11
|
|
|
12
|
+
## When to Use
|
|
13
|
+
- User wants to train a classification or regression model
|
|
14
|
+
- User needs cross-validation scores for model selection
|
|
15
|
+
- User wants to tune hyperparameters with GridSearch or RandomizedSearch
|
|
16
|
+
- User needs to compare a model against a baseline
|
|
17
|
+
- User wants to save a trained model for deployment
|
|
18
|
+
|
|
19
|
+
## Available Operations
|
|
20
|
+
1. Build a full sklearn Pipeline (preprocessing + model)
|
|
21
|
+
2. Run cross-validation with multiple scoring metrics
|
|
22
|
+
3. Tune hyperparameters with GridSearchCV or RandomizedSearchCV
|
|
23
|
+
4. Evaluate on held-out test set with classification_report / regression metrics
|
|
24
|
+
5. Compare against baseline (DummyClassifier/DummyRegressor)
|
|
25
|
+
6. Save the best model with joblib
|
|
26
|
+
|
|
27
|
+
## Multi-Step Workflow
|
|
28
|
+
|
|
29
|
+
### Step 1: Install Dependencies
|
|
30
|
+
```bash
|
|
31
|
+
pip install scikit-learn pandas numpy joblib
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
### Step 2: Load Prepared Data
|
|
35
|
+
```python
|
|
36
|
+
import pandas as pd
|
|
37
|
+
from sklearn.model_selection import train_test_split
|
|
38
|
+
|
|
39
|
+
df = pd.read_parquet("data/processed/dataset_clean.parquet")
|
|
40
|
+
X = df.drop(columns=["target"])
|
|
41
|
+
y = df["target"]
|
|
42
|
+
|
|
43
|
+
X_train, X_test, y_train, y_test = train_test_split(
|
|
44
|
+
X, y, test_size=0.2, random_state=42, stratify=y # stratify for classification
|
|
45
|
+
)
|
|
46
|
+
print(f"Train: {X_train.shape}, Test: {X_test.shape}")
|
|
47
|
+
print(f"Class distribution:\n{y_train.value_counts(normalize=True)}")
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
### Step 3: Build Preprocessing + Model Pipeline
|
|
10
51
|
```python
|
|
11
52
|
from sklearn.pipeline import Pipeline
|
|
12
53
|
from sklearn.preprocessing import StandardScaler, OneHotEncoder
|
|
13
54
|
from sklearn.compose import ColumnTransformer
|
|
14
|
-
from sklearn.model_selection import cross_val_score, GridSearchCV
|
|
15
55
|
from sklearn.ensemble import RandomForestClassifier
|
|
16
|
-
from sklearn.metrics import classification_report
|
|
17
|
-
import joblib
|
|
18
56
|
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
categorical_features = ['department', 'role']
|
|
57
|
+
numeric_features = ["age", "salary", "experience"]
|
|
58
|
+
categorical_features = ["department", "role"]
|
|
22
59
|
|
|
23
60
|
preprocessor = ColumnTransformer(
|
|
24
61
|
transformers=[
|
|
25
|
-
(
|
|
26
|
-
(
|
|
62
|
+
("num", StandardScaler(), numeric_features),
|
|
63
|
+
("cat", OneHotEncoder(handle_unknown="ignore"), categorical_features),
|
|
27
64
|
]
|
|
28
65
|
)
|
|
29
66
|
|
|
30
|
-
# 2. Pipeline
|
|
31
67
|
pipeline = Pipeline([
|
|
32
|
-
(
|
|
33
|
-
(
|
|
68
|
+
("preprocessor", preprocessor),
|
|
69
|
+
("classifier", RandomForestClassifier(random_state=42)),
|
|
34
70
|
])
|
|
71
|
+
```
|
|
35
72
|
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
73
|
+
### Step 4: Run Cross-Validation
|
|
74
|
+
```python
|
|
75
|
+
from sklearn.model_selection import cross_val_score
|
|
76
|
+
|
|
77
|
+
scores = cross_val_score(pipeline, X_train, y_train, cv=5, scoring="f1_weighted")
|
|
78
|
+
print(f"F1 Score (5-fold CV): {scores.mean():.3f} (+/- {scores.std():.3f})")
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Step 5: Compare Against Baseline
|
|
82
|
+
```python
|
|
83
|
+
from sklearn.dummy import DummyClassifier
|
|
84
|
+
|
|
85
|
+
baseline = DummyClassifier(strategy="most_frequent")
|
|
86
|
+
baseline.fit(X_train.select_dtypes(include="number"), y_train)
|
|
87
|
+
baseline_score = baseline.score(X_test.select_dtypes(include="number"), y_test)
|
|
88
|
+
|
|
89
|
+
print(f"Baseline accuracy: {baseline_score:.3f}")
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### Step 6: Hyperparameter Tuning
|
|
93
|
+
```python
|
|
94
|
+
from sklearn.model_selection import GridSearchCV
|
|
39
95
|
|
|
40
|
-
# 4. Hyperparameter tuning
|
|
41
96
|
param_grid = {
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
97
|
+
"classifier__n_estimators": [100, 200, 500],
|
|
98
|
+
"classifier__max_depth": [10, 20, None],
|
|
99
|
+
"classifier__min_samples_split": [2, 5, 10],
|
|
45
100
|
}
|
|
46
101
|
|
|
47
|
-
grid_search = GridSearchCV(
|
|
102
|
+
grid_search = GridSearchCV(
|
|
103
|
+
pipeline, param_grid, cv=5, scoring="f1_weighted", n_jobs=-1, verbose=1
|
|
104
|
+
)
|
|
48
105
|
grid_search.fit(X_train, y_train)
|
|
49
106
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
print(classification_report(y_test, y_pred))
|
|
53
|
-
|
|
54
|
-
# 6. Salvar modelo
|
|
55
|
-
joblib.dump(grid_search.best_estimator_, 'models/model_v1.pkl')
|
|
107
|
+
print(f"Best params: {grid_search.best_params_}")
|
|
108
|
+
print(f"Best CV score: {grid_search.best_score_:.3f}")
|
|
56
109
|
```
|
|
57
110
|
|
|
58
|
-
|
|
111
|
+
For larger search spaces, use RandomizedSearchCV:
|
|
112
|
+
```python
|
|
113
|
+
from sklearn.model_selection import RandomizedSearchCV
|
|
114
|
+
from scipy.stats import randint, uniform
|
|
115
|
+
|
|
116
|
+
param_distributions = {
|
|
117
|
+
"classifier__n_estimators": randint(50, 500),
|
|
118
|
+
"classifier__max_depth": [5, 10, 20, None],
|
|
119
|
+
"classifier__min_samples_split": randint(2, 20),
|
|
120
|
+
}
|
|
59
121
|
|
|
122
|
+
random_search = RandomizedSearchCV(
|
|
123
|
+
pipeline, param_distributions, n_iter=50, cv=5,
|
|
124
|
+
scoring="f1_weighted", n_jobs=-1, random_state=42
|
|
125
|
+
)
|
|
126
|
+
random_search.fit(X_train, y_train)
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
### Step 7: Final Evaluation on Test Set
|
|
60
130
|
```python
|
|
61
|
-
from sklearn.
|
|
131
|
+
from sklearn.metrics import classification_report, confusion_matrix
|
|
62
132
|
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
print(f"Baseline accuracy: {baseline_score:.3f}")
|
|
133
|
+
y_pred = grid_search.predict(X_test)
|
|
134
|
+
print(classification_report(y_test, y_pred))
|
|
135
|
+
print(f"\nConfusion Matrix:\n{confusion_matrix(y_test, y_pred)}")
|
|
67
136
|
print(f"Model accuracy: {grid_search.score(X_test, y_test):.3f}")
|
|
68
137
|
```
|
|
138
|
+
|
|
139
|
+
### Step 8: Save the Best Model
|
|
140
|
+
```bash
|
|
141
|
+
mkdir -p models
|
|
142
|
+
```
|
|
143
|
+
```python
|
|
144
|
+
import joblib
|
|
145
|
+
|
|
146
|
+
best_model = grid_search.best_estimator_
|
|
147
|
+
joblib.dump(best_model, "models/model_v1.pkl")
|
|
148
|
+
print("Saved best model to models/model_v1.pkl")
|
|
149
|
+
|
|
150
|
+
# Verify the saved model works
|
|
151
|
+
loaded_model = joblib.load("models/model_v1.pkl")
|
|
152
|
+
assert (loaded_model.predict(X_test) == y_pred).all()
|
|
153
|
+
print("Model verification passed!")
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
## Resources
|
|
157
|
+
- `references/model-selection-guide.md` - Which model to use for which problem
|
|
158
|
+
- `references/evaluation-metrics.md` - Metrics reference for classification and regression
|
|
159
|
+
|
|
160
|
+
## Examples
|
|
161
|
+
### Example 1: Train a Classifier
|
|
162
|
+
User asks: "Train a model to predict customer churn"
|
|
163
|
+
Response approach:
|
|
164
|
+
1. Load and split data with stratification
|
|
165
|
+
2. Build ColumnTransformer for numeric + categorical features
|
|
166
|
+
3. Create Pipeline with RandomForestClassifier
|
|
167
|
+
4. Run 5-fold cross-validation to get baseline performance
|
|
168
|
+
5. Compare against DummyClassifier
|
|
169
|
+
6. Tune with GridSearchCV
|
|
170
|
+
7. Evaluate on test set with classification_report
|
|
171
|
+
8. Save best model with joblib
|
|
172
|
+
|
|
173
|
+
### Example 2: Quick Model Comparison
|
|
174
|
+
User asks: "Which algorithm works best for this dataset?"
|
|
175
|
+
Response approach:
|
|
176
|
+
1. Build pipelines for multiple models (RF, LogisticRegression, GradientBoosting)
|
|
177
|
+
2. Run cross_val_score on each
|
|
178
|
+
3. Print comparison table of mean and std scores
|
|
179
|
+
4. Pick the best and run hyperparameter tuning
|
|
180
|
+
5. Report final test set performance
|
|
181
|
+
|
|
182
|
+
## Notes
|
|
183
|
+
- Always compare against a baseline before claiming good performance
|
|
184
|
+
- Use stratify=y in train_test_split for imbalanced classification
|
|
185
|
+
- GridSearchCV for small param spaces (<100 combos), RandomizedSearchCV for larger ones
|
|
186
|
+
- Never look at test set metrics until final evaluation
|
|
187
|
+
- Save both the model and the preprocessing pipeline together (Pipeline does this automatically)
|
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
# Evaluation Metrics Reference
|
|
2
|
+
|
|
3
|
+
## Classification Metrics
|
|
4
|
+
|
|
5
|
+
```python
|
|
6
|
+
from sklearn.metrics import (
|
|
7
|
+
accuracy_score, precision_score, recall_score, f1_score,
|
|
8
|
+
classification_report, confusion_matrix, roc_auc_score
|
|
9
|
+
)
|
|
10
|
+
|
|
11
|
+
# All-in-one report
|
|
12
|
+
print(classification_report(y_test, y_pred))
|
|
13
|
+
|
|
14
|
+
# Individual metrics
|
|
15
|
+
accuracy_score(y_test, y_pred)
|
|
16
|
+
precision_score(y_test, y_pred, average='weighted')
|
|
17
|
+
recall_score(y_test, y_pred, average='weighted')
|
|
18
|
+
f1_score(y_test, y_pred, average='weighted')
|
|
19
|
+
roc_auc_score(y_test, y_pred_proba, multi_class='ovr')
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
### When to Use Which
|
|
23
|
+
| Metric | Use When |
|
|
24
|
+
|---|---|
|
|
25
|
+
| Accuracy | Balanced classes |
|
|
26
|
+
| Precision | False positives are costly (spam detection) |
|
|
27
|
+
| Recall | False negatives are costly (disease detection) |
|
|
28
|
+
| F1 | Imbalanced classes, need balance of precision/recall |
|
|
29
|
+
| ROC AUC | Need threshold-independent evaluation |
|
|
30
|
+
|
|
31
|
+
## Regression Metrics
|
|
32
|
+
|
|
33
|
+
```python
|
|
34
|
+
from sklearn.metrics import (
|
|
35
|
+
mean_squared_error, mean_absolute_error, r2_score,
|
|
36
|
+
mean_absolute_percentage_error
|
|
37
|
+
)
|
|
38
|
+
|
|
39
|
+
mean_squared_error(y_test, y_pred) # MSE
|
|
40
|
+
mean_squared_error(y_test, y_pred, squared=False) # RMSE
|
|
41
|
+
mean_absolute_error(y_test, y_pred) # MAE
|
|
42
|
+
r2_score(y_test, y_pred) # R-squared
|
|
43
|
+
mean_absolute_percentage_error(y_test, y_pred) # MAPE
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
### When to Use Which
|
|
47
|
+
| Metric | Use When |
|
|
48
|
+
|---|---|
|
|
49
|
+
| RMSE | Penalize large errors more |
|
|
50
|
+
| MAE | Robust to outliers |
|
|
51
|
+
| R-squared | Compare to baseline (0 = same as mean) |
|
|
52
|
+
| MAPE | Need percentage-based interpretation |
|
package/templates/bundle-data-pipeline/skills/model-training/references/model-selection-guide.md
ADDED
|
@@ -0,0 +1,41 @@
|
|
|
1
|
+
# Model Selection Guide
|
|
2
|
+
|
|
3
|
+
## Classification
|
|
4
|
+
|
|
5
|
+
| Model | Best For | Pros | Cons |
|
|
6
|
+
|---|---|---|---|
|
|
7
|
+
| LogisticRegression | Binary/multi-class, linear boundaries | Fast, interpretable | Limited to linear |
|
|
8
|
+
| RandomForestClassifier | General purpose, mixed features | Robust, feature importance | Slower, less interpretable |
|
|
9
|
+
| GradientBoostingClassifier | High accuracy needed | Best accuracy usually | Slow to train, overfitting risk |
|
|
10
|
+
| XGBClassifier | Competitions, large data | Fast, regularization | Needs tuning |
|
|
11
|
+
| SVC | Small-medium data, non-linear | Flexible kernels | Slow on large data |
|
|
12
|
+
|
|
13
|
+
## Regression
|
|
14
|
+
|
|
15
|
+
| Model | Best For | Pros | Cons |
|
|
16
|
+
|---|---|---|---|
|
|
17
|
+
| LinearRegression | Linear relationships | Fast, interpretable | Limited to linear |
|
|
18
|
+
| Ridge/Lasso | Regularized linear | Handles multicollinearity | Still linear |
|
|
19
|
+
| RandomForestRegressor | Non-linear, mixed features | Robust | Slower |
|
|
20
|
+
| GradientBoostingRegressor | High accuracy | Best accuracy usually | Needs tuning |
|
|
21
|
+
| XGBRegressor | Large datasets | Fast, scalable | Complex tuning |
|
|
22
|
+
|
|
23
|
+
## Quick Start Recipes
|
|
24
|
+
|
|
25
|
+
### Binary Classification
|
|
26
|
+
```python
|
|
27
|
+
from sklearn.ensemble import GradientBoostingClassifier
|
|
28
|
+
model = GradientBoostingClassifier(n_estimators=200, max_depth=5, random_state=42)
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
### Multi-class Classification
|
|
32
|
+
```python
|
|
33
|
+
from sklearn.ensemble import RandomForestClassifier
|
|
34
|
+
model = RandomForestClassifier(n_estimators=200, random_state=42)
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
### Regression
|
|
38
|
+
```python
|
|
39
|
+
from sklearn.ensemble import GradientBoostingRegressor
|
|
40
|
+
model = GradientBoostingRegressor(n_estimators=200, max_depth=5, random_state=42)
|
|
41
|
+
```
|