kmds-modeling 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,10 @@
1
+ recursive-exclude models *
2
+ recursive-exclude notebooks *
3
+ recursive-exclude output *
4
+ recursive-exclude .venv *
5
+ recursive-exclude __pycache__ *
6
+ prune models
7
+ prune notebooks
8
+ prune output
9
+ prune .venv
10
+ prune __pycache__
@@ -0,0 +1,54 @@
1
+ Metadata-Version: 2.4
2
+ Name: kmds-modeling
3
+ Version: 0.1.0
4
+ Summary: KMDS modeling pipeline package for KMDS lifecycle and model selection.
5
+ Requires-Python: >=3.13
6
+ Description-Content-Type: text/markdown
7
+ Requires-Dist: kmds-featurization>=0.1.5
8
+ Requires-Dist: click>=8.0
9
+ Requires-Dist: pandas>=2.0
10
+ Requires-Dist: numpy>=1.26
11
+ Requires-Dist: scikit-learn>=1.4
12
+ Requires-Dist: PyYAML>=6.0
13
+ Requires-Dist: joblib>=1.3
14
+
15
+ # KMDS Modeling
16
+
17
+ `kmds-modeling` is a lightweight modeling package designed to work inside the KMDS ecosystem. It provides generic modeling infrastructure and pipeline utilities for KMDS-style workflows, while leaving domain-specific examples and workspace-specific implementations separate.
18
+
19
+ ## What this package provides
20
+ - `src/kmds_modeling/core` — generic modeling package infrastructure
21
+ - `src/kmds_modeling/core/path_coordinator.py` — workspace-rooted path resolution for KMDS modeling
22
+ - `src/kmds_modeling/core/notebook_utils.py` — notebook-friendly workspace resolver
23
+ - `src/kmds_modeling/cli.py` — installable CLI glue for evaluation and export
24
+ - `models/sba_example` — an example SBA-specific modeling workflow kept outside the installed package
25
+
26
+ ## Intended usage
27
+ This package is meant to be installed into a KMDS workspace and used against modeling artifacts generated by KMDS tools such as `kmds-featurization`. The package does not embed any domain-specific SBA implementation in the installable distribution.
28
+
29
+ ## Installation
30
+ ```bash
31
+ pip install kmds-modeling
32
+ ```
33
+
34
+ ## CLI commands
35
+ After installing, the package exposes the `kmds-modeling` CLI:
36
+
37
+ ```bash
38
+ kmds-modeling evaluate --config /path/to/modeling_config.yaml
39
+ kmds-modeling export --config /path/to/modeling_config.yaml
40
+ ```
41
+
42
+ ## Working directory and configuration
43
+ KMDS modeling expects a `working_dir` and a `modeling_config.yaml` that defines the workspace layout. The package resolves paths using the `PathCoordinator` and writes modeling outputs into the workspace `models/` directory by default.
44
+
45
+ ## Example workflow
46
+ 1. Use KMDS featurization to generate `model_ready_numeric_data.csv` under `data/featurization/`.
47
+ 2. Create `modeling_config.yaml` with a `working_dir` pointing to your KMDS workspace.
48
+ 3. Run the package CLI to evaluate and export model artifacts.
49
+
50
+ ## Packaging note
51
+ The published PyPI package should only contain the generic package code under `src/kmds_modeling/`. Workspace-specific examples such as `models/sba_example/` are intentionally kept outside the installable package source tree.
52
+
53
+ ## Contributing
54
+ If you want to add another KMDS modeling example, put it under `models/<example_name>/` and leave the core package unchanged.
@@ -0,0 +1,40 @@
1
+ # KMDS Modeling
2
+
3
+ `kmds-modeling` is a lightweight modeling package designed to work inside the KMDS ecosystem. It provides generic modeling infrastructure and pipeline utilities for KMDS-style workflows, while leaving domain-specific examples and workspace-specific implementations separate.
4
+
5
+ ## What this package provides
6
+ - `src/kmds_modeling/core` — generic modeling package infrastructure
7
+ - `src/kmds_modeling/core/path_coordinator.py` — workspace-rooted path resolution for KMDS modeling
8
+ - `src/kmds_modeling/core/notebook_utils.py` — notebook-friendly workspace resolver
9
+ - `src/kmds_modeling/cli.py` — installable CLI glue for evaluation and export
10
+ - `models/sba_example` — an example SBA-specific modeling workflow kept outside the installed package
11
+
12
+ ## Intended usage
13
+ This package is meant to be installed into a KMDS workspace and used against modeling artifacts generated by KMDS tools such as `kmds-featurization`. The package does not embed any domain-specific SBA implementation in the installable distribution.
14
+
15
+ ## Installation
16
+ ```bash
17
+ pip install kmds-modeling
18
+ ```
19
+
20
+ ## CLI commands
21
+ After installing, the package exposes the `kmds-modeling` CLI:
22
+
23
+ ```bash
24
+ kmds-modeling evaluate --config /path/to/modeling_config.yaml
25
+ kmds-modeling export --config /path/to/modeling_config.yaml
26
+ ```
27
+
28
+ ## Working directory and configuration
29
+ KMDS modeling expects a `working_dir` and a `modeling_config.yaml` that defines the workspace layout. The package resolves paths using the `PathCoordinator` and writes modeling outputs into the workspace `models/` directory by default.
30
+
31
+ ## Example workflow
32
+ 1. Use KMDS featurization to generate `model_ready_numeric_data.csv` under `data/featurization/`.
33
+ 2. Create `modeling_config.yaml` with a `working_dir` pointing to your KMDS workspace.
34
+ 3. Run the package CLI to evaluate and export model artifacts.
35
+
36
+ ## Packaging note
37
+ The published PyPI package should only contain the generic package code under `src/kmds_modeling/`. Workspace-specific examples such as `models/sba_example/` are intentionally kept outside the installable package source tree.
38
+
39
+ ## Contributing
40
+ If you want to add another KMDS modeling example, put it under `models/<example_name>/` and leave the core package unchanged.
@@ -0,0 +1,26 @@
1
+ [project]
2
+ name = "kmds-modeling"
3
+ version = "0.1.0"
4
+ description = "KMDS modeling pipeline package for KMDS lifecycle and model selection."
5
+ readme = "README.md"
6
+ requires-python = ">=3.13"
7
+ dependencies = [
8
+ "kmds-featurization>=0.1.5",
9
+ "click>=8.0",
10
+ "pandas>=2.0",
11
+ "numpy>=1.26",
12
+ "scikit-learn>=1.4",
13
+ "PyYAML>=6.0",
14
+ "joblib>=1.3",
15
+ ]
16
+
17
+ [project.scripts]
18
+ kmds-modeling = "kmds_modeling.cli:cli"
19
+
20
+ [build-system]
21
+ requires = ["setuptools>=65.0", "wheel"]
22
+ build-backend = "setuptools.build_meta"
23
+
24
+ [tool.setuptools]
25
+ package-dir = {"" = "src"}
26
+ packages = { find = { where = ["src"] } }
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,10 @@
1
+ """kmds_modeling package."""
2
+
3
+ from .core.runner import ExperimentRunner
4
+ from .core.base import BaseFeatureTransformer, BaseModelCandidate
5
+
6
+ __all__ = [
7
+ "ExperimentRunner",
8
+ "BaseFeatureTransformer",
9
+ "BaseModelCandidate",
10
+ ]
@@ -0,0 +1,32 @@
1
+ import click
2
+ from .core.runner import ExperimentRunner
3
+
4
+
5
+ @click.group()
6
+ def cli():
7
+ """KMDS Modeling CLI."""
8
+ pass
9
+
10
+
11
+ @cli.command()
12
+ @click.option("--config", required=True, type=click.Path(exists=True), help="Path to modeling_config.yaml")
13
+ def evaluate(config):
14
+ """Run model evaluation for configured candidates."""
15
+ runner = ExperimentRunner(config)
16
+ click.echo("Starting candidate model evaluation...")
17
+ df = runner.run_evaluation()
18
+ click.echo("\n--- EXPERIMENT RESULTS LEADERBOARD ---")
19
+ click.echo(df.to_string(index=False))
20
+
21
+
22
+ @cli.command()
23
+ @click.option("--config", required=True, type=click.Path(exists=True), help="Path to modeling_config.yaml")
24
+ def export(config):
25
+ """Export the selected champion model using the configured production target."""
26
+ runner = ExperimentRunner(config)
27
+ click.echo("Exporting the champion model artifacts...")
28
+ runner.export_champion()
29
+
30
+
31
+ if __name__ == '__main__':
32
+ cli()
@@ -0,0 +1,17 @@
1
+ from .path_coordinator import PathCoordinator
2
+ from .notebook_utils import (
3
+ build_notebook_resolver,
4
+ get_modeling_artifact_paths,
5
+ load_model_ready_dataset,
6
+ load_workspace_config,
7
+ resolve_notebook_workspace_root,
8
+ )
9
+
10
+ __all__ = [
11
+ "PathCoordinator",
12
+ "build_notebook_resolver",
13
+ "get_modeling_artifact_paths",
14
+ "load_model_ready_dataset",
15
+ "load_workspace_config",
16
+ "resolve_notebook_workspace_root",
17
+ ]
@@ -0,0 +1,36 @@
1
+ from abc import ABC, abstractmethod
2
+ import pandas as pd
3
+ import numpy as np
4
+
5
+ class BaseFeatureTransformer(ABC):
6
+ """Interface for ad-hoc dataset-level transformations."""
7
+
8
+ @abstractmethod
9
+ def fit(self, X: pd.DataFrame, y: pd.Series = None):
10
+ """Fit internal parameters based on training data."""
11
+ pass
12
+
13
+ @abstractmethod
14
+ def transform(self, X: pd.DataFrame) -> pd.DataFrame:
15
+ """Transforms dataset features while maintaining the exact input index."""
16
+ pass
17
+
18
+ def fit_transform(self, X: pd.DataFrame, y: pd.Series = None) -> pd.DataFrame:
19
+ return self.fit(X, y).transform(X)
20
+
21
+
22
+ class BaseModelCandidate(ABC):
23
+ """Interface for uniform model orchestration."""
24
+
25
+ def __init__(self, hyperparameters: dict):
26
+ self.hyperparameters = hyperparameters
27
+
28
+ @abstractmethod
29
+ def fit(self, X_train: pd.DataFrame, y_train: pd.Series):
30
+ """Train the underlying model."""
31
+ pass
32
+
33
+ @abstractmethod
34
+ def predict_proba(self, X: pd.DataFrame) -> np.ndarray:
35
+ """Return class probabilities. Must return a 2D array [prob_class_0, prob_class_1]."""
36
+ pass
@@ -0,0 +1,48 @@
1
+ import os
2
+ from typing import Dict
3
+
4
+ import pandas as pd
5
+ import yaml
6
+
7
+ from .path_coordinator import PathCoordinator
8
+
9
+
10
+ def resolve_notebook_workspace_root(working_dir: str, config_name: str = "modeling_config.yaml") -> str:
11
+ working_dir = os.path.abspath(working_dir)
12
+ if not os.path.isdir(working_dir):
13
+ raise FileNotFoundError(f"Notebook directory does not exist: {working_dir}")
14
+
15
+ return working_dir
16
+
17
+
18
+ def load_workspace_config(working_dir: str, config_name: str = "modeling_config.yaml") -> Dict:
19
+ workspace_root = resolve_notebook_workspace_root(working_dir, config_name=config_name)
20
+ config_path = os.path.join(workspace_root, config_name)
21
+ if not os.path.isfile(config_path):
22
+ raise FileNotFoundError(f"Modeling config not found at: {config_path}")
23
+
24
+ with open(config_path, "r", encoding="utf-8") as f:
25
+ return yaml.safe_load(f) or {}
26
+
27
+
28
+ def build_notebook_resolver(working_dir: str, config_name: str = "modeling_config.yaml") -> PathCoordinator:
29
+ config = load_workspace_config(working_dir, config_name=config_name)
30
+ return PathCoordinator(working_dir=working_dir, config=config)
31
+
32
+
33
+ def get_modeling_artifact_paths(resolver: PathCoordinator) -> Dict[str, str]:
34
+ return {
35
+ "model_ready_dataset_path": resolver.model_ready_dataset_path,
36
+ "model_weights_path": resolver.model_weights_path,
37
+ "feature_pipeline_path": resolver.feature_pipeline_path,
38
+ "calibrator_path": resolver.calibrator_path,
39
+ "metadata_path": resolver.metadata_path,
40
+ "active_scores_path": resolver.active_scores_path,
41
+ }
42
+
43
+
44
+ def load_model_ready_dataset(resolver: PathCoordinator, **read_csv_kwargs) -> pd.DataFrame:
45
+ path = resolver.model_ready_dataset_path
46
+ if not os.path.isfile(path):
47
+ raise FileNotFoundError(f"Model-ready dataset not found at: {path}")
48
+ return pd.read_csv(path, **read_csv_kwargs)
@@ -0,0 +1,70 @@
1
+ import os
2
+ from typing import Any, Dict
3
+
4
+
5
+ class PathCoordinator:
6
+ """Resolves KMDS modeling package paths from the workspace working directory."""
7
+
8
+ def __init__(self, working_dir: str, config: Dict[str, Any]):
9
+ self.working_dir = os.path.abspath(working_dir)
10
+ self.config = config or {}
11
+
12
+ def _remove_anchor_prefix(self, config_value: str, anchor: str) -> str:
13
+ if config_value.startswith(anchor + os.sep):
14
+ return config_value.replace(anchor + os.sep, "", 1)
15
+ return config_value
16
+
17
+ @property
18
+ def model_ready_data_file(self) -> str:
19
+ return self.config.get("model_ready_data_file", "model_ready_numeric_data.csv")
20
+
21
+ @property
22
+ def featurization_output_dir(self) -> str:
23
+ return self.config.get("featurization_output_dir", "featurization")
24
+
25
+ @property
26
+ def modeling_output_dir(self) -> str:
27
+ return self.config.get("modeling_output_dir", "models")
28
+
29
+ def _resolve_data_dir(self, config_value: str) -> str:
30
+ if os.path.isabs(config_value):
31
+ return config_value
32
+ config_value = self._remove_anchor_prefix(config_value, "data")
33
+ return os.path.join(self.working_dir, "data", config_value)
34
+
35
+ def _resolve_workspace_dir(self, config_value: str) -> str:
36
+ if os.path.isabs(config_value):
37
+ return config_value
38
+ return os.path.join(self.working_dir, config_value)
39
+
40
+ @property
41
+ def featurization_output_path(self) -> str:
42
+ return self._resolve_data_dir(self.featurization_output_dir)
43
+
44
+ @property
45
+ def modeling_output_path(self) -> str:
46
+ return self._resolve_workspace_dir(self.modeling_output_dir)
47
+
48
+ @property
49
+ def model_ready_dataset_path(self) -> str:
50
+ return os.path.join(self.featurization_output_path, self.model_ready_data_file)
51
+
52
+ @property
53
+ def model_weights_path(self) -> str:
54
+ return os.path.join(self.modeling_output_path, "model_weights.pkl")
55
+
56
+ @property
57
+ def feature_pipeline_path(self) -> str:
58
+ return os.path.join(self.modeling_output_path, "feature_pipeline.pkl")
59
+
60
+ @property
61
+ def calibrator_path(self) -> str:
62
+ return os.path.join(self.modeling_output_path, "calibrator.pkl")
63
+
64
+ @property
65
+ def metadata_path(self) -> str:
66
+ return os.path.join(self.modeling_output_path, "metadata.json")
67
+
68
+ @property
69
+ def active_scores_path(self) -> str:
70
+ return os.path.join(self.modeling_output_path, "active_set_scores.csv")
@@ -0,0 +1,123 @@
1
+ import os
2
+ import json
3
+ import importlib
4
+ from typing import Optional
5
+
6
+ import joblib
7
+ import numpy as np
8
+ import pandas as pd
9
+ import yaml
10
+ from sklearn.metrics import f1_score, roc_auc_score
11
+ from sklearn.model_selection import StratifiedKFold
12
+
13
+ from .path_coordinator import PathCoordinator
14
+
15
+
16
+ class ExperimentRunner:
17
+ def __init__(self, config_path: str):
18
+ self.config_path = os.path.abspath(config_path)
19
+ with open(self.config_path, "r") as f:
20
+ self.config = yaml.safe_load(f)
21
+
22
+ self.custom_transformers = []
23
+ self._load_data()
24
+
25
+ def _load_data(self):
26
+ data_cfg = self.config["data"]
27
+ working_dir = data_cfg.get("working_dir") or os.path.dirname(self.config_path)
28
+ self.path_coordinator = PathCoordinator(working_dir=working_dir, config=self.config)
29
+
30
+ df = pd.read_csv(self.path_coordinator.model_ready_dataset_path)
31
+
32
+ index_column = data_cfg.get("index_column")
33
+ if index_column:
34
+ if index_column in df.columns:
35
+ df.set_index(index_column, inplace=True)
36
+ else:
37
+ df.index.name = index_column
38
+
39
+ target = self.config["project"]["target_variable"]
40
+ self.y = df[target]
41
+ self.X = df.drop(columns=[target])
42
+
43
+ def register_transformer(self, transformer):
44
+ self.custom_transformers.append(transformer)
45
+
46
+ def _apply_transformers(self, X_train: pd.DataFrame, X_val: pd.DataFrame, y_train: pd.Series):
47
+ X_tr_fe = X_train.copy()
48
+ X_val_fe = X_val.copy()
49
+ for trans in self.custom_transformers:
50
+ X_tr_fe = trans.fit_transform(X_tr_fe, y_train)
51
+ X_val_fe = trans.transform(X_val_fe)
52
+ return X_tr_fe, X_val_fe
53
+
54
+ def _get_candidate_class(self, class_path: str):
55
+ module_name, class_name = class_path.rsplit(".", 1)
56
+ module = importlib.import_module(module_name)
57
+ return getattr(module, class_name)
58
+
59
+ def run_evaluation(self) -> pd.DataFrame:
60
+ cv_cfg = self.config["experiment_settings"]["cross_validation"]
61
+ skf = StratifiedKFold(
62
+ n_splits=cv_cfg["splits"],
63
+ shuffle=True,
64
+ random_state=cv_cfg["random_state"],
65
+ )
66
+
67
+ leaderboard = []
68
+
69
+ for model_cfg in self.config["candidates"]:
70
+ candidate_class = self._get_candidate_class(model_cfg["class_path"])
71
+ fold_auc, fold_f1 = [], []
72
+
73
+ for train_idx, val_idx in skf.split(self.X, self.y):
74
+ X_train, X_val = self.X.iloc[train_idx], self.X.iloc[val_idx]
75
+ y_train, y_val = self.y.iloc[train_idx], self.y.iloc[val_idx]
76
+
77
+ X_tr_fe, X_val_fe = self._apply_transformers(X_train, X_val, y_train)
78
+
79
+ model = candidate_class(model_cfg["hyperparameters"])
80
+ model.fit(X_tr_fe, y_train)
81
+
82
+ preds = model.predict_proba(X_val_fe)[:, 1]
83
+ fold_auc.append(roc_auc_score(y_val, preds))
84
+ fold_f1.append(f1_score(y_val, (preds >= 0.5).astype(int)))
85
+
86
+ leaderboard.append(
87
+ {
88
+ "candidate_name": model_cfg["name"],
89
+ "mean_roc_auc": float(np.mean(fold_auc)),
90
+ "mean_f1": float(np.mean(fold_f1)),
91
+ }
92
+ )
93
+
94
+ return pd.DataFrame(leaderboard)
95
+
96
+ def export_champion(self):
97
+ prod_cfg = self.config["production_target"]
98
+ champ_name = prod_cfg["champion_candidate_name"]
99
+ model_cfg = next(c for c in self.config["candidates"] if c["name"] == champ_name)
100
+
101
+ X_final = self.X.copy()
102
+ for trans in self.custom_transformers:
103
+ X_final = trans.fit_transform(X_final, self.y)
104
+
105
+ candidate_class = self._get_candidate_class(model_cfg["class_path"])
106
+ model = candidate_class(model_cfg["hyperparameters"])
107
+ model.fit(X_final, self.y)
108
+
109
+ out_dir = prod_cfg["export_directory"]
110
+ os.makedirs(out_dir, exist_ok=True)
111
+
112
+ joblib.dump(model, os.path.join(out_dir, "model_weights.pkl"))
113
+ joblib.dump(self.custom_transformers, os.path.join(out_dir, "feature_pipeline.pkl"))
114
+
115
+ metadata = {
116
+ "model_name": self.config["project"]["name"],
117
+ "version": self.config["project"]["experiment_version"],
118
+ "features": list(X_final.columns),
119
+ "target": self.config["project"]["target_variable"],
120
+ "metrics": {"primary_metric": self.config["experiment_settings"]["primary_metric"]},
121
+ }
122
+ with open(os.path.join(out_dir, "metadata.json"), "w") as f:
123
+ json.dump(metadata, f, indent=4)
@@ -0,0 +1,5 @@
1
+ """Example adapters for the kmds_modeling framework."""
2
+ from .example_candidate import ExampleCandidate
3
+ from .example_transformer import ExampleTransformer
4
+
5
+ __all__ = ["ExampleCandidate", "ExampleTransformer"]
@@ -0,0 +1,21 @@
1
+ import numpy as np
2
+ import pandas as pd
3
+ from pandas import Series
4
+ from sklearn.dummy import DummyClassifier
5
+ from ..core.base import BaseModelCandidate
6
+
7
+
8
+ class ExampleCandidate(BaseModelCandidate):
9
+ """A minimal candidate that uses a dummy classifier."""
10
+
11
+ def __init__(self, hyperparameters: dict):
12
+ super().__init__(hyperparameters)
13
+ strategy = hyperparameters.get("strategy", "prior")
14
+ self.model = DummyClassifier(strategy=strategy)
15
+
16
+ def fit(self, X_train: pd.DataFrame, y_train: Series):
17
+ self.model.fit(X_train, y_train)
18
+ return self
19
+
20
+ def predict_proba(self, X: pd.DataFrame) -> np.ndarray:
21
+ return self.model.predict_proba(X)
@@ -0,0 +1,16 @@
1
+ import pandas as pd
2
+ from pandas import Series
3
+ from ..core.base import BaseFeatureTransformer
4
+
5
+
6
+ class ExampleTransformer(BaseFeatureTransformer):
7
+ """A minimal sample transformer that preserves the input index."""
8
+
9
+ def fit(self, X: pd.DataFrame, y: Series = None):
10
+ self.feature_names_ = list(X.columns)
11
+ return self
12
+
13
+ def transform(self, X: pd.DataFrame) -> pd.DataFrame:
14
+ transformed = X.copy()
15
+ transformed.index = X.index
16
+ return transformed
@@ -0,0 +1,54 @@
1
+ Metadata-Version: 2.4
2
+ Name: kmds-modeling
3
+ Version: 0.1.0
4
+ Summary: KMDS modeling pipeline package for KMDS lifecycle and model selection.
5
+ Requires-Python: >=3.13
6
+ Description-Content-Type: text/markdown
7
+ Requires-Dist: kmds-featurization>=0.1.5
8
+ Requires-Dist: click>=8.0
9
+ Requires-Dist: pandas>=2.0
10
+ Requires-Dist: numpy>=1.26
11
+ Requires-Dist: scikit-learn>=1.4
12
+ Requires-Dist: PyYAML>=6.0
13
+ Requires-Dist: joblib>=1.3
14
+
15
+ # KMDS Modeling
16
+
17
+ `kmds-modeling` is a lightweight modeling package designed to work inside the KMDS ecosystem. It provides generic modeling infrastructure and pipeline utilities for KMDS-style workflows, while leaving domain-specific examples and workspace-specific implementations separate.
18
+
19
+ ## What this package provides
20
+ - `src/kmds_modeling/core` — generic modeling package infrastructure
21
+ - `src/kmds_modeling/core/path_coordinator.py` — workspace-rooted path resolution for KMDS modeling
22
+ - `src/kmds_modeling/core/notebook_utils.py` — notebook-friendly workspace resolver
23
+ - `src/kmds_modeling/cli.py` — installable CLI glue for evaluation and export
24
+ - `models/sba_example` — an example SBA-specific modeling workflow kept outside the installed package
25
+
26
+ ## Intended usage
27
+ This package is meant to be installed into a KMDS workspace and used against modeling artifacts generated by KMDS tools such as `kmds-featurization`. The package does not embed any domain-specific SBA implementation in the installable distribution.
28
+
29
+ ## Installation
30
+ ```bash
31
+ pip install kmds-modeling
32
+ ```
33
+
34
+ ## CLI commands
35
+ After installing, the package exposes the `kmds-modeling` CLI:
36
+
37
+ ```bash
38
+ kmds-modeling evaluate --config /path/to/modeling_config.yaml
39
+ kmds-modeling export --config /path/to/modeling_config.yaml
40
+ ```
41
+
42
+ ## Working directory and configuration
43
+ KMDS modeling expects a `working_dir` and a `modeling_config.yaml` that defines the workspace layout. The package resolves paths using the `PathCoordinator` and writes modeling outputs into the workspace `models/` directory by default.
44
+
45
+ ## Example workflow
46
+ 1. Use KMDS featurization to generate `model_ready_numeric_data.csv` under `data/featurization/`.
47
+ 2. Create `modeling_config.yaml` with a `working_dir` pointing to your KMDS workspace.
48
+ 3. Run the package CLI to evaluate and export model artifacts.
49
+
50
+ ## Packaging note
51
+ The published PyPI package should only contain the generic package code under `src/kmds_modeling/`. Workspace-specific examples such as `models/sba_example/` are intentionally kept outside the installable package source tree.
52
+
53
+ ## Contributing
54
+ If you want to add another KMDS modeling example, put it under `models/<example_name>/` and leave the core package unchanged.
@@ -0,0 +1,19 @@
1
+ MANIFEST.in
2
+ README.md
3
+ pyproject.toml
4
+ src/kmds_modeling/__init__.py
5
+ src/kmds_modeling/cli.py
6
+ src/kmds_modeling.egg-info/PKG-INFO
7
+ src/kmds_modeling.egg-info/SOURCES.txt
8
+ src/kmds_modeling.egg-info/dependency_links.txt
9
+ src/kmds_modeling.egg-info/entry_points.txt
10
+ src/kmds_modeling.egg-info/requires.txt
11
+ src/kmds_modeling.egg-info/top_level.txt
12
+ src/kmds_modeling/core/__init__.py
13
+ src/kmds_modeling/core/base.py
14
+ src/kmds_modeling/core/notebook_utils.py
15
+ src/kmds_modeling/core/path_coordinator.py
16
+ src/kmds_modeling/core/runner.py
17
+ src/kmds_modeling/examples/__init__.py
18
+ src/kmds_modeling/examples/example_candidate.py
19
+ src/kmds_modeling/examples/example_transformer.py
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ kmds-modeling = kmds_modeling.cli:cli
@@ -0,0 +1,7 @@
1
+ kmds-featurization>=0.1.5
2
+ click>=8.0
3
+ pandas>=2.0
4
+ numpy>=1.26
5
+ scikit-learn>=1.4
6
+ PyYAML>=6.0
7
+ joblib>=1.3
@@ -0,0 +1 @@
1
+ kmds_modeling