phoenix-ml-workflow 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (26) hide show
  1. phoenix_ml_workflow-1.0.0/LICENSE +21 -0
  2. phoenix_ml_workflow-1.0.0/MANIFEST.in +3 -0
  3. phoenix_ml_workflow-1.0.0/PKG-INFO +84 -0
  4. phoenix_ml_workflow-1.0.0/README.md +49 -0
  5. phoenix_ml_workflow-1.0.0/examples/DC_Motor_Dataset.csv +1001 -0
  6. phoenix_ml_workflow-1.0.0/examples/README.md +18 -0
  7. phoenix_ml_workflow-1.0.0/phoenix_ml/__init__.py +6 -0
  8. phoenix_ml_workflow-1.0.0/phoenix_ml/data_preprocessing.py +335 -0
  9. phoenix_ml_workflow-1.0.0/phoenix_ml/hyperparameter_optimisation.py +969 -0
  10. phoenix_ml_workflow-1.0.0/phoenix_ml/interpretability.py +230 -0
  11. phoenix_ml_workflow-1.0.0/phoenix_ml/model_training.py +178 -0
  12. phoenix_ml_workflow-1.0.0/phoenix_ml/models.py +93 -0
  13. phoenix_ml_workflow-1.0.0/phoenix_ml/persistence.py +122 -0
  14. phoenix_ml_workflow-1.0.0/phoenix_ml/physics_model.py +109 -0
  15. phoenix_ml_workflow-1.0.0/phoenix_ml/postprocessing.py +400 -0
  16. phoenix_ml_workflow-1.0.0/phoenix_ml/report_generation.py +812 -0
  17. phoenix_ml_workflow-1.0.0/phoenix_ml/system_info.py +97 -0
  18. phoenix_ml_workflow-1.0.0/phoenix_ml/uncertainty_quantification.py +238 -0
  19. phoenix_ml_workflow-1.0.0/phoenix_ml/workflow.py +273 -0
  20. phoenix_ml_workflow-1.0.0/phoenix_ml_workflow.egg-info/PKG-INFO +84 -0
  21. phoenix_ml_workflow-1.0.0/phoenix_ml_workflow.egg-info/SOURCES.txt +24 -0
  22. phoenix_ml_workflow-1.0.0/phoenix_ml_workflow.egg-info/dependency_links.txt +1 -0
  23. phoenix_ml_workflow-1.0.0/phoenix_ml_workflow.egg-info/requires.txt +22 -0
  24. phoenix_ml_workflow-1.0.0/phoenix_ml_workflow.egg-info/top_level.txt +1 -0
  25. phoenix_ml_workflow-1.0.0/pyproject.toml +51 -0
  26. phoenix_ml_workflow-1.0.0/setup.cfg +4 -0
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Ryan Cheung
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,3 @@
1
+ include README.md
2
+ include LICENSE
3
+ recursive-include examples *.csv *.md
@@ -0,0 +1,84 @@
1
+ Metadata-Version: 2.4
2
+ Name: phoenix_ml_workflow
3
+ Version: 1.0.0
4
+ Summary: A workflow designed for Physics-Enhanced Machine Learning in engineering applications.
5
+ Author-email: "[Ryan Cheung]" <cheungkh@lancaster.ac.uk>
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/Ryan-907316/phoenix_ml
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: Programming Language :: Python :: 3 :: Only
10
+ Classifier: Operating System :: OS Independent
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
13
+ Requires-Python: >=3.12.6
14
+ Description-Content-Type: text/markdown
15
+ Requires-Dist: numpy>=1.22
16
+ Requires-Dist: pandas>=1.5
17
+ Requires-Dist: scipy>=1.9
18
+ Requires-Dist: scikit-learn>=1.2
19
+ Requires-Dist: matplotlib>=3.6
20
+ Requires-Dist: seaborn>=0.12
21
+ Requires-Dist: joblib>=1.2
22
+ Requires-Dist: tqdm>=4.65
23
+ Requires-Dist: tabulate>=0.9
24
+ Requires-Dist: statsmodels>=0.13
25
+ Requires-Dist: scikit-optimize>=0.9
26
+ Requires-Dist: hyperopt>=0.2
27
+ Requires-Dist: dcor>=0.6
28
+ Requires-Dist: shap>=0.41
29
+ Requires-Dist: reportlab>=3.6
30
+ Requires-Dist: psutil>=5.9
31
+ Requires-Dist: py-cpuinfo>=9.0
32
+ Requires-Dist: xgboost>=1.7
33
+ Requires-Dist: lightgbm>=3.3
34
+ Requires-Dist: pywin32>=305; platform_system == "Windows"
35
+
36
+ # phoenix_ml
37
+ A Physics and Hybrid Optimised ENgine for Interpretability and eXplainability for Machine Learning. Intended to make the full machine learning workflow experience easier, from dataset to report.
38
+
39
+ # Overview
40
+
41
+ This package intends to take you through the entire machine learning workflow with all the tools that you need in a single package. Preprocessing, model evaluation, interpretability, hyperparameter optimisation, postprocessing, uncertainty quantification, modelling with first-principles equations, residual learning, and report generation are all included without the additional importation of other packages, and has been made easy to use and highly customisible at every step of the workflow.
42
+
43
+ This package is intended to be used with regression datasets that are clean, though future versions may extend this to classification problems as well as an in-house dataset cleaner.
44
+
45
+ # Features
46
+
47
+ This package contains the following:
48
+ - **Physics modelling**: Physics-Enhanced Machine Learning (PEML) based methods such as residual learning, easy input of first-principles equations, and automatic generation of residual datasets.
49
+ - **Preprocessing**: Customisable test/train options, scatter plots of features and target variables, highly detailed boxplots of features, and distance correlation matrix with toggleable dummy variable.
50
+ - **Model training and evaluation**: The ability to add, change, or remove regression models when training, compatibility with single and multivariable optimisation, evaluation summarised as neat tables with different evaluation metrics.
51
+ - **Interpretability**: Partial Dependence Plots (PDPs) with Individual Condition Expectations (ICE), with the inclusion of Shapley Additive eXplanation (SHAP) summaries and plots. Automatic evaluation of preferred model based on model training performance used in interpretability.
52
+ - **Hyperparameter Optimisation (HPO)**: Inclusion of random HPO (with the choice of random sampling included, choose from Monte Carlo, Sobol, Halton, or Latin Hypercube sampling), Hyperopt (Adaptive Tree-based Parzen Estimators) and scikit-optimize (Gaussian Process Minimisation). Customise which method(s) to use and compare, as well as the number of iterations for all, and the number of CPU cores for HPO. Displays the best performing model for each target variable according to the user-defined metric (MSE, R^2, Adjusted R^2, or Q^2) and the time elapsed for each.
53
+ - **Postprocessing**: Variety of cross-validation methods with full customisation of arguments and scoring metrics. Influential points determination using Cook's Distance, residual analysis and automatic determination of transformed residuals using Anderson-Darling normality test, and Q-Q plots.
54
+ - **Uncertainty Quantification (UQ)**: Use of bootstrapping and conformal predictions as well as a user-customisable confidence and prediction interval. Ability to perform UQ before and/or after HPO for comparison.
55
+ - **Report generation**: Summarise all findings in a single .pdf file with the above features, with high quality images included in the report and additionally in a separate images folder for further analysis. Useful information is summarised in neat tables and .csv files. Models are saved as .pkl files overall and for each target variable. .json files included for full reproducibility.
56
+
57
+ ## Installation
58
+
59
+ TODO: Figure out how to give instructions on how to install phoenix_ml and put them here.
60
+
61
+ ## Quick Start
62
+
63
+ Included in this repository is a DC motor dataset for demonstration.
64
+
65
+ Clone the repository and run the workflow:
66
+
67
+ ```bash
68
+ git clone https://github.com/Ryan-907316/phoenix_ml.git
69
+ cd phoenix_ml
70
+ python _runner.py
71
+ ```
72
+
73
+ ## License
74
+
75
+ This project is licensed under the MIT License - see the [license](https://github.com/Ryan-907316/phoenix_ml/blob/main/LICENSE) file for details.
76
+
77
+ ## Credits
78
+
79
+ Package created by Ryan Cheung, and extends the work done previously for an individual undergraduate project.
80
+
81
+ ## Contact
82
+
83
+ University email is cheungkh@lancaster.ac.uk for queries, instructions, or more information.
84
+
@@ -0,0 +1,49 @@
1
+ # phoenix_ml
2
+ A Physics and Hybrid Optimised ENgine for Interpretability and eXplainability for Machine Learning. Intended to make the full machine learning workflow experience easier, from dataset to report.
3
+
4
+ # Overview
5
+
6
+ This package intends to take you through the entire machine learning workflow with all the tools that you need in a single package. Preprocessing, model evaluation, interpretability, hyperparameter optimisation, postprocessing, uncertainty quantification, modelling with first-principles equations, residual learning, and report generation are all included without the additional importation of other packages, and has been made easy to use and highly customisible at every step of the workflow.
7
+
8
+ This package is intended to be used with regression datasets that are clean, though future versions may extend this to classification problems as well as an in-house dataset cleaner.
9
+
10
+ # Features
11
+
12
+ This package contains the following:
13
+ - **Physics modelling**: Physics-Enhanced Machine Learning (PEML) based methods such as residual learning, easy input of first-principles equations, and automatic generation of residual datasets.
14
+ - **Preprocessing**: Customisable test/train options, scatter plots of features and target variables, highly detailed boxplots of features, and distance correlation matrix with toggleable dummy variable.
15
+ - **Model training and evaluation**: The ability to add, change, or remove regression models when training, compatibility with single and multivariable optimisation, evaluation summarised as neat tables with different evaluation metrics.
16
+ - **Interpretability**: Partial Dependence Plots (PDPs) with Individual Condition Expectations (ICE), with the inclusion of Shapley Additive eXplanation (SHAP) summaries and plots. Automatic evaluation of preferred model based on model training performance used in interpretability.
17
+ - **Hyperparameter Optimisation (HPO)**: Inclusion of random HPO (with the choice of random sampling included, choose from Monte Carlo, Sobol, Halton, or Latin Hypercube sampling), Hyperopt (Adaptive Tree-based Parzen Estimators) and scikit-optimize (Gaussian Process Minimisation). Customise which method(s) to use and compare, as well as the number of iterations for all, and the number of CPU cores for HPO. Displays the best performing model for each target variable according to the user-defined metric (MSE, R^2, Adjusted R^2, or Q^2) and the time elapsed for each.
18
+ - **Postprocessing**: Variety of cross-validation methods with full customisation of arguments and scoring metrics. Influential points determination using Cook's Distance, residual analysis and automatic determination of transformed residuals using Anderson-Darling normality test, and Q-Q plots.
19
+ - **Uncertainty Quantification (UQ)**: Use of bootstrapping and conformal predictions as well as a user-customisable confidence and prediction interval. Ability to perform UQ before and/or after HPO for comparison.
20
+ - **Report generation**: Summarise all findings in a single .pdf file with the above features, with high quality images included in the report and additionally in a separate images folder for further analysis. Useful information is summarised in neat tables and .csv files. Models are saved as .pkl files overall and for each target variable. .json files included for full reproducibility.
21
+
22
+ ## Installation
23
+
24
+ TODO: Figure out how to give instructions on how to install phoenix_ml and put them here.
25
+
26
+ ## Quick Start
27
+
28
+ Included in this repository is a DC motor dataset for demonstration.
29
+
30
+ Clone the repository and run the workflow:
31
+
32
+ ```bash
33
+ git clone https://github.com/Ryan-907316/phoenix_ml.git
34
+ cd phoenix_ml
35
+ python _runner.py
36
+ ```
37
+
38
+ ## License
39
+
40
+ This project is licensed under the MIT License - see the [license](https://github.com/Ryan-907316/phoenix_ml/blob/main/LICENSE) file for details.
41
+
42
+ ## Credits
43
+
44
+ Package created by Ryan Cheung, and extends the work done previously for an individual undergraduate project.
45
+
46
+ ## Contact
47
+
48
+ University email is cheungkh@lancaster.ac.uk for queries, instructions, or more information.
49
+