aiqclib 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- aiqclib-0.1.0/.github/workflows/check_package.yml +31 -0
- aiqclib-0.1.0/.gitignore +35 -0
- aiqclib-0.1.0/.python-version +1 -0
- aiqclib-0.1.0/.readthedocs.yml +13 -0
- aiqclib-0.1.0/CHANGELOG.md +11 -0
- aiqclib-0.1.0/LICENSE +21 -0
- aiqclib-0.1.0/PKG-INFO +411 -0
- aiqclib-0.1.0/README.md +391 -0
- aiqclib-0.1.0/codecov.yaml +2 -0
- aiqclib-0.1.0/docs/Makefile +20 -0
- aiqclib-0.1.0/docs/_templates/apidoc/package.rst.jinja +61 -0
- aiqclib-0.1.0/docs/make.bat +35 -0
- aiqclib-0.1.0/docs/requirements.txt +5 -0
- aiqclib-0.1.0/docs/scripts/build_docs.sh +25 -0
- aiqclib-0.1.0/docs/scripts/clean_api_rst.py +36 -0
- aiqclib-0.1.0/docs/scripts/prompt_main.txt +2 -0
- aiqclib-0.1.0/docs/scripts/prompt_unittest.txt +2 -0
- aiqclib-0.1.0/docs/scripts/update_docstrings.py +57 -0
- aiqclib-0.1.0/docs/source/_static/_empty +0 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.classify.rst +21 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.classify.step1_read_input.rst +18 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.classify.step2_calc_stats.rst +18 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.classify.step3_select_profiles.rst +18 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.classify.step4_select_rows.rst +18 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.classify.step5_extract_features.rst +18 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.classify.step6_classify_dataset.rst +26 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.classify.step7_concat_datasets.rst +34 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.common.base.rst +50 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.common.config.rst +50 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.common.loader.rst +106 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.common.rst +18 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.common.utils.rst +34 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.interface.rst +50 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.prepare.features.rst +58 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.prepare.rst +21 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.prepare.step1_read_input.rst +26 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.prepare.step2_calc_stats.rst +26 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.prepare.step3_select_profiles.rst +34 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.prepare.step4_select_rows.rst +34 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.prepare.step5_extract_features.rst +26 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.prepare.step6_split_dataset.rst +34 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.rst +19 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.train.models.rst +90 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.train.rst +19 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.train.step1_read_input.rst +26 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.train.step2_validate_model.rst +34 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.train.step3_optimise_model.rst +7 -0
- aiqclib-0.1.0/docs/source/api/aiqclib.train.step4_build_model.rst +34 -0
- aiqclib-0.1.0/docs/source/api/modules.rst +7 -0
- aiqclib-0.1.0/docs/source/conf.py +37 -0
- aiqclib-0.1.0/docs/source/configuration/classification.rst +346 -0
- aiqclib-0.1.0/docs/source/configuration/preparation.rst +333 -0
- aiqclib-0.1.0/docs/source/configuration/training.rst +175 -0
- aiqclib-0.1.0/docs/source/features/basic_values.rst +52 -0
- aiqclib-0.1.0/docs/source/features/day_of_year.rst +38 -0
- aiqclib-0.1.0/docs/source/features/location.rst +51 -0
- aiqclib-0.1.0/docs/source/features/neigbouring_values.rst +59 -0
- aiqclib-0.1.0/docs/source/features/profile_summary_stats.rst +89 -0
- aiqclib-0.1.0/docs/source/how-to/algorithm_selection.rst +448 -0
- aiqclib-0.1.0/docs/source/how-to/data_preprocessing_utilities.rst +155 -0
- aiqclib-0.1.0/docs/source/how-to/down_sampling_negative.rst +118 -0
- aiqclib-0.1.0/docs/source/how-to/feature_normalization.rst +125 -0
- aiqclib-0.1.0/docs/source/how-to/quick_start.rst +298 -0
- aiqclib-0.1.0/docs/source/how-to/selecting_specific_configurations.rst +44 -0
- aiqclib-0.1.0/docs/source/how-to/shap_values.rst +67 -0
- aiqclib-0.1.0/docs/source/index.rst +108 -0
- aiqclib-0.1.0/docs/source/tutorial/classification.rst +133 -0
- aiqclib-0.1.0/docs/source/tutorial/installation.rst +112 -0
- aiqclib-0.1.0/docs/source/tutorial/overview.rst +47 -0
- aiqclib-0.1.0/docs/source/tutorial/preparation.rst +190 -0
- aiqclib-0.1.0/docs/source/tutorial/training.rst +133 -0
- aiqclib-0.1.0/pyproject.toml +45 -0
- aiqclib-0.1.0/pytest.ini +5 -0
- aiqclib-0.1.0/src/aiqclib/__init__.py +36 -0
- aiqclib-0.1.0/src/aiqclib/classify/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/classify/step1_read_input/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/classify/step1_read_input/dataset_all.py +37 -0
- aiqclib-0.1.0/src/aiqclib/classify/step2_calc_stats/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/classify/step2_calc_stats/dataset_all.py +56 -0
- aiqclib-0.1.0/src/aiqclib/classify/step3_select_profiles/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/classify/step3_select_profiles/dataset_all.py +108 -0
- aiqclib-0.1.0/src/aiqclib/classify/step4_select_rows/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/classify/step4_select_rows/dataset_all.py +132 -0
- aiqclib-0.1.0/src/aiqclib/classify/step5_extract_features/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/classify/step5_extract_features/dataset_all.py +75 -0
- aiqclib-0.1.0/src/aiqclib/classify/step6_classify_dataset/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/classify/step6_classify_dataset/dataset_all.py +158 -0
- aiqclib-0.1.0/src/aiqclib/classify/step6_classify_dataset/dataset_all_suite.py +277 -0
- aiqclib-0.1.0/src/aiqclib/classify/step7_concat_datasets/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/classify/step7_concat_datasets/concat_base.py +131 -0
- aiqclib-0.1.0/src/aiqclib/classify/step7_concat_datasets/dataset_all.py +55 -0
- aiqclib-0.1.0/src/aiqclib/classify/step7_concat_datasets/dataset_suite.py +114 -0
- aiqclib-0.1.0/src/aiqclib/common/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/common/base/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/common/base/config_base.py +462 -0
- aiqclib-0.1.0/src/aiqclib/common/base/dataset_base.py +78 -0
- aiqclib-0.1.0/src/aiqclib/common/base/feature_base.py +108 -0
- aiqclib-0.1.0/src/aiqclib/common/base/model_base.py +208 -0
- aiqclib-0.1.0/src/aiqclib/common/base/scikit_learn_model_base.py +326 -0
- aiqclib-0.1.0/src/aiqclib/common/config/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/common/config/classify_config.py +99 -0
- aiqclib-0.1.0/src/aiqclib/common/config/dataset_config.py +87 -0
- aiqclib-0.1.0/src/aiqclib/common/config/training_config.py +80 -0
- aiqclib-0.1.0/src/aiqclib/common/config/yaml_schema.py +973 -0
- aiqclib-0.1.0/src/aiqclib/common/config/yaml_templates.py +740 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/classify_loader.py +249 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/classify_registry.py +89 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/dataset_loader.py +223 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/dataset_registry.py +74 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/feature_loader.py +82 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/feature_registry.py +34 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/model_loader.py +63 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/model_registry.py +29 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/single_model_loader.py +67 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/single_model_registry.py +46 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/training_loader.py +130 -0
- aiqclib-0.1.0/src/aiqclib/common/loader/training_registry.py +51 -0
- aiqclib-0.1.0/src/aiqclib/common/utils/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/common/utils/config.py +83 -0
- aiqclib-0.1.0/src/aiqclib/common/utils/file.py +88 -0
- aiqclib-0.1.0/src/aiqclib/common/utils/metric_plots.py +276 -0
- aiqclib-0.1.0/src/aiqclib/interface/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/interface/classify.py +87 -0
- aiqclib-0.1.0/src/aiqclib/interface/config.py +111 -0
- aiqclib-0.1.0/src/aiqclib/interface/prepare.py +89 -0
- aiqclib-0.1.0/src/aiqclib/interface/stats.py +188 -0
- aiqclib-0.1.0/src/aiqclib/interface/train.py +57 -0
- aiqclib-0.1.0/src/aiqclib/prepare/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/prepare/features/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/prepare/features/basic_values.py +147 -0
- aiqclib-0.1.0/src/aiqclib/prepare/features/day_of_year.py +170 -0
- aiqclib-0.1.0/src/aiqclib/prepare/features/flank_down.py +228 -0
- aiqclib-0.1.0/src/aiqclib/prepare/features/flank_up.py +200 -0
- aiqclib-0.1.0/src/aiqclib/prepare/features/location.py +104 -0
- aiqclib-0.1.0/src/aiqclib/prepare/features/profile_summary.py +150 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step1_read_input/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step1_read_input/dataset_a.py +38 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step1_read_input/input_base.py +163 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step2_calc_stats/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step2_calc_stats/dataset_a.py +44 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step2_calc_stats/summary_base.py +252 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step3_select_profiles/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step3_select_profiles/dataset_a.py +202 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step3_select_profiles/dataset_all.py +129 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step3_select_profiles/select_base.py +106 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step4_select_rows/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step4_select_rows/dataset_a.py +306 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step4_select_rows/dataset_all.py +101 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step4_select_rows/locate_base.py +139 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step5_extract_features/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step5_extract_features/dataset_a.py +64 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step5_extract_features/extract_base.py +202 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step6_split_dataset/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step6_split_dataset/dataset_a.py +174 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step6_split_dataset/dataset_all.py +183 -0
- aiqclib-0.1.0/src/aiqclib/prepare/step6_split_dataset/split_base.py +201 -0
- aiqclib-0.1.0/src/aiqclib/py.typed +0 -0
- aiqclib-0.1.0/src/aiqclib/train/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/train/models/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/train/models/decision_tree.py +71 -0
- aiqclib-0.1.0/src/aiqclib/train/models/gaussian_naive_bayes.py +65 -0
- aiqclib-0.1.0/src/aiqclib/train/models/k_nearest_neighbors.py +70 -0
- aiqclib-0.1.0/src/aiqclib/train/models/linear_discriminant_analysis.py +69 -0
- aiqclib-0.1.0/src/aiqclib/train/models/logistic_regression.py +66 -0
- aiqclib-0.1.0/src/aiqclib/train/models/model_suite.py +173 -0
- aiqclib-0.1.0/src/aiqclib/train/models/multilayer_perceptron.py +77 -0
- aiqclib-0.1.0/src/aiqclib/train/models/random_forest.py +71 -0
- aiqclib-0.1.0/src/aiqclib/train/models/support_vector_machine.py +78 -0
- aiqclib-0.1.0/src/aiqclib/train/models/xgboost.py +68 -0
- aiqclib-0.1.0/src/aiqclib/train/step1_read_input/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/train/step1_read_input/dataset_a.py +37 -0
- aiqclib-0.1.0/src/aiqclib/train/step1_read_input/input_base.py +119 -0
- aiqclib-0.1.0/src/aiqclib/train/step2_validate_model/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/train/step2_validate_model/kfold_validation.py +123 -0
- aiqclib-0.1.0/src/aiqclib/train/step2_validate_model/kfold_validation_suite.py +176 -0
- aiqclib-0.1.0/src/aiqclib/train/step2_validate_model/validate_base.py +178 -0
- aiqclib-0.1.0/src/aiqclib/train/step3_optimise_model/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/train/step4_build_model/__init__.py +0 -0
- aiqclib-0.1.0/src/aiqclib/train/step4_build_model/build_model.py +175 -0
- aiqclib-0.1.0/src/aiqclib/train/step4_build_model/build_model_base.py +313 -0
- aiqclib-0.1.0/src/aiqclib/train/step4_build_model/build_model_suite.py +315 -0
- aiqclib-0.1.0/tests/data/classify/classify_prediction_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/classify/classify_prediction_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/classify/classify_prediction_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/classify/classify_report_pres.tsv +7 -0
- aiqclib-0.1.0/tests/data/classify/classify_report_psal.tsv +7 -0
- aiqclib-0.1.0/tests/data/classify/classify_report_temp.tsv +7 -0
- aiqclib-0.1.0/tests/data/classify/predictions.parquet +0 -0
- aiqclib-0.1.0/tests/data/config/config_classify_set_full_template.yaml +142 -0
- aiqclib-0.1.0/tests/data/config/config_classify_set_template.yaml +118 -0
- aiqclib-0.1.0/tests/data/config/config_data_set_full_template.yaml +136 -0
- aiqclib-0.1.0/tests/data/config/config_data_set_reduced_template.yaml +112 -0
- aiqclib-0.1.0/tests/data/config/config_data_set_template.yaml +112 -0
- aiqclib-0.1.0/tests/data/config/config_train_set_template.yaml +48 -0
- aiqclib-0.1.0/tests/data/config/test_classify_001.yaml +172 -0
- aiqclib-0.1.0/tests/data/config/test_classify_002.yaml +148 -0
- aiqclib-0.1.0/tests/data/config/test_classify_003.yaml +148 -0
- aiqclib-0.1.0/tests/data/config/test_dataset_001.yaml +161 -0
- aiqclib-0.1.0/tests/data/config/test_dataset_002.yaml +143 -0
- aiqclib-0.1.0/tests/data/config/test_dataset_003.yaml +161 -0
- aiqclib-0.1.0/tests/data/config/test_dataset_004.yaml +110 -0
- aiqclib-0.1.0/tests/data/config/test_dataset_005.yaml +110 -0
- aiqclib-0.1.0/tests/data/config/test_dataset_invalid.yaml +2 -0
- aiqclib-0.1.0/tests/data/config/test_training_001.yaml +73 -0
- aiqclib-0.1.0/tests/data/config/test_training_002.yaml +55 -0
- aiqclib-0.1.0/tests/data/config/test_training_003.yaml +55 -0
- aiqclib-0.1.0/tests/data/extract/extracted_features_classify_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/extract/extracted_features_classify_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/extract/extracted_features_classify_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/extract/extracted_features_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/extract/extracted_features_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/extract/extracted_features_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/input/empty_text_file.txt +0 -0
- aiqclib-0.1.0/tests/data/input/nrt_cora_bo_test.parquet +0 -0
- aiqclib-0.1.0/tests/data/input/nrt_cora_bo_test_2023_row1.csv +2 -0
- aiqclib-0.1.0/tests/data/input/nrt_cora_bo_test_2023_row1.csv.gz +0 -0
- aiqclib-0.1.0/tests/data/input/nrt_cora_bo_test_2023_row1.tsv +2 -0
- aiqclib-0.1.0/tests/data/input/nrt_cora_bo_test_2023_row1.tsv.gz +0 -0
- aiqclib-0.1.0/tests/data/negx5_model/model_pres.joblib +0 -0
- aiqclib-0.1.0/tests/data/negx5_model/model_psal.joblib +0 -0
- aiqclib-0.1.0/tests/data/negx5_model/model_temp.joblib +0 -0
- aiqclib-0.1.0/tests/data/negx5_training/test_set_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/negx5_training/test_set_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/negx5_training/test_set_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/negx5_training/train_set_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/negx5_training/train_set_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/negx5_training/train_set_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/select/selected_profiles.parquet +0 -0
- aiqclib-0.1.0/tests/data/select/selected_profiles_classify.parquet +0 -0
- aiqclib-0.1.0/tests/data/select/selected_rows_classify_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/select/selected_rows_classify_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/select/selected_rows_classify_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/select/selected_rows_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/select/selected_rows_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/select/selected_rows_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/summary/summary_stats.tsv +3529 -0
- aiqclib-0.1.0/tests/data/summary/summary_stats_classify.tsv +596 -0
- aiqclib-0.1.0/tests/data/training/model_pres.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_dt.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_gnb.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_knn.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_lda.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_logit.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_mlp.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_rf.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_svm.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_pres_xgb.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_dt.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_gnb.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_knn.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_lda.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_logit.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_mlp.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_rf.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_svm.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_psal_xgb.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_dt.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_gnb.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_knn.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_lda.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_logit.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_mlp.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_rf.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_svm.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/model_temp_xgb.joblib +0 -0
- aiqclib-0.1.0/tests/data/training/test_prediction_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/test_prediction_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/test_prediction_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/test_report_pres.tsv +7 -0
- aiqclib-0.1.0/tests/data/training/test_report_psal.tsv +7 -0
- aiqclib-0.1.0/tests/data/training/test_report_temp.tsv +7 -0
- aiqclib-0.1.0/tests/data/training/test_set_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/test_set_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/test_set_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/train_set_pres.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/train_set_psal.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/train_set_temp.parquet +0 -0
- aiqclib-0.1.0/tests/data/training/validation_report_pres.tsv +19 -0
- aiqclib-0.1.0/tests/data/training/validation_report_psal.tsv +19 -0
- aiqclib-0.1.0/tests/data/training/validation_report_temp.tsv +19 -0
- aiqclib-0.1.0/tests/test_classify_step1_input_all.py +143 -0
- aiqclib-0.1.0/tests/test_classify_step2_summary_all.py +171 -0
- aiqclib-0.1.0/tests/test_classify_step3_select_all.py +147 -0
- aiqclib-0.1.0/tests/test_classify_step4_locate_all.py +180 -0
- aiqclib-0.1.0/tests/test_classify_step5_extract_all.py +252 -0
- aiqclib-0.1.0/tests/test_classify_step6_classify_all.py +1152 -0
- aiqclib-0.1.0/tests/test_classify_step6_classify_suite.py +421 -0
- aiqclib-0.1.0/tests/test_classify_step7_concat_all.py +216 -0
- aiqclib-0.1.0/tests/test_classify_step7_concat_suite.py +248 -0
- aiqclib-0.1.0/tests/test_common_base_config.py +140 -0
- aiqclib-0.1.0/tests/test_common_base_dataset.py +65 -0
- aiqclib-0.1.0/tests/test_common_base_model.py +249 -0
- aiqclib-0.1.0/tests/test_common_base_sklearn_model.py +412 -0
- aiqclib-0.1.0/tests/test_common_config_base_methods.py +362 -0
- aiqclib-0.1.0/tests/test_common_config_classification.py +143 -0
- aiqclib-0.1.0/tests/test_common_config_dataset.py +138 -0
- aiqclib-0.1.0/tests/test_common_config_training.py +122 -0
- aiqclib-0.1.0/tests/test_common_loaders_classification.py +639 -0
- aiqclib-0.1.0/tests/test_common_loaders_datasets.py +461 -0
- aiqclib-0.1.0/tests/test_common_loaders_feature.py +54 -0
- aiqclib-0.1.0/tests/test_common_loaders_model.py +106 -0
- aiqclib-0.1.0/tests/test_common_loaders_single_model.py +206 -0
- aiqclib-0.1.0/tests/test_common_loaders_training.py +287 -0
- aiqclib-0.1.0/tests/test_common_metric_plots.py +127 -0
- aiqclib-0.1.0/tests/test_common_utils_config.py +61 -0
- aiqclib-0.1.0/tests/test_common_utils_file.py +120 -0
- aiqclib-0.1.0/tests/test_dmqclib.py +325 -0
- aiqclib-0.1.0/tests/test_interface_classification.py +287 -0
- aiqclib-0.1.0/tests/test_interface_config.py +144 -0
- aiqclib-0.1.0/tests/test_interface_prepare.py +198 -0
- aiqclib-0.1.0/tests/test_interface_stats.py +94 -0
- aiqclib-0.1.0/tests/test_interface_train.py +246 -0
- aiqclib-0.1.0/tests/test_prepare_features.py +404 -0
- aiqclib-0.1.0/tests/test_prepare_step1_input_a.py +339 -0
- aiqclib-0.1.0/tests/test_prepare_step2_summary_a.py +203 -0
- aiqclib-0.1.0/tests/test_prepare_step3_select_a.py +208 -0
- aiqclib-0.1.0/tests/test_prepare_step3_select_all.py +104 -0
- aiqclib-0.1.0/tests/test_prepare_step4_locate_a.py +443 -0
- aiqclib-0.1.0/tests/test_prepare_step4_locate_all.py +178 -0
- aiqclib-0.1.0/tests/test_prepare_step5_extract_a.py +565 -0
- aiqclib-0.1.0/tests/test_prepare_step6_split_a.py +498 -0
- aiqclib-0.1.0/tests/test_prepare_step6_split_all.py +219 -0
- aiqclib-0.1.0/tests/test_training_model_suite.py +183 -0
- aiqclib-0.1.0/tests/test_training_models.py +371 -0
- aiqclib-0.1.0/tests/test_training_step1_input_a.py +140 -0
- aiqclib-0.1.0/tests/test_training_step2_validate_a.py +553 -0
- aiqclib-0.1.0/tests/test_training_step2_validate_suite.py +364 -0
- aiqclib-0.1.0/tests/test_training_step4_build_a.py +1199 -0
- aiqclib-0.1.0/tests/test_training_step4_build_suite.py +423 -0
- aiqclib-0.1.0/uv.lock +1693 -0
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
name: Check Package
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [ "main" ]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: [ "main", "develop" ]
|
|
8
|
+
|
|
9
|
+
jobs:
|
|
10
|
+
build:
|
|
11
|
+
name: python
|
|
12
|
+
runs-on: ubuntu-latest
|
|
13
|
+
strategy:
|
|
14
|
+
fail-fast: false
|
|
15
|
+
|
|
16
|
+
steps:
|
|
17
|
+
- uses: actions/checkout@v4
|
|
18
|
+
|
|
19
|
+
- name: Install uv
|
|
20
|
+
uses: astral-sh/setup-uv@v5
|
|
21
|
+
|
|
22
|
+
- name: Set up Python
|
|
23
|
+
uses: actions/setup-python@v5
|
|
24
|
+
with:
|
|
25
|
+
python-version-file: ".python-version"
|
|
26
|
+
|
|
27
|
+
- name: Install the project
|
|
28
|
+
run: uv sync --locked --all-extras --dev
|
|
29
|
+
|
|
30
|
+
- name: Run tests
|
|
31
|
+
run: uv run pytest -v tests
|
aiqclib-0.1.0/.gitignore
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# Python-generated files
|
|
2
|
+
__pycache__/
|
|
3
|
+
*.py[oc]
|
|
4
|
+
build/
|
|
5
|
+
dist/
|
|
6
|
+
wheels/
|
|
7
|
+
*.egg-info
|
|
8
|
+
|
|
9
|
+
# Virtual environments
|
|
10
|
+
.venv
|
|
11
|
+
/.project
|
|
12
|
+
|
|
13
|
+
#idea folder
|
|
14
|
+
.idea/
|
|
15
|
+
|
|
16
|
+
# Conda recipe build scripts (no need to track generated files)
|
|
17
|
+
conda/build.sh
|
|
18
|
+
conda/bld.bat
|
|
19
|
+
|
|
20
|
+
# Local build and test artifacts
|
|
21
|
+
conda/work/
|
|
22
|
+
conda/__pycache__/
|
|
23
|
+
conda/*.py[cod]
|
|
24
|
+
conda/*.log
|
|
25
|
+
|
|
26
|
+
# Conda package tarballs or .conda outputs
|
|
27
|
+
conda/*.tar.bz2
|
|
28
|
+
conda/*.conda
|
|
29
|
+
|
|
30
|
+
# Conda metadata directories created by conda-build
|
|
31
|
+
conda/.built
|
|
32
|
+
conda/.cache
|
|
33
|
+
|
|
34
|
+
# Ignore Sphinx build output
|
|
35
|
+
docs/build/
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
3.12
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
All notable changes to this project will be documented in this file.
|
|
3
|
+
|
|
4
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
|
|
5
|
+
As this project is still in active development, it does not yet strictly adhere to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
6
|
+
|
|
7
|
+
## [Unreleased]
|
|
8
|
+
|
|
9
|
+
## [0.1.0] - 2026-05-08
|
|
10
|
+
### Added
|
|
11
|
+
- Port from dmqclib
|
aiqclib-0.1.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Takaya Saito
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
aiqclib-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,411 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: aiqclib
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: This package aims to offer helper functions that simplify model building and evaluation
|
|
5
|
+
Author-email: Takaya Saito <takaya.saito@outlook.com>
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Requires-Python: >=3.12
|
|
8
|
+
Requires-Dist: joblib>=1.4.2
|
|
9
|
+
Requires-Dist: jsonschema>=4.23.0
|
|
10
|
+
Requires-Dist: matplotlib>=3.10.8
|
|
11
|
+
Requires-Dist: numpy>=2.2
|
|
12
|
+
Requires-Dist: pandas>=2.2
|
|
13
|
+
Requires-Dist: polars>=1.30.0
|
|
14
|
+
Requires-Dist: pyarrow>=19.0.0
|
|
15
|
+
Requires-Dist: pyyaml>=6.0.2
|
|
16
|
+
Requires-Dist: scikit-learn>=1.6.1
|
|
17
|
+
Requires-Dist: shap>=0.51.0
|
|
18
|
+
Requires-Dist: xgboost>=3.0.2
|
|
19
|
+
Description-Content-Type: text/markdown
|
|
20
|
+
|
|
21
|
+
# aiqclib
|
|
22
|
+
|
|
23
|
+
[](https://pypi.org/project/aiqclib/)
|
|
24
|
+
[](https://anaconda.org/conda-forge/aiqclib)
|
|
25
|
+
[](https://github.com/AIQC-Hub/aiqclib/actions/workflows/check_package.yml)
|
|
26
|
+
[](https://codecov.io/gh/AIQC-Hub/aiqclib)
|
|
27
|
+
[](https://www.codefactor.io/repository/github/aiqc-hub/aiqclib)
|
|
28
|
+
[](https://doi.org/10.5281/zenodo.16055323)
|
|
29
|
+
|
|
30
|
+
**aiqclib** is a Python library that provides a configuration-driven workflow for machine learning, simplifying dataset preparation, model training, and data classification. It is a core component of the AIQC project that aims to enhance anomaly detection in CTD (Conductivity, Temperature, Depth) data.
|
|
31
|
+
|
|
32
|
+
## ML Algorithms Supported by **aiqclib**
|
|
33
|
+
|
|
34
|
+
| Category | Algorithm | Short Name | Method |
|
|
35
|
+
| :--- | :--- | :--- | :--- |
|
|
36
|
+
| Tree-Based & Ensemble | **XGBoost** | XGB | Ensemble (Boosting) |
|
|
37
|
+
| | **Random Forest** | RF | Ensemble (Bagging) |
|
|
38
|
+
| | **Decision Tree** | DT | Tree |
|
|
39
|
+
| Linear & Geometric | **Logistic Regression** | Logit | Linear |
|
|
40
|
+
| | **Linear Discriminant Analysis** | LDA | Linear / Statistical |
|
|
41
|
+
| | **Support Vector Machine** | SVM | Geometric |
|
|
42
|
+
| Instance-Based (Distance-Based) | **K-Nearest Neighbors** | KNN | Distance-based |
|
|
43
|
+
| Probabilistic | **Gaussian Naive Bayes** | GNB | Probabilistic |
|
|
44
|
+
| Neural Network | **Multilayer Perceptron** | MLP | Neural Network |
|
|
45
|
+
|
|
46
|
+
## Installation
|
|
47
|
+
|
|
48
|
+
The package is available on PyPI and conda-forge.
|
|
49
|
+
|
|
50
|
+
**Using pip:**
|
|
51
|
+
```bash
|
|
52
|
+
pip install aiqclib
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
**Using conda:**
|
|
56
|
+
```bash
|
|
57
|
+
conda install -c conda-forge aiqclib
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## Documentation
|
|
61
|
+
|
|
62
|
+
Project documentation is hosted on [Read the Docs](https://aiqclib.readthedocs.io/en/latest/index.html).
|
|
63
|
+
|
|
64
|
+
## Core Concepts
|
|
65
|
+
|
|
66
|
+
The library is designed around a three-stage workflow:
|
|
67
|
+
|
|
68
|
+
1. **Dataset Preparation:** Prepare feature datasets from raw data and generate training, validation, and test data sets.
|
|
69
|
+
2. **Training & Evaluation:** Train machine learning models and evaluate their performance using cross-validation.
|
|
70
|
+
3. **Classification:** Apply a trained model to classify new, unseen data.
|
|
71
|
+
|
|
72
|
+
Each stage is controlled by a YAML configuration file, allowing you to define and reproduce your entire workflow with ease.
|
|
73
|
+
|
|
74
|
+
## Usage
|
|
75
|
+
|
|
76
|
+
The general workflow for any task in `aiqclib` follows these steps:
|
|
77
|
+
|
|
78
|
+
1. **Generate a Configuration Template:** Create a starter YAML file for the task (e.g., `prepare`, `train`, `classify`).
|
|
79
|
+
2. **Customize the Configuration:** Edit the YAML file to specify paths, dataset names, and other parameters.
|
|
80
|
+
3. **Run the Task:** Load the configuration and execute the main function for the task.
|
|
81
|
+
|
|
82
|
+
### 1. Dataset Preparation
|
|
83
|
+
|
|
84
|
+
This workflow processes your input data and creates training, validation, and test sets.
|
|
85
|
+
|
|
86
|
+
**Step 1: Generate a configuration template.**
|
|
87
|
+
|
|
88
|
+
```python
|
|
89
|
+
import aiqclib as aq
|
|
90
|
+
|
|
91
|
+
aq.write_config_template(file_name="/path/to/prepare_config.yaml", stage="prepare")
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
**Step 2: Customize `prepare_config.yaml`.**
|
|
95
|
+
You must edit the file to set the correct input/output paths and define your dataset. See the [Configuration](#configuration) section for details.
|
|
96
|
+
|
|
97
|
+
**Step 3: Run the preparation process.**
|
|
98
|
+
```python
|
|
99
|
+
import aiqclib as aq
|
|
100
|
+
|
|
101
|
+
config = aq.read_config("/path/to/prepare_config.yaml")
|
|
102
|
+
aq.create_training_dataset(config)
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
This generates the following output folders:
|
|
106
|
+
- **summary**: Statistics of input data used for normalization.
|
|
107
|
+
- **select**: Profiles with bad observation flags (positive samples) and good profiles (negative samples).
|
|
108
|
+
- **locate**: Observation records for both positive and negative profiles.
|
|
109
|
+
- **extract**: Features extracted from the observation records.
|
|
110
|
+
- **training**: The final training, validation, and test datasets.
|
|
111
|
+
|
|
112
|
+
### 2. Model Training and Evaluation
|
|
113
|
+
|
|
114
|
+
This workflow uses the prepared dataset to train a model and evaluate its performance.
|
|
115
|
+
|
|
116
|
+
**Step 1: Generate a training configuration template.**
|
|
117
|
+
|
|
118
|
+
```python
|
|
119
|
+
import aiqclib as aq
|
|
120
|
+
|
|
121
|
+
aq.write_config_template(file_name="/path/to/training_config.yaml", stage="train")
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
**Step 2: Customize `training_config.yaml`.**
|
|
125
|
+
Edit the file to point to your prepared dataset and define training parameters.
|
|
126
|
+
|
|
127
|
+
**Step 3: Train and evaluate the model.**
|
|
128
|
+
```python
|
|
129
|
+
import aiqclib as aq
|
|
130
|
+
|
|
131
|
+
config = aq.read_config("/path/to/training_config.yaml")
|
|
132
|
+
aq.train_and_evaluate(config)
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
This generates the following output folders:
|
|
136
|
+
- **validate**: Results from the cross-validation process.
|
|
137
|
+
- **build**: The final trained models and their evaluation results on the test dataset.
|
|
138
|
+
|
|
139
|
+
### 3. Data Classification
|
|
140
|
+
|
|
141
|
+
This workflow applies a trained model to classify all observations in a dataset.
|
|
142
|
+
|
|
143
|
+
**Step 1: Generate a classification configuration template.**
|
|
144
|
+
|
|
145
|
+
```python
|
|
146
|
+
import aiqclib as aq
|
|
147
|
+
|
|
148
|
+
aq.write_config_template(file_name="/path/to/classification_config.yaml", stage="classify")
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
**Step 2: Customize `classification_config.yaml`.**
|
|
152
|
+
Edit the file to point to the input data and the trained model.
|
|
153
|
+
|
|
154
|
+
**Step 3: Run classification.**
|
|
155
|
+
```python
|
|
156
|
+
import aiqclib as aq
|
|
157
|
+
|
|
158
|
+
config = aq.read_config("/path/to/classification_config.yaml")
|
|
159
|
+
aq.classify_dataset(config)
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
This workflow processes a dataset using a trained model and generates:
|
|
163
|
+
- **classify**: The final classification results and a summary report.
|
|
164
|
+
|
|
165
|
+
## Configuration
|
|
166
|
+
|
|
167
|
+
Configuration is managed via YAML files. The `write_config_template` function provides a starting point that you must customize for each module.
|
|
168
|
+
|
|
169
|
+
### 1. Dataset Preparation (`stage="prepare"`)
|
|
170
|
+
|
|
171
|
+
The preparation config requires you to modify two key sections:
|
|
172
|
+
|
|
173
|
+
- **`path_info_sets`**: Defines the location of input and output data.
|
|
174
|
+
```yaml
|
|
175
|
+
path_info_sets:
|
|
176
|
+
- name: data_set_1
|
|
177
|
+
common:
|
|
178
|
+
base_path: /path/to/data # EDIT: Root output directory
|
|
179
|
+
input:
|
|
180
|
+
base_path: /path/to/input # EDIT: Directory with input files
|
|
181
|
+
step_folder_name: ""
|
|
182
|
+
split:
|
|
183
|
+
step_folder_name: training
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
- **`data_sets`**: Defines a specific dataset to be processed.
|
|
187
|
+
```yaml
|
|
188
|
+
data_sets:
|
|
189
|
+
- name: dataset_0001 # EDIT: Your data set name
|
|
190
|
+
dataset_folder_name: dataset_0001 # EDIT: Your output folder
|
|
191
|
+
input_file_name: nrt_cora_bo_4.parquet # EDIT: Your input filename
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
### 2. Training and Evaluation (`stage="train"`)
|
|
195
|
+
|
|
196
|
+
The training config links the prepared data to the model training process.
|
|
197
|
+
|
|
198
|
+
- **`path_info_sets`**: Defines where to find the prepared dataset and where to save model artifacts.
|
|
199
|
+
```yaml
|
|
200
|
+
path_info_sets:
|
|
201
|
+
- name: data_set_1
|
|
202
|
+
common:
|
|
203
|
+
base_path: /path/to/data # EDIT: Root output directory
|
|
204
|
+
input:
|
|
205
|
+
step_folder_name: training
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
- **`training_sets`**: Links to a dataset prepared in the previous workflow.
|
|
209
|
+
```yaml
|
|
210
|
+
training_sets:
|
|
211
|
+
- name: training_0001 # EDIT: Your training name
|
|
212
|
+
dataset_folder_name: dataset_0001 # EDIT: Your output folder
|
|
213
|
+
```
|
|
214
|
+
|
|
215
|
+
### 3. Classification (`stage="classify"`)
|
|
216
|
+
|
|
217
|
+
The classification config uses a trained model to classify new data.
|
|
218
|
+
|
|
219
|
+
- **`path_info_sets`**: Defines paths for raw data, models, and classification results.
|
|
220
|
+
```yaml
|
|
221
|
+
path_info_sets:
|
|
222
|
+
- name: data_set_1
|
|
223
|
+
common:
|
|
224
|
+
base_path: /path/to/data # EDIT: Root output directory
|
|
225
|
+
input:
|
|
226
|
+
base_path: /path/to/input # EDIT: Directory with input files
|
|
227
|
+
step_folder_name: ""
|
|
228
|
+
model:
|
|
229
|
+
base_path: /path/to/model # EDIT: Directory with model files
|
|
230
|
+
step_folder_name: model
|
|
231
|
+
concat:
|
|
232
|
+
step_folder_name: classification # EDIT: Directory with classification results
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
- **`classification_sets`**: Defines a specific dataset to be classified.
|
|
236
|
+
```yaml
|
|
237
|
+
classification_sets:
|
|
238
|
+
- name: classification_0001 # EDIT: Your classification name
|
|
239
|
+
dataset_folder_name: dataset_0001 # EDIT: Your output folder
|
|
240
|
+
input_file_name: nrt_cora_bo_4.parquet # EDIT: Your input filename
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
## Contributing & Development
|
|
244
|
+
|
|
245
|
+
We welcome contributions! Please use the following guidelines for development.
|
|
246
|
+
|
|
247
|
+
### Environment Setup
|
|
248
|
+
|
|
249
|
+
We recommend using **uv** for managing the development environment.
|
|
250
|
+
|
|
251
|
+
1. **Install `uv` into your base conda/mamba environment.**
|
|
252
|
+
This makes the `uv` command available globally without cluttering your `base` environment.
|
|
253
|
+
|
|
254
|
+
```bash
|
|
255
|
+
# Using mamba (recommended)
|
|
256
|
+
mamba activate base
|
|
257
|
+
mamba install -n base -c conda-forge uv
|
|
258
|
+
|
|
259
|
+
# Or using conda
|
|
260
|
+
conda activate base
|
|
261
|
+
conda install -n base -c conda-forge uv
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
2. **Create and activate the project's virtual environment.**
|
|
265
|
+
From the project's root directory, run the following:
|
|
266
|
+
|
|
267
|
+
```bash
|
|
268
|
+
# Create the virtual environment in a .venv folder
|
|
269
|
+
uv venv
|
|
270
|
+
|
|
271
|
+
# Activate the virtual environment
|
|
272
|
+
source .venv/bin/activate
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
3. **Install the project and its dependencies.**
|
|
276
|
+
This command installs the library in "editable" mode (`-e`) and pulls in all dependencies from `pyproject.toml`.
|
|
277
|
+
|
|
278
|
+
```bash
|
|
279
|
+
uv sync
|
|
280
|
+
uv pip install -e .
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
### Running Tests
|
|
284
|
+
|
|
285
|
+
With your environment activated, you can run the test suite using `pytest`.
|
|
286
|
+
|
|
287
|
+
```bash
|
|
288
|
+
uv run pytest -v
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
### Code Style (Linting & Formatting)
|
|
292
|
+
|
|
293
|
+
We use **Ruff** for linting and formatting.
|
|
294
|
+
|
|
295
|
+
**Linting:**
|
|
296
|
+
Check the library and test code for style issues.
|
|
297
|
+
```bash
|
|
298
|
+
# Lint the library source code
|
|
299
|
+
uv ruff check src
|
|
300
|
+
|
|
301
|
+
# Lint the test code
|
|
302
|
+
ruff check tests
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
**Formatting:**
|
|
306
|
+
Automatically format the code to match the project's style.
|
|
307
|
+
```bash
|
|
308
|
+
# Format the library source code
|
|
309
|
+
ruff format src
|
|
310
|
+
|
|
311
|
+
# Format the test code
|
|
312
|
+
ruff format tests
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
## Documentation (for Maintainers)
|
|
316
|
+
|
|
317
|
+
### Building Docs Locally
|
|
318
|
+
|
|
319
|
+
1. **Update Docstrings (Requires Google Gemini API Key):**
|
|
320
|
+
```bash
|
|
321
|
+
# Update docstrings for source files
|
|
322
|
+
python ./docs/scripts/update_docstrings.py src docs/scripts/prompt_main.txt
|
|
323
|
+
|
|
324
|
+
# Update docstrings for test files
|
|
325
|
+
python ./docs/scripts/update_docstrings.py tests docs/scripts/prompt_unittest.txt
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
2. **Review Docstrings:**
|
|
329
|
+
Manually review all modified files. Remove generated headers/footers and correct any sections marked with "Issues:".
|
|
330
|
+
|
|
331
|
+
3. **Update API Documents:**
|
|
332
|
+
From the project root, run:
|
|
333
|
+
```bash
|
|
334
|
+
uv run sphinx-apidoc -f --remove-old --module-first -o docs/source/api src/aiqclib
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
4. **Build HTML:**
|
|
338
|
+
From the project root, run:
|
|
339
|
+
```bash
|
|
340
|
+
cd docs; uv run make html; cd ..
|
|
341
|
+
```
|
|
342
|
+
You can view the generated site by opening `docs/build/html/index.html` in a browser.
|
|
343
|
+
|
|
344
|
+
## Deployment (for Maintainers)
|
|
345
|
+
|
|
346
|
+
### PyPI
|
|
347
|
+
|
|
348
|
+
The package is published to [PyPI](https://pypi.org/project/aiqclib/) automatically via a GitHub Action whenever a new release is created on GitHub.
|
|
349
|
+
|
|
350
|
+
### conda-forge (Automatic)
|
|
351
|
+
|
|
352
|
+
The conda-forge bot automatically creates a pull request and merges it into the main branch when a new version of the package is published on PyPI.
|
|
353
|
+
|
|
354
|
+
### conda-forge (Manual)
|
|
355
|
+
|
|
356
|
+
#### Bump version with new dependencies
|
|
357
|
+
|
|
358
|
+
When runtime dependencies change, the automated PR from the conda-forge bot may fail. In that case, you must manually update the feedstock by creating a pull request to the `conda-forge/aiqclib-feedstock` repository in this case.
|
|
359
|
+
|
|
360
|
+
1. **Install build tools:**
|
|
361
|
+
```bash
|
|
362
|
+
mamba install -c conda-forge conda-build conda-smithy grayskull
|
|
363
|
+
```
|
|
364
|
+
2. **Fork and clone** the `aiqclib-feedstock` repository.
|
|
365
|
+
3. **Sync with upstream** (e.g., add `conda-forge/aiqclib-feedstock` as a remote named `upstream` and `git rebase upstream/main`).
|
|
366
|
+
4. **Update the forked repo:**
|
|
367
|
+
```bash
|
|
368
|
+
git checkout main # Go to your local main branch
|
|
369
|
+
git fetch upstream # Get latest changes from original repo
|
|
370
|
+
git rebase upstream/main # Make your local main perfectly linear with original
|
|
371
|
+
git push origin main --force # Update your GitHub fork's main (optional but good practice)
|
|
372
|
+
```
|
|
373
|
+
5. **Create a new branch** (e.g., `git checkout -b update_vX.Y.Z`).
|
|
374
|
+
6. **Generate a strict recipe** (e.g., `grayskull pypi aiqclib --strict-conda-forge`).
|
|
375
|
+
7. **Review `recipes/meta.yaml`** and ensure it meets `conda-forge` standards.
|
|
376
|
+
8. **Rerender the feedstock** (e.g., `conda smithy rerender -c auto`).
|
|
377
|
+
9. **Commit, push, and open a pull request** to the `staged-recipes` repository.
|
|
378
|
+
10. **Merge it** after passing CI.
|
|
379
|
+
|
|
380
|
+
#### Initial upload
|
|
381
|
+
Submitting the package on `conda-forge` involves creating a pull request to the `conda-forge/staged-recipes` repository.
|
|
382
|
+
|
|
383
|
+
1. **Fork and clone** the `staged-recipes` repository.
|
|
384
|
+
2. **Configure upstream** the `git remote add upstream https://github.com/conda-forge/aiqclib-feedstock.git`
|
|
385
|
+
3. **Create a new branch** (e.g., `git checkout -b aiqclib-recipe`).
|
|
386
|
+
4. **Generate a strict recipe:** `grayskull pypi aiqclib --strict-conda-forge`.
|
|
387
|
+
5. **Review `recipes/aiqclib/meta.yaml`** and ensure it meets `conda-forge` standards.
|
|
388
|
+
6. **Commit, push, and open a pull request** to the `staged-recipes` repository.
|
|
389
|
+
|
|
390
|
+
### Anaconda.org (Manual)
|
|
391
|
+
|
|
392
|
+
Publishing to the `<username>` channel on [Anaconda.org](https://anaconda.org/takayasaito/aiqclib) is a manual process.
|
|
393
|
+
|
|
394
|
+
1. **Install build tools:**
|
|
395
|
+
```bash
|
|
396
|
+
mamba install -c conda-forge conda-build anaconda-client grayskull
|
|
397
|
+
```
|
|
398
|
+
conda-smithy
|
|
399
|
+
2. **Generate Recipe:**
|
|
400
|
+
From the project root, run `grayskull pypi aiqclib`. This creates `aiqclib/meta.yaml`.
|
|
401
|
+
|
|
402
|
+
3. **Build Package:**
|
|
403
|
+
`conda build aiqclib`
|
|
404
|
+
|
|
405
|
+
4. **Upload Package:**
|
|
406
|
+
```bash
|
|
407
|
+
anaconda login
|
|
408
|
+
anaconda upload /path/to/your/conda-bld/noarch/aiqclib-*.conda
|
|
409
|
+
```
|
|
410
|
+
5. **Cleanup:**
|
|
411
|
+
Copy `aiqclib/meta.yaml` to `conda/meta.yaml` for version control and remove the temporary `aiqclib` directory.
|