hofvarpnir-hcon 3.20.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,28 @@
1
+ BSD 3-Clause License
2
+
3
+ Copyright 2026 Leonard F Haasbroek
4
+
5
+ Redistribution and use in source and binary forms, with or without modification,
6
+ are permitted provided that the following conditions are met:
7
+
8
+ 1. Redistributions of source code must retain the above copyright notice,
9
+ this list of conditions and the following disclaimer.
10
+
11
+ 2. Redistributions in binary form must reproduce the above copyright notice,
12
+ this list of conditions and the following disclaimer in the documentation
13
+ and/or other materials provided with the distribution.
14
+
15
+ 3. Neither the name of the copyright holder nor the names of its contributors
16
+ may be used to endorse or promote products derived from this software without
17
+ specific prior written permission.
18
+
19
+ THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY
20
+ EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
21
+ OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
22
+
23
+ IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
24
+ INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
25
+ PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
26
+ HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
27
+ (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
28
+ EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
@@ -0,0 +1,6 @@
1
+ # MANIFEST.in
2
+ include README.md
3
+ include LICENSE
4
+ include NOTICE
5
+ include THIRD-PARTY-LICENSES.txt
6
+ recursive-include hofvarpnirhcon/docs *.md
@@ -0,0 +1,11 @@
1
+ THIRD-PARTY NOTICE
2
+
3
+ This software incorporates third-party open-source components.
4
+
5
+ A list of third-party dependencies and their associated licenses is provided
6
+ in THIRD-PARTY-LICENSES.txt.
7
+
8
+ These components are used under their respective permissive or weak-copyleft
9
+ licenses (including BSD, MIT, and MPL-2.0).
10
+
11
+ No modification has been made to third-party license terms.
@@ -0,0 +1,275 @@
1
+ Metadata-Version: 2.4
2
+ Name: hofvarpnir-hcon
3
+ Version: 3.20.0
4
+ Summary: HófvarpnirHCON - Fast dictionary-based crystal density prediction from SMILES
5
+ Author: Leonard Haasbroek
6
+ Author-email: leonardfhaasbroek@gmail.com
7
+ License: BSD-3-Clause
8
+ Project-URL: Source, https://github.com/LeonardFH/hofvarpnir-hcon
9
+ Project-URL: Bug Reports, https://github.com/LeonardFH/hofvarpnir-hcon/issues
10
+ Classifier: Development Status :: 4 - Beta
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: Topic :: Scientific/Engineering :: Chemistry
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.8
15
+ Classifier: Programming Language :: Python :: 3.9
16
+ Classifier: Programming Language :: Python :: 3.10
17
+ Classifier: Programming Language :: Python :: 3.11
18
+ Classifier: Programming Language :: Python :: 3.12
19
+ Classifier: License :: OSI Approved :: BSD License
20
+ Requires-Python: >=3.8
21
+ Description-Content-Type: text/markdown
22
+ License-File: LICENSE
23
+ Requires-Dist: rdkit>=2023.03.1
24
+ Requires-Dist: numpy>=1.21.0
25
+ Requires-Dist: pandas>=1.3.0
26
+ Requires-Dist: tqdm>=4.62.0
27
+ Requires-Dist: scipy>=1.8.0
28
+ Dynamic: author
29
+ Dynamic: author-email
30
+ Dynamic: classifier
31
+ Dynamic: description
32
+ Dynamic: description-content-type
33
+ Dynamic: license
34
+ Dynamic: license-file
35
+ Dynamic: project-url
36
+ Dynamic: requires-dist
37
+ Dynamic: requires-python
38
+ Dynamic: summary
39
+
40
+ # HófvarpnirHCON
41
+
42
+ GitHub: [github.com/LeonardFH/hofvarpnir-hcon](https://github.com/LeonardFH/hofvarpnir-hcon)
43
+ PyPI: [pypi.org/project/hofvarpnir-hcon](https://pypi.org/project/hofvarpnir-hcon/)
44
+
45
+ A modular Python framework for molecular property prediction from SMILES strings.
46
+
47
+ HófvarpnirHCON (pronounced "HOFF-varp-neer-HCON") is designed as a fast and extensible framework for predicting molecular properties of organic compounds containing C, H, O, and N.
48
+
49
+ Named after the flying horse of the Norse goddess Gná, reflecting the software's intended speed and range across molecular property spaces.
50
+
51
+ ## Current Status
52
+
53
+ At present, the package implements:
54
+
55
+ - Crystal density prediction for organic molecules
56
+
57
+ The framework is designed with extensibility in mind, allowing additional molecular property predictors to be added in future versions.
58
+
59
+ ## A Friendly Note
60
+
61
+ Hi there,
62
+
63
+ I built HófvarpnirHCON because crystal density prediction should be fast, transparent, and accessible. I'm glad you found it.
64
+
65
+ If you need to get in touch: leonardfhaasbroek@gmail.com
66
+
67
+ ## License
68
+
69
+ This project is distributed under the BSD 3-Clause License.
70
+
71
+ ## Data Sources
72
+
73
+ The training data may be obtained from:
74
+
75
+ - Davis, J. V.; Marrs, F. W.; Cawkwell, M. J.; Manner, V. W. Machine Learning Models for High Explosive Crystal Density and Performance. Chem. Mater. 2024, 36, 11109–11118. DOI: 10.1021/acs.chemmater.4c01978
76
+
77
+ - Mathieu, D. Sensitivity of Energetic Materials: Theoretical Relationships to Detonation Performance and Molecular Structure. Ind. Eng. Chem. Res. 2017, 56, 8191–8201. DOI: 10.1021/acs.iecr.7b02021
78
+
79
+ These datasets are available as Supporting Information with their respective papers.
80
+
81
+ ## Community Benchmarks
82
+
83
+ If you use HófvarpnirHCON on your own dataset, I invite you to share your results.
84
+
85
+ Email: **leonardfhaasbroek@gmail.com**
86
+
87
+ Please include:
88
+ - MAE, RMSE, R²
89
+ - Number of molecules
90
+ - Number of cocrystals
91
+ - Dataset description and source (if public)
92
+
93
+ Results will be posted here (with your permission).
94
+
95
+ ## Documentation
96
+
97
+ For a detailed explanation of the method, see [hofvarpnirhcon/docs/METHODS.md](hofvarpnirhcon/docs/METHODS.md).
98
+
99
+ ## Installation
100
+
101
+ pip install hofvarpnir-hcon
102
+
103
+ ## Quick Start: Train and Predict in Thonny
104
+
105
+ ```python
106
+ #Copy and paste this entire script into Thonny and run it:
107
+
108
+ from hofvarpnirhcon import train_density, predict_density, predict_density_batch
109
+ import pandas as pd
110
+ import numpy as np
111
+ from sklearn.metrics import mean_absolute_error
112
+
113
+ # ============================================================
114
+ # STEP 1: Download a dataset from one of the papers above
115
+ # Save it as "trainingdata.csv" with columns: SMILES, Density
116
+ # ============================================================
117
+
118
+ # ============================================================
119
+ # STEP 2: Train your own weights
120
+ # ============================================================
121
+
122
+ print("Training model...")
123
+ weights = train_density(
124
+ data_path="trainingdata.csv",
125
+ output_path="my_weights.pkl",
126
+ filter_cocrystals=True, # Train on pure crystals only (recommended)
127
+ filter_hcon=True, # Train on H,C,O,N atoms only (recommended)
128
+ verbose=True
129
+ )
130
+ print("Training complete! Weights saved to my_weights.pkl")
131
+
132
+ # ============================================================
133
+ # STEP 3: Load the dataset for predictions
134
+ # ============================================================
135
+
136
+ df = pd.read_csv("trainingdata.csv")
137
+ smiles_list = df["SMILES"].tolist()
138
+ actuals = df["Density"].values
139
+
140
+ # ============================================================
141
+ # STEP 4: Single molecule prediction
142
+ # ============================================================
143
+
144
+ print("\n" + "=" * 60)
145
+ print("SINGLE MOLECULE PREDICTION")
146
+ print("=" * 60)
147
+
148
+ test_smiles = smiles_list[0]
149
+ test_actual = actuals[0]
150
+ pred = predict_density(test_smiles, weights_path="my_weights.pkl")
151
+ print(f"SMILES: {test_smiles}")
152
+ print(f"Actual density: {test_actual:.4f} g/cm³")
153
+ print(f"Predicted density: {pred:.4f} g/cm³")
154
+ print(f"Error: {abs(pred - test_actual):.4f} g/cm³")
155
+
156
+ # ============================================================
157
+ # STEP 5: Batch prediction on entire dataset
158
+ # ============================================================
159
+
160
+ print("\n" + "=" * 60)
161
+ print("BATCH PREDICTION")
162
+ print("=" * 60)
163
+
164
+ print(f"Predicting {len(smiles_list)} molecules...")
165
+ predictions = predict_density_batch(
166
+ smiles_list=smiles_list,
167
+ weights_path="my_weights.pkl",
168
+ verbose=True
169
+ )
170
+
171
+ # ============================================================
172
+ # STEP 6: Calculate MAE and show results (FILTER NONE VALUES)
173
+ # ============================================================
174
+
175
+ # Filter out None values (failed predictions)
176
+ valid_mask = [p is not None for p in predictions]
177
+ valid_actuals = np.array(actuals)[valid_mask]
178
+ valid_predictions = [p for p in predictions if p is not None]
179
+
180
+ print(f"\n✅ Valid predictions: {len(valid_predictions):,} / {len(smiles_list):,}")
181
+
182
+ if len(valid_predictions) == 0:
183
+ print("❌ No valid predictions. Check your SMILES strings.")
184
+ exit()
185
+
186
+ mae = mean_absolute_error(valid_actuals, valid_predictions)
187
+ rmse = np.sqrt(np.mean((np.array(valid_predictions) - valid_actuals) ** 2))
188
+ r2 = np.corrcoef(valid_predictions, valid_actuals)[0, 1] ** 2
189
+
190
+ print(f"\nModel Performance:")
191
+ print(f" MAE: {mae:.4f} g/cm³")
192
+ print(f" RMSE: {rmse:.4f} g/cm³")
193
+ print(f" R²: {r2:.4f}")
194
+
195
+ print("\nFirst 10 predictions:")
196
+ print("-" * 70)
197
+ print(f"{'SMILES':<35} {'Actual':>10} {'Predicted':>10} {'Error':>10}")
198
+ print("-" * 70)
199
+
200
+ for i in range(min(10, len(valid_predictions))):
201
+ smiles = smiles_list[i][:35]
202
+ actual = valid_actuals[i]
203
+ pred = valid_predictions[i]
204
+ error = abs(pred - actual)
205
+ print(f"{smiles:<35} {actual:>10.4f} {pred:>10.4f} {error:>10.4f}")
206
+
207
+ print("-" * 70)
208
+ print(f"MAE: {mae:.4f} g/cm³")
209
+ print("\n✅ All done! Weights saved to my_weights.pkl")
210
+
211
+ # ============================================================
212
+ # STEP 7: Save results to CSV
213
+ # ============================================================
214
+
215
+ results_df = pd.DataFrame({
216
+ 'SMILES': smiles_list[:len(valid_predictions)],
217
+ 'Actual_Density': valid_actuals,
218
+ 'Predicted_Density': valid_predictions,
219
+ 'Error': np.array(valid_predictions) - valid_actuals,
220
+ 'Abs_Error': np.abs(np.array(valid_predictions) - valid_actuals),
221
+ })
222
+
223
+ results_df.to_csv('prediction_results.csv', index=False)
224
+ print("\n💾 Results saved to: prediction_results.csv")
225
+
226
+ # ============================================================
227
+ # USAGE EXAMPLES
228
+ # ============================================================
229
+
230
+ # Single molecule prediction:
231
+ from hofvarpnirhcon import predict_density
232
+
233
+ density = predict_density("CCO", weights_path="my_weights.pkl")
234
+ print(f"{density:.3f} g/cm³")
235
+
236
+ # Batch prediction:
237
+ from hofvarpnirhcon import predict_density_batch
238
+
239
+ smiles_list = ["CCO", "CC", "c1ccccc1", "O"]
240
+ results = predict_density_batch(smiles_list, weights_path="my_weights.pkl")
241
+
242
+ for smiles, density in zip(smiles_list, results):
243
+ print(f"{smiles}: {density:.3f} g/cm³")
244
+ ```
245
+
246
+ ## Performance
247
+
248
+ - MAE: ~0.0300 g/cm³ on CHON molecules
249
+ - Speed: ~1,800 molecules/second (1 core/thread)
250
+ - Speed: ~2,700 molecules/second (2 core/thread)
251
+ - Speed: ~3,500 molecules/second (4 core/thread - max achieved)
252
+
253
+ ## Tips for Best Performance
254
+
255
+ For optimal accuracy, we recommend training separate dictionaries for each chemical family:
256
+
257
+ - **HCON only** (C, H, N, O) — best overall performance
258
+ - **HCON + F** — fluorine-containing molecules
259
+ - **HCON + Cl** — chlorine-containing molecules
260
+ - **HCON + S** — sulfur-containing molecules
261
+ - **HCON + P** — phosphorus-containing molecules
262
+
263
+ **Avoid mixing different heteroatom types** (e.g., S and Cl together) in a single training run, as this can degrade prediction accuracy.
264
+
265
+ For molecules containing rare halogens (Br, I), we recommend using the HCON-only dictionaries, as there is insufficient data to train reliable halogen-specific overlaps.
266
+
267
+ ## Important Note on Polymorphs
268
+
269
+ The model predicts a single crystal density per SMILES string. For molecules with multiple known polymorphs (e.g., ROY, carbamazepine), the prediction corresponds to a **centroid** density within the experimental range. It does **not** predict individual polymorph forms.
270
+
271
+ ## Citation
272
+
273
+ If you use this software in your research, please cite:
274
+
275
+ Haasbroek, L. F. (2026). HófvarpnirHCON: Fast dictionary-based crystal density prediction.
@@ -0,0 +1,236 @@
1
+ # HófvarpnirHCON
2
+
3
+ GitHub: [github.com/LeonardFH/hofvarpnir-hcon](https://github.com/LeonardFH/hofvarpnir-hcon)
4
+ PyPI: [pypi.org/project/hofvarpnir-hcon](https://pypi.org/project/hofvarpnir-hcon/)
5
+
6
+ A modular Python framework for molecular property prediction from SMILES strings.
7
+
8
+ HófvarpnirHCON (pronounced "HOFF-varp-neer-HCON") is designed as a fast and extensible framework for predicting molecular properties of organic compounds containing C, H, O, and N.
9
+
10
+ Named after the flying horse of the Norse goddess Gná, reflecting the software's intended speed and range across molecular property spaces.
11
+
12
+ ## Current Status
13
+
14
+ At present, the package implements:
15
+
16
+ - Crystal density prediction for organic molecules
17
+
18
+ The framework is designed with extensibility in mind, allowing additional molecular property predictors to be added in future versions.
19
+
20
+ ## A Friendly Note
21
+
22
+ Hi there,
23
+
24
+ I built HófvarpnirHCON because crystal density prediction should be fast, transparent, and accessible. I'm glad you found it.
25
+
26
+ If you need to get in touch: leonardfhaasbroek@gmail.com
27
+
28
+ ## License
29
+
30
+ This project is distributed under the BSD 3-Clause License.
31
+
32
+ ## Data Sources
33
+
34
+ The training data may be obtained from:
35
+
36
+ - Davis, J. V.; Marrs, F. W.; Cawkwell, M. J.; Manner, V. W. Machine Learning Models for High Explosive Crystal Density and Performance. Chem. Mater. 2024, 36, 11109–11118. DOI: 10.1021/acs.chemmater.4c01978
37
+
38
+ - Mathieu, D. Sensitivity of Energetic Materials: Theoretical Relationships to Detonation Performance and Molecular Structure. Ind. Eng. Chem. Res. 2017, 56, 8191–8201. DOI: 10.1021/acs.iecr.7b02021
39
+
40
+ These datasets are available as Supporting Information with their respective papers.
41
+
42
+ ## Community Benchmarks
43
+
44
+ If you use HófvarpnirHCON on your own dataset, I invite you to share your results.
45
+
46
+ Email: **leonardfhaasbroek@gmail.com**
47
+
48
+ Please include:
49
+ - MAE, RMSE, R²
50
+ - Number of molecules
51
+ - Number of cocrystals
52
+ - Dataset description and source (if public)
53
+
54
+ Results will be posted here (with your permission).
55
+
56
+ ## Documentation
57
+
58
+ For a detailed explanation of the method, see [hofvarpnirhcon/docs/METHODS.md](hofvarpnirhcon/docs/METHODS.md).
59
+
60
+ ## Installation
61
+
62
+ pip install hofvarpnir-hcon
63
+
64
+ ## Quick Start: Train and Predict in Thonny
65
+
66
+ ```python
67
+ #Copy and paste this entire script into Thonny and run it:
68
+
69
+ from hofvarpnirhcon import train_density, predict_density, predict_density_batch
70
+ import pandas as pd
71
+ import numpy as np
72
+ from sklearn.metrics import mean_absolute_error
73
+
74
+ # ============================================================
75
+ # STEP 1: Download a dataset from one of the papers above
76
+ # Save it as "trainingdata.csv" with columns: SMILES, Density
77
+ # ============================================================
78
+
79
+ # ============================================================
80
+ # STEP 2: Train your own weights
81
+ # ============================================================
82
+
83
+ print("Training model...")
84
+ weights = train_density(
85
+ data_path="trainingdata.csv",
86
+ output_path="my_weights.pkl",
87
+ filter_cocrystals=True, # Train on pure crystals only (recommended)
88
+ filter_hcon=True, # Train on H,C,O,N atoms only (recommended)
89
+ verbose=True
90
+ )
91
+ print("Training complete! Weights saved to my_weights.pkl")
92
+
93
+ # ============================================================
94
+ # STEP 3: Load the dataset for predictions
95
+ # ============================================================
96
+
97
+ df = pd.read_csv("trainingdata.csv")
98
+ smiles_list = df["SMILES"].tolist()
99
+ actuals = df["Density"].values
100
+
101
+ # ============================================================
102
+ # STEP 4: Single molecule prediction
103
+ # ============================================================
104
+
105
+ print("\n" + "=" * 60)
106
+ print("SINGLE MOLECULE PREDICTION")
107
+ print("=" * 60)
108
+
109
+ test_smiles = smiles_list[0]
110
+ test_actual = actuals[0]
111
+ pred = predict_density(test_smiles, weights_path="my_weights.pkl")
112
+ print(f"SMILES: {test_smiles}")
113
+ print(f"Actual density: {test_actual:.4f} g/cm³")
114
+ print(f"Predicted density: {pred:.4f} g/cm³")
115
+ print(f"Error: {abs(pred - test_actual):.4f} g/cm³")
116
+
117
+ # ============================================================
118
+ # STEP 5: Batch prediction on entire dataset
119
+ # ============================================================
120
+
121
+ print("\n" + "=" * 60)
122
+ print("BATCH PREDICTION")
123
+ print("=" * 60)
124
+
125
+ print(f"Predicting {len(smiles_list)} molecules...")
126
+ predictions = predict_density_batch(
127
+ smiles_list=smiles_list,
128
+ weights_path="my_weights.pkl",
129
+ verbose=True
130
+ )
131
+
132
+ # ============================================================
133
+ # STEP 6: Calculate MAE and show results (FILTER NONE VALUES)
134
+ # ============================================================
135
+
136
+ # Filter out None values (failed predictions)
137
+ valid_mask = [p is not None for p in predictions]
138
+ valid_actuals = np.array(actuals)[valid_mask]
139
+ valid_predictions = [p for p in predictions if p is not None]
140
+
141
+ print(f"\n✅ Valid predictions: {len(valid_predictions):,} / {len(smiles_list):,}")
142
+
143
+ if len(valid_predictions) == 0:
144
+ print("❌ No valid predictions. Check your SMILES strings.")
145
+ exit()
146
+
147
+ mae = mean_absolute_error(valid_actuals, valid_predictions)
148
+ rmse = np.sqrt(np.mean((np.array(valid_predictions) - valid_actuals) ** 2))
149
+ r2 = np.corrcoef(valid_predictions, valid_actuals)[0, 1] ** 2
150
+
151
+ print(f"\nModel Performance:")
152
+ print(f" MAE: {mae:.4f} g/cm³")
153
+ print(f" RMSE: {rmse:.4f} g/cm³")
154
+ print(f" R²: {r2:.4f}")
155
+
156
+ print("\nFirst 10 predictions:")
157
+ print("-" * 70)
158
+ print(f"{'SMILES':<35} {'Actual':>10} {'Predicted':>10} {'Error':>10}")
159
+ print("-" * 70)
160
+
161
+ for i in range(min(10, len(valid_predictions))):
162
+ smiles = smiles_list[i][:35]
163
+ actual = valid_actuals[i]
164
+ pred = valid_predictions[i]
165
+ error = abs(pred - actual)
166
+ print(f"{smiles:<35} {actual:>10.4f} {pred:>10.4f} {error:>10.4f}")
167
+
168
+ print("-" * 70)
169
+ print(f"MAE: {mae:.4f} g/cm³")
170
+ print("\n✅ All done! Weights saved to my_weights.pkl")
171
+
172
+ # ============================================================
173
+ # STEP 7: Save results to CSV
174
+ # ============================================================
175
+
176
+ results_df = pd.DataFrame({
177
+ 'SMILES': smiles_list[:len(valid_predictions)],
178
+ 'Actual_Density': valid_actuals,
179
+ 'Predicted_Density': valid_predictions,
180
+ 'Error': np.array(valid_predictions) - valid_actuals,
181
+ 'Abs_Error': np.abs(np.array(valid_predictions) - valid_actuals),
182
+ })
183
+
184
+ results_df.to_csv('prediction_results.csv', index=False)
185
+ print("\n💾 Results saved to: prediction_results.csv")
186
+
187
+ # ============================================================
188
+ # USAGE EXAMPLES
189
+ # ============================================================
190
+
191
+ # Single molecule prediction:
192
+ from hofvarpnirhcon import predict_density
193
+
194
+ density = predict_density("CCO", weights_path="my_weights.pkl")
195
+ print(f"{density:.3f} g/cm³")
196
+
197
+ # Batch prediction:
198
+ from hofvarpnirhcon import predict_density_batch
199
+
200
+ smiles_list = ["CCO", "CC", "c1ccccc1", "O"]
201
+ results = predict_density_batch(smiles_list, weights_path="my_weights.pkl")
202
+
203
+ for smiles, density in zip(smiles_list, results):
204
+ print(f"{smiles}: {density:.3f} g/cm³")
205
+ ```
206
+
207
+ ## Performance
208
+
209
+ - MAE: ~0.0300 g/cm³ on CHON molecules
210
+ - Speed: ~1,800 molecules/second (1 core/thread)
211
+ - Speed: ~2,700 molecules/second (2 core/thread)
212
+ - Speed: ~3,500 molecules/second (4 core/thread - max achieved)
213
+
214
+ ## Tips for Best Performance
215
+
216
+ For optimal accuracy, we recommend training separate dictionaries for each chemical family:
217
+
218
+ - **HCON only** (C, H, N, O) — best overall performance
219
+ - **HCON + F** — fluorine-containing molecules
220
+ - **HCON + Cl** — chlorine-containing molecules
221
+ - **HCON + S** — sulfur-containing molecules
222
+ - **HCON + P** — phosphorus-containing molecules
223
+
224
+ **Avoid mixing different heteroatom types** (e.g., S and Cl together) in a single training run, as this can degrade prediction accuracy.
225
+
226
+ For molecules containing rare halogens (Br, I), we recommend using the HCON-only dictionaries, as there is insufficient data to train reliable halogen-specific overlaps.
227
+
228
+ ## Important Note on Polymorphs
229
+
230
+ The model predicts a single crystal density per SMILES string. For molecules with multiple known polymorphs (e.g., ROY, carbamazepine), the prediction corresponds to a **centroid** density within the experimental range. It does **not** predict individual polymorph forms.
231
+
232
+ ## Citation
233
+
234
+ If you use this software in your research, please cite:
235
+
236
+ Haasbroek, L. F. (2026). HófvarpnirHCON: Fast dictionary-based crystal density prediction.
@@ -0,0 +1,39 @@
1
+ THIRD-PARTY LICENSES
2
+
3
+ This file lists third-party dependencies used by this software.
4
+
5
+ ============================================================
6
+
7
+ Package: numpy 2.2.6
8
+ License: BSD License
9
+
10
+ ============================================================
11
+
12
+ Package: pandas 2.3.3
13
+ License: BSD License
14
+
15
+ ============================================================
16
+
17
+ Package: scipy 1.15.3
18
+ License: BSD License
19
+
20
+ ============================================================
21
+
22
+ Package: scikit-learn 1.7.1
23
+ License: BSD-3-Clause
24
+
25
+ ============================================================
26
+
27
+ Package: rdkit 2025.3.6
28
+ License: BSD-3-Clause
29
+
30
+ ============================================================
31
+
32
+ Package: tqdm 4.68.1
33
+ License: MPL-2.0 AND MIT
34
+
35
+ ============================================================
36
+
37
+ NOTE:
38
+ Full license texts are available in the upstream package distributions
39
+ and are included when these dependencies are installed via pip.