bblean 0.6.0b1__cp312-cp312-macosx_10_13_universal2.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- bblean/__init__.py +22 -0
- bblean/_config.py +61 -0
- bblean/_console.py +187 -0
- bblean/_cpp_similarity.cpython-312-darwin.so +0 -0
- bblean/_legacy/__init__.py +0 -0
- bblean/_legacy/bb_int64.py +1252 -0
- bblean/_legacy/bb_uint8.py +1144 -0
- bblean/_memory.py +198 -0
- bblean/_merges.py +212 -0
- bblean/_py_similarity.py +278 -0
- bblean/_timer.py +42 -0
- bblean/_version.py +34 -0
- bblean/analysis.py +258 -0
- bblean/bitbirch.py +1437 -0
- bblean/cli.py +1854 -0
- bblean/csrc/README.md +1 -0
- bblean/csrc/similarity.cpp +521 -0
- bblean/fingerprints.py +424 -0
- bblean/metrics.py +199 -0
- bblean/multiround.py +489 -0
- bblean/plotting.py +479 -0
- bblean/similarity.py +304 -0
- bblean/sklearn.py +203 -0
- bblean/smiles.py +61 -0
- bblean/utils.py +130 -0
- bblean-0.6.0b1.dist-info/METADATA +283 -0
- bblean-0.6.0b1.dist-info/RECORD +31 -0
- bblean-0.6.0b1.dist-info/WHEEL +6 -0
- bblean-0.6.0b1.dist-info/entry_points.txt +2 -0
- bblean-0.6.0b1.dist-info/licenses/LICENSE +48 -0
- bblean-0.6.0b1.dist-info/top_level.txt +1 -0
|
@@ -0,0 +1,283 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: bblean
|
|
3
|
+
Version: 0.6.0b1
|
|
4
|
+
Summary: BitBirch-Lean Python package
|
|
5
|
+
Author: The Miranda-Quintana Lab and other BitBirch developers
|
|
6
|
+
Author-email: Ramon Alain Miranda Quintana <quintana@chem.ufl.edu>, Krisztina Zsigmond <kzsigmond@ufl.edu>, Ignacio Pickering <ipickering@ufl.edu>, Kenneth Lopez Perez <klopezperez@chem.ufl.edu>, Miroslav Lzicar <miroslav.lzicar@deepmedchem.com>
|
|
7
|
+
License-Expression: GPL-3.0-only
|
|
8
|
+
Project-URL: homepage, https://github.com/mqcomplab/bblean.git
|
|
9
|
+
Project-URL: repository, https://github.com/mqcomplab/bblean.git
|
|
10
|
+
Project-URL: documentation, https://github.com/mqcomplab/bblean.git
|
|
11
|
+
Classifier: Intended Audience :: Science/Research
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: Programming Language :: Python
|
|
14
|
+
Classifier: Topic :: Software Development
|
|
15
|
+
Classifier: Topic :: Scientific/Engineering
|
|
16
|
+
Classifier: Development Status :: 4 - Beta
|
|
17
|
+
Classifier: Operating System :: POSIX
|
|
18
|
+
Classifier: Operating System :: Unix
|
|
19
|
+
Classifier: Operating System :: MacOS
|
|
20
|
+
Classifier: Programming Language :: Python :: 3
|
|
21
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
22
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
23
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
24
|
+
Classifier: Programming Language :: Python :: Implementation :: CPython
|
|
25
|
+
Requires-Python: >=3.11
|
|
26
|
+
Description-Content-Type: text/markdown
|
|
27
|
+
License-File: LICENSE
|
|
28
|
+
Requires-Dist: scipy
|
|
29
|
+
Requires-Dist: rdkit
|
|
30
|
+
Requires-Dist: numpy>=2.0
|
|
31
|
+
Requires-Dist: pandas
|
|
32
|
+
Requires-Dist: psutil
|
|
33
|
+
Requires-Dist: matplotlib
|
|
34
|
+
Requires-Dist: colorcet
|
|
35
|
+
Requires-Dist: seaborn
|
|
36
|
+
Requires-Dist: scikit-learn
|
|
37
|
+
Requires-Dist: rich
|
|
38
|
+
Requires-Dist: typer
|
|
39
|
+
Requires-Dist: opentsne
|
|
40
|
+
Requires-Dist: umap-learn
|
|
41
|
+
Requires-Dist: pynndescent
|
|
42
|
+
Dynamic: license-file
|
|
43
|
+
|
|
44
|
+
<picture>
|
|
45
|
+
<source media="(prefers-color-scheme: dark)" srcset="https://raw.githubusercontent.com/mqcomplab/bblean/main/docs/src/_static/logo-dark-bw.svg">
|
|
46
|
+
<source media="(prefers-color-scheme: light)" srcset="https://raw.githubusercontent.com/mqcomplab/bblean/main/docs/src/_static/logo-light-bw.svg">
|
|
47
|
+
<img alt="BitBIRCH-Lean logo" src="https://raw.githubusercontent.com/mqcomplab/bblean/main/docs/src/_static/logo-light-bw.svg">
|
|
48
|
+
</picture>
|
|
49
|
+
<br>
|
|
50
|
+
<br>
|
|
51
|
+
|
|
52
|
+
[](https://doi.org/10.5281/zenodo.17139445)
|
|
53
|
+
[](https://www.gnu.org/licenses/gpl-3.0)
|
|
54
|
+
[](https://github.com/mqcomplab/bblean/actions/workflows/ci.yaml)
|
|
55
|
+
[](https://github.com/psf/black)
|
|
56
|
+

|
|
57
|
+
|
|
58
|
+
## Overview
|
|
59
|
+
|
|
60
|
+
BitBIRCH-Lean is a high-throughput implementation of the BitBIRCH clustering
|
|
61
|
+
algorithm designed for very large molecular libraries.
|
|
62
|
+
|
|
63
|
+
If you find this software useful please cite the following articles:
|
|
64
|
+
|
|
65
|
+
- *BitBIRCH: efficient clustering of large molecular libraries*:
|
|
66
|
+
https://doi.org/10.1039/D5DD00030K
|
|
67
|
+
- *BitBIRCH Clustering Refinement Strategies*:
|
|
68
|
+
https://doi.org/10.1021/acs.jcim.5c00627
|
|
69
|
+
- *BitBIRCH-Lean*:
|
|
70
|
+
(preprint) https://www.biorxiv.org/content/10.1101/2025.10.22.684015v1
|
|
71
|
+
|
|
72
|
+
**NOTE**: BitBirch-Lean is currently beta software, expect minor breaking changes until
|
|
73
|
+
we hit version 1.0
|
|
74
|
+
|
|
75
|
+
The [documentation](https://mqcomplab.github.io/bblean/devdocs) of the developer version is a work in progress. Please let us know if you find any issues.
|
|
76
|
+
|
|
77
|
+
⚠️ **Important**: The default `threshold` is 0.3 and the default fingerprint kind to
|
|
78
|
+
*ecfp4*. We recommend setting `threshold` to 0.5-0.65 for *rdkit* fingerprints and
|
|
79
|
+
0.3-0.4 for *ecfp4* or *ecfp6* fingerprints (although you may need further tuning for
|
|
80
|
+
your specific library / fingerprint set). For more information on tuning these
|
|
81
|
+
parameters see [the best
|
|
82
|
+
practices](https://mqcomplab.github.io/bblean/devdocs/user-guide/notebooks/bitbirch_best_practices.html)
|
|
83
|
+
and [parameter
|
|
84
|
+
tuning](https://mqcomplab.github.io/bblean/devdocs/user-guide/parameters.html) guides.
|
|
85
|
+
|
|
86
|
+
## Installation
|
|
87
|
+
|
|
88
|
+
BitBIRCH-Lean requires Python 3.11 or higher, and can be installed in Linux or macOS.
|
|
89
|
+
Via pip, which automatically includes C++ extensions:
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
pip install bblean
|
|
93
|
+
```
|
|
94
|
+
We recommend installing `bblean` in a conda environment or a `venv`.
|
|
95
|
+
|
|
96
|
+
### From source
|
|
97
|
+
|
|
98
|
+
To build from source instead (editable mode):
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
git clone git@github.com:mqcomplab/bblean,
|
|
102
|
+
cd bblean
|
|
103
|
+
|
|
104
|
+
conda env create --file ./environment.yaml
|
|
105
|
+
conda activate bblean
|
|
106
|
+
|
|
107
|
+
BITBIRCH_BUILD_CPP=1 pip install -e .
|
|
108
|
+
|
|
109
|
+
# If you want to build without the C++ extensions run this instead:
|
|
110
|
+
pip install -e .
|
|
111
|
+
|
|
112
|
+
bb --help
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
If the extensions install successfully, they will be automatically used each time
|
|
116
|
+
BitBirch-Lean or its classes are used. No need to do anything else.
|
|
117
|
+
|
|
118
|
+
If you run into any issues when installing the extensions, please open a GitHub issue
|
|
119
|
+
and tag it with `C++`.
|
|
120
|
+
|
|
121
|
+
## CLI Quickstart
|
|
122
|
+
|
|
123
|
+
<div align="center">
|
|
124
|
+
<img src="bblean-demo-v2.gif" width="600" />
|
|
125
|
+
</div>
|
|
126
|
+
|
|
127
|
+
BitBIRCH-Lean provides a convenient CLI interface, `bb`. The CLI can be used to convert
|
|
128
|
+
SMILES files into compact fingerprint arrays, and cluster them in parallel or serial
|
|
129
|
+
mode with a single command, making it straightforward to triage collections with
|
|
130
|
+
millions of molecules. The CLI prints a run banner with the parameters used, memory
|
|
131
|
+
usage (when available), and elapsed timings so you can track each job at a glance.
|
|
132
|
+
|
|
133
|
+
The most important commands you need are:
|
|
134
|
+
|
|
135
|
+
- `bb fps-from-smiles`: Generate fingerprints from a `*.smi` file.
|
|
136
|
+
- `bb run` or `bb multiround`: Cluster the fingerprints
|
|
137
|
+
- `bb plot-summary` or `bb plot-tsne`: Analyze the clusters
|
|
138
|
+
|
|
139
|
+
An example usual workflow is as follows:
|
|
140
|
+
|
|
141
|
+
1. **Generate fingerprints from SMILES**: The repository ships with a ChEMBL
|
|
142
|
+
sample that you can use right away for testing:
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
bb fps-from-smiles examples/chembl-33-natural-products-sample.smi
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
This writes a packed fingerprint array to the current working directory (use
|
|
149
|
+
`--out-dir <dir>` for a different location). The naming convention is
|
|
150
|
+
`packed-fps-uint8-508e53ef.npy`, where `508e53ef` is a unique identifier (use `--name
|
|
151
|
+
<name>` if you prefer a different name). The packed `uint8` format is required for
|
|
152
|
+
maximum memory-efficient, so keep the default
|
|
153
|
+
`--pack` and `--dtype` values unless you have a very good reason to change them.
|
|
154
|
+
You can optionally split over multiple files for parallel parallel processing with `--num-parts <num>`.
|
|
155
|
+
|
|
156
|
+
3. **Cluster the fingerprints**: To cluster in serial mode, point `bb run` at the
|
|
157
|
+
generated array (or a directory with multiple `*.npy` files):
|
|
158
|
+
|
|
159
|
+
```bash
|
|
160
|
+
bb run ./packed-fps-uint8-508e53ef.npy
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
The outputs are stored in directory such as `bb_run_outputs/504e40ef/`, where
|
|
164
|
+
`504e40ef` is a unique identifier (use `--out-dir <dir>` for a different location).
|
|
165
|
+
Additional flags can be set to control the BitBIRCH `--branching`, `--threshold`,
|
|
166
|
+
and merge criterion. Optionally, cluster refinement can be performed with `--refine-num 1`.
|
|
167
|
+
`bb run --help ` for details.
|
|
168
|
+
|
|
169
|
+
To cluster in parallel mode, use `bb multiround ./file-or-dir` instead. If pointed to
|
|
170
|
+
a directory with multiple `*.npy` files, files will be clustered in parallel and
|
|
171
|
+
sub-trees will be merged iteratively in intermediate rounds. For more information:
|
|
172
|
+
`bb multiround --help`. Outputs are written by default to
|
|
173
|
+
`bb_multiround_outputs/<unique-id>/`.
|
|
174
|
+
|
|
175
|
+
4. **Visualize the results**: You can plot a summary of the largest clusters with
|
|
176
|
+
`bb plot-summary <output-path> --top 20` (largest 20 clusters). Passing the optional `--smiles <path-to-file.smi>` argument
|
|
177
|
+
additionally generates Murcko scaffold analysis. For a t-SNE
|
|
178
|
+
visualization try `bb plot-tsne <output-path> -- top 20`.
|
|
179
|
+
t-SNE plots use [openTSNE](https://opentsne.readthedocs.io/en/latest/) as a backend,
|
|
180
|
+
which is a parallel, extremely fast implementation. We recommend you consult the corresponding
|
|
181
|
+
documentation for info on the available parameters.
|
|
182
|
+
Still, expect t-SNE plots to be slow for very large datasets (more than 1M molecules).
|
|
183
|
+
|
|
184
|
+
### Manually exploring clustering results
|
|
185
|
+
|
|
186
|
+
Every run directory contains a raw `clusters.pkl` file with the molecule indices for each
|
|
187
|
+
cluster, plus metadata in `*.json` files that captures the exact settings and
|
|
188
|
+
performance characteristics. A quick Python session is all you need to get started:
|
|
189
|
+
|
|
190
|
+
```python
|
|
191
|
+
import pickle
|
|
192
|
+
|
|
193
|
+
clusters = pickle.load(open("bb_run_outputs/504e40ef/clusters.pkl", "rb"))
|
|
194
|
+
clusters[:2]
|
|
195
|
+
# [[321, 323, 326, 328, 337, ..., 9988, 9989],
|
|
196
|
+
# [5914, 5915, 5916, 5917, 5918, ..., 9990, 9991, 9992, 9993]]
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
The indices refer to the position of each molecule in the order they were read from the
|
|
200
|
+
fingerprint files, making it easy to link back to your original SMILES records.
|
|
201
|
+
|
|
202
|
+
## Python Quickstart and Examples
|
|
203
|
+
|
|
204
|
+
For an example of how to use the main `bblean` classes and functions consult
|
|
205
|
+
`examples/bitbirch_quickstart.ipynb`. The `examples/dataset_splitting.ipynb` notebook
|
|
206
|
+
contains an adapted notebook by Pat Walters ([Some Thoughts on Splitting Chemical
|
|
207
|
+
Datasets](https://practicalcheminformatics.blogspot.com/2024/11/some-thoughts-on-splitting-chemical.html)).
|
|
208
|
+
More examples will be added soon!
|
|
209
|
+
|
|
210
|
+
A quick summary:
|
|
211
|
+
|
|
212
|
+
```python
|
|
213
|
+
import pickle
|
|
214
|
+
|
|
215
|
+
import matplotlib.pyplot as plt
|
|
216
|
+
import numpy as np
|
|
217
|
+
|
|
218
|
+
import bblean
|
|
219
|
+
import bblean.plotting as plotting
|
|
220
|
+
import bblean.analysis as analysis
|
|
221
|
+
|
|
222
|
+
# Create the fingerprints and pack them into a numpy array, starting from a *.smi file
|
|
223
|
+
smiles = bblean.load_smiles("./examples/chembl-33-natural-products-sample.smi")
|
|
224
|
+
fps = bblean.fps_from_smiles(smiles, pack=True, n_features=2048, kind="rdkit")
|
|
225
|
+
|
|
226
|
+
# Fit the figerprints (by default all bblean functions take *packed* fingerprints)
|
|
227
|
+
# A threhsold of 0.5-0.65 is good for rdkit fingerprints, a threshold of 0.3-0.4
|
|
228
|
+
# is better for ECFPs
|
|
229
|
+
tree = bblean.BitBirch(branching_factor=50, threshold=0.65, merge_criterion="diameter")
|
|
230
|
+
tree.fit(fps)
|
|
231
|
+
|
|
232
|
+
# Refine the tree (if needed)
|
|
233
|
+
tree.set_merge(merge_criterion="tolerance-diameter", tolerance=0.0)
|
|
234
|
+
tree.refine_inplace(fps)
|
|
235
|
+
|
|
236
|
+
# Visualize the results
|
|
237
|
+
clusters = tree.get_cluster_mol_ids()
|
|
238
|
+
ca = analysis.cluster_analysis(clusters, fps, smiles)
|
|
239
|
+
plotting.summary_plot(ca, title="ChEMBL Sample")
|
|
240
|
+
plt.show()
|
|
241
|
+
|
|
242
|
+
# Save the resulting clusters, metrics, and fps
|
|
243
|
+
with open("./clusters.pkl", "wb") as f:
|
|
244
|
+
pickle.dump(clusters, f)
|
|
245
|
+
ca.dump_metrics("./metrics.csv")
|
|
246
|
+
np.save("./fps-packed-2048.npy", fps)
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
## Public Python API and Documentation
|
|
250
|
+
|
|
251
|
+
By default all functions take *packed* fingerprints of dtype `uint8`. Many functions
|
|
252
|
+
support an `input_is_packed: bool` flag, which you can toggle to `False` in case for
|
|
253
|
+
some reason you want to pass unpacked fingerprints (not recommended).
|
|
254
|
+
|
|
255
|
+
- Functions and classes that *end in an underscore* are considered private (such as
|
|
256
|
+
`_private_function(...)`) and should not be used, since they can be removed or
|
|
257
|
+
modified without warning.
|
|
258
|
+
- All functions and classes that are in *modules that end with an underscore* are also
|
|
259
|
+
considered private (such as `bblean._private_module.private_function(...)`) and should
|
|
260
|
+
not be used, since they can be removed or modified without warning.
|
|
261
|
+
- All other functions and classes are part of the stable public API and can be used.
|
|
262
|
+
However, expect minor breaking changes before we hit version 1.0
|
|
263
|
+
|
|
264
|
+
## Contributing
|
|
265
|
+
|
|
266
|
+
If you find a bug in BitBIRCH-Lean or have an issue with the usage
|
|
267
|
+
or documentation please open an issue in the GitHub issue tracker.
|
|
268
|
+
|
|
269
|
+
If you want to contribute to BitBIRCH-Lean with a bug fix, improving the documentation,
|
|
270
|
+
with usability, maintainability, or performance, please open an issue with your
|
|
271
|
+
idea/request (or directly open a PR from a fork if you prefer).
|
|
272
|
+
|
|
273
|
+
Currently we don't directly accept PRs with new features that have not been extensively
|
|
274
|
+
validated, but if you have an idea to improve the BitBIRCH algorithm you may want to
|
|
275
|
+
contact the Miranda-Quintana Lab, we are open to collaborations.
|
|
276
|
+
|
|
277
|
+
To contribute, first create a
|
|
278
|
+
[fork](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/fork-a-repo),
|
|
279
|
+
then clone your fork (`git clone git@github.com:<user>/bblean`. We recommend you install
|
|
280
|
+
`pre-commit` (`pre-commit install --hook-type pre-push`), which will run some checks
|
|
281
|
+
before you push to your branch. After you have finished work on your branch, [open a
|
|
282
|
+
pull
|
|
283
|
+
request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request).
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
bblean-0.6.0b1.dist-info/RECORD,,
|
|
2
|
+
bblean-0.6.0b1.dist-info/WHEEL,sha256=3DzZhK-rNIkkEIppB3qbu6syT4kayeV5HkusnWIqMIg,142
|
|
3
|
+
bblean-0.6.0b1.dist-info/entry_points.txt,sha256=a0jb2L5JFKioMD6CqbvJiI2unaArGzi-AMZsyY-uyGg,38
|
|
4
|
+
bblean-0.6.0b1.dist-info/top_level.txt,sha256=ybxTonvTC9zR25yR5B27aEDLl6CiwID093ZyS_--Cq4,7
|
|
5
|
+
bblean-0.6.0b1.dist-info/METADATA,sha256=KZDkJKONgnTbqwaXEK6a023k1hqcdjQBSXGemqAzx4Y,12558
|
|
6
|
+
bblean-0.6.0b1.dist-info/licenses/LICENSE,sha256=-LLlY16Y8cwDvYq3OsbnPMiG1LZnISvS1vWpKf0tFMk,2550
|
|
7
|
+
bblean/_console.py,sha256=vT217TqL_SPBGk23GnW8F8sKKDY5Uq-8LdsxutqxClE,7834
|
|
8
|
+
bblean/metrics.py,sha256=j5_ho52zBddTINpkKqBmJ7oJ_XzeH4UyTI73scNLGlI,7081
|
|
9
|
+
bblean/_timer.py,sha256=ozAtbXcCw5btCxTYgsBWSiVcZdJVzOpAD03wd4VQfwg,1360
|
|
10
|
+
bblean/analysis.py,sha256=JHoUZZfR8NpvjU-UXr8MlXIczQrIMhVHUqIdLbGt2N0,7860
|
|
11
|
+
bblean/_merges.py,sha256=tQLnviVkjnHa5sgN-steyi6Tb7ZbNpfWkDQpiPVEriA,6220
|
|
12
|
+
bblean/similarity.py,sha256=ZieFoeV5sFkNGOBn1VfO3sg1B3QrxuW7Q3W-mUSANs0,9899
|
|
13
|
+
bblean/_version.py,sha256=qOyKFSwx2M995XMIWlR9oHifiOE8nDtfvsOvfcxYC1E,712
|
|
14
|
+
bblean/bitbirch.py,sha256=6C14DKJhepqsKV_Lii9yfg5JivJ55TbEXZqYNN5eDUc,57355
|
|
15
|
+
bblean/_cpp_similarity.cpython-312-darwin.so,sha256=q5DU8xMweZMA585-IP1Kn4liysjBGDA4DoeI9YypW1s,534968
|
|
16
|
+
bblean/plotting.py,sha256=dAOGqo4r_FtQJEKibU_tadM7bs9-uVJg0TMYAiESVjE,15197
|
|
17
|
+
bblean/_py_similarity.py,sha256=TWc3MVM1Qyf2uNcXr0aRZez4D42FJeAOL4_OhUGuXTg,9581
|
|
18
|
+
bblean/__init__.py,sha256=PI-W0P_HskNz_15jySNAOsHCAwWJgO9daUr2-furDzc,664
|
|
19
|
+
bblean/multiround.py,sha256=_-pr5LG_GLSBNZ60uLcy8XZ-qo7lr0Y048Kp041_ug8,19980
|
|
20
|
+
bblean/cli.py,sha256=2LDzZBXPC1P51LFYZOVOKzZ2OA13t3CdFdRSSfZpOB0,60788
|
|
21
|
+
bblean/utils.py,sha256=34Az3-hQoY5IVd1-28xkrQOyjETpyvi3LaoG8-W6EHU,3706
|
|
22
|
+
bblean/smiles.py,sha256=O-uZ0xyWThGT06rYzxYTMlH4Wkld8fnuXMxk1QByg7o,1794
|
|
23
|
+
bblean/_memory.py,sha256=433F0G86SVMK62OrC0YEkUwM0asnS_tiaLdp_HJwFK8,6694
|
|
24
|
+
bblean/_config.py,sha256=Fe5drpl4zvaDpp0V9n9bguh8bh-akQXXM7NkNeOOk1Y,1752
|
|
25
|
+
bblean/fingerprints.py,sha256=QoHda68n5qb9Q6CCDkyhJnOLZImi8c5P8trkOhQdjCA,14891
|
|
26
|
+
bblean/sklearn.py,sha256=baHX1hBS7e-Dnw1lSTsnqNkiYd1RkCv3KHZN41OAeGo,7309
|
|
27
|
+
bblean/csrc/README.md,sha256=Gat24x1okyEdRI-FvQFK-WV_dm7IMcBZNvVA-05ZZeQ,58
|
|
28
|
+
bblean/csrc/similarity.cpp,sha256=dDvnpRtKZTnQS6ttzW2bnbeslebohpViF0cLFhT2mrQ,20717
|
|
29
|
+
bblean/_legacy/bb_int64.py,sha256=r6tIS-9jZGRLpvNA3AFfUTKkqTs_NUkf81wFnJy4bWs,44518
|
|
30
|
+
bblean/_legacy/bb_uint8.py,sha256=a0RTc6OFzQ-4E43sKjJfs79D_zwrgSp1iEFFcYA0Nsw,40429
|
|
31
|
+
bblean/_legacy/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
BitBIRCH-Lean Python Package: An open-source clustering module based on iSIM.
|
|
2
|
+
|
|
3
|
+
If you find this software useful please cite the following articles:
|
|
4
|
+
- BitBIRCH: efficient clustering of large molecular libraries:
|
|
5
|
+
https://doi.org/10.1039/D5DD00030K
|
|
6
|
+
- BitBIRCH Clustering Refinement Strategies:
|
|
7
|
+
https://doi.org/10.1021/acs.jcim.5c00627
|
|
8
|
+
- BitBIRCH-Lean:
|
|
9
|
+
(preprint) https://www.biorxiv.org/content/10.1101/2025.10.22.684015v1
|
|
10
|
+
|
|
11
|
+
Copyright (C) 2025 The Miranda-Quintana Lab and other BitBirch developers, comprised
|
|
12
|
+
exclusively by:
|
|
13
|
+
- Ramon Alain Miranda Quintana <ramirandaq@gmail.com>, <quintana@chem.ufl.edu>
|
|
14
|
+
- Krisztina Zsigmond <kzsigmond@ufl.edu>
|
|
15
|
+
- Ignacio Pickering <ipickering@chem.ufl.edu>
|
|
16
|
+
- Kenneth Lopez Perez <klopezperez@chem.ufl.edu>
|
|
17
|
+
- Miroslav Lzicar <miroslav.lzicar@deepmedchem.com>
|
|
18
|
+
|
|
19
|
+
Authors of ./bblean/multiround.py are:
|
|
20
|
+
- Ramon Alain Miranda Quintana <ramirandaq@gmail.com>, <quintana@chem.ufl.edu>
|
|
21
|
+
- Ignacio Pickering <ipickering@chem.ufl.edu>
|
|
22
|
+
|
|
23
|
+
This program is free software: you can redistribute it and/or modify it under the
|
|
24
|
+
terms of the GNU General Public License as published by the Free Software Foundation,
|
|
25
|
+
version 3 (SPDX-License-Identifier: GPL-3.0-only).
|
|
26
|
+
|
|
27
|
+
Portions of the file ./bblean/bitbirch.py are licensed under the BSD 3-Clause License
|
|
28
|
+
Copyright (c) 2007-2024 The scikit-learn developers. All rights reserved.
|
|
29
|
+
(SPDX-License-Identifier: BSD-3-Clause). Copies or reproductions of code in the
|
|
30
|
+
./bblean/bitbirch.py file must in addition adhere to the BSD-3-Clause license terms. A
|
|
31
|
+
copy of the BSD-3-Clause license can be located at the root of this repository, under
|
|
32
|
+
./LICENSES/BSD-3-Clause.txt.
|
|
33
|
+
|
|
34
|
+
Portions of the file ./bblean/bitbirch.py were previously licensed under the LGPL 3.0
|
|
35
|
+
license (SPDX-License-Identifier: LGPL-3.0-only), they are relicensed in this program
|
|
36
|
+
as GPL-3.0, with permission of all original copyright holders:
|
|
37
|
+
- Ramon Alain Miranda Quintana <ramirandaq@gmail.com>, <quintana@chem.ufl.edu>
|
|
38
|
+
- Vicky (Vic) Jung <jungvicky@ufl.edu>
|
|
39
|
+
- Kenneth Lopez Perez <klopezperez@chem.ufl.edu>
|
|
40
|
+
- Kate Huddleston <kdavis2@chem.ufl.edu>
|
|
41
|
+
|
|
42
|
+
This program is distributed in the hope that it will be useful, but WITHOUT ANY
|
|
43
|
+
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
|
|
44
|
+
PARTICULAR PURPOSE. See the GNU General Public License for more details.
|
|
45
|
+
|
|
46
|
+
You should have received a copy of the GNU General Public License along with this
|
|
47
|
+
program. This copy can be located at the root of this repository, under
|
|
48
|
+
./LICENSES/GPL-3.0-only.txt. If not, see <http://www.gnu.org/licenses/gpl-3.0.html>.
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
bblean
|