Microns-DataCleaner 0.1.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Milano Microns Colab
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,140 @@
1
+ Metadata-Version: 2.3
2
+ Name: Microns-DataCleaner
3
+ Version: 0.1.5
4
+ Summary: Provides common functions to download and process data from the mm3 data.
5
+ Keywords: neuroscience
6
+ Author: Victor Buendia
7
+ Author-email: victor.buendia@unibocconi.it
8
+ Maintainer: All contributors
9
+ Requires-Python: >=3.9
10
+ Classifier: Development Status :: 2 - Pre-Alpha
11
+ Classifier: Intended Audience :: Science/Research
12
+ Classifier: Operating System :: OS Independent
13
+ Classifier: Programming Language :: Python :: 3
14
+ Requires-Dist: caveclient
15
+ Requires-Dist: numpy
16
+ Requires-Dist: pandas
17
+ Requires-Dist: scipy
18
+ Requires-Dist: standard_transform
19
+ Requires-Dist: tqdm
20
+ Description-Content-Type: text/markdown
21
+
22
+ # MICrONS-datacleaner
23
+
24
+ [![License](https://badgen.net/github/license/MICrONS-Milano-CoLab/MICrONS-datacleaner)](https://opensource.org/licenses/MIT)
25
+
26
+ This project contains tools to work with the [MICrONS Cortical MM3 dataset](https://www.microns-explorer.org/cortical-mm3), providing a **robust interface** to interact with the nucleus data.
27
+
28
+ ## Key features
29
+
30
+ - **Simple interface** to download and keep organized anatomical data via CAVEClient.
31
+ - **Allows to query the synapse table in chunks** avoiding common pitfalls.
32
+ - **Easily process nucleus annotation tables**.
33
+ - **Automatically segment** the brain volume into cortical **layers.**
34
+ - **Tools for filtering** and constructing connectome subsets.
35
+ - Basic interface to add functional properties, including **tuning curves and selectivity**.
36
+
37
+ ## Install 📥
38
+
39
+ ```bash
40
+ pip install microns-datacleaner
41
+ ```
42
+
43
+ ## Using the package ⏩
44
+
45
+ - **Few lines of code** to get a full table with neurons' brain area, layer, cell_type, proofreading information, and nucleus position:
46
+
47
+ ```python
48
+ #Import the lib
49
+ import microns_datacleaner as mic
50
+
51
+ #Target version and download folder
52
+ cleaner = mic.MicronsDataCleaner(datadir = "data", version=1300)
53
+
54
+ #Download the data
55
+ cleaner.download_nucleus_data()
56
+
57
+ #Process the downloaded data and segment into layers
58
+ units, segments = cleaner.process_nucleus_data()
59
+ ```
60
+
61
+ - **Filter easily!** How can we get all neurons in V1, layers L2/3 and L4 with proofread axons?
62
+
63
+ ```python
64
+ units_filter = fl.filter_neurons(units, layer=['L2/3', 'L4'], proofread='ax_clean', brain_area='V1')
65
+ ```
66
+
67
+ - **Robustly download synapses** between a subset of pre and post-synaptic neurons in chunks.
68
+
69
+ ```python
70
+ preids = units_filter['pt_root_id']
71
+ postids = units_filter['pt_root_id']
72
+ cleaner.download_synapse_data(preids, postids)
73
+
74
+ #Connection problems at chunk number 23? Just restart from there
75
+ cleanerdownload_synapse_data(preids, postids, start_index=23)
76
+ ```
77
+
78
+ Check the docs and our tutorial notebook just below to get started!
79
+
80
+
81
+ ## Docs & Tutorials 📜
82
+
83
+ If it is the first time working with the MICrONS data, we recommend you read our basic tutorial (also available as a Python Notebook), as well as the official documentation of the MICrONS project.
84
+
85
+ If you want to contribute, please read our guidelines first. Feel free to open an issue if you find any problem.
86
+
87
+ You can find a full documentation of the API and functions in the [docs](https://margheritapremi.github.io/MICrONS-datacleaner).
88
+
89
+
90
+ ## Requirements
91
+
92
+ ### Dependencies
93
+
94
+ - CaveCLIENT
95
+ - Pandas
96
+ - Numpy
97
+ - Scipy
98
+ - TQDM
99
+ - Standard transform for coordinate change (MICrONS ecosystem)
100
+
101
+ ### Dev-dependencies
102
+
103
+ - pdoc (to generate the docs)
104
+ - ruff (to keep contributions in a consistent format)
105
+
106
+
107
+
108
+ ## Citation Policy 📚
109
+
110
+ If you use our code, **please consider to cite the associated repository,** as well as the [IARPA MICrONS Minnie Project](https://doi.org/10.60533/BOSS-2021-T0SY) and the [Microns Phase 3 NDA](https://github.com/cajal/microns_phase3_nda) repository.
111
+
112
+ Our code serves as an interface for the MICrONS data. Please cite appropiate the literature for the data used following their [Citation Policy](https://www.microns-explorer.org/citation-policy). The papers may depend on the annotation tables used.
113
+
114
+ Our unit table is constructed by integrating information from the following papers:
115
+
116
+ 1. [Functional connectomics spanning multiple areas of mouse visual cortex](https://doi.org/10.1038/s41586-025-08790-w). The Microns Consortium. 2025
117
+ 2. [Foundation model of neural activity predicts response to new stimulus types](https://doi.org/10.1038/s41586-025-08829-y)
118
+ 3. [Perisomatic ultrastructure efficiently classifies cells in mouse cortex](http://doi.org/10.1038/s41586-024-07765-7)
119
+ 4. [NEURD offers automated proofreading and feature extraction for connectomics](https://doi.org/)
120
+ 5. [CAVE: Connectome Annotation Versioning Engine](https://doi.org/10.1038/s41592-024-02426-z)
121
+
122
+ ## Acknowledgements
123
+
124
+ We acknowledge funding by the NextGenerationEU, in the framework of the FAIR—Future Artificial Intelligence Research project (FAIR PE00000013—CUP B43C22000800006).
125
+
126
+
127
+ ## Generating the Docs
128
+
129
+ ## Generating the docs
130
+
131
+ Go to the main folder of the repository, and run
132
+
133
+ ```
134
+ pdoc -t docs/template source/mic_datacleaner.py -o docs/html
135
+ ```
136
+
137
+ The docs will be generated in the `docs/html` folder in HTML format, which can be checked with the browser. If you need the docs for all the files, and not only the class, use `source/*.py` instead of `source/mic_datacleaner.py` above.
138
+
139
+
140
+
@@ -0,0 +1,118 @@
1
+ # MICrONS-datacleaner
2
+
3
+ [![License](https://badgen.net/github/license/MICrONS-Milano-CoLab/MICrONS-datacleaner)](https://opensource.org/licenses/MIT)
4
+
5
+ This project contains tools to work with the [MICrONS Cortical MM3 dataset](https://www.microns-explorer.org/cortical-mm3), providing a **robust interface** to interact with the nucleus data.
6
+
7
+ ## Key features
8
+
9
+ - **Simple interface** to download and keep organized anatomical data via CAVEClient.
10
+ - **Allows to query the synapse table in chunks** avoiding common pitfalls.
11
+ - **Easily process nucleus annotation tables**.
12
+ - **Automatically segment** the brain volume into cortical **layers.**
13
+ - **Tools for filtering** and constructing connectome subsets.
14
+ - Basic interface to add functional properties, including **tuning curves and selectivity**.
15
+
16
+ ## Install 📥
17
+
18
+ ```bash
19
+ pip install microns-datacleaner
20
+ ```
21
+
22
+ ## Using the package ⏩
23
+
24
+ - **Few lines of code** to get a full table with neurons' brain area, layer, cell_type, proofreading information, and nucleus position:
25
+
26
+ ```python
27
+ #Import the lib
28
+ import microns_datacleaner as mic
29
+
30
+ #Target version and download folder
31
+ cleaner = mic.MicronsDataCleaner(datadir = "data", version=1300)
32
+
33
+ #Download the data
34
+ cleaner.download_nucleus_data()
35
+
36
+ #Process the downloaded data and segment into layers
37
+ units, segments = cleaner.process_nucleus_data()
38
+ ```
39
+
40
+ - **Filter easily!** How can we get all neurons in V1, layers L2/3 and L4 with proofread axons?
41
+
42
+ ```python
43
+ units_filter = fl.filter_neurons(units, layer=['L2/3', 'L4'], proofread='ax_clean', brain_area='V1')
44
+ ```
45
+
46
+ - **Robustly download synapses** between a subset of pre and post-synaptic neurons in chunks.
47
+
48
+ ```python
49
+ preids = units_filter['pt_root_id']
50
+ postids = units_filter['pt_root_id']
51
+ cleaner.download_synapse_data(preids, postids)
52
+
53
+ #Connection problems at chunk number 23? Just restart from there
54
+ cleanerdownload_synapse_data(preids, postids, start_index=23)
55
+ ```
56
+
57
+ Check the docs and our tutorial notebook just below to get started!
58
+
59
+
60
+ ## Docs & Tutorials 📜
61
+
62
+ If it is the first time working with the MICrONS data, we recommend you read our basic tutorial (also available as a Python Notebook), as well as the official documentation of the MICrONS project.
63
+
64
+ If you want to contribute, please read our guidelines first. Feel free to open an issue if you find any problem.
65
+
66
+ You can find a full documentation of the API and functions in the [docs](https://margheritapremi.github.io/MICrONS-datacleaner).
67
+
68
+
69
+ ## Requirements
70
+
71
+ ### Dependencies
72
+
73
+ - CaveCLIENT
74
+ - Pandas
75
+ - Numpy
76
+ - Scipy
77
+ - TQDM
78
+ - Standard transform for coordinate change (MICrONS ecosystem)
79
+
80
+ ### Dev-dependencies
81
+
82
+ - pdoc (to generate the docs)
83
+ - ruff (to keep contributions in a consistent format)
84
+
85
+
86
+
87
+ ## Citation Policy 📚
88
+
89
+ If you use our code, **please consider to cite the associated repository,** as well as the [IARPA MICrONS Minnie Project](https://doi.org/10.60533/BOSS-2021-T0SY) and the [Microns Phase 3 NDA](https://github.com/cajal/microns_phase3_nda) repository.
90
+
91
+ Our code serves as an interface for the MICrONS data. Please cite appropiate the literature for the data used following their [Citation Policy](https://www.microns-explorer.org/citation-policy). The papers may depend on the annotation tables used.
92
+
93
+ Our unit table is constructed by integrating information from the following papers:
94
+
95
+ 1. [Functional connectomics spanning multiple areas of mouse visual cortex](https://doi.org/10.1038/s41586-025-08790-w). The Microns Consortium. 2025
96
+ 2. [Foundation model of neural activity predicts response to new stimulus types](https://doi.org/10.1038/s41586-025-08829-y)
97
+ 3. [Perisomatic ultrastructure efficiently classifies cells in mouse cortex](http://doi.org/10.1038/s41586-024-07765-7)
98
+ 4. [NEURD offers automated proofreading and feature extraction for connectomics](https://doi.org/)
99
+ 5. [CAVE: Connectome Annotation Versioning Engine](https://doi.org/10.1038/s41592-024-02426-z)
100
+
101
+ ## Acknowledgements
102
+
103
+ We acknowledge funding by the NextGenerationEU, in the framework of the FAIR—Future Artificial Intelligence Research project (FAIR PE00000013—CUP B43C22000800006).
104
+
105
+
106
+ ## Generating the Docs
107
+
108
+ ## Generating the docs
109
+
110
+ Go to the main folder of the repository, and run
111
+
112
+ ```
113
+ pdoc -t docs/template source/mic_datacleaner.py -o docs/html
114
+ ```
115
+
116
+ The docs will be generated in the `docs/html` folder in HTML format, which can be checked with the browser. If you need the docs for all the files, and not only the class, use `source/*.py` instead of `source/mic_datacleaner.py` above.
117
+
118
+
@@ -0,0 +1,30 @@
1
+ [build-system]
2
+ requires = ["poetry-core>=1.0.0"]
3
+ build-backend = "poetry.core.masonry.api"
4
+
5
+
6
+ [project]
7
+ name = "Microns-DataCleaner"
8
+ version = "0.1.5"
9
+ description = "Provides common functions to download and process data from the mm3 data."
10
+ keywords = ["neuroscience"]
11
+ classifiers = ["Development Status :: 2 - Pre-Alpha",
12
+ "Intended Audience :: Science/Research",
13
+ "Operating System :: OS Independent",
14
+ "Programming Language :: Python :: 3",
15
+ ]
16
+ requires-python = ">=3.9"
17
+ packages = [{include = ""}]
18
+ dynamic = ["version", "readme"]
19
+ authors = [{name = "Victor Buendia", email = "victor.buendia@unibocconi.it"}, {name = "Margherita Premi", email = "margherita.premi@gmail.com"}]
20
+ maintainers = [{name = "All contributors"}]
21
+ readme = "README.md"
22
+
23
+ dependencies = [
24
+ "numpy",
25
+ "standard_transform",
26
+ "pandas",
27
+ "caveclient",
28
+ "tqdm",
29
+ "scipy"
30
+ ]
@@ -0,0 +1,9 @@
1
+ """
2
+ .. include:: ../../docs-src/docs_main.md
3
+ """
4
+
5
+ __version__ = "0.1.5"
6
+
7
+ from .mic_datacleaner import MicronsDataCleaner
8
+
9
+ __all__ = ["MicronsDataCleaner", "filters", "remapper"]
@@ -0,0 +1,278 @@
1
+ import pandas as pd
2
+ import time as time
3
+ import requests
4
+ import os
5
+ import logging
6
+ from tqdm import tqdm
7
+
8
+ def download_tables(client, path2download, tables2download):
9
+ """
10
+ Download all the indicated tables for further processing.
11
+
12
+ Parameters:
13
+ -----------
14
+ client: caveclient.CAVEclient
15
+ The CAVEclient instance used to connect to and download from the data service.
16
+ path2download: str
17
+ The local file path to the directory where the downloaded tables will be saved as CSV files.
18
+ tables2download: list[str]
19
+ A list containing the names of the tables to be downloaded.
20
+
21
+ Returns:
22
+ -------
23
+ None.
24
+ This function does not return any value. It saves the downloaded tables as files in the
25
+ specified directory.
26
+ """
27
+
28
+ logging.info(f"Starting download of nucleus data to {path2download}.")
29
+ # Ensure directory exists
30
+ os.makedirs(path2download, exist_ok=True)
31
+
32
+ # Ensure directory exists
33
+ os.makedirs(path2download, exist_ok=True)
34
+
35
+ # Download all the tables in the list
36
+ for table in tqdm(tables2download, "Downloading nucleus tables..."):
37
+ try:
38
+ auxtable = client.materialize.query_table(table, split_positions=True)
39
+ auxtable = pd.DataFrame(auxtable)
40
+ auxtable.to_csv(f'{path2download}/{table}.csv', index=False)
41
+ except Exception as e:
42
+ raise RuntimeError(f'Error downloading table {table}: {e}')
43
+
44
+
45
+ def connectome_constructor(
46
+ client, presynaptic_set, postsynaptic_set, savefolder, neurs_per_steps=500, start_index=0, max_retries=10, delay=5, drop_synapses_duplicates=True
47
+ ):
48
+ """
49
+ Constructs a connectome subset for specified pre- and postsynaptic neurons.
50
+ This function queries the MICrONS connectomics database to extract synaptic
51
+ connections between a defined set of presynaptic and postsynaptic neurons.
52
+
53
+ Parameters:
54
+ -----------
55
+ client: caveclient.CAVEclient
56
+ The CAVEclient instance used to access MICrONS connectomics data.
57
+ presynaptic_set: numpy.ndarray
58
+ A 1D NumPy array of unique `root_ids` for the presynaptic neurons.
59
+ postsynaptic_set: numpy.ndarray
60
+ A 1D NumPy array of unique `root_ids` for the postsynaptic neurons.
61
+ savefolder: str
62
+ The path to the directory where the output files will be saved.
63
+ neurs_per_steps: int, optional
64
+ Number of postsynaptic neurons to query per batch, by default 500.
65
+ This parameter enables querying the database in iterative batches to
66
+ work around API query size limits. A value of 500 is a reliable
67
+ choice for a presynaptic set of approximately 8000 neurons.
68
+ start_index: int, optional
69
+ The starting batch index for the download, by default 0. If a previous
70
+ download was interrupted, this can be set to the index of the last
71
+ successfully downloaded file to resume the process.
72
+ max_retries: int, optional
73
+ The maximum number of times to retry a query if the server fails to
74
+ respond, by default 10.
75
+ drop_synapses_duplicates: bool, optional
76
+ If True (default), all synapses between a given pair of neurons (i, j)
77
+ are merged into a single entry. The `synapse_size` of this entry will be
78
+ the sum of all individual synapse sizes. If False, each synapse is
79
+ saved as a separate entry.
80
+
81
+ Returns:
82
+ --------
83
+ None.
84
+ This function does not return any value. The resulting connection tables
85
+ are saved as individual CSV files in the specified `savefolder`.
86
+ """
87
+
88
+ # Ensure directory exists
89
+ os.makedirs(savefolder, exist_ok=True)
90
+
91
+ # We are doing the neurons in packages of neurs_per_steps. If neurs_per_steps is not
92
+ # a divisor of the postsynaptic_set the last iteration has less neurons
93
+ n_before_last = (postsynaptic_set.size // neurs_per_steps) * neurs_per_steps
94
+ n_chunks = 1 + (postsynaptic_set.size // neurs_per_steps)
95
+
96
+ # Time before starting the party
97
+ time_0 = time.time()
98
+
99
+ synapse_table = client.info.get_datastack_info()['synapse_table']
100
+
101
+ # Preset the dictionary so we do not build a large object every time
102
+ neurons_to_download = {'pre_pt_root_id': presynaptic_set}
103
+
104
+ # If we are not getting individual synapses, the best thing we can do is to not ask for positions, which is very heavy
105
+ if drop_synapses_duplicates:
106
+ cols_2_download = ['pre_pt_root_id', 'post_pt_root_id', 'size']
107
+ logging.info("Dropping synapse duplicates and excluding position data for lighter queries.")
108
+ else:
109
+ cols_2_download = ['pre_pt_root_id', 'post_pt_root_id', 'size', 'ctr_pt_position']
110
+ part = start_index
111
+
112
+ # Progress bar over the amount of chunks to download
113
+ with tqdm(total=n_chunks) as progress_bar:
114
+ # Main loop over chunks
115
+ for i in range(start_index * neurs_per_steps, postsynaptic_set.size, neurs_per_steps):
116
+ # Inform about our progress
117
+ logging.debug(f'Postsynaptic neurons queried so far: {i}...')
118
+
119
+ # Try to query the API several times
120
+ success = False # Flag to track if current batch succeeded
121
+ retry = 0
122
+ while retry < max_retries and not success:
123
+ try:
124
+ # Get the postids that we will be grabbing in this query. We will get neurs_per_step of them
125
+ post_ids = postsynaptic_set[i : i + neurs_per_steps] if i < n_before_last else postsynaptic_set[i:]
126
+ neurons_to_download['post_pt_root_id'] = post_ids
127
+ logging.debug(f"Querying batch starting at index {i} with {len(post_ids)} neurons.")
128
+ # Query the table
129
+ sub_syn_df = client.materialize.query_table(
130
+ synapse_table, filter_in_dict=neurons_to_download, select_columns=cols_2_download, split_positions=True
131
+ )
132
+
133
+ # Sum all repeated synapses. The last reset_index is because groupby would otherwise create a
134
+ # multiindex dataframe and we want to have pre_root and post_root as columns
135
+ if drop_synapses_duplicates:
136
+ sub_syn_df = sub_syn_df.groupby(['pre_pt_root_id', 'post_pt_root_id']).sum().reset_index()
137
+
138
+ sub_syn_df.to_csv(f'{savefolder}/connections_table_{part}.csv', index=False)
139
+ logging.info(f"Successfully saved connections_table_{part}.csv")
140
+ part += 1
141
+
142
+ # Measure how much time in total our program did run
143
+ elapsed_time = time.time() - time_0
144
+ neurons_done = min(i + neurs_per_steps, postsynaptic_set.size)
145
+ time_per_neuron = elapsed_time / neurons_done
146
+ neurons_2_do = postsynaptic_set.size - neurons_done
147
+ remaining_time = time_format(neurons_2_do * time_per_neuron)
148
+ logging.debug(f'Estimated remaining time: {remaining_time}')
149
+ success = True
150
+
151
+ # Set that another chunk was downloaded
152
+ progress_bar.update(1)
153
+
154
+ except requests.HTTPError as excep:
155
+ logging.warning(f'API error on trial {retry + 1}. Retrying in {delay} seconds... Details: {excep}')
156
+ print(f'API error on trial {retry + 1}. Retrying in {delay} seconds... Details: {excep}')
157
+ time.sleep(delay)
158
+ retry += 1
159
+
160
+ except Exception as excep:
161
+ logging.error(f"An unexpected error occurred: {excep}")
162
+ raise excep
163
+
164
+ if not success:
165
+ logging.error('Exceeded the max retries when trying to get synaptic connectivity. Aborting.')
166
+ raise TimeoutError('Exceeded the max_tries when trying to get synaptic connectivity')
167
+
168
+
169
+ def time_format(seconds):
170
+ """
171
+ Formats a duration in seconds into a human-readable string.
172
+
173
+ Parameters:
174
+ -----------
175
+ seconds: float
176
+ The total duration in seconds to be formatted.
177
+
178
+ Returns:
179
+ --------
180
+ str
181
+ A string representing the formatted duration.
182
+ """
183
+
184
+ if seconds > 3600 * 24:
185
+ days = int(seconds // (24 * 3600))
186
+ hours = int((seconds - days * 24 * 3600) // 3600)
187
+ return f'{days} days, {hours}h'
188
+ elif seconds > 3600:
189
+ hours = int(seconds // 3600)
190
+ minutes = int((seconds - hours * 3600) // 60)
191
+ return f'{hours}h, {minutes}min'
192
+ elif seconds > 60:
193
+ minutes = int(seconds // 60)
194
+ rem_sec = int((seconds - 60 * minutes))
195
+ return f'{minutes}min {rem_sec}s'
196
+ else:
197
+ return f'{seconds:.0f}s'
198
+
199
+
200
+ def merge_connection_tables(savefolder, filename):
201
+ """
202
+ Merges individual connection table files into a single master file.
203
+ This function scans a specified directory for connection table files
204
+ (identified by the prefix 'connections_table_'), which are typically
205
+ generated by the `connectome_constructor` function. It then concatenates
206
+ them into a single pandas DataFrame and saves the result as a new CSV file.
207
+
208
+ Parameters:
209
+ -----------
210
+ savefolder: str
211
+ The path to the directory containing the connection table files to be merged.
212
+ filename: str
213
+ The base name for the output file. The merged table will be saved as
214
+ '{filename}.csv' in the `savefolder`.
215
+
216
+ Returns:
217
+ --------
218
+ None.
219
+ This function does not return a value. It saves the merged table to a CSV file.
220
+ """
221
+
222
+ # Check if the synapses folder exists
223
+ logging.info(f"Starting to merge connection tables into {filename}.csv")
224
+ synapses_path = f'{savefolder}/synapses/'
225
+ if not os.path.exists(synapses_path):
226
+ if os.path.exists(savefolder) and any('connections_table_' in f for f in os.listdir(savefolder)):
227
+ synapses_path = savefolder
228
+ else:
229
+ raise FileNotFoundError(f'Could not find synapses directory at {synapses_path}')
230
+
231
+ # Count the number of tables to merge, by checking all files in the correct folder
232
+ connection_files = []
233
+ for file in os.listdir(synapses_path):
234
+ file_path = os.path.join(synapses_path, file)
235
+ if os.path.isfile(file_path) and 'connections_table_' in file:
236
+ connection_files.append(file_path)
237
+
238
+ if not connection_files:
239
+ logging.warning('No connection tables found to merge.')
240
+ return
241
+
242
+ logging.info(f"Found {len(connection_files)} connection tables to merge.")
243
+
244
+ # Merge all of them
245
+ first_file = connection_files[0]
246
+ table = pd.read_csv(first_file)
247
+
248
+ for file_path in connection_files[1:]:
249
+ table = pd.concat([table, pd.read_csv(file_path)])
250
+
251
+ output_path = f'{savefolder}/{filename}.csv'
252
+ table.to_csv(output_path, index=False)
253
+ logging.info(f'Merged {len(connection_files)} tables into {output_path}')
254
+ return
255
+
256
+
257
+ def download_functional_fits(filepath):
258
+ """
259
+ Downloads functional fit data from a static Zenodo repository.
260
+ This function retrieves a CSV file containing functional fitting data from a
261
+ pre-defined Zenodo URL and saves it to the specified local path.
262
+
263
+ Parameters:
264
+ -----------
265
+ filepath: str
266
+ The full path, including the desired filename, where the downloaded file will be stored.
267
+
268
+ Returns:
269
+ --------
270
+ None.
271
+ This function does not return a value. It saves the content directly to a file.
272
+ """
273
+
274
+ # TO DO
275
+ response = requests.get("URL TO OUR FILE IN ZENODO")
276
+
277
+ with open(f"{filepath}.csv", mode="wb") as file:
278
+ file.write(response.content)
@@ -0,0 +1,17 @@
1
+ """
2
+ # Filters subpackage
3
+
4
+ The filters subpackage helps to query the units and synapse tables. The tables are just Pandas Dataframes, so it is always possible to `.query` them.
5
+ However, often it is necessary to query for several aspects at once, which is inconvenient, especially for the synapses.
6
+ The `filters` package helps to reduce the effort for the most common operations.
7
+ As stated in the Quick Start, the three most important functions are `filter_neurons`, `filter_connections` and `synapses_by_id`.
8
+ There are several examples in the `basic_notebook`. The API reference below contains detailed information about the arguments of these functions.
9
+
10
+ > It is important to notice that the filters act only on the predefined columns of the unit table, but not on custom columns added from the other tables. In these cases, your best bet is to `.query` directly.
11
+
12
+ Please read the API below for more information in individual functions.
13
+ """
14
+
15
+ from .filters import *
16
+
17
+ __all__ = ["filter_neurons", "filter_connections", "synapses_by_id", "remove_autapses", "connections_to", "connections_from"]