fruitloops 0.1.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Gustavo Madeira Santana
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,407 @@
1
+ Metadata-Version: 2.4
2
+ Name: fruitloops
3
+ Version: 0.1.1
4
+ Summary: Agent-friendly CLI for querying local connectome analysis tables.
5
+ Author: Gustavo Madeira Santana
6
+ License-Expression: MIT
7
+ Project-URL: Homepage, https://github.com/gumadeiras/fruitloops
8
+ Project-URL: Repository, https://github.com/gumadeiras/fruitloops
9
+ Classifier: Programming Language :: Python :: 3
10
+ Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
11
+ Requires-Python: >=3.10
12
+ Description-Content-Type: text/markdown
13
+ License-File: LICENSE
14
+ Provides-Extra: bulk
15
+ Requires-Dist: duckdb>=1.1; extra == "bulk"
16
+ Requires-Dist: pyarrow>=14; extra == "bulk"
17
+ Provides-Extra: live
18
+ Requires-Dist: caveclient; extra == "live"
19
+ Requires-Dist: neuprint-python; extra == "live"
20
+ Provides-Extra: plot
21
+ Requires-Dist: matplotlib>=3.7; extra == "plot"
22
+ Dynamic: license-file
23
+
24
+ # fruitloops
25
+
26
+ Agent-friendly CLI for querying connectome analysis tables from hemibrain and
27
+ FlyWire.
28
+
29
+ The repository keeps generated CSV products in a predictable layout:
30
+
31
+ ```text
32
+ data/
33
+ manifest.csv
34
+ hemibrain/
35
+ flywire/
36
+ comparison/
37
+ ```
38
+
39
+ ## Quick Use
40
+
41
+ Install with Homebrew:
42
+
43
+ ```bash
44
+ brew tap gumadeiras/tap
45
+ brew install fruitloops
46
+ ```
47
+
48
+ The brewed CLI installs lightweight by default. Add the bulk/live/plot Python
49
+ dependencies into fruitloops' Homebrew virtualenv when needed:
50
+
51
+ ```bash
52
+ fruitloops-install-extras
53
+ ```
54
+
55
+ Install from GitHub with the extras you need:
56
+
57
+ ```bash
58
+ python -m pip install "fruitloops @ git+https://github.com/gumadeiras/fruitloops.git"
59
+ python -m pip install "fruitloops[bulk,live,plot] @ git+https://github.com/gumadeiras/fruitloops.git"
60
+ ```
61
+
62
+ Run directly from the repository:
63
+
64
+ ```bash
65
+ python -m fruitloops datasets
66
+ python -m fruitloops files --dataset flywire --contains summary
67
+ python -m fruitloops head --table flywire:analysis_outputs/full_summary
68
+ python -m fruitloops query --table comparison:matched_ln_class_similarity --contains LN_class=il3LN6 --format json
69
+ python -m fruitloops ln il3LN6 --dataset flywire --format json
70
+ python -m fruitloops partners il3LN6 --dataset flywire --kind orn --format csv
71
+ python -m fruitloops compare il3LN6 --format json
72
+ ```
73
+
74
+ For editable installation:
75
+
76
+ ```bash
77
+ python -m pip install -e '.[bulk,live,plot]'
78
+ fruitloops datasets
79
+ ```
80
+
81
+ ## Table References
82
+
83
+ Tables can be referenced as:
84
+
85
+ - `dataset:relative/path/without_csv`
86
+ - `dataset:collection/file_stem`
87
+ - `file_id` from `data/manifest.csv`
88
+
89
+ Examples:
90
+
91
+ ```bash
92
+ fruitloops schema --table flywire:analysis_outputs/full_summary
93
+ fruitloops query --table hemibrain:analysis_outputs/full_summary --select bodyId,LN_type,input_preference
94
+ fruitloops query --table flywire:source_audit/ln_observations_by_hemisphere --where LN_type=il3LN6
95
+ fruitloops query --table flywire:source_audit/orn_partner_counts_by_hemisphere --where LN_type=il3LN6 --format csv
96
+ fruitloops path --table comparison:matched_ln_class_similarity
97
+ ```
98
+
99
+ ## Common Agent Queries
100
+
101
+ Aggregate any table without pandas:
102
+
103
+ ```bash
104
+ fruitloops aggregate \
105
+ --table flywire:source_audit/orn_partner_counts_by_hemisphere \
106
+ --where LN_type=il3LN6 \
107
+ --by LN_type,analysis_hemisphere,input_relation \
108
+ --sum n_synapses \
109
+ --format csv
110
+ ```
111
+
112
+ Summarize ORN or PN partners for one LN:
113
+
114
+ ```bash
115
+ fruitloops partners il3LN6 --dataset flywire --kind orn --format csv
116
+ fruitloops partners il3LN6 --dataset flywire --kind pn --format csv
117
+ fruitloops partners il3LN6 --dataset hemibrain --kind orn --format csv
118
+ fruitloops partners il3LN6 --dataset hemibrain --kind pn --format csv
119
+ ```
120
+
121
+ Pull the reconciled hemibrain/FlyWire comparison:
122
+
123
+ ```bash
124
+ fruitloops compare il3LN6 --format json
125
+ ```
126
+
127
+ Useful LN workflow:
128
+
129
+ ```bash
130
+ fruitloops ln il3LN6 --format csv
131
+ fruitloops query --table flywire:source_audit/ln_observations_by_hemisphere --where LN_type=il3LN6 --format csv
132
+ fruitloops aggregate \
133
+ --table flywire:source_audit/orn_partner_counts_by_hemisphere \
134
+ --where LN_type=il3LN6 \
135
+ --by analysis_hemisphere,input_relation \
136
+ --sum n_synapses \
137
+ --format csv
138
+ fruitloops compare il3LN6 --format jsonl
139
+ ```
140
+
141
+ ## Olfaction Offline Cache
142
+
143
+ Build derived AL/LH/MB tables after importing bulk connectivity:
144
+
145
+ ```bash
146
+ python -m pip install -e '.[bulk]'
147
+ fruitloops olfaction build
148
+ fruitloops olfaction tables
149
+ ```
150
+
151
+ For complete names/classes/glomeruli, cache annotations once from live APIs and
152
+ rebuild:
153
+
154
+ ```bash
155
+ python -m pip install -e '.[bulk,live]'
156
+ fruitloops olfaction cache-annotations --dataset hemibrain
157
+ fruitloops olfaction cache-annotations --dataset flywire
158
+ ```
159
+
160
+ The builder creates `olf_edges_by_neuropil`, `olf_edges_total` aggregated over
161
+ AL/LH/MB, `olf_neuropil_membership`, `olf_neurons`, and `olf_provenance` in the
162
+ DuckDB store. It uses imported annotation tables when available:
163
+
164
+ - `hemibrain_olfaction_neuron_annotations` or `hemibrain_traced_neurons`
165
+ - `flywire_hierarchical_neuron_annotations`
166
+ - `flywire_neuron_information_v2`
167
+
168
+ Example olfaction queries:
169
+
170
+ ```bash
171
+ fruitloops olfaction neurons --dataset flywire --region AL --class ORN --format csv
172
+ fruitloops olfaction pns --dataset hemibrain --glomerulus DM1 --format csv
173
+ fruitloops olfaction orn-inputs --dataset hemibrain --glomerulus DM1 --by-side --format csv
174
+ fruitloops olfaction edges --dataset flywire --region LH --min-synapses 5 --format csv
175
+ ```
176
+
177
+ ## Generic Plotting
178
+
179
+ Plotting is reusable and table-agnostic. Install the plotting extra when needed:
180
+
181
+ ```bash
182
+ python -m pip install -e '.[plot]'
183
+ ```
184
+
185
+ Render from any `fruitloops` table reference:
186
+
187
+ ```bash
188
+ fruitloops plot \
189
+ --table comparison:matched_ln_class_similarity \
190
+ --kind scatter \
191
+ --x hemibrain_mean_contra_preference \
192
+ --y flywire_mean_contra_preference \
193
+ --label LN_class \
194
+ --top-labels 8 \
195
+ --output outputs/contra_preference_scatter \
196
+ --formats png,svg
197
+ ```
198
+
199
+ Or render from any CSV path:
200
+
201
+ ```bash
202
+ fruitloops plot \
203
+ --csv path/to/table.csv \
204
+ --kind scatter \
205
+ --x x_column \
206
+ --y y_column \
207
+ --output outputs/my_scatter
208
+ ```
209
+
210
+ Other generic plot kinds:
211
+
212
+ ```bash
213
+ fruitloops plot --table comparison:matched_ln_class_similarity --kind bar --x LN_class --y orn_input_distribution_correlation --output outputs/orn_corr_bar
214
+ fruitloops plot --table flywire:source_audit/orn_partner_counts_by_hemisphere --kind violin --x input_relation --value n_synapses --where LN_type=il3LN6 --output outputs/il3ln6_orn_violin
215
+ fruitloops plot --table flywire:source_audit/orn_partner_counts_by_hemisphere --kind heatmap --x glomerulus --y input_relation --value n_synapses --where LN_type=il3LN6 --output outputs/il3ln6_orn_heatmap
216
+ fruitloops plot --table comparison:matched_ln_class_similarity --kind bubble --x orn_input_distribution_correlation --y pn_output_distribution_correlation --size flywire_orn_input_total --color flywire_contra_fraction --label LN_class --output outputs/similarity_bubble
217
+ ```
218
+
219
+ The wrapper script is equivalent:
220
+
221
+ ```bash
222
+ python scripts/plot_csv.py --csv path/to/table.csv --kind hist --value score --output outputs/score_hist
223
+ ```
224
+
225
+ ## Live Connectome Access
226
+
227
+ Live database access is optional. Credentials come from environment variables or
228
+ from a local `.env` file. `.env` is ignored by git; start from `.env.example`.
229
+
230
+ ```bash
231
+ python -m pip install -e '.[live]'
232
+ cp .env.example .env
233
+ ```
234
+
235
+ Use a different env file with `--env-file path/to/file.env`.
236
+
237
+ Hemibrain uses `neuprint-python`:
238
+
239
+ ```bash
240
+ export NEUPRINT_SERVER=neuprint.janelia.org
241
+ export NEUPRINT_DATASET=hemibrain:v1.2.1
242
+ export NEUPRINT_APPLICATION_CREDENTIALS=<neuprint-token>
243
+
244
+ fruitloops live hemibrain neurons --type-contains il3LN6 --limit 5 --format csv
245
+ fruitloops live hemibrain connections --upstream-body-id 5813018460 --limit 20 --format json
246
+ fruitloops live hemibrain cypher --query 'MATCH (n:Neuron) RETURN n.bodyId AS bodyId, n.type AS type LIMIT 5'
247
+ ```
248
+
249
+ FlyWire uses `caveclient`:
250
+
251
+ ```bash
252
+ export FLYWIRE_DATASTACK=flywire_fafb_public
253
+ export CAVE_AUTH_TOKEN=<cave-token>
254
+
255
+ fruitloops live flywire tables --format csv
256
+ fruitloops live flywire table --table synapses_nt_v1 --in pre_pt_root_id=720575940623636701 --limit 10 --format csv
257
+ fruitloops live flywire synapses --pre-root-id 720575940623636701 --limit 10 --format json
258
+ ```
259
+
260
+ Script shortcuts are equivalent:
261
+
262
+ ```bash
263
+ python scripts/live_hemibrain.py neurons --type-contains il3LN6 --limit 5
264
+ python scripts/live_flywire.py tables
265
+ ```
266
+
267
+ ## Offline-First Live Cache
268
+
269
+ Use `offline fetch` when you want local data first and live APIs only on cache
270
+ miss. Results are saved under `cache/live/`, which is ignored by git.
271
+
272
+ ```bash
273
+ fruitloops offline fetch \
274
+ --dataset flywire \
275
+ --action synapses \
276
+ --pre-root-id 720575940623636701 \
277
+ --limit 10 \
278
+ --format csv
279
+ ```
280
+
281
+ Repeat the same command to read the cached CSV. Use `--offline-only` to fail
282
+ instead of hitting the network, or `--refresh` to force a live re-fetch.
283
+
284
+ ```bash
285
+ fruitloops offline list
286
+ fruitloops offline fetch --dataset flywire --action tables --offline-only
287
+ fruitloops offline fetch --dataset hemibrain --action neurons --type-contains il3LN6 --limit 5
288
+ ```
289
+
290
+ ## Bulk Offline Releases
291
+
292
+ Bulk releases should be the primary offline source when you need broad
293
+ connectivity, with live/cache queries only filling gaps.
294
+
295
+ List known public release files:
296
+
297
+ ```bash
298
+ fruitloops bulk sources
299
+ ```
300
+
301
+ Download the practical FlyWire connection table first:
302
+
303
+ ```bash
304
+ fruitloops bulk download --dataset flywire --kind proofread-connections
305
+ ```
306
+
307
+ Optional larger downloads:
308
+
309
+ ```bash
310
+ fruitloops bulk download --dataset hemibrain --kind compact-adjacencies
311
+ fruitloops bulk download --dataset flywire --kind synapses
312
+ fruitloops bulk download --dataset hemibrain --kind neo4j-inputs
313
+ ```
314
+
315
+ Import CSV/Parquet/Feather into local DuckDB:
316
+
317
+ ```bash
318
+ python -m pip install -e '.[bulk]'
319
+ fruitloops bulk import \
320
+ --path bulk/raw/flywire/proofread_connections_783.feather \
321
+ --table flywire_proofread_connections \
322
+ --replace
323
+ fruitloops bulk tables
324
+ fruitloops bulk query --table flywire_proofread_connections --limit 10 --format csv
325
+ ```
326
+
327
+ Optimize imported connection tables before repeated partner queries:
328
+
329
+ ```bash
330
+ fruitloops bulk optimize --table flywire_proofread_connections --prefix flywire
331
+ fruitloops bulk optimize --table hemibrain_traced_roi_connections --prefix hemibrain
332
+ ```
333
+
334
+ Agent-facing wrappers infer common pre/post/weight/ROI column names:
335
+
336
+ ```bash
337
+ fruitloops bulk schema --table flywire_proofread_connections
338
+ fruitloops bulk connections --table flywire_proofread_connections --pre-id ROOT --limit 20 --format csv
339
+ fruitloops bulk inputs --table flywire_proofread_connections --body-id ROOT --format csv
340
+ fruitloops bulk outputs --table flywire_proofread_connections --body-id ROOT --format csv
341
+ fruitloops bulk partners --table flywire_proofread_connections --body-id ROOT --format json
342
+ fruitloops bulk views --table flywire_proofread_connections --prefix flywire
343
+ fruitloops bulk optimize --table flywire_proofread_connections --prefix flywire
344
+ ```
345
+
346
+ Hemibrain's compact adjacency and Neo4j bundles are CSV archives; extract first,
347
+ then import the CSVs you need:
348
+
349
+ ```bash
350
+ fruitloops bulk extract --path bulk/raw/hemibrain/exported-traced-adjacencies-v1.2.tar.gz
351
+ fruitloops bulk import \
352
+ --path bulk/extracted/exported-traced-adjacencies-v1.2/traced-roi-connections.csv \
353
+ --table hemibrain_traced_roi_connections \
354
+ --replace
355
+ fruitloops bulk import \
356
+ --path bulk/extracted/exported-traced-adjacencies-v1.2/traced-total-connections.csv \
357
+ --table hemibrain_traced_total_connections \
358
+ --replace
359
+ fruitloops bulk import \
360
+ --path bulk/extracted/exported-traced-adjacencies-v1.2/traced-neurons.csv \
361
+ --table hemibrain_traced_neurons \
362
+ --replace
363
+ fruitloops bulk extract --path bulk/raw/hemibrain/hemibrain_v1.2_neo4j_inputs.zip
364
+ fruitloops bulk import --path bulk/extracted/hemibrain_v1.2_neo4j_inputs/<file>.csv --table hemibrain_<name>
365
+ ```
366
+
367
+ End-to-end offline setup:
368
+
369
+ ```bash
370
+ python -m pip install -e '.[bulk]'
371
+ fruitloops bulk download --dataset flywire --kind proofread-connections
372
+ fruitloops bulk import --path bulk/raw/flywire/proofread_connections_783.feather --table flywire_proofread_connections --replace
373
+ fruitloops bulk optimize --table flywire_proofread_connections --prefix flywire
374
+ fruitloops bulk download --dataset hemibrain --kind compact-adjacencies
375
+ fruitloops bulk extract --path bulk/raw/hemibrain/exported-traced-adjacencies-v1.2.tar.gz
376
+ fruitloops bulk import --path bulk/extracted/exported-traced-adjacencies-v1.2/traced-roi-connections.csv --table hemibrain_traced_roi_connections --replace
377
+ fruitloops bulk optimize --table hemibrain_traced_roi_connections --prefix hemibrain
378
+ fruitloops bulk tables
379
+ ```
380
+
381
+ `flywire_synapses_783.feather` is much larger than the proofread connection
382
+ table. Fruitloops streams Feather imports through Arrow record batches, but the
383
+ resulting DuckDB database still needs enough local disk for the imported table
384
+ and indexes.
385
+
386
+ ## Output Formats
387
+
388
+ Most commands support `--format table`, `--format csv`, `--format json`, or
389
+ `--format jsonl`. CSV and JSONL are intended for downstream agent pipelines.
390
+
391
+ ## Rebuilding the Data Snapshot
392
+
393
+ From the paper repository root:
394
+
395
+ ```bash
396
+ python scripts/build_data_snapshot.py \
397
+ --source "/path/to/widespread-direction-selectivity" \
398
+ --dest data
399
+ ```
400
+
401
+ The script copies generated CSVs and rewrites `data/manifest.csv`.
402
+
403
+ ## Test
404
+
405
+ ```bash
406
+ python -m unittest discover -s tests
407
+ ```