PyPI - Rhapso - Versions diffs - 0.1.92__tar.gz - Mend

Rhapso 0.1.92__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (107) hide show

rhapso-0.1.92/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2022 Allen Institute for Neural Dynamics
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

rhapso-0.1.92/PKG-INFO ADDED Viewed

@@ -0,0 +1,39 @@
+Metadata-Version: 2.4
+Name: Rhapso
+Version: 0.1.92
+Summary: A python package for aligning and stitching light sheet fluorescence microscopy images together
+Author: ND
+Author-email: sean.fite@alleninstitute.org
+Classifier: Development Status :: 3 - Alpha
+Classifier: Intended Audience :: Developers
+Classifier: Natural Language :: English
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.7
+Classifier: Programming Language :: Python :: 3.8
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Operating System :: OS Independent
+Requires-Python: >=3.7
+License-File: LICENSE
+Requires-Dist: pandas
+Requires-Dist: dask[array]==2024.12.1
+Requires-Dist: zarr==2.18.3
+Requires-Dist: scipy==1.13.1
+Requires-Dist: scikit-image
+Requires-Dist: bioio==1.3.0
+Requires-Dist: bioio-tifffile==1.0.0
+Requires-Dist: tifffile==2025.1.10
+Requires-Dist: dask-image==2024.5.3
+Requires-Dist: boto3==1.35.92
+Requires-Dist: numcodecs==0.13.1
+Requires-Dist: matplotlib==3.10.0
+Requires-Dist: memory-profiler==0.61.0
+Requires-Dist: s3fs==2024.12.0
+Requires-Dist: scikit-learn
+Dynamic: author
+Dynamic: author-email
+Dynamic: classifier
+Dynamic: license-file
+Dynamic: requires-dist
+Dynamic: requires-python
+Dynamic: summary

rhapso-0.1.92/README.md ADDED Viewed

@@ -0,0 +1,361 @@
+# Rhapso
+**Rhapso** is a modular Python toolkit for interest point based registration, alignment, and fusing of large-scale microscopy datasets.
+[![License](https://img.shields.io/badge/license-MIT-brightgreen)](LICENSE)
+[![Python Version](https://img.shields.io/badge/python-3.11-blue.svg)](https://www.python.org/downloads/release/python-3110/)
+[![Documentation](https://img.shields.io/badge/docs-wiki-blue)](https://github.com/AllenNeuralDynamics/Rhapso/wiki)
+<!-- ## Example Usage Media Content Coming Soon....
+-- -->
+<br>
+## Table of Contents
+- [Summary](#summary)
+- [Contact](#contact)
+- [Features](#features)
+- [Performance](#performance)
+- [Layout](#layout)
+- [Installation](#installation)
+- [Ray](#ray)
+- [Run Locally w/ Ray](#run-locally-with-ray)
+- [Run on AWS Cluster w/ Ray](#run-on-aws-cluster-with-ray)
+- [Access Ray Dashboard](#access-ray-dashboard)
+- [Parameters](#parameters)
+- [Tuning Guide](#tuning-guide)
+- [Build Package](#build-package)
+  - [Using the Built `.whl` File](#using-the-built-whl-file)
+---
+<br>
+**Update 11/26/25**
+--------
+Rhapso is still loading... and while we wrap up development, a couple things to know if you are outside the Allen Institute:
+   - This process requires a very specific XML structure to work.
+   - Fusion/Mutliscale is included but still under testing and development
+<br>
+## Summary
+Rhapso is a set of Python components for registration, alignment, and stitching of large-scale, 3D, overlapping tile-based, multiscale microscopy datasets.
+Rhapso was developed by the Allen Institute for Neural Dynamics. Rhapso is comprised of stateless components. You can call these components using a pipeline script, with the option to run on a single machine or scale out with Ray to cloud based (currently only supporting AWS) clusters.
+Current data loaders support Zarr and Tiff.
+<br>
+## Contact
+Questions or want to contribute? Please open an issue..
+<br>
+## Features
+- **Interest Point Detection** - using DOG based feature detection
+- **Interest Point Matching** - using descriptor based RANSAC to match feature points
+- **Global Optimization** - aligning matched features per tile, globally
+- **Validation and Visualization Tools** - validate component specific results for the best output
+---
+<br>
+## High Level Approach to Registration, Alignment, and Fusion
+We first run **interest point detection** to capture feature points in the dataset, focusing on overlapping regions between tiles. These points drive all downstream alignment.
+Next, we perform **alignment** in two-three stages, with regularized models:
+1. **Rigid matching + solver** – Match interest points with a rigid model and solve for globally consistent rigid transforms between all tiles.
+2. **Affine matching + solver** – Starting from the rigid solution, repeat matching with an affine model to recover more precise tile transforms.
+3. **Split affine matching + solver** – For very large z-stacks, we recommend first running the split dataset component to chunk tiles into smaller Z-bounds, then repeating affine matching and solving in “split affine” mode to refine local alignment.
+All resulting transforms are written back into the input XML.
+Whether you split or not, once the XML contains your final transforms, you are ready for **fusion**. We recommend viewing the aligned XML in FIJI/BDV to visually confirm alignment quality before running fusion.
+---
+<br>
+## Performance
+**Interest Point Detection Performance Example (130TB Zarr dataset)**
+| Environment           | Resources            | Avg runtime |
+|:----------------------|:---------------------|:-----------:|
+| Local single machine  | 10 CPU,  10 GB RAM   | ~120 min    |
+| AWS Ray cluster       | 560 CPU, 4.4 TB RAM  | ~30 min     |
+<br>
+*Actual times vary by pipeline components, dataset size, tiling, and parameter choices.*
+---
+<br>
+## Layout
+```
+Rhapso/
+└── Rhapso/
+    ├── data_prep/                          # Custom data loaders
+    ├── detection/
+    ├── evaluation/
+    ├── fusion/
+    ├── image_split/
+    ├── matching/
+    ├── pipelines/
+    │   └── ray/
+    │       ├── aws/
+    │       │   ├── config/                 # Cluster templates (edit for your account)
+    │       │   └── alignment_pipeline.py   # AWS Ray pipeline entry point
+    │       ├── local/
+    │       │   └── alignment_pipeline.py   # Local Ray pipeline entry point
+    │       ├── param/                      # Run parameter files (customize per run)
+    │       ├── interest_point_detection.py # Detection pipeline script
+    │       ├── interest_point_matching.py  # Matching pipeline script
+    │       └── solver.py                   # Global solver script
+    ├── solver/
+    └── visualization/                      # Validation tools
+```
+---
+<br>
+## Installation
+ ```sh
+# clone the repo
+git clone https://github.com/AllenNeuralDynamics/Rhapso.git
+# create and activate a virtual environment
+python -m venv .venv && source .venv/bin/activate
+# or: conda create -n rhapso python=3.11 && conda activate rhapso
+# install deps
+pip install -r requirements.txt
+```
+---
+<br>
+## Ray
+**Ray** is a Python framework for parallel and distributed computing. It lets you run regular Python functions in parallel on a single machine **or** scale them out to a cluster (e.g., AWS) with minimal code changes. In Rhapso, we use Ray to process large scale datasets.
+- Convert a function into a distributed task with `@ray.remote`
+- Control scheduling with resource hints (CPUs, memory)
+<br>
+> [!TIP]
+> Ray schedules **greedily** by default and each task reserves **1 CPU**, so if you fire many tasks, Ray will try to run as many as your machine advertises—often too much for a laptop. Throttle concurrency explicitly so you don’t overload your system. Use your machine's activity monitor to track this or the Ray dashboard to monitor this on your cluster:
+>
+> - **Cap by CPUs**:
+>   ```python
+>   @ray.remote(num_cpus=3)   # Ray will schedule each time 3 cpus are available
+>   ```
+> - **Cap by Memory and CPU** if Tasks are RAM-Heavy (bytes):
+>   ```python
+>   @ray.remote(num_cpus=2, memory=4 * 1024**3)  # 4 GiB and 2 CPU per task>
+>   ```
+> - **No Cap** on Resources:
+>   ```python
+>   @ray.remote
+>   ```
+> - **Good Local Default:**
+>   ```python
+>   @ray.remote(num_cpus=2)
+>   ```
+---
+<br>
+## Run Locally with Ray
+### 1. Edit or create param file (templates in codebase)
+```python
+Rhapso/Rhapso/pipelines/param/
+```
+### 2. Update alignment pipeline script to point to param file
+```python
+with open("Rhapso/pipelines/ray/param/your_param_file.yml", "r") as file:
+    config = yaml.safe_load(file)
+```
+### 3. Run local alignment pipeline script
+```python
+python Rhapso/pipelines/ray/local/alignment_pipeline.py
+```
+---
+<br>
+## Run on AWS Cluster with Ray
+### 1. Edit/create param file (templates in codebase)
+```python
+Rhapso/pipelines/ray/param/
+```
+### 2. Update alignment pipeline script to point to param file
+```python
+with open("Rhapso/pipelines/ray/param/your_param_file.yml", "r") as file:
+    config = yaml.safe_load(file)
+```
+### 3. Edit/create config file (templates in codebase)
+```python
+Rhapso/pipelines/ray/aws/config/
+```
+### 4. Update config file to point to whl location in setup_commands
+```python
+- aws s3 cp s3://rhapso-whl-v2/Rhapso-0.1.8-py3-none-any.whl /tmp/Rhapso-0.1.8-py3-none-any.whl
+```
+### 5. Update alignment pipeline script to point to config file
+```python
+unified_yml = "your_cluster_config_file_name.yml"
+```
+### 6. Create whl file and upload to s3
+```python
+python setup.py sdist bdist_wheel
+```
+### 7. Run AWS alignment pipeline script
+```python
+python Rhapso/pipelines/ray/aws/alignment_pipeline.py
+```
+> [!TIP]
+> - The pipeline script is set to always spin the cluster down, it is a good practice to double check in AWS.
+> - If you experience a sticky cache on run params, you may have forgotten to spin your old cluster down.
+<br>
+## Access Ray Dashboard
+**This is a great place to tune your cluster's performance.**
+1.	Find public IP of head node.
+2.	Replace the ip address and PEM file location to ssh into head node.
+     ```
+    ssh -i /You/path/to/ssh/key.pem -L port:localhost:port ubuntu@public.ip.address
+    ```
+4.	Go to dashboard.
+     ```
+    http://localhost:8265
+    ```
+---
+<br>
+## Parameters
+### Detection
+```
+| Parameter          | Feature / step         | What it does                                                                                  | Typical range\*                   |
+| :----------------- | :--------------------- | :-------------------------------------------------------------------------------------------- | :-------------------------------- |
+| `dsxy`             | Downsampling (XY)      | Reduces XY resolution before detection; speeds up & denoises, but raises minimum feature size | 16                                |
+| `dsz`              | Downsampling (Z)       | Reduces Z resolution; often lower than XY due to anisotropy                                   | 16                                |
+| `min_intensity`    | Normalization          | Lower bound for intensity normalization prior to DoG                                          | 1                                 |
+| `max_intensity`    | Normalization          | Upper bound for intensity normalization prior to DoG                                          | 5                                 |
+| `sigma`            | DoG blur               | Gaussian blur scale (sets feature size); higher = smoother, fewer peaks                       | 1.5 - 2.5                         |
+| `threshold`        | Peak detection (DoG)   | Peak threshold (initial min peak ≈ `threshold / 3`); higher = fewer, stronger peaks           | 0.0008 - .05                      |
+| `median_filter`    | Pre-filter (XY)        | Median filter size to suppress speckle/isolated noise before DoG                              | 1-10                              |
+| `combine_distance` | Post-merge (DoG peaks) | Merge radius (voxels) to de-duplicate nearby detections                                       | 0.5                               |
+| `chunks_per_bound` | Tiling/parallelism     | Sub-partitions per tile/bound; higher improves parallelism but adds overhead                  | 12-18                             |
+| `max_spots`        | Post-cap               | Maximum detections per bound to prevent domination by dense regions                           | 8,0000 - 10,000                   |
+```
+<br>
+### Matching
+```
+# Candidate Selection
+| Parameter                      | Feature / step      | What it does                                                      | Typical range  |
+| :----------------------------- | :------------------ | :---------------------------------------------------------------- | :------------- |
+| `num_neighbors`                | Candidate search    | Number of nearest neighbors to consider per point                 | 3              |
+| `redundancy`                   | Candidate search    | Extra neighbors added for robustness beyond `num_neighbors`       | 0 - 1          |
+| `significance`                 | Ratio test          | Strictness of descriptor ratio test; larger = stricter acceptance | 3              |
+| `search_radius`                | Spatial gating      | Max spatial distance for candidate matches (in downsampled units) | 100 - 300      |
+| `num_required_neighbors`       | Candidate filtering | Minimum neighbors required to keep a candidate point              | 3              |
+# Ransac
+| Parameter                     | Feature / step       | What it does                                                      | Typical range  |
+| :---------------------------- | :------------------- | :---------------------------------------------------------------- | :------------- |
+| `model_min_matches`           | RANSAC               | Minimum correspondences to estimate a rigid transform             | 18 – 32        |
+| `inlier_factor`               | RANSAC               | Inlier tolerance scaling; larger = looser inlier threshold        | 30 – 100       |
+| `lambda_value`                | RANSAC               | Regularization strength during model fitting                      | 0.1 – 0.05     |
+| `num_iterations`              | RANSAC               | Number of RANSAC trials; higher = more robust, slower             | 10,0000        |
+| `regularization_weight`       | RANSAC               | Weight applied to the regularization term                         | 1.0            |
+```
+<br>
+### Solver
+```
+| Parameter            | Feature / step | What it does                                                       | Typical range       |
+| :------------------- | :------------- | :----------------------------------------------------------------- | :------------------ |
+| `relative_threshold` | Graph pruning  | Reject edges with residuals above dataset-relative cutoff          | 3.5                 |
+| `absolute_threshold` | Graph pruning  | Reject edges above an absolute error bound (detection-space units) | 7.0                 |
+| `min_matches`        | Graph pruning  | Minimum matches required to retain an edge between tiles           | 3                   |
+| `damp`               | Optimization   | Damping for iterative solver; higher can stabilize tough cases     | 1.0                 |
+| `max_iterations`     | Optimization   | Upper bound on solver iterations                                   | 10,0000             |
+| `max_allowed_error`  | Optimization   | Overall error cap; `inf` disables hard stop by error               | `inf`               |
+| `max_plateauwidth`   | Early stopping | Stagnation window before stopping on no improvement                | 200                 |
+```
+---
+<br>
+## Tuning Guide
+- **Start with Detection.** The quality and density of interest points strongly determine alignment outcomes.
+- **Target Counts (exaSPIM):** ~25–35k points per tile in dense regions; ~10k for sparser tiles. Going much higher usually increases runtime without meaningful accuracy gains.
+- **Inspect Early.** After detection, run the visualization script and verify that peaks form **clustered shapes/lines** with a **good spatial spread**—a good sign for robust rigid matches.
+- **Rigid → Affine Dependency.** Weak rigid matches produce poor rigid transforms, which then degrade affine matching (points don’t land close enough). If tiles fail to align:
+  - Check **match counts** for the problem tile and its neighbors.
+  - Adjust high-impact detection knobs—`sigma`, `threshold`, and `median_filter`—within sensible ranges.
+  - Revisit `max_spots` and `combine_distance` to balance density vs. duplicate detections.
+---
+<br>
+## Build Package
+### Using the Built `.whl` File
+1. **Build the `.whl` File in the root of this repo:**
+  ```sh
+  cd /path/to/Rhapso
+  pip install setuptools wheel
+  python setup.py sdist bdist_wheel
+  ```
+  The `.whl` file will appear in the `dist` directory. Do not rename it to ensure compatibility (e.g., `rhapso-0.1-py3-none-any.whl`).
+---
+<br>
+<br>
+<br>

rhapso-0.1.92/Rhapso/__init__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+

rhapso-0.1.92/Rhapso/data_prep/__init__.py ADDED Viewed

	@@ -0,0 +1,2 @@
1	+
2	+ # This file can be empty or contain initialization code for the data_preparation module

rhapso-0.1.92/Rhapso/data_prep/n5_reader.py ADDED Viewed

@@ -0,0 +1,188 @@
+# -----------------------------------------------------------------------------
+# n5_reader.py – N5 data verification script
+#
+# Setup & run:
+#   1. python -m venv n5Venv             # create a fresh Python virtual environment
+#   2. source n5Venv/bin/activate        # activate the virtual environment
+#   3. pip install .[n5_reader]          # install n5_reader dependencies from setup.py
+#   4. python Rhapso/data_prep/n5_reader.py
+#      # run the N5 reader for inspecting datasets
+# -----------------------------------------------------------------------------
+import zarr
+import s3fs
+import os
+import numpy as np
+import matplotlib.pyplot as plt
+import json
+from zarr.storage import FSStore
+def list_files_under_prefix(node, path):
+    try:
+        for item in node[path]:
+            new_path = f"{path}/{item}"
+            if isinstance(node[new_path], zarr.hierarchy.Group):
+                print(f"Group: {new_path}")
+                list_files_under_prefix(node, new_path)
+            else:
+                print(f"Dataset: {new_path} - {node[new_path].shape}")
+    except KeyError:
+        print(f"No items found under the path {path}")
+# Amount of interest points in view 18,0 is 1061
+# Max value for view 18,0 in corr ip index is 1017
+def read_n5_data(n5_path):
+    import zarr, s3fs, os
+    # guard missing local path
+    if not n5_path.startswith("s3://") and not os.path.isdir(n5_path):
+        print(f"❌ Local N5 path not found: {n5_path}")
+        return
+    # open the store (S3 or local N5)
+    if n5_path.startswith("s3://"):
+        s3 = s3fs.S3FileSystem(anon=False)
+        store = s3fs.S3Map(root=n5_path, s3=s3)
+    else:
+        store = zarr.N5Store(n5_path)
+    print(f"\n🔍 Reading N5 data at: {n5_path}")
+    root = zarr.open(store, mode='r')
+    def visit_fn(path, node):
+        if isinstance(node, zarr.Array):
+            print(f"\n📂 Dataset: {path}")
+            print(f"  🔢 dtype: {node.dtype}")
+            shape = node.shape
+            print(f"  📏 shape: {shape}")
+            if len(shape) > 1:
+                print(f"  📊 count: {shape[0]} arrays of shape {shape[1:]}")
+            else:
+                print(f"  📊 count: {shape[0]} elements")
+            print(f"  🗂 chunks: {node.chunks}")
+            print(f"  🛠 compressor: {node.compressor}")
+            print("  🔎 first 5 entries:")
+            sample = node[:5]
+            for i, entry in enumerate(sample, start=1):
+                # ensure nested array is printed clearly
+                val = entry.tolist() if hasattr(entry, "tolist") else entry
+                print(f"    {i}. {val}")
+    root.visititems(visit_fn)
+# # read_n5_data('/home/martin/Documents/Allen/Data/IP_TIFF_XML_2/interestpoints.n5')
+def read_correspondences(dataset_path):
+    if dataset_path.startswith("s3://"):
+        store = zarr.storage.FSStore(dataset_path, mode="r")
+        root = zarr.open(store, mode="r")
+    else:
+        store = zarr.N5Store(dataset_path)
+        root = zarr.open(store, mode="r")
+    if "data" not in root:
+        print("Key 'data' not found in root.")
+        return
+    group = root["data"]
+    data = group[:]
+    print(f"Loaded {len(data)} entries.")
+    # for i, entry in enumerate(data):
+    #     print(f"{i}: {entry}")
+    # print("hi")
+# Big Stitcher Output
+# # base_path = "/Users/seanfite/Desktop/interest_point_detection/interestpoints.n5"
+# base_path = "/Users/seanfite/Desktop/ip_rigid_alignment/interestpoints.n5"
+# # base_path = "/Users/seanfite/Desktop/ip_affine_alignment/interestpoints.n5"
+# # base_path = "s3://rhapso-matching-test/output/interestpoints.n5"
+# for tp_id in [0]:
+#     for setup_id in range(20):
+#         path = f"{base_path}/tpId_{tp_id}_viewSetupId_{setup_id}/beads/correspondences"
+#         print(f"Reading: {path}")
+#         read_correspondences(path)
+def read_interest_points(full_path):
+    if full_path.startswith("s3://"):
+        # s3 = s3fs.S3FileSystem(anon=False)
+        # store = s3fs.S3Map(root=full_path, s3=s3)
+        # zarray = zarr.open_array(store, mode='r')
+        # data = zarray[:]
+        path = full_path.replace("s3://", "", 1)
+        bucket = path.split("/")[0]
+        prefix = "/".join(path.split("/")[1:])
+        s3 = s3fs.S3FileSystem()
+        store = FSStore(f"{bucket}/{prefix}", fs=s3, mode='r')
+        root = zarr.open(store, mode="r")
+        group = root["data"]
+        data = group[:]
+        count = len(data)
+        print(count)
+        print("")
+    else:
+        full_path = full_path.rstrip("/")  # remove trailing slash if any
+        components = full_path.split("/")
+        # Find index of the N5 root (assumes .n5 marks the root)
+        try:
+            n5_index = next(i for i, c in enumerate(components) if c.endswith(".n5"))
+        except StopIteration:
+            raise ValueError("No .n5 directory found in path")
+        dataset_path = "/".join(components[:n5_index + 1])            # the store root
+        dataset_rel_path = "/".join(components[n5_index + 1:])        # relative dataset path
+        # Open N5 store and dataset
+        store = zarr.N5Store(dataset_path)
+        root = zarr.open(store, mode='r')
+        if dataset_rel_path not in root:
+            print(f"Skipping: {dataset_rel_path} (not found)")
+            return
+        zarray = root[dataset_rel_path + "/loc"]
+        data = zarray[:]
+    print("\n--- Detection Stats (Raw Rhapso Output) ---")
+    print(f"Total Points: {len(data)}")
+    # for dim, name in zip(range(3), ['X', 'Y', 'Z']):
+    #     values = data[:, dim]
+    #     print(f"{name} Range: {values.min():.2f} – {values.max():.2f} | Spread (std): {values.std():.2f}")
+    # volume = np.ptp(data[:, 0]) * np.ptp(data[:, 1]) * np.ptp(data[:, 2])
+    # density = len(data) / (volume / 1e9) if volume > 0 else 0
+    # print(f"Estimated Density: {density:.2f} points per 1000³ volume")
+    # print("-----------------------")
+    # # --- 3D Plot ---
+    # max_points = 1000000000000
+    # sample = data if len(data) <= max_points else data[np.random.choice(len(data), max_points, replace=False)]
+    # fig = plt.figure(figsize=(10, 8))
+    # ax = fig.add_subplot(111, projection='3d')
+    # ax.scatter(sample[:, 0], sample[:, 1], sample[:, 2], c='blue', alpha=0.5, s=1)
+    # ax.set_xlabel('X')
+    # ax.set_ylabel('Y')
+    # ax.set_zlabel('Z')
+    # ax.set_title(f"Interest Points in 3D (showing {len(sample)} points)")
+    # plt.tight_layout()
+    # plt.show()
+# base_path = "s3://rhapso-matching-test/output/interestpoints.n5"
+# base_path = "/Users/seanfite/Desktop/IP_TIFF_XML/interestpoints.n5"
+base_path = "/Users/seanfite/Desktop/interestpoints.n5"
+for tp_id in [0]:
+    for setup_id in range(20):
+        path = f"{base_path}/tpId_{tp_id}_viewSetupId_{setup_id}/beads/interestpoints"
+        print(f"For view: {setup_id}")
+        read_interest_points(path)

rhapso-0.1.92/Rhapso/data_prep/s3_big_stitcher_reader.py ADDED Viewed

@@ -0,0 +1,55 @@
+import json
+import os
+import s3fs
+"""
+Utility class for downloading BigStitcher outputs from S3 to local storage for N5 reader compatibility
+"""
+class S3BigStitcherReader:
+    def __init__(self, s3_uri, local_directory):
+           self.s3_uri = s3_uri
+           self.local_directory = local_directory
+    def download_n5_from_s3_to_local(self):
+        """
+        Recursively download an N5 dataset from S3 to a local directory.
+        """
+        s3 = s3fs.S3FileSystem(anon=False)
+        s3_path = self.s3_uri.replace("s3://", "")
+        all_keys = s3.find(s3_path, detail=True)
+        for key, obj in all_keys.items():
+            if obj["type"] == "file":
+                rel_path = key.replace(s3_path + "/", "")
+                local_file_path = os.path.join(self.local_directory, rel_path)
+                os.makedirs(os.path.dirname(local_file_path), exist_ok=True)
+                s3.get(key, local_file_path)
+                # Check for the specific interestpoints path
+                if rel_path.endswith("beads/interestpoints/attributes.json") and "interestpoints.n5" in rel_path:
+                    # Construct the path to the attributes file
+                    attributes_path = os.path.join(os.path.dirname(local_file_path), "attributes.json")
+                    attributes_data = {
+                        "pointcloud": "1.0.0",
+                        "type": "list",
+                        "list version": "1.0.0"
+                    }
+                    with open(attributes_path, "w") as f:
+                                        json.dump(attributes_data, f, indent=2)
+    def run(self):
+        self.download_n5_from_s3_to_local()
+        s3_path = self.s3_uri.replace("s3://", "")
+        full_local_path = os.path.join(self.local_directory, s3_path)
+        # Final paths
+        xml_input_path = os.path.join(full_local_path, "bigstitcher_ip.xml")
+        n5_output_path = os.path.join(full_local_path, "interestpoints.n5")
+        print("XML Input Path:", xml_input_path)
+        print("N5 Output Path:", n5_output_path)
+        return xml_input_path, n5_output_path