PyPI - CATSort - Versions diffs - 0.1.3__tar.gz - Mend

CATSort 0.1.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

catsort-0.1.3/LICENSE +21 -0
catsort-0.1.3/PKG-INFO +87 -0
catsort-0.1.3/README.md +56 -0
catsort-0.1.3/pyproject.toml +52 -0
catsort-0.1.3/setup.cfg +4 -0
catsort-0.1.3/src/CATSort.egg-info/PKG-INFO +87 -0
catsort-0.1.3/src/CATSort.egg-info/SOURCES.txt +14 -0
catsort-0.1.3/src/CATSort.egg-info/dependency_links.txt +1 -0
catsort-0.1.3/src/CATSort.egg-info/requires.txt +6 -0
catsort-0.1.3/src/CATSort.egg-info/top_level.txt +1 -0
catsort-0.1.3/src/catsort/__init__.py +5 -0
catsort-0.1.3/src/catsort/core/clustering.py +71 -0
catsort-0.1.3/src/catsort/core/collision.py +151 -0
catsort-0.1.3/src/catsort/core/utils.py +70 -0
catsort-0.1.3/src/catsort/sorter.py +212 -0
catsort-0.1.3/tests/test_sorter.py +82 -0

catsort-0.1.3/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 lucasbeziers
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

catsort-0.1.3/PKG-INFO ADDED Viewed

@@ -0,0 +1,87 @@
+Metadata-Version: 2.4
+Name: CATSort
+Version: 0.1.3
+Summary: A collision-aware template matching spike sorter.
+Author-email: Lucas Beziers <lucas.beziers.pro@gmail.com>
+License: MIT
+Project-URL: Homepage, https://github.com/lucasbeziers/CATSort
+Project-URL: Repository, https://github.com/lucasbeziers/CATSort
+Project-URL: Issues, https://github.com/lucasbeziers/CATSort/issues
+Keywords: spike sorting,neuroscience,electrophysiology,collisions,template matching
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Science/Research
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Scientific/Engineering
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: isosplit6
+Requires-Dist: numba
+Requires-Dist: numpy
+Requires-Dist: scikit-learn
+Requires-Dist: scipy
+Requires-Dist: spikeinterface>=0.103.1
+Dynamic: license-file
+# CATSort
+**CATSort** (Collision-Aware Template-matching Sort) is a robust spike sorter designed to handle overlapping spikes (collisions) with high precision using a specific collision-handling stage before clustering followed by template matching.
+## Key Features
+- **Collision Handling**: Automatically identifies and flags collided spikes using multi-criterion feature analysis (amplitude, width, energy).
+- **Template Matching**: Robust spike extraction using template-based matching (including 'wobble' for now).
+- **Flexible Schemes**: Choose between an `adaptive` threshold optimization or an `original` fixed MAD (Median Absolute Deviation) multiplier scheme.
+- **SpikeInterface Integration**: Fully compatible with the [SpikeInterface](https://github.com/SpikeInterface/spikeinterface) ecosystem.
+## Installation
+You can install `catsort` via pip:
+```bash
+pip install catsort
+```
+Or from source:
+```bash
+git clone https://github.com/lucasbeziers/CATSort.git
+cd CATSort
+pip install -e .
+```
+## Quick Start
+```python
+import spikeinterface.extractors as se
+from catsort import run_catsort
+# Load your recording
+recording = se.read_binary("path_to_data.dat", sampling_frequency=30000, num_channels=384, dtype="int16")
+# Run CATSort
+sorting = run_catsort(recording)
+# The result is a SpikeInterface Sorting object
+print(sorting)
+```
+## Parameters
+CATSort offers several parameters to fine-tune its behavior:
+- `scheme`: `'original'` or `'adaptive'`.
+  - `'adaptive'` uses temporal collisions to optimize thresholds.
+  - `'original'` uses fixed MAD multipliers.
+- `mad_multiplier_amplitude`, `mad_multiplier_width`, `mad_multiplier_energy`: (Default: `7.0`, `10.0`, `15.0`) Used when `scheme='original'`.
+- `detect_threshold`: Spike detection threshold in standard deviations (Default: `5`).
+## License
+MIT License. See [LICENSE](LICENSE) for details.

catsort-0.1.3/README.md ADDED Viewed

@@ -0,0 +1,56 @@
+# CATSort
+**CATSort** (Collision-Aware Template-matching Sort) is a robust spike sorter designed to handle overlapping spikes (collisions) with high precision using a specific collision-handling stage before clustering followed by template matching.
+## Key Features
+- **Collision Handling**: Automatically identifies and flags collided spikes using multi-criterion feature analysis (amplitude, width, energy).
+- **Template Matching**: Robust spike extraction using template-based matching (including 'wobble' for now).
+- **Flexible Schemes**: Choose between an `adaptive` threshold optimization or an `original` fixed MAD (Median Absolute Deviation) multiplier scheme.
+- **SpikeInterface Integration**: Fully compatible with the [SpikeInterface](https://github.com/SpikeInterface/spikeinterface) ecosystem.
+## Installation
+You can install `catsort` via pip:
+```bash
+pip install catsort
+```
+Or from source:
+```bash
+git clone https://github.com/lucasbeziers/CATSort.git
+cd CATSort
+pip install -e .
+```
+## Quick Start
+```python
+import spikeinterface.extractors as se
+from catsort import run_catsort
+# Load your recording
+recording = se.read_binary("path_to_data.dat", sampling_frequency=30000, num_channels=384, dtype="int16")
+# Run CATSort
+sorting = run_catsort(recording)
+# The result is a SpikeInterface Sorting object
+print(sorting)
+```
+## Parameters
+CATSort offers several parameters to fine-tune its behavior:
+- `scheme`: `'original'` or `'adaptive'`.
+  - `'adaptive'` uses temporal collisions to optimize thresholds.
+  - `'original'` uses fixed MAD multipliers.
+- `mad_multiplier_amplitude`, `mad_multiplier_width`, `mad_multiplier_energy`: (Default: `7.0`, `10.0`, `15.0`) Used when `scheme='original'`.
+- `detect_threshold`: Spike detection threshold in standard deviations (Default: `5`).
+## License
+MIT License. See [LICENSE](LICENSE) for details.

catsort-0.1.3/pyproject.toml ADDED Viewed

@@ -0,0 +1,52 @@
+[project]
+name = "CATSort"
+version = "0.1.3"
+description = "A collision-aware template matching spike sorter."
+readme = "README.md"
+requires-python = ">=3.9"
+license = { text = "MIT" }
+authors = [
+  { name="Lucas Beziers", email="lucas.beziers.pro@gmail.com" },
+]
+keywords = ["spike sorting", "neuroscience", "electrophysiology", "collisions", "template matching"]
+classifiers = [
+    "Development Status :: 4 - Beta",
+    "Intended Audience :: Science/Research",
+    "License :: OSI Approved :: MIT License",
+    "Programming Language :: Python :: 3",
+    "Programming Language :: Python :: 3.9",
+    "Programming Language :: Python :: 3.10",
+    "Programming Language :: Python :: 3.11",
+    "Programming Language :: Python :: 3.12",
+    "Programming Language :: Python :: 3.13",
+    "Topic :: Scientific/Engineering",
+]
+dependencies = [
+    "isosplit6",
+    "numba",
+    "numpy",
+    "scikit-learn",
+    "scipy",
+    "spikeinterface>=0.103.1",
+]
+[project.urls]
+Homepage = "https://github.com/lucasbeziers/CATSort"
+Repository = "https://github.com/lucasbeziers/CATSort"
+Issues = "https://github.com/lucasbeziers/CATSort/issues"
+[build-system]
+requires = ["setuptools>=61.0"]
+build-backend = "setuptools.build_meta"
+[tool.setuptools.packages.find]
+where = ["src"]
+[dependency-groups]
+dev = [
+    "ipykernel>=6.31.0",
+    "mearec",
+    "pandas",
+    "pytest",
+    "twine",
+]

catsort-0.1.3/setup.cfg ADDED Viewed

@@ -0,0 +1,4 @@
+[egg_info]
+tag_build =
+tag_date = 0

catsort-0.1.3/src/CATSort.egg-info/PKG-INFO ADDED Viewed

@@ -0,0 +1,87 @@
+Metadata-Version: 2.4
+Name: CATSort
+Version: 0.1.3
+Summary: A collision-aware template matching spike sorter.
+Author-email: Lucas Beziers <lucas.beziers.pro@gmail.com>
+License: MIT
+Project-URL: Homepage, https://github.com/lucasbeziers/CATSort
+Project-URL: Repository, https://github.com/lucasbeziers/CATSort
+Project-URL: Issues, https://github.com/lucasbeziers/CATSort/issues
+Keywords: spike sorting,neuroscience,electrophysiology,collisions,template matching
+Classifier: Development Status :: 4 - Beta
+Classifier: Intended Audience :: Science/Research
+Classifier: License :: OSI Approved :: MIT License
+Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Scientific/Engineering
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+License-File: LICENSE
+Requires-Dist: isosplit6
+Requires-Dist: numba
+Requires-Dist: numpy
+Requires-Dist: scikit-learn
+Requires-Dist: scipy
+Requires-Dist: spikeinterface>=0.103.1
+Dynamic: license-file
+# CATSort
+**CATSort** (Collision-Aware Template-matching Sort) is a robust spike sorter designed to handle overlapping spikes (collisions) with high precision using a specific collision-handling stage before clustering followed by template matching.
+## Key Features
+- **Collision Handling**: Automatically identifies and flags collided spikes using multi-criterion feature analysis (amplitude, width, energy).
+- **Template Matching**: Robust spike extraction using template-based matching (including 'wobble' for now).
+- **Flexible Schemes**: Choose between an `adaptive` threshold optimization or an `original` fixed MAD (Median Absolute Deviation) multiplier scheme.
+- **SpikeInterface Integration**: Fully compatible with the [SpikeInterface](https://github.com/SpikeInterface/spikeinterface) ecosystem.
+## Installation
+You can install `catsort` via pip:
+```bash
+pip install catsort
+```
+Or from source:
+```bash
+git clone https://github.com/lucasbeziers/CATSort.git
+cd CATSort
+pip install -e .
+```
+## Quick Start
+```python
+import spikeinterface.extractors as se
+from catsort import run_catsort
+# Load your recording
+recording = se.read_binary("path_to_data.dat", sampling_frequency=30000, num_channels=384, dtype="int16")
+# Run CATSort
+sorting = run_catsort(recording)
+# The result is a SpikeInterface Sorting object
+print(sorting)
+```
+## Parameters
+CATSort offers several parameters to fine-tune its behavior:
+- `scheme`: `'original'` or `'adaptive'`.
+  - `'adaptive'` uses temporal collisions to optimize thresholds.
+  - `'original'` uses fixed MAD multipliers.
+- `mad_multiplier_amplitude`, `mad_multiplier_width`, `mad_multiplier_energy`: (Default: `7.0`, `10.0`, `15.0`) Used when `scheme='original'`.
+- `detect_threshold`: Spike detection threshold in standard deviations (Default: `5`).
+## License
+MIT License. See [LICENSE](LICENSE) for details.

catsort-0.1.3/src/CATSort.egg-info/SOURCES.txt ADDED Viewed

@@ -0,0 +1,14 @@
+LICENSE
+README.md
+pyproject.toml
+src/CATSort.egg-info/PKG-INFO
+src/CATSort.egg-info/SOURCES.txt
+src/CATSort.egg-info/dependency_links.txt
+src/CATSort.egg-info/requires.txt
+src/CATSort.egg-info/top_level.txt
+src/catsort/__init__.py
+src/catsort/sorter.py
+src/catsort/core/clustering.py
+src/catsort/core/collision.py
+src/catsort/core/utils.py
+tests/test_sorter.py

catsort-0.1.3/src/CATSort.egg-info/dependency_links.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+

catsort-0.1.3/src/CATSort.egg-info/requires.txt ADDED Viewed

@@ -0,0 +1,6 @@
+isosplit6
+numba
+numpy
+scikit-learn
+scipy
+spikeinterface>=0.103.1

catsort-0.1.3/src/CATSort.egg-info/top_level.txt ADDED Viewed

	@@ -0,0 +1 @@
1	+ catsort

catsort-0.1.3/src/catsort/__init__.py ADDED Viewed

@@ -0,0 +1,5 @@
+# CATSort
+__version__ = "0.1.3"
+from .sorter import run_catsort, DEFAULT_PARAMS

catsort-0.1.3/src/catsort/core/clustering.py ADDED Viewed

@@ -0,0 +1,71 @@
+# Adapted from MountainSort5 (https://github.com/flatironinstitute/mountainsort5)
+# Licensed under Apache-2.0
+import numpy as np
+from sklearn import decomposition
+from isosplit6 import isosplit6
+from scipy.spatial.distance import squareform
+from scipy.cluster.hierarchy import linkage, cut_tree
+from numpy.typing import NDArray
+from typing import Optional
+def compute_pca_features(X: NDArray, npca: int) -> NDArray:
+    L = X.shape[0]
+    D = X.shape[1]
+    npca_2 = np.minimum(np.minimum(npca, L), D)
+    if L == 0 or D == 0:
+        return np.zeros((0, npca_2), dtype=np.float32)
+    pca = decomposition.PCA(n_components=npca_2)
+    return pca.fit_transform(X)
+def isosplit6_subdivision_method(X: NDArray, npca_per_subdivision: int, inds: Optional[NDArray] = None) -> NDArray:
+    if inds is not None:
+        X_sub = X[inds]
+    else:
+        X_sub = X
+    L = X_sub.shape[0]
+    if L == 0:
+        return np.zeros((0,), dtype=np.int32)
+    features = compute_pca_features(X_sub, npca=npca_per_subdivision)
+    labels = isosplit6(features)
+    K = int(np.max(labels)) if len(labels) > 0 else 0
+    if K <= 1:
+        return labels
+    centroids = np.zeros((K, X.shape[1]), dtype=np.float32)
+    for k in range(1, K + 1):
+        centroids[k - 1] = np.median(X_sub[labels == k], axis=0)
+    dists = np.sqrt(np.sum((centroids[:, None, :] - centroids[None, :, :]) ** 2, axis=2))
+    dists_condensed = squareform(dists)
+    Z = linkage(dists_condensed, method='single', metric='euclidean')
+    clusters0 = cut_tree(Z, n_clusters=2)
+    cluster_inds_1 = np.where(clusters0 == 0)[0] + 1
+    cluster_inds_2 = np.where(clusters0 == 1)[0] + 1
+    inds1 = np.where(np.isin(labels, cluster_inds_1))[0]
+    inds2 = np.where(np.isin(labels, cluster_inds_2))[0]
+    if inds is not None:
+        inds1_b = inds[inds1]
+        inds2_b = inds[inds2]
+    else:
+        inds1_b = inds1
+        inds2_b = inds2
+    labels1 = isosplit6_subdivision_method(X, npca_per_subdivision=npca_per_subdivision, inds=inds1_b)
+    labels2 = isosplit6_subdivision_method(X, npca_per_subdivision=npca_per_subdivision, inds=inds2_b)
+    K1 = int(np.max(labels1))
+    ret_labels = np.zeros(L, dtype=np.int32)
+    ret_labels[inds1] = labels1
+    ret_labels[inds2] = labels2 + K1
+    return ret_labels

catsort-0.1.3/src/catsort/core/collision.py ADDED Viewed

@@ -0,0 +1,151 @@
+import numpy as np
+from scipy.signal import resample
+from scipy.stats import median_abs_deviation
+def compute_fixed_thresholds(
+    features: dict,
+    mad_multipliers: dict
+) -> dict:
+    """
+    Compute thresholds for collision features using fixed MAD multipliers.
+    Threshold = median + MAD_multiplier * MAD
+    Args:
+        features: Dictionary with feature values (amplitude, width, energy)
+        mad_multipliers: Dictionary with MAD multipliers for each feature
+    Returns:
+        Dictionary with thresholds for each feature
+    """
+    thresholds = {}
+    for criterion, values in features.items():
+        if criterion in mad_multipliers:
+            median_val = np.median(values)
+            mad_val = median_abs_deviation(values)
+            thresholds[criterion] = median_val + mad_multipliers[criterion] * mad_val
+        else:
+            # Default to max value (no flagging) if multiplier not specified
+            thresholds[criterion] = np.max(values)
+    return thresholds
+def longest_true_runs(arr: np.ndarray) -> np.ndarray:
+    """
+    For each row in a 2D boolean array, finds the longest consecutive run of True values
+    and returns a boolean array of the same shape with only that run set to True.
+    """
+    out = np.zeros_like(arr, dtype=bool)
+    for i, row in enumerate(arr):
+        # Find start/end indices of True runs
+        padded = np.r_[False, row, False]
+        edges = np.flatnonzero(padded[1:] != padded[:-1])
+        starts, ends = edges[::2], edges[1::2]
+        if len(starts) == 0:
+            continue
+        lengths = ends - starts
+        j = np.argmax(lengths)
+        out[i, starts[j]:ends[j]] = True
+    return out
+def compute_collision_features(
+    peaks_traces: np.ndarray,
+    sampling_frequency: float,
+    width_threshold_amplitude: float = 0.5
+) -> dict:
+    """
+    Compute collision features (amplitude, width, energy) for each peak.
+    """
+    amplitudes = np.abs(peaks_traces).max(axis=1)
+    energies = np.sum(peaks_traces**2, axis=1)
+    # Resample for width precision if 1D trace per peak
+    if peaks_traces.ndim == 2:
+        num_samples = peaks_traces.shape[1]
+        resample_factor = 8
+        peaks_traces_resampled = resample(peaks_traces, num=num_samples * resample_factor, axis=1)
+        # Compute widths on resampled traces
+        under_width_threshold = peaks_traces_resampled < -width_threshold_amplitude * amplitudes[:, np.newaxis]
+        longest_under = longest_true_runs(under_width_threshold)
+        widths_samples = np.sum(longest_under, axis=1) / resample_factor
+    else:
+        under_width_threshold = peaks_traces < -width_threshold_amplitude * amplitudes[:, np.newaxis]
+        widths_samples = np.sum(longest_true_runs(under_width_threshold), axis=1)
+    widths_ms = widths_samples * (1000 / sampling_frequency)
+    return {
+        "amplitude": amplitudes,
+        "width": widths_ms,
+        "energy": energies
+    }
+def detect_temporal_collisions(
+    sample_indices: np.ndarray,
+    channel_indices: np.ndarray,
+    sampling_frequency: float,
+    refractory_period_ms: float = 2.0
+) -> np.ndarray:
+    """
+    Identify spikes that are too close to each other on the same channel.
+    """
+    prev_diffs = np.full_like(sample_indices, np.inf, dtype=float)
+    next_diffs = np.full_like(sample_indices, np.inf, dtype=float)
+    for ch in np.unique(channel_indices):
+        mask = channel_indices == ch
+        idx = np.where(mask)[0]
+        samples = sample_indices[mask]
+        # Sort by time within this channel
+        order = np.argsort(samples)
+        sorted_idx = idx[order]
+        sorted_samples = samples[order]
+        # Compute diffs
+        if len(sorted_samples) > 1:
+            diffs = np.diff(sorted_samples)
+            prev_diffs[sorted_idx[1:]] = diffs
+            next_diffs[sorted_idx[:-1]] = diffs
+    closest_sample_diff = np.minimum(np.abs(prev_diffs), np.abs(next_diffs))
+    closest_ms = closest_sample_diff * 1000 / sampling_frequency
+    return closest_ms < refractory_period_ms
+def optimize_collision_thresholds(
+    features: dict,
+    temporal_collisions: np.ndarray,
+    false_positive_tolerance: float = 0.05
+) -> dict:
+    """
+    Find optimal thresholds for collision features based on temporal collisions.
+    """
+    non_collision_count = (~temporal_collisions).sum()
+    max_fp = false_positive_tolerance * non_collision_count
+    optimized_thresholds = {}
+    for criterion, values in features.items():
+        unique_vals = np.sort(np.unique(values))
+        best_thr = unique_vals[-1] # Default to max (nothing flagged)
+        best_tp = -1
+        # Binary search or scan for best threshold
+        # For simplicity, we scan a subset of values if too many
+        if len(unique_vals) > 1000:
+            candidates = unique_vals[::len(unique_vals)//1000]
+        else:
+            candidates = unique_vals
+        for thr in candidates:
+            flagged = values > thr
+            tp = np.count_nonzero(flagged & temporal_collisions)
+            fp = np.count_nonzero(flagged & ~temporal_collisions)
+            if fp <= max_fp:
+                if tp > best_tp:
+                    best_tp = tp
+                    best_thr = thr
+        optimized_thresholds[criterion] = best_thr
+    return optimized_thresholds

catsort-0.1.3/src/catsort/core/utils.py ADDED Viewed

@@ -0,0 +1,70 @@
+import numpy as np
+def get_snippet(
+    traces: np.ndarray,
+    index: int,
+    n_before: int, n_after: int
+    ) -> np.ndarray:
+    """
+    Get a snippet of the traces around a specific index.
+    Fill the snippet with zeros if the index is out of bounds.
+    """
+    n_channels = traces.shape[1]
+    snippet = np.zeros((n_before + n_after, n_channels))
+    start = index - n_before
+    end = index + n_after
+    # If the snippet is fully within bounds, extract it directly
+    if start >= 0 and end <= traces.shape[0]:
+        snippet = traces[start:end, :]
+    # If the snippet is partially out of bounds, fill with zeros where necessary
+    else:
+        valid_start = max(start, 0)
+        valid_end = min(end, traces.shape[0])
+        insert_start = valid_start - start
+        insert_end = insert_start + (valid_end - valid_start)
+        snippet[insert_start:insert_end, :] = traces[valid_start:valid_end, :]
+    return snippet # shape (n_before+n_after, n_channels)
+def get_peaks_traces_all_channels(
+    peaks: np.ndarray,
+    traces: np.ndarray,
+    n_before: int, n_after: int
+    ) -> np.ndarray:
+    """
+    Extract snippets of traces around detected peaks.
+    Output shape: (n_peaks, n_before + n_after, n_channels)
+    """
+    n_channels = traces.shape[1]
+    complete_peaks = np.zeros((len(peaks), n_before+n_after, n_channels))
+    for i, peak in enumerate(peaks):
+        sample_index = peak['sample_index']
+        snippet = get_snippet(traces, sample_index, n_before, n_after)
+        complete_peaks[i] = snippet
+    return complete_peaks
+def get_peaks_traces_best_channel(
+    peaks: np.ndarray,
+    traces: np.ndarray,
+    n_before: int, n_after: int
+    ) -> np.ndarray:
+    """
+    Extract snippets of traces around detected peaks.
+    Keep only the channel with the highest amplitude
+    Output shape: (n_peaks, n_before + n_after)
+    """
+    complete_peaks = np.zeros((len(peaks), n_before+n_after))
+    for i, peak in enumerate(peaks):
+        sample_index = peak['sample_index']
+        snippet = get_snippet(traces, sample_index, n_before, n_after) # shape (n_before+n_after, n_channels)
+        best_channel = np.argmax(np.abs(snippet).max(axis=0))
+        complete_peaks[i] = snippet[:, best_channel] # shape (n_before+n_after)
+    return complete_peaks

catsort-0.1.3/src/catsort/sorter.py ADDED Viewed

@@ -0,0 +1,212 @@
+import numpy as np
+from typing import Optional
+from spikeinterface.core import BaseRecording, NumpySorting, SortingAnalyzer, Templates, ChannelSparsity, create_sorting_analyzer
+from spikeinterface.sortingcomponents.peak_detection import detect_peaks
+import spikeinterface.sortingcomponents.matching as sm
+from sklearn.decomposition import PCA
+from catsort.core.utils import get_peaks_traces_best_channel, get_peaks_traces_all_channels
+from catsort.core.collision import (
+    compute_collision_features,
+    detect_temporal_collisions,
+    optimize_collision_thresholds,
+    compute_fixed_thresholds
+)
+from catsort.core.clustering import isosplit6_subdivision_method
+DEFAULT_PARAMS = {
+    # Detection
+    'detect_threshold': 5,
+    'exclude_sweep_ms': 0.2,
+    'radius_um': 100,
+    # Collision analysis
+    'ms_before_spike_detected': 1.0,
+    'ms_after_spike_detected': 1.0,
+    'refractory_period': 2.0,
+    'scheme': 'original',  # 'original' or 'adaptive'
+    # Original scheme parameters
+    'mad_multiplier_amplitude': 7.0,
+    'mad_multiplier_width': 10.0,
+    'mad_multiplier_energy': 15.0,
+    # Adaptive scheme parameters
+    'false_positive_tolerance': 0.05,  # Used when scheme='adaptive'
+    # Clustering
+    'n_pca_components': 10,
+    'npca_per_subdivision': 10,
+    # Template matching
+    'tm_method': 'wobble', # 'wobble' only for now
+    # Template matching (Wobble)
+    'threshold_wobble': 5000,
+    'jitter_factor_wobble': 24,
+    'refractory_period_ms_wobble': 2.0,
+}
+def get_sorting_analyzer_with_computations(
+    sorting: NumpySorting,
+    recording: BaseRecording,
+    ms_before: float, ms_after: float
+) -> SortingAnalyzer:
+    sorting_analyzer = create_sorting_analyzer(sorting, recording, return_in_uV=True)
+    sorting_analyzer.compute("random_spikes", method="uniform", max_spikes_per_unit=np.inf)
+    sorting_analyzer.compute("waveforms", ms_before=ms_before, ms_after=ms_after)
+    sorting_analyzer.compute("templates", operators=["average", "median", "std"])
+    return sorting_analyzer
+def run_catsort(recording: BaseRecording, params: Optional[dict] = None) -> NumpySorting:
+    """
+    Main entry point for Catsort (Collision Aware Template-matching sort).
+    Args:
+        recording: spikeinterface recording object
+        params: dictionary of parameters (optional)
+    Returns:
+        sorting: spikeinterface sorting object
+    """
+    if params is None:
+        params = DEFAULT_PARAMS
+    else:
+        # Merge with defaults
+        full_params = DEFAULT_PARAMS.copy()
+        full_params.update(params)
+        params = full_params
+    print("Step 1: Detecting spikes...")
+    peaks_detected = detect_peaks(
+        recording=recording,
+        method="locally_exclusive",
+        method_kwargs={
+            "peak_sign": "neg",
+            "detect_threshold": params['detect_threshold'],
+            "exclude_sweep_ms": params['exclude_sweep_ms'],
+            "radius_um": params['radius_um'],
+        },
+    )
+    sampling_freq = recording.get_sampling_frequency()
+    traces = recording.get_traces()
+    n_before = int(sampling_freq * params['ms_before_spike_detected'] * 0.001)
+    n_after = int(sampling_freq * params['ms_after_spike_detected'] * 0.001)
+    print("Step 2: Collision handling...")
+    # Extract best channel traces for collision feature computation
+    traces_best = get_peaks_traces_best_channel(peaks_detected, traces, n_before, n_after)
+    # Temporal collisions
+    too_close = detect_temporal_collisions(
+        peaks_detected['sample_index'],
+        peaks_detected['channel_index'],
+        sampling_freq,
+        params['refractory_period']
+    )
+    # Compute features and thresholds based on scheme
+    collision_features = compute_collision_features(traces_best, sampling_freq)
+    if params['scheme'] == 'adaptive':
+        thresholds = optimize_collision_thresholds(
+            collision_features,
+            too_close,
+            params['false_positive_tolerance']
+        )
+    elif params['scheme'] == 'original':
+        mad_multipliers = {
+            'amplitude': params['mad_multiplier_amplitude'],
+            'width': params['mad_multiplier_width'],
+            'energy': params['mad_multiplier_energy']
+        }
+        thresholds = compute_fixed_thresholds(collision_features, mad_multipliers)
+    else:
+        raise ValueError(f"Unknown scheme: {params['scheme']}. Must be 'adaptive' or 'original'.")
+    print(f"  Scheme: {params['scheme']}")
+    # Flag collisions
+    is_collision = too_close.copy()
+    print(f"  Temporal collisions: {np.sum(too_close)}")
+    for crit, val in collision_features.items():
+        flagged_by_crit = val > thresholds[crit]
+        flagged_by_crit_not_too_close = flagged_by_crit & ~too_close
+        print(f"  {crit}: {np.sum(flagged_by_crit_not_too_close)} additional collisions")
+        is_collision |= flagged_by_crit
+    print(f"  Total flagged: {np.sum(is_collision)} collisions out of {len(peaks_detected)} spikes")
+    print("Step 3: Clustering non-collided spikes...")
+    mask_not_collided = ~is_collision
+    traces_all_not_collided = get_peaks_traces_all_channels(peaks_detected[mask_not_collided], traces, n_before, n_after)
+    # PCA and Clustering
+    num_spikes, num_samples, num_channels = traces_all_not_collided.shape
+    concatenated = traces_all_not_collided.reshape(num_spikes, -1)
+    pca = PCA(n_components=params['n_pca_components'])
+    features_not_collided = pca.fit_transform(concatenated)
+    labels_not_collided = isosplit6_subdivision_method(
+        features_not_collided,
+        npca_per_subdivision=params['npca_per_subdivision']
+    )
+    # Create sorting with clusters for template computation
+    samples_clean = peaks_detected[mask_not_collided]['sample_index']
+    labels_clean = labels_not_collided
+    sorting_clean = NumpySorting.from_samples_and_labels(
+        samples_list=[samples_clean],
+        labels_list=[labels_clean],
+        sampling_frequency=sampling_freq
+    )
+    print("Step 4: Template Matching...")
+    # Compute templates from clean clusters
+    analyzer = get_sorting_analyzer_with_computations(
+        sorting_clean, recording,
+        params['ms_before_spike_detected'],
+        params['ms_after_spike_detected']
+    )
+    templates_ext = analyzer.get_extension('templates')
+    sparsity = ChannelSparsity.create_dense(analyzer)
+    templates = Templates(
+        templates_array=templates_ext.data['average'],
+        sampling_frequency=sampling_freq,
+        nbefore=templates_ext.nbefore,
+        is_in_uV=True,
+        sparsity_mask=sparsity.mask,
+        channel_ids=analyzer.channel_ids,
+        unit_ids=analyzer.unit_ids,
+        probe=analyzer.get_probe()
+    )
+    if params['tm_method'] == 'wobble':
+        spikes_tm = sm.find_spikes_from_templates(
+            recording=recording,
+            templates=templates,
+            method='wobble',
+            method_kwargs={
+                "parameters": {
+                    "threshold": params['threshold_wobble'],
+                    "jitter_factor": params['jitter_factor_wobble'],
+                    "refractory_period_frames": int(sampling_freq * params['refractory_period_ms_wobble'] * 0.001),
+                    "scale_amplitudes": True
+                },
+            }
+        )
+    else:
+        raise ValueError(f"Unknown template matching method: {params['tm_method']}")
+    final_sorting = NumpySorting.from_samples_and_labels(
+        samples_list=[spikes_tm['sample_index']],
+        labels_list=[spikes_tm['cluster_index']],
+        sampling_frequency=sampling_freq
+    )
+    return final_sorting

catsort-0.1.3/tests/test_sorter.py ADDED Viewed

@@ -0,0 +1,82 @@
+from spikeinterface.core import generate_recording, generate_sorting, generate_snippets
+from spikeinterface.extractors import toy_example
+from spikeinterface.comparison import compare_sorter_to_ground_truth
+from catsort import sorter
+def test_tetrode():
+     recording, sorting_gt = toy_example(
+        duration=10,
+        num_channels=4,
+        num_units=5,
+        sampling_frequency=30000,
+        num_segments=1
+    )
+     default_parameters = sorter.DEFAULT_PARAMS
+     sorting = sorter.run_catsort(recording, params=default_parameters)
+     assert True
+def test_performance_tetrode():
+    recording, sorting_gt = toy_example(
+            duration=10,
+            num_channels=4,
+            num_units=5,
+            sampling_frequency=30000,
+            num_segments=1
+        )
+    default_parameters = sorter.DEFAULT_PARAMS
+    sorting = sorter.run_catsort(recording, params=default_parameters)
+    comparison = compare_sorter_to_ground_truth(sorting, sorting_gt, match_score=0.01)
+    perf = comparison.get_performance()
+    assert perf['accuracy'].mean() > 0.5
+    assert perf['recall'].mean() > 0.5
+    assert perf['precision'].mean() > 0.5
+def test_monotrode():
+     recording, sorting_gt = toy_example(
+        duration=10,
+        num_channels=1,
+        num_units=5,
+        sampling_frequency=30000,
+        num_segments=1
+    )
+     default_parameters = sorter.DEFAULT_PARAMS
+     sorting = sorter.run_catsort(recording, params=default_parameters)
+     assert True
+def test_performance_monotrode():
+    recording, sorting_gt = toy_example(
+            duration=10,
+            num_channels=1,
+            num_units=5,
+            sampling_frequency=30000,
+            num_segments=1
+        )
+    default_parameters = sorter.DEFAULT_PARAMS
+    sorting = sorter.run_catsort(recording, params=default_parameters)
+    comparison = compare_sorter_to_ground_truth(sorting, sorting_gt, match_score=0.01)
+    perf = comparison.get_performance()
+    assert perf['accuracy'].mean() > 0.25
+    assert perf['recall'].mean() > 0.25
+    assert perf['precision'].mean() > 0.25
+def test_scheme_adaptive():
+     recording, sorting_gt = toy_example(
+        duration=10,
+        num_channels=4,
+        num_units=5,
+        sampling_frequency=30000,
+        num_segments=1
+    )
+     adaptive_parameters = sorter.DEFAULT_PARAMS.copy()
+     adaptive_parameters['scheme'] = 'adaptive'
+     sorting = sorter.run_catsort(recording, params=adaptive_parameters)
+     assert True