PyPI - reverse-pred - Versions diffs - 0.0.1__tar.gz - Mend

reverse-pred 0.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

reverse_pred-0.0.1/LICENSE +21 -0
reverse_pred-0.0.1/PKG-INFO +108 -0
reverse_pred-0.0.1/README.md +94 -0
reverse_pred-0.0.1/pyproject.toml +23 -0
reverse_pred-0.0.1/src/reverse_pred/__init__.py +0 -0
reverse_pred-0.0.1/src/reverse_pred/correlation_metrics.py +115 -0
reverse_pred-0.0.1/src/reverse_pred/decode_utils.py +106 -0
reverse_pred-0.0.1/src/reverse_pred/h5_utils.py +72 -0
reverse_pred-0.0.1/src/reverse_pred/model_to_model.py +33 -0
reverse_pred-0.0.1/src/reverse_pred/model_to_monkey.py +57 -0
reverse_pred-0.0.1/src/reverse_pred/monkey_to_model.py +57 -0
reverse_pred-0.0.1/src/reverse_pred/monkey_to_monkey.py +44 -0
reverse_pred-0.0.1/src/reverse_pred/prediction_utils.py +165 -0
reverse_pred-0.0.1/src/reverse_pred/regression_metrics.py +79 -0

reverse_pred-0.0.1/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 vital-kolab
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

reverse_pred-0.0.1/PKG-INFO ADDED Viewed

@@ -0,0 +1,108 @@
+Metadata-Version: 2.4
+Name: reverse_pred
+Version: 0.0.1
+Summary: Library to run Reverse Predictivity
+Project-URL: Homepage, https://github.com/vital-kolab/reverse_pred
+Project-URL: Issues, https://github.com/vital-kolab/reverse_pred/issues
+Author-email: Sabine Muzellec and Kohitij Kar <sabinem@yorku.ca>
+License-Expression: MIT
+License-File: LICENSE
+Classifier: Operating System :: OS Independent
+Classifier: Programming Language :: Python :: 3
+Requires-Python: >=3.9
+Description-Content-Type: text/markdown
+# Reverse Predictivity
+A research codebase accompanying the preprint:
+**Reverse Predictivity: Going Beyond One-Way Mapping to Compare Artificial Neural Network Models and Brains**, Muzellec & Kar, bioRxiv (posted August 8, 2025) ([biorxiv.org](https://www.biorxiv.org/content/10.1101/2025.08.08.669382v1))
+This repository supports analyses comparing macaque inferior temporal (IT) cortex responses with artificial neural network (ANN) units—specifically using a *reverse predictivity* metric that assesses how well neural responses predict ANN activations ([biorxiv.org](https://www.biorxiv.org/content/10.1101/2025.08.08.669382v1)).
+### Compare brains and models in both directions.
+This repository implements **reverse predictivity**: a complementary evaluation to forward neural predictivity that asks *how well do neural responses predict ANN activations?* It provides utilities to map macaque IT population responses to model units, quantify bidirectional alignment, and reproduce manuscript figures.
+## 🧠 What is reverse predictivity?
+Traditional *forward* neural predictivity evaluates how well a model’s features linearly predict neural responses. **Reverse predictivity** inverts that lens: using neural responses to predict model units. Agreement across both directions strengthens claims that a model and a brain area **share representations**. Practically, this repo includes:
+- Regression utilities to decode IT neurons / ANN unit activations from ANN units / IT population responses
+- Image‑level metrics and correlation suites to compare human/ANN/neural behaviors
+- End‑to‑end notebooks to reproduce figures
+## 🗂️ Repository layout
+- `demo_forward_predictivity.ipynb` – quick demo of *forward* mapping model units -> neurons
+- `demo_reverse_predictivity.ipynb` – quick demo of *reverse* mapping model units <- neurons
+- `demo_generate_neurons_i1.ipynb` – compute image‑level neural metrics
+- `demo_generate_model_i1.ipynb` – compute image‑level model metrics
+- `figure[1-6].ipynb` – figure reproduction notebooks
+- `model_to_monkey.py` – utilities for model -> neural regression and evaluation
+- `monkey_to_model.py` – utilities for model <- neural regression and evaluation
+- `correlation_metrics.py` – Spearman/Pearson, reliability‑aware correlations, confidence intervals
+- `regression_metrics.py` – regression helpers
+- `prediction_utils.py` – shared helpers for prediction/decoding
+- `decode_utils.py` – train/test splits, cross‑validation, split‑half routines
+- `figure_utils.py` – journal‑style plotting helpers
+- `h5_utils.py` – helpers to read/write HDF5 feature and metadata files
+📦 *Large data files (IT features, image sets) are not stored in the repo.* They can be downloaded from: [here](https://osf.io/y3qmk/?view_only=6dfe548c7ba24238932d247e65523053)
+## 🛠️ Installation
+We recommend Python ≥3.10 with a fresh environment (Conda or venv).
+```bash
+# Using conda
+conda create -n reverse_pred python=3.10 -y
+conda activate reverse_pred
+# Install core dependencies
+pip install numpy scipy scikit-learn matplotlib h5py
+```
+## 📥 Data & preparation
+This project assumes access to:
+1. **Macaque IT responses**: population responses for N images.
+   - `/neural_data` shape `(n_images, n_neurons, n_reps)`
+2. **Model features**: precomputed ANN activations for the same images
+   - `/model_features` shape `(n_images, n_units)`
+3. **Humans / Primates behavior**: image‑level accuracies
+   - `/behavior` shape `(n_images)`
+## 🚀 Quickstart
+- `demo_forward_predictivity.ipynb` – step‑by‑step guide to fitting a model to neuron regression, evaluating correlations.
+- `demo_reverse_predictivity.ipynb` – end‑to‑end demonstration of neuron to model regression, computing EV/correlation metrics.
+- `demo_generate_neurons_i1.ipynb` – generates image‑level accuracies from neural decoders.
+- `demo_generate_model_i1.ipynb` – extracts image‑level model metrics from ANN activations.
+## 🔁 Reproducing manuscript figures
+Each `figureX.ipynb` notebook reproduces the corresponding figure from the preprint. Notebooks expect the data assets described above. If paths differ, change the config cell at the top of each notebook.
+- **Figure 1:** Forward Predictivity
+- **Figure 2:** Reverse vs forward predictivity examples
+- **Figure 3:** Reverse vs forward predictivity accross monkeys and models
+- **Figure 4:** Influencing factors
+- **Figure 5:** Analysis of unique units
+- **Figure 6:** Link with behavior
+## 📌 Status & citation
+This codebase accompanies the preprint:
+**Muzellec, S. & Kar, K. (2025). _Reverse Predictivity: Going Beyond One‑Way Mapping to Compare Artificial Neural Network Models and Brains_. bioRxiv.**
+If you use this repository or ideas from it, please cite the preprint and link to this repo.
+```
+@article{muzellec_kar_2025_reversepredictivity,
+  title  = {Reverse Predictivity: Going Beyond One-Way Mapping to Compare Artificial Neural Network Models and Brains},
+  author = {Muzellec, Sabine and Kar, Kohitij},
+  year   = {2025},
+  journal= {bioRxiv}
+}
+```
+License: **MIT** (see `LICENSE`).

reverse_pred-0.0.1/README.md ADDED Viewed

@@ -0,0 +1,94 @@
+# Reverse Predictivity
+A research codebase accompanying the preprint:
+**Reverse Predictivity: Going Beyond One-Way Mapping to Compare Artificial Neural Network Models and Brains**, Muzellec & Kar, bioRxiv (posted August 8, 2025) ([biorxiv.org](https://www.biorxiv.org/content/10.1101/2025.08.08.669382v1))
+This repository supports analyses comparing macaque inferior temporal (IT) cortex responses with artificial neural network (ANN) units—specifically using a *reverse predictivity* metric that assesses how well neural responses predict ANN activations ([biorxiv.org](https://www.biorxiv.org/content/10.1101/2025.08.08.669382v1)).
+### Compare brains and models in both directions.
+This repository implements **reverse predictivity**: a complementary evaluation to forward neural predictivity that asks *how well do neural responses predict ANN activations?* It provides utilities to map macaque IT population responses to model units, quantify bidirectional alignment, and reproduce manuscript figures.
+## 🧠 What is reverse predictivity?
+Traditional *forward* neural predictivity evaluates how well a model’s features linearly predict neural responses. **Reverse predictivity** inverts that lens: using neural responses to predict model units. Agreement across both directions strengthens claims that a model and a brain area **share representations**. Practically, this repo includes:
+- Regression utilities to decode IT neurons / ANN unit activations from ANN units / IT population responses
+- Image‑level metrics and correlation suites to compare human/ANN/neural behaviors
+- End‑to‑end notebooks to reproduce figures
+## 🗂️ Repository layout
+- `demo_forward_predictivity.ipynb` – quick demo of *forward* mapping model units -> neurons
+- `demo_reverse_predictivity.ipynb` – quick demo of *reverse* mapping model units <- neurons
+- `demo_generate_neurons_i1.ipynb` – compute image‑level neural metrics
+- `demo_generate_model_i1.ipynb` – compute image‑level model metrics
+- `figure[1-6].ipynb` – figure reproduction notebooks
+- `model_to_monkey.py` – utilities for model -> neural regression and evaluation
+- `monkey_to_model.py` – utilities for model <- neural regression and evaluation
+- `correlation_metrics.py` – Spearman/Pearson, reliability‑aware correlations, confidence intervals
+- `regression_metrics.py` – regression helpers
+- `prediction_utils.py` – shared helpers for prediction/decoding
+- `decode_utils.py` – train/test splits, cross‑validation, split‑half routines
+- `figure_utils.py` – journal‑style plotting helpers
+- `h5_utils.py` – helpers to read/write HDF5 feature and metadata files
+📦 *Large data files (IT features, image sets) are not stored in the repo.* They can be downloaded from: [here](https://osf.io/y3qmk/?view_only=6dfe548c7ba24238932d247e65523053)
+## 🛠️ Installation
+We recommend Python ≥3.10 with a fresh environment (Conda or venv).
+```bash
+# Using conda
+conda create -n reverse_pred python=3.10 -y
+conda activate reverse_pred
+# Install core dependencies
+pip install numpy scipy scikit-learn matplotlib h5py
+```
+## 📥 Data & preparation
+This project assumes access to:
+1. **Macaque IT responses**: population responses for N images.
+   - `/neural_data` shape `(n_images, n_neurons, n_reps)`
+2. **Model features**: precomputed ANN activations for the same images
+   - `/model_features` shape `(n_images, n_units)`
+3. **Humans / Primates behavior**: image‑level accuracies
+   - `/behavior` shape `(n_images)`
+## 🚀 Quickstart
+- `demo_forward_predictivity.ipynb` – step‑by‑step guide to fitting a model to neuron regression, evaluating correlations.
+- `demo_reverse_predictivity.ipynb` – end‑to‑end demonstration of neuron to model regression, computing EV/correlation metrics.
+- `demo_generate_neurons_i1.ipynb` – generates image‑level accuracies from neural decoders.
+- `demo_generate_model_i1.ipynb` – extracts image‑level model metrics from ANN activations.
+## 🔁 Reproducing manuscript figures
+Each `figureX.ipynb` notebook reproduces the corresponding figure from the preprint. Notebooks expect the data assets described above. If paths differ, change the config cell at the top of each notebook.
+- **Figure 1:** Forward Predictivity
+- **Figure 2:** Reverse vs forward predictivity examples
+- **Figure 3:** Reverse vs forward predictivity accross monkeys and models
+- **Figure 4:** Influencing factors
+- **Figure 5:** Analysis of unique units
+- **Figure 6:** Link with behavior
+## 📌 Status & citation
+This codebase accompanies the preprint:
+**Muzellec, S. & Kar, K. (2025). _Reverse Predictivity: Going Beyond One‑Way Mapping to Compare Artificial Neural Network Models and Brains_. bioRxiv.**
+If you use this repository or ideas from it, please cite the preprint and link to this repo.
+```
+@article{muzellec_kar_2025_reversepredictivity,
+  title  = {Reverse Predictivity: Going Beyond One-Way Mapping to Compare Artificial Neural Network Models and Brains},
+  author = {Muzellec, Sabine and Kar, Kohitij},
+  year   = {2025},
+  journal= {bioRxiv}
+}
+```
+License: **MIT** (see `LICENSE`).

reverse_pred-0.0.1/pyproject.toml ADDED Viewed

@@ -0,0 +1,23 @@
+[build-system]
+requires = ["hatchling >= 1.26"]
+build-backend = "hatchling.build"
+[project]
+name = "reverse_pred"
+version = "0.0.1"
+authors = [
+  { name="Sabine Muzellec and Kohitij Kar", email="sabinem@yorku.ca" },
+]
+description = "Library to run Reverse Predictivity"
+readme = "README.md"
+requires-python = ">=3.9"
+classifiers = [
+    "Programming Language :: Python :: 3",
+    "Operating System :: OS Independent",
+]
+license = "MIT"
+license-files = ["LICEN[CS]E*"]
+[project.urls]
+Homepage = "https://github.com/vital-kolab/reverse_pred"
+Issues = "https://github.com/vital-kolab/reverse_pred/issues"

reverse_pred-0.0.1/src/reverse_pred/__init__.py ADDED Viewed

File without changes

reverse_pred-0.0.1/src/reverse_pred/correlation_metrics.py ADDED Viewed

@@ -0,0 +1,115 @@
+import numpy as np
+from scipy import stats
+import random
+def get_split_half_correlation(averaged_data):
+    n_shc_allsites = []
+    ev_allsites = []
+    for s in range(averaged_data.shape[1]):
+        new_rate = averaged_data[:,s]
+        shc = get_splithalf_corr(new_rate,ax=1)
+        neural_shc = spearmanbrown_correction(shc['split_half_corr'])
+        n_shc_allsites.append(neural_shc)
+        ev_allsites.append((neural_shc**2))
+    return ev_allsites, n_shc_allsites
+def get_splithalf_corr(var, ax=1, type='spearman'):
+    """
+    specify the variable (var) for which splits are required,
+    along which axis (ax)?
+    which correlation method do you want (type)?
+    """
+    _, _, split_mean1, split_mean2 = get_splithalves(var, ax=ax)
+    if (type == 'spearman'):
+        split_half_correlation = stats.spearmanr(split_mean1, split_mean2)  # get the Spearman Correlation
+    else:
+        split_half_correlation = stats.pearsonr(split_mean1, split_mean2)  # get the Pearson Correlation
+    return {'split_half_corr': split_half_correlation[0],
+            'p-value': split_half_correlation[1],
+            'type': type
+            }
+def get_splithalves(var, ax=1, rng=None):
+    """
+    Randomly split the array along the specified axis and return the two halves and their means.
+    Parameters
+    ----------
+    var : ndarray
+        The input array to split.
+    ax : int, optional
+        The axis along which to split. Default is 1.
+    rng : np.random.Generator, optional
+        Numpy random number generator for reproducibility. If None, defaults to np.random.default_rng().
+    Returns
+    -------
+    split1, split2 : ndarray
+        The two split halves.
+    split_mean1, split_mean2 : ndarray
+        The means of the two split halves along the specified axis.
+    """
+    if rng is None:
+        rng = np.random.default_rng()
+    # Transpose var so that the split axis becomes axis 0 (easier for shuffling along slices)
+    var = np.swapaxes(var, 0, ax)
+    shuffled = var.copy()
+    rng.shuffle(shuffled, axis=0)  # shuffle along the new 0th axis (original ax)
+    split1, split2 = np.array_split(shuffled, 2, axis=0)
+    split_mean1 = np.nanmean(split1, axis=0)
+    split_mean2 = np.nanmean(split2, axis=0)
+    # Swap axes back to original configuration
+    return (
+        np.swapaxes(split1, 0, ax),
+        np.swapaxes(split2, 0, ax),
+        np.swapaxes(split_mean1, 0, ax - 1 if ax > 0 else 0),
+        np.swapaxes(split_mean2, 0, ax - 1 if ax > 0 else 0),
+    )
+def spearmanbrown_correction(var):  # Spearman Brown Correct the correlation value
+    spc_var = (2 * var) / (1 + var)
+    return spc_var
+def get_correlation_noise_corrected(var1, var2, nrbs=50, correction_method='spearmanBrown'):
+    """
+        Parameters
+    ----------
+    var1 :  variable 1 for correlation (2d array): 2nd dimension has to be trials (repetitions)
+    var2 : variable 2 for correlation (2d array): 2nd dimension has to be trials (repetitions)
+    nrbs : number of bootstrap repeats. optional, The default is 50.
+    correction_method : Split correction applied, optional, The default is 'spearmanBrown'.
+    Returns
+    -------
+    corrected_corr : 1d array of corrected pearson correlation values
+    """
+    corrected_corr = np.empty([nrbs, 1], dtype=float)
+    for i in range(nrbs):
+        sh_corr_var1 = get_splithalf_corr(var1)
+        sh_corr_var2 = get_splithalf_corr(var2)
+        den = np.sqrt(sh_corr_var1['split_half_corr'] * sh_corr_var2['split_half_corr'])
+        if (correction_method == 'spearmanBrown'):
+            num = stats.pearsonr(np.nanmean(var1, axis=1), np.nanmean(var2, axis=1))
+        else:
+            var1_split = var1[:, random.sample(list(np.arange(0, np.size(var1, axis=1), 1)),
+                                               int(np.round(np.size(var1, axis=1) / 2)))]
+            var2_split = var2[:, random.sample(list(np.arange(0, np.size(var2, axis=1), 1)),
+                                               int(np.round(np.size(var2, axis=1) / 2)))]
+            num = stats.pearsonr(np.nanmean(var1_split, axis=1), np.nanmean(var2_split, axis=1))
+        corrected_corr[i] = num[0] / den
+    return corrected_corr
+def main():
+    if __name__ == "__main__":
+        main()

reverse_pred-0.0.1/src/reverse_pred/decode_utils.py ADDED Viewed

@@ -0,0 +1,106 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+"""
+Created on Sat Apr  4 01:17:31 2020
+@author: kohitij
+"""
+import numpy as np
+#import h5py
+#from sklearn.multiclass import OneVsRestClassifier
+from sklearn.linear_model import LogisticRegression
+from sklearn.multiclass import OneVsRestClassifier
+from sklearn.svm import SVC
+from scipy.stats import zscore, norm
+from sklearn import preprocessing
+def decode(features,labels,nrfolds=2,seed=0):
+    classes=np.unique(labels)
+    nrImages = features.shape[1]
+    _,ind = np.unique(classes, return_inverse=True)
+    features = zscore(features,axis=0)
+    num_classes = len(classes)
+    prob = np.zeros((nrImages,len(classes)))
+    prob[:]=np.nan
+    for i in range(nrfolds):
+        train, test = get_train_test_indices(nrImages,nrfolds=nrfolds, foldnumber=i, seed=seed)
+        XTrain = features[:,train]
+        XTest = features[:,test]
+        YTrain = labels[train]
+        clf = LogisticRegression(penalty='l2',C=5*10e4,multi_class='ovr', max_iter=1000, class_weight='balanced').fit(XTrain.T, YTrain)
+        pred=clf.predict_proba(XTest.T)
+        prob[test,0:num_classes]=pred
+    return prob
+def get_percent_correct_from_proba(prob, labels,class_order, eps=1e-3):
+    nrImages = prob.shape[0]
+    class_order=np.unique(labels)
+    pc = np.zeros((nrImages,len(class_order)))
+    pc[:]=np.nan
+    _,ind = np.unique(labels, return_inverse=True)
+    for i in range(nrImages):
+        loc_target = labels[i]==class_order
+        pc[i,:] = np.divide(prob[i,labels[i]==class_order],prob[i,:]+prob[i,loc_target]) #+eps
+        pc[i,loc_target]=np.nan
+    return pc
+def get_fa(pc, labels):
+    _,ind = np.unique(labels, return_inverse=True)
+    full_fa = 1-pc
+    pfa = np.nanmean(full_fa,axis=0)
+    fa = pfa[ind]
+    return fa, full_fa
+def get_dprime(pc,fa):
+    zHit = norm.ppf(pc)
+    zFA = norm.ppf(fa)
+    # controll for infinite values
+    zHit[np.isposinf(zHit)] = 5
+    zFA[np.isneginf(zFA)] = -5
+    # Calculate d'
+    dp = zHit - zFA
+    dp[dp>5]=5
+    dp[dp<-5]=-5
+    return dp
+def get_train_test_indices(totalIndices, nrfolds=10,foldnumber=0, seed=1):
+    """
+    Parameters
+    ----------
+    totalIndices : TYPE
+        DESCRIPTION.
+    nrfolds : TYPE, optional
+        DESCRIPTION. The default is 10.
+    foldnumber : TYPE, optional
+        DESCRIPTION. The default is 0.
+    seed : TYPE, optional
+        DESCRIPTION. The default is 1.
+    Returns
+    -------
+    train_indices : TYPE
+        DESCRIPTION.
+    test_indices : TYPE
+        DESCRIPTION.
+    """
+    np.random.seed(seed)
+    inds = np.arange(totalIndices)
+    np.random.shuffle(inds)
+    splits = np.array_split(inds,nrfolds)
+    test_indices = inds[np.isin(inds,splits[foldnumber])]
+    train_indices = inds[np.logical_not(np.isin(inds, test_indices))]
+    return train_indices, test_indices

reverse_pred-0.0.1/src/reverse_pred/h5_utils.py ADDED Viewed

@@ -0,0 +1,72 @@
+# function to look at .h5 file contents
+import h5py
+def h5disp(filename):
+    """
+    Display the structure and contents of an HDF5 file.
+    Parameters:
+        filename (str): Path to the HDF5 file.
+    """
+    def display_item(name, obj, indent=0):
+        """Recursively display information about HDF5 groups and datasets."""
+        spacing = '  ' * indent
+        if isinstance(obj, h5py.Group):
+            print(f"{spacing}Group: {name}")
+            for key, item in obj.items():
+                display_item(key, item, indent + 1)
+        elif isinstance(obj, h5py.Dataset):
+            print(f"{spacing}Dataset: {name}")
+            print(f"{spacing}  Shape: {obj.shape}")
+            print(f"{spacing}  Data type: {obj.dtype}")
+            if obj.size < 20:  # Display small datasets inline
+                print(f"{spacing}  Data: {obj[()]}")
+    try:
+        with h5py.File(filename, 'r') as h5file:
+            print(f"HDF5 file: {filename}")
+            for name, item in h5file.items():
+                display_item(name, item)
+    except Exception as e:
+        print(f"Error reading HDF5 file: {e}")
+# Example usage:
+# h5disp('example.h5')
+def h5read(filename, dataset_path):
+    """
+    Read data from a specified dataset in an HDF5 file.
+    Parameters:
+        filename (str): Path to the HDF5 file.
+        dataset_path (str): Path to the dataset within the HDF5 file.
+    Returns:
+        numpy.ndarray: The data from the specified dataset.
+    """
+    try:
+        with h5py.File(filename, 'r') as h5file:
+            if dataset_path in h5file:
+                data = h5file[dataset_path][()]
+                return data
+            else:
+                raise KeyError(f"Dataset '{dataset_path}' not found in file '{filename}'.")
+    except Exception as e:
+        print(f"Error reading dataset '{dataset_path}' from file '{filename}': {e}")
+        return None
+# # Import the function
+# from h5_utils import h5read
+# # Path to the HDF5 file
+# filename = 'example.h5'
+# # Path to the dataset within the file
+# dataset_path = '/group1/dataset1'
+# # Read and print the dataset
+# data = h5read(filename, dataset_path)
+# print(f"Dataset data: {data}")

reverse_pred-0.0.1/src/reverse_pred/model_to_model.py ADDED Viewed

@@ -0,0 +1,33 @@
+import os
+import sys
+import numpy as np
+import h5py
+from h5_utils import h5read
+import prediction_utils as pu
+def load_model_features(model_name, n_images, data_dir):
+    features = np.load(os.path.join(data_dir, f"{model_name}_features.npy")).reshape(n_images, -1)
+    return features
+def main(model1, model2, out_dir, n_images, data_dir, reps=10):
+    os.makedirs(out_dir, exist_ok=True)
+    ev_path = os.path.join(out_dir, f'forward_{model2}_ev.npy')
+    if not os.path.exists(ev_path):
+        model_features_predictor = load_model_features(model1, n_images, data_dir)
+        model_features_predicted = load_model_features(model2, n_images, data_dir)
+        # Compute predictions from model
+        prediction = pu.get_all_preds(model_features_predicted, model_features_predictor, ncomp=20)
+        # Compute EV
+        ev = pu.get_all_stats(prediction, model_features_predicted, model_features_predictor, ncomp=20)
+        print(np.nanmean(ev))
+        np.save(ev_path, ev)
+if __name__ == "__main__":
+    model1 = sys.argv[1]
+    model2 = sys.argv[2]
+    out_dir = f'/scratch/smuzelle/results_predictions/model2model/{model1}'
+    data_dir = f'/scratch/smuzelle/model_features/'
+    n_images = 1320
+    main(model1, model2, out_dir, n_images, data_dir)

reverse_pred-0.0.1/src/reverse_pred/model_to_monkey.py ADDED Viewed

@@ -0,0 +1,57 @@
+import os
+import sys
+import numpy as np
+import h5py
+from h5_utils import h5read
+import prediction_utils as pu
+def load_selected_rates(monkey, data_dir):
+    if monkey == "m1":
+        data = h5read(os.path.join(data_dir, 'neural_data/rates_m1_active.h5'), '/m1/active')
+        selected = np.load(os.path.join(data_dir, 'neural_data/selected_rates_m1.npy'))
+    elif monkey == "m2":
+        data = h5read(os.path.join(data_dir, 'neural_data/rates_m2_active.h5'), '/m2/active')
+        selected = np.load(os.path.join(data_dir, 'neural_data/selected_rates_m2.npy'))
+    else:
+        raise ValueError("Monkey not found")
+    return pu.average_data(data), selected
+def load_features(model_name, n_images, data_dir):
+    features = np.load(os.path.join(data_dir, f'model_features/{model_name}_features.npy')).reshape(n_images, -1)
+    return features
+def load_model_features(model, n_images, data_dir):
+    features = load_features(model, n_images, data_dir)
+    return features
+def main(model, monkey, out_dir, n_images, data_dir, reps=20):
+    os.makedirs(out_dir, exist_ok=True)
+    # Load model features and data
+    rates, selected_rates = load_selected_rates(monkey, data_dir)
+    model_features = load_model_features(model, n_images, data_dir)
+    responses = np.nanmean(selected_rates, axis=2)
+    print(responses.shape)
+    ev_path = os.path.join(out_dir, f'forward_{monkey}_ev.npy')
+    all_evs = []
+    for r in range(reps):
+        # Compute predictions from model
+        prediction = pu.get_all_preds(responses, model_features, ncomp=20)
+        # Compute EV
+        ev = pu.get_all_stats(prediction, selected_rates, model_features, ncomp=20)
+        all_evs.append(ev)
+    all_evs = np.array(all_evs)
+    np.save(ev_path, np.nanmean(all_evs, axis=0))
+if __name__ == "__main__":
+    model = sys.argv[1]
+    monkey = sys.argv[2]
+    out_dir = f'./results_for_figures/model2monkey/{model}'
+    data_dir = f'./'
+    n_images = 1320
+    main(model, monkey, out_dir, n_images, data_dir)

reverse_pred-0.0.1/src/reverse_pred/monkey_to_model.py ADDED Viewed

@@ -0,0 +1,57 @@
+import os
+import sys
+import numpy as np
+from h5_utils import h5read
+import prediction_utils as pu
+import h5py
+def load_selected_rates(monkey, data_dir):
+    if monkey == "m1":
+        data = h5read(os.path.join(data_dir, 'neural_data/rates_m1_active.h5'), '/m1/active')
+        selected = np.load(os.path.join(data_dir, 'neural_data/selected_rates_m1.npy'))
+    elif monkey == "m2":
+        data = h5read(os.path.join(data_dir, 'neural_data/rates_m2_active.h5'), '/m2/active')
+        selected = np.load(os.path.join(data_dir, 'neural_data/selected_rates_m2.npy'))
+    else:
+        raise ValueError("Monkey not found")
+    return pu.average_data(data), selected
+def load_features(model_name, n_images, data_dir):
+    features = np.load(os.path.join(data_dir, f'model_features/{model_name}_features.npy')).reshape(n_images, -1)
+    return features
+def load_model_features(model, n_images, data_dir):
+    features = load_features(model, n_images, data_dir)
+    return features
+def main(model, monkey, out_dir, n_images, data_dir, reps=20):
+    os.makedirs(out_dir, exist_ok=True)
+    # Load model features and data
+    rates, _ = load_selected_rates(monkey, data_dir)
+    model_features = load_model_features(model, n_images, data_dir)
+    responses = np.nanmean(rates, axis=2)
+    print(responses.shape)
+    ev_path = os.path.join(out_dir, f'reverse_{monkey}_ev.npy')
+    all_evs = []
+    for r in range(reps):
+        # Compute predictions from model
+        prediction = pu.get_all_preds(model_features, responses, ncomp=20)
+        # Compute EV
+        ev = pu.get_all_stats(prediction, model_features, rates, ncomp=20)
+        all_evs.append(ev)
+    all_evs = np.array(all_evs)
+    np.save(ev_path, np.nanmean(all_evs, axis=0))
+if __name__ == "__main__":
+    model = sys.argv[1]
+    monkey = sys.argv[2]
+    out_dir = f'./results_for_figures/monkey2model/{model}'
+    data_dir = f'./'
+    n_images = 1320
+    main(model, monkey, out_dir, n_images, data_dir)

reverse_pred-0.0.1/src/reverse_pred/monkey_to_monkey.py ADDED Viewed

@@ -0,0 +1,44 @@
+import os
+import sys
+import numpy as np
+from h5_utils import h5read
+import prediction_utils as pu
+import h5py
+def main(start, end, out_dir, data_dir, reps=10, max_n=None):
+    os.makedirs(out_dir, exist_ok=True)
+    rates_predictor = np.load("./temp/predictor.npy")
+    rates_predicted = np.load("./temp/predicted.npy")
+    responses_predicted = np.nanmean(rates_predicted, axis=2)
+    responses_predictor = np.nanmean(rates_predictor, axis=2)
+    ev_path = os.path.join(out_dir, f'{monkey1}_to_{monkey2}_ev.npy')
+    all_evs = []
+    for r in range(reps):
+        if max_n is not None and responses_predicted.shape[1] > max_n:
+            indices = np.random.choice(responses_predicted.shape[1], max_n, replace=False)
+            responses_predicted = responses_predicted[:, indices]
+            rates_predicted = rates_predicted[:, indices]
+        if max_n is not None and responses_predictor.shape[1] > max_n:
+            indices = np.random.choice(responses_predictor.shape[1], max_n, replace=False)
+            responses_predictor = responses_predictor[:, indices]
+            rates_predictor = rates_predictor[:, indices]
+        print(responses_predicted.shape, responses_predictor.shape)
+        # Compute predictions from model
+        prediction = pu.get_all_preds(responses_predicted, responses_predictor, ncomp=20)
+        # Compute EV
+        ev = pu.get_all_stats(prediction, rates_predicted, rates_predictor, ncomp=20) #, rhoxx, rhoyy
+        all_evs.append(ev)
+    all_evs = np.array(all_evs)
+    np.save(ev_path, np.nanmean(all_evs, axis=0))
+if __name__ == "__main__":
+    monkey1 = sys.argv[1]
+    monkey2 = sys.argv[2]
+    out_dir = f'/scratch/smuzelle/results_predictions/monkey2model'
+    data_dir = f'./'
+    main(start, end, out_dir, data_dir)

reverse_pred-0.0.1/src/reverse_pred/prediction_utils.py ADDED Viewed

@@ -0,0 +1,165 @@
+from scipy import stats
+from regression_metrics import get_train_test_indices, ridge_regress
+import numpy as np
+from correlation_metrics import get_splithalves, spearmanbrown_correction
+def average_data(f):
+    # calculate the number of reliable sites
+    # Define time points
+    time = np.arange(0, 260, 10)  # Equivalent to MATLAB's 0:10:250
+    # Find the indices corresponding to 70 ms and 170 ms
+    start_idx = np.where(time == 70)[0][0]
+    end_idx = np.where(time == 170)[0][0] + 1  # Include the 170 ms point
+    # Average across the specified time range
+    averaged_rates = np.nanmean(f[start_idx:end_idx, :, :, :], axis=0)
+    return averaged_rates
+def get_predictions_multioutput(responses, predictor, ncomp=10, nrfolds=10, seed=0, model=None, monkey=None):
+    nrImages, n_targets = responses.shape
+    ypred = np.full((nrImages, n_targets), np.nan)
+    for i in range(nrfolds):
+        train, test = get_train_test_indices(nrImages, nrfolds=nrfolds, foldnumber=i, seed=seed)
+        pred = ridge_regress(predictor[train, :], responses[train, :], predictor[test, :], model=model, monkey=monkey, fold=i)
+        ypred[test, :] = pred
+    return ypred
+# Updated to handle multi-target
+def get_all_preds(neurons_predicted, neurons_predictor, ncomp, model=None, monkey=None):
+    if len(neurons_predicted.shape) == 3:
+        mean_target = np.nanmean(neurons_predicted, axis=2)   # shape: (n_images, n_target_neurons)
+    else:
+        mean_target = neurons_predicted
+    if len(neurons_predictor.shape) == 3:
+        mean_source = np.nanmean(neurons_predictor, axis=2)   # shape: (n_images, n_source_neurons)
+    else:
+        mean_source = neurons_predictor
+    p = get_predictions_multioutput(mean_target, mean_source, ncomp=ncomp, model=model, monkey=monkey)
+    return p
+def get_splithalf_corr(var, ax=1, type='spearman'):
+    _, _, split_mean1, split_mean2 = get_splithalves(var, ax=ax)  # e.g., output shape (samples, neurons)
+    # Make sure the inputs are 2D
+    assert split_mean1.ndim == 2 and split_mean2.ndim == 2, "Split halves must be 2D"
+    correlations = []
+    for i in range(split_mean1.shape[1]):  # iterate over neurons
+        x, y = split_mean1[:, i], split_mean2[:, i]
+        if type == 'spearman':
+            r, _ = stats.spearmanr(x, y)
+        else:
+            r, _ = stats.pearsonr(x, y)
+        correlations.append(r)
+    return {
+        'split_half_corr': np.array(correlations),
+        'type': type
+    }
+def predictivity(x, y, rho_xx, rho_yy):
+    assert x.shape == y.shape, "Input and prediction shapes must match"
+    n_neurons = x.shape[1]
+    raw_corr = np.array([stats.pearsonr(x[:, i], y[:, i])[0] for i in range(n_neurons)])
+    denominator = np.sqrt(rho_xx * rho_yy)
+    corrected_raw_corr = raw_corr / denominator
+    ev = (corrected_raw_corr ** 2) * 100
+    return ev
+def get_neural_neural_splithalfcorr(rate_predicted, rate_predictor, ncomp=10, nrfolds=10, seed=0):
+    # Split-half correlation of each predicted neuron
+    shc_predicted = get_splithalf_corr(rate_predicted, ax=2)  # shape: (n_neurons,) or (n_neurons, n_neurons)
+    # Predict using split 1 and split 2 of the predictor
+    sp1_predictor, sp2_predictor, _, _ = get_splithalves(rate_predictor, ax=2)
+    p1 = get_predictions_multioutput(np.nanmean(rate_predicted, axis=2), np.nanmean(sp1_predictor, axis=2),
+                                     nrfolds=nrfolds, ncomp=ncomp, seed=seed)
+    p2 = get_predictions_multioutput(np.nanmean(rate_predicted, axis=2), np.nanmean(sp2_predictor, axis=2),
+                                     nrfolds=nrfolds, ncomp=ncomp, seed=seed)
+    prediction_shc = np.array([stats.pearsonr(p1[:, i], p2[:, i])[0] for i in range(p1.shape[1])])
+    prediction_shc = spearmanbrown_correction(prediction_shc)
+    mat = shc_predicted['split_half_corr']
+    if mat.ndim == 2 and mat.shape[0] == mat.shape[1]:
+        diag_vals = np.diag(mat)
+    else:
+        diag_vals = mat
+    neuron_shc = spearmanbrown_correction(diag_vals)
+    return prediction_shc, neuron_shc
+def get_neural_model_splithalfcorr(model_features, rate, ncomp=10, nrfolds=10, seed=0):
+    """
+    model_features: shape (n_images, n_model_units) - deterministic
+    rate: shape (n_images, n_neurons, n_repeats) - noisy
+    """
+    sp1, sp2, _, _ = get_splithalves(rate, ax=2)  # split neural responses along repetitions
+    # Predict each split of neural data from fixed model features
+    p1 = get_predictions_multioutput(model_features, np.nanmean(sp1, axis=2), nrfolds=nrfolds, ncomp=ncomp, seed=seed)
+    p2 = get_predictions_multioutput(model_features, np.nanmean(sp2, axis=2), nrfolds=nrfolds, ncomp=ncomp, seed=seed)
+    # Compute split-half correlation per neuron
+    corr = np.array([stats.pearsonr(p1[:, i], p2[:, i])[0] for i in range(p1.shape[1])])
+    model_shc = spearmanbrown_correction(corr)
+    return model_shc, 1.0
+def get_model_neural_splithalfcorr(rate, model_features, ncomp=10, nrfolds=10, seed=0):
+    """
+    Predict noisy neural responses from model features.
+    - rate: shape (images, neurons, repeats)
+    - model_features: shape (images, model_units)
+    """
+    # Split the rate data along the repetition axis
+    sp1, sp2, _, _ = get_splithalves(rate, ax=2)
+    # Compute SHC for the neural rate
+    shc = get_splithalf_corr(rate, ax=2)
+    # Model predictions from averaged neural splits
+    target_sp1 = np.nanmean(sp1, axis=2)  # (images, neurons)
+    target_sp2 = np.nanmean(sp2, axis=2)  # (images, neurons)
+    # Predict both splits from the model features
+    p1 = get_predictions_multioutput(target_sp1, model_features, nrfolds=nrfolds, ncomp=ncomp, seed=seed)
+    p2 = get_predictions_multioutput(target_sp2, model_features, nrfolds=nrfolds, ncomp=ncomp, seed=seed)
+    # Compute split-half correlation of model predictions per neuron
+    model_shc = np.array([stats.pearsonr(p1[:, i], p2[:, i])[0] for i in range(p1.shape[1])])
+    model_shc = spearmanbrown_correction(model_shc)
+    neural_shc = spearmanbrown_correction(shc['split_half_corr'])
+    return model_shc, neural_shc
+def get_all_stats(p, neurons_predicted, neurons_predictor, ncomp):
+    if len(neurons_predicted.shape) == 3:
+        mean_target = np.nanmean(neurons_predicted, axis=2)   # shape: (n_images, n_target_neurons)
+    else:
+        mean_target = neurons_predicted
+    if len(neurons_predicted.shape) == 3 and len(neurons_predictor.shape) == 3:
+        mshc, nshc = get_neural_neural_splithalfcorr(neurons_predicted, neurons_predictor, ncomp=ncomp)
+    if len(neurons_predicted.shape) == 2 and len(neurons_predictor.shape) == 3:
+        mshc, nshc = get_neural_model_splithalfcorr(neurons_predicted, neurons_predictor, ncomp=ncomp)
+    if len(neurons_predicted.shape) == 3 and len(neurons_predictor.shape) == 2:
+        mshc, nshc = get_model_neural_splithalfcorr(neurons_predicted, neurons_predictor, ncomp=ncomp)
+    if len(neurons_predicted.shape) == 2 and len(neurons_predictor.shape) == 2:
+        mshc, nshc = 1.0, 1.0
+    ev = predictivity(mean_target, p, nshc, mshc)  # Now p and mean_target are 2D
+    return ev

reverse_pred-0.0.1/src/reverse_pred/regression_metrics.py ADDED Viewed

@@ -0,0 +1,79 @@
+import numpy as np
+from sklearn import linear_model
+def ridge_regress(X_train, Y_train, X_test, model=None, monkey=None, fold=None):
+    """
+    Parameters
+    ----------
+    X_train : TYPE
+        DESCRIPTION.
+    Y_train : TYPE
+        DESCRIPTION.
+    X_test : TYPE
+        DESCRIPTION.
+    Returns
+    -------
+    Y_test_pred : TYPE
+        DESCRIPTION.
+    """
+    clf = linear_model.Ridge(alpha=0.1)
+    clf.fit(X_train, Y_train)
+    Y_test_pred = clf.predict(X_test)
+    if model is not None:
+        # Save the weights for later use
+        np.save(f'./results_for_figures/model2monkey/{model}_to_{monkey}_ridge_weights_{fold}.npy', clf.coef_)
+    return Y_test_pred
+def get_train_test_indices(totalIndices, nrfolds=10, foldnumber=0, seed=1):
+    """
+    Parameters
+    ----------
+    totalIndices : TYPE
+        DESCRIPTION.
+    nrfolds : TYPE, optional
+        DESCRIPTION. The default is 10.
+    foldnumber : TYPE, optional
+        DESCRIPTION. The default is 0.
+    seed : TYPE, optional
+        DESCRIPTION. The default is 1.
+    Returns
+    -------
+    train_indices : TYPE
+        DESCRIPTION.
+    test_indices : TYPE
+        DESCRIPTION.
+    """
+    np.random.seed(seed)
+    inds = np.arange(totalIndices)
+    np.random.shuffle(inds)
+    splits = np.array_split(inds, nrfolds)
+    test_indices = inds[np.isin(inds, splits[foldnumber])]
+    train_indices = inds[np.logical_not(np.isin(inds, test_indices))]
+    return train_indices, test_indices
+def main():
+    if __name__ == "__main__":
+        main()