PyPI - gsMap - Versions diffs - 1.73.0__tar.gz → 1.73.2__tar.gz - Mend

gsMap 1.73.0tar.gz → 1.73.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (83) hide show

gsmap-1.73.2/.github/workflows/test_linux.yml ADDED Viewed

@@ -0,0 +1,100 @@
+name: test
+on:
+  push:
+    branches: [main, "[0-9]+.[0-9]+.x"]
+  pull_request:
+  schedule:
+    - cron: "0 0 * * *"
+  workflow_dispatch:
+concurrency:
+  group: ${{ github.workflow }}-${{ github.ref }}
+  cancel-in-progress: true
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    defaults:
+      run:
+        shell: bash -e {0} # -e to fail on error
+    strategy:
+      fail-fast: false
+      matrix:
+        python: ["3.10", "3.13"]
+    name: Python ${{ matrix.python }} integration
+    env:
+      PYTHON: ${{ matrix.python }}
+      TEST_DATA_URL: https://yanglab.westlake.edu.cn/data/gsMap/gsMap_test_data.tar.gz
+      TEST_DATA_DIR: ${{ github.workspace }}/test_data
+      WORK_DIR: ${{ github.workspace }}/gsmap_workdir
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+      - name: Install uv
+        uses: astral-sh/setup-uv@v5
+      - name: "Set up Python"
+        uses: actions/setup-python@v5
+        with:
+          python-version: ${{ matrix.python }}
+      - name: Install dependencies
+        run: |
+          uv pip install --system -e ".[tests]"
+      - name: Create workdir
+        run: |
+          mkdir -p $WORK_DIR
+          echo "Created workdir: $WORK_DIR"
+      - name: Cache test data
+        uses: actions/cache@v3
+        id: cache-test-data
+        with:
+          path: ${{ env.TEST_DATA_DIR }}
+          key: test-data-v1
+      - name: Download and extract test data
+        if: steps.cache-test-data.outputs.cache-hit != 'true'
+        run: |
+          echo "Downloading test data from $TEST_DATA_URL"
+          curl -L $TEST_DATA_URL -o gsMap_test_data.tar.gz
+          tar -xzf gsMap_test_data.tar.gz -C ${{ github.workspace }}
+          rm gsMap_test_data.tar.gz
+          echo "Test data extracted to ${{ github.workspace }}"
+          ls -la $TEST_DATA_DIR
+      - name: Run pytest
+        env:
+          MPLBACKEND: agg
+          DISPLAY: :0
+          COLUMNS: 120
+        run: |
+          python -m pytest --cov=src \
+              --junitxml=junit.xml -o junit_family=legacy \
+              --cov-report=term-missing \
+              --cov-report=xml \
+              --cov-config=.coveragerc \
+              -v -s --color=yes \
+              --run-real-data \
+              --work-dir=$WORK_DIR \
+              --test-data=$TEST_DATA_DIR
+      - uses: codecov/codecov-action@v4
+        with:
+          token: ${{ secrets.CODECOV_TOKEN }}
+          files: ./coverage.xml
+          fail_ci_if_error: false
+      - name: Upload test results to Codecov
+        if: ${{ !cancelled() }}
+        uses: codecov/test-results-action@v1
+        with:
+          token: ${{ secrets.CODECOV_TOKEN }}

{gsmap-1.73.0 → gsmap-1.73.2}/.pre-commit-config.yaml RENAMED Viewed

@@ -18,7 +18,7 @@ repos:
         types: [yaml]
   - repo: https://github.com/executablebooks/mdformat
-    rev: 0.7.21
+    rev: 0.7.22
     hooks:
       - id: mdformat
         additional_dependencies:
@@ -29,7 +29,7 @@ repos:
           )$
   - repo: https://github.com/igorshubovych/markdownlint-cli
-    rev: v0.43.0
+    rev: v0.44.0
     hooks:
       - id: markdownlint-fix
         exclude: |
@@ -38,7 +38,7 @@ repos:
           )$
   - repo: https://github.com/astral-sh/ruff-pre-commit
-    rev: v0.9.2
+    rev: v0.11.5
     hooks:
       - id: ruff
         args: [--fix, --exit-non-zero-on-fix]

{gsmap-1.73.0 → gsmap-1.73.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: gsMap
-Version: 1.73.0
+Version: 1.73.2
 Summary: Genetics-informed pathogenic spatial mapping
 Author-email: liyang <songliyang@westlake.edu.cn>, wenhao <chenwenhao@westlake.edu.cn>
 Requires-Python: >=3.10
@@ -97,6 +97,14 @@ conda activate gsMap
 pip install gsMap
 ```
+Install using conda:
+```bash
+conda create -n gsMap python>=3.10
+conda activate gsMap
+conda install bioconda::gsmap
+```
 Install from source:
 ```bash

{gsmap-1.73.0 → gsmap-1.73.2}/README.md RENAMED Viewed

@@ -33,6 +33,14 @@ conda activate gsMap
 pip install gsMap
 ```
+Install using conda:
+```bash
+conda create -n gsMap python>=3.10
+conda activate gsMap
+conda install bioconda::gsmap
+```
 Install from source:
 ```bash

gsmap-1.73.2/docs/source/10x.md ADDED Viewed

@@ -0,0 +1,137 @@
+# Cases on 10x Visium Data
+Here we provide case applications based on 10x Visium data (which are not at single-cell resolution). For convenience, we used the `Quick Mode` here, but you can also follow the {doc}`Step by Step <step_by_step>` Guide to analyze 10x Visium data—the steps are the same.
+A frequently asked question is how to provide annotations for 10x Visium data. Note that gsMap can run without annotations. The most convenient approaches are to either leave the `annotation` parameter unset (in {doc}`Step by Step <step_by_step>`) or provide annotations from spatial clustering methods, such as [SpaGCN](https://github.com/jianhuupenn/SpaGCN).
+## Preparation
+Make sure you have {doc}`installed <install>` the `gsMap` package before proceeding.
+### 1. Download Dependencies
+The `gsMap` package in quick mode requires the following resources:
+- **Gene transfer format (GTF) file**, for gene coordinates on the genome.
+- **LD reference panel**, in quick mode, we provide a pre-built LD score snp-by-gene matrix based on 1000G_EUR_Phase3.
+- **SNP weight file**, to adjust correlations between SNP-trait association statistics.
+- **Homologous gene transformations file** (optional), to map genes between species.
+To download all the required files:
+```bash
+wget https://yanglab.westlake.edu.cn/data/gsMap/gsMap_resource.tar.gz
+tar -xvzf gsMap_resource.tar.gz
+```
+Directory structure:
+```bash
+tree -L 2
+gsMap_resource
+    ├── genome_annotation
+    │   ├── enhancer
+    │   └── gtf
+    ├── homologs
+    │   ├── macaque_human_homologs.txt
+    │   └── mouse_human_homologs.txt
+    ├── LD_Reference_Panel
+    │   └── 1000G_EUR_Phase3_plink
+    ├── LDSC_resource
+    │   ├── hapmap3_snps
+    │   └── weights_hm3_no_hla
+    └── quick_mode
+        ├── baseline
+        ├── SNP_gene_pair
+        └── snp_gene_weight_matrix.h5ad
+```
+### 2. Download Example Data
+You can download the example 10x Visium data as follows:
+```bash
+wget https://yanglab.westlake.edu.cn/data/gsMap/Visium_example_data.tar.gz
+tar -xvzf Visium_example_data.tar.gz
+```
+Directory structure:
+```bash
+tree -L 2
+Visium_example_data/
+├── GWAS
+│   ├── IQ_NG_2018.sumstats.gz
+│   └── Serum_creatinine.sumstats.gz
+└── ST
+    ├── V1_Adult_Mouse_Brain_Coronal_Section.h5ad
+    ├── V1_Mouse_Brain_Sagittal_Posterior_Section.h5ad
+    └── V1_Mouse_Kidney.h5ad
+```
+## Case1
+Data: Visium data of adult mouse coronal section
+Trait: IQ
+<span style="color:#31a354"> Required memory: 11G (2902 cells) </span>
+```bash
+gsmap quick_mode \
+    --workdir './example_quick_mode/Visium' \
+    --homolog_file 'gsMap_resource/homologs/mouse_human_homologs.txt' \
+    --sample_name 'V1_Adult_Mouse_Brain_Coronal_Section' \
+    --gsMap_resource_dir 'gsMap_resource' \
+    --hdf5_path 'Visium_example_data/ST/V1_Adult_Mouse_Brain_Coronal_Section.h5ad' \
+    --annotation 'domain' \
+    --data_layer 'count' \
+    --sumstats_file 'Visium_example_data/GWAS/IQ_NG_2018.sumstats.gz' \
+    --trait_name 'IQ'
+```
+[gsMap report](https://yanglab.westlake.edu.cn/data/gsMap/Visium_report/coronal/V1_Adult_Mouse_Brain_Coronal_Section_IQ_gsMap_Report.html) for the `IQ` on the adult mouse coronal section Visium data.
+## Case2
+Data: Visium data of adult mouse sigital section
+Trait: IQ
+<span style="color:#31a354"> Required memory: 12G (3289 cells) </span>
+```bash
+gsmap quick_mode \
+    --workdir './example_quick_mode/Visium' \
+    --homolog_file 'gsMap_resource/homologs/mouse_human_homologs.txt' \
+    --sample_name 'V1_Mouse_Brain_Sagittal_Posterior_Section' \
+    --gsMap_resource_dir 'gsMap_resource' \
+    --hdf5_path 'Visium_example_data/ST/V1_Mouse_Brain_Sagittal_Posterior_Section.h5ad' \
+    --annotation 'domain' \
+    --data_layer 'count' \
+    --sumstats_file 'Visium_example_data/GWAS/IQ_NG_2018.sumstats.gz' \
+    --trait_name 'IQ'
+```
+[gsMap report](https://yanglab.westlake.edu.cn/data/gsMap/Visium_report/saggital/V1_Mouse_Brain_Sagittal_Posterior_Section_IQ_gsMap_Report.html) for the `IQ` on the adult mouse sigital section Visium data.
+## Case3
+Data: Visium data of adult mouse kindey
+Trait: Serum creatinine
+<span style="color:#31a354"> Required memory: 8G (1437 cells) </span>
+```bash
+gsmap quick_mode \
+    --workdir './example_quick_mode/Visium' \
+    --homolog_file 'gsMap_resource/homologs/mouse_human_homologs.txt' \
+    --sample_name 'V1_Mouse_Kidney' \
+    --gsMap_resource_dir 'gsMap_resource' \
+    --hdf5_path 'Visium_example_data/ST/V1_Mouse_Kidney.h5ad' \
+    --annotation 'domain' \
+    --data_layer 'count' \
+    --sumstats_file 'Visium_example_data/GWAS/Serum_creatinine.sumstats.gz' \
+    --trait_name 'Serum_creatinine'
+```
+[gsMap report](https://yanglab.westlake.edu.cn/data/gsMap/Visium/V1_Mouse_Kidney_Serum_creatinine_gsMap_Report.html) for the `Serum creatinine` on the adult mouse kindey Visium data.

{gsmap-1.73.0 → gsmap-1.73.2}/docs/source/advanced_usage.md RENAMED Viewed

@@ -44,7 +44,7 @@ do
         --chrom $CHROM \
         --bfile_root 'gsMap_resource/LD_Reference_Panel/1000G_EUR_Phase3_plink/1000G.EUR.QC' \
         --keep_snp_root 'gsMap_resource/LDSC_resource/hapmap3_snps/hm' \
-        --gtf_annotation_file 'gsMap_resource/genome_annotation/gtf/gencode.v39lift37.annotation.gtf' \
+        --gtf_annotation_file 'gsMap_resource/genome_annotation/gtf/gencode.v46lift37.basic.annotation.gtf' \
         --gene_window_size 50000 \
         --additional_baseline_annotation 'gsMap_additional_annotation'
 done

{gsmap-1.73.0 → gsmap-1.73.2}/docs/source/step_by_step.md RENAMED Viewed

@@ -132,7 +132,7 @@ do
         --chrom $CHROM \
         --bfile_root 'gsMap_resource/LD_Reference_Panel/1000G_EUR_Phase3_plink/1000G.EUR.QC' \
         --keep_snp_root 'gsMap_resource/LDSC_resource/hapmap3_snps/hm' \
-        --gtf_annotation_file 'gsMap_resource/genome_annotation/gtf/gencode.v39lift37.annotation.gtf' \
+        --gtf_annotation_file 'gsMap_resource/genome_annotation/gtf/gencode.v46lift37.basic.annotation.gtf' \
         --gene_window_size 50000
 done
 ```
@@ -150,7 +150,7 @@ do
         --chrom $CHROM \
         --bfile_root 'gsMap_resource/LD_Reference_Panel/1000G_EUR_Phase3_plink/1000G.EUR.QC' \
         --keep_snp_root 'gsMap_resource/LDSC_resource/hapmap3_snps/hm' \
-        --gtf_annotation_file 'gsMap_resource/genome_annotation/gtf/gencode.v39lift37.annotation.gtf' \
+        --gtf_annotation_file 'gsMap_resource/genome_annotation/gtf/gencode.v46lift37.basic.annotation.gtf' \
         --enhancer_annotation_file 'gsMap_resource/genome_annotation/enhancer/by_tissue/ALL/ABC_roadmap_merged.bed' \
         --snp_multiple_enhancer_strategy 'max_mkscore' \
         --gene_window_enhancer_priority 'enhancer_only'
@@ -170,7 +170,7 @@ do
         --chrom $CHROM \
         --bfile_root 'gsMap_resource/LD_Reference_Panel/1000G_EUR_Phase3_plink/1000G.EUR.QC' \
         --keep_snp_root 'gsMap_resource/LDSC_resource/hapmap3_snps/hm' \
-        --gtf_annotation_file 'gsMap_resource/genome_annotation/gtf/gencode.v39lift37.annotation.gtf' \
+        --gtf_annotation_file 'gsMap_resource/genome_annotation/gtf/gencode.v46lift37.basic.annotation.gtf' \
         --gene_window_size 50000 \
         --enhancer_annotation_file 'gsMap_resource/genome_annotation/enhancer/by_tissue/ALL/ABC_roadmap_merged.bed' \
         --snp_multiple_enhancer_strategy 'max_mkscore' \

{gsmap-1.73.0 → gsmap-1.73.2}/docs/source/tutorials.rst RENAMED Viewed

@@ -27,4 +27,5 @@ The tutorials are organized as follows:
     quick_mode.md
     step_by_step.md
     advanced_usage.md
+    10x.md
     data_format.md

{gsmap-1.73.0 → gsmap-1.73.2}/src/gsMap/GNN/train.py RENAMED Viewed

@@ -17,7 +17,7 @@ def reconstruction_loss(decoded, x):
 def label_loss(pred_label, true_label):
     """Compute the cross-entropy loss."""
-    return F.cross_entropy(pred_label, true_label)
+    return F.cross_entropy(pred_label, true_label.long())
 class ModelTrainer:

{gsmap-1.73.0 → gsmap-1.73.2}/src/gsMap/__init__.py RENAMED Viewed

@@ -2,4 +2,4 @@
 Genetics-informed pathogenic spatial mapping
 """
-__version__ = "1.73.0"
+__version__ = "1.73.2"

{gsmap-1.73.0 → gsmap-1.73.2}/src/gsMap/config.py RENAMED Viewed

@@ -232,6 +232,9 @@ def add_find_latent_representations_args(parser):
         action="store_true",
         help="Enable hierarchical latent representation finding.",
     )
+    parser.add_argument(
+        "--pearson_residuals", action="store_true", help="Using the pearson residuals."
+    )
 def chrom_choice(value):
@@ -308,7 +311,7 @@ def add_generate_ldscore_args(parser):
         help="Root path for genotype plink bfiles (.bim, .bed, .fam).",
     )
     parser.add_argument(
-        "--keep_snp_root", type=str, required=True, help="Root path for SNP files."
+        "--keep_snp_root", type=str, required=False, help="Root path for SNP files"
     )
     parser.add_argument(
         "--gtf_annotation_file", type=str, required=True, help="Path to GTF annotation file."
@@ -357,7 +360,11 @@ def add_spatial_ldsc_args(parser):
         "--sumstats_file", type=str, required=True, help="Path to GWAS summary statistics file."
     )
     parser.add_argument(
-        "--w_file", type=str, required=True, help="Path to regression weight file."
+        "--w_file",
+        type=str,
+        required=False,
+        default=None,
+        help="Path to regression weight file. If not provided, will use weights generated in the generate_ldscore step.",
     )
     parser.add_argument(
         "--trait_name", type=str, required=True, help="Name of the trait being analyzed."
@@ -678,6 +685,9 @@ def add_run_all_mode_args(parser):
     parser.add_argument(
         "--gM_slices", type=str, default=None, help="Path to the slice mean file (optional)."
     )
+    parser.add_argument(
+        "--pearson_residuals", action="store_true", help="Using the pearson residuals."
+    )
 def ensure_path_exists(func):
@@ -854,6 +864,7 @@ class FindLatentRepresentationsConfig(ConfigWithAutoPaths):
     var: bool = False
     convergence_threshold: float = 1e-4
     hierarchically: bool = False
+    pearson_residuals: bool = False
     def __post_init__(self):
         # self.output_hdf5_path = self.hdf5_with_latent_path
@@ -942,11 +953,11 @@ class GenerateLDScoreConfig(ConfigWithAutoPaths):
     chrom: int | str
     bfile_root: str
-    keep_snp_root: str | None
     # annotation by gene distance
     gtf_annotation_file: str
     gene_window_size: int = 50000
+    keep_snp_root: str | None = None
     # annotation by enhancer
     enhancer_annotation_file: str = None
@@ -1055,7 +1066,7 @@ class GenerateLDScoreConfig(ConfigWithAutoPaths):
 @dataclass
 class SpatialLDSCConfig(ConfigWithAutoPaths):
-    w_file: str
+    w_file: str | None = None
     # ldscore_save_dir: str
     use_additional_baseline_annotation: bool = True
     trait_name: str | None = None
@@ -1105,8 +1116,19 @@ class SpatialLDSCConfig(ConfigWithAutoPaths):
         for sumstats_file in self.sumstats_config_dict.values():
             assert Path(sumstats_file).exists(), f"{sumstats_file} does not exist."
-        # check if additional baseline annotation is exist
-        # self.use_additional_baseline_annotation = False
+        # Handle w_file
+        if self.w_file is None:
+            w_ld_dir = Path(self.ldscore_save_dir) / "w_ld"
+            if w_ld_dir.exists():
+                self.w_file = str(w_ld_dir / "weights.")
+                logger.info(f"Using weights generated in the generate_ldscore step: {self.w_file}")
+            else:
+                raise ValueError(
+                    "No w_file provided and no weights found in generate_ldscore output. "
+                    "Either provide --w_file or run generate_ldscore first."
+                )
+        else:
+            logger.info(f"Using provided weights file: {self.w_file}")
         if self.use_additional_baseline_annotation:
             self.process_additional_baseline_annotation()
@@ -1117,16 +1139,6 @@ class SpatialLDSCConfig(ConfigWithAutoPaths):
         if not dir_exists:
             self.use_additional_baseline_annotation = False
-            # if self.use_additional_baseline_annotation:
-            #     logger.warning(f"additional_baseline directory is not found in {self.ldscore_save_dir}.")
-            #     print('''\
-            #         if you want to use additional baseline annotation,
-            #         please provide additional baseline annotation when calculating ld score.
-            #         ''')
-            #     raise FileNotFoundError(
-            #         f'additional_baseline directory is not found.')
-            # return
-            # self.use_additional_baseline_annotation = self.use_additional_baseline_annotation or True
         else:
             logger.info(
                 "------Additional baseline annotation is provided. It will be used with the default baseline annotation."
@@ -1227,6 +1239,7 @@ class RunAllModeConfig(ConfigWithAutoPaths):
     # == Find Latent Representation PARAMETERS ==
     n_comps: int = 300
+    pearson_residuals: bool = False
     # == latent 2 Gene PARAMETERS ==
     gM_slices: str | None = None

{gsmap-1.73.0 → gsmap-1.73.2}/src/gsMap/create_slice_mean.py RENAMED Viewed

@@ -23,6 +23,7 @@ def get_common_genes(h5ad_files, config: CreateSliceMeanConfig):
     common_genes = None
     for file in tqdm(h5ad_files, desc="Finding common genes"):
         adata = sc.read_h5ad(file)
+        sc.pp.filter_genes(adata, min_cells=1)
         adata.var_names_make_unique()
         if common_genes is None:
             common_genes = adata.var_names

{gsmap-1.73.0 → gsmap-1.73.2}/src/gsMap/diagnosis.py RENAMED Viewed

@@ -49,7 +49,10 @@ def compute_gene_diagnostic_info(config: DiagnosisConfig):
     # Align marker scores with trait LDSC results
     mk_score = mk_score.loc[trait_ldsc_result.index]
-    mk_score = mk_score.loc[:, mk_score.sum(axis=0) != 0]
+    # Filter out genes with no variation
+    has_variation = (~mk_score.eq(mk_score.iloc[0], axis=1)).any()
+    mk_score = mk_score.loc[:, has_variation]
     logger.info("Calculating correlation between gene marker scores and trait logp-values...")
     corr = mk_score.corrwith(trait_ldsc_result["logp"])
@@ -66,10 +69,6 @@ def compute_gene_diagnostic_info(config: DiagnosisConfig):
         }
     )
-    # Filter based on median GSS score
-    high_GSS_Gene_annotation_pair = high_GSS_Gene_annotation_pair[
-        high_GSS_Gene_annotation_pair["Median_GSS"] >= 1.0
-    ]
     high_GSS_Gene_annotation_pair = high_GSS_Gene_annotation_pair.merge(
         corr, left_on="Gene", right_index=True
     )
@@ -88,19 +87,6 @@ def compute_gene_diagnostic_info(config: DiagnosisConfig):
     gene_diagnostic_info.to_csv(gene_diagnostic_info_save_path, index=False)
     logger.info(f"Gene diagnostic information saved to {gene_diagnostic_info_save_path}.")
-    # TODO: A new script is needed to save the gene diagnostic info to adata.var and trait_ldsc_result to adata.obs when running multiple traits
-    # # Save to adata.var with the trait_name prefix
-    # logger.info('Saving gene diagnostic info to adata.var...')
-    # gene_diagnostic_info.set_index('Gene', inplace=True)  # Use 'Gene' as the index to align with adata.var
-    # adata.var[f'{config.trait_name}_Annotation'] = gene_diagnostic_info['Annotation']
-    # adata.var[f'{config.trait_name}_Median_GSS'] = gene_diagnostic_info['Median_GSS']
-    # adata.var[f'{config.trait_name}_PCC'] = gene_diagnostic_info['PCC']
-    #
-    # # Save trait_ldsc_result to adata.obs
-    # logger.info(f'Saving trait LDSC results to adata.obs as gsMap_{config.trait_name}_p_value...')
-    # adata.obs[f'gsMap_{config.trait_name}_p_value'] = trait_ldsc_result['p']
-    # adata.write(config.hdf5_with_latent_path, )
     return gene_diagnostic_info.reset_index()
@@ -171,6 +157,20 @@ def generate_manhattan_plot(config: DiagnosisConfig):
         + gwas_data_to_plot["Annotation"].astype(str)
     )
+    # Verify data integrity
+    if gwas_data_with_gene_annotation_sort.empty:
+        logger.error("Filtered GWAS data is empty, cannot create Manhattan plot")
+        return
+    if len(gwas_data_to_plot) == 0:
+        logger.error("No SNPs passed filtering criteria for Manhattan plot")
+        return
+    # Log some diagnostic information
+    logger.info(f"Creating Manhattan plot with {len(gwas_data_to_plot)} SNPs")
+    logger.info(f"Columns available: {list(gwas_data_to_plot.columns)}")
+    logger.info(f"Chromosome column values: {gwas_data_to_plot['CHR'].unique()}")
     fig = ManhattanPlot(
         dataframe=gwas_data_to_plot,
         title="gsMap Diagnosis Manhattan Plot",

{gsmap-1.73.0 → gsmap-1.73.2}/src/gsMap/find_latent_representation.py RENAMED Viewed

@@ -50,6 +50,15 @@ def preprocess_data(adata, params):
         # HVGs based on count
         logger.info("Dealing with count data...")
         sc.pp.highly_variable_genes(adata, flavor="seurat_v3", n_top_genes=params.feat_cell)
+        # Get the pearson residuals
+        if params.pearson_residuals:
+            sc.experimental.pp.normalize_pearson_residuals(adata, inplace=False)
+            pearson_residuals = sc.experimental.pp.normalize_pearson_residuals(
+                adata, inplace=False, clip=10
+            )
+            adata.layers["pearson_residuals"] = pearson_residuals["X"]
         # Normalize the data
         sc.pp.normalize_total(adata, target_sum=1e4)
         sc.pp.log1p(adata)
@@ -64,8 +73,13 @@ class LatentRepresentationFinder:
     def __init__(self, adata, args: FindLatentRepresentationsConfig):
         self.params = args
-        self.expression_array = adata[:, adata.var.highly_variable].X.copy()
-        self.expression_array = sc.pp.scale(self.expression_array, max_value=10)
+        if "pearson_residuals" in adata.layers:
+            self.expression_array = (
+                adata[:, adata.var.highly_variable].layers["pearson_residuals"].copy()
+            )
+        else:
+            self.expression_array = adata[:, adata.var.highly_variable].X.copy()
+            self.expression_array = sc.pp.scale(self.expression_array, max_value=10)
         # Construct the neighboring graph
         self.graph_dict = construct_adjacency_matrix(adata, self.params)
@@ -103,6 +117,8 @@ def run_find_latent_representation(args: FindLatentRepresentationsConfig):
     # Load the ST data
     logger.info(f"Loading ST data of {args.sample_name}...")
     adata = sc.read_h5ad(args.input_hdf5_path)
+    sc.pp.filter_genes(adata, min_cells=1)
     logger.info(f"The ST data contains {adata.shape[0]} cells, {adata.shape[1]} genes.")
     # Load the cell type annotation

gsMap 1.73.0__tar.gz → 1.73.2__tar.gz

gsMap 1.73.0tar.gz → 1.73.2tar.gz