stgp 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
stgp-0.1.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Baichen Yu and contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
stgp-0.1.0/PKG-INFO ADDED
@@ -0,0 +1,53 @@
1
+ Metadata-Version: 2.4
2
+ Name: stgp
3
+ Version: 0.1.0
4
+ Summary: Spatiotemporal gene program discovery with Gaussian process priors
5
+ Author-email: Baichen Yu <mabyu@ust.hk>
6
+ Maintainer-email: Baichen Yu <mabyu@ust.hk>
7
+ License-Expression: MIT
8
+ Project-URL: Homepage, https://github.com/YangLabHKUST/stGP
9
+ Project-URL: Repository, https://github.com/YangLabHKUST/stGP
10
+ Keywords: spatial transcriptomics,gene programs,gaussian process,matrix factorization
11
+ Classifier: Development Status :: 3 - Alpha
12
+ Classifier: Intended Audience :: Science/Research
13
+ Classifier: Operating System :: OS Independent
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3.9
16
+ Classifier: Programming Language :: Python :: 3.10
17
+ Classifier: Programming Language :: Python :: 3.11
18
+ Classifier: Programming Language :: Python :: 3.12
19
+ Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
20
+ Requires-Python: >=3.9
21
+ Description-Content-Type: text/markdown
22
+ License-File: LICENSE
23
+ Requires-Dist: numpy<2.0,>=1.22
24
+ Requires-Dist: scipy>=1.10
25
+ Dynamic: license-file
26
+
27
+ # stGP: Characterizing dynamic tissue architectures by identifying cell-type-specific spatiotemporal gene programs
28
+
29
+ ## Introduction
30
+
31
+ stGP is a statistical framework for identifying interpretable cell-type-specific spatiotemporal gene programs (stGPs) from multi-sample spatiotemporal transcriptomic data measured across biological time by deciphering temporal trajectory and dynamic spatial patterns.
32
+
33
+ stGP's effectiveness relies on our innovations in the integration of Gaussian process priors and interpretable matrix factorization:
34
+
35
+ - stGP represents gene expression within each cell type as a small set of latent programs with non-negative gene loadings, making each program interpretable as a weighted gene set shared across samples. Variance components quantify the relative contributions of time and space to each program.
36
+ - stGP decomposes per-cell program activity into a sample-level temporal component that captures coordinated responses over biological time (e.g., age or stage), and a within-section spatial component that characterizes dynamic program deployment across tissue coordinates without requiring cross-section registration.
37
+ - For multi-program inference, stGP adopts a blockwise backfitting scheme that sequentially extracts rank-1 components from residuals, with automatic model selection to determine the number of programs.
38
+
39
+
40
+ ## Reference
41
+
42
+ If you find `stGP` useful for your work, please cite:
43
+ > Characterizing dynamic tissue architectures by identifying cell-type-specific spatiotemporal gene programs with stGP.
44
+ > Baichen Yu, Ziyue Tan, Xiaomeng Wan, Hansheng Wang, and Can Yang.
45
+ > Working paper, 2026.
46
+
47
+ ## Development
48
+
49
+ The software is developed and maintained by [Baichen Yu](mailto:mabyu@ust.hk).
50
+
51
+ ## Contact
52
+
53
+ Please feel free to contact [Baichen Yu](mailto:mabyu@ust.hk) or [Prof. Can Yang](mailto:macyang@ust.hk) if any inquiries.
stgp-0.1.0/README.md ADDED
@@ -0,0 +1,82 @@
1
+ # Characterizing dynamic tissue architectures by identifying cell-type-specific spatiotemporal gene programs with stGP
2
+
3
+ This repository contains the code for **stGP** (spatiotemporal Gene Programs for spatial transcriptomics by Gaussian Process), a statistical framework for discovering spatiotemporally variable gene programs from aging spatial transcriptomic data.
4
+
5
+ ## Introduction
6
+
7
+ stGP is a statistical framework for identifying interpretable cell-type-specific spatiotemporal gene programs (stGPs) from multi-sample spatiotemporal transcriptomic data measured across biological time.
8
+
9
+ stGP's effectiveness relies on our innovations in the integration of Gaussian process priors and interpretable matrix factorization:
10
+
11
+ - stGP represents gene expression within each cell type as a small set of latent programs with non-negative gene loadings constrained to the simplex, making each program interpretable as a weighted gene set shared across samples.
12
+ - stGP decomposes per-cell program activity into a sample-level temporal component that captures coordinated responses over biological time (e.g., age or stage), and a within-section spatial component that characterizes local program deployment across tissue coordinates—without requiring cross-section registration.
13
+ - Gaussian process priors smooth temporal trajectories along the biological-time covariate and spatial embeddings within each tissue section, while variance components quantify the relative contributions of time and space to each program.
14
+ - For multi-program inference, stGP adopts a blockwise backfitting scheme that sequentially extracts rank-1 components from residuals, with automatic model selection to determine the number of programs.
15
+
16
+ <figure>
17
+
18
+ <img src="FigureReproducing/Fig1_overview.png" style="width:85.0%"
19
+ alt="Overview" />
20
+ </figure>
21
+
22
+
23
+ ## Installation
24
+
25
+ ```bash
26
+ # Clone or copy the repository
27
+ cd stGP
28
+ conda env create -f stGP.yml
29
+ conda activate stGP
30
+ ```
31
+
32
+ ### Popari environment (separate)
33
+
34
+ Popari requires an isolated conda environment due to PyTorch/SIGFPE conflicts.
35
+
36
+ ```bash
37
+ git clone https://github.com/alam-shahul/popari.git
38
+ cd popari
39
+ conda create -n Popari python=3.12
40
+ conda activate Popari
41
+ pip install uv
42
+ uv pip install ".[mlflow,simulation]"
43
+ pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu126
44
+ cd ..
45
+ ```
46
+
47
+ ## Quick start: simulation
48
+
49
+ ```bash
50
+ cd Simulation
51
+
52
+ # Count-regime multi-program recovery
53
+ python simu_gaussian.py --stage all --methods auto --reps 50
54
+ ```
55
+
56
+ ## Quick start: real data
57
+
58
+ ```bash
59
+ cd RealData_MouseBrainMERFISH
60
+
61
+ # Run steps individually for one cell type
62
+ python3 01_preprocess_qc.py --output data/qc/aging_coronal_qc.h5ad
63
+ python3 02_run_stgp.py --celltypes Microglia
64
+ python3 04_make_figures.py --celltype Microglia
65
+
66
+ # Baselines and downstream analyses are documented in RealData_MouseBrainMERFISH/README.md
67
+ ```
68
+
69
+ ## Reference
70
+
71
+ If you find `stGP` or any of the source code in this repository useful for your work, please cite:
72
+ > Characterizing dynamic tissue architectures by identifying cell-type-specific spatiotemporal gene programs with stGP.
73
+ > Baichen Yu, Ziyue Tan, Xiaomeng Wan, Hansheng Wang, and Can Yang.
74
+ > Working paper, 2026.
75
+
76
+ ## Development
77
+
78
+ The software is developed and maintained by [Baichen Yu](mailto:mabyu@ust.hk).
79
+
80
+ ## Contact
81
+
82
+ Please feel free to contact [Baichen Yu](mailto:mabyu@ust.hk) or [Prof. Can Yang](mailto:macyang@ust.hk) if any inquiries.
@@ -0,0 +1,45 @@
1
+ [build-system]
2
+ requires = ["setuptools>=77", "wheel"]
3
+ build-backend = "setuptools.build_meta"
4
+
5
+ [project]
6
+ name = "stgp"
7
+ version = "0.1.0"
8
+ description = "Spatiotemporal gene program discovery with Gaussian process priors"
9
+ readme = "stgp/README.md"
10
+ requires-python = ">=3.9"
11
+ license = "MIT"
12
+ authors = [
13
+ { name = "Baichen Yu", email = "mabyu@ust.hk" },
14
+ ]
15
+ maintainers = [
16
+ { name = "Baichen Yu", email = "mabyu@ust.hk" },
17
+ ]
18
+ dependencies = [
19
+ "numpy>=1.22,<2.0",
20
+ "scipy>=1.10",
21
+ ]
22
+ keywords = [
23
+ "spatial transcriptomics",
24
+ "gene programs",
25
+ "gaussian process",
26
+ "matrix factorization",
27
+ ]
28
+ classifiers = [
29
+ "Development Status :: 3 - Alpha",
30
+ "Intended Audience :: Science/Research",
31
+ "Operating System :: OS Independent",
32
+ "Programming Language :: Python :: 3",
33
+ "Programming Language :: Python :: 3.9",
34
+ "Programming Language :: Python :: 3.10",
35
+ "Programming Language :: Python :: 3.11",
36
+ "Programming Language :: Python :: 3.12",
37
+ "Topic :: Scientific/Engineering :: Bio-Informatics",
38
+ ]
39
+
40
+ [project.urls]
41
+ Homepage = "https://github.com/YangLabHKUST/stGP"
42
+ Repository = "https://github.com/YangLabHKUST/stGP"
43
+
44
+ [tool.setuptools.packages.find]
45
+ include = ["stgp*"]
stgp-0.1.0/setup.cfg ADDED
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,27 @@
1
+ # stGP: Characterizing dynamic tissue architectures by identifying cell-type-specific spatiotemporal gene programs
2
+
3
+ ## Introduction
4
+
5
+ stGP is a statistical framework for identifying interpretable cell-type-specific spatiotemporal gene programs (stGPs) from multi-sample spatiotemporal transcriptomic data measured across biological time by deciphering temporal trajectory and dynamic spatial patterns.
6
+
7
+ stGP's effectiveness relies on our innovations in the integration of Gaussian process priors and interpretable matrix factorization:
8
+
9
+ - stGP represents gene expression within each cell type as a small set of latent programs with non-negative gene loadings, making each program interpretable as a weighted gene set shared across samples. Variance components quantify the relative contributions of time and space to each program.
10
+ - stGP decomposes per-cell program activity into a sample-level temporal component that captures coordinated responses over biological time (e.g., age or stage), and a within-section spatial component that characterizes dynamic program deployment across tissue coordinates without requiring cross-section registration.
11
+ - For multi-program inference, stGP adopts a blockwise backfitting scheme that sequentially extracts rank-1 components from residuals, with automatic model selection to determine the number of programs.
12
+
13
+
14
+ ## Reference
15
+
16
+ If you find `stGP` useful for your work, please cite:
17
+ > Characterizing dynamic tissue architectures by identifying cell-type-specific spatiotemporal gene programs with stGP.
18
+ > Baichen Yu, Ziyue Tan, Xiaomeng Wan, Hansheng Wang, and Can Yang.
19
+ > Working paper, 2026.
20
+
21
+ ## Development
22
+
23
+ The software is developed and maintained by [Baichen Yu](mailto:mabyu@ust.hk).
24
+
25
+ ## Contact
26
+
27
+ Please feel free to contact [Baichen Yu](mailto:mabyu@ust.hk) or [Prof. Can Yang](mailto:macyang@ust.hk) if any inquiries.
@@ -0,0 +1,48 @@
1
+ from stgp.estimation import (
2
+ fit_rank1,
3
+ fit_pfactor,
4
+ fit_pfactor_auto,
5
+ project_simplex,
6
+ project_simplex_topk,
7
+ recover_low_rank_signal,
8
+ align_programs_and_mse,
9
+ )
10
+
11
+ from stgp.kernels import (
12
+ build_K_age,
13
+ build_K_spa,
14
+ build_K_spa_list,
15
+ build_K_spa_list_from_stacked,
16
+ rbf_kernel_1d,
17
+ ar1_kernel_1d,
18
+ )
19
+
20
+ from stgp.preprocessing import (
21
+ log1p_normalize,
22
+ log1p_norm_centered_list,
23
+ demean_genes,
24
+ library_normalize,
25
+ )
26
+
27
+ __version__ = "0.1.0"
28
+
29
+ __all__ = [
30
+ "__version__",
31
+ "fit_rank1",
32
+ "fit_pfactor",
33
+ "fit_pfactor_auto",
34
+ "project_simplex",
35
+ "project_simplex_topk",
36
+ "recover_low_rank_signal",
37
+ "align_programs_and_mse",
38
+ "build_K_age",
39
+ "build_K_spa",
40
+ "build_K_spa_list",
41
+ "build_K_spa_list_from_stacked",
42
+ "rbf_kernel_1d",
43
+ "ar1_kernel_1d",
44
+ "log1p_normalize",
45
+ "log1p_norm_centered_list",
46
+ "demean_genes",
47
+ "library_normalize",
48
+ ]