pystaar 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pystaar-1.0.0/LICENSE +6 -0
- pystaar-1.0.0/PKG-INFO +104 -0
- pystaar-1.0.0/README.md +71 -0
- pystaar-1.0.0/pyproject.toml +61 -0
- pystaar-1.0.0/setup.cfg +4 -0
- pystaar-1.0.0/src/pystaar/__init__.py +104 -0
- pystaar-1.0.0/src/pystaar/_data/DATA_SOURCE.md +144 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_cov_dense_s1_b1.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_cov_dense_s1_b2.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_cov_dense_s2_b1.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_cov_dense_s2_b2.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_cov_sparse_s1_b1.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_cov_sparse_s1_b2.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_cov_sparse_s2_b1.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_cov_sparse_s2_b2.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_pop_groups.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_pop_weights_1_1.csv +4 -0
- pystaar-1.0.0/src/pystaar/_data/example_ai_pop_weights_1_25.csv +4 -0
- pystaar-1.0.0/src/pystaar/_data/example_geno.mtx +7488 -0
- pystaar-1.0.0/src/pystaar/_data/example_glm_binary_spa_cov_filter.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_dense_XW.csv +4 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_dense_XXWX_inv.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_dense_cov_filter.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_dense_fitted.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_dense_scaled_residuals.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_sparse_XW.csv +4 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_sparse_XXWX_inv.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_sparse_cov_filter.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_sparse_fitted.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_binary_spa_sparse_scaled_residuals.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_cov.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_cov_cond_dense.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_cov_cond_sparse.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_cov_rare_maf_0_01.csv +154 -0
- pystaar-1.0.0/src/pystaar/_data/example_glmmkin_scaled_residuals.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_kins_dense.mtx +22502 -0
- pystaar-1.0.0/src/pystaar/_data/example_kins_sparse.mtx +22502 -0
- pystaar-1.0.0/src/pystaar/_data/example_pheno_related.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_pheno_unrelated.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/example_phred.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample601_geno.mtx +7638 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample601_kins_dense.mtx +22502 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample601_kins_sparse.mtx +22502 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample601_pheno_related.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample601_pheno_unrelated.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample601_phred.csv +145 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample602_geno.mtx +6176 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample602_kins_dense.mtx +22502 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample602_kins_sparse.mtx +22502 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample602_pheno_related.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample602_pheno_unrelated.csv +10001 -0
- pystaar-1.0.0/src/pystaar/_data/nonexample602_phred.csv +148 -0
- pystaar-1.0.0/src/pystaar/_data/r_cov.csv +164 -0
- pystaar-1.0.0/src/pystaar/_data/r_scaled_residuals.csv +10001 -0
- pystaar-1.0.0/src/pystaar/data.py +202 -0
- pystaar-1.0.0/src/pystaar/models.py +541 -0
- pystaar-1.0.0/src/pystaar/staar_core.py +1873 -0
- pystaar-1.0.0/src/pystaar/staar_stats.py +147 -0
- pystaar-1.0.0/src/pystaar/workflows.py +1454 -0
- pystaar-1.0.0/src/pystaar.egg-info/PKG-INFO +104 -0
- pystaar-1.0.0/src/pystaar.egg-info/SOURCES.txt +68 -0
- pystaar-1.0.0/src/pystaar.egg-info/dependency_links.txt +1 -0
- pystaar-1.0.0/src/pystaar.egg-info/requires.txt +7 -0
- pystaar-1.0.0/src/pystaar.egg-info/top_level.txt +1 -0
- pystaar-1.0.0/tests/test_api_contract.py +53 -0
- pystaar-1.0.0/tests/test_data.py +24 -0
- pystaar-1.0.0/tests/test_models.py +102 -0
- pystaar-1.0.0/tests/test_staar_core.py +73 -0
- pystaar-1.0.0/tests/test_staar_stats.py +33 -0
- pystaar-1.0.0/tests/test_workflows.py +1121 -0
pystaar-1.0.0/LICENSE
ADDED
|
@@ -0,0 +1,6 @@
|
|
|
1
|
+
Copyright (c) 2026 pySTAAR contributors.
|
|
2
|
+
All rights reserved.
|
|
3
|
+
|
|
4
|
+
This software is proprietary and confidential. No permission is granted to use,
|
|
5
|
+
copy, modify, merge, publish, distribute, sublicense, or sell this software
|
|
6
|
+
without prior written authorization from the copyright holder.
|
pystaar-1.0.0/PKG-INFO
ADDED
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: pystaar
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: Python migration of the STAAR R package
|
|
5
|
+
Author: STAAR migration
|
|
6
|
+
License-Expression: LicenseRef-Proprietary
|
|
7
|
+
Project-URL: Homepage, https://github.com/xiaozhouwang/pySTAAR
|
|
8
|
+
Project-URL: Repository, https://github.com/xiaozhouwang/pySTAAR
|
|
9
|
+
Project-URL: Documentation, https://github.com/xiaozhouwang/pySTAAR/tree/main/docs
|
|
10
|
+
Project-URL: Issues, https://github.com/xiaozhouwang/pySTAAR/issues
|
|
11
|
+
Keywords: staar,genomics,genetics,rare-variant,association-testing
|
|
12
|
+
Classifier: Development Status :: 5 - Production/Stable
|
|
13
|
+
Classifier: Intended Audience :: Science/Research
|
|
14
|
+
Classifier: Operating System :: OS Independent
|
|
15
|
+
Classifier: Programming Language :: Python :: 3
|
|
16
|
+
Classifier: Programming Language :: Python :: 3 :: Only
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
20
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
21
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
22
|
+
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
|
|
23
|
+
Requires-Python: >=3.9
|
|
24
|
+
Description-Content-Type: text/markdown
|
|
25
|
+
License-File: LICENSE
|
|
26
|
+
Requires-Dist: numpy>=1.24
|
|
27
|
+
Requires-Dist: scipy>=1.10
|
|
28
|
+
Requires-Dist: pandas>=2.0
|
|
29
|
+
Requires-Dist: pyyaml>=6.0
|
|
30
|
+
Provides-Extra: dev
|
|
31
|
+
Requires-Dist: pytest>=7.0; extra == "dev"
|
|
32
|
+
Dynamic: license-file
|
|
33
|
+
|
|
34
|
+
# pySTAAR
|
|
35
|
+
|
|
36
|
+
Python 版 STAAR(R 包)迁移项目,面向中文统计遗传/基因组分析用户。
|
|
37
|
+
|
|
38
|
+
For English docs, see [`docs/README.md`](docs/README.md).
|
|
39
|
+
|
|
40
|
+
## 项目定位
|
|
41
|
+
|
|
42
|
+
- 已完成计划内功能迁移(到 `STAAR-56`)。
|
|
43
|
+
- 默认 workflow 入口覆盖:STAAR、条件分析、Binary SPA、单变异得分检验、AI-STAAR。
|
|
44
|
+
- 当前 parity 基线为 pure-Python 路径(related workflows 不依赖预计算 R 协方差文件)。
|
|
45
|
+
|
|
46
|
+
## 快速安装
|
|
47
|
+
|
|
48
|
+
普通用户(发布版):
|
|
49
|
+
|
|
50
|
+
```bash
|
|
51
|
+
pip install pystaar
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
本仓库开发模式:
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
pip install -e '.[dev]'
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
## 快速运行
|
|
61
|
+
|
|
62
|
+
```python
|
|
63
|
+
from pystaar import staar_unrelated_glm
|
|
64
|
+
|
|
65
|
+
res = staar_unrelated_glm(
|
|
66
|
+
dataset="example",
|
|
67
|
+
seed=600,
|
|
68
|
+
rare_maf_cutoff=0.05,
|
|
69
|
+
)
|
|
70
|
+
print("STAAR-O:", res["results_STAAR_O"])
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
## R 用户迁移入口
|
|
74
|
+
|
|
75
|
+
- 完整迁移说明:[`docs/migration_from_r.md`](docs/migration_from_r.md)
|
|
76
|
+
- 15 分钟迁移清单:[`docs/migration_r_quickstart_cn.md`](docs/migration_r_quickstart_cn.md)
|
|
77
|
+
- 数据目录模板:[`docs/data_directory_template_cn.md`](docs/data_directory_template_cn.md)
|
|
78
|
+
|
|
79
|
+
## 文档导航
|
|
80
|
+
|
|
81
|
+
- 中文快速入门:[`docs/README_CN.md`](docs/README_CN.md)
|
|
82
|
+
- 英文快速入门:[`docs/README.md`](docs/README.md)
|
|
83
|
+
- 安装与环境:[`docs/installation.md`](docs/installation.md)
|
|
84
|
+
- 性能对比总览(Python vs R):[`docs/performance_comparison.md`](docs/performance_comparison.md)
|
|
85
|
+
- 性能口径说明:官方跨平台结论以 OpenBLAS backend 为准;macOS Accelerate 本地参考见 `examples/1kg_parity/README.md`。
|
|
86
|
+
- 本地 1KG 对比示例(数据级 + 模拟完整 workflow):[`examples/1kg_parity/README.md`](examples/1kg_parity/README.md)
|
|
87
|
+
- 教程:
|
|
88
|
+
- [`docs/tutorials/01_basic_staar.md`](docs/tutorials/01_basic_staar.md)
|
|
89
|
+
- [`docs/tutorials/02_binary_spa.md`](docs/tutorials/02_binary_spa.md)
|
|
90
|
+
- [`docs/tutorials/03_related_samples.md`](docs/tutorials/03_related_samples.md)
|
|
91
|
+
- [`docs/tutorials/04_conditional.md`](docs/tutorials/04_conditional.md)
|
|
92
|
+
- [`docs/tutorials/05_ai_staar.md`](docs/tutorials/05_ai_staar.md)
|
|
93
|
+
- API 文档:
|
|
94
|
+
- [`docs/api/null_models.md`](docs/api/null_models.md)
|
|
95
|
+
- [`docs/api/staar_functions.md`](docs/api/staar_functions.md)
|
|
96
|
+
- [`docs/api/output_fields.md`](docs/api/output_fields.md)
|
|
97
|
+
- [`docs/api/utilities.md`](docs/api/utilities.md)
|
|
98
|
+
- [`docs/api/stability.md`](docs/api/stability.md)
|
|
99
|
+
- 变更记录:[`CHANGELOG.md`](CHANGELOG.md)
|
|
100
|
+
|
|
101
|
+
## 一致性说明
|
|
102
|
+
|
|
103
|
+
- 历史偏差记录 `DEV-001` 已关闭,仅保留历史背景。
|
|
104
|
+
- 当前状态与 release 口径以 [`reports/summary.md`](reports/summary.md) 和 [`reports/deviations.md`](reports/deviations.md) 为准。
|
pystaar-1.0.0/README.md
ADDED
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# pySTAAR
|
|
2
|
+
|
|
3
|
+
Python 版 STAAR(R 包)迁移项目,面向中文统计遗传/基因组分析用户。
|
|
4
|
+
|
|
5
|
+
For English docs, see [`docs/README.md`](docs/README.md).
|
|
6
|
+
|
|
7
|
+
## 项目定位
|
|
8
|
+
|
|
9
|
+
- 已完成计划内功能迁移(到 `STAAR-56`)。
|
|
10
|
+
- 默认 workflow 入口覆盖:STAAR、条件分析、Binary SPA、单变异得分检验、AI-STAAR。
|
|
11
|
+
- 当前 parity 基线为 pure-Python 路径(related workflows 不依赖预计算 R 协方差文件)。
|
|
12
|
+
|
|
13
|
+
## 快速安装
|
|
14
|
+
|
|
15
|
+
普通用户(发布版):
|
|
16
|
+
|
|
17
|
+
```bash
|
|
18
|
+
pip install pystaar
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
本仓库开发模式:
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
pip install -e '.[dev]'
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
## 快速运行
|
|
28
|
+
|
|
29
|
+
```python
|
|
30
|
+
from pystaar import staar_unrelated_glm
|
|
31
|
+
|
|
32
|
+
res = staar_unrelated_glm(
|
|
33
|
+
dataset="example",
|
|
34
|
+
seed=600,
|
|
35
|
+
rare_maf_cutoff=0.05,
|
|
36
|
+
)
|
|
37
|
+
print("STAAR-O:", res["results_STAAR_O"])
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
## R 用户迁移入口
|
|
41
|
+
|
|
42
|
+
- 完整迁移说明:[`docs/migration_from_r.md`](docs/migration_from_r.md)
|
|
43
|
+
- 15 分钟迁移清单:[`docs/migration_r_quickstart_cn.md`](docs/migration_r_quickstart_cn.md)
|
|
44
|
+
- 数据目录模板:[`docs/data_directory_template_cn.md`](docs/data_directory_template_cn.md)
|
|
45
|
+
|
|
46
|
+
## 文档导航
|
|
47
|
+
|
|
48
|
+
- 中文快速入门:[`docs/README_CN.md`](docs/README_CN.md)
|
|
49
|
+
- 英文快速入门:[`docs/README.md`](docs/README.md)
|
|
50
|
+
- 安装与环境:[`docs/installation.md`](docs/installation.md)
|
|
51
|
+
- 性能对比总览(Python vs R):[`docs/performance_comparison.md`](docs/performance_comparison.md)
|
|
52
|
+
- 性能口径说明:官方跨平台结论以 OpenBLAS backend 为准;macOS Accelerate 本地参考见 `examples/1kg_parity/README.md`。
|
|
53
|
+
- 本地 1KG 对比示例(数据级 + 模拟完整 workflow):[`examples/1kg_parity/README.md`](examples/1kg_parity/README.md)
|
|
54
|
+
- 教程:
|
|
55
|
+
- [`docs/tutorials/01_basic_staar.md`](docs/tutorials/01_basic_staar.md)
|
|
56
|
+
- [`docs/tutorials/02_binary_spa.md`](docs/tutorials/02_binary_spa.md)
|
|
57
|
+
- [`docs/tutorials/03_related_samples.md`](docs/tutorials/03_related_samples.md)
|
|
58
|
+
- [`docs/tutorials/04_conditional.md`](docs/tutorials/04_conditional.md)
|
|
59
|
+
- [`docs/tutorials/05_ai_staar.md`](docs/tutorials/05_ai_staar.md)
|
|
60
|
+
- API 文档:
|
|
61
|
+
- [`docs/api/null_models.md`](docs/api/null_models.md)
|
|
62
|
+
- [`docs/api/staar_functions.md`](docs/api/staar_functions.md)
|
|
63
|
+
- [`docs/api/output_fields.md`](docs/api/output_fields.md)
|
|
64
|
+
- [`docs/api/utilities.md`](docs/api/utilities.md)
|
|
65
|
+
- [`docs/api/stability.md`](docs/api/stability.md)
|
|
66
|
+
- 变更记录:[`CHANGELOG.md`](CHANGELOG.md)
|
|
67
|
+
|
|
68
|
+
## 一致性说明
|
|
69
|
+
|
|
70
|
+
- 历史偏差记录 `DEV-001` 已关闭,仅保留历史背景。
|
|
71
|
+
- 当前状态与 release 口径以 [`reports/summary.md`](reports/summary.md) 和 [`reports/deviations.md`](reports/deviations.md) 为准。
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
[build-system]
|
|
2
|
+
requires = ["setuptools>=77"]
|
|
3
|
+
build-backend = "setuptools.build_meta"
|
|
4
|
+
|
|
5
|
+
[project]
|
|
6
|
+
name = "pystaar"
|
|
7
|
+
version = "1.0.0"
|
|
8
|
+
description = "Python migration of the STAAR R package"
|
|
9
|
+
readme = "README.md"
|
|
10
|
+
requires-python = ">=3.9"
|
|
11
|
+
license = "LicenseRef-Proprietary"
|
|
12
|
+
license-files = ["LICENSE"]
|
|
13
|
+
authors = [
|
|
14
|
+
{ name = "STAAR migration" }
|
|
15
|
+
]
|
|
16
|
+
keywords = [
|
|
17
|
+
"staar",
|
|
18
|
+
"genomics",
|
|
19
|
+
"genetics",
|
|
20
|
+
"rare-variant",
|
|
21
|
+
"association-testing",
|
|
22
|
+
]
|
|
23
|
+
classifiers = [
|
|
24
|
+
"Development Status :: 5 - Production/Stable",
|
|
25
|
+
"Intended Audience :: Science/Research",
|
|
26
|
+
"Operating System :: OS Independent",
|
|
27
|
+
"Programming Language :: Python :: 3",
|
|
28
|
+
"Programming Language :: Python :: 3 :: Only",
|
|
29
|
+
"Programming Language :: Python :: 3.9",
|
|
30
|
+
"Programming Language :: Python :: 3.10",
|
|
31
|
+
"Programming Language :: Python :: 3.11",
|
|
32
|
+
"Programming Language :: Python :: 3.12",
|
|
33
|
+
"Programming Language :: Python :: 3.13",
|
|
34
|
+
"Topic :: Scientific/Engineering :: Bio-Informatics",
|
|
35
|
+
]
|
|
36
|
+
dependencies = [
|
|
37
|
+
"numpy>=1.24",
|
|
38
|
+
"scipy>=1.10",
|
|
39
|
+
"pandas>=2.0",
|
|
40
|
+
"pyyaml>=6.0",
|
|
41
|
+
]
|
|
42
|
+
|
|
43
|
+
[project.urls]
|
|
44
|
+
Homepage = "https://github.com/xiaozhouwang/pySTAAR"
|
|
45
|
+
Repository = "https://github.com/xiaozhouwang/pySTAAR"
|
|
46
|
+
Documentation = "https://github.com/xiaozhouwang/pySTAAR/tree/main/docs"
|
|
47
|
+
Issues = "https://github.com/xiaozhouwang/pySTAAR/issues"
|
|
48
|
+
|
|
49
|
+
[project.optional-dependencies]
|
|
50
|
+
dev = [
|
|
51
|
+
"pytest>=7.0",
|
|
52
|
+
]
|
|
53
|
+
|
|
54
|
+
[tool.setuptools]
|
|
55
|
+
package-dir = {"" = "src"}
|
|
56
|
+
|
|
57
|
+
[tool.setuptools.packages.find]
|
|
58
|
+
where = ["src"]
|
|
59
|
+
|
|
60
|
+
[tool.setuptools.package-data]
|
|
61
|
+
pystaar = ["_data/*"]
|
pystaar-1.0.0/setup.cfg
ADDED
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
"""Python migration of the STAAR R package.
|
|
2
|
+
|
|
3
|
+
Public API includes:
|
|
4
|
+
- R-compatible names (e.g., ``STAAR`` / ``AI_STAAR`` / ``CCT``).
|
|
5
|
+
- Python workflow entry points used by parity scenarios.
|
|
6
|
+
"""
|
|
7
|
+
|
|
8
|
+
from .models import (
|
|
9
|
+
fit_null_glm,
|
|
10
|
+
fit_null_glm_binary_spa,
|
|
11
|
+
fit_null_glmmkin,
|
|
12
|
+
fit_null_glmmkin_binary_spa,
|
|
13
|
+
)
|
|
14
|
+
from .staar_core import (
|
|
15
|
+
_get_eigensolver_runtime_info,
|
|
16
|
+
ai_staar,
|
|
17
|
+
indiv_score_test_region,
|
|
18
|
+
indiv_score_test_region_cond,
|
|
19
|
+
matrix_flip,
|
|
20
|
+
staar,
|
|
21
|
+
staar_binary_spa,
|
|
22
|
+
staar_cond,
|
|
23
|
+
)
|
|
24
|
+
from .staar_stats import cct
|
|
25
|
+
from .workflows import (
|
|
26
|
+
ai_staar_related_dense_glmmkin,
|
|
27
|
+
ai_staar_related_dense_glmmkin_find_weight,
|
|
28
|
+
ai_staar_related_sparse_glmmkin,
|
|
29
|
+
ai_staar_related_sparse_glmmkin_find_weight,
|
|
30
|
+
ai_staar_unrelated_glm,
|
|
31
|
+
ai_staar_unrelated_glm_find_weight,
|
|
32
|
+
clear_runtime_caches,
|
|
33
|
+
get_runtime_cache_info,
|
|
34
|
+
indiv_score_related_dense_glmmkin,
|
|
35
|
+
indiv_score_related_dense_glmmkin_cond,
|
|
36
|
+
indiv_score_related_sparse_glmmkin,
|
|
37
|
+
indiv_score_related_sparse_glmmkin_cond,
|
|
38
|
+
indiv_score_unrelated_glm,
|
|
39
|
+
indiv_score_unrelated_glm_cond,
|
|
40
|
+
staar_related_dense_binary_spa,
|
|
41
|
+
staar_related_dense_glmmkin,
|
|
42
|
+
staar_related_dense_glmmkin_cond,
|
|
43
|
+
staar_related_sparse_binary_spa,
|
|
44
|
+
staar_related_sparse_glmmkin,
|
|
45
|
+
staar_related_sparse_glmmkin_cond,
|
|
46
|
+
staar_unrelated_binary_spa,
|
|
47
|
+
staar_unrelated_glm,
|
|
48
|
+
staar_unrelated_glm_cond,
|
|
49
|
+
)
|
|
50
|
+
|
|
51
|
+
# R-compatible aliases from STAAR NAMESPACE exports.
|
|
52
|
+
CCT = cct
|
|
53
|
+
fit_null_glm_Binary_SPA = fit_null_glm_binary_spa
|
|
54
|
+
fit_null_glmmkin_Binary_SPA = fit_null_glmmkin_binary_spa
|
|
55
|
+
STAAR = staar
|
|
56
|
+
STAAR_cond = staar_cond
|
|
57
|
+
Indiv_Score_Test_Region = indiv_score_test_region
|
|
58
|
+
Indiv_Score_Test_Region_cond = indiv_score_test_region_cond
|
|
59
|
+
STAAR_Binary_SPA = staar_binary_spa
|
|
60
|
+
AI_STAAR = ai_staar
|
|
61
|
+
|
|
62
|
+
__all__ = [
|
|
63
|
+
"CCT",
|
|
64
|
+
"fit_null_glm",
|
|
65
|
+
"fit_null_glmmkin",
|
|
66
|
+
"fit_null_glm_Binary_SPA",
|
|
67
|
+
"fit_null_glmmkin_Binary_SPA",
|
|
68
|
+
"matrix_flip",
|
|
69
|
+
"STAAR",
|
|
70
|
+
"STAAR_cond",
|
|
71
|
+
"STAAR_Binary_SPA",
|
|
72
|
+
"Indiv_Score_Test_Region",
|
|
73
|
+
"Indiv_Score_Test_Region_cond",
|
|
74
|
+
"AI_STAAR",
|
|
75
|
+
"staar_unrelated_glm",
|
|
76
|
+
"staar_related_sparse_glmmkin",
|
|
77
|
+
"staar_related_dense_glmmkin",
|
|
78
|
+
"staar_unrelated_glm_cond",
|
|
79
|
+
"staar_related_sparse_glmmkin_cond",
|
|
80
|
+
"staar_related_dense_glmmkin_cond",
|
|
81
|
+
"staar_unrelated_binary_spa",
|
|
82
|
+
"staar_related_sparse_binary_spa",
|
|
83
|
+
"staar_related_dense_binary_spa",
|
|
84
|
+
"indiv_score_unrelated_glm",
|
|
85
|
+
"indiv_score_related_sparse_glmmkin",
|
|
86
|
+
"indiv_score_related_dense_glmmkin",
|
|
87
|
+
"indiv_score_unrelated_glm_cond",
|
|
88
|
+
"indiv_score_related_sparse_glmmkin_cond",
|
|
89
|
+
"indiv_score_related_dense_glmmkin_cond",
|
|
90
|
+
"ai_staar_unrelated_glm",
|
|
91
|
+
"ai_staar_related_sparse_glmmkin",
|
|
92
|
+
"ai_staar_related_dense_glmmkin",
|
|
93
|
+
"ai_staar_unrelated_glm_find_weight",
|
|
94
|
+
"ai_staar_related_sparse_glmmkin_find_weight",
|
|
95
|
+
"ai_staar_related_dense_glmmkin_find_weight",
|
|
96
|
+
"get_runtime_cache_info",
|
|
97
|
+
"clear_runtime_caches",
|
|
98
|
+
"get_eigensolver_runtime_info",
|
|
99
|
+
]
|
|
100
|
+
|
|
101
|
+
|
|
102
|
+
def get_eigensolver_runtime_info():
|
|
103
|
+
"""Return backend-aware eigensolver runtime selection metadata."""
|
|
104
|
+
return _get_eigensolver_runtime_info()
|
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
# Data Source and Fingerprints
|
|
2
|
+
|
|
3
|
+
- Document date: 2026-02-07
|
|
4
|
+
- Scope: files under `data/` used by Python parity workflows.
|
|
5
|
+
|
|
6
|
+
## Source
|
|
7
|
+
|
|
8
|
+
- Upstream project: `https://github.com/xihaoli/STAAR`
|
|
9
|
+
- Upstream commit for baseline extraction: `9db9dd504905b9f469146f670e5f6dbe3e08d01a`
|
|
10
|
+
- Baseline source record: `baselines/SOURCE.md`
|
|
11
|
+
- Raw input used to generate `data/`: `baselines/example_sim_data.rds`
|
|
12
|
+
- Dataset type: simulated example dataset generated by baseline extraction scripts
|
|
13
|
+
- License context:
|
|
14
|
+
- Upstream package license: `GPL-3` (`../STAAR/DESCRIPTION`)
|
|
15
|
+
- License text available at `../STAAR/LICENSE.md`
|
|
16
|
+
|
|
17
|
+
## Raw Input Checksum
|
|
18
|
+
|
|
19
|
+
| file | size_bytes | sha256 |
|
|
20
|
+
|---|---:|---|
|
|
21
|
+
| `baselines/example_sim_data.rds` | 1312814 | `89498efaa1140720d512e92b4fb80d5141dfbec174dca54df29d553ef7b92fe6` |
|
|
22
|
+
| `baselines/nonexample601_sim_data.rds` | 1307967 | `3104c42d8fff5cd75f824bb10e282366b593c6f22ad924885263bdee63513241` |
|
|
23
|
+
| `baselines/nonexample602_sim_data.rds` | 1304535 | `97f2757233fd25d4fcdb3171cdcb8f341aa330f65078ad4753137bf1de2d77fb` |
|
|
24
|
+
|
|
25
|
+
## Exported Runtime Files
|
|
26
|
+
|
|
27
|
+
| file | size_bytes | sha256 |
|
|
28
|
+
|---|---:|---|
|
|
29
|
+
| `data/example_geno.mtx` | 76674 | `c7fe50fdb539ff71601b4542c78d5d829d67ec5f9932f1dde5830b6ad3692569` |
|
|
30
|
+
| `data/example_kins_sparse.mtx` | 300091 | `e6dbc9c0597345876e29317b22404d69f4b64915ecce9a02299439c399cd56c2` |
|
|
31
|
+
| `data/example_kins_dense.mtx` | 300091 | `e6dbc9c0597345876e29317b22404d69f4b64915ecce9a02299439c399cd56c2` |
|
|
32
|
+
| `data/example_phred.csv` | 27924 | `272e43cfe806a2a980de6293fb353ee4cbb5d66722d8ef765b59c4eecad6d0bb` |
|
|
33
|
+
| `data/example_pheno_unrelated.csv` | 381409 | `e0f4caf7398060761040d4f709b546e6531a6808b713ea5681e8921b5becd8f3` |
|
|
34
|
+
| `data/example_pheno_related.csv` | 429554 | `3f0a8ba32fe7bcf1c8eafc6666b64b3bbbf0b9a145f58b5a065442ff2f3c6396` |
|
|
35
|
+
| `data/nonexample601_geno.mtx` | 77624 | `2b9de0a4d95f22a9b7d7912e3189ccfbc170efd80566aac34b53f09a6182ec5d` |
|
|
36
|
+
| `data/nonexample601_kins_sparse.mtx` | 300091 | `5783413a57c288fac6964946e1c4bd5d9453ca3c87a24847fd4619da5e037243` |
|
|
37
|
+
| `data/nonexample601_kins_dense.mtx` | 300091 | `5783413a57c288fac6964946e1c4bd5d9453ca3c87a24847fd4619da5e037243` |
|
|
38
|
+
| `data/nonexample601_phred.csv` | 24686 | `576e4f60f5d42653c18eb157e2c3847710f97538cf81e5fa44320adbabb16741` |
|
|
39
|
+
| `data/nonexample601_pheno_unrelated.csv` | 381071 | `c0fb8111ed7128fddc2b13e2ed6986d4566a4732e43960a82f925942d72f0358` |
|
|
40
|
+
| `data/nonexample601_pheno_related.csv` | 429309 | `26a2cc47bfb5f902a37495bad541be760a81cf8972ed306f4c82f8c67b6665c4` |
|
|
41
|
+
| `data/nonexample602_geno.mtx` | 62711 | `1ca6579addfa90c0cca6cbd77b4d272f2bf1b9ed36a4c881111e191783f3af15` |
|
|
42
|
+
| `data/nonexample602_kins_sparse.mtx` | 300091 | `5783413a57c288fac6964946e1c4bd5d9453ca3c87a24847fd4619da5e037243` |
|
|
43
|
+
| `data/nonexample602_kins_dense.mtx` | 300091 | `5783413a57c288fac6964946e1c4bd5d9453ca3c87a24847fd4619da5e037243` |
|
|
44
|
+
| `data/nonexample602_phred.csv` | 25214 | `808a8ddd4780b409ed32cd62f5a1dafb6452db83494e96e3358d8953fc6675a1` |
|
|
45
|
+
| `data/nonexample602_pheno_unrelated.csv` | 381102 | `4a6a588cabee87e60064354b9811cdf9181a0f5c47e27418edb49b284091c965` |
|
|
46
|
+
| `data/nonexample602_pheno_related.csv` | 429283 | `8692f9b8051d11dbc4d6b7173fe890c395705c2e637fc22b0fb549ff0edad750` |
|
|
47
|
+
| `data/example_glmmkin_cov.csv` | 555749 | `f6893ba9f034143c0b8eab7fda2fe020e8e43ac0280fbd1bd3824fa20c3311e6` |
|
|
48
|
+
| `data/example_glmmkin_cov_rare_maf_0_01.csv` | 495166 | `c08479c923d9207da12c4e4d8faa64cf80390291223700b423eb15a1cf649ebc` |
|
|
49
|
+
| `data/example_glmmkin_scaled_residuals.csv` | 182493 | `55ebe906e9c63fb25e8bbc4bf4f4ac05966401af8a66bbba972cd48a4a3c29ca` |
|
|
50
|
+
| `data/example_glmmkin_cov_cond_sparse.csv` | 554222 | `a816f345142065cd3a95c3616f9c00e64f433e26ea57704b9dd1670b604130b2` |
|
|
51
|
+
| `data/example_glmmkin_cov_cond_dense.csv` | 555130 | `05ad3b8d587404a850182310edac55fdc11e48c3092ddb954653145250c43a55` |
|
|
52
|
+
| `data/example_glmmkin_binary_spa_sparse_fitted.csv` | 188525 | `8c827840b987e39d853de2ef505f5d730c6085e52339d8d2f549a136e12ffa93` |
|
|
53
|
+
| `data/example_glmmkin_binary_spa_sparse_scaled_residuals.csv` | 197780 | `5613078915493f8152ba5b10e0bd1b66aaa218166d912122500d867e008144cd` |
|
|
54
|
+
| `data/example_glmmkin_binary_spa_sparse_XW.csv` | 643616 | `d89f82fa11888f012abf4806651a2f4df1f0f2fccc5cb397e3879e39945c3ea6` |
|
|
55
|
+
| `data/example_glmmkin_binary_spa_sparse_XXWX_inv.csv` | 614222 | `cdf1b7cf565f5ce68f00f8b77bfc37d8151a5689dc233eee1957d09656e95ae1` |
|
|
56
|
+
| `data/example_glmmkin_binary_spa_dense_fitted.csv` | 188469 | `82e3baef157b2d90d14aa042f117e4d2871a9041079a0eb6390951f160f5429f` |
|
|
57
|
+
| `data/example_glmmkin_binary_spa_dense_scaled_residuals.csv` | 197716 | `084a4bbfa11a8827af81d2625d6e8eedfbee53654a843543679992bb662b56c5` |
|
|
58
|
+
| `data/example_glmmkin_binary_spa_dense_XW.csv` | 557284 | `846283deb00c944da332684ccddf38c36598fe21cddce5929cce908cfd2a2f3d` |
|
|
59
|
+
| `data/example_glmmkin_binary_spa_dense_XXWX_inv.csv` | 614480 | `3bb01dabd6258d4c56e75b275f4ca95faa4503e89d2c258e6e68a460c5e5a36e` |
|
|
60
|
+
| `data/example_glm_binary_spa_cov_filter.csv` | 571828 | `5ec312432b33f86ce4774053d70e40dea641a012699298c9411b4fe541909711` |
|
|
61
|
+
| `data/example_glmmkin_binary_spa_sparse_cov_filter.csv` | 571750 | `30d089be8804cedb8789e51bf4572ccbfcaf36dd4734d1bbc50ad71b9232271a` |
|
|
62
|
+
| `data/example_glmmkin_binary_spa_dense_cov_filter.csv` | 571635 | `2349ab58cab8516b45612d09fc46554e29f33d46f6550ba798051b136f4164d1` |
|
|
63
|
+
| `data/example_ai_pop_groups.csv` | 60012 | `8563a26d5075f4664322997f703b40e6314666744fd3051ea94a3354d17de17c` |
|
|
64
|
+
| `data/example_ai_pop_weights_1_1.csv` | 148 | `e710f254e379c81ec26bcb543d4c88048ae12df46855cfcac8ebbe4192dca2a0` |
|
|
65
|
+
| `data/example_ai_pop_weights_1_25.csv` | 148 | `454abfff9449d29ecd6ee26a9b18c66fa1da8c698c5e94ca6481497e1b92ec51` |
|
|
66
|
+
| `data/example_ai_cov_sparse_s1_b1.csv` | 567327 | `518a448dbf542b663161482a2bcfc1d583651ef59d53ac4088cce7b4357529e8` |
|
|
67
|
+
| `data/example_ai_cov_sparse_s1_b2.csv` | 567952 | `193b1d6d8e3947c30128878d1a7f624bd30e793fc18d2bb581bcc851f5c477dd` |
|
|
68
|
+
| `data/example_ai_cov_sparse_s2_b1.csv` | 515145 | `1a1b0f072cbfc51d172b50b0ecf402d67ac8f9663b27e872b6eed83a24a23c62` |
|
|
69
|
+
| `data/example_ai_cov_sparse_s2_b2.csv` | 517885 | `0a76a6891c6e911bfa4be1ce1d6165d5f306f5ff35fbd1d4b2ec14dddafe9895` |
|
|
70
|
+
| `data/example_ai_cov_dense_s1_b1.csv` | 567248 | `9db30d895f8cf289eb32b8ffe7fea715fb1fea7cb8574c139add0b3c29b7e470` |
|
|
71
|
+
| `data/example_ai_cov_dense_s1_b2.csv` | 568040 | `2801f7b4be47f16875ba5eb5175ab84f01dc728c99b0fc9d56bfd5fa5d2242fd` |
|
|
72
|
+
| `data/example_ai_cov_dense_s2_b1.csv` | 515109 | `5efca5a228e054ec6c887301c545e5e832098ab3c3e42cac75e156f52c3cc9ae` |
|
|
73
|
+
| `data/example_ai_cov_dense_s2_b2.csv` | 517701 | `d9ccdd07c0d7a7278f45bffa96f4d15f879a676da44d806247b6d33e708504df` |
|
|
74
|
+
|
|
75
|
+
## Structured Fingerprints
|
|
76
|
+
|
|
77
|
+
- `data/example_pheno_unrelated.csv`: rows=10000, cols=3, columns=`Y,X1,X2`
|
|
78
|
+
- `data/example_pheno_related.csv`: rows=10000, cols=4, columns=`Y,X1,X2,id`
|
|
79
|
+
- `data/example_phred.csv`: rows=163, cols=10, columns=`Z1,Z2,Z3,Z4,Z5,Z6,Z7,Z8,Z9,Z10`
|
|
80
|
+
- `data/example_geno.mtx`: shape=`10000x163`, nnz=7486
|
|
81
|
+
- `data/example_kins_sparse.mtx`: shape=`10000x10000`, nnz=35000
|
|
82
|
+
- `data/example_kins_dense.mtx`: shape=`10000x10000`, nnz=35000
|
|
83
|
+
- `data/nonexample601_pheno_unrelated.csv`: rows=10000, cols=3, columns=`Y,X1,X2`
|
|
84
|
+
- `data/nonexample601_pheno_related.csv`: rows=10000, cols=4, columns=`Y,X1,X2,id`
|
|
85
|
+
- `data/nonexample601_phred.csv`: rows=144, cols=10, columns=`Z1,Z2,Z3,Z4,Z5,Z6,Z7,Z8,Z9,Z10`
|
|
86
|
+
- `data/nonexample601_geno.mtx`: shape=`10000x144`, nnz=7636
|
|
87
|
+
- `data/nonexample601_kins_sparse.mtx`: shape=`10000x10000`, nnz=35000
|
|
88
|
+
- `data/nonexample601_kins_dense.mtx`: shape=`10000x10000`, nnz=35000
|
|
89
|
+
- `data/nonexample602_pheno_unrelated.csv`: rows=10000, cols=3, columns=`Y,X1,X2`
|
|
90
|
+
- `data/nonexample602_pheno_related.csv`: rows=10000, cols=4, columns=`Y,X1,X2,id`
|
|
91
|
+
- `data/nonexample602_phred.csv`: rows=147, cols=10, columns=`Z1,Z2,Z3,Z4,Z5,Z6,Z7,Z8,Z9,Z10`
|
|
92
|
+
- `data/nonexample602_geno.mtx`: shape=`10000x147`, nnz=6174
|
|
93
|
+
- `data/nonexample602_kins_sparse.mtx`: shape=`10000x10000`, nnz=35000
|
|
94
|
+
- `data/nonexample602_kins_dense.mtx`: shape=`10000x10000`, nnz=35000
|
|
95
|
+
- `data/example_glmmkin_cov.csv`: shape=`163x163`
|
|
96
|
+
- `data/example_glmmkin_cov_rare_maf_0_01.csv`: shape=`153x153`
|
|
97
|
+
- `data/example_glmmkin_cov_cond_sparse.csv`: shape=`163x163`
|
|
98
|
+
- `data/example_glmmkin_cov_cond_dense.csv`: shape=`163x163`
|
|
99
|
+
- `data/example_glmmkin_binary_spa_sparse_fitted.csv`: shape=`10000x1`
|
|
100
|
+
- `data/example_glmmkin_binary_spa_sparse_scaled_residuals.csv`: shape=`10000x1`
|
|
101
|
+
- `data/example_glmmkin_binary_spa_sparse_XW.csv`: shape=`3x10000`
|
|
102
|
+
- `data/example_glmmkin_binary_spa_sparse_XXWX_inv.csv`: shape=`10000x3`
|
|
103
|
+
- `data/example_glmmkin_binary_spa_dense_fitted.csv`: shape=`10000x1`
|
|
104
|
+
- `data/example_glmmkin_binary_spa_dense_scaled_residuals.csv`: shape=`10000x1`
|
|
105
|
+
- `data/example_glmmkin_binary_spa_dense_XW.csv`: shape=`3x10000`
|
|
106
|
+
- `data/example_glmmkin_binary_spa_dense_XXWX_inv.csv`: shape=`10000x3`
|
|
107
|
+
- `data/example_glm_binary_spa_cov_filter.csv`: shape=`163x163`
|
|
108
|
+
- `data/example_glmmkin_binary_spa_sparse_cov_filter.csv`: shape=`163x163`
|
|
109
|
+
- `data/example_glmmkin_binary_spa_dense_cov_filter.csv`: shape=`163x163`
|
|
110
|
+
- `data/example_ai_pop_groups.csv`: rows=10000, cols=1, columns=`pop_group`
|
|
111
|
+
- `data/example_ai_pop_weights_1_1.csv`: rows=3, cols=3, columns=`population,B1,B2`
|
|
112
|
+
- `data/example_ai_pop_weights_1_25.csv`: rows=3, cols=3, columns=`population,B1,B2`
|
|
113
|
+
- `data/example_ai_cov_sparse_s1_b1.csv`: shape=`163x163`
|
|
114
|
+
- `data/example_ai_cov_sparse_s1_b2.csv`: shape=`163x163`
|
|
115
|
+
- `data/example_ai_cov_sparse_s2_b1.csv`: shape=`163x163`
|
|
116
|
+
- `data/example_ai_cov_sparse_s2_b2.csv`: shape=`163x163`
|
|
117
|
+
- `data/example_ai_cov_dense_s1_b1.csv`: shape=`163x163`
|
|
118
|
+
- `data/example_ai_cov_dense_s1_b2.csv`: shape=`163x163`
|
|
119
|
+
- `data/example_ai_cov_dense_s2_b1.csv`: shape=`163x163`
|
|
120
|
+
- `data/example_ai_cov_dense_s2_b2.csv`: shape=`163x163`
|
|
121
|
+
|
|
122
|
+
## Generation Path
|
|
123
|
+
|
|
124
|
+
`data/` exports are generated by:
|
|
125
|
+
|
|
126
|
+
- `scripts/export_example_data.R`
|
|
127
|
+
- `baselines/scripts/extract_related_sparse_glmmkin_cond.R` (conditional covariance export)
|
|
128
|
+
- `baselines/scripts/extract_related_dense_glmmkin_cond.R` (conditional covariance export)
|
|
129
|
+
- `baselines/scripts/extract_related_sparse_binary_spa.R` (related binary SPA null-model components)
|
|
130
|
+
- `baselines/scripts/extract_related_dense_binary_spa.R` (related binary SPA null-model components)
|
|
131
|
+
- `baselines/scripts/extract_unrelated_binary_spa_filter.R` (unrelated binary SPA prefilter covariance export)
|
|
132
|
+
- `baselines/scripts/extract_related_sparse_binary_spa_filter.R` (related sparse binary SPA prefilter covariance export)
|
|
133
|
+
- `baselines/scripts/extract_related_dense_binary_spa_filter.R` (related dense binary SPA prefilter covariance export)
|
|
134
|
+
- `baselines/scripts/extract_ai_staar_unrelated.R` (AI-STAAR ancestry metadata export)
|
|
135
|
+
- `baselines/scripts/extract_ai_staar_related_sparse.R` (AI-STAAR sparse related covariance export)
|
|
136
|
+
- `baselines/scripts/extract_ai_staar_related_dense.R` (AI-STAAR dense related covariance export)
|
|
137
|
+
- `../STAAR` one-shot extraction script (`/tmp/extract_nonexample601.R`) for non-clone `nonexample601` dataset + sentinels (`STAAR-47`)
|
|
138
|
+
- `../STAAR` one-shot extraction script (`/tmp/extract_nonexample602.R`) for second-seed `nonexample602` dataset (`STAAR-54`)
|
|
139
|
+
|
|
140
|
+
Command:
|
|
141
|
+
|
|
142
|
+
- `Rscript scripts/export_example_data.R`
|
|
143
|
+
- `Rscript /tmp/extract_nonexample601.R` (executed in `../STAAR`, then copied `nonexample601_*` outputs)
|
|
144
|
+
- `Rscript /tmp/extract_nonexample602.R` (executed in `../STAAR`, then copied `nonexample602_*` outputs)
|