@bgicli/bgicli 2.1.1 → 2.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (1267) hide show
  1. package/README.md +152 -74
  2. package/data/skills/aav-vector-design-agent/SKILL.md +198 -0
  3. package/data/skills/adaptyv/SKILL.md +112 -0
  4. package/data/skills/adhd-daily-planner/SKILL.md +271 -0
  5. package/data/skills/aeon/SKILL.md +372 -0
  6. package/data/skills/agent-browser/SKILL.md +159 -0
  7. package/data/skills/agentd-drug-discovery/SKILL.md +52 -0
  8. package/data/skills/ai-analyzer/SKILL.md +218 -0
  9. package/data/skills/alphafold/SKILL.md +183 -0
  10. package/data/skills/alphafold-database/SKILL.md +500 -0
  11. package/data/skills/anndata/SKILL.md +394 -0
  12. package/data/skills/antibody-design-agent/SKILL.md +64 -0
  13. package/data/skills/arboreto/SKILL.md +237 -0
  14. package/data/skills/armored-cart-design-agent/SKILL.md +225 -0
  15. package/data/skills/arxiv-search/SKILL.md +224 -0
  16. package/data/skills/autonomous-oncology-agent/SKILL.md +77 -0
  17. package/data/skills/bayesian-optimizer/SKILL.md +60 -0
  18. package/data/skills/benchling-integration/SKILL.md +473 -0
  19. package/data/skills/bgpt-paper-search/SKILL.md +81 -0
  20. package/data/skills/bindcraft/SKILL.md +198 -0
  21. package/data/skills/binder-design/SKILL.md +182 -0
  22. package/data/skills/binding-characterization/SKILL.md +234 -0
  23. package/data/skills/bindingdb-database/SKILL.md +332 -0
  24. package/data/skills/bio-admet-prediction/SKILL.md +224 -0
  25. package/data/skills/bio-alignment-files-bam-statistics/SKILL.md +340 -0
  26. package/data/skills/bio-alignment-filtering/SKILL.md +322 -0
  27. package/data/skills/bio-alignment-indexing/SKILL.md +249 -0
  28. package/data/skills/bio-alignment-io/SKILL.md +301 -0
  29. package/data/skills/bio-alignment-msa-parsing/SKILL.md +366 -0
  30. package/data/skills/bio-alignment-msa-statistics/SKILL.md +375 -0
  31. package/data/skills/bio-alignment-pairwise/SKILL.md +277 -0
  32. package/data/skills/bio-alignment-sorting/SKILL.md +296 -0
  33. package/data/skills/bio-alignment-validation/SKILL.md +374 -0
  34. package/data/skills/bio-atac-seq-atac-peak-calling/SKILL.md +221 -0
  35. package/data/skills/bio-atac-seq-atac-qc/SKILL.md +292 -0
  36. package/data/skills/bio-atac-seq-differential-accessibility/SKILL.md +268 -0
  37. package/data/skills/bio-atac-seq-footprinting/SKILL.md +256 -0
  38. package/data/skills/bio-atac-seq-motif-deviation/SKILL.md +319 -0
  39. package/data/skills/bio-atac-seq-nucleosome-positioning/SKILL.md +321 -0
  40. package/data/skills/bio-basecalling/SKILL.md +368 -0
  41. package/data/skills/bio-batch-downloads/SKILL.md +384 -0
  42. package/data/skills/bio-batch-processing/SKILL.md +303 -0
  43. package/data/skills/bio-bedgraph-handling/SKILL.md +336 -0
  44. package/data/skills/bio-blast-searches/SKILL.md +354 -0
  45. package/data/skills/bio-causal-genomics-colocalization-analysis/SKILL.md +264 -0
  46. package/data/skills/bio-causal-genomics-fine-mapping/SKILL.md +267 -0
  47. package/data/skills/bio-causal-genomics-mediation-analysis/SKILL.md +264 -0
  48. package/data/skills/bio-causal-genomics-mendelian-randomization/SKILL.md +221 -0
  49. package/data/skills/bio-causal-genomics-pleiotropy-detection/SKILL.md +292 -0
  50. package/data/skills/bio-cfdna-preprocessing/SKILL.md +200 -0
  51. package/data/skills/bio-chipseq-differential-binding/SKILL.md +262 -0
  52. package/data/skills/bio-chipseq-motif-analysis/SKILL.md +387 -0
  53. package/data/skills/bio-chipseq-peak-annotation/SKILL.md +239 -0
  54. package/data/skills/bio-chipseq-peak-calling/SKILL.md +277 -0
  55. package/data/skills/bio-chipseq-qc/SKILL.md +391 -0
  56. package/data/skills/bio-chipseq-super-enhancers/SKILL.md +288 -0
  57. package/data/skills/bio-chipseq-visualization/SKILL.md +289 -0
  58. package/data/skills/bio-clinical-databases-clinvar-lookup/SKILL.md +188 -0
  59. package/data/skills/bio-clinical-databases-dbsnp-queries/SKILL.md +171 -0
  60. package/data/skills/bio-clinical-databases-gnomad-frequencies/SKILL.md +205 -0
  61. package/data/skills/bio-clinical-databases-hla-typing/SKILL.md +248 -0
  62. package/data/skills/bio-clinical-databases-myvariant-queries/SKILL.md +174 -0
  63. package/data/skills/bio-clinical-databases-pharmacogenomics/SKILL.md +232 -0
  64. package/data/skills/bio-clinical-databases-polygenic-risk/SKILL.md +276 -0
  65. package/data/skills/bio-clinical-databases-somatic-signatures/SKILL.md +261 -0
  66. package/data/skills/bio-clinical-databases-tumor-mutational-burden/SKILL.md +301 -0
  67. package/data/skills/bio-clinical-databases-variant-prioritization/SKILL.md +225 -0
  68. package/data/skills/bio-clip-seq-binding-site-annotation/SKILL.md +66 -0
  69. package/data/skills/bio-clip-seq-clip-alignment/SKILL.md +70 -0
  70. package/data/skills/bio-clip-seq-clip-motif-analysis/SKILL.md +62 -0
  71. package/data/skills/bio-clip-seq-clip-peak-calling/SKILL.md +282 -0
  72. package/data/skills/bio-clip-seq-clip-preprocessing/SKILL.md +142 -0
  73. package/data/skills/bio-codon-usage/SKILL.md +353 -0
  74. package/data/skills/bio-comparative-genomics-ancestral-reconstruction/SKILL.md +312 -0
  75. package/data/skills/bio-comparative-genomics-hgt-detection/SKILL.md +341 -0
  76. package/data/skills/bio-comparative-genomics-ortholog-inference/SKILL.md +308 -0
  77. package/data/skills/bio-comparative-genomics-positive-selection/SKILL.md +354 -0
  78. package/data/skills/bio-comparative-genomics-synteny-analysis/SKILL.md +315 -0
  79. package/data/skills/bio-compressed-files/SKILL.md +263 -0
  80. package/data/skills/bio-consensus-sequences/SKILL.md +340 -0
  81. package/data/skills/bio-copy-number-cnv-annotation/SKILL.md +307 -0
  82. package/data/skills/bio-copy-number-cnv-visualization/SKILL.md +294 -0
  83. package/data/skills/bio-copy-number-cnvkit-analysis/SKILL.md +290 -0
  84. package/data/skills/bio-copy-number-gatk-cnv/SKILL.md +270 -0
  85. package/data/skills/bio-crispr-screens-base-editing-analysis/SKILL.md +110 -0
  86. package/data/skills/bio-crispr-screens-batch-correction/SKILL.md +316 -0
  87. package/data/skills/bio-crispr-screens-crispresso-editing/SKILL.md +205 -0
  88. package/data/skills/bio-crispr-screens-hit-calling/SKILL.md +264 -0
  89. package/data/skills/bio-crispr-screens-jacks-analysis/SKILL.md +313 -0
  90. package/data/skills/bio-crispr-screens-library-design/SKILL.md +417 -0
  91. package/data/skills/bio-crispr-screens-mageck-analysis/SKILL.md +222 -0
  92. package/data/skills/bio-crispr-screens-screen-qc/SKILL.md +243 -0
  93. package/data/skills/bio-ctdna-mutation-detection/SKILL.md +234 -0
  94. package/data/skills/bio-data-visualization-circos-plots/SKILL.md +405 -0
  95. package/data/skills/bio-data-visualization-color-palettes/SKILL.md +244 -0
  96. package/data/skills/bio-data-visualization-genome-browser-tracks/SKILL.md +328 -0
  97. package/data/skills/bio-data-visualization-genome-tracks/SKILL.md +249 -0
  98. package/data/skills/bio-data-visualization-ggplot2-fundamentals/SKILL.md +313 -0
  99. package/data/skills/bio-data-visualization-heatmaps-clustering/SKILL.md +227 -0
  100. package/data/skills/bio-data-visualization-interactive-visualization/SKILL.md +210 -0
  101. package/data/skills/bio-data-visualization-multipanel-figures/SKILL.md +274 -0
  102. package/data/skills/bio-data-visualization-specialized-omics-plots/SKILL.md +251 -0
  103. package/data/skills/bio-data-visualization-upset-plots/SKILL.md +228 -0
  104. package/data/skills/bio-data-visualization-volcano-customization/SKILL.md +233 -0
  105. package/data/skills/bio-de-deseq2-basics/SKILL.md +376 -0
  106. package/data/skills/bio-de-edger-basics/SKILL.md +418 -0
  107. package/data/skills/bio-de-results/SKILL.md +378 -0
  108. package/data/skills/bio-de-visualization/SKILL.md +408 -0
  109. package/data/skills/bio-differential-expression-batch-correction/SKILL.md +253 -0
  110. package/data/skills/bio-differential-expression-timeseries-de/SKILL.md +370 -0
  111. package/data/skills/bio-differential-splicing/SKILL.md +177 -0
  112. package/data/skills/bio-duplicate-handling/SKILL.md +292 -0
  113. package/data/skills/bio-entrez-fetch/SKILL.md +334 -0
  114. package/data/skills/bio-entrez-link/SKILL.md +325 -0
  115. package/data/skills/bio-entrez-search/SKILL.md +311 -0
  116. package/data/skills/bio-epidemiological-genomics-amr-surveillance/SKILL.md +233 -0
  117. package/data/skills/bio-epidemiological-genomics-pathogen-typing/SKILL.md +202 -0
  118. package/data/skills/bio-epidemiological-genomics-phylodynamics/SKILL.md +207 -0
  119. package/data/skills/bio-epidemiological-genomics-transmission-inference/SKILL.md +237 -0
  120. package/data/skills/bio-epidemiological-genomics-variant-surveillance/SKILL.md +237 -0
  121. package/data/skills/bio-epitranscriptomics-m6a-differential/SKILL.md +88 -0
  122. package/data/skills/bio-epitranscriptomics-m6a-peak-calling/SKILL.md +89 -0
  123. package/data/skills/bio-epitranscriptomics-m6anet-analysis/SKILL.md +101 -0
  124. package/data/skills/bio-epitranscriptomics-merip-preprocessing/SKILL.md +81 -0
  125. package/data/skills/bio-epitranscriptomics-modification-visualization/SKILL.md +98 -0
  126. package/data/skills/bio-experimental-design-batch-design/SKILL.md +110 -0
  127. package/data/skills/bio-experimental-design-multiple-testing/SKILL.md +98 -0
  128. package/data/skills/bio-experimental-design-power-analysis/SKILL.md +84 -0
  129. package/data/skills/bio-experimental-design-sample-size/SKILL.md +93 -0
  130. package/data/skills/bio-expression-matrix-counts-ingest/SKILL.md +220 -0
  131. package/data/skills/bio-expression-matrix-gene-id-mapping/SKILL.md +256 -0
  132. package/data/skills/bio-expression-matrix-metadata-joins/SKILL.md +271 -0
  133. package/data/skills/bio-expression-matrix-sparse-handling/SKILL.md +247 -0
  134. package/data/skills/bio-fastq-quality/SKILL.md +279 -0
  135. package/data/skills/bio-filter-sequences/SKILL.md +265 -0
  136. package/data/skills/bio-flow-cytometry-bead-normalization/SKILL.md +315 -0
  137. package/data/skills/bio-flow-cytometry-clustering-phenotyping/SKILL.md +237 -0
  138. package/data/skills/bio-flow-cytometry-compensation-transformation/SKILL.md +196 -0
  139. package/data/skills/bio-flow-cytometry-cytometry-qc/SKILL.md +382 -0
  140. package/data/skills/bio-flow-cytometry-differential-analysis/SKILL.md +217 -0
  141. package/data/skills/bio-flow-cytometry-doublet-detection/SKILL.md +288 -0
  142. package/data/skills/bio-flow-cytometry-fcs-handling/SKILL.md +221 -0
  143. package/data/skills/bio-flow-cytometry-gating-analysis/SKILL.md +193 -0
  144. package/data/skills/bio-format-conversion/SKILL.md +193 -0
  145. package/data/skills/bio-fragment-analysis/SKILL.md +214 -0
  146. package/data/skills/bio-gatk-variant-calling/SKILL.md +422 -0
  147. package/data/skills/bio-genome-assembly-assembly-polishing/SKILL.md +333 -0
  148. package/data/skills/bio-genome-assembly-assembly-qc/SKILL.md +344 -0
  149. package/data/skills/bio-genome-assembly-contamination-detection/SKILL.md +235 -0
  150. package/data/skills/bio-genome-assembly-hifi-assembly/SKILL.md +178 -0
  151. package/data/skills/bio-genome-assembly-long-read-assembly/SKILL.md +307 -0
  152. package/data/skills/bio-genome-assembly-metagenome-assembly/SKILL.md +227 -0
  153. package/data/skills/bio-genome-assembly-scaffolding/SKILL.md +204 -0
  154. package/data/skills/bio-genome-assembly-short-read-assembly/SKILL.md +319 -0
  155. package/data/skills/bio-genome-engineering-base-editing-design/SKILL.md +277 -0
  156. package/data/skills/bio-genome-engineering-grna-design/SKILL.md +221 -0
  157. package/data/skills/bio-genome-engineering-hdr-template-design/SKILL.md +264 -0
  158. package/data/skills/bio-genome-engineering-off-target-prediction/SKILL.md +232 -0
  159. package/data/skills/bio-genome-engineering-prime-editing-design/SKILL.md +275 -0
  160. package/data/skills/bio-genome-intervals-bed-file-basics/SKILL.md +357 -0
  161. package/data/skills/bio-genome-intervals-bigwig-tracks/SKILL.md +351 -0
  162. package/data/skills/bio-genome-intervals-coverage-analysis/SKILL.md +300 -0
  163. package/data/skills/bio-genome-intervals-gtf-gff-handling/SKILL.md +345 -0
  164. package/data/skills/bio-genome-intervals-interval-arithmetic/SKILL.md +485 -0
  165. package/data/skills/bio-genome-intervals-proximity-operations/SKILL.md +337 -0
  166. package/data/skills/bio-geo-data/SKILL.md +380 -0
  167. package/data/skills/bio-hi-c-analysis-compartment-analysis/SKILL.md +261 -0
  168. package/data/skills/bio-hi-c-analysis-contact-pairs/SKILL.md +278 -0
  169. package/data/skills/bio-hi-c-analysis-hic-data-io/SKILL.md +260 -0
  170. package/data/skills/bio-hi-c-analysis-hic-differential/SKILL.md +328 -0
  171. package/data/skills/bio-hi-c-analysis-hic-visualization/SKILL.md +297 -0
  172. package/data/skills/bio-hi-c-analysis-loop-calling/SKILL.md +284 -0
  173. package/data/skills/bio-hi-c-analysis-matrix-operations/SKILL.md +274 -0
  174. package/data/skills/bio-hi-c-analysis-tad-detection/SKILL.md +239 -0
  175. package/data/skills/bio-imaging-mass-cytometry-cell-segmentation/SKILL.md +241 -0
  176. package/data/skills/bio-imaging-mass-cytometry-data-preprocessing/SKILL.md +279 -0
  177. package/data/skills/bio-imaging-mass-cytometry-interactive-annotation/SKILL.md +304 -0
  178. package/data/skills/bio-imaging-mass-cytometry-phenotyping/SKILL.md +231 -0
  179. package/data/skills/bio-imaging-mass-cytometry-quality-metrics/SKILL.md +316 -0
  180. package/data/skills/bio-imaging-mass-cytometry-spatial-analysis/SKILL.md +246 -0
  181. package/data/skills/bio-immunoinformatics-epitope-prediction/SKILL.md +259 -0
  182. package/data/skills/bio-immunoinformatics-immunogenicity-scoring/SKILL.md +275 -0
  183. package/data/skills/bio-immunoinformatics-mhc-binding-prediction/SKILL.md +260 -0
  184. package/data/skills/bio-immunoinformatics-neoantigen-prediction/SKILL.md +277 -0
  185. package/data/skills/bio-immunoinformatics-tcr-epitope-binding/SKILL.md +257 -0
  186. package/data/skills/bio-isoform-switching/SKILL.md +192 -0
  187. package/data/skills/bio-liquid-biopsy-pipeline/SKILL.md +311 -0
  188. package/data/skills/bio-local-blast/SKILL.md +350 -0
  189. package/data/skills/bio-long-read-sequencing-clair3-variants/SKILL.md +252 -0
  190. package/data/skills/bio-long-read-sequencing-isoseq-analysis/SKILL.md +334 -0
  191. package/data/skills/bio-long-read-sequencing-nanopore-methylation/SKILL.md +110 -0
  192. package/data/skills/bio-longitudinal-monitoring/SKILL.md +271 -0
  193. package/data/skills/bio-longread-alignment/SKILL.md +193 -0
  194. package/data/skills/bio-longread-medaka/SKILL.md +176 -0
  195. package/data/skills/bio-longread-qc/SKILL.md +224 -0
  196. package/data/skills/bio-longread-structural-variants/SKILL.md +201 -0
  197. package/data/skills/bio-machine-learning-atlas-mapping/SKILL.md +139 -0
  198. package/data/skills/bio-machine-learning-biomarker-discovery/SKILL.md +157 -0
  199. package/data/skills/bio-machine-learning-model-validation/SKILL.md +148 -0
  200. package/data/skills/bio-machine-learning-omics-classifiers/SKILL.md +146 -0
  201. package/data/skills/bio-machine-learning-prediction-explanation/SKILL.md +162 -0
  202. package/data/skills/bio-machine-learning-survival-analysis/SKILL.md +176 -0
  203. package/data/skills/bio-metabolomics-lipidomics/SKILL.md +265 -0
  204. package/data/skills/bio-metabolomics-metabolite-annotation/SKILL.md +241 -0
  205. package/data/skills/bio-metabolomics-msdial-preprocessing/SKILL.md +308 -0
  206. package/data/skills/bio-metabolomics-normalization-qc/SKILL.md +283 -0
  207. package/data/skills/bio-metabolomics-pathway-mapping/SKILL.md +237 -0
  208. package/data/skills/bio-metabolomics-statistical-analysis/SKILL.md +276 -0
  209. package/data/skills/bio-metabolomics-targeted-analysis/SKILL.md +314 -0
  210. package/data/skills/bio-metabolomics-xcms-preprocessing/SKILL.md +268 -0
  211. package/data/skills/bio-metagenomics-abundance/SKILL.md +203 -0
  212. package/data/skills/bio-metagenomics-amr-detection/SKILL.md +293 -0
  213. package/data/skills/bio-metagenomics-functional-profiling/SKILL.md +252 -0
  214. package/data/skills/bio-metagenomics-kraken/SKILL.md +204 -0
  215. package/data/skills/bio-metagenomics-metaphlan/SKILL.md +214 -0
  216. package/data/skills/bio-metagenomics-strain-tracking/SKILL.md +292 -0
  217. package/data/skills/bio-metagenomics-visualization/SKILL.md +240 -0
  218. package/data/skills/bio-methylation-based-detection/SKILL.md +223 -0
  219. package/data/skills/bio-methylation-bismark-alignment/SKILL.md +195 -0
  220. package/data/skills/bio-methylation-calling/SKILL.md +200 -0
  221. package/data/skills/bio-methylation-dmr-detection/SKILL.md +211 -0
  222. package/data/skills/bio-methylation-methylkit/SKILL.md +219 -0
  223. package/data/skills/bio-microbiome-amplicon-processing/SKILL.md +137 -0
  224. package/data/skills/bio-microbiome-differential-abundance/SKILL.md +147 -0
  225. package/data/skills/bio-microbiome-diversity-analysis/SKILL.md +188 -0
  226. package/data/skills/bio-microbiome-functional-prediction/SKILL.md +153 -0
  227. package/data/skills/bio-microbiome-qiime2-workflow/SKILL.md +219 -0
  228. package/data/skills/bio-microbiome-taxonomy-assignment/SKILL.md +168 -0
  229. package/data/skills/bio-molecular-descriptors/SKILL.md +200 -0
  230. package/data/skills/bio-molecular-io/SKILL.md +188 -0
  231. package/data/skills/bio-motif-search/SKILL.md +354 -0
  232. package/data/skills/bio-multi-omics-data-harmonization/SKILL.md +228 -0
  233. package/data/skills/bio-multi-omics-mixomics-analysis/SKILL.md +221 -0
  234. package/data/skills/bio-multi-omics-mofa-integration/SKILL.md +225 -0
  235. package/data/skills/bio-multi-omics-similarity-network/SKILL.md +235 -0
  236. package/data/skills/bio-orchestrator/SKILL.md +133 -0
  237. package/data/skills/bio-paired-end-fastq/SKILL.md +334 -0
  238. package/data/skills/bio-pathway-enrichment-visualization/SKILL.md +278 -0
  239. package/data/skills/bio-pathway-go-enrichment/SKILL.md +218 -0
  240. package/data/skills/bio-pathway-gsea/SKILL.md +227 -0
  241. package/data/skills/bio-pathway-kegg-pathways/SKILL.md +234 -0
  242. package/data/skills/bio-pathway-reactome/SKILL.md +215 -0
  243. package/data/skills/bio-pathway-wikipathways/SKILL.md +255 -0
  244. package/data/skills/bio-pdb-geometric-analysis/SKILL.md +475 -0
  245. package/data/skills/bio-pdb-structure-io/SKILL.md +296 -0
  246. package/data/skills/bio-pdb-structure-modification/SKILL.md +448 -0
  247. package/data/skills/bio-pdb-structure-navigation/SKILL.md +335 -0
  248. package/data/skills/bio-phasing-imputation-genotype-imputation/SKILL.md +201 -0
  249. package/data/skills/bio-phasing-imputation-haplotype-phasing/SKILL.md +190 -0
  250. package/data/skills/bio-phasing-imputation-imputation-qc/SKILL.md +265 -0
  251. package/data/skills/bio-phasing-imputation-reference-panels/SKILL.md +203 -0
  252. package/data/skills/bio-phylo-distance-calculations/SKILL.md +307 -0
  253. package/data/skills/bio-phylo-modern-tree-inference/SKILL.md +274 -0
  254. package/data/skills/bio-phylo-tree-io/SKILL.md +252 -0
  255. package/data/skills/bio-phylo-tree-manipulation/SKILL.md +375 -0
  256. package/data/skills/bio-phylo-tree-visualization/SKILL.md +275 -0
  257. package/data/skills/bio-pileup-generation/SKILL.md +314 -0
  258. package/data/skills/bio-population-genetics-association-testing/SKILL.md +293 -0
  259. package/data/skills/bio-population-genetics-linkage-disequilibrium/SKILL.md +260 -0
  260. package/data/skills/bio-population-genetics-plink-basics/SKILL.md +338 -0
  261. package/data/skills/bio-population-genetics-population-structure/SKILL.md +352 -0
  262. package/data/skills/bio-population-genetics-scikit-allel-analysis/SKILL.md +306 -0
  263. package/data/skills/bio-population-genetics-selection-statistics/SKILL.md +251 -0
  264. package/data/skills/bio-primer-design-primer-basics/SKILL.md +289 -0
  265. package/data/skills/bio-primer-design-primer-validation/SKILL.md +344 -0
  266. package/data/skills/bio-primer-design-qpcr-primers/SKILL.md +273 -0
  267. package/data/skills/bio-proteomics-data-import/SKILL.md +122 -0
  268. package/data/skills/bio-proteomics-dia-analysis/SKILL.md +246 -0
  269. package/data/skills/bio-proteomics-differential-abundance/SKILL.md +129 -0
  270. package/data/skills/bio-proteomics-peptide-identification/SKILL.md +122 -0
  271. package/data/skills/bio-proteomics-protein-inference/SKILL.md +174 -0
  272. package/data/skills/bio-proteomics-proteomics-qc/SKILL.md +208 -0
  273. package/data/skills/bio-proteomics-ptm-analysis/SKILL.md +139 -0
  274. package/data/skills/bio-proteomics-quantification/SKILL.md +141 -0
  275. package/data/skills/bio-proteomics-spectral-libraries/SKILL.md +270 -0
  276. package/data/skills/bio-reaction-enumeration/SKILL.md +251 -0
  277. package/data/skills/bio-read-alignment-bowtie2-alignment/SKILL.md +189 -0
  278. package/data/skills/bio-read-alignment-bwa-alignment/SKILL.md +166 -0
  279. package/data/skills/bio-read-alignment-hisat2-alignment/SKILL.md +205 -0
  280. package/data/skills/bio-read-alignment-star-alignment/SKILL.md +204 -0
  281. package/data/skills/bio-read-qc-adapter-trimming/SKILL.md +222 -0
  282. package/data/skills/bio-read-qc-contamination-screening/SKILL.md +252 -0
  283. package/data/skills/bio-read-qc-fastp-workflow/SKILL.md +278 -0
  284. package/data/skills/bio-read-qc-quality-filtering/SKILL.md +231 -0
  285. package/data/skills/bio-read-qc-quality-reports/SKILL.md +204 -0
  286. package/data/skills/bio-read-qc-umi-processing/SKILL.md +391 -0
  287. package/data/skills/bio-read-sequences/SKILL.md +319 -0
  288. package/data/skills/bio-reference-operations/SKILL.md +302 -0
  289. package/data/skills/bio-reporting-automated-qc-reports/SKILL.md +103 -0
  290. package/data/skills/bio-reporting-figure-export/SKILL.md +112 -0
  291. package/data/skills/bio-reporting-jupyter-reports/SKILL.md +98 -0
  292. package/data/skills/bio-reporting-quarto-reports/SKILL.md +295 -0
  293. package/data/skills/bio-reporting-rmarkdown-reports/SKILL.md +276 -0
  294. package/data/skills/bio-research-tools-biomarker-signature-studio/SKILL.md +99 -0
  295. package/data/skills/bio-restriction-enzyme-selection/SKILL.md +342 -0
  296. package/data/skills/bio-restriction-fragment-analysis/SKILL.md +259 -0
  297. package/data/skills/bio-restriction-mapping/SKILL.md +239 -0
  298. package/data/skills/bio-restriction-sites/SKILL.md +222 -0
  299. package/data/skills/bio-reverse-complement/SKILL.md +250 -0
  300. package/data/skills/bio-ribo-seq-orf-detection/SKILL.md +303 -0
  301. package/data/skills/bio-ribo-seq-riboseq-preprocessing/SKILL.md +176 -0
  302. package/data/skills/bio-ribo-seq-ribosome-periodicity/SKILL.md +182 -0
  303. package/data/skills/bio-ribo-seq-ribosome-stalling/SKILL.md +217 -0
  304. package/data/skills/bio-ribo-seq-translation-efficiency/SKILL.md +183 -0
  305. package/data/skills/bio-rna-quantification-alignment-free-quant/SKILL.md +226 -0
  306. package/data/skills/bio-rna-quantification-count-matrix-qc/SKILL.md +310 -0
  307. package/data/skills/bio-rna-quantification-featurecounts-counting/SKILL.md +190 -0
  308. package/data/skills/bio-rna-quantification-tximport-workflow/SKILL.md +240 -0
  309. package/data/skills/bio-rnaseq-qc/SKILL.md +320 -0
  310. package/data/skills/bio-sam-bam-basics/SKILL.md +248 -0
  311. package/data/skills/bio-sashimi-plots/SKILL.md +175 -0
  312. package/data/skills/bio-seq-objects/SKILL.md +240 -0
  313. package/data/skills/bio-sequence-properties/SKILL.md +397 -0
  314. package/data/skills/bio-sequence-similarity/SKILL.md +335 -0
  315. package/data/skills/bio-sequence-slicing/SKILL.md +232 -0
  316. package/data/skills/bio-sequence-statistics/SKILL.md +318 -0
  317. package/data/skills/bio-similarity-searching/SKILL.md +200 -0
  318. package/data/skills/bio-single-cell-batch-integration/SKILL.md +317 -0
  319. package/data/skills/bio-single-cell-cell-annotation/SKILL.md +259 -0
  320. package/data/skills/bio-single-cell-cell-communication/SKILL.md +257 -0
  321. package/data/skills/bio-single-cell-clustering/SKILL.md +330 -0
  322. package/data/skills/bio-single-cell-data-io/SKILL.md +315 -0
  323. package/data/skills/bio-single-cell-doublet-detection/SKILL.md +362 -0
  324. package/data/skills/bio-single-cell-lineage-tracing/SKILL.md +319 -0
  325. package/data/skills/bio-single-cell-markers-annotation/SKILL.md +317 -0
  326. package/data/skills/bio-single-cell-metabolite-communication/SKILL.md +258 -0
  327. package/data/skills/bio-single-cell-multimodal-integration/SKILL.md +242 -0
  328. package/data/skills/bio-single-cell-perturb-seq/SKILL.md +258 -0
  329. package/data/skills/bio-single-cell-preprocessing/SKILL.md +338 -0
  330. package/data/skills/bio-single-cell-scatac-analysis/SKILL.md +326 -0
  331. package/data/skills/bio-single-cell-splicing/SKILL.md +199 -0
  332. package/data/skills/bio-single-cell-trajectory-inference/SKILL.md +225 -0
  333. package/data/skills/bio-small-rna-seq-differential-mirna/SKILL.md +194 -0
  334. package/data/skills/bio-small-rna-seq-mirdeep2-analysis/SKILL.md +180 -0
  335. package/data/skills/bio-small-rna-seq-mirge3-analysis/SKILL.md +178 -0
  336. package/data/skills/bio-small-rna-seq-smrna-preprocessing/SKILL.md +174 -0
  337. package/data/skills/bio-small-rna-seq-target-prediction/SKILL.md +202 -0
  338. package/data/skills/bio-spatial-transcriptomics-image-analysis/SKILL.md +283 -0
  339. package/data/skills/bio-spatial-transcriptomics-spatial-communication/SKILL.md +299 -0
  340. package/data/skills/bio-spatial-transcriptomics-spatial-data-io/SKILL.md +272 -0
  341. package/data/skills/bio-spatial-transcriptomics-spatial-deconvolution/SKILL.md +314 -0
  342. package/data/skills/bio-spatial-transcriptomics-spatial-domains/SKILL.md +254 -0
  343. package/data/skills/bio-spatial-transcriptomics-spatial-multiomics/SKILL.md +181 -0
  344. package/data/skills/bio-spatial-transcriptomics-spatial-neighbors/SKILL.md +198 -0
  345. package/data/skills/bio-spatial-transcriptomics-spatial-preprocessing/SKILL.md +269 -0
  346. package/data/skills/bio-spatial-transcriptomics-spatial-proteomics/SKILL.md +124 -0
  347. package/data/skills/bio-spatial-transcriptomics-spatial-statistics/SKILL.md +237 -0
  348. package/data/skills/bio-spatial-transcriptomics-spatial-visualization/SKILL.md +287 -0
  349. package/data/skills/bio-splicing-pipeline/SKILL.md +253 -0
  350. package/data/skills/bio-splicing-qc/SKILL.md +190 -0
  351. package/data/skills/bio-splicing-quantification/SKILL.md +145 -0
  352. package/data/skills/bio-sra-data/SKILL.md +363 -0
  353. package/data/skills/bio-structural-biology-alphafold-predictions/SKILL.md +258 -0
  354. package/data/skills/bio-structural-biology-modern-structure-prediction/SKILL.md +346 -0
  355. package/data/skills/bio-substructure-search/SKILL.md +206 -0
  356. package/data/skills/bio-systems-biology-context-specific-models/SKILL.md +241 -0
  357. package/data/skills/bio-systems-biology-flux-balance-analysis/SKILL.md +206 -0
  358. package/data/skills/bio-systems-biology-gene-essentiality/SKILL.md +235 -0
  359. package/data/skills/bio-systems-biology-metabolic-reconstruction/SKILL.md +215 -0
  360. package/data/skills/bio-systems-biology-model-curation/SKILL.md +243 -0
  361. package/data/skills/bio-tcr-bcr-analysis-immcantation-analysis/SKILL.md +195 -0
  362. package/data/skills/bio-tcr-bcr-analysis-mixcr-analysis/SKILL.md +167 -0
  363. package/data/skills/bio-tcr-bcr-analysis-repertoire-visualization/SKILL.md +224 -0
  364. package/data/skills/bio-tcr-bcr-analysis-scirpy-analysis/SKILL.md +168 -0
  365. package/data/skills/bio-tcr-bcr-analysis-vdjtools-analysis/SKILL.md +188 -0
  366. package/data/skills/bio-transcription-translation/SKILL.md +237 -0
  367. package/data/skills/bio-tumor-fraction-estimation/SKILL.md +211 -0
  368. package/data/skills/bio-uniprot-access/SKILL.md +239 -0
  369. package/data/skills/bio-variant-annotation/SKILL.md +410 -0
  370. package/data/skills/bio-variant-calling/SKILL.md +266 -0
  371. package/data/skills/bio-variant-calling-clinical-interpretation/SKILL.md +355 -0
  372. package/data/skills/bio-variant-calling-deepvariant/SKILL.md +315 -0
  373. package/data/skills/bio-variant-calling-filtering-best-practices/SKILL.md +403 -0
  374. package/data/skills/bio-variant-calling-joint-calling/SKILL.md +338 -0
  375. package/data/skills/bio-variant-calling-structural-variant-calling/SKILL.md +253 -0
  376. package/data/skills/bio-variant-normalization/SKILL.md +325 -0
  377. package/data/skills/bio-vcf-basics/SKILL.md +342 -0
  378. package/data/skills/bio-vcf-manipulation/SKILL.md +429 -0
  379. package/data/skills/bio-vcf-statistics/SKILL.md +445 -0
  380. package/data/skills/bio-virtual-screening/SKILL.md +263 -0
  381. package/data/skills/bio-workflow-management-cwl-workflows/SKILL.md +433 -0
  382. package/data/skills/bio-workflow-management-nextflow-pipelines/SKILL.md +386 -0
  383. package/data/skills/bio-workflow-management-snakemake-workflows/SKILL.md +383 -0
  384. package/data/skills/bio-workflow-management-wdl-workflows/SKILL.md +500 -0
  385. package/data/skills/bio-workflows-atacseq-pipeline/SKILL.md +362 -0
  386. package/data/skills/bio-workflows-biomarker-pipeline/SKILL.md +272 -0
  387. package/data/skills/bio-workflows-chipseq-pipeline/SKILL.md +282 -0
  388. package/data/skills/bio-workflows-clip-pipeline/SKILL.md +268 -0
  389. package/data/skills/bio-workflows-cnv-pipeline/SKILL.md +324 -0
  390. package/data/skills/bio-workflows-crispr-editing-pipeline/SKILL.md +455 -0
  391. package/data/skills/bio-workflows-crispr-screen-pipeline/SKILL.md +278 -0
  392. package/data/skills/bio-workflows-cytometry-pipeline/SKILL.md +328 -0
  393. package/data/skills/bio-workflows-expression-to-pathways/SKILL.md +329 -0
  394. package/data/skills/bio-workflows-fastq-to-variants/SKILL.md +374 -0
  395. package/data/skills/bio-workflows-genome-assembly-pipeline/SKILL.md +290 -0
  396. package/data/skills/bio-workflows-gwas-pipeline/SKILL.md +323 -0
  397. package/data/skills/bio-workflows-hic-pipeline/SKILL.md +304 -0
  398. package/data/skills/bio-workflows-imc-pipeline/SKILL.md +304 -0
  399. package/data/skills/bio-workflows-longread-sv-pipeline/SKILL.md +281 -0
  400. package/data/skills/bio-workflows-merip-pipeline/SKILL.md +222 -0
  401. package/data/skills/bio-workflows-metabolic-modeling-pipeline/SKILL.md +408 -0
  402. package/data/skills/bio-workflows-metabolomics-pipeline/SKILL.md +297 -0
  403. package/data/skills/bio-workflows-metagenomics-pipeline/SKILL.md +283 -0
  404. package/data/skills/bio-workflows-methylation-pipeline/SKILL.md +274 -0
  405. package/data/skills/bio-workflows-microbiome-pipeline/SKILL.md +221 -0
  406. package/data/skills/bio-workflows-multi-omics-pipeline/SKILL.md +362 -0
  407. package/data/skills/bio-workflows-multiome-pipeline/SKILL.md +298 -0
  408. package/data/skills/bio-workflows-neoantigen-pipeline/SKILL.md +325 -0
  409. package/data/skills/bio-workflows-outbreak-pipeline/SKILL.md +341 -0
  410. package/data/skills/bio-workflows-proteomics-pipeline/SKILL.md +226 -0
  411. package/data/skills/bio-workflows-riboseq-pipeline/SKILL.md +94 -0
  412. package/data/skills/bio-workflows-rnaseq-to-de/SKILL.md +345 -0
  413. package/data/skills/bio-workflows-scrnaseq-pipeline/SKILL.md +354 -0
  414. package/data/skills/bio-workflows-smrna-pipeline/SKILL.md +86 -0
  415. package/data/skills/bio-workflows-somatic-variant-pipeline/SKILL.md +313 -0
  416. package/data/skills/bio-workflows-spatial-pipeline/SKILL.md +267 -0
  417. package/data/skills/bio-workflows-tcr-pipeline/SKILL.md +84 -0
  418. package/data/skills/bio-write-sequences/SKILL.md +205 -0
  419. package/data/skills/bioinformatics-singlecell/SKILL.md +143 -0
  420. package/data/skills/biokernel/SKILL.md +61 -0
  421. package/data/skills/biologist-analyst/SKILL.md +799 -0
  422. package/data/skills/biomaster-workflows/SKILL.md +55 -0
  423. package/data/skills/biomcp-server/SKILL.md +65 -0
  424. package/data/skills/biomedical-data-analysis/SKILL.md +56 -0
  425. package/data/skills/biomedical-search/SKILL.md +214 -0
  426. package/data/skills/biomni/SKILL.md +309 -0
  427. package/data/skills/biomni-general-agent/SKILL.md +43 -0
  428. package/data/skills/biomni-research-agent/SKILL.md +76 -0
  429. package/data/skills/biopython/SKILL.md +437 -0
  430. package/data/skills/biorxiv-database/SKILL.md +477 -0
  431. package/data/skills/bioservices/SKILL.md +355 -0
  432. package/data/skills/boltz/SKILL.md +188 -0
  433. package/data/skills/boltzgen/SKILL.md +287 -0
  434. package/data/skills/bone-marrow-ai-agent/SKILL.md +163 -0
  435. package/data/skills/brainstorming/SKILL.md +96 -0
  436. package/data/skills/brenda-database/SKILL.md +714 -0
  437. package/data/skills/bulk-combat-correction/SKILL.md +54 -0
  438. package/data/skills/bulk-deg-analysis/SKILL.md +61 -0
  439. package/data/skills/bulk-deseq2-analysis/SKILL.md +50 -0
  440. package/data/skills/bulk-stringdb-ppi/SKILL.md +49 -0
  441. package/data/skills/bulk-to-single-deconvolution/SKILL.md +50 -0
  442. package/data/skills/bulk-trajblend-interpolation/SKILL.md +52 -0
  443. package/data/skills/bulk-wgcna-analysis/SKILL.md +56 -0
  444. package/data/skills/cancer-metabolism-agent/SKILL.md +180 -0
  445. package/data/skills/care-coordination/SKILL.md +35 -0
  446. package/data/skills/cart-design-optimizer-agent/SKILL.md +162 -0
  447. package/data/skills/cbioportal-database/SKILL.md +367 -0
  448. package/data/skills/cell-free-expression/SKILL.md +291 -0
  449. package/data/skills/cellagent-annotation/SKILL.md +69 -0
  450. package/data/skills/cellfree-rna-agent/SKILL.md +182 -0
  451. package/data/skills/cellular-senescence-agent/SKILL.md +183 -0
  452. package/data/skills/cellxgene-census/SKILL.md +505 -0
  453. package/data/skills/chai/SKILL.md +272 -0
  454. package/data/skills/chatehr-clinician-assistant/SKILL.md +67 -0
  455. package/data/skills/chematagent-drug-discovery/SKILL.md +68 -0
  456. package/data/skills/chembl-database/SKILL.md +383 -0
  457. package/data/skills/chembl-search/SKILL.md +211 -0
  458. package/data/skills/chemcrow-drug-discovery/SKILL.md +61 -0
  459. package/data/skills/chemical-property-lookup/SKILL.md +42 -0
  460. package/data/skills/chemist-analyst/SKILL.md +1603 -0
  461. package/data/skills/chemistry-agent/SKILL.md +62 -0
  462. package/data/skills/chip-clonal-hematopoiesis-agent/SKILL.md +224 -0
  463. package/data/skills/chromosomal-instability-agent/SKILL.md +187 -0
  464. package/data/skills/citation-management/SKILL.md +1081 -0
  465. package/data/skills/claims-appeals/SKILL.md +35 -0
  466. package/data/skills/claw-ancestry-pca/SKILL.md +145 -0
  467. package/data/skills/claw-metagenomics/SKILL.md +238 -0
  468. package/data/skills/claw-semantic-sim/SKILL.md +151 -0
  469. package/data/skills/clinical-decision-support/SKILL.md +504 -0
  470. package/data/skills/clinical-diagnostic-reasoning/SKILL.md +222 -0
  471. package/data/skills/clinical-nlp-extractor/SKILL.md +59 -0
  472. package/data/skills/clinical-note-summarization/SKILL.md +52 -0
  473. package/data/skills/clinical-reports/SKILL.md +1127 -0
  474. package/data/skills/clinical-trial-protocol-skill/SKILL.md +508 -0
  475. package/data/skills/clinical-trials-search/SKILL.md +211 -0
  476. package/data/skills/clinicaltrials-database/SKILL.md +501 -0
  477. package/data/skills/clinpgx/SKILL.md +96 -0
  478. package/data/skills/clinpgx-database/SKILL.md +632 -0
  479. package/data/skills/clinvar-database/SKILL.md +356 -0
  480. package/data/skills/cnv-caller-agent/SKILL.md +171 -0
  481. package/data/skills/coagulation-thrombosis-agent/SKILL.md +141 -0
  482. package/data/skills/cobrapy/SKILL.md +457 -0
  483. package/data/skills/compbioagent-explorer/SKILL.md +67 -0
  484. package/data/skills/computational-pathology-agent/SKILL.md +72 -0
  485. package/data/skills/convergence-study/SKILL.md +98 -0
  486. package/data/skills/cosmic-database/SKILL.md +330 -0
  487. package/data/skills/crisis-detection-intervention-ai/SKILL.md +569 -0
  488. package/data/skills/crisis-response-protocol/SKILL.md +456 -0
  489. package/data/skills/crispr-guide-design/SKILL.md +72 -0
  490. package/data/skills/crispr-offtarget-predictor/SKILL.md +56 -0
  491. package/data/skills/cryoem-ai-drug-design-agent/SKILL.md +216 -0
  492. package/data/skills/ctdna-dynamics-mrd-agent/SKILL.md +206 -0
  493. package/data/skills/cytokine-storm-analysis-agent/SKILL.md +180 -0
  494. package/data/skills/dask/SKILL.md +454 -0
  495. package/data/skills/data-stats-analysis/SKILL.md +477 -0
  496. package/data/skills/data-transform/SKILL.md +576 -0
  497. package/data/skills/data-visualization-biomedical/SKILL.md +252 -0
  498. package/data/skills/data-visualization-expert/SKILL.md +72 -0
  499. package/data/skills/data-viz-plots/SKILL.md +461 -0
  500. package/data/skills/datacommons-client/SKILL.md +253 -0
  501. package/data/skills/datamol/SKILL.md +700 -0
  502. package/data/skills/deep-research/SKILL.md +111 -0
  503. package/data/skills/deep-research-swarm/SKILL.md +62 -0
  504. package/data/skills/deep-visual-proteomics-agent/SKILL.md +149 -0
  505. package/data/skills/deepchem/SKILL.md +591 -0
  506. package/data/skills/deeptools/SKILL.md +525 -0
  507. package/data/skills/depmap/SKILL.md +300 -0
  508. package/data/skills/diffdock/SKILL.md +477 -0
  509. package/data/skills/differentiation-schemes/SKILL.md +159 -0
  510. package/data/skills/digital-twin-clinical-agent/SKILL.md +228 -0
  511. package/data/skills/dispatching-parallel-agents/SKILL.md +180 -0
  512. package/data/skills/dnanexus-integration/SKILL.md +376 -0
  513. package/data/skills/doc-coauthoring/SKILL.md +375 -0
  514. package/data/skills/docx/SKILL.md +590 -0
  515. package/data/skills/docx-official/SKILL.md +197 -0
  516. package/data/skills/drug-discovery-search/SKILL.md +214 -0
  517. package/data/skills/drug-interaction-checker/SKILL.md +56 -0
  518. package/data/skills/drug-labels-search/SKILL.md +211 -0
  519. package/data/skills/drug-photo/SKILL.md +149 -0
  520. package/data/skills/drugbank-database/SKILL.md +184 -0
  521. package/data/skills/drugbank-search/SKILL.md +211 -0
  522. package/data/skills/ehr-fhir-integration/SKILL.md +60 -0
  523. package/data/skills/emergency-card/SKILL.md +426 -0
  524. package/data/skills/ena-database/SKILL.md +198 -0
  525. package/data/skills/ensembl-database/SKILL.md +305 -0
  526. package/data/skills/epidemiologist-analyst/SKILL.md +1844 -0
  527. package/data/skills/epigenomics-methylgpt-agent/SKILL.md +111 -0
  528. package/data/skills/equity-scorer/SKILL.md +182 -0
  529. package/data/skills/esm/SKILL.md +300 -0
  530. package/data/skills/etetoolkit/SKILL.md +617 -0
  531. package/data/skills/executing-plans/SKILL.md +84 -0
  532. package/data/skills/exosome-ev-analysis-agent/SKILL.md +171 -0
  533. package/data/skills/exploratory-data-analysis/SKILL.md +440 -0
  534. package/data/skills/family-health-analyzer/SKILL.md +137 -0
  535. package/data/skills/fastq-analysis/SKILL.md +191 -0
  536. package/data/skills/fda-database/SKILL.md +512 -0
  537. package/data/skills/fhir-developer-skill/SKILL.md +294 -0
  538. package/data/skills/fhir-development/SKILL.md +35 -0
  539. package/data/skills/find-skills/SKILL.md +133 -0
  540. package/data/skills/finishing-a-development-branch/SKILL.md +200 -0
  541. package/data/skills/fitness-analyzer/SKILL.md +431 -0
  542. package/data/skills/flowio/SKILL.md +602 -0
  543. package/data/skills/foldseek/SKILL.md +179 -0
  544. package/data/skills/galaxy-bridge/SKILL.md +215 -0
  545. package/data/skills/gene-database/SKILL.md +173 -0
  546. package/data/skills/gene-panel-design-agent/SKILL.md +192 -0
  547. package/data/skills/geniml/SKILL.md +312 -0
  548. package/data/skills/genome-compare/SKILL.md +127 -0
  549. package/data/skills/geo-database/SKILL.md +809 -0
  550. package/data/skills/geopandas/SKILL.md +245 -0
  551. package/data/skills/gget/SKILL.md +865 -0
  552. package/data/skills/ginkgo-cloud-lab/SKILL.md +56 -0
  553. package/data/skills/glycoengineering/SKILL.md +338 -0
  554. package/data/skills/gnomad-database/SKILL.md +395 -0
  555. package/data/skills/goal-analyzer/SKILL.md +605 -0
  556. package/data/skills/grief-companion/SKILL.md +250 -0
  557. package/data/skills/gsea-enrichment/SKILL.md +151 -0
  558. package/data/skills/gtars/SKILL.md +279 -0
  559. package/data/skills/gtex-database/SKILL.md +315 -0
  560. package/data/skills/gwas-database/SKILL.md +602 -0
  561. package/data/skills/gwas-lookup/SKILL.md +122 -0
  562. package/data/skills/gwas-prs/SKILL.md +178 -0
  563. package/data/skills/health-trend-analyzer/SKILL.md +451 -0
  564. package/data/skills/hemoglobinopathy-analysis-agent/SKILL.md +167 -0
  565. package/data/skills/hipaa-compliance/SKILL.md +230 -0
  566. package/data/skills/histolab/SKILL.md +672 -0
  567. package/data/skills/hmdb-database/SKILL.md +190 -0
  568. package/data/skills/hrd-analysis-agent/SKILL.md +184 -0
  569. package/data/skills/hrv-alexithymia-expert/SKILL.md +151 -0
  570. package/data/skills/hypogenic/SKILL.md +649 -0
  571. package/data/skills/hypothesis-generation/SKILL.md +286 -0
  572. package/data/skills/imaging-data-commons/SKILL.md +843 -0
  573. package/data/skills/immune-checkpoint-combination-agent/SKILL.md +170 -0
  574. package/data/skills/infographics/SKILL.md +563 -0
  575. package/data/skills/instrument-data-to-allotrope/SKILL.md +280 -0
  576. package/data/skills/interpro-database/SKILL.md +305 -0
  577. package/data/skills/ipsae/SKILL.md +190 -0
  578. package/data/skills/iso-13485-certification/SKILL.md +678 -0
  579. package/data/skills/jaspar-database/SKILL.md +351 -0
  580. package/data/skills/jungian-psychologist/SKILL.md +191 -0
  581. package/data/skills/kegg-database/SKILL.md +371 -0
  582. package/data/skills/knowledge-synthesis/SKILL.md +283 -0
  583. package/data/skills/kragen-knowledge-graph/SKILL.md +68 -0
  584. package/data/skills/lab-results/SKILL.md +35 -0
  585. package/data/skills/labarchive-integration/SKILL.md +262 -0
  586. package/data/skills/labstep/SKILL.md +208 -0
  587. package/data/skills/lamindb/SKILL.md +384 -0
  588. package/data/skills/latchbio-integration/SKILL.md +347 -0
  589. package/data/skills/latex-posters/SKILL.md +1602 -0
  590. package/data/skills/leads-literature-mining/SKILL.md +68 -0
  591. package/data/skills/ligandmpnn/SKILL.md +170 -0
  592. package/data/skills/linear-solvers/SKILL.md +165 -0
  593. package/data/skills/liquid-biopsy-analytics-agent/SKILL.md +171 -0
  594. package/data/skills/lit-synthesizer/SKILL.md +53 -0
  595. package/data/skills/literature-review/SKILL.md +584 -0
  596. package/data/skills/literature-search/SKILL.md +214 -0
  597. package/data/skills/lobster-bioinformatics/SKILL.md +305 -0
  598. package/data/skills/long-read-sequencing-agent/SKILL.md +181 -0
  599. package/data/skills/mage-antibody-generator/SKILL.md +54 -0
  600. package/data/skills/markdown-mermaid-writing/SKILL.md +327 -0
  601. package/data/skills/markitdown/SKILL.md +486 -0
  602. package/data/skills/matchms/SKILL.md +197 -0
  603. package/data/skills/matplotlib/SKILL.md +359 -0
  604. package/data/skills/mcpmed-bioinformatics-server/SKILL.md +42 -0
  605. package/data/skills/medchem/SKILL.md +400 -0
  606. package/data/skills/medea-therapeutic-discovery/SKILL.md +45 -0
  607. package/data/skills/medical-entity-extractor/SKILL.md +144 -0
  608. package/data/skills/medical-imaging-review/SKILL.md +170 -0
  609. package/data/skills/medical-research-toolkit/SKILL.md +273 -0
  610. package/data/skills/medrxiv-search/SKILL.md +211 -0
  611. package/data/skills/mental-health-analyzer/SKILL.md +981 -0
  612. package/data/skills/mesh-generation/SKILL.md +149 -0
  613. package/data/skills/metabolomics-workbench-database/SKILL.md +253 -0
  614. package/data/skills/microbiome-cancer-agent/SKILL.md +180 -0
  615. package/data/skills/modern-drug-rehab-computer/SKILL.md +392 -0
  616. package/data/skills/molecular-dynamics/SKILL.md +457 -0
  617. package/data/skills/molecular-glue-discovery-agent/SKILL.md +224 -0
  618. package/data/skills/molecule-evolution-agent/SKILL.md +62 -0
  619. package/data/skills/molfeat/SKILL.md +505 -0
  620. package/data/skills/monarch-database/SKILL.md +372 -0
  621. package/data/skills/mpn-progression-monitor-agent/SKILL.md +228 -0
  622. package/data/skills/mpn-research-assistant/SKILL.md +197 -0
  623. package/data/skills/mrd-edge-detection-agent/SKILL.md +213 -0
  624. package/data/skills/multi-ancestry-prs-agent/SKILL.md +224 -0
  625. package/data/skills/multi-search-engine/SKILL.md +110 -0
  626. package/data/skills/multimodal-medical-imaging/SKILL.md +59 -0
  627. package/data/skills/multimodal-radpath-fusion-agent/SKILL.md +213 -0
  628. package/data/skills/myeloma-mrd-agent/SKILL.md +184 -0
  629. package/data/skills/networkx/SKILL.md +435 -0
  630. package/data/skills/neurokit2/SKILL.md +350 -0
  631. package/data/skills/neuropixels-analysis/SKILL.md +344 -0
  632. package/data/skills/nextflow-development/SKILL.md +290 -0
  633. package/data/skills/ngs-analysis/SKILL.md +183 -0
  634. package/data/skills/nicheformer-spatial-agent/SKILL.md +197 -0
  635. package/data/skills/nk-cell-therapy-agent/SKILL.md +186 -0
  636. package/data/skills/nonlinear-solvers/SKILL.md +180 -0
  637. package/data/skills/numerical-integration/SKILL.md +166 -0
  638. package/data/skills/numerical-stability/SKILL.md +149 -0
  639. package/data/skills/nutrition-analyzer/SKILL.md +775 -0
  640. package/data/skills/occupational-health-analyzer/SKILL.md +386 -0
  641. package/data/skills/omero-integration/SKILL.md +245 -0
  642. package/data/skills/ontology-explorer/SKILL.md +168 -0
  643. package/data/skills/ontology-mapper/SKILL.md +171 -0
  644. package/data/skills/ontology-validator/SKILL.md +136 -0
  645. package/data/skills/open-notebook/SKILL.md +289 -0
  646. package/data/skills/open-targets-search/SKILL.md +211 -0
  647. package/data/skills/openalex-database/SKILL.md +488 -0
  648. package/data/skills/opentargets-database/SKILL.md +367 -0
  649. package/data/skills/opentrons-integration/SKILL.md +567 -0
  650. package/data/skills/opentrons-protocol-agent/SKILL.md +58 -0
  651. package/data/skills/organoid-drug-response-agent/SKILL.md +189 -0
  652. package/data/skills/pan-cancer-multiomics-agent/SKILL.md +159 -0
  653. package/data/skills/paper-2-web/SKILL.md +495 -0
  654. package/data/skills/parameter-optimization/SKILL.md +141 -0
  655. package/data/skills/patents-search/SKILL.md +211 -0
  656. package/data/skills/pathml/SKILL.md +160 -0
  657. package/data/skills/patiently-ai/SKILL.md +103 -0
  658. package/data/skills/pdb/SKILL.md +217 -0
  659. package/data/skills/pdb-database/SKILL.md +303 -0
  660. package/data/skills/pdf/SKILL.md +314 -0
  661. package/data/skills/pdf-anthropic/SKILL.md +294 -0
  662. package/data/skills/pdf-processing/SKILL.md +149 -0
  663. package/data/skills/pdf-processing-pro/SKILL.md +296 -0
  664. package/data/skills/pdx-model-analysis-agent/SKILL.md +169 -0
  665. package/data/skills/peer-review/SKILL.md +565 -0
  666. package/data/skills/performance-profiling/SKILL.md +255 -0
  667. package/data/skills/perplexity-search/SKILL.md +441 -0
  668. package/data/skills/pharmacogenomics-agent/SKILL.md +143 -0
  669. package/data/skills/pharmgx-reporter/SKILL.md +134 -0
  670. package/data/skills/phylogenetics/SKILL.md +404 -0
  671. package/data/skills/plotly/SKILL.md +265 -0
  672. package/data/skills/polars/SKILL.md +385 -0
  673. package/data/skills/popeve-variant-predictor-agent/SKILL.md +213 -0
  674. package/data/skills/post-processing/SKILL.md +338 -0
  675. package/data/skills/pptx/SKILL.md +232 -0
  676. package/data/skills/pptx-official/SKILL.md +484 -0
  677. package/data/skills/pptx-posters/SKILL.md +414 -0
  678. package/data/skills/precision-oncology-agent/SKILL.md +53 -0
  679. package/data/skills/prior-auth-coworker/SKILL.md +60 -0
  680. package/data/skills/prior-auth-review-skill/SKILL.md +360 -0
  681. package/data/skills/profile-report/SKILL.md +120 -0
  682. package/data/skills/protac-design-agent/SKILL.md +220 -0
  683. package/data/skills/protein-design-workflow/SKILL.md +199 -0
  684. package/data/skills/protein-qc/SKILL.md +300 -0
  685. package/data/skills/protein-structure-prediction/SKILL.md +59 -0
  686. package/data/skills/proteinmpnn/SKILL.md +279 -0
  687. package/data/skills/protocolsio-integration/SKILL.md +415 -0
  688. package/data/skills/prs-net-deep-learning-agent/SKILL.md +232 -0
  689. package/data/skills/psychologist-analyst/SKILL.md +1888 -0
  690. package/data/skills/pubchem-database/SKILL.md +568 -0
  691. package/data/skills/pubmed-database/SKILL.md +454 -0
  692. package/data/skills/pubmed-search/SKILL.md +103 -0
  693. package/data/skills/pydeseq2/SKILL.md +553 -0
  694. package/data/skills/pydicom/SKILL.md +428 -0
  695. package/data/skills/pyhealth/SKILL.md +485 -0
  696. package/data/skills/pylabrobot/SKILL.md +179 -0
  697. package/data/skills/pymc/SKILL.md +566 -0
  698. package/data/skills/pymoo/SKILL.md +565 -0
  699. package/data/skills/pyopenms/SKILL.md +211 -0
  700. package/data/skills/pysam/SKILL.md +259 -0
  701. package/data/skills/pytdc/SKILL.md +454 -0
  702. package/data/skills/pytorch-lightning/SKILL.md +172 -0
  703. package/data/skills/pyzotero/SKILL.md +111 -0
  704. package/data/skills/radgpt-radiology-reporter/SKILL.md +67 -0
  705. package/data/skills/radiomics-pathomics-fusion-agent/SKILL.md +221 -0
  706. package/data/skills/rdkit/SKILL.md +763 -0
  707. package/data/skills/reactome-database/SKILL.md +272 -0
  708. package/data/skills/receiving-code-review/SKILL.md +213 -0
  709. package/data/skills/recovery-community-moderator/SKILL.md +175 -0
  710. package/data/skills/regulatory-drafter/SKILL.md +56 -0
  711. package/data/skills/regulatory-drafting/SKILL.md +35 -0
  712. package/data/skills/rehabilitation-analyzer/SKILL.md +636 -0
  713. package/data/skills/repro-enforcer/SKILL.md +50 -0
  714. package/data/skills/requesting-code-review/SKILL.md +105 -0
  715. package/data/skills/research-grants/SKILL.md +935 -0
  716. package/data/skills/research-literature/SKILL.md +35 -0
  717. package/data/skills/research-lookup/SKILL.md +502 -0
  718. package/data/skills/rfdiffusion/SKILL.md +306 -0
  719. package/data/skills/rna-velocity-agent/SKILL.md +174 -0
  720. package/data/skills/scanpy/SKILL.md +380 -0
  721. package/data/skills/scfoundation-model-agent/SKILL.md +210 -0
  722. package/data/skills/scientific-brainstorming/SKILL.md +185 -0
  723. package/data/skills/scientific-critical-thinking/SKILL.md +566 -0
  724. package/data/skills/scientific-manuscript/SKILL.md +181 -0
  725. package/data/skills/scientific-problem-selection/SKILL.md +269 -0
  726. package/data/skills/scientific-schematics/SKILL.md +619 -0
  727. package/data/skills/scientific-slides/SKILL.md +1154 -0
  728. package/data/skills/scientific-visualization/SKILL.md +773 -0
  729. package/data/skills/scientific-writing/SKILL.md +483 -0
  730. package/data/skills/scikit-bio/SKILL.md +431 -0
  731. package/data/skills/scikit-learn/SKILL.md +515 -0
  732. package/data/skills/scikit-survival/SKILL.md +393 -0
  733. package/data/skills/scrna-orchestrator/SKILL.md +204 -0
  734. package/data/skills/scrna-qc/SKILL.md +43 -0
  735. package/data/skills/scvelo/SKILL.md +321 -0
  736. package/data/skills/scvi-tools/SKILL.md +184 -0
  737. package/data/skills/seaborn/SKILL.md +671 -0
  738. package/data/skills/search-strategy/SKILL.md +247 -0
  739. package/data/skills/seq-wrangler/SKILL.md +58 -0
  740. package/data/skills/shap/SKILL.md +560 -0
  741. package/data/skills/simo-multiomics-integration-agent/SKILL.md +178 -0
  742. package/data/skills/simpy/SKILL.md +423 -0
  743. package/data/skills/simulation-orchestrator/SKILL.md +230 -0
  744. package/data/skills/simulation-validator/SKILL.md +195 -0
  745. package/data/skills/single-annotation/SKILL.md +129 -0
  746. package/data/skills/single-cell-rna-qc/SKILL.md +175 -0
  747. package/data/skills/single-cellphone-db/SKILL.md +68 -0
  748. package/data/skills/single-clustering/SKILL.md +75 -0
  749. package/data/skills/single-downstream-analysis/SKILL.md +150 -0
  750. package/data/skills/single-multiomics/SKILL.md +44 -0
  751. package/data/skills/single-preprocessing/SKILL.md +184 -0
  752. package/data/skills/single-to-spatial-mapping/SKILL.md +48 -0
  753. package/data/skills/single-trajectory/SKILL.md +62 -0
  754. package/data/skills/sleep-analyzer/SKILL.md +773 -0
  755. package/data/skills/slurm-job-script-generator/SKILL.md +135 -0
  756. package/data/skills/solublempnn/SKILL.md +165 -0
  757. package/data/skills/spatial-agent/SKILL.md +56 -0
  758. package/data/skills/spatial-epigenomics-agent/SKILL.md +163 -0
  759. package/data/skills/spatial-transcriptomics-agent/SKILL.md +75 -0
  760. package/data/skills/spatial-transcriptomics-analysis/SKILL.md +72 -0
  761. package/data/skills/spatial-transcriptomics-analysis/STAgent/SKILL.md +75 -0
  762. package/data/skills/spatial-transcriptomics-analysis/SpatialAgent/SKILL.md +56 -0
  763. package/data/skills/spatial-transcriptomics-analysis/bioSkills/image-analysis/SKILL.md +266 -0
  764. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-communication/SKILL.md +287 -0
  765. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-data-io/SKILL.md +243 -0
  766. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-deconvolution/SKILL.md +298 -0
  767. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-domains/SKILL.md +229 -0
  768. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-multiomics/SKILL.md +172 -0
  769. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-neighbors/SKILL.md +189 -0
  770. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-preprocessing/SKILL.md +232 -0
  771. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-proteomics/SKILL.md +127 -0
  772. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-statistics/SKILL.md +225 -0
  773. package/data/skills/spatial-transcriptomics-analysis/bioSkills/spatial-visualization/SKILL.md +270 -0
  774. package/data/skills/spatial-tutorials/SKILL.md +87 -0
  775. package/data/skills/speech-pathology-ai/SKILL.md +184 -0
  776. package/data/skills/statistical-analysis/SKILL.md +626 -0
  777. package/data/skills/statsmodels/SKILL.md +608 -0
  778. package/data/skills/string-database/SKILL.md +528 -0
  779. package/data/skills/struct-predictor/SKILL.md +52 -0
  780. package/data/skills/subagent-driven-development/SKILL.md +242 -0
  781. package/data/skills/systematic-debugging/SKILL.md +296 -0
  782. package/data/skills/tcell-exhaustion-analysis-agent/SKILL.md +139 -0
  783. package/data/skills/tcga-preprocessing/SKILL.md +49 -0
  784. package/data/skills/tcm-constitution-analyzer/SKILL.md +664 -0
  785. package/data/skills/tcr-pmhc-prediction-agent/SKILL.md +226 -0
  786. package/data/skills/tcr-repertoire-analysis-agent/SKILL.md +218 -0
  787. package/data/skills/test-driven-development/SKILL.md +371 -0
  788. package/data/skills/tiledbvcf/SKILL.md +459 -0
  789. package/data/skills/time-resolved-cryoem-agent/SKILL.md +223 -0
  790. package/data/skills/time-stepping/SKILL.md +140 -0
  791. package/data/skills/timesfm-forecasting/SKILL.md +785 -0
  792. package/data/skills/tme-immune-profiling-agent/SKILL.md +220 -0
  793. package/data/skills/tooluniverse-adverse-event-detection/SKILL.md +1115 -0
  794. package/data/skills/tooluniverse-antibody-engineering/SKILL.md +1581 -0
  795. package/data/skills/tooluniverse-binder-discovery/SKILL.md +1459 -0
  796. package/data/skills/tooluniverse-cancer-variant-interpretation/SKILL.md +971 -0
  797. package/data/skills/tooluniverse-chemical-compound-retrieval/SKILL.md +322 -0
  798. package/data/skills/tooluniverse-chemical-safety/SKILL.md +733 -0
  799. package/data/skills/tooluniverse-clinical-guidelines/SKILL.md +399 -0
  800. package/data/skills/tooluniverse-clinical-trial-design/SKILL.md +1195 -0
  801. package/data/skills/tooluniverse-clinical-trial-matching/SKILL.md +1333 -0
  802. package/data/skills/tooluniverse-crispr-screen-analysis/SKILL.md +900 -0
  803. package/data/skills/tooluniverse-disease-research/SKILL.md +630 -0
  804. package/data/skills/tooluniverse-drug-drug-interaction/SKILL.md +73 -0
  805. package/data/skills/tooluniverse-drug-repurposing/SKILL.md +595 -0
  806. package/data/skills/tooluniverse-drug-research/SKILL.md +1642 -0
  807. package/data/skills/tooluniverse-drug-target-validation/SKILL.md +1206 -0
  808. package/data/skills/tooluniverse-epigenomics/SKILL.md +1489 -0
  809. package/data/skills/tooluniverse-expression-data-retrieval/SKILL.md +389 -0
  810. package/data/skills/tooluniverse-gene-enrichment/SKILL.md +402 -0
  811. package/data/skills/tooluniverse-gwas-drug-discovery/SKILL.md +576 -0
  812. package/data/skills/tooluniverse-gwas-finemapping/SKILL.md +309 -0
  813. package/data/skills/tooluniverse-gwas-snp-interpretation/SKILL.md +223 -0
  814. package/data/skills/tooluniverse-gwas-study-explorer/SKILL.md +342 -0
  815. package/data/skills/tooluniverse-gwas-trait-to-gene/SKILL.md +236 -0
  816. package/data/skills/tooluniverse-image-analysis/SKILL.md +439 -0
  817. package/data/skills/tooluniverse-immune-repertoire-analysis/SKILL.md +949 -0
  818. package/data/skills/tooluniverse-immunotherapy-response-prediction/SKILL.md +865 -0
  819. package/data/skills/tooluniverse-infectious-disease/SKILL.md +749 -0
  820. package/data/skills/tooluniverse-literature-deep-research/SKILL.md +1050 -0
  821. package/data/skills/tooluniverse-metabolomics/SKILL.md +298 -0
  822. package/data/skills/tooluniverse-metabolomics-analysis/SKILL.md +764 -0
  823. package/data/skills/tooluniverse-multi-omics-integration/SKILL.md +703 -0
  824. package/data/skills/tooluniverse-multiomic-disease-characterization/SKILL.md +1138 -0
  825. package/data/skills/tooluniverse-network-pharmacology/SKILL.md +1312 -0
  826. package/data/skills/tooluniverse-pharmacovigilance/SKILL.md +807 -0
  827. package/data/skills/tooluniverse-phylogenetics/SKILL.md +461 -0
  828. package/data/skills/tooluniverse-polygenic-risk-score/SKILL.md +397 -0
  829. package/data/skills/tooluniverse-precision-medicine-stratification/SKILL.md +1143 -0
  830. package/data/skills/tooluniverse-precision-oncology/SKILL.md +1091 -0
  831. package/data/skills/tooluniverse-protein-interactions/SKILL.md +446 -0
  832. package/data/skills/tooluniverse-protein-structure-retrieval/SKILL.md +416 -0
  833. package/data/skills/tooluniverse-protein-therapeutic-design/SKILL.md +637 -0
  834. package/data/skills/tooluniverse-proteomics-analysis/SKILL.md +843 -0
  835. package/data/skills/tooluniverse-rare-disease-diagnosis/SKILL.md +1257 -0
  836. package/data/skills/tooluniverse-rnaseq-deseq2/SKILL.md +536 -0
  837. package/data/skills/tooluniverse-sequence-retrieval/SKILL.md +419 -0
  838. package/data/skills/tooluniverse-single-cell/SKILL.md +719 -0
  839. package/data/skills/tooluniverse-spatial-omics-analysis/SKILL.md +1102 -0
  840. package/data/skills/tooluniverse-spatial-transcriptomics/SKILL.md +788 -0
  841. package/data/skills/tooluniverse-statistical-modeling/SKILL.md +557 -0
  842. package/data/skills/tooluniverse-structural-variant-analysis/SKILL.md +1356 -0
  843. package/data/skills/tooluniverse-systems-biology/SKILL.md +374 -0
  844. package/data/skills/tooluniverse-target-research/SKILL.md +1510 -0
  845. package/data/skills/tooluniverse-variant-analysis/SKILL.md +448 -0
  846. package/data/skills/tooluniverse-variant-interpretation/SKILL.md +1118 -0
  847. package/data/skills/torch-geometric/SKILL.md +674 -0
  848. package/data/skills/torch_geometric/SKILL.md +670 -0
  849. package/data/skills/torchdrug/SKILL.md +444 -0
  850. package/data/skills/tpd-ternary-complex-agent/SKILL.md +226 -0
  851. package/data/skills/transformers/SKILL.md +157 -0
  852. package/data/skills/travel-health-analyzer/SKILL.md +421 -0
  853. package/data/skills/treatment-plans/SKILL.md +1576 -0
  854. package/data/skills/trial-eligibility-agent/SKILL.md +54 -0
  855. package/data/skills/trialgpt-matching/SKILL.md +66 -0
  856. package/data/skills/tumor-clonal-evolution-agent/SKILL.md +134 -0
  857. package/data/skills/tumor-heterogeneity-agent/SKILL.md +216 -0
  858. package/data/skills/tumor-mutational-burden-agent/SKILL.md +188 -0
  859. package/data/skills/ukb-navigator/SKILL.md +113 -0
  860. package/data/skills/umap-learn/SKILL.md +473 -0
  861. package/data/skills/uniprot-database/SKILL.md +189 -0
  862. package/data/skills/universal-single-cell-annotator/SKILL.md +72 -0
  863. package/data/skills/using-git-worktrees/SKILL.md +218 -0
  864. package/data/skills/using-superpowers/SKILL.md +95 -0
  865. package/data/skills/usmle/SKILL.md +62 -0
  866. package/data/skills/uspto-database/SKILL.md +597 -0
  867. package/data/skills/vaex/SKILL.md +180 -0
  868. package/data/skills/varcadd-pathogenicity/SKILL.md +68 -0
  869. package/data/skills/variant-interpretation-acmg/SKILL.md +58 -0
  870. package/data/skills/variant-interpretation-acmg/bioSkills/clinical-interpretation/SKILL.md +334 -0
  871. package/data/skills/variant-interpretation-acmg/bioSkills/consensus-sequences/SKILL.md +343 -0
  872. package/data/skills/variant-interpretation-acmg/bioSkills/deepvariant/SKILL.md +279 -0
  873. package/data/skills/variant-interpretation-acmg/bioSkills/filtering-best-practices/SKILL.md +362 -0
  874. package/data/skills/variant-interpretation-acmg/bioSkills/gatk-variant-calling/SKILL.md +398 -0
  875. package/data/skills/variant-interpretation-acmg/bioSkills/joint-calling/SKILL.md +343 -0
  876. package/data/skills/variant-interpretation-acmg/bioSkills/structural-variant-calling/SKILL.md +256 -0
  877. package/data/skills/variant-interpretation-acmg/bioSkills/variant-annotation/SKILL.md +387 -0
  878. package/data/skills/variant-interpretation-acmg/bioSkills/variant-calling/SKILL.md +258 -0
  879. package/data/skills/variant-interpretation-acmg/bioSkills/variant-normalization/SKILL.md +304 -0
  880. package/data/skills/variant-interpretation-acmg/bioSkills/vcf-basics/SKILL.md +329 -0
  881. package/data/skills/variant-interpretation-acmg/bioSkills/vcf-manipulation/SKILL.md +398 -0
  882. package/data/skills/variant-interpretation-acmg/bioSkills/vcf-statistics/SKILL.md +424 -0
  883. package/data/skills/variant-interpretation-acmg/varCADD/SKILL.md +68 -0
  884. package/data/skills/vcf-annotator/SKILL.md +55 -0
  885. package/data/skills/verification-before-completion/SKILL.md +139 -0
  886. package/data/skills/virtual-lab-agent/SKILL.md +240 -0
  887. package/data/skills/wearable-analysis-agent/SKILL.md +70 -0
  888. package/data/skills/weightloss-analyzer/SKILL.md +320 -0
  889. package/data/skills/wellally-tech/SKILL.md +685 -0
  890. package/data/skills/wikipedia-search/SKILL.md +481 -0
  891. package/data/skills/writing-plans/SKILL.md +116 -0
  892. package/data/skills/writing-skills/SKILL.md +655 -0
  893. package/data/skills/xlsx/SKILL.md +292 -0
  894. package/data/skills/xlsx-official/SKILL.md +289 -0
  895. package/data/skills/zarr-python/SKILL.md +777 -0
  896. package/data/skills/zinc-database/SKILL.md +398 -0
  897. package/data/tools/__init__.py +8 -0
  898. package/data/tools/hpc.py +71 -0
  899. package/data/tools/hpc_client/__init__.py +8 -0
  900. package/data/tools/hpc_client/builders/__init__.py +12 -0
  901. package/data/tools/hpc_client/builders/alphafold.py +36 -0
  902. package/data/tools/hpc_client/builders/boltz.py +33 -0
  903. package/data/tools/hpc_client/builders/chai.py +30 -0
  904. package/data/tools/hpc_client/builders/immunebuilder.py +31 -0
  905. package/data/tools/hpc_client/builders/rfantibody.py +58 -0
  906. package/data/tools/hpc_client/builders/thermompnn.py +16 -0
  907. package/data/tools/hpc_client/hpc_api.py +41 -0
  908. package/data/tools/hpc_client/hpc_tools.py +218 -0
  909. package/data/tools/hpc_dynamic.py +71 -0
  910. package/data/tools/integrations/__init__.py +14 -0
  911. package/data/tools/integrations/adaptyv.py +107 -0
  912. package/data/tools/integrations/addgene.py +52 -0
  913. package/data/tools/integrations/api_internal.py +33 -0
  914. package/data/tools/molecular_biology.py +688 -0
  915. package/data/tools/pharmacology.py +67 -0
  916. package/data/workflows/bulk-omics-clustering/SKILL.md +501 -0
  917. package/data/workflows/bulk-omics-clustering/references/best_practices.md +395 -0
  918. package/data/workflows/bulk-omics-clustering/references/clustering_methods_comparison.md +288 -0
  919. package/data/workflows/bulk-omics-clustering/references/common-patterns.md +1136 -0
  920. package/data/workflows/bulk-omics-clustering/references/decision-guide.md +819 -0
  921. package/data/workflows/bulk-omics-clustering/references/distance_metrics_guide.md +388 -0
  922. package/data/workflows/bulk-omics-clustering/references/parameter_guide.md +396 -0
  923. package/data/workflows/bulk-omics-clustering/references/r-quick-start.md +105 -0
  924. package/data/workflows/bulk-omics-clustering/references/validation_metrics_guide.md +315 -0
  925. package/data/workflows/bulk-omics-clustering/scripts/characterize_clusters.py +255 -0
  926. package/data/workflows/bulk-omics-clustering/scripts/cluster_validation.py +449 -0
  927. package/data/workflows/bulk-omics-clustering/scripts/density_clustering.py +321 -0
  928. package/data/workflows/bulk-omics-clustering/scripts/dimensionality_reduction.py +328 -0
  929. package/data/workflows/bulk-omics-clustering/scripts/distance_metrics.py +251 -0
  930. package/data/workflows/bulk-omics-clustering/scripts/export_results.py +456 -0
  931. package/data/workflows/bulk-omics-clustering/scripts/hierarchical_clustering.R +229 -0
  932. package/data/workflows/bulk-omics-clustering/scripts/hierarchical_clustering.py +269 -0
  933. package/data/workflows/bulk-omics-clustering/scripts/kmeans_clustering.py +346 -0
  934. package/data/workflows/bulk-omics-clustering/scripts/load_example_data.R +171 -0
  935. package/data/workflows/bulk-omics-clustering/scripts/load_example_data.py +171 -0
  936. package/data/workflows/bulk-omics-clustering/scripts/model_based_clustering.py +370 -0
  937. package/data/workflows/bulk-omics-clustering/scripts/optimal_clusters.py +381 -0
  938. package/data/workflows/bulk-omics-clustering/scripts/plot_cluster_heatmap.R +141 -0
  939. package/data/workflows/bulk-omics-clustering/scripts/plot_clustering_results.py +452 -0
  940. package/data/workflows/bulk-omics-clustering/scripts/prepare_data.py +250 -0
  941. package/data/workflows/bulk-omics-clustering/scripts/stability_analysis.py +434 -0
  942. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/SKILL.md +505 -0
  943. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/references/comprehensive-reference.md +440 -0
  944. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/references/decision-guide.md +327 -0
  945. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/references/troubleshooting.md +456 -0
  946. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/references/usage-guide.md +75 -0
  947. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/scripts/basic_workflow.R +149 -0
  948. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/scripts/batch_correction.R +44 -0
  949. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/scripts/export_results.R +190 -0
  950. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/scripts/extract_results.R +242 -0
  951. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/scripts/load_example_data.R +250 -0
  952. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/scripts/multi_condition.R +50 -0
  953. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/scripts/qc_plots.R +410 -0
  954. package/data/workflows/bulk-rnaseq-counts-to-de-deseq2/scripts/transformations.R +218 -0
  955. package/data/workflows/chip-atlas-diff-analysis/SKILL.md +222 -0
  956. package/data/workflows/chip-atlas-diff-analysis/references/chipatlas_diff_api_format.md +106 -0
  957. package/data/workflows/chip-atlas-diff-analysis/references/diff_analysis_methods.md +89 -0
  958. package/data/workflows/chip-atlas-diff-analysis/references/output_format.md +78 -0
  959. package/data/workflows/chip-atlas-diff-analysis/scripts/__init__.py +1 -0
  960. package/data/workflows/chip-atlas-diff-analysis/scripts/annotate_genes.py +144 -0
  961. package/data/workflows/chip-atlas-diff-analysis/scripts/export_all.py +498 -0
  962. package/data/workflows/chip-atlas-diff-analysis/scripts/filter_regions.py +176 -0
  963. package/data/workflows/chip-atlas-diff-analysis/scripts/generate_all_plots.py +321 -0
  964. package/data/workflows/chip-atlas-diff-analysis/scripts/load_example_data.py +149 -0
  965. package/data/workflows/chip-atlas-diff-analysis/scripts/load_user_data.py +211 -0
  966. package/data/workflows/chip-atlas-diff-analysis/scripts/parse_bed_results.py +240 -0
  967. package/data/workflows/chip-atlas-diff-analysis/scripts/qc_checks.py +621 -0
  968. package/data/workflows/chip-atlas-diff-analysis/scripts/query_chipatlas_api.py +329 -0
  969. package/data/workflows/chip-atlas-diff-analysis/scripts/run_diff_workflow.py +256 -0
  970. package/data/workflows/chip-atlas-peak-enrichment/SKILL.md +212 -0
  971. package/data/workflows/chip-atlas-peak-enrichment/references/chipatlas_metadata_format.md +115 -0
  972. package/data/workflows/chip-atlas-peak-enrichment/references/enrichment_statistics.md +145 -0
  973. package/data/workflows/chip-atlas-peak-enrichment/references/peak_thresholds.md +63 -0
  974. package/data/workflows/chip-atlas-peak-enrichment/references/promoter_definitions.md +69 -0
  975. package/data/workflows/chip-atlas-peak-enrichment/scripts/__init__.py +1 -0
  976. package/data/workflows/chip-atlas-peak-enrichment/scripts/convert_genes_to_regions.py +271 -0
  977. package/data/workflows/chip-atlas-peak-enrichment/scripts/export_all.py +456 -0
  978. package/data/workflows/chip-atlas-peak-enrichment/scripts/filter_experiments.py +116 -0
  979. package/data/workflows/chip-atlas-peak-enrichment/scripts/generate_all_plots.py +280 -0
  980. package/data/workflows/chip-atlas-peak-enrichment/scripts/load_example_data.py +96 -0
  981. package/data/workflows/chip-atlas-peak-enrichment/scripts/load_user_data.py +183 -0
  982. package/data/workflows/chip-atlas-peak-enrichment/scripts/query_chipatlas_api.py +349 -0
  983. package/data/workflows/chip-atlas-peak-enrichment/scripts/run_enrichment_workflow.py +271 -0
  984. package/data/workflows/chip-atlas-target-genes/SKILL.md +230 -0
  985. package/data/workflows/chip-atlas-target-genes/references/macs2_binding_scores.md +89 -0
  986. package/data/workflows/chip-atlas-target-genes/references/string_scores.md +58 -0
  987. package/data/workflows/chip-atlas-target-genes/references/target_genes_data_format.md +73 -0
  988. package/data/workflows/chip-atlas-target-genes/scripts/__init__.py +0 -0
  989. package/data/workflows/chip-atlas-target-genes/scripts/download_target_genes.py +200 -0
  990. package/data/workflows/chip-atlas-target-genes/scripts/export_all.py +340 -0
  991. package/data/workflows/chip-atlas-target-genes/scripts/filter_targets.py +205 -0
  992. package/data/workflows/chip-atlas-target-genes/scripts/generate_all_plots.py +330 -0
  993. package/data/workflows/chip-atlas-target-genes/scripts/load_example_query.py +61 -0
  994. package/data/workflows/chip-atlas-target-genes/scripts/load_user_query.py +47 -0
  995. package/data/workflows/chip-atlas-target-genes/scripts/run_target_genes_workflow.py +141 -0
  996. package/data/workflows/clinicaltrials-landscape/SKILL.md +257 -0
  997. package/data/workflows/clinicaltrials-landscape/references/api-parameters.md +181 -0
  998. package/data/workflows/clinicaltrials-landscape/references/mechanisms.md +141 -0
  999. package/data/workflows/clinicaltrials-landscape/references/output-schema.md +184 -0
  1000. package/data/workflows/clinicaltrials-landscape/scripts/__init__.py +1 -0
  1001. package/data/workflows/clinicaltrials-landscape/scripts/classify_mechanisms.py +359 -0
  1002. package/data/workflows/clinicaltrials-landscape/scripts/compile_trials.py +579 -0
  1003. package/data/workflows/clinicaltrials-landscape/scripts/disease_config.py +161 -0
  1004. package/data/workflows/clinicaltrials-landscape/scripts/export_all.py +242 -0
  1005. package/data/workflows/clinicaltrials-landscape/scripts/generate_landscape_plots.py +761 -0
  1006. package/data/workflows/clinicaltrials-landscape/scripts/generate_pdf_report.py +1465 -0
  1007. package/data/workflows/clinicaltrials-landscape/scripts/generate_report.py +1813 -0
  1008. package/data/workflows/clinicaltrials-landscape/scripts/query_clinicaltrials.py +307 -0
  1009. package/data/workflows/coexpression-network/SKILL.md +344 -0
  1010. package/data/workflows/coexpression-network/references/parameter-tuning-guide.md +591 -0
  1011. package/data/workflows/coexpression-network/references/troubleshooting.md +483 -0
  1012. package/data/workflows/coexpression-network/references/wgcna-best-practices.md +563 -0
  1013. package/data/workflows/coexpression-network/references/wgcna-reference.md +538 -0
  1014. package/data/workflows/coexpression-network/scripts/build_network.R +43 -0
  1015. package/data/workflows/coexpression-network/scripts/correlate_modules_traits.R +92 -0
  1016. package/data/workflows/coexpression-network/scripts/export_wgcna_results.R +117 -0
  1017. package/data/workflows/coexpression-network/scripts/identify_hub_genes.R +63 -0
  1018. package/data/workflows/coexpression-network/scripts/load_example_data.R +214 -0
  1019. package/data/workflows/coexpression-network/scripts/module_enrichment.R +159 -0
  1020. package/data/workflows/coexpression-network/scripts/pick_soft_power.R +70 -0
  1021. package/data/workflows/coexpression-network/scripts/plot_all_wgcna.R +104 -0
  1022. package/data/workflows/coexpression-network/scripts/plot_eigengene_heatmap.R +65 -0
  1023. package/data/workflows/coexpression-network/scripts/plot_hub_genes.R +70 -0
  1024. package/data/workflows/coexpression-network/scripts/plot_module_dendrogram.R +50 -0
  1025. package/data/workflows/coexpression-network/scripts/plotting_helpers.R +87 -0
  1026. package/data/workflows/coexpression-network/scripts/prepare_wgcna_data.R +73 -0
  1027. package/data/workflows/coexpression-network/scripts/wgcna_workflow.R +93 -0
  1028. package/data/workflows/experimental-design-statistics/SKILL.md +408 -0
  1029. package/data/workflows/experimental-design-statistics/references/batch_effect_mitigation.md +756 -0
  1030. package/data/workflows/experimental-design-statistics/references/cv_tissue_database.csv +30 -0
  1031. package/data/workflows/experimental-design-statistics/references/experimental_design_best_practices.md +515 -0
  1032. package/data/workflows/experimental-design-statistics/references/multiple_testing_guide.md +730 -0
  1033. package/data/workflows/experimental-design-statistics/references/power_analysis_guidelines.md +635 -0
  1034. package/data/workflows/experimental-design-statistics/references/qc_guidelines.md +310 -0
  1035. package/data/workflows/experimental-design-statistics/references/software_requirements.md +328 -0
  1036. package/data/workflows/experimental-design-statistics/references/troubleshooting_guide.md +510 -0
  1037. package/data/workflows/experimental-design-statistics/scripts/batch_assignment.R +302 -0
  1038. package/data/workflows/experimental-design-statistics/scripts/batch_validation.R +342 -0
  1039. package/data/workflows/experimental-design-statistics/scripts/export_design.R +352 -0
  1040. package/data/workflows/experimental-design-statistics/scripts/load_example_data.R +204 -0
  1041. package/data/workflows/experimental-design-statistics/scripts/multiple_testing.R +417 -0
  1042. package/data/workflows/experimental-design-statistics/scripts/plot_power_curves.R +317 -0
  1043. package/data/workflows/experimental-design-statistics/scripts/power_atacseq.R +229 -0
  1044. package/data/workflows/experimental-design-statistics/scripts/power_pilot_based.R +289 -0
  1045. package/data/workflows/experimental-design-statistics/scripts/power_rnaseq.R +247 -0
  1046. package/data/workflows/experimental-design-statistics/scripts/sample_size_de.R +327 -0
  1047. package/data/workflows/experimental-design-statistics/scripts/sample_size_scrna.R +304 -0
  1048. package/data/workflows/functional-enrichment-from-degs/SKILL.md +387 -0
  1049. package/data/workflows/functional-enrichment-from-degs/references/database_guide.md +354 -0
  1050. package/data/workflows/functional-enrichment-from-degs/references/decision-guide.md +546 -0
  1051. package/data/workflows/functional-enrichment-from-degs/references/gsea_ora_comparison.md +213 -0
  1052. package/data/workflows/functional-enrichment-from-degs/references/gsea_ora_validation_framework.md +483 -0
  1053. package/data/workflows/functional-enrichment-from-degs/references/interpretation_guidelines.md +374 -0
  1054. package/data/workflows/functional-enrichment-from-degs/references/method-reference.md +742 -0
  1055. package/data/workflows/functional-enrichment-from-degs/scripts/export_results.R +190 -0
  1056. package/data/workflows/functional-enrichment-from-degs/scripts/generate_plots.R +240 -0
  1057. package/data/workflows/functional-enrichment-from-degs/scripts/get_msigdb_genesets.R +75 -0
  1058. package/data/workflows/functional-enrichment-from-degs/scripts/load_de_results.R +60 -0
  1059. package/data/workflows/functional-enrichment-from-degs/scripts/load_example_data.R +212 -0
  1060. package/data/workflows/functional-enrichment-from-degs/scripts/prepare_gene_lists.R +92 -0
  1061. package/data/workflows/functional-enrichment-from-degs/scripts/run_gsea.R +44 -0
  1062. package/data/workflows/functional-enrichment-from-degs/scripts/run_ora.R +53 -0
  1063. package/data/workflows/genetic-variant-annotation/SKILL.md +440 -0
  1064. package/data/workflows/genetic-variant-annotation/references/auto_installation_implementation.md +274 -0
  1065. package/data/workflows/genetic-variant-annotation/references/consequence_terms.md +392 -0
  1066. package/data/workflows/genetic-variant-annotation/references/filtering_strategies.md +808 -0
  1067. package/data/workflows/genetic-variant-annotation/references/installation_guide.md +557 -0
  1068. package/data/workflows/genetic-variant-annotation/references/pathogenicity_interpretation.md +473 -0
  1069. package/data/workflows/genetic-variant-annotation/references/qc_guidelines.md +524 -0
  1070. package/data/workflows/genetic-variant-annotation/references/snpeff_best_practices.md +481 -0
  1071. package/data/workflows/genetic-variant-annotation/references/tool_selection_guide.md +433 -0
  1072. package/data/workflows/genetic-variant-annotation/references/troubleshooting_guide.md +678 -0
  1073. package/data/workflows/genetic-variant-annotation/references/vep_best_practices.md +450 -0
  1074. package/data/workflows/genetic-variant-annotation/scripts/annotate_genes.py +243 -0
  1075. package/data/workflows/genetic-variant-annotation/scripts/export_results.py +450 -0
  1076. package/data/workflows/genetic-variant-annotation/scripts/filter_variants.py +365 -0
  1077. package/data/workflows/genetic-variant-annotation/scripts/install_tools.py +246 -0
  1078. package/data/workflows/genetic-variant-annotation/scripts/load_example_data.py +166 -0
  1079. package/data/workflows/genetic-variant-annotation/scripts/parse_snpeff_output.py +283 -0
  1080. package/data/workflows/genetic-variant-annotation/scripts/parse_vep_output.py +257 -0
  1081. package/data/workflows/genetic-variant-annotation/scripts/plot_variant_distribution.py +372 -0
  1082. package/data/workflows/genetic-variant-annotation/scripts/prioritize_variants.py +287 -0
  1083. package/data/workflows/genetic-variant-annotation/scripts/run_snpeff.py +418 -0
  1084. package/data/workflows/genetic-variant-annotation/scripts/run_vep.py +358 -0
  1085. package/data/workflows/genetic-variant-annotation/scripts/select_tool.py +203 -0
  1086. package/data/workflows/genetic-variant-annotation/scripts/test_complete_workflow.py +312 -0
  1087. package/data/workflows/genetic-variant-annotation/scripts/test_pickle_load.py +118 -0
  1088. package/data/workflows/genetic-variant-annotation/scripts/validate_vcf.py +351 -0
  1089. package/data/workflows/genetic-variant-annotation/scripts/verify_changes.py +212 -0
  1090. package/data/workflows/grn-pyscenic/SKILL.md +331 -0
  1091. package/data/workflows/grn-pyscenic/references/cli_interface.md +222 -0
  1092. package/data/workflows/grn-pyscenic/references/database_downloads.md +245 -0
  1093. package/data/workflows/grn-pyscenic/scripts/export_all.py +192 -0
  1094. package/data/workflows/grn-pyscenic/scripts/generate_report.py +512 -0
  1095. package/data/workflows/grn-pyscenic/scripts/integrate_with_adata.py +54 -0
  1096. package/data/workflows/grn-pyscenic/scripts/load_example_data.py +200 -0
  1097. package/data/workflows/grn-pyscenic/scripts/load_expression_data.py +61 -0
  1098. package/data/workflows/grn-pyscenic/scripts/plot_regulon_visualizations.py +263 -0
  1099. package/data/workflows/grn-pyscenic/scripts/run_grn_workflow.py +184 -0
  1100. package/data/workflows/gwas-to-function-twas/SKILL.md +394 -0
  1101. package/data/workflows/gwas-to-function-twas/references/fusion_best_practices.md +120 -0
  1102. package/data/workflows/gwas-to-function-twas/references/installation-guide.md +414 -0
  1103. package/data/workflows/gwas-to-function-twas/references/ldsc_qc_guidelines.md +287 -0
  1104. package/data/workflows/gwas-to-function-twas/references/spredixxcan_best_practices.md +166 -0
  1105. package/data/workflows/gwas-to-function-twas/references/therapeutic_interpretation_guide.md +717 -0
  1106. package/data/workflows/gwas-to-function-twas/references/tissue_reference_guide.md +182 -0
  1107. package/data/workflows/gwas-to-function-twas/references/troubleshooting_guide.md +317 -0
  1108. package/data/workflows/gwas-to-function-twas/references/twas_hub_validation_guide.md +88 -0
  1109. package/data/workflows/gwas-to-function-twas/scripts/colocalization_analysis.py +187 -0
  1110. package/data/workflows/gwas-to-function-twas/scripts/druggability_scoring.py +199 -0
  1111. package/data/workflows/gwas-to-function-twas/scripts/export_results.py +220 -0
  1112. package/data/workflows/gwas-to-function-twas/scripts/integrate_variant_annotation.py +194 -0
  1113. package/data/workflows/gwas-to-function-twas/scripts/interpret_therapeutic_direction.py +418 -0
  1114. package/data/workflows/gwas-to-function-twas/scripts/mendelian_randomization.py +749 -0
  1115. package/data/workflows/gwas-to-function-twas/scripts/multilayer_direction_analysis.py +471 -0
  1116. package/data/workflows/gwas-to-function-twas/scripts/plot_twas_results.py +252 -0
  1117. package/data/workflows/gwas-to-function-twas/scripts/run_fusion.py +155 -0
  1118. package/data/workflows/gwas-to-function-twas/scripts/run_smultixcan.py +102 -0
  1119. package/data/workflows/gwas-to-function-twas/scripts/run_spredixxcan.py +138 -0
  1120. package/data/workflows/gwas-to-function-twas/scripts/select_reference_panel.py +253 -0
  1121. package/data/workflows/gwas-to-function-twas/scripts/validate_gwas_sumstats.py +214 -0
  1122. package/data/workflows/gwas-to-function-twas/scripts/validate_with_twas_hub.py +439 -0
  1123. package/data/workflows/lasso-biomarker-panel/SKILL.md +322 -0
  1124. package/data/workflows/lasso-biomarker-panel/references/decision-guide.md +64 -0
  1125. package/data/workflows/lasso-biomarker-panel/references/lasso-reference.md +110 -0
  1126. package/data/workflows/lasso-biomarker-panel/references/validation-guide.md +105 -0
  1127. package/data/workflows/lasso-biomarker-panel/scripts/biological_interpretation.R +1560 -0
  1128. package/data/workflows/lasso-biomarker-panel/scripts/biomarker_plots.R +350 -0
  1129. package/data/workflows/lasso-biomarker-panel/scripts/export_results.R +1492 -0
  1130. package/data/workflows/lasso-biomarker-panel/scripts/lasso_workflow.R +328 -0
  1131. package/data/workflows/lasso-biomarker-panel/scripts/load_example_data.R +1903 -0
  1132. package/data/workflows/lasso-biomarker-panel/scripts/plotting_helpers.R +78 -0
  1133. package/data/workflows/lasso-biomarker-panel/scripts/prepare_features.R +225 -0
  1134. package/data/workflows/lasso-biomarker-panel/scripts/query_cellxgene.py +107 -0
  1135. package/data/workflows/lasso-biomarker-panel/scripts/validate_external.R +174 -0
  1136. package/data/workflows/literature-preclinical/SKILL.md +276 -0
  1137. package/data/workflows/literature-preclinical/assets/eval/simple_test.py +386 -0
  1138. package/data/workflows/literature-preclinical/references/experiment-extraction-guide.md +147 -0
  1139. package/data/workflows/literature-preclinical/references/full-text-enrichment-guide.md +121 -0
  1140. package/data/workflows/literature-preclinical/references/preclinical-search-guide.md +117 -0
  1141. package/data/workflows/literature-preclinical/scripts/extract_experiments.py +401 -0
  1142. package/data/workflows/literature-preclinical/scripts/generate_plots.R +303 -0
  1143. package/data/workflows/literature-preclinical/scripts/narrative_synthesis.py +653 -0
  1144. package/data/workflows/literature-preclinical/scripts/preclinical_search.py +332 -0
  1145. package/data/workflows/literature-preclinical/scripts/preclinical_synthesis.py +237 -0
  1146. package/data/workflows/literature-preclinical/scripts/report_generation.py +326 -0
  1147. package/data/workflows/mendelian-randomization-twosamplemr/SKILL.md +210 -0
  1148. package/data/workflows/mendelian-randomization-twosamplemr/references/interpretation-guide.md +239 -0
  1149. package/data/workflows/mendelian-randomization-twosamplemr/references/method-reference.md +190 -0
  1150. package/data/workflows/mendelian-randomization-twosamplemr/scripts/export_results.R +123 -0
  1151. package/data/workflows/mendelian-randomization-twosamplemr/scripts/generate_report.R +411 -0
  1152. package/data/workflows/mendelian-randomization-twosamplemr/scripts/load_data.R +281 -0
  1153. package/data/workflows/mendelian-randomization-twosamplemr/scripts/mr_plots.R +163 -0
  1154. package/data/workflows/mendelian-randomization-twosamplemr/scripts/run_mr_analysis.R +322 -0
  1155. package/data/workflows/pcr-primer-design/SKILL.md +397 -0
  1156. package/data/workflows/pcr-primer-design/references/code_examples.md +594 -0
  1157. package/data/workflows/pcr-primer-design/references/miqe_guidelines.md +453 -0
  1158. package/data/workflows/pcr-primer-design/references/parameter_ranges.md +356 -0
  1159. package/data/workflows/pcr-primer-design/references/primer_design_best_practices.md +451 -0
  1160. package/data/workflows/pcr-primer-design/references/troubleshooting_guide.md +477 -0
  1161. package/data/workflows/pcr-primer-design/scripts/__init__.py +2 -0
  1162. package/data/workflows/pcr-primer-design/scripts/calculate_tm.py +306 -0
  1163. package/data/workflows/pcr-primer-design/scripts/check_dimers.py +298 -0
  1164. package/data/workflows/pcr-primer-design/scripts/check_secondary_structures.py +343 -0
  1165. package/data/workflows/pcr-primer-design/scripts/design_qpcr_primers.py +233 -0
  1166. package/data/workflows/pcr-primer-design/scripts/design_standard_primers.py +197 -0
  1167. package/data/workflows/pcr-primer-design/scripts/design_taqman_probes.py +226 -0
  1168. package/data/workflows/pcr-primer-design/scripts/export_results.py +382 -0
  1169. package/data/workflows/pcr-primer-design/scripts/generate_reports.py +379 -0
  1170. package/data/workflows/pcr-primer-design/scripts/validate_specificity.py +311 -0
  1171. package/data/workflows/pcr-primer-design/scripts/visualize_primers.py +379 -0
  1172. package/data/workflows/polygenic-risk-score-prs-catalog/SKILL.md +195 -0
  1173. package/data/workflows/polygenic-risk-score-prs-catalog/references/interpretation-guide.md +80 -0
  1174. package/data/workflows/polygenic-risk-score-prs-catalog/references/pgs-catalog-guide.md +109 -0
  1175. package/data/workflows/polygenic-risk-score-prs-catalog/scripts/export_results.R +186 -0
  1176. package/data/workflows/polygenic-risk-score-prs-catalog/scripts/generate_plots.R +283 -0
  1177. package/data/workflows/polygenic-risk-score-prs-catalog/scripts/load_pgs_weights.R +228 -0
  1178. package/data/workflows/polygenic-risk-score-prs-catalog/scripts/load_reference_data.R +191 -0
  1179. package/data/workflows/polygenic-risk-score-prs-catalog/scripts/score_traits.R +216 -0
  1180. package/data/workflows/pooled-crispr-screens/SKILL.md +362 -0
  1181. package/data/workflows/pooled-crispr-screens/references/crispr_screen_best_practices.md +349 -0
  1182. package/data/workflows/pooled-crispr-screens/references/qc_guidelines.md +722 -0
  1183. package/data/workflows/pooled-crispr-screens/references/statistical_methods.md +644 -0
  1184. package/data/workflows/pooled-crispr-screens/references/troubleshooting_guide.md +684 -0
  1185. package/data/workflows/pooled-crispr-screens/references/umi_optimization.md +297 -0
  1186. package/data/workflows/pooled-crispr-screens/scripts/concatenate_libraries.py +132 -0
  1187. package/data/workflows/pooled-crispr-screens/scripts/detect_perturbed_cells.py +255 -0
  1188. package/data/workflows/pooled-crispr-screens/scripts/differential_expression.py +202 -0
  1189. package/data/workflows/pooled-crispr-screens/scripts/differential_expression_glmgampoi.py +320 -0
  1190. package/data/workflows/pooled-crispr-screens/scripts/export_results.py +261 -0
  1191. package/data/workflows/pooled-crispr-screens/scripts/expression_filtering.py +159 -0
  1192. package/data/workflows/pooled-crispr-screens/scripts/gene_name_corrections.py +188 -0
  1193. package/data/workflows/pooled-crispr-screens/scripts/generate_report.py +485 -0
  1194. package/data/workflows/pooled-crispr-screens/scripts/load_10x_libraries.py +69 -0
  1195. package/data/workflows/pooled-crispr-screens/scripts/load_example_data.py +257 -0
  1196. package/data/workflows/pooled-crispr-screens/scripts/map_sgrna_to_cells.py +119 -0
  1197. package/data/workflows/pooled-crispr-screens/scripts/normalize_and_scale.py +140 -0
  1198. package/data/workflows/pooled-crispr-screens/scripts/qc_filtering.py +185 -0
  1199. package/data/workflows/pooled-crispr-screens/scripts/run_glmgampoi.R +181 -0
  1200. package/data/workflows/pooled-crispr-screens/scripts/screen_all_perturbations.py +306 -0
  1201. package/data/workflows/pooled-crispr-screens/scripts/validate_perturbations.py +314 -0
  1202. package/data/workflows/pooled-crispr-screens/scripts/visualize_perturbations.py +314 -0
  1203. package/data/workflows/scrnaseq-scanpy-core-analysis/SKILL.md +425 -0
  1204. package/data/workflows/scrnaseq-scanpy-core-analysis/references/ambient_rna_correction.md +422 -0
  1205. package/data/workflows/scrnaseq-scanpy-core-analysis/references/common-patterns.md +533 -0
  1206. package/data/workflows/scrnaseq-scanpy-core-analysis/references/integration_methods.md +820 -0
  1207. package/data/workflows/scrnaseq-scanpy-core-analysis/references/marker_gene_database.md +471 -0
  1208. package/data/workflows/scrnaseq-scanpy-core-analysis/references/pseudobulk_de_guide.md +408 -0
  1209. package/data/workflows/scrnaseq-scanpy-core-analysis/references/qc_guidelines.md +535 -0
  1210. package/data/workflows/scrnaseq-scanpy-core-analysis/references/scanpy_best_practices.md +496 -0
  1211. package/data/workflows/scrnaseq-scanpy-core-analysis/references/troubleshooting_guide.md +668 -0
  1212. package/data/workflows/scrnaseq-scanpy-core-analysis/references/workflow-details.md +727 -0
  1213. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/annotate_celltypes.py +431 -0
  1214. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/cluster_cells.py +293 -0
  1215. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/export_results.py +423 -0
  1216. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/filter_cells.py +531 -0
  1217. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/find_markers.py +391 -0
  1218. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/find_variable_genes.py +222 -0
  1219. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/integrate_scvi.py +665 -0
  1220. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/integration_diagnostics.py +678 -0
  1221. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/load_example_data.py +68 -0
  1222. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/normalize_data.py +325 -0
  1223. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/plot_dimreduction.py +389 -0
  1224. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/plot_qc.py +320 -0
  1225. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/pseudobulk_de.py +553 -0
  1226. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/qc_metrics.py +477 -0
  1227. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/remove_ambient_rna.py +347 -0
  1228. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/run_umap.py +188 -0
  1229. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/scale_and_pca.py +365 -0
  1230. package/data/workflows/scrnaseq-scanpy-core-analysis/scripts/setup_and_import.py +334 -0
  1231. package/data/workflows/scrnaseq-seurat-core-analysis/SKILL.md +585 -0
  1232. package/data/workflows/scrnaseq-seurat-core-analysis/references/ambient_rna_correction.md +422 -0
  1233. package/data/workflows/scrnaseq-seurat-core-analysis/references/common-patterns.md +667 -0
  1234. package/data/workflows/scrnaseq-seurat-core-analysis/references/decision-guide.md +456 -0
  1235. package/data/workflows/scrnaseq-seurat-core-analysis/references/integration_methods.md +864 -0
  1236. package/data/workflows/scrnaseq-seurat-core-analysis/references/marker_gene_database.md +471 -0
  1237. package/data/workflows/scrnaseq-seurat-core-analysis/references/pseudobulk_de_guide.md +408 -0
  1238. package/data/workflows/scrnaseq-seurat-core-analysis/references/qc_guidelines.md +452 -0
  1239. package/data/workflows/scrnaseq-seurat-core-analysis/references/seurat_best_practices.md +417 -0
  1240. package/data/workflows/scrnaseq-seurat-core-analysis/references/troubleshooting_guide.md +566 -0
  1241. package/data/workflows/scrnaseq-seurat-core-analysis/references/workflow-details.md +801 -0
  1242. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/annotate_celltypes.R +306 -0
  1243. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/cluster_cells.R +223 -0
  1244. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/export_results.R +292 -0
  1245. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/filter_cells.R +576 -0
  1246. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/find_markers.R +325 -0
  1247. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/find_variable_features.R +106 -0
  1248. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/integrate_batches.R +504 -0
  1249. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/integration_diagnostics.R +596 -0
  1250. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/load_example_data.R +89 -0
  1251. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/normalize_data.R +184 -0
  1252. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/plot_dimreduction.R +273 -0
  1253. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/plot_qc.R +250 -0
  1254. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/pseudobulk_de.R +324 -0
  1255. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/qc_metrics.R +358 -0
  1256. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/remove_ambient_rna.R +281 -0
  1257. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/run_umap.R +116 -0
  1258. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/scale_and_pca.R +243 -0
  1259. package/data/workflows/scrnaseq-seurat-core-analysis/scripts/setup_and_import.R +193 -0
  1260. package/data/workflows/spatial-transcriptomics/SKILL.md +256 -0
  1261. package/data/workflows/spatial-transcriptomics/references/spatial-analysis-guide.md +216 -0
  1262. package/data/workflows/spatial-transcriptomics/scripts/export_results.py +214 -0
  1263. package/data/workflows/spatial-transcriptomics/scripts/generate_all_plots.py +397 -0
  1264. package/data/workflows/spatial-transcriptomics/scripts/load_example_data.py +175 -0
  1265. package/data/workflows/spatial-transcriptomics/scripts/spatial_workflow.py +206 -0
  1266. package/dist/bgi.js +128 -2
  1267. package/package.json +2 -1
@@ -0,0 +1,1136 @@
1
+ # Common Clustering Patterns
2
+
3
+ Detailed code examples and variations for common clustering workflows with
4
+ complete implementation guidance.
5
+
6
+ ---
7
+
8
+ ## Pattern 1: Sample Subtype Discovery
9
+
10
+ **Use case:** Group patients/samples by gene expression profiles to discover
11
+ disease subtypes or treatment response groups
12
+
13
+ **When to use:**
14
+
15
+ - You have bulk RNA-seq, proteomics, or metabolomics data
16
+ - Samples are patients, cell lines, or biological replicates
17
+ - Goal is to find molecular subtypes
18
+ - Most common clustering application in biology
19
+
20
+ ### Complete Working Example
21
+
22
+ ```python
23
+ import numpy as np
24
+ import pandas as pd
25
+ from scripts.prepare_data import load_and_prepare_data
26
+ from scripts.dimensionality_reduction import apply_pca, apply_umap
27
+ from scripts.hierarchical_clustering import hierarchical_clustering
28
+ from scripts.optimal_clusters import find_optimal_clusters
29
+ from scripts.cluster_validation import validate_clustering
30
+ from scripts.plot_clustering_results import plot_all_results
31
+ from scripts.characterize_clusters import characterize_clusters
32
+ from scripts.export_results import export_clustering_results
33
+
34
+ # 1. Load normalized expression data (samples × genes)
35
+ data, metadata, genes, samples = load_and_prepare_data(
36
+ data_path="expression_tpm.csv",
37
+ metadata_path="sample_metadata.csv",
38
+ transpose=False, # Samples in rows, genes in columns
39
+ normalize_method="zscore", # Z-score normalization
40
+ filter_low_variance=True,
41
+ variance_threshold=0.1, # Remove bottom 10% variance genes
42
+ handle_missing="drop" # Remove samples/genes with missing values
43
+ )
44
+
45
+ print(f"Loaded {data.shape[0]} samples × {data.shape[1]} genes")
46
+
47
+ # 2. Reduce dimensions (1000s of genes → 50 PCs)
48
+ # Keeps major variation, reduces noise and computational cost
49
+ pca_data, pca_model, explained_variance = apply_pca(
50
+ data,
51
+ n_components=50, # Or use variance_threshold=0.90
52
+ plot_variance=True # Creates scree plot
53
+ )
54
+
55
+ print(f"PCA: {explained_variance.sum():.1%} variance explained by {pca_data.shape[1]} PCs")
56
+
57
+ # 3. Explore clustering structure with dendrogram
58
+ # Don't specify n_clusters to see full tree
59
+ linkage_matrix, _ = hierarchical_clustering(
60
+ pca_data,
61
+ n_clusters=None, # Build full tree
62
+ linkage_method="ward", # Ward minimizes variance
63
+ plot_dendrogram=True, # Visual exploration
64
+ save_path="dendrogram_exploration"
65
+ )
66
+
67
+ # 4. Determine optimal k using multiple metrics
68
+ results = find_optimal_clusters(
69
+ pca_data,
70
+ method="hierarchical", # Test hierarchical
71
+ k_range=range(2, 11), # Test k=2 to k=10
72
+ metrics=["elbow", "silhouette", "gap", "calinski"],
73
+ plot_results=True,
74
+ output_path="optimal_k_analysis"
75
+ )
76
+
77
+ print(f"Suggested optimal k: {results['optimal_k']}")
78
+ print(f"Silhouette scores: {results['silhouette_scores']}")
79
+
80
+ # 5. Apply final clustering with chosen k
81
+ optimal_k = results['optimal_k'] # Or manually choose based on biology
82
+ cluster_labels, _ = hierarchical_clustering(
83
+ pca_data,
84
+ n_clusters=optimal_k,
85
+ linkage_method="ward"
86
+ )
87
+
88
+ # 6. Validate clustering quality
89
+ validation = validate_clustering(
90
+ pca_data,
91
+ cluster_labels,
92
+ metrics="all",
93
+ true_labels=metadata.get('known_subtype') if 'known_subtype' in metadata else None,
94
+ plot_silhouette=True,
95
+ output_path="validation_plots"
96
+ )
97
+
98
+ print(f"Silhouette score: {validation['silhouette']:.3f}")
99
+ print(f"Davies-Bouldin index: {validation['davies_bouldin']:.3f}")
100
+ print(f"Calinski-Harabasz score: {validation['calinski_harabasz']:.1f}")
101
+
102
+ # 7. Test clustering stability
103
+ from scripts.stability_analysis import stability_analysis
104
+ stability = stability_analysis(
105
+ pca_data,
106
+ cluster_labels,
107
+ clustering_method="hierarchical",
108
+ n_bootstrap=100,
109
+ sample_fraction=0.8,
110
+ plot_consensus=True,
111
+ output_path="stability_analysis"
112
+ )
113
+
114
+ print(f"Mean stability: {stability['mean_stability']:.3f}")
115
+
116
+ # 8. Characterize clusters (find distinguishing features)
117
+ cluster_features = characterize_clusters(
118
+ data, # Use original data, not PCA
119
+ cluster_labels,
120
+ genes,
121
+ method="anova", # ANOVA for multi-cluster comparison
122
+ top_n=50, # Top 50 genes per cluster
123
+ fdr_threshold=0.05,
124
+ plot_heatmap=True,
125
+ output_path="cluster_features_heatmap"
126
+ )
127
+
128
+ # 9. Comprehensive visualization
129
+ umap_embedding = apply_umap(pca_data, n_neighbors=15, min_dist=0.1)
130
+
131
+ plot_all_results(
132
+ data,
133
+ cluster_labels,
134
+ samples,
135
+ genes,
136
+ pca_data=pca_data,
137
+ umap_embedding=umap_embedding,
138
+ linkage_matrix=linkage_matrix,
139
+ metadata=metadata,
140
+ output_dir="clustering_visualizations/"
141
+ )
142
+
143
+ # 10. Export all results
144
+ export_clustering_results(
145
+ cluster_labels,
146
+ samples,
147
+ validation,
148
+ cluster_features,
149
+ output_dir="clustering_results/",
150
+ prefix="sample_subtyping"
151
+ )
152
+
153
+ print("✓ Sample subtype discovery complete!")
154
+ ```
155
+
156
+ ### Variations
157
+
158
+ **With batch effect correction:**
159
+
160
+ ```python
161
+ # If samples cluster by batch instead of biology
162
+ from scripts.prepare_data import regress_out_batch
163
+
164
+ data_corrected = regress_out_batch(
165
+ data,
166
+ batch_labels=metadata['batch'],
167
+ preserve_design=metadata['condition'] # Preserve biological signal
168
+ )
169
+
170
+ # Then proceed with clustering on data_corrected
171
+ ```
172
+
173
+ **With bootstrap validation:**
174
+
175
+ ```python
176
+ # For publication: test multiple k values with stability
177
+ k_candidates = [3, 4, 5]
178
+
179
+ for k in k_candidates:
180
+ labels, _ = hierarchical_clustering(pca_data, n_clusters=k)
181
+ stability = stability_analysis(pca_data, labels, n_bootstrap=100)
182
+ validation = validate_clustering(pca_data, labels)
183
+
184
+ print(f"k={k}: Silhouette={validation['silhouette']:.3f}, Stability={stability['mean_stability']:.3f}")
185
+
186
+ # Choose k with best silhouette + stability trade-off
187
+ ```
188
+
189
+ **With feature filtering:**
190
+
191
+ ```python
192
+ # Focus on most variable genes (faster, less noise)
193
+ from sklearn.feature_selection import VarianceThreshold
194
+
195
+ selector = VarianceThreshold(threshold=1.0) # Keep genes with variance > 1
196
+ data_filtered = selector.fit_transform(data)
197
+
198
+ # Proceed with clustering
199
+ ```
200
+
201
+ ---
202
+
203
+ ## Pattern 2: Gene Co-clustering
204
+
205
+ **Use case:** Group genes/proteins by similar expression patterns across samples
206
+ to find co-regulated modules
207
+
208
+ **When to use:**
209
+
210
+ - You want to find co-expressed gene modules
211
+ - Goal is to understand gene relationships, not sample relationships
212
+ - Transpose your data: genes as rows, samples as columns
213
+ - Often followed by pathway enrichment analysis
214
+
215
+ ### Complete Working Example
216
+
217
+ ```python
218
+ import numpy as np
219
+ import pandas as pd
220
+ from scripts.prepare_data import load_and_prepare_data
221
+ from scripts.distance_metrics import calculate_distance_matrix
222
+ from scripts.hierarchical_clustering import hierarchical_clustering
223
+ from scripts.characterize_clusters import characterize_clusters
224
+ from scripts.plot_clustering_results import plot_cluster_heatmap
225
+
226
+ # 1. Load and transpose (genes × samples)
227
+ data, metadata, samples, genes = load_and_prepare_data(
228
+ data_path="expression_tpm.csv",
229
+ transpose=True, # CRITICAL: genes as rows, samples as columns
230
+ normalize_method="zscore", # Normalize across samples for each gene
231
+ filter_low_variance=True,
232
+ variance_threshold=0.2 # Remove lowly expressed genes
233
+ )
234
+
235
+ print(f"Loaded {data.shape[0]} genes × {data.shape[1]} samples")
236
+
237
+ # 2. Use correlation distance (pattern similarity)
238
+ # Correlation focuses on pattern, not magnitude
239
+ distance_matrix = calculate_distance_matrix(
240
+ data,
241
+ metric="correlation", # 1 - Pearson correlation
242
+ show_distribution=True
243
+ )
244
+
245
+ # 3. Hierarchical clustering with average linkage
246
+ # Average linkage works well with correlation distance
247
+ linkage_matrix, cluster_labels = hierarchical_clustering(
248
+ data,
249
+ n_clusters=15, # 10-20 modules is typical for gene clustering
250
+ linkage_method="average", # Average linkage for correlation
251
+ metric="precomputed", # Already computed distance matrix
252
+ distance_matrix=distance_matrix,
253
+ plot_dendrogram=True,
254
+ save_path="gene_dendrogram"
255
+ )
256
+
257
+ # 4. Characterize gene clusters
258
+ # Find which conditions/samples show high/low expression for each module
259
+ cluster_features = characterize_clusters(
260
+ data.T, # Transpose back: samples × genes
261
+ cluster_labels,
262
+ genes,
263
+ method="anova",
264
+ plot_heatmap=True,
265
+ output_path="gene_module_heatmap"
266
+ )
267
+
268
+ # 5. Export gene modules
269
+ gene_clusters_df = pd.DataFrame({
270
+ 'Gene': genes,
271
+ 'Module': cluster_labels
272
+ })
273
+
274
+ for module in np.unique(cluster_labels):
275
+ module_genes = gene_clusters_df[gene_clusters_df['Module'] == module]['Gene'].tolist()
276
+ print(f"\nModule {module}: {len(module_genes)} genes")
277
+
278
+ # Save module genes for pathway enrichment
279
+ with open(f"gene_modules/module_{module}_genes.txt", 'w') as f:
280
+ f.write('\n'.join(module_genes))
281
+
282
+ # 6. Visualize module expression patterns
283
+ plot_cluster_heatmap(
284
+ data.T, # samples × genes
285
+ cluster_labels,
286
+ genes,
287
+ samples,
288
+ top_n_features=None, # Show all genes (or top 100)
289
+ output_path="gene_modules_heatmap"
290
+ )
291
+
292
+ print("✓ Gene co-clustering complete!")
293
+ ```
294
+
295
+ ### Variations
296
+
297
+ **Time-series gene clustering:**
298
+
299
+ ```python
300
+ # For developmental or time-course data
301
+ # Order samples by time before clustering
302
+
303
+ time_ordered_samples = metadata.sort_values('timepoint').index
304
+ data_ordered = data[:, time_ordered_samples]
305
+
306
+ # Cluster genes, then plot modules showing temporal patterns
307
+ from scripts.plot_clustering_results import plot_temporal_patterns
308
+ plot_temporal_patterns(
309
+ data_ordered,
310
+ cluster_labels,
311
+ timepoints=metadata.loc[time_ordered_samples, 'timepoint'],
312
+ output_path="temporal_gene_modules"
313
+ )
314
+ ```
315
+
316
+ **Condition-specific patterns:**
317
+
318
+ ```python
319
+ # Find genes that cluster differently across conditions
320
+
321
+ conditions = metadata['condition'].unique()
322
+
323
+ for condition in conditions:
324
+ condition_samples = metadata[metadata['condition'] == condition].index
325
+ data_condition = data[:, condition_samples]
326
+
327
+ # Cluster within condition
328
+ linkage, labels = hierarchical_clustering(
329
+ data_condition, n_clusters=10,
330
+ linkage_method="average"
331
+ )
332
+
333
+ # Compare to overall clustering
334
+ ```
335
+
336
+ **Multi-omics gene integration:**
337
+
338
+ ```python
339
+ # Cluster genes using both transcriptomics and proteomics
340
+
341
+ # Concatenate standardized features
342
+ rna_data_zscore = (rna_data - rna_data.mean(axis=1, keepdims=True)) / rna_data.std(axis=1, keepdims=True)
343
+ protein_data_zscore = (protein_data - protein_data.mean(axis=1, keepdims=True)) / protein_data.std(axis=1, keepdims=True)
344
+
345
+ multi_omics_data = np.concatenate([rna_data_zscore, protein_data_zscore], axis=1)
346
+
347
+ # Cluster on integrated data
348
+ ```
349
+
350
+ ---
351
+
352
+ ## Pattern 3: Method Comparison
353
+
354
+ **Use case:** Systematically compare multiple clustering algorithms to find most
355
+ robust solution
356
+
357
+ **When to use:**
358
+
359
+ - Exploratory analysis where you don't know the best approach
360
+ - Publication-quality analysis requiring method justification
361
+ - Data with unclear structure
362
+ - Want to show results are robust across methods
363
+
364
+ ### Complete Working Example
365
+
366
+ ```python
367
+ import numpy as np
368
+ import pandas as pd
369
+ from scripts.prepare_data import load_and_prepare_data
370
+ from scripts.dimensionality_reduction import apply_pca
371
+ from scripts.hierarchical_clustering import hierarchical_clustering
372
+ from scripts.kmeans_clustering import kmeans_clustering
373
+ from scripts.density_clustering import hdbscan_clustering
374
+ from scripts.model_based_clustering import gmm_clustering
375
+ from scripts.cluster_validation import validate_clustering
376
+ from scripts.plot_clustering_results import plot_comparison
377
+
378
+ # 1. Prepare data
379
+ data, metadata, genes, samples = load_and_prepare_data(
380
+ "expression_matrix.csv",
381
+ normalize_method="zscore"
382
+ )
383
+
384
+ pca_data, _, _ = apply_pca(data, n_components=50)
385
+
386
+ # 2. Apply multiple methods with same k (where applicable)
387
+ k = 5 # Or test multiple k values
388
+
389
+ methods = {}
390
+
391
+ # Hierarchical
392
+ linkage, labels_hier = hierarchical_clustering(
393
+ pca_data, n_clusters=k, linkage_method="ward"
394
+ )
395
+ methods['Hierarchical'] = labels_hier
396
+
397
+ # K-means
398
+ labels_kmeans, _, _ = kmeans_clustering(
399
+ pca_data, n_clusters=k, method="kmeans", n_init=50
400
+ )
401
+ methods['K-means'] = labels_kmeans
402
+
403
+ # HDBSCAN (finds k automatically)
404
+ labels_hdbscan, _, n_clusters_hdbscan = hdbscan_clustering(
405
+ pca_data, min_cluster_size=10, min_samples=5
406
+ )
407
+ methods['HDBSCAN'] = labels_hdbscan
408
+ print(f"HDBSCAN found {n_clusters_hdbscan} clusters")
409
+
410
+ # GMM
411
+ labels_gmm, _, _ = gmm_clustering(
412
+ pca_data, n_components=k, covariance_type="full"
413
+ )
414
+ methods['GMM'] = labels_gmm
415
+
416
+ # 3. Validate each method
417
+ validation_results = {}
418
+
419
+ for name, labels in methods.items():
420
+ validation = validate_clustering(
421
+ pca_data, labels, metrics="all"
422
+ )
423
+ validation_results[name] = validation
424
+
425
+ print(f"\n{name}:")
426
+ print(f" Silhouette: {validation['silhouette']:.3f}")
427
+ print(f" Davies-Bouldin: {validation['davies_bouldin']:.3f}")
428
+ print(f" Calinski-Harabasz: {validation['calinski_harabasz']:.1f}")
429
+
430
+ # 4. Create comparison table
431
+ comparison_df = pd.DataFrame({
432
+ 'Method': list(validation_results.keys()),
433
+ 'Silhouette': [v['silhouette'] for v in validation_results.values()],
434
+ 'Davies-Bouldin': [v['davies_bouldin'] for v in validation_results.values()],
435
+ 'Calinski-Harabasz': [v['calinski_harabasz'] for v in validation_results.values()],
436
+ 'N_clusters': [len(np.unique(labels[labels >= 0])) for labels in methods.values()]
437
+ })
438
+
439
+ comparison_df = comparison_df.sort_values('Silhouette', ascending=False)
440
+ print("\n=== Method Comparison ===")
441
+ print(comparison_df.to_string(index=False))
442
+
443
+ # Save comparison
444
+ comparison_df.to_csv("method_comparison.csv", index=False)
445
+
446
+ # 5. Test agreement between methods
447
+ from sklearn.metrics import adjusted_rand_score
448
+
449
+ print("\n=== Method Agreement (Adjusted Rand Index) ===")
450
+ method_names = list(methods.keys())
451
+
452
+ for i, name1 in enumerate(method_names):
453
+ for name2 in method_names[i+1:]:
454
+ ari = adjusted_rand_score(methods[name1], methods[name2])
455
+ print(f"{name1} vs {name2}: ARI = {ari:.3f}")
456
+
457
+ # 6. Visual comparison
458
+ from scripts.plot_clustering_results import plot_pca_scatter
459
+
460
+ for name, labels in methods.items():
461
+ plot_pca_scatter(
462
+ pca_data, labels, samples,
463
+ output_path=f"clustering_comparison/{name}_pca"
464
+ )
465
+
466
+ # 7. Choose best method
467
+ best_method = comparison_df.iloc[0]['Method']
468
+ best_labels = methods[best_method]
469
+
470
+ print(f"\n✓ Best method: {best_method}")
471
+ print("✓ Method comparison complete!")
472
+ ```
473
+
474
+ ### Variations
475
+
476
+ **Systematic parameter sweep:**
477
+
478
+ ```python
479
+ # Test multiple k values for each method
480
+
481
+ k_range = range(2, 11)
482
+ results = []
483
+
484
+ for k in k_range:
485
+ # Hierarchical
486
+ labels_h, _ = hierarchical_clustering(pca_data, n_clusters=k)
487
+ val_h = validate_clustering(pca_data, labels_h)
488
+
489
+ # K-means
490
+ labels_k, _, _ = kmeans_clustering(pca_data, n_clusters=k)
491
+ val_k = validate_clustering(pca_data, labels_k)
492
+
493
+ # GMM
494
+ labels_g, _, _ = gmm_clustering(pca_data, n_components=k)
495
+ val_g = validate_clustering(pca_data, labels_g)
496
+
497
+ results.append({
498
+ 'k': k,
499
+ 'Hierarchical_silhouette': val_h['silhouette'],
500
+ 'K-means_silhouette': val_k['silhouette'],
501
+ 'GMM_silhouette': val_g['silhouette']
502
+ })
503
+
504
+ results_df = pd.DataFrame(results)
505
+ results_df.to_csv("parameter_sweep.csv", index=False)
506
+
507
+ # Plot silhouette vs k for each method
508
+ import matplotlib.pyplot as plt
509
+ plt.figure(figsize=(10, 6))
510
+ for method in ['Hierarchical', 'K-means', 'GMM']:
511
+ plt.plot(results_df['k'], results_df[f'{method}_silhouette'], marker='o', label=method)
512
+ plt.xlabel('Number of clusters (k)')
513
+ plt.ylabel('Silhouette score')
514
+ plt.legend()
515
+ plt.title('Method comparison across k values')
516
+ plt.savefig("method_comparison_silhouette.png", dpi=300)
517
+ ```
518
+
519
+ **Consensus clustering:**
520
+
521
+ ```python
522
+ # Combine results from multiple methods
523
+
524
+ # Get cluster assignments from each method
525
+ all_labels = np.column_stack([
526
+ methods['Hierarchical'],
527
+ methods['K-means'],
528
+ methods['GMM']
529
+ ])
530
+
531
+ # Create consensus matrix (how often samples cluster together)
532
+ n_samples = len(all_labels)
533
+ consensus_matrix = np.zeros((n_samples, n_samples))
534
+
535
+ for labels in all_labels.T:
536
+ for i in range(n_samples):
537
+ for j in range(i+1, n_samples):
538
+ if labels[i] == labels[j]:
539
+ consensus_matrix[i, j] += 1
540
+ consensus_matrix[j, i] += 1
541
+
542
+ consensus_matrix /= all_labels.shape[1] # Normalize by number of methods
543
+
544
+ # Cluster on consensus matrix
545
+ from scipy.cluster.hierarchy import linkage, fcluster
546
+ from scipy.spatial.distance import squareform
547
+
548
+ consensus_dist = 1 - consensus_matrix
549
+ linkage_consensus = linkage(squareform(consensus_dist), method='average')
550
+ consensus_labels = fcluster(linkage_consensus, k, criterion='maxclust')
551
+
552
+ print("✓ Consensus clustering complete!")
553
+ ```
554
+
555
+ ---
556
+
557
+ ## Pattern 4: QC and Outlier Detection
558
+
559
+ **Use case:** Identify batch effects, outlier samples, or data quality issues
560
+ before main analysis
561
+
562
+ **When to use:**
563
+
564
+ - Before performing differential expression or other analyses
565
+ - When you suspect batch effects or technical artifacts
566
+ - To identify mislabeled or contaminated samples
567
+ - As part of quality control pipeline
568
+
569
+ ### Complete Working Example
570
+
571
+ ```python
572
+ import numpy as np
573
+ import pandas as pd
574
+ from scripts.prepare_data import load_and_prepare_data
575
+ from scripts.dimensionality_reduction import apply_pca, apply_umap
576
+ from scripts.density_clustering import hdbscan_clustering
577
+ from scripts.plot_clustering_results import plot_pca_scatter
578
+
579
+ # 1. Load data without aggressive filtering
580
+ data, metadata, genes, samples = load_and_prepare_data(
581
+ "expression_matrix.csv",
582
+ metadata_path="sample_metadata.csv",
583
+ normalize_method="zscore",
584
+ filter_low_variance=False, # Keep all data for QC
585
+ handle_missing="drop"
586
+ )
587
+
588
+ # 2. PCA for visualization
589
+ pca_data, pca_model, explained_var = apply_pca(
590
+ data, n_components=10, plot_variance=True
591
+ )
592
+
593
+ # 3. Quick clustering without knowing k (HDBSCAN)
594
+ cluster_labels, probabilities, n_clusters = hdbscan_clustering(
595
+ pca_data[:, :5], # Use first 5 PCs
596
+ min_cluster_size=5, # Minimum 5 samples per cluster
597
+ min_samples=3 # Core samples threshold
598
+ )
599
+
600
+ print(f"HDBSCAN detected {n_clusters} clusters")
601
+
602
+ # 4. Identify outliers (label = -1 in HDBSCAN)
603
+ outlier_mask = cluster_labels == -1
604
+ outlier_indices = np.where(outlier_mask)[0]
605
+ outlier_samples = [samples[i] for i in outlier_indices]
606
+
607
+ print(f"\nDetected {len(outlier_samples)} outlier samples:")
608
+ print(outlier_samples)
609
+
610
+ # 5. Check if outliers correspond to known issues
611
+ if 'batch' in metadata.columns:
612
+ outlier_batches = metadata.iloc[outlier_indices]['batch'].value_counts()
613
+ print("\nOutlier distribution by batch:")
614
+ print(outlier_batches)
615
+
616
+ if 'qc_metrics' in metadata.columns:
617
+ outlier_qc = metadata.iloc[outlier_indices]['qc_metrics'].describe()
618
+ normal_qc = metadata.iloc[~outlier_mask]['qc_metrics'].describe()
619
+ print("\nQC metrics comparison:")
620
+ print(f"Outliers: {outlier_qc['mean']:.2f} ± {outlier_qc['std']:.2f}")
621
+ print(f"Normal: {normal_qc['mean']:.2f} ± {normal_qc['std']:.2f}")
622
+
623
+ # 6. Visualize with PCA colored by cluster/batch
624
+ plot_pca_scatter(
625
+ pca_data,
626
+ cluster_labels,
627
+ samples,
628
+ output_path="qc_pca_clusters"
629
+ )
630
+
631
+ # Color by batch if available
632
+ if 'batch' in metadata.columns:
633
+ batch_labels = pd.Categorical(metadata['batch']).codes
634
+ plot_pca_scatter(
635
+ pca_data,
636
+ batch_labels,
637
+ samples,
638
+ output_path="qc_pca_batch"
639
+ )
640
+
641
+ # Check if clustering separates by batch (bad!)
642
+ from sklearn.metrics import adjusted_rand_score
643
+ ari_batch = adjusted_rand_score(cluster_labels[~outlier_mask],
644
+ batch_labels[~outlier_mask])
645
+ print(f"\nClustering vs Batch ARI: {ari_batch:.3f}")
646
+ if ari_batch > 0.5:
647
+ print("⚠️ WARNING: Samples cluster by batch! Consider batch correction.")
648
+
649
+ # 7. UMAP for detailed visualization
650
+ umap_embedding = apply_umap(pca_data, n_neighbors=15, min_dist=0.1)
651
+
652
+ from scripts.plot_clustering_results import plot_umap_scatter
653
+ plot_umap_scatter(
654
+ umap_embedding,
655
+ cluster_labels,
656
+ samples,
657
+ output_path="qc_umap_clusters"
658
+ )
659
+
660
+ # 8. Save outlier report
661
+ outlier_report = metadata.iloc[outlier_indices].copy()
662
+ outlier_report['hdbscan_probability'] = probabilities[outlier_indices]
663
+ outlier_report.to_csv("outlier_samples_report.csv")
664
+
665
+ # 9. Decision: remove outliers or investigate
666
+ print("\n=== Outlier Decision Guide ===")
667
+ print("If outliers are:")
668
+ print(" - Technical artifacts → Remove from downstream analysis")
669
+ print(" - Biological variation → Keep (may be interesting subtypes)")
670
+ print(" - Batch effects → Apply batch correction instead of removal")
671
+ print(" - Mislabeled → Investigate and correct metadata")
672
+
673
+ # 10. Re-cluster after removing outliers (if decided)
674
+ data_clean = data[~outlier_mask]
675
+ samples_clean = [s for i, s in enumerate(samples) if not outlier_mask[i]]
676
+
677
+ print(f"\n✓ QC complete! {len(samples_clean)} samples retained ({len(outlier_samples)} outliers)")
678
+ ```
679
+
680
+ ### Variations
681
+
682
+ **Multi-step outlier removal:**
683
+
684
+ ```python
685
+ # Iteratively remove outliers until none remain
686
+
687
+ max_iterations = 5
688
+ current_data = data.copy()
689
+ current_samples = samples.copy()
690
+ all_outliers = []
691
+
692
+ for iteration in range(max_iterations):
693
+ # Cluster
694
+ pca_data, _, _ = apply_pca(current_data, n_components=10)
695
+ labels, probs, _ = hdbscan_clustering(pca_data[:, :5], min_cluster_size=5)
696
+
697
+ # Find outliers
698
+ outlier_mask = labels == -1
699
+
700
+ if not outlier_mask.any():
701
+ print(f"No more outliers found after {iteration+1} iterations")
702
+ break
703
+
704
+ # Remove outliers
705
+ outlier_samples = [current_samples[i] for i, is_outlier in enumerate(outlier_mask) if is_outlier]
706
+ all_outliers.extend(outlier_samples)
707
+
708
+ current_data = current_data[~outlier_mask]
709
+ current_samples = [s for s, is_outlier in zip(current_samples, outlier_mask) if not is_outlier]
710
+
711
+ print(f"Iteration {iteration+1}: Removed {outlier_mask.sum()} outliers")
712
+
713
+ print(f"Total outliers removed: {len(all_outliers)}")
714
+ ```
715
+
716
+ **Batch-aware outlier detection:**
717
+
718
+ ```python
719
+ # Detect outliers within each batch separately
720
+
721
+ all_outliers = []
722
+
723
+ for batch in metadata['batch'].unique():
724
+ batch_mask = metadata['batch'] == batch
725
+ batch_data = data[batch_mask]
726
+ batch_samples = [samples[i] for i, is_batch in enumerate(batch_mask) if is_batch]
727
+
728
+ # Cluster within batch
729
+ pca_data, _, _ = apply_pca(batch_data, n_components=10)
730
+ labels, _, _ = hdbscan_clustering(pca_data[:, :5])
731
+
732
+ # Find batch-specific outliers
733
+ outlier_mask = labels == -1
734
+ batch_outliers = [batch_samples[i] for i, is_outlier in enumerate(outlier_mask) if is_outlier]
735
+
736
+ all_outliers.extend(batch_outliers)
737
+ print(f"Batch {batch}: {len(batch_outliers)} outliers")
738
+
739
+ print(f"Total outliers across all batches: {len(all_outliers)}")
740
+ ```
741
+
742
+ ---
743
+
744
+ ## Pattern 5: Robust Clustering with Stability Testing
745
+
746
+ **Use case:** Ensure clustering results are reproducible and not artifacts of
747
+ random initialization or sampling
748
+
749
+ **When to use:**
750
+
751
+ - Publication-quality analysis requiring robust results
752
+ - When k-means gives variable results across runs
753
+ - To justify cluster number choice
754
+ - When you need confidence in cluster assignments
755
+
756
+ ### Complete Working Example
757
+
758
+ ```python
759
+ import numpy as np
760
+ import pandas as pd
761
+ from scripts.prepare_data import load_and_prepare_data
762
+ from scripts.dimensionality_reduction import apply_pca
763
+ from scripts.kmeans_clustering import kmeans_clustering
764
+ from scripts.cluster_validation import validate_clustering
765
+ from scripts.stability_analysis import stability_analysis
766
+
767
+ # 1. Prepare data
768
+ data, metadata, genes, samples = load_and_prepare_data(
769
+ "expression_matrix.csv",
770
+ normalize_method="zscore"
771
+ )
772
+
773
+ pca_data, _, _ = apply_pca(data, n_components=50)
774
+
775
+ # 2. Test multiple k values with stability
776
+ k_range = [3, 4, 5, 6]
777
+ stability_results = []
778
+
779
+ for k in k_range:
780
+ print(f"\n=== Testing k={k} ===")
781
+
782
+ # Perform clustering with high n_init for reproducibility
783
+ cluster_labels, centroids, inertia = kmeans_clustering(
784
+ pca_data,
785
+ n_clusters=k,
786
+ method="kmeans",
787
+ n_init=100, # Run 100 times, take best
788
+ random_state=42 # For reproducibility
789
+ )
790
+
791
+ # Compute validation metrics
792
+ validation = validate_clustering(pca_data, cluster_labels, metrics="all")
793
+
794
+ # Test stability via bootstrap
795
+ stability = stability_analysis(
796
+ pca_data,
797
+ cluster_labels,
798
+ clustering_method="kmeans",
799
+ n_bootstrap=100, # 100 bootstrap samples
800
+ sample_fraction=0.8, # 80% of samples per bootstrap
801
+ plot_consensus=True,
802
+ output_path=f"stability_k{k}"
803
+ )
804
+
805
+ # Store results
806
+ stability_results.append({
807
+ 'k': k,
808
+ 'silhouette': validation['silhouette'],
809
+ 'davies_bouldin': validation['davies_bouldin'],
810
+ 'stability_mean': stability['mean_stability'],
811
+ 'stability_std': stability['std_stability'],
812
+ 'stable_samples_pct': stability['stable_samples_pct']
813
+ })
814
+
815
+ print(f"k={k}: Silhouette={validation['silhouette']:.3f}, Stability={stability['mean_stability']:.3f}")
816
+
817
+ # 3. Create comparison table
818
+ stability_df = pd.DataFrame(stability_results)
819
+ stability_df.to_csv("stability_comparison.csv", index=False)
820
+
821
+ print("\n=== Stability Comparison ===")
822
+ print(stability_df.to_string(index=False))
823
+
824
+ # 4. Choose k based on silhouette + stability trade-off
825
+ # Prioritize stability >0.85, then optimize silhouette
826
+
827
+ stable_options = stability_df[stability_df['stability_mean'] > 0.85]
828
+
829
+ if len(stable_options) > 0:
830
+ best_k = stable_options.loc[stable_options['silhouette'].idxmax(), 'k']
831
+ print(f"\n✓ Best k: {int(best_k)} (stable + highest silhouette)")
832
+ else:
833
+ print("\n⚠️ No k value achieved stability >0.85")
834
+ best_k = stability_df.loc[stability_df['stability_mean'].idxmax(), 'k']
835
+ print(f" Choosing k={int(best_k)} with highest stability ({stability_df['stability_mean'].max():.3f})")
836
+
837
+ # 5. Final clustering with chosen k
838
+ final_labels, _, _ = kmeans_clustering(
839
+ pca_data,
840
+ n_clusters=int(best_k),
841
+ n_init=100,
842
+ random_state=42
843
+ )
844
+
845
+ # 6. Test reproducibility across multiple runs
846
+ print("\n=== Testing Reproducibility ===")
847
+
848
+ from sklearn.metrics import adjusted_rand_score
849
+
850
+ run_labels = []
851
+ for run in range(10):
852
+ labels, _, _ = kmeans_clustering(
853
+ pca_data,
854
+ n_clusters=int(best_k),
855
+ n_init=50,
856
+ random_state=run # Different seed
857
+ )
858
+ run_labels.append(labels)
859
+
860
+ # Compute ARI between all pairs of runs
861
+ ari_scores = []
862
+ for i in range(len(run_labels)):
863
+ for j in range(i+1, len(run_labels)):
864
+ ari = adjusted_rand_score(run_labels[i], run_labels[j])
865
+ ari_scores.append(ari)
866
+
867
+ mean_ari = np.mean(ari_scores)
868
+ print(f"Mean ARI across 10 runs: {mean_ari:.3f}")
869
+
870
+ if mean_ari > 0.95:
871
+ print("✓ Clustering is highly reproducible")
872
+ elif mean_ari > 0.85:
873
+ print("✓ Clustering is reasonably reproducible")
874
+ else:
875
+ print("⚠️ Clustering has low reproducibility - consider hierarchical instead")
876
+
877
+ # 7. Identify unstable samples
878
+ final_stability = stability_analysis(
879
+ pca_data, final_labels, clustering_method="kmeans", n_bootstrap=100
880
+ )
881
+
882
+ unstable_samples = [
883
+ samples[i] for i, prob in enumerate(final_stability['sample_stability'])
884
+ if prob < 0.7
885
+ ]
886
+
887
+ print(f"\nIdentified {len(unstable_samples)} unstable samples (<70% stability)")
888
+ if len(unstable_samples) > 0:
889
+ print("Unstable samples:", unstable_samples[:10], "..." if len(unstable_samples) > 10 else "")
890
+
891
+ # 8. Export results with stability scores
892
+ results_df = pd.DataFrame({
893
+ 'Sample': samples,
894
+ 'Cluster': final_labels,
895
+ 'Stability': final_stability['sample_stability']
896
+ })
897
+
898
+ results_df.to_csv("clustering_with_stability.csv", index=False)
899
+
900
+ print("\n✓ Robust clustering with stability testing complete!")
901
+ ```
902
+
903
+ ### Variations
904
+
905
+ **Consensus matrix visualization:**
906
+
907
+ ```python
908
+ # Create and visualize consensus matrix from bootstrap
909
+
910
+ import matplotlib.pyplot as plt
911
+ import seaborn as sns
912
+
913
+ # Get consensus matrix from stability analysis
914
+ stability = stability_analysis(
915
+ pca_data, cluster_labels, clustering_method="kmeans",
916
+ n_bootstrap=100, return_consensus=True
917
+ )
918
+
919
+ consensus_matrix = stability['consensus_matrix']
920
+
921
+ # Plot consensus matrix (sorted by cluster)
922
+ sorted_idx = np.argsort(cluster_labels)
923
+
924
+ plt.figure(figsize=(10, 8))
925
+ sns.heatmap(
926
+ consensus_matrix[sorted_idx][:, sorted_idx],
927
+ cmap='RdBu_r',
928
+ vmin=0, vmax=1,
929
+ xticklabels=False,
930
+ yticklabels=False,
931
+ cbar_kws={'label': 'Co-clustering frequency'}
932
+ )
933
+ plt.title('Consensus Matrix (sorted by cluster)')
934
+ plt.savefig("consensus_matrix.png", dpi=300, bbox_inches='tight')
935
+
936
+ # Clear block structure = stable clustering
937
+ ```
938
+
939
+ **Perturbation analysis:**
940
+
941
+ ```python
942
+ # Test sensitivity to feature noise
943
+
944
+ noise_levels = [0.0, 0.05, 0.1, 0.15, 0.2]
945
+ perturbation_results = []
946
+
947
+ # Original clustering
948
+ labels_original, _, _ = kmeans_clustering(pca_data, n_clusters=5)
949
+
950
+ for noise_level in noise_levels:
951
+ # Add Gaussian noise
952
+ pca_noisy = pca_data + np.random.normal(0, noise_level, pca_data.shape)
953
+
954
+ # Re-cluster
955
+ labels_noisy, _, _ = kmeans_clustering(pca_noisy, n_clusters=5, n_init=50)
956
+
957
+ # Compare to original
958
+ ari = adjusted_rand_score(labels_original, labels_noisy)
959
+ perturbation_results.append({'noise_level': noise_level, 'ARI': ari})
960
+
961
+ print(f"Noise level {noise_level:.2f}: ARI = {ari:.3f}")
962
+
963
+ # Robust clustering should maintain high ARI even with noise
964
+ ```
965
+
966
+ ---
967
+
968
+ ## Pattern 6: Hierarchical Exploration Then Efficient Partitioning
969
+
970
+ **Use case:** Use dendrogram to guide k selection, then apply scalable method
971
+ for final clustering
972
+
973
+ **When to use:**
974
+
975
+ - Large datasets where hierarchical is too slow for final clustering
976
+ - Want dendrogram visualization benefits
977
+ - Need efficient final clustering (k-means)
978
+ - Best of both worlds approach
979
+
980
+ ### Complete Working Example
981
+
982
+ ```python
983
+ # 1. Subsample for hierarchical exploration
984
+ n_samples = len(data)
985
+ subsample_size = min(2000, n_samples) # Max 2000 for hierarchical
986
+ subsample_indices = np.random.choice(n_samples, subsample_size, replace=False)
987
+
988
+ data_subsample = data[subsample_indices]
989
+ pca_subsample, _, _ = apply_pca(data_subsample, n_components=50)
990
+
991
+ # 2. Hierarchical on subsample to explore structure
992
+ from scripts.hierarchical_clustering import hierarchical_clustering
993
+
994
+ linkage_matrix, labels_subsample = hierarchical_clustering(
995
+ pca_subsample,
996
+ n_clusters=None, # Full tree
997
+ linkage_method="ward",
998
+ plot_dendrogram=True,
999
+ save_path="exploration_dendrogram"
1000
+ )
1001
+
1002
+ # 3. Examine dendrogram, choose k
1003
+ # Let's say dendrogram suggests k=5
1004
+
1005
+ # 4. Apply k-means on full dataset
1006
+ pca_full, _, _ = apply_pca(data, n_components=50)
1007
+
1008
+ labels_full, centroids, _ = kmeans_clustering(
1009
+ pca_full,
1010
+ n_clusters=5, # Based on dendrogram
1011
+ n_init=100
1012
+ )
1013
+
1014
+ # 5. Validate on full data
1015
+ validation = validate_clustering(pca_full, labels_full)
1016
+ print(f"Full dataset clustering: Silhouette = {validation['silhouette']:.3f}")
1017
+
1018
+ # This approach combines hierarchical exploration with k-means efficiency
1019
+ ```
1020
+
1021
+ ---
1022
+
1023
+ ## Pattern 7: High-Dimensional Feature Clustering
1024
+
1025
+ **Use case:** Cluster 10,000+ features (genes, proteins, metabolites)
1026
+ efficiently
1027
+
1028
+ **When to use:**
1029
+
1030
+ - Clustering features, not samples
1031
+ - Very high-dimensional data
1032
+ - Need to reduce computation time
1033
+ - Want to find feature modules
1034
+
1035
+ ### Complete Working Example
1036
+
1037
+ ```python
1038
+ # 1. Start with feature matrix (features × samples)
1039
+ # For 10,000+ features, direct clustering is slow
1040
+
1041
+ # 2. Pre-filter to most variable features
1042
+ from sklearn.feature_selection import VarianceThreshold
1043
+
1044
+ selector = VarianceThreshold(threshold=1.0) # Variance > 1 after z-scoring
1045
+ data_filtered = selector.fit_transform(data)
1046
+ genes_filtered = [genes[i] for i in selector.get_support(indices=True)]
1047
+
1048
+ print(f"Filtered to {len(genes_filtered)} high-variance features")
1049
+
1050
+ # 3. Use correlation distance + average linkage (efficient for features)
1051
+ distance_matrix = calculate_distance_matrix(
1052
+ data_filtered,
1053
+ metric="correlation"
1054
+ )
1055
+
1056
+ # 4. Hierarchical clustering
1057
+ linkage_matrix, cluster_labels = hierarchical_clustering(
1058
+ data_filtered,
1059
+ n_clusters=20, # 10-30 modules typical
1060
+ linkage_method="average",
1061
+ metric="precomputed",
1062
+ distance_matrix=distance_matrix
1063
+ )
1064
+
1065
+ # 5. For even larger feature sets (>50k), use approximate methods
1066
+ # Mini-batch K-means on feature space
1067
+ from sklearn.cluster import MiniBatchKMeans
1068
+
1069
+ mbk = MiniBatchKMeans(n_clusters=20, batch_size=1000, n_init=10)
1070
+ cluster_labels_approx = mbk.fit_predict(data.T) # Transpose: features as rows
1071
+
1072
+ print("✓ High-dimensional feature clustering complete!")
1073
+ ```
1074
+
1075
+ ---
1076
+
1077
+ ## Troubleshooting Pattern Selection
1078
+
1079
+ **Problem → Recommended Pattern:**
1080
+
1081
+ - **"My clusters are unstable"** → Pattern 5 (Robust Clustering with Stability)
1082
+ - **"I need to compare methods"** → Pattern 3 (Method Comparison)
1083
+ - **"I have outliers/batch effects"** → Pattern 4 (QC and Outlier Detection)
1084
+ - **"I want to find disease subtypes"** → Pattern 1 (Sample Subtype Discovery)
1085
+ - **"I want co-expressed gene modules"** → Pattern 2 (Gene Co-clustering)
1086
+ - **"I have >10k samples"** → Pattern 6 (Hierarchical exploration + k-means)
1087
+ - **"I have >50k features"** → Pattern 7 (High-dimensional feature clustering)
1088
+ - **"Results don't match biology"** → Start with Pattern 4 (QC), then Pattern 1
1089
+ - **"I need publication-quality analysis"** → Pattern 5 (Stability) + Pattern 3
1090
+ (Comparison)
1091
+ - **"I don't know where to start"** → Pattern 1 (most common starting point)
1092
+
1093
+ ---
1094
+
1095
+ ## Combining Patterns
1096
+
1097
+ **Example: Comprehensive Publication-Ready Analysis**
1098
+
1099
+ ```python
1100
+ # 1. QC and outlier detection (Pattern 4)
1101
+ data_clean, outliers = qc_and_outlier_detection(data)
1102
+
1103
+ # 2. Sample subtype discovery (Pattern 1)
1104
+ cluster_labels = sample_subtype_discovery(data_clean)
1105
+
1106
+ # 3. Stability testing (Pattern 5)
1107
+ stability = test_clustering_stability(data_clean, cluster_labels)
1108
+
1109
+ # 4. Method comparison (Pattern 3)
1110
+ method_comparison = compare_clustering_methods(data_clean, cluster_labels)
1111
+
1112
+ # 5. Gene module discovery (Pattern 2)
1113
+ gene_modules = gene_coclustering(data_clean, cluster_labels)
1114
+
1115
+ # This provides: clean data + robust clusters + method validation + gene signatures
1116
+ ```
1117
+
1118
+ ---
1119
+
1120
+ ## Summary
1121
+
1122
+ **Most Common Patterns:**
1123
+
1124
+ 1. **Pattern 1** (Sample Subtype Discovery): ~60% of use cases
1125
+ 2. **Pattern 2** (Gene Co-clustering): ~20% of use cases
1126
+ 3. **Pattern 4** (QC/Outliers): ~15% of use cases (often before Pattern 1)
1127
+
1128
+ **For Robust Analysis:**
1129
+
1130
+ - Combine Pattern 1 + Pattern 5 (discovery + stability)
1131
+ - Or Pattern 1 + Pattern 3 (discovery + method comparison)
1132
+
1133
+ **For Publications:**
1134
+
1135
+ - Pattern 4 → Pattern 1 → Pattern 5 → Pattern 3
1136
+ - (QC → Discovery → Stability → Comparison)