@synsci/cli-darwin-x64-baseline 1.1.76 → 1.1.78

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (830) hide show
  1. package/bin/skills/adaptyv/SKILL.md +114 -0
  2. package/bin/skills/adaptyv/reference/api_reference.md +308 -0
  3. package/bin/skills/adaptyv/reference/examples.md +913 -0
  4. package/bin/skills/adaptyv/reference/experiments.md +360 -0
  5. package/bin/skills/adaptyv/reference/protein_optimization.md +637 -0
  6. package/bin/skills/aeon/SKILL.md +374 -0
  7. package/bin/skills/aeon/references/anomaly_detection.md +154 -0
  8. package/bin/skills/aeon/references/classification.md +144 -0
  9. package/bin/skills/aeon/references/clustering.md +123 -0
  10. package/bin/skills/aeon/references/datasets_benchmarking.md +387 -0
  11. package/bin/skills/aeon/references/distances.md +256 -0
  12. package/bin/skills/aeon/references/forecasting.md +140 -0
  13. package/bin/skills/aeon/references/networks.md +289 -0
  14. package/bin/skills/aeon/references/regression.md +118 -0
  15. package/bin/skills/aeon/references/segmentation.md +163 -0
  16. package/bin/skills/aeon/references/similarity_search.md +187 -0
  17. package/bin/skills/aeon/references/transformations.md +246 -0
  18. package/bin/skills/alphafold-database/SKILL.md +513 -0
  19. package/bin/skills/alphafold-database/references/api_reference.md +423 -0
  20. package/bin/skills/anndata/SKILL.md +400 -0
  21. package/bin/skills/anndata/references/best_practices.md +525 -0
  22. package/bin/skills/anndata/references/concatenation.md +396 -0
  23. package/bin/skills/anndata/references/data_structure.md +314 -0
  24. package/bin/skills/anndata/references/io_operations.md +404 -0
  25. package/bin/skills/anndata/references/manipulation.md +516 -0
  26. package/bin/skills/arboreto/SKILL.md +243 -0
  27. package/bin/skills/arboreto/references/algorithms.md +138 -0
  28. package/bin/skills/arboreto/references/basic_inference.md +151 -0
  29. package/bin/skills/arboreto/references/distributed_computing.md +242 -0
  30. package/bin/skills/arboreto/scripts/basic_grn_inference.py +97 -0
  31. package/bin/skills/astropy/SKILL.md +331 -0
  32. package/bin/skills/astropy/references/coordinates.md +273 -0
  33. package/bin/skills/astropy/references/cosmology.md +307 -0
  34. package/bin/skills/astropy/references/fits.md +396 -0
  35. package/bin/skills/astropy/references/tables.md +489 -0
  36. package/bin/skills/astropy/references/time.md +404 -0
  37. package/bin/skills/astropy/references/units.md +178 -0
  38. package/bin/skills/astropy/references/wcs_and_other_modules.md +373 -0
  39. package/bin/skills/benchling-integration/SKILL.md +480 -0
  40. package/bin/skills/benchling-integration/references/api_endpoints.md +883 -0
  41. package/bin/skills/benchling-integration/references/authentication.md +379 -0
  42. package/bin/skills/benchling-integration/references/sdk_reference.md +774 -0
  43. package/bin/skills/biopython/SKILL.md +443 -0
  44. package/bin/skills/biopython/references/advanced.md +577 -0
  45. package/bin/skills/biopython/references/alignment.md +362 -0
  46. package/bin/skills/biopython/references/blast.md +455 -0
  47. package/bin/skills/biopython/references/databases.md +484 -0
  48. package/bin/skills/biopython/references/phylogenetics.md +566 -0
  49. package/bin/skills/biopython/references/sequence_io.md +285 -0
  50. package/bin/skills/biopython/references/structure.md +564 -0
  51. package/bin/skills/biorxiv-database/SKILL.md +483 -0
  52. package/bin/skills/biorxiv-database/references/api_reference.md +280 -0
  53. package/bin/skills/biorxiv-database/scripts/biorxiv_search.py +445 -0
  54. package/bin/skills/bioservices/SKILL.md +361 -0
  55. package/bin/skills/bioservices/references/identifier_mapping.md +685 -0
  56. package/bin/skills/bioservices/references/services_reference.md +636 -0
  57. package/bin/skills/bioservices/references/workflow_patterns.md +811 -0
  58. package/bin/skills/bioservices/scripts/batch_id_converter.py +347 -0
  59. package/bin/skills/bioservices/scripts/compound_cross_reference.py +378 -0
  60. package/bin/skills/bioservices/scripts/pathway_analysis.py +309 -0
  61. package/bin/skills/bioservices/scripts/protein_analysis_workflow.py +408 -0
  62. package/bin/skills/brenda-database/SKILL.md +719 -0
  63. package/bin/skills/brenda-database/references/api_reference.md +537 -0
  64. package/bin/skills/brenda-database/scripts/brenda_queries.py +844 -0
  65. package/bin/skills/brenda-database/scripts/brenda_visualization.py +772 -0
  66. package/bin/skills/brenda-database/scripts/enzyme_pathway_builder.py +1053 -0
  67. package/bin/skills/cellxgene-census/SKILL.md +511 -0
  68. package/bin/skills/cellxgene-census/references/census_schema.md +182 -0
  69. package/bin/skills/cellxgene-census/references/common_patterns.md +351 -0
  70. package/bin/skills/chembl-database/SKILL.md +389 -0
  71. package/bin/skills/chembl-database/references/api_reference.md +272 -0
  72. package/bin/skills/chembl-database/scripts/example_queries.py +278 -0
  73. package/bin/skills/cirq/SKILL.md +346 -0
  74. package/bin/skills/cirq/references/building.md +307 -0
  75. package/bin/skills/cirq/references/experiments.md +572 -0
  76. package/bin/skills/cirq/references/hardware.md +515 -0
  77. package/bin/skills/cirq/references/noise.md +515 -0
  78. package/bin/skills/cirq/references/simulation.md +350 -0
  79. package/bin/skills/cirq/references/transformation.md +416 -0
  80. package/bin/skills/clinicaltrials-database/SKILL.md +507 -0
  81. package/bin/skills/clinicaltrials-database/references/api_reference.md +358 -0
  82. package/bin/skills/clinicaltrials-database/scripts/query_clinicaltrials.py +215 -0
  83. package/bin/skills/clinpgx-database/SKILL.md +638 -0
  84. package/bin/skills/clinpgx-database/references/api_reference.md +757 -0
  85. package/bin/skills/clinpgx-database/scripts/query_clinpgx.py +518 -0
  86. package/bin/skills/clinvar-database/SKILL.md +362 -0
  87. package/bin/skills/clinvar-database/references/api_reference.md +227 -0
  88. package/bin/skills/clinvar-database/references/clinical_significance.md +218 -0
  89. package/bin/skills/clinvar-database/references/data_formats.md +358 -0
  90. package/bin/skills/cobrapy/SKILL.md +463 -0
  91. package/bin/skills/cobrapy/references/api_quick_reference.md +655 -0
  92. package/bin/skills/cobrapy/references/workflows.md +593 -0
  93. package/bin/skills/cosmic-database/SKILL.md +336 -0
  94. package/bin/skills/cosmic-database/references/cosmic_data_reference.md +220 -0
  95. package/bin/skills/cosmic-database/scripts/download_cosmic.py +231 -0
  96. package/bin/skills/dask/SKILL.md +456 -0
  97. package/bin/skills/dask/references/arrays.md +497 -0
  98. package/bin/skills/dask/references/bags.md +468 -0
  99. package/bin/skills/dask/references/best-practices.md +277 -0
  100. package/bin/skills/dask/references/dataframes.md +368 -0
  101. package/bin/skills/dask/references/futures.md +541 -0
  102. package/bin/skills/dask/references/schedulers.md +504 -0
  103. package/bin/skills/datacommons-client/SKILL.md +255 -0
  104. package/bin/skills/datacommons-client/references/getting_started.md +417 -0
  105. package/bin/skills/datacommons-client/references/node.md +250 -0
  106. package/bin/skills/datacommons-client/references/observation.md +185 -0
  107. package/bin/skills/datacommons-client/references/resolve.md +246 -0
  108. package/bin/skills/datamol/SKILL.md +706 -0
  109. package/bin/skills/datamol/references/conformers_module.md +131 -0
  110. package/bin/skills/datamol/references/core_api.md +130 -0
  111. package/bin/skills/datamol/references/descriptors_viz.md +195 -0
  112. package/bin/skills/datamol/references/fragments_scaffolds.md +174 -0
  113. package/bin/skills/datamol/references/io_module.md +109 -0
  114. package/bin/skills/datamol/references/reactions_data.md +218 -0
  115. package/bin/skills/deepchem/SKILL.md +597 -0
  116. package/bin/skills/deepchem/references/api_reference.md +303 -0
  117. package/bin/skills/deepchem/references/workflows.md +491 -0
  118. package/bin/skills/deepchem/scripts/graph_neural_network.py +338 -0
  119. package/bin/skills/deepchem/scripts/predict_solubility.py +224 -0
  120. package/bin/skills/deepchem/scripts/transfer_learning.py +375 -0
  121. package/bin/skills/deeptools/SKILL.md +531 -0
  122. package/bin/skills/deeptools/assets/quick_reference.md +58 -0
  123. package/bin/skills/deeptools/references/effective_genome_sizes.md +116 -0
  124. package/bin/skills/deeptools/references/normalization_methods.md +410 -0
  125. package/bin/skills/deeptools/references/tools_reference.md +533 -0
  126. package/bin/skills/deeptools/references/workflows.md +474 -0
  127. package/bin/skills/deeptools/scripts/validate_files.py +195 -0
  128. package/bin/skills/deeptools/scripts/workflow_generator.py +454 -0
  129. package/bin/skills/denario/SKILL.md +215 -0
  130. package/bin/skills/denario/references/examples.md +494 -0
  131. package/bin/skills/denario/references/installation.md +213 -0
  132. package/bin/skills/denario/references/llm_configuration.md +265 -0
  133. package/bin/skills/denario/references/research_pipeline.md +471 -0
  134. package/bin/skills/diffdock/SKILL.md +483 -0
  135. package/bin/skills/diffdock/assets/batch_template.csv +4 -0
  136. package/bin/skills/diffdock/assets/custom_inference_config.yaml +90 -0
  137. package/bin/skills/diffdock/references/confidence_and_limitations.md +182 -0
  138. package/bin/skills/diffdock/references/parameters_reference.md +163 -0
  139. package/bin/skills/diffdock/references/workflows_examples.md +392 -0
  140. package/bin/skills/diffdock/scripts/analyze_results.py +334 -0
  141. package/bin/skills/diffdock/scripts/prepare_batch_csv.py +254 -0
  142. package/bin/skills/diffdock/scripts/setup_check.py +278 -0
  143. package/bin/skills/dnanexus-integration/SKILL.md +383 -0
  144. package/bin/skills/dnanexus-integration/references/app-development.md +247 -0
  145. package/bin/skills/dnanexus-integration/references/configuration.md +646 -0
  146. package/bin/skills/dnanexus-integration/references/data-operations.md +400 -0
  147. package/bin/skills/dnanexus-integration/references/job-execution.md +412 -0
  148. package/bin/skills/dnanexus-integration/references/python-sdk.md +523 -0
  149. package/bin/skills/document-skills/docx/LICENSE.txt +30 -0
  150. package/bin/skills/document-skills/docx/SKILL.md +233 -0
  151. package/bin/skills/document-skills/docx/docx-js.md +350 -0
  152. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chart.xsd +1499 -0
  153. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd +146 -0
  154. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd +1085 -0
  155. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd +11 -0
  156. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-main.xsd +3081 -0
  157. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-picture.xsd +23 -0
  158. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd +185 -0
  159. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd +287 -0
  160. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/pml.xsd +1676 -0
  161. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd +28 -0
  162. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd +144 -0
  163. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd +174 -0
  164. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd +25 -0
  165. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd +18 -0
  166. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd +59 -0
  167. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd +56 -0
  168. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd +195 -0
  169. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-math.xsd +582 -0
  170. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd +25 -0
  171. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/sml.xsd +4439 -0
  172. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-main.xsd +570 -0
  173. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd +509 -0
  174. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd +12 -0
  175. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd +108 -0
  176. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd +96 -0
  177. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/wml.xsd +3646 -0
  178. package/bin/skills/document-skills/docx/ooxml/schemas/ISO-IEC29500-4_2016/xml.xsd +116 -0
  179. package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-contentTypes.xsd +42 -0
  180. package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-coreProperties.xsd +50 -0
  181. package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-digSig.xsd +49 -0
  182. package/bin/skills/document-skills/docx/ooxml/schemas/ecma/fouth-edition/opc-relationships.xsd +33 -0
  183. package/bin/skills/document-skills/docx/ooxml/schemas/mce/mc.xsd +75 -0
  184. package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2010.xsd +560 -0
  185. package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2012.xsd +67 -0
  186. package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-2018.xsd +14 -0
  187. package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-cex-2018.xsd +20 -0
  188. package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-cid-2016.xsd +13 -0
  189. package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-sdtdatahash-2020.xsd +4 -0
  190. package/bin/skills/document-skills/docx/ooxml/schemas/microsoft/wml-symex-2015.xsd +8 -0
  191. package/bin/skills/document-skills/docx/ooxml/scripts/pack.py +159 -0
  192. package/bin/skills/document-skills/docx/ooxml/scripts/unpack.py +29 -0
  193. package/bin/skills/document-skills/docx/ooxml/scripts/validate.py +69 -0
  194. package/bin/skills/document-skills/docx/ooxml/scripts/validation/__init__.py +15 -0
  195. package/bin/skills/document-skills/docx/ooxml/scripts/validation/base.py +951 -0
  196. package/bin/skills/document-skills/docx/ooxml/scripts/validation/docx.py +274 -0
  197. package/bin/skills/document-skills/docx/ooxml/scripts/validation/pptx.py +315 -0
  198. package/bin/skills/document-skills/docx/ooxml/scripts/validation/redlining.py +279 -0
  199. package/bin/skills/document-skills/docx/ooxml.md +610 -0
  200. package/bin/skills/document-skills/docx/scripts/__init__.py +1 -0
  201. package/bin/skills/document-skills/docx/scripts/document.py +1276 -0
  202. package/bin/skills/document-skills/docx/scripts/templates/comments.xml +3 -0
  203. package/bin/skills/document-skills/docx/scripts/templates/commentsExtended.xml +3 -0
  204. package/bin/skills/document-skills/docx/scripts/templates/commentsExtensible.xml +3 -0
  205. package/bin/skills/document-skills/docx/scripts/templates/commentsIds.xml +3 -0
  206. package/bin/skills/document-skills/docx/scripts/templates/people.xml +3 -0
  207. package/bin/skills/document-skills/docx/scripts/utilities.py +374 -0
  208. package/bin/skills/document-skills/pdf/LICENSE.txt +30 -0
  209. package/bin/skills/document-skills/pdf/SKILL.md +330 -0
  210. package/bin/skills/document-skills/pdf/forms.md +205 -0
  211. package/bin/skills/document-skills/pdf/reference.md +612 -0
  212. package/bin/skills/document-skills/pdf/scripts/check_bounding_boxes.py +70 -0
  213. package/bin/skills/document-skills/pdf/scripts/check_bounding_boxes_test.py +226 -0
  214. package/bin/skills/document-skills/pdf/scripts/check_fillable_fields.py +12 -0
  215. package/bin/skills/document-skills/pdf/scripts/convert_pdf_to_images.py +35 -0
  216. package/bin/skills/document-skills/pdf/scripts/create_validation_image.py +41 -0
  217. package/bin/skills/document-skills/pdf/scripts/extract_form_field_info.py +152 -0
  218. package/bin/skills/document-skills/pdf/scripts/fill_fillable_fields.py +114 -0
  219. package/bin/skills/document-skills/pdf/scripts/fill_pdf_form_with_annotations.py +108 -0
  220. package/bin/skills/document-skills/pptx/LICENSE.txt +30 -0
  221. package/bin/skills/document-skills/pptx/SKILL.md +520 -0
  222. package/bin/skills/document-skills/pptx/html2pptx.md +625 -0
  223. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chart.xsd +1499 -0
  224. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-chartDrawing.xsd +146 -0
  225. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-diagram.xsd +1085 -0
  226. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-lockedCanvas.xsd +11 -0
  227. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-main.xsd +3081 -0
  228. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-picture.xsd +23 -0
  229. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-spreadsheetDrawing.xsd +185 -0
  230. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/dml-wordprocessingDrawing.xsd +287 -0
  231. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/pml.xsd +1676 -0
  232. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-additionalCharacteristics.xsd +28 -0
  233. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-bibliography.xsd +144 -0
  234. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-commonSimpleTypes.xsd +174 -0
  235. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlDataProperties.xsd +25 -0
  236. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-customXmlSchemaProperties.xsd +18 -0
  237. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesCustom.xsd +59 -0
  238. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesExtended.xsd +56 -0
  239. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-documentPropertiesVariantTypes.xsd +195 -0
  240. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-math.xsd +582 -0
  241. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/shared-relationshipReference.xsd +25 -0
  242. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/sml.xsd +4439 -0
  243. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-main.xsd +570 -0
  244. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-officeDrawing.xsd +509 -0
  245. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-presentationDrawing.xsd +12 -0
  246. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-spreadsheetDrawing.xsd +108 -0
  247. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/vml-wordprocessingDrawing.xsd +96 -0
  248. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/wml.xsd +3646 -0
  249. package/bin/skills/document-skills/pptx/ooxml/schemas/ISO-IEC29500-4_2016/xml.xsd +116 -0
  250. package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-contentTypes.xsd +42 -0
  251. package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-coreProperties.xsd +50 -0
  252. package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-digSig.xsd +49 -0
  253. package/bin/skills/document-skills/pptx/ooxml/schemas/ecma/fouth-edition/opc-relationships.xsd +33 -0
  254. package/bin/skills/document-skills/pptx/ooxml/schemas/mce/mc.xsd +75 -0
  255. package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2010.xsd +560 -0
  256. package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2012.xsd +67 -0
  257. package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-2018.xsd +14 -0
  258. package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-cex-2018.xsd +20 -0
  259. package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-cid-2016.xsd +13 -0
  260. package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-sdtdatahash-2020.xsd +4 -0
  261. package/bin/skills/document-skills/pptx/ooxml/schemas/microsoft/wml-symex-2015.xsd +8 -0
  262. package/bin/skills/document-skills/pptx/ooxml/scripts/pack.py +159 -0
  263. package/bin/skills/document-skills/pptx/ooxml/scripts/unpack.py +29 -0
  264. package/bin/skills/document-skills/pptx/ooxml/scripts/validate.py +69 -0
  265. package/bin/skills/document-skills/pptx/ooxml/scripts/validation/__init__.py +15 -0
  266. package/bin/skills/document-skills/pptx/ooxml/scripts/validation/base.py +951 -0
  267. package/bin/skills/document-skills/pptx/ooxml/scripts/validation/docx.py +274 -0
  268. package/bin/skills/document-skills/pptx/ooxml/scripts/validation/pptx.py +315 -0
  269. package/bin/skills/document-skills/pptx/ooxml/scripts/validation/redlining.py +279 -0
  270. package/bin/skills/document-skills/pptx/ooxml.md +427 -0
  271. package/bin/skills/document-skills/pptx/scripts/html2pptx.js +979 -0
  272. package/bin/skills/document-skills/pptx/scripts/inventory.py +1020 -0
  273. package/bin/skills/document-skills/pptx/scripts/rearrange.py +231 -0
  274. package/bin/skills/document-skills/pptx/scripts/replace.py +385 -0
  275. package/bin/skills/document-skills/pptx/scripts/thumbnail.py +450 -0
  276. package/bin/skills/document-skills/xlsx/LICENSE.txt +30 -0
  277. package/bin/skills/document-skills/xlsx/SKILL.md +325 -0
  278. package/bin/skills/document-skills/xlsx/recalc.py +178 -0
  279. package/bin/skills/drugbank-database/SKILL.md +190 -0
  280. package/bin/skills/drugbank-database/references/chemical-analysis.md +590 -0
  281. package/bin/skills/drugbank-database/references/data-access.md +242 -0
  282. package/bin/skills/drugbank-database/references/drug-queries.md +386 -0
  283. package/bin/skills/drugbank-database/references/interactions.md +425 -0
  284. package/bin/skills/drugbank-database/references/targets-pathways.md +518 -0
  285. package/bin/skills/drugbank-database/scripts/drugbank_helper.py +350 -0
  286. package/bin/skills/ena-database/SKILL.md +204 -0
  287. package/bin/skills/ena-database/references/api_reference.md +490 -0
  288. package/bin/skills/ensembl-database/SKILL.md +311 -0
  289. package/bin/skills/ensembl-database/references/api_endpoints.md +346 -0
  290. package/bin/skills/ensembl-database/scripts/ensembl_query.py +427 -0
  291. package/bin/skills/esm/SKILL.md +306 -0
  292. package/bin/skills/esm/references/esm-c-api.md +583 -0
  293. package/bin/skills/esm/references/esm3-api.md +452 -0
  294. package/bin/skills/esm/references/forge-api.md +657 -0
  295. package/bin/skills/esm/references/workflows.md +685 -0
  296. package/bin/skills/etetoolkit/SKILL.md +623 -0
  297. package/bin/skills/etetoolkit/references/api_reference.md +583 -0
  298. package/bin/skills/etetoolkit/references/visualization.md +783 -0
  299. package/bin/skills/etetoolkit/references/workflows.md +774 -0
  300. package/bin/skills/etetoolkit/scripts/quick_visualize.py +214 -0
  301. package/bin/skills/etetoolkit/scripts/tree_operations.py +229 -0
  302. package/bin/skills/exploratory-data-analysis/SKILL.md +446 -0
  303. package/bin/skills/exploratory-data-analysis/assets/report_template.md +196 -0
  304. package/bin/skills/exploratory-data-analysis/references/bioinformatics_genomics_formats.md +664 -0
  305. package/bin/skills/exploratory-data-analysis/references/chemistry_molecular_formats.md +664 -0
  306. package/bin/skills/exploratory-data-analysis/references/general_scientific_formats.md +518 -0
  307. package/bin/skills/exploratory-data-analysis/references/microscopy_imaging_formats.md +620 -0
  308. package/bin/skills/exploratory-data-analysis/references/proteomics_metabolomics_formats.md +517 -0
  309. package/bin/skills/exploratory-data-analysis/references/spectroscopy_analytical_formats.md +633 -0
  310. package/bin/skills/exploratory-data-analysis/scripts/eda_analyzer.py +547 -0
  311. package/bin/skills/fda-database/SKILL.md +518 -0
  312. package/bin/skills/fda-database/references/animal_veterinary.md +377 -0
  313. package/bin/skills/fda-database/references/api_basics.md +687 -0
  314. package/bin/skills/fda-database/references/devices.md +632 -0
  315. package/bin/skills/fda-database/references/drugs.md +468 -0
  316. package/bin/skills/fda-database/references/foods.md +374 -0
  317. package/bin/skills/fda-database/references/other.md +472 -0
  318. package/bin/skills/fda-database/scripts/fda_examples.py +335 -0
  319. package/bin/skills/fda-database/scripts/fda_query.py +440 -0
  320. package/bin/skills/flowio/SKILL.md +608 -0
  321. package/bin/skills/flowio/references/api_reference.md +372 -0
  322. package/bin/skills/fluidsim/SKILL.md +349 -0
  323. package/bin/skills/fluidsim/references/advanced_features.md +398 -0
  324. package/bin/skills/fluidsim/references/installation.md +68 -0
  325. package/bin/skills/fluidsim/references/output_analysis.md +283 -0
  326. package/bin/skills/fluidsim/references/parameters.md +198 -0
  327. package/bin/skills/fluidsim/references/simulation_workflow.md +172 -0
  328. package/bin/skills/fluidsim/references/solvers.md +94 -0
  329. package/bin/skills/fred-economic-data/SKILL.md +433 -0
  330. package/bin/skills/fred-economic-data/references/api_basics.md +212 -0
  331. package/bin/skills/fred-economic-data/references/categories.md +442 -0
  332. package/bin/skills/fred-economic-data/references/geofred.md +588 -0
  333. package/bin/skills/fred-economic-data/references/releases.md +642 -0
  334. package/bin/skills/fred-economic-data/references/series.md +584 -0
  335. package/bin/skills/fred-economic-data/references/sources.md +423 -0
  336. package/bin/skills/fred-economic-data/references/tags.md +485 -0
  337. package/bin/skills/fred-economic-data/scripts/fred_examples.py +354 -0
  338. package/bin/skills/fred-economic-data/scripts/fred_query.py +590 -0
  339. package/bin/skills/gene-database/SKILL.md +179 -0
  340. package/bin/skills/gene-database/references/api_reference.md +404 -0
  341. package/bin/skills/gene-database/references/common_workflows.md +428 -0
  342. package/bin/skills/gene-database/scripts/batch_gene_lookup.py +298 -0
  343. package/bin/skills/gene-database/scripts/fetch_gene_data.py +277 -0
  344. package/bin/skills/gene-database/scripts/query_gene.py +251 -0
  345. package/bin/skills/geniml/SKILL.md +318 -0
  346. package/bin/skills/geniml/references/bedspace.md +127 -0
  347. package/bin/skills/geniml/references/consensus_peaks.md +238 -0
  348. package/bin/skills/geniml/references/region2vec.md +90 -0
  349. package/bin/skills/geniml/references/scembed.md +197 -0
  350. package/bin/skills/geniml/references/utilities.md +385 -0
  351. package/bin/skills/geo-database/SKILL.md +815 -0
  352. package/bin/skills/geo-database/references/geo_reference.md +829 -0
  353. package/bin/skills/geopandas/SKILL.md +251 -0
  354. package/bin/skills/geopandas/references/crs-management.md +243 -0
  355. package/bin/skills/geopandas/references/data-io.md +165 -0
  356. package/bin/skills/geopandas/references/data-structures.md +70 -0
  357. package/bin/skills/geopandas/references/geometric-operations.md +221 -0
  358. package/bin/skills/geopandas/references/spatial-analysis.md +184 -0
  359. package/bin/skills/geopandas/references/visualization.md +243 -0
  360. package/bin/skills/get-available-resources/SKILL.md +277 -0
  361. package/bin/skills/get-available-resources/scripts/detect_resources.py +401 -0
  362. package/bin/skills/gget/SKILL.md +871 -0
  363. package/bin/skills/gget/references/database_info.md +300 -0
  364. package/bin/skills/gget/references/module_reference.md +467 -0
  365. package/bin/skills/gget/references/workflows.md +814 -0
  366. package/bin/skills/gget/scripts/batch_sequence_analysis.py +191 -0
  367. package/bin/skills/gget/scripts/enrichment_pipeline.py +235 -0
  368. package/bin/skills/gget/scripts/gene_analysis.py +161 -0
  369. package/bin/skills/gtars/SKILL.md +285 -0
  370. package/bin/skills/gtars/references/cli.md +222 -0
  371. package/bin/skills/gtars/references/coverage.md +172 -0
  372. package/bin/skills/gtars/references/overlap.md +156 -0
  373. package/bin/skills/gtars/references/python-api.md +211 -0
  374. package/bin/skills/gtars/references/refget.md +147 -0
  375. package/bin/skills/gtars/references/tokenizers.md +103 -0
  376. package/bin/skills/gwas-database/SKILL.md +608 -0
  377. package/bin/skills/gwas-database/references/api_reference.md +793 -0
  378. package/bin/skills/histolab/SKILL.md +678 -0
  379. package/bin/skills/histolab/references/filters_preprocessing.md +514 -0
  380. package/bin/skills/histolab/references/slide_management.md +172 -0
  381. package/bin/skills/histolab/references/tile_extraction.md +421 -0
  382. package/bin/skills/histolab/references/tissue_masks.md +251 -0
  383. package/bin/skills/histolab/references/visualization.md +547 -0
  384. package/bin/skills/hmdb-database/SKILL.md +196 -0
  385. package/bin/skills/hmdb-database/references/hmdb_data_fields.md +267 -0
  386. package/bin/skills/hypogenic/SKILL.md +655 -0
  387. package/bin/skills/hypogenic/references/config_template.yaml +150 -0
  388. package/bin/skills/imaging-data-commons/SKILL.md +1182 -0
  389. package/bin/skills/imaging-data-commons/references/bigquery_guide.md +556 -0
  390. package/bin/skills/imaging-data-commons/references/cli_guide.md +272 -0
  391. package/bin/skills/imaging-data-commons/references/cloud_storage_guide.md +333 -0
  392. package/bin/skills/imaging-data-commons/references/dicomweb_guide.md +399 -0
  393. package/bin/skills/infographics/SKILL.md +563 -0
  394. package/bin/skills/infographics/references/color_palettes.md +496 -0
  395. package/bin/skills/infographics/references/design_principles.md +636 -0
  396. package/bin/skills/infographics/references/infographic_types.md +907 -0
  397. package/bin/skills/infographics/scripts/generate_infographic.py +234 -0
  398. package/bin/skills/infographics/scripts/generate_infographic_ai.py +1290 -0
  399. package/bin/skills/iso-13485-certification/SKILL.md +680 -0
  400. package/bin/skills/iso-13485-certification/assets/templates/procedures/CAPA-procedure-template.md +453 -0
  401. package/bin/skills/iso-13485-certification/assets/templates/procedures/document-control-procedure-template.md +567 -0
  402. package/bin/skills/iso-13485-certification/assets/templates/quality-manual-template.md +521 -0
  403. package/bin/skills/iso-13485-certification/references/gap-analysis-checklist.md +568 -0
  404. package/bin/skills/iso-13485-certification/references/iso-13485-requirements.md +610 -0
  405. package/bin/skills/iso-13485-certification/references/mandatory-documents.md +606 -0
  406. package/bin/skills/iso-13485-certification/references/quality-manual-guide.md +688 -0
  407. package/bin/skills/iso-13485-certification/scripts/gap_analyzer.py +440 -0
  408. package/bin/skills/kegg-database/SKILL.md +377 -0
  409. package/bin/skills/kegg-database/references/kegg_reference.md +326 -0
  410. package/bin/skills/kegg-database/scripts/kegg_api.py +251 -0
  411. package/bin/skills/labarchive-integration/SKILL.md +268 -0
  412. package/bin/skills/labarchive-integration/references/api_reference.md +342 -0
  413. package/bin/skills/labarchive-integration/references/authentication_guide.md +357 -0
  414. package/bin/skills/labarchive-integration/references/integrations.md +425 -0
  415. package/bin/skills/labarchive-integration/scripts/entry_operations.py +334 -0
  416. package/bin/skills/labarchive-integration/scripts/notebook_operations.py +269 -0
  417. package/bin/skills/labarchive-integration/scripts/setup_config.py +205 -0
  418. package/bin/skills/lamindb/SKILL.md +390 -0
  419. package/bin/skills/lamindb/references/annotation-validation.md +513 -0
  420. package/bin/skills/lamindb/references/core-concepts.md +380 -0
  421. package/bin/skills/lamindb/references/data-management.md +433 -0
  422. package/bin/skills/lamindb/references/integrations.md +642 -0
  423. package/bin/skills/lamindb/references/ontologies.md +497 -0
  424. package/bin/skills/lamindb/references/setup-deployment.md +733 -0
  425. package/bin/skills/latchbio-integration/SKILL.md +353 -0
  426. package/bin/skills/latchbio-integration/references/data-management.md +427 -0
  427. package/bin/skills/latchbio-integration/references/resource-configuration.md +429 -0
  428. package/bin/skills/latchbio-integration/references/verified-workflows.md +487 -0
  429. package/bin/skills/latchbio-integration/references/workflow-creation.md +254 -0
  430. package/bin/skills/matchms/SKILL.md +203 -0
  431. package/bin/skills/matchms/references/filtering.md +288 -0
  432. package/bin/skills/matchms/references/importing_exporting.md +416 -0
  433. package/bin/skills/matchms/references/similarity.md +380 -0
  434. package/bin/skills/matchms/references/workflows.md +647 -0
  435. package/bin/skills/matlab/SKILL.md +376 -0
  436. package/bin/skills/matlab/references/data-import-export.md +479 -0
  437. package/bin/skills/matlab/references/executing-scripts.md +444 -0
  438. package/bin/skills/matlab/references/graphics-visualization.md +579 -0
  439. package/bin/skills/matlab/references/mathematics.md +553 -0
  440. package/bin/skills/matlab/references/matrices-arrays.md +349 -0
  441. package/bin/skills/matlab/references/octave-compatibility.md +544 -0
  442. package/bin/skills/matlab/references/programming.md +672 -0
  443. package/bin/skills/matlab/references/python-integration.md +433 -0
  444. package/bin/skills/matplotlib/SKILL.md +361 -0
  445. package/bin/skills/matplotlib/references/api_reference.md +412 -0
  446. package/bin/skills/matplotlib/references/common_issues.md +563 -0
  447. package/bin/skills/matplotlib/references/plot_types.md +476 -0
  448. package/bin/skills/matplotlib/references/styling_guide.md +589 -0
  449. package/bin/skills/matplotlib/scripts/plot_template.py +401 -0
  450. package/bin/skills/matplotlib/scripts/style_configurator.py +409 -0
  451. package/bin/skills/medchem/SKILL.md +406 -0
  452. package/bin/skills/medchem/references/api_guide.md +600 -0
  453. package/bin/skills/medchem/references/rules_catalog.md +604 -0
  454. package/bin/skills/medchem/scripts/filter_molecules.py +418 -0
  455. package/bin/skills/metabolomics-workbench-database/SKILL.md +259 -0
  456. package/bin/skills/metabolomics-workbench-database/references/api_reference.md +494 -0
  457. package/bin/skills/modal-research-gpu/SKILL.md +238 -0
  458. package/bin/skills/molfeat/SKILL.md +511 -0
  459. package/bin/skills/molfeat/references/api_reference.md +428 -0
  460. package/bin/skills/molfeat/references/available_featurizers.md +333 -0
  461. package/bin/skills/molfeat/references/examples.md +723 -0
  462. package/bin/skills/networkx/SKILL.md +437 -0
  463. package/bin/skills/networkx/references/algorithms.md +383 -0
  464. package/bin/skills/networkx/references/generators.md +378 -0
  465. package/bin/skills/networkx/references/graph-basics.md +283 -0
  466. package/bin/skills/networkx/references/io.md +441 -0
  467. package/bin/skills/networkx/references/visualization.md +529 -0
  468. package/bin/skills/neurokit2/SKILL.md +356 -0
  469. package/bin/skills/neurokit2/references/bio_module.md +417 -0
  470. package/bin/skills/neurokit2/references/complexity.md +715 -0
  471. package/bin/skills/neurokit2/references/ecg_cardiac.md +355 -0
  472. package/bin/skills/neurokit2/references/eda.md +497 -0
  473. package/bin/skills/neurokit2/references/eeg.md +506 -0
  474. package/bin/skills/neurokit2/references/emg.md +408 -0
  475. package/bin/skills/neurokit2/references/eog.md +407 -0
  476. package/bin/skills/neurokit2/references/epochs_events.md +471 -0
  477. package/bin/skills/neurokit2/references/hrv.md +480 -0
  478. package/bin/skills/neurokit2/references/ppg.md +413 -0
  479. package/bin/skills/neurokit2/references/rsp.md +510 -0
  480. package/bin/skills/neurokit2/references/signal_processing.md +648 -0
  481. package/bin/skills/neuropixels-analysis/SKILL.md +350 -0
  482. package/bin/skills/neuropixels-analysis/assets/analysis_template.py +271 -0
  483. package/bin/skills/neuropixels-analysis/references/AI_CURATION.md +345 -0
  484. package/bin/skills/neuropixels-analysis/references/ANALYSIS.md +392 -0
  485. package/bin/skills/neuropixels-analysis/references/AUTOMATED_CURATION.md +358 -0
  486. package/bin/skills/neuropixels-analysis/references/MOTION_CORRECTION.md +323 -0
  487. package/bin/skills/neuropixels-analysis/references/PREPROCESSING.md +273 -0
  488. package/bin/skills/neuropixels-analysis/references/QUALITY_METRICS.md +359 -0
  489. package/bin/skills/neuropixels-analysis/references/SPIKE_SORTING.md +339 -0
  490. package/bin/skills/neuropixels-analysis/references/api_reference.md +415 -0
  491. package/bin/skills/neuropixels-analysis/references/plotting_guide.md +454 -0
  492. package/bin/skills/neuropixels-analysis/references/standard_workflow.md +385 -0
  493. package/bin/skills/neuropixels-analysis/scripts/compute_metrics.py +178 -0
  494. package/bin/skills/neuropixels-analysis/scripts/explore_recording.py +168 -0
  495. package/bin/skills/neuropixels-analysis/scripts/export_to_phy.py +79 -0
  496. package/bin/skills/neuropixels-analysis/scripts/neuropixels_pipeline.py +432 -0
  497. package/bin/skills/neuropixels-analysis/scripts/preprocess_recording.py +122 -0
  498. package/bin/skills/neuropixels-analysis/scripts/run_sorting.py +98 -0
  499. package/bin/skills/offer-k-dense-web/SKILL.md +21 -0
  500. package/bin/skills/omero-integration/SKILL.md +251 -0
  501. package/bin/skills/omero-integration/references/advanced.md +631 -0
  502. package/bin/skills/omero-integration/references/connection.md +369 -0
  503. package/bin/skills/omero-integration/references/data_access.md +544 -0
  504. package/bin/skills/omero-integration/references/image_processing.md +665 -0
  505. package/bin/skills/omero-integration/references/metadata.md +688 -0
  506. package/bin/skills/omero-integration/references/rois.md +648 -0
  507. package/bin/skills/omero-integration/references/scripts.md +637 -0
  508. package/bin/skills/omero-integration/references/tables.md +532 -0
  509. package/bin/skills/openalex-database/SKILL.md +494 -0
  510. package/bin/skills/openalex-database/references/api_guide.md +371 -0
  511. package/bin/skills/openalex-database/references/common_queries.md +381 -0
  512. package/bin/skills/openalex-database/scripts/openalex_client.py +337 -0
  513. package/bin/skills/openalex-database/scripts/query_helpers.py +306 -0
  514. package/bin/skills/opentargets-database/SKILL.md +373 -0
  515. package/bin/skills/opentargets-database/references/api_reference.md +249 -0
  516. package/bin/skills/opentargets-database/references/evidence_types.md +306 -0
  517. package/bin/skills/opentargets-database/references/target_annotations.md +401 -0
  518. package/bin/skills/opentargets-database/scripts/query_opentargets.py +403 -0
  519. package/bin/skills/opentrons-integration/SKILL.md +573 -0
  520. package/bin/skills/opentrons-integration/references/api_reference.md +366 -0
  521. package/bin/skills/opentrons-integration/scripts/basic_protocol_template.py +67 -0
  522. package/bin/skills/opentrons-integration/scripts/pcr_setup_template.py +154 -0
  523. package/bin/skills/opentrons-integration/scripts/serial_dilution_template.py +96 -0
  524. package/bin/skills/pathml/SKILL.md +166 -0
  525. package/bin/skills/pathml/references/data_management.md +742 -0
  526. package/bin/skills/pathml/references/graphs.md +653 -0
  527. package/bin/skills/pathml/references/image_loading.md +448 -0
  528. package/bin/skills/pathml/references/machine_learning.md +725 -0
  529. package/bin/skills/pathml/references/multiparametric.md +686 -0
  530. package/bin/skills/pathml/references/preprocessing.md +722 -0
  531. package/bin/skills/pdb-database/SKILL.md +309 -0
  532. package/bin/skills/pdb-database/references/api_reference.md +617 -0
  533. package/bin/skills/pennylane/SKILL.md +226 -0
  534. package/bin/skills/pennylane/references/advanced_features.md +667 -0
  535. package/bin/skills/pennylane/references/devices_backends.md +596 -0
  536. package/bin/skills/pennylane/references/getting_started.md +227 -0
  537. package/bin/skills/pennylane/references/optimization.md +671 -0
  538. package/bin/skills/pennylane/references/quantum_chemistry.md +567 -0
  539. package/bin/skills/pennylane/references/quantum_circuits.md +437 -0
  540. package/bin/skills/pennylane/references/quantum_ml.md +571 -0
  541. package/bin/skills/perplexity-search/SKILL.md +448 -0
  542. package/bin/skills/perplexity-search/assets/.env.example +16 -0
  543. package/bin/skills/perplexity-search/references/model_comparison.md +386 -0
  544. package/bin/skills/perplexity-search/references/openrouter_setup.md +454 -0
  545. package/bin/skills/perplexity-search/references/search_strategies.md +258 -0
  546. package/bin/skills/perplexity-search/scripts/perplexity_search.py +277 -0
  547. package/bin/skills/perplexity-search/scripts/setup_env.py +171 -0
  548. package/bin/skills/plotly/SKILL.md +267 -0
  549. package/bin/skills/plotly/references/chart-types.md +488 -0
  550. package/bin/skills/plotly/references/export-interactivity.md +453 -0
  551. package/bin/skills/plotly/references/graph-objects.md +302 -0
  552. package/bin/skills/plotly/references/layouts-styling.md +457 -0
  553. package/bin/skills/plotly/references/plotly-express.md +213 -0
  554. package/bin/skills/polars/SKILL.md +387 -0
  555. package/bin/skills/polars/references/best_practices.md +649 -0
  556. package/bin/skills/polars/references/core_concepts.md +378 -0
  557. package/bin/skills/polars/references/io_guide.md +557 -0
  558. package/bin/skills/polars/references/operations.md +602 -0
  559. package/bin/skills/polars/references/pandas_migration.md +417 -0
  560. package/bin/skills/polars/references/transformations.md +549 -0
  561. package/bin/skills/protocolsio-integration/SKILL.md +421 -0
  562. package/bin/skills/protocolsio-integration/references/additional_features.md +387 -0
  563. package/bin/skills/protocolsio-integration/references/authentication.md +100 -0
  564. package/bin/skills/protocolsio-integration/references/discussions.md +225 -0
  565. package/bin/skills/protocolsio-integration/references/file_manager.md +412 -0
  566. package/bin/skills/protocolsio-integration/references/protocols_api.md +294 -0
  567. package/bin/skills/protocolsio-integration/references/workspaces.md +293 -0
  568. package/bin/skills/pubchem-database/SKILL.md +574 -0
  569. package/bin/skills/pubchem-database/references/api_reference.md +440 -0
  570. package/bin/skills/pubchem-database/scripts/bioactivity_query.py +367 -0
  571. package/bin/skills/pubchem-database/scripts/compound_search.py +297 -0
  572. package/bin/skills/pubmed-database/SKILL.md +460 -0
  573. package/bin/skills/pubmed-database/references/api_reference.md +298 -0
  574. package/bin/skills/pubmed-database/references/common_queries.md +453 -0
  575. package/bin/skills/pubmed-database/references/search_syntax.md +436 -0
  576. package/bin/skills/pufferlib/SKILL.md +436 -0
  577. package/bin/skills/pufferlib/references/environments.md +508 -0
  578. package/bin/skills/pufferlib/references/integration.md +621 -0
  579. package/bin/skills/pufferlib/references/policies.md +653 -0
  580. package/bin/skills/pufferlib/references/training.md +360 -0
  581. package/bin/skills/pufferlib/references/vectorization.md +557 -0
  582. package/bin/skills/pufferlib/scripts/env_template.py +340 -0
  583. package/bin/skills/pufferlib/scripts/train_template.py +239 -0
  584. package/bin/skills/pydeseq2/SKILL.md +559 -0
  585. package/bin/skills/pydeseq2/references/api_reference.md +228 -0
  586. package/bin/skills/pydeseq2/references/workflow_guide.md +582 -0
  587. package/bin/skills/pydeseq2/scripts/run_deseq2_analysis.py +353 -0
  588. package/bin/skills/pydicom/SKILL.md +434 -0
  589. package/bin/skills/pydicom/references/common_tags.md +228 -0
  590. package/bin/skills/pydicom/references/transfer_syntaxes.md +352 -0
  591. package/bin/skills/pydicom/scripts/anonymize_dicom.py +137 -0
  592. package/bin/skills/pydicom/scripts/dicom_to_image.py +172 -0
  593. package/bin/skills/pydicom/scripts/extract_metadata.py +173 -0
  594. package/bin/skills/pyhealth/SKILL.md +491 -0
  595. package/bin/skills/pyhealth/references/datasets.md +178 -0
  596. package/bin/skills/pyhealth/references/medical_coding.md +284 -0
  597. package/bin/skills/pyhealth/references/models.md +594 -0
  598. package/bin/skills/pyhealth/references/preprocessing.md +638 -0
  599. package/bin/skills/pyhealth/references/tasks.md +379 -0
  600. package/bin/skills/pyhealth/references/training_evaluation.md +648 -0
  601. package/bin/skills/pylabrobot/SKILL.md +185 -0
  602. package/bin/skills/pylabrobot/references/analytical-equipment.md +464 -0
  603. package/bin/skills/pylabrobot/references/hardware-backends.md +480 -0
  604. package/bin/skills/pylabrobot/references/liquid-handling.md +403 -0
  605. package/bin/skills/pylabrobot/references/material-handling.md +620 -0
  606. package/bin/skills/pylabrobot/references/resources.md +489 -0
  607. package/bin/skills/pylabrobot/references/visualization.md +532 -0
  608. package/bin/skills/pymatgen/SKILL.md +691 -0
  609. package/bin/skills/pymatgen/references/analysis_modules.md +530 -0
  610. package/bin/skills/pymatgen/references/core_classes.md +318 -0
  611. package/bin/skills/pymatgen/references/io_formats.md +469 -0
  612. package/bin/skills/pymatgen/references/materials_project_api.md +517 -0
  613. package/bin/skills/pymatgen/references/transformations_workflows.md +591 -0
  614. package/bin/skills/pymatgen/scripts/phase_diagram_generator.py +233 -0
  615. package/bin/skills/pymatgen/scripts/structure_analyzer.py +266 -0
  616. package/bin/skills/pymatgen/scripts/structure_converter.py +169 -0
  617. package/bin/skills/pymc/SKILL.md +572 -0
  618. package/bin/skills/pymc/assets/hierarchical_model_template.py +333 -0
  619. package/bin/skills/pymc/assets/linear_regression_template.py +241 -0
  620. package/bin/skills/pymc/references/distributions.md +320 -0
  621. package/bin/skills/pymc/references/sampling_inference.md +424 -0
  622. package/bin/skills/pymc/references/workflows.md +526 -0
  623. package/bin/skills/pymc/scripts/model_comparison.py +387 -0
  624. package/bin/skills/pymc/scripts/model_diagnostics.py +350 -0
  625. package/bin/skills/pymoo/SKILL.md +571 -0
  626. package/bin/skills/pymoo/references/algorithms.md +180 -0
  627. package/bin/skills/pymoo/references/constraints_mcdm.md +417 -0
  628. package/bin/skills/pymoo/references/operators.md +345 -0
  629. package/bin/skills/pymoo/references/problems.md +265 -0
  630. package/bin/skills/pymoo/references/visualization.md +353 -0
  631. package/bin/skills/pymoo/scripts/custom_problem_example.py +181 -0
  632. package/bin/skills/pymoo/scripts/decision_making_example.py +161 -0
  633. package/bin/skills/pymoo/scripts/many_objective_example.py +72 -0
  634. package/bin/skills/pymoo/scripts/multi_objective_example.py +63 -0
  635. package/bin/skills/pymoo/scripts/single_objective_example.py +59 -0
  636. package/bin/skills/pyopenms/SKILL.md +217 -0
  637. package/bin/skills/pyopenms/references/data_structures.md +497 -0
  638. package/bin/skills/pyopenms/references/feature_detection.md +410 -0
  639. package/bin/skills/pyopenms/references/file_io.md +349 -0
  640. package/bin/skills/pyopenms/references/identification.md +422 -0
  641. package/bin/skills/pyopenms/references/metabolomics.md +482 -0
  642. package/bin/skills/pyopenms/references/signal_processing.md +433 -0
  643. package/bin/skills/pysam/SKILL.md +265 -0
  644. package/bin/skills/pysam/references/alignment_files.md +280 -0
  645. package/bin/skills/pysam/references/common_workflows.md +520 -0
  646. package/bin/skills/pysam/references/sequence_files.md +407 -0
  647. package/bin/skills/pysam/references/variant_files.md +365 -0
  648. package/bin/skills/pytdc/SKILL.md +460 -0
  649. package/bin/skills/pytdc/references/datasets.md +246 -0
  650. package/bin/skills/pytdc/references/oracles.md +400 -0
  651. package/bin/skills/pytdc/references/utilities.md +684 -0
  652. package/bin/skills/pytdc/scripts/benchmark_evaluation.py +327 -0
  653. package/bin/skills/pytdc/scripts/load_and_split_data.py +214 -0
  654. package/bin/skills/pytdc/scripts/molecular_generation.py +404 -0
  655. package/bin/skills/qiskit/SKILL.md +275 -0
  656. package/bin/skills/qiskit/references/algorithms.md +607 -0
  657. package/bin/skills/qiskit/references/backends.md +433 -0
  658. package/bin/skills/qiskit/references/circuits.md +197 -0
  659. package/bin/skills/qiskit/references/patterns.md +533 -0
  660. package/bin/skills/qiskit/references/primitives.md +277 -0
  661. package/bin/skills/qiskit/references/setup.md +99 -0
  662. package/bin/skills/qiskit/references/transpilation.md +286 -0
  663. package/bin/skills/qiskit/references/visualization.md +415 -0
  664. package/bin/skills/qutip/SKILL.md +318 -0
  665. package/bin/skills/qutip/references/advanced.md +555 -0
  666. package/bin/skills/qutip/references/analysis.md +523 -0
  667. package/bin/skills/qutip/references/core_concepts.md +293 -0
  668. package/bin/skills/qutip/references/time_evolution.md +348 -0
  669. package/bin/skills/qutip/references/visualization.md +431 -0
  670. package/bin/skills/rdkit/SKILL.md +780 -0
  671. package/bin/skills/rdkit/references/api_reference.md +432 -0
  672. package/bin/skills/rdkit/references/descriptors_reference.md +595 -0
  673. package/bin/skills/rdkit/references/smarts_patterns.md +668 -0
  674. package/bin/skills/rdkit/scripts/molecular_properties.py +243 -0
  675. package/bin/skills/rdkit/scripts/similarity_search.py +297 -0
  676. package/bin/skills/rdkit/scripts/substructure_filter.py +386 -0
  677. package/bin/skills/reactome-database/SKILL.md +278 -0
  678. package/bin/skills/reactome-database/references/api_reference.md +465 -0
  679. package/bin/skills/reactome-database/scripts/reactome_query.py +286 -0
  680. package/bin/skills/rowan/SKILL.md +427 -0
  681. package/bin/skills/rowan/references/api_reference.md +413 -0
  682. package/bin/skills/rowan/references/molecule_handling.md +429 -0
  683. package/bin/skills/rowan/references/proteins_and_organization.md +499 -0
  684. package/bin/skills/rowan/references/rdkit_native.md +438 -0
  685. package/bin/skills/rowan/references/results_interpretation.md +481 -0
  686. package/bin/skills/rowan/references/workflow_types.md +591 -0
  687. package/bin/skills/scanpy/SKILL.md +386 -0
  688. package/bin/skills/scanpy/assets/analysis_template.py +295 -0
  689. package/bin/skills/scanpy/references/api_reference.md +251 -0
  690. package/bin/skills/scanpy/references/plotting_guide.md +352 -0
  691. package/bin/skills/scanpy/references/standard_workflow.md +206 -0
  692. package/bin/skills/scanpy/scripts/qc_analysis.py +200 -0
  693. package/bin/skills/scientific-brainstorming/SKILL.md +191 -0
  694. package/bin/skills/scientific-brainstorming/references/brainstorming_methods.md +326 -0
  695. package/bin/skills/scientific-visualization/SKILL.md +779 -0
  696. package/bin/skills/scientific-visualization/assets/color_palettes.py +197 -0
  697. package/bin/skills/scientific-visualization/assets/nature.mplstyle +63 -0
  698. package/bin/skills/scientific-visualization/assets/presentation.mplstyle +61 -0
  699. package/bin/skills/scientific-visualization/assets/publication.mplstyle +68 -0
  700. package/bin/skills/scientific-visualization/references/color_palettes.md +348 -0
  701. package/bin/skills/scientific-visualization/references/journal_requirements.md +320 -0
  702. package/bin/skills/scientific-visualization/references/matplotlib_examples.md +620 -0
  703. package/bin/skills/scientific-visualization/references/publication_guidelines.md +205 -0
  704. package/bin/skills/scientific-visualization/scripts/figure_export.py +343 -0
  705. package/bin/skills/scientific-visualization/scripts/style_presets.py +416 -0
  706. package/bin/skills/scikit-bio/SKILL.md +437 -0
  707. package/bin/skills/scikit-bio/references/api_reference.md +749 -0
  708. package/bin/skills/scikit-learn/SKILL.md +521 -0
  709. package/bin/skills/scikit-learn/references/model_evaluation.md +592 -0
  710. package/bin/skills/scikit-learn/references/pipelines_and_composition.md +612 -0
  711. package/bin/skills/scikit-learn/references/preprocessing.md +606 -0
  712. package/bin/skills/scikit-learn/references/quick_reference.md +433 -0
  713. package/bin/skills/scikit-learn/references/supervised_learning.md +378 -0
  714. package/bin/skills/scikit-learn/references/unsupervised_learning.md +505 -0
  715. package/bin/skills/scikit-learn/scripts/classification_pipeline.py +257 -0
  716. package/bin/skills/scikit-learn/scripts/clustering_analysis.py +386 -0
  717. package/bin/skills/scikit-survival/SKILL.md +399 -0
  718. package/bin/skills/scikit-survival/references/competing-risks.md +397 -0
  719. package/bin/skills/scikit-survival/references/cox-models.md +182 -0
  720. package/bin/skills/scikit-survival/references/data-handling.md +494 -0
  721. package/bin/skills/scikit-survival/references/ensemble-models.md +327 -0
  722. package/bin/skills/scikit-survival/references/evaluation-metrics.md +378 -0
  723. package/bin/skills/scikit-survival/references/svm-models.md +411 -0
  724. package/bin/skills/scvi-tools/SKILL.md +190 -0
  725. package/bin/skills/scvi-tools/references/differential-expression.md +581 -0
  726. package/bin/skills/scvi-tools/references/models-atac-seq.md +321 -0
  727. package/bin/skills/scvi-tools/references/models-multimodal.md +367 -0
  728. package/bin/skills/scvi-tools/references/models-scrna-seq.md +330 -0
  729. package/bin/skills/scvi-tools/references/models-spatial.md +438 -0
  730. package/bin/skills/scvi-tools/references/models-specialized.md +408 -0
  731. package/bin/skills/scvi-tools/references/theoretical-foundations.md +438 -0
  732. package/bin/skills/scvi-tools/references/workflows.md +546 -0
  733. package/bin/skills/seaborn/SKILL.md +673 -0
  734. package/bin/skills/seaborn/references/examples.md +822 -0
  735. package/bin/skills/seaborn/references/function_reference.md +770 -0
  736. package/bin/skills/seaborn/references/objects_interface.md +964 -0
  737. package/bin/skills/shap/SKILL.md +566 -0
  738. package/bin/skills/shap/references/explainers.md +339 -0
  739. package/bin/skills/shap/references/plots.md +507 -0
  740. package/bin/skills/shap/references/theory.md +449 -0
  741. package/bin/skills/shap/references/workflows.md +605 -0
  742. package/bin/skills/simpy/SKILL.md +429 -0
  743. package/bin/skills/simpy/references/events.md +374 -0
  744. package/bin/skills/simpy/references/monitoring.md +475 -0
  745. package/bin/skills/simpy/references/process-interaction.md +424 -0
  746. package/bin/skills/simpy/references/real-time.md +395 -0
  747. package/bin/skills/simpy/references/resources.md +275 -0
  748. package/bin/skills/simpy/scripts/basic_simulation_template.py +193 -0
  749. package/bin/skills/simpy/scripts/resource_monitor.py +345 -0
  750. package/bin/skills/stable-baselines3/SKILL.md +299 -0
  751. package/bin/skills/stable-baselines3/references/algorithms.md +333 -0
  752. package/bin/skills/stable-baselines3/references/callbacks.md +556 -0
  753. package/bin/skills/stable-baselines3/references/custom_environments.md +526 -0
  754. package/bin/skills/stable-baselines3/references/vectorized_envs.md +568 -0
  755. package/bin/skills/stable-baselines3/scripts/custom_env_template.py +314 -0
  756. package/bin/skills/stable-baselines3/scripts/evaluate_agent.py +245 -0
  757. package/bin/skills/stable-baselines3/scripts/train_rl_agent.py +165 -0
  758. package/bin/skills/statistical-analysis/SKILL.md +632 -0
  759. package/bin/skills/statistical-analysis/references/assumptions_and_diagnostics.md +369 -0
  760. package/bin/skills/statistical-analysis/references/bayesian_statistics.md +661 -0
  761. package/bin/skills/statistical-analysis/references/effect_sizes_and_power.md +581 -0
  762. package/bin/skills/statistical-analysis/references/reporting_standards.md +469 -0
  763. package/bin/skills/statistical-analysis/references/test_selection_guide.md +129 -0
  764. package/bin/skills/statistical-analysis/scripts/assumption_checks.py +539 -0
  765. package/bin/skills/statsmodels/SKILL.md +614 -0
  766. package/bin/skills/statsmodels/references/discrete_choice.md +669 -0
  767. package/bin/skills/statsmodels/references/glm.md +619 -0
  768. package/bin/skills/statsmodels/references/linear_models.md +447 -0
  769. package/bin/skills/statsmodels/references/stats_diagnostics.md +859 -0
  770. package/bin/skills/statsmodels/references/time_series.md +716 -0
  771. package/bin/skills/string-database/SKILL.md +534 -0
  772. package/bin/skills/string-database/references/string_reference.md +455 -0
  773. package/bin/skills/string-database/scripts/string_api.py +369 -0
  774. package/bin/skills/sympy/SKILL.md +500 -0
  775. package/bin/skills/sympy/references/advanced-topics.md +635 -0
  776. package/bin/skills/sympy/references/code-generation-printing.md +599 -0
  777. package/bin/skills/sympy/references/core-capabilities.md +348 -0
  778. package/bin/skills/sympy/references/matrices-linear-algebra.md +526 -0
  779. package/bin/skills/sympy/references/physics-mechanics.md +592 -0
  780. package/bin/skills/torch_geometric/SKILL.md +676 -0
  781. package/bin/skills/torch_geometric/references/datasets_reference.md +574 -0
  782. package/bin/skills/torch_geometric/references/layers_reference.md +485 -0
  783. package/bin/skills/torch_geometric/references/transforms_reference.md +679 -0
  784. package/bin/skills/torch_geometric/scripts/benchmark_model.py +309 -0
  785. package/bin/skills/torch_geometric/scripts/create_gnn_template.py +529 -0
  786. package/bin/skills/torch_geometric/scripts/visualize_graph.py +313 -0
  787. package/bin/skills/torchdrug/SKILL.md +450 -0
  788. package/bin/skills/torchdrug/references/core_concepts.md +565 -0
  789. package/bin/skills/torchdrug/references/datasets.md +380 -0
  790. package/bin/skills/torchdrug/references/knowledge_graphs.md +320 -0
  791. package/bin/skills/torchdrug/references/models_architectures.md +541 -0
  792. package/bin/skills/torchdrug/references/molecular_generation.md +352 -0
  793. package/bin/skills/torchdrug/references/molecular_property_prediction.md +169 -0
  794. package/bin/skills/torchdrug/references/protein_modeling.md +272 -0
  795. package/bin/skills/torchdrug/references/retrosynthesis.md +436 -0
  796. package/bin/skills/transformers/SKILL.md +164 -0
  797. package/bin/skills/transformers/references/generation.md +467 -0
  798. package/bin/skills/transformers/references/models.md +361 -0
  799. package/bin/skills/transformers/references/pipelines.md +335 -0
  800. package/bin/skills/transformers/references/tokenizers.md +447 -0
  801. package/bin/skills/transformers/references/training.md +500 -0
  802. package/bin/skills/umap-learn/SKILL.md +479 -0
  803. package/bin/skills/umap-learn/references/api_reference.md +532 -0
  804. package/bin/skills/uniprot-database/SKILL.md +195 -0
  805. package/bin/skills/uniprot-database/references/api_examples.md +413 -0
  806. package/bin/skills/uniprot-database/references/api_fields.md +275 -0
  807. package/bin/skills/uniprot-database/references/id_mapping_databases.md +285 -0
  808. package/bin/skills/uniprot-database/references/query_syntax.md +256 -0
  809. package/bin/skills/uniprot-database/scripts/uniprot_client.py +341 -0
  810. package/bin/skills/uspto-database/SKILL.md +607 -0
  811. package/bin/skills/uspto-database/references/additional_apis.md +394 -0
  812. package/bin/skills/uspto-database/references/patentsearch_api.md +266 -0
  813. package/bin/skills/uspto-database/references/peds_api.md +212 -0
  814. package/bin/skills/uspto-database/references/trademark_api.md +358 -0
  815. package/bin/skills/uspto-database/scripts/patent_search.py +290 -0
  816. package/bin/skills/uspto-database/scripts/peds_client.py +285 -0
  817. package/bin/skills/uspto-database/scripts/trademark_client.py +311 -0
  818. package/bin/skills/vaex/SKILL.md +182 -0
  819. package/bin/skills/vaex/references/core_dataframes.md +367 -0
  820. package/bin/skills/vaex/references/data_processing.md +555 -0
  821. package/bin/skills/vaex/references/io_operations.md +703 -0
  822. package/bin/skills/vaex/references/machine_learning.md +728 -0
  823. package/bin/skills/vaex/references/performance.md +571 -0
  824. package/bin/skills/vaex/references/visualization.md +613 -0
  825. package/bin/skills/zarr-python/SKILL.md +779 -0
  826. package/bin/skills/zarr-python/references/api_reference.md +515 -0
  827. package/bin/skills/zinc-database/SKILL.md +404 -0
  828. package/bin/skills/zinc-database/references/api_reference.md +692 -0
  829. package/bin/synsc +0 -0
  830. package/package.json +1 -1
@@ -0,0 +1,742 @@
1
+ # Data Management & Storage
2
+
3
+ ## Overview
4
+
5
+ PathML provides efficient data management solutions for handling large-scale pathology datasets through HDF5 storage, tile management strategies, and optimized batch processing workflows. The framework enables seamless storage and retrieval of images, masks, features, and metadata in formats optimized for machine learning pipelines and downstream analysis.
6
+
7
+ ## HDF5 Integration
8
+
9
+ HDF5 (Hierarchical Data Format) is the primary storage format for processed PathML data, providing:
10
+ - Efficient compression and chunked storage
11
+ - Fast random access to subsets of data
12
+ - Support for arbitrarily large datasets
13
+ - Hierarchical organization of heterogeneous data types
14
+ - Cross-platform compatibility
15
+
16
+ ### Saving to HDF5
17
+
18
+ **Single slide:**
19
+ ```python
20
+ from pathml.core import SlideData
21
+
22
+ # Load and process slide
23
+ wsi = SlideData.from_slide("slide.svs")
24
+ wsi.generate_tiles(level=1, tile_size=256, stride=256)
25
+
26
+ # Run preprocessing pipeline
27
+ pipeline.run(wsi)
28
+
29
+ # Save to HDF5
30
+ wsi.to_hdf5("processed_slide.h5")
31
+ ```
32
+
33
+ **Multiple slides (SlideDataset):**
34
+ ```python
35
+ from pathml.core import SlideDataset
36
+ import glob
37
+
38
+ # Create dataset
39
+ slide_paths = glob.glob("data/*.svs")
40
+ dataset = SlideDataset(slide_paths, tile_size=256, stride=256, level=1)
41
+
42
+ # Process
43
+ dataset.run(pipeline, distributed=True, n_workers=8)
44
+
45
+ # Save entire dataset
46
+ dataset.to_hdf5("processed_dataset.h5")
47
+ ```
48
+
49
+ ### HDF5 File Structure
50
+
51
+ PathML HDF5 files are organized hierarchically:
52
+
53
+ ```
54
+ processed_dataset.h5
55
+ ├── slide_0/
56
+ │ ├── metadata/
57
+ │ │ ├── name
58
+ │ │ ├── level
59
+ │ │ ├── dimensions
60
+ │ │ └── ...
61
+ │ ├── tiles/
62
+ │ │ ├── tile_0/
63
+ │ │ │ ├── image (H, W, C) array
64
+ │ │ │ ├── coords (x, y)
65
+ │ │ │ └── masks/
66
+ │ │ │ ├── tissue
67
+ │ │ │ ├── nucleus
68
+ │ │ │ └── ...
69
+ │ │ ├── tile_1/
70
+ │ │ └── ...
71
+ │ └── features/
72
+ │ ├── tile_features (n_tiles, n_features)
73
+ │ └── feature_names
74
+ ├── slide_1/
75
+ └── ...
76
+ ```
77
+
78
+ ### Loading from HDF5
79
+
80
+ **Load entire slide:**
81
+ ```python
82
+ from pathml.core import SlideData
83
+
84
+ # Load from HDF5
85
+ wsi = SlideData.from_hdf5("processed_slide.h5")
86
+
87
+ # Access tiles
88
+ for tile in wsi.tiles:
89
+ image = tile.image
90
+ masks = tile.masks
91
+ # Process tile...
92
+ ```
93
+
94
+ **Load specific tiles:**
95
+ ```python
96
+ # Load only tiles at specific indices
97
+ tile_indices = [0, 10, 20, 30]
98
+ tiles = wsi.load_tiles_from_hdf5("processed_slide.h5", indices=tile_indices)
99
+
100
+ for tile in tiles:
101
+ # Process subset...
102
+ pass
103
+ ```
104
+
105
+ **Memory-mapped access:**
106
+ ```python
107
+ import h5py
108
+
109
+ # Open HDF5 file without loading into memory
110
+ with h5py.File("processed_dataset.h5", 'r') as f:
111
+ # Access specific data
112
+ tile_0_image = f['slide_0/tiles/tile_0/image'][:]
113
+ tissue_mask = f['slide_0/tiles/tile_0/masks/tissue'][:]
114
+
115
+ # Iterate through tiles efficiently
116
+ for tile_key in f['slide_0/tiles'].keys():
117
+ tile_image = f[f'slide_0/tiles/{tile_key}/image'][:]
118
+ # Process without loading all tiles...
119
+ ```
120
+
121
+ ## Tile Management
122
+
123
+ ### Tile Generation Strategies
124
+
125
+ **Fixed-size tiles with no overlap:**
126
+ ```python
127
+ wsi.generate_tiles(
128
+ level=1,
129
+ tile_size=256,
130
+ stride=256, # stride = tile_size → no overlap
131
+ pad=False # Don't pad edge tiles
132
+ )
133
+ ```
134
+ - **Use case:** Standard tile-based processing, classification
135
+ - **Pros:** Simple, no redundancy, fast processing
136
+ - **Cons:** Edge effects at tile boundaries
137
+
138
+ **Overlapping tiles:**
139
+ ```python
140
+ wsi.generate_tiles(
141
+ level=1,
142
+ tile_size=256,
143
+ stride=128, # 50% overlap
144
+ pad=False
145
+ )
146
+ ```
147
+ - **Use case:** Segmentation, detection (reduces boundary artifacts)
148
+ - **Pros:** Better boundary handling, smoother stitching
149
+ - **Cons:** More tiles, redundant computation
150
+
151
+ **Adaptive tiling based on tissue content:**
152
+ ```python
153
+ from pathml.utils import adaptive_tile_generation
154
+
155
+ # Generate tiles only in tissue regions
156
+ wsi.generate_tiles(level=1, tile_size=256, stride=256)
157
+
158
+ # Filter to keep only tiles with sufficient tissue
159
+ tissue_tiles = []
160
+ for tile in wsi.tiles:
161
+ if tile.masks.get('tissue') is not None:
162
+ tissue_coverage = tile.masks['tissue'].sum() / (tile_size**2)
163
+ if tissue_coverage > 0.5: # Keep tiles with >50% tissue
164
+ tissue_tiles.append(tile)
165
+
166
+ wsi.tiles = tissue_tiles
167
+ ```
168
+ - **Use case:** Sparse tissue samples, efficiency
169
+ - **Pros:** Reduces processing of background tiles
170
+ - **Cons:** Requires tissue detection preprocessing step
171
+
172
+ ### Tile Stitching
173
+
174
+ Reconstruct full slide from processed tiles:
175
+
176
+ ```python
177
+ from pathml.utils import stitch_tiles
178
+
179
+ # Process tiles
180
+ for tile in wsi.tiles:
181
+ tile.prediction = model.predict(tile.image)
182
+
183
+ # Stitch predictions back to full resolution
184
+ full_prediction_map = stitch_tiles(
185
+ wsi.tiles,
186
+ output_shape=wsi.level_dimensions[1], # Use level 1 dimensions
187
+ tile_size=256,
188
+ stride=256,
189
+ method='average' # 'average', 'max', or 'first'
190
+ )
191
+
192
+ # Visualize
193
+ import matplotlib.pyplot as plt
194
+ plt.figure(figsize=(15, 15))
195
+ plt.imshow(full_prediction_map)
196
+ plt.title('Stitched Prediction Map')
197
+ plt.axis('off')
198
+ plt.show()
199
+ ```
200
+
201
+ **Stitching methods:**
202
+ - `'average'`: Average overlapping regions (smooth transitions)
203
+ - `'max'`: Maximum value in overlapping regions
204
+ - `'first'`: Keep first tile's value (no blending)
205
+ - `'weighted'`: Distance-weighted blending for smooth boundaries
206
+
207
+ ### Tile Caching
208
+
209
+ Cache frequently accessed tiles for faster iteration:
210
+
211
+ ```python
212
+ from pathml.utils import TileCache
213
+
214
+ # Create cache
215
+ cache = TileCache(max_size_gb=10)
216
+
217
+ # Cache tiles during first iteration
218
+ for i, tile in enumerate(wsi.tiles):
219
+ cache.add(f'tile_{i}', tile.image)
220
+ # Process tile...
221
+
222
+ # Subsequent iterations use cached data
223
+ for i in range(len(wsi.tiles)):
224
+ cached_image = cache.get(f'tile_{i}')
225
+ # Fast access...
226
+ ```
227
+
228
+ ## Dataset Organization
229
+
230
+ ### Directory Structure for Large Projects
231
+
232
+ Organize pathology projects with consistent structure:
233
+
234
+ ```
235
+ project/
236
+ ├── raw_slides/
237
+ │ ├── cohort1/
238
+ │ │ ├── slide001.svs
239
+ │ │ ├── slide002.svs
240
+ │ │ └── ...
241
+ │ └── cohort2/
242
+ │ └── ...
243
+ ├── processed/
244
+ │ ├── cohort1/
245
+ │ │ ├── slide001.h5
246
+ │ │ ├── slide002.h5
247
+ │ │ └── ...
248
+ │ └── cohort2/
249
+ │ └── ...
250
+ ├── features/
251
+ │ ├── cohort1_features.h5
252
+ │ └── cohort2_features.h5
253
+ ├── models/
254
+ │ ├── hovernet_checkpoint.pth
255
+ │ └── classifier.onnx
256
+ ├── results/
257
+ │ ├── predictions/
258
+ │ ├── visualizations/
259
+ │ └── metrics.csv
260
+ └── metadata/
261
+ ├── clinical_data.csv
262
+ └── slide_manifest.csv
263
+ ```
264
+
265
+ ### Metadata Management
266
+
267
+ Store slide-level and cohort-level metadata:
268
+
269
+ ```python
270
+ import pandas as pd
271
+
272
+ # Slide manifest
273
+ manifest = pd.DataFrame({
274
+ 'slide_id': ['slide001', 'slide002', 'slide003'],
275
+ 'path': ['raw_slides/cohort1/slide001.svs', ...],
276
+ 'cohort': ['cohort1', 'cohort1', 'cohort2'],
277
+ 'tissue_type': ['breast', 'breast', 'lung'],
278
+ 'scanner': ['Aperio', 'Hamamatsu', 'Aperio'],
279
+ 'magnification': [40, 40, 20],
280
+ 'staining': ['H&E', 'H&E', 'H&E']
281
+ })
282
+
283
+ manifest.to_csv('metadata/slide_manifest.csv', index=False)
284
+
285
+ # Clinical data
286
+ clinical = pd.DataFrame({
287
+ 'slide_id': ['slide001', 'slide002', 'slide003'],
288
+ 'patient_id': ['P001', 'P002', 'P003'],
289
+ 'age': [55, 62, 48],
290
+ 'diagnosis': ['invasive', 'in_situ', 'invasive'],
291
+ 'stage': ['II', 'I', 'III'],
292
+ 'outcome': ['favorable', 'favorable', 'poor']
293
+ })
294
+
295
+ clinical.to_csv('metadata/clinical_data.csv', index=False)
296
+
297
+ # Load and merge
298
+ manifest = pd.read_csv('metadata/slide_manifest.csv')
299
+ clinical = pd.read_csv('metadata/clinical_data.csv')
300
+ data = manifest.merge(clinical, on='slide_id')
301
+ ```
302
+
303
+ ## Batch Processing Strategies
304
+
305
+ ### Sequential Processing
306
+
307
+ Process slides one at a time (memory-efficient):
308
+
309
+ ```python
310
+ import glob
311
+ from pathml.core import SlideData
312
+ from pathml.preprocessing import Pipeline
313
+
314
+ slide_paths = glob.glob('raw_slides/**/*.svs', recursive=True)
315
+
316
+ for slide_path in slide_paths:
317
+ # Load slide
318
+ wsi = SlideData.from_slide(slide_path)
319
+ wsi.generate_tiles(level=1, tile_size=256, stride=256)
320
+
321
+ # Process
322
+ pipeline.run(wsi)
323
+
324
+ # Save
325
+ output_path = slide_path.replace('raw_slides', 'processed').replace('.svs', '.h5')
326
+ wsi.to_hdf5(output_path)
327
+
328
+ print(f"Processed: {slide_path}")
329
+ ```
330
+
331
+ ### Parallel Processing with Dask
332
+
333
+ Process multiple slides in parallel:
334
+
335
+ ```python
336
+ from pathml.core import SlideDataset
337
+ from dask.distributed import Client, LocalCluster
338
+ from pathml.preprocessing import Pipeline
339
+
340
+ # Start Dask cluster
341
+ cluster = LocalCluster(
342
+ n_workers=8,
343
+ threads_per_worker=2,
344
+ memory_limit='8GB',
345
+ dashboard_address=':8787' # View progress at localhost:8787
346
+ )
347
+ client = Client(cluster)
348
+
349
+ # Create dataset
350
+ slide_paths = glob.glob('raw_slides/**/*.svs', recursive=True)
351
+ dataset = SlideDataset(slide_paths, tile_size=256, stride=256, level=1)
352
+
353
+ # Distribute processing
354
+ dataset.run(
355
+ pipeline,
356
+ distributed=True,
357
+ client=client,
358
+ scheduler='distributed'
359
+ )
360
+
361
+ # Save results
362
+ for i, slide in enumerate(dataset):
363
+ output_path = slide_paths[i].replace('raw_slides', 'processed').replace('.svs', '.h5')
364
+ slide.to_hdf5(output_path)
365
+
366
+ client.close()
367
+ cluster.close()
368
+ ```
369
+
370
+ ### Batch Processing with Job Arrays
371
+
372
+ For HPC clusters (SLURM, PBS):
373
+
374
+ ```python
375
+ # submit_jobs.py
376
+ import os
377
+ import glob
378
+
379
+ slide_paths = glob.glob('raw_slides/**/*.svs', recursive=True)
380
+
381
+ # Write slide list
382
+ with open('slide_list.txt', 'w') as f:
383
+ for path in slide_paths:
384
+ f.write(path + '\n')
385
+
386
+ # Create SLURM job script
387
+ slurm_script = """#!/bin/bash
388
+ #SBATCH --array=1-{n_slides}
389
+ #SBATCH --cpus-per-task=4
390
+ #SBATCH --mem=16G
391
+ #SBATCH --time=4:00:00
392
+ #SBATCH --output=logs/slide_%A_%a.out
393
+
394
+ # Get slide path for this array task
395
+ SLIDE_PATH=$(sed -n "${{SLURM_ARRAY_TASK_ID}}p" slide_list.txt)
396
+
397
+ # Run processing
398
+ python process_slide.py --slide_path $SLIDE_PATH
399
+ """.format(n_slides=len(slide_paths))
400
+
401
+ with open('submit_jobs.sh', 'w') as f:
402
+ f.write(slurm_script)
403
+
404
+ # Submit: sbatch submit_jobs.sh
405
+ ```
406
+
407
+ ```python
408
+ # process_slide.py
409
+ import argparse
410
+ from pathml.core import SlideData
411
+ from pathml.preprocessing import Pipeline
412
+
413
+ parser = argparse.ArgumentParser()
414
+ parser.add_argument('--slide_path', type=str, required=True)
415
+ args = parser.parse_args()
416
+
417
+ # Load and process
418
+ wsi = SlideData.from_slide(args.slide_path)
419
+ wsi.generate_tiles(level=1, tile_size=256, stride=256)
420
+
421
+ pipeline = Pipeline([...])
422
+ pipeline.run(wsi)
423
+
424
+ # Save
425
+ output_path = args.slide_path.replace('raw_slides', 'processed').replace('.svs', '.h5')
426
+ wsi.to_hdf5(output_path)
427
+
428
+ print(f"Processed: {args.slide_path}")
429
+ ```
430
+
431
+ ## Feature Extraction and Storage
432
+
433
+ ### Extracting Features
434
+
435
+ ```python
436
+ from pathml.core import SlideData
437
+ import torch
438
+ import numpy as np
439
+
440
+ # Load pre-trained model for feature extraction
441
+ model = torch.load('models/feature_extractor.pth')
442
+ model.eval()
443
+ device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
444
+ model = model.to(device)
445
+
446
+ # Load processed slide
447
+ wsi = SlideData.from_hdf5('processed/slide001.h5')
448
+
449
+ # Extract features for each tile
450
+ features = []
451
+ coords = []
452
+
453
+ for tile in wsi.tiles:
454
+ # Preprocess tile
455
+ tile_tensor = torch.from_numpy(tile.image).permute(2, 0, 1).unsqueeze(0).float()
456
+ tile_tensor = tile_tensor.to(device)
457
+
458
+ # Extract features
459
+ with torch.no_grad():
460
+ feature_vec = model(tile_tensor).cpu().numpy().flatten()
461
+
462
+ features.append(feature_vec)
463
+ coords.append(tile.coords)
464
+
465
+ features = np.array(features) # Shape: (n_tiles, feature_dim)
466
+ coords = np.array(coords) # Shape: (n_tiles, 2)
467
+ ```
468
+
469
+ ### Storing Features in HDF5
470
+
471
+ ```python
472
+ import h5py
473
+
474
+ # Save features
475
+ with h5py.File('features/slide001_features.h5', 'w') as f:
476
+ f.create_dataset('features', data=features, compression='gzip')
477
+ f.create_dataset('coords', data=coords)
478
+ f.attrs['feature_dim'] = features.shape[1]
479
+ f.attrs['num_tiles'] = features.shape[0]
480
+ f.attrs['model'] = 'resnet50'
481
+
482
+ # Load features
483
+ with h5py.File('features/slide001_features.h5', 'r') as f:
484
+ features = f['features'][:]
485
+ coords = f['coords'][:]
486
+ feature_dim = f.attrs['feature_dim']
487
+ ```
488
+
489
+ ### Feature Database for Multiple Slides
490
+
491
+ ```python
492
+ # Create consolidated feature database
493
+ import h5py
494
+ import glob
495
+
496
+ feature_files = glob.glob('features/*_features.h5')
497
+
498
+ with h5py.File('features/all_features.h5', 'w') as out_f:
499
+ for i, feature_file in enumerate(feature_files):
500
+ slide_name = feature_file.split('/')[-1].replace('_features.h5', '')
501
+
502
+ with h5py.File(feature_file, 'r') as in_f:
503
+ features = in_f['features'][:]
504
+ coords = in_f['coords'][:]
505
+
506
+ # Store in consolidated file
507
+ grp = out_f.create_group(f'slide_{i}')
508
+ grp.create_dataset('features', data=features, compression='gzip')
509
+ grp.create_dataset('coords', data=coords)
510
+ grp.attrs['slide_name'] = slide_name
511
+
512
+ # Query features from all slides
513
+ with h5py.File('features/all_features.h5', 'r') as f:
514
+ for slide_key in f.keys():
515
+ slide_name = f[slide_key].attrs['slide_name']
516
+ features = f[f'{slide_key}/features'][:]
517
+ # Process...
518
+ ```
519
+
520
+ ## Data Versioning
521
+
522
+ ### Version Control with DVC
523
+
524
+ Use Data Version Control (DVC) for large dataset management:
525
+
526
+ ```bash
527
+ # Initialize DVC
528
+ dvc init
529
+
530
+ # Add data directory
531
+ dvc add raw_slides/
532
+ dvc add processed/
533
+
534
+ # Commit to git
535
+ git add raw_slides.dvc processed.dvc .gitignore
536
+ git commit -m "Add raw and processed slides"
537
+
538
+ # Push data to remote storage (S3, GCS, etc.)
539
+ dvc remote add -d storage s3://my-bucket/pathml-data
540
+ dvc push
541
+
542
+ # Pull data on another machine
543
+ git pull
544
+ dvc pull
545
+ ```
546
+
547
+ ### Checksums and Validation
548
+
549
+ Validate data integrity:
550
+
551
+ ```python
552
+ import hashlib
553
+ import pandas as pd
554
+
555
+ def compute_checksum(file_path):
556
+ """Compute MD5 checksum of file."""
557
+ hash_md5 = hashlib.md5()
558
+ with open(file_path, 'rb') as f:
559
+ for chunk in iter(lambda: f.read(4096), b""):
560
+ hash_md5.update(chunk)
561
+ return hash_md5.hexdigest()
562
+
563
+ # Create checksum manifest
564
+ slide_paths = glob.glob('raw_slides/**/*.svs', recursive=True)
565
+ checksums = []
566
+
567
+ for slide_path in slide_paths:
568
+ checksum = compute_checksum(slide_path)
569
+ checksums.append({
570
+ 'path': slide_path,
571
+ 'checksum': checksum,
572
+ 'size_mb': os.path.getsize(slide_path) / 1e6
573
+ })
574
+
575
+ checksum_df = pd.DataFrame(checksums)
576
+ checksum_df.to_csv('metadata/checksums.csv', index=False)
577
+
578
+ # Validate files
579
+ def validate_files(manifest_path):
580
+ manifest = pd.read_csv(manifest_path)
581
+ for _, row in manifest.iterrows():
582
+ current_checksum = compute_checksum(row['path'])
583
+ if current_checksum != row['checksum']:
584
+ print(f"ERROR: Checksum mismatch for {row['path']}")
585
+ else:
586
+ print(f"OK: {row['path']}")
587
+
588
+ validate_files('metadata/checksums.csv')
589
+ ```
590
+
591
+ ## Performance Optimization
592
+
593
+ ### Compression Settings
594
+
595
+ Optimize HDF5 compression for speed vs. size:
596
+
597
+ ```python
598
+ import h5py
599
+
600
+ # Fast compression (less CPU, larger files)
601
+ with h5py.File('output.h5', 'w') as f:
602
+ f.create_dataset(
603
+ 'images',
604
+ data=images,
605
+ compression='gzip',
606
+ compression_opts=1 # Level 1-9, lower = faster
607
+ )
608
+
609
+ # Maximum compression (more CPU, smaller files)
610
+ with h5py.File('output.h5', 'w') as f:
611
+ f.create_dataset(
612
+ 'images',
613
+ data=images,
614
+ compression='gzip',
615
+ compression_opts=9
616
+ )
617
+
618
+ # Balanced (recommended)
619
+ with h5py.File('output.h5', 'w') as f:
620
+ f.create_dataset(
621
+ 'images',
622
+ data=images,
623
+ compression='gzip',
624
+ compression_opts=4,
625
+ chunks=True # Enable chunking for better I/O
626
+ )
627
+ ```
628
+
629
+ ### Chunking Strategy
630
+
631
+ Optimize chunked storage for access patterns:
632
+
633
+ ```python
634
+ # For tile-based access (access one tile at a time)
635
+ with h5py.File('tiles.h5', 'w') as f:
636
+ f.create_dataset(
637
+ 'tiles',
638
+ shape=(n_tiles, 256, 256, 3),
639
+ dtype='uint8',
640
+ chunks=(1, 256, 256, 3), # One tile per chunk
641
+ compression='gzip'
642
+ )
643
+
644
+ # For channel-based access (access all tiles for one channel)
645
+ with h5py.File('tiles.h5', 'w') as f:
646
+ f.create_dataset(
647
+ 'tiles',
648
+ shape=(n_tiles, 256, 256, 3),
649
+ dtype='uint8',
650
+ chunks=(n_tiles, 256, 256, 1), # All tiles for one channel
651
+ compression='gzip'
652
+ )
653
+ ```
654
+
655
+ ### Memory-Mapped Arrays
656
+
657
+ Use memory mapping for large arrays:
658
+
659
+ ```python
660
+ import numpy as np
661
+
662
+ # Save as memory-mapped file
663
+ features_mmap = np.memmap(
664
+ 'features/features.mmap',
665
+ dtype='float32',
666
+ mode='w+',
667
+ shape=(n_tiles, feature_dim)
668
+ )
669
+
670
+ # Populate
671
+ for i, tile in enumerate(wsi.tiles):
672
+ features_mmap[i] = extract_features(tile)
673
+
674
+ # Flush to disk
675
+ features_mmap.flush()
676
+
677
+ # Load without reading into memory
678
+ features_mmap = np.memmap(
679
+ 'features/features.mmap',
680
+ dtype='float32',
681
+ mode='r',
682
+ shape=(n_tiles, feature_dim)
683
+ )
684
+
685
+ # Access subset efficiently
686
+ subset = features_mmap[1000:2000] # Only loads requested rows
687
+ ```
688
+
689
+ ## Best Practices
690
+
691
+ 1. **Use HDF5 for processed data:** Save preprocessed tiles and features to HDF5 for fast access
692
+
693
+ 2. **Separate raw and processed data:** Keep original slides separate from processed outputs
694
+
695
+ 3. **Maintain metadata:** Track slide provenance, processing parameters, and clinical annotations
696
+
697
+ 4. **Implement checksums:** Validate data integrity, especially after transfers
698
+
699
+ 5. **Version datasets:** Use DVC or similar tools to version large datasets
700
+
701
+ 6. **Optimize storage:** Balance compression level with I/O performance
702
+
703
+ 7. **Organize by cohort:** Structure directories by study cohort for clarity
704
+
705
+ 8. **Regular backups:** Back up both data and metadata to remote storage
706
+
707
+ 9. **Document processing:** Keep logs of processing steps, parameters, and versions
708
+
709
+ 10. **Monitor disk usage:** Track storage consumption as datasets grow
710
+
711
+ ## Common Issues and Solutions
712
+
713
+ **Issue: HDF5 files very large**
714
+ - Increase compression level: `compression_opts=9`
715
+ - Store only necessary data (avoid redundant copies)
716
+ - Use appropriate data types (uint8 for images vs. float64)
717
+
718
+ **Issue: Slow HDF5 read/write**
719
+ - Optimize chunk size for access pattern
720
+ - Reduce compression level for faster I/O
721
+ - Use SSD storage instead of HDD
722
+ - Enable parallel HDF5 with MPI
723
+
724
+ **Issue: Running out of disk space**
725
+ - Delete intermediate files after processing
726
+ - Compress inactive datasets
727
+ - Move old data to archival storage
728
+ - Use cloud storage for less-accessed data
729
+
730
+ **Issue: Data corruption or loss**
731
+ - Implement regular backups
732
+ - Use RAID for redundancy
733
+ - Validate checksums after transfers
734
+ - Use version control (DVC)
735
+
736
+ ## Additional Resources
737
+
738
+ - **HDF5 Documentation:** https://www.hdfgroup.org/solutions/hdf5/
739
+ - **h5py:** https://docs.h5py.org/
740
+ - **DVC (Data Version Control):** https://dvc.org/
741
+ - **Dask:** https://docs.dask.org/
742
+ - **PathML Data Management API:** https://pathml.readthedocs.io/en/latest/api_data_reference.html