pi-skill-search 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (299) hide show
  1. package/CHANGELOG.md +20 -0
  2. package/LICENSE +21 -0
  3. package/README.md +97 -0
  4. package/index.ts +163 -0
  5. package/package.json +48 -0
  6. package/skills/adaptyv/SKILL.md +92 -0
  7. package/skills/add-community-extension/SKILL.md +85 -0
  8. package/skills/aeon/SKILL.md +111 -0
  9. package/skills/ai-slop-cleaner/SKILL.md +118 -0
  10. package/skills/anndata/SKILL.md +83 -0
  11. package/skills/arboreto/SKILL.md +107 -0
  12. package/skills/ask/SKILL.md +55 -0
  13. package/skills/astropy/SKILL.md +30 -0
  14. package/skills/async-worker-recovery/SKILL.md +44 -0
  15. package/skills/autopilot/SKILL.md +63 -0
  16. package/skills/autoresearch/SKILL.md +64 -0
  17. package/skills/autoskill/SKILL.md +116 -0
  18. package/skills/babysit/SKILL.md +43 -0
  19. package/skills/benchling-integration/SKILL.md +106 -0
  20. package/skills/bgpt-paper-search/SKILL.md +67 -0
  21. package/skills/biopython/SKILL.md +29 -0
  22. package/skills/bioservices/SKILL.md +96 -0
  23. package/skills/brainstorming/SKILL.md +104 -0
  24. package/skills/cancel/SKILL.md +85 -0
  25. package/skills/ccg/SKILL.md +87 -0
  26. package/skills/celery-pipeline/SKILL.md +30 -0
  27. package/skills/cellxgene-census/SKILL.md +104 -0
  28. package/skills/child-pi-spawning/SKILL.md +85 -0
  29. package/skills/cirq/SKILL.md +113 -0
  30. package/skills/citation-management/SKILL.md +91 -0
  31. package/skills/clinical-decision-support/SKILL.md +117 -0
  32. package/skills/clinical-reports/SKILL.md +118 -0
  33. package/skills/clinical-trial/SKILL.md +28 -0
  34. package/skills/cobrapy/SKILL.md +116 -0
  35. package/skills/configure-notifications/SKILL.md +85 -0
  36. package/skills/consciousness-council/SKILL.md +120 -0
  37. package/skills/context-artifact-hygiene/SKILL.md +85 -0
  38. package/skills/context-mode-ops/SKILL.md +87 -0
  39. package/skills/dask/SKILL.md +85 -0
  40. package/skills/database-lookup/SKILL.md +118 -0
  41. package/skills/datamol/SKILL.md +108 -0
  42. package/skills/debug/SKILL.md +32 -0
  43. package/skills/deep-dive/SKILL.md +114 -0
  44. package/skills/deep-interview/SKILL.md +90 -0
  45. package/skills/deepchem/SKILL.md +117 -0
  46. package/skills/deepinit/SKILL.md +100 -0
  47. package/skills/deeptools/SKILL.md +118 -0
  48. package/skills/delegation-patterns/SKILL.md +56 -0
  49. package/skills/depmap/SKILL.md +94 -0
  50. package/skills/dhdna-profiler/SKILL.md +86 -0
  51. package/skills/diffdock/SKILL.md +101 -0
  52. package/skills/dispatching-parallel-agents/SKILL.md +119 -0
  53. package/skills/dnanexus-integration/SKILL.md +118 -0
  54. package/skills/do/SKILL.md +48 -0
  55. package/skills/docker-sandbox/SKILL.md +29 -0
  56. package/skills/docx/SKILL.md +119 -0
  57. package/skills/esm/SKILL.md +116 -0
  58. package/skills/etetoolkit/SKILL.md +103 -0
  59. package/skills/event-log-tracing/SKILL.md +85 -0
  60. package/skills/exa-search/SKILL.md +72 -0
  61. package/skills/executing-plans/SKILL.md +69 -0
  62. package/skills/exploratory-data-analysis/SKILL.md +118 -0
  63. package/skills/external-context/SKILL.md +80 -0
  64. package/skills/fastapi/SKILL.md +30 -0
  65. package/skills/finishing-a-development-branch/SKILL.md +106 -0
  66. package/skills/flowio/SKILL.md +114 -0
  67. package/skills/fluidsim/SKILL.md +108 -0
  68. package/skills/generate-image/SKILL.md +108 -0
  69. package/skills/geniml/SKILL.md +117 -0
  70. package/skills/geomaster/SKILL.md +109 -0
  71. package/skills/geopandas/SKILL.md +114 -0
  72. package/skills/get-available-resources/SKILL.md +100 -0
  73. package/skills/gget/SKILL.md +111 -0
  74. package/skills/ginkgo-cloud-lab/SKILL.md +52 -0
  75. package/skills/git-master/SKILL.md +85 -0
  76. package/skills/glycoengineering/SKILL.md +104 -0
  77. package/skills/gtars/SKILL.md +104 -0
  78. package/skills/hackernews-frontpage/SKILL.md +46 -0
  79. package/skills/histolab/SKILL.md +98 -0
  80. package/skills/how-it-works/SKILL.md +25 -0
  81. package/skills/hud/SKILL.md +86 -0
  82. package/skills/hugging-science/SKILL.md +93 -0
  83. package/skills/huggingface/SKILL.md +30 -0
  84. package/skills/hypogenic/SKILL.md +107 -0
  85. package/skills/hypothesis-generation/SKILL.md +118 -0
  86. package/skills/imaging-data-commons/SKILL.md +119 -0
  87. package/skills/infographics/SKILL.md +102 -0
  88. package/skills/iso-13485-certification/SKILL.md +114 -0
  89. package/skills/knowledge-agent/SKILL.md +83 -0
  90. package/skills/labarchive-integration/SKILL.md +98 -0
  91. package/skills/lamindb/SKILL.md +119 -0
  92. package/skills/landsat/SKILL.md +29 -0
  93. package/skills/latchbio-integration/SKILL.md +118 -0
  94. package/skills/latex-posters/SKILL.md +112 -0
  95. package/skills/learn-codebase/SKILL.md +24 -0
  96. package/skills/learner/SKILL.md +118 -0
  97. package/skills/literature-review/SKILL.md +118 -0
  98. package/skills/live-agent-lifecycle/SKILL.md +85 -0
  99. package/skills/mailbox-interactive/SKILL.md +85 -0
  100. package/skills/make-plan/SKILL.md +59 -0
  101. package/skills/markdown-mermaid-writing/SKILL.md +118 -0
  102. package/skills/market-research-reports/SKILL.md +119 -0
  103. package/skills/markitdown/SKILL.md +111 -0
  104. package/skills/markitdown-docs/SKILL.md +28 -0
  105. package/skills/matchms/SKILL.md +91 -0
  106. package/skills/matlab/SKILL.md +118 -0
  107. package/skills/matplotlib/SKILL.md +30 -0
  108. package/skills/mcp-setup/SKILL.md +84 -0
  109. package/skills/medchem/SKILL.md +109 -0
  110. package/skills/mem-search/SKILL.md +96 -0
  111. package/skills/modal/SKILL.md +104 -0
  112. package/skills/model-routing-context/SKILL.md +85 -0
  113. package/skills/molecular-dynamics/SKILL.md +116 -0
  114. package/skills/molfeat/SKILL.md +110 -0
  115. package/skills/multi-perspective-review/SKILL.md +85 -0
  116. package/skills/networkx/SKILL.md +111 -0
  117. package/skills/neurokit2/SKILL.md +114 -0
  118. package/skills/neuropixels-analysis/SKILL.md +112 -0
  119. package/skills/nilearn/SKILL.md +29 -0
  120. package/skills/observability-reliability/SKILL.md +43 -0
  121. package/skills/omc-doctor/SKILL.md +86 -0
  122. package/skills/omc-reference/SKILL.md +119 -0
  123. package/skills/omc-setup/SKILL.md +82 -0
  124. package/skills/omc-teams/SKILL.md +81 -0
  125. package/skills/omero-integration/SKILL.md +111 -0
  126. package/skills/open-notebook/SKILL.md +100 -0
  127. package/skills/openephys/SKILL.md +28 -0
  128. package/skills/opentrons-integration/SKILL.md +110 -0
  129. package/skills/optimize-for-gpu/SKILL.md +119 -0
  130. package/skills/orchestration/SKILL.md +85 -0
  131. package/skills/ownership-session-security/SKILL.md +43 -0
  132. package/skills/paper-lookup/SKILL.md +119 -0
  133. package/skills/paperzilla/SKILL.md +114 -0
  134. package/skills/parallel-web/SKILL.md +64 -0
  135. package/skills/pathfinder/SKILL.md +114 -0
  136. package/skills/pathml/SKILL.md +98 -0
  137. package/skills/pdf/SKILL.md +113 -0
  138. package/skills/peer-review/SKILL.md +119 -0
  139. package/skills/pennylane/SKILL.md +119 -0
  140. package/skills/phylogenetics/SKILL.md +102 -0
  141. package/skills/pi-extension-lifecycle/SKILL.md +41 -0
  142. package/skills/plan/SKILL.md +66 -0
  143. package/skills/polars/SKILL.md +114 -0
  144. package/skills/polars-bio/SKILL.md +84 -0
  145. package/skills/pptx/SKILL.md +118 -0
  146. package/skills/pptx-posters/SKILL.md +112 -0
  147. package/skills/primekg/SKILL.md +97 -0
  148. package/skills/project-session-manager/SKILL.md +85 -0
  149. package/skills/protocolsio-integration/SKILL.md +119 -0
  150. package/skills/pubmed-search/SKILL.md +29 -0
  151. package/skills/pufferlib/SKILL.md +103 -0
  152. package/skills/pydeseq2/SKILL.md +106 -0
  153. package/skills/pydicom/SKILL.md +115 -0
  154. package/skills/pyhealth/SKILL.md +117 -0
  155. package/skills/pylabrobot/SKILL.md +100 -0
  156. package/skills/pymatgen/SKILL.md +28 -0
  157. package/skills/pymc/SKILL.md +108 -0
  158. package/skills/pymoo/SKILL.md +90 -0
  159. package/skills/pyopenms/SKILL.md +119 -0
  160. package/skills/pysam/SKILL.md +118 -0
  161. package/skills/pyspark/SKILL.md +30 -0
  162. package/skills/pytdc/SKILL.md +102 -0
  163. package/skills/pytorch/SKILL.md +31 -0
  164. package/skills/pytorch-lightning/SKILL.md +119 -0
  165. package/skills/pyzotero/SKILL.md +104 -0
  166. package/skills/qiskit/SKILL.md +119 -0
  167. package/skills/qutip/SKILL.md +111 -0
  168. package/skills/ralph/SKILL.md +23 -0
  169. package/skills/ralplan/SKILL.md +105 -0
  170. package/skills/rdflib/SKILL.md +29 -0
  171. package/skills/rdkit/SKILL.md +30 -0
  172. package/skills/read-only-explorer/SKILL.md +85 -0
  173. package/skills/receiving-code-review/SKILL.md +103 -0
  174. package/skills/release/SKILL.md +117 -0
  175. package/skills/remember/SKILL.md +39 -0
  176. package/skills/requesting-code-review/SKILL.md +85 -0
  177. package/skills/requirements-to-task-packet/SKILL.md +65 -0
  178. package/skills/research-grants/SKILL.md +118 -0
  179. package/skills/research-lookup/SKILL.md +117 -0
  180. package/skills/research-reproducibility/SKILL.md +28 -0
  181. package/skills/resource-discovery-config/SKILL.md +43 -0
  182. package/skills/rowan/SKILL.md +100 -0
  183. package/skills/runtime-state-reader/SKILL.md +46 -0
  184. package/skills/safe-bash/SKILL.md +85 -0
  185. package/skills/scanpy/SKILL.md +32 -0
  186. package/skills/scholar-evaluation/SKILL.md +115 -0
  187. package/skills/scientific-brainstorming/SKILL.md +118 -0
  188. package/skills/scientific-critical-thinking/SKILL.md +119 -0
  189. package/skills/scientific-schematics/SKILL.md +116 -0
  190. package/skills/scientific-slides/SKILL.md +117 -0
  191. package/skills/scientific-visualization/SKILL.md +109 -0
  192. package/skills/scientific-writing/SKILL.md +119 -0
  193. package/skills/scikit-bio/SKILL.md +92 -0
  194. package/skills/scikit-learn/SKILL.md +99 -0
  195. package/skills/scikit-survival/SKILL.md +110 -0
  196. package/skills/sciomc/SKILL.md +86 -0
  197. package/skills/scvelo/SKILL.md +106 -0
  198. package/skills/scvi-tools/SKILL.md +114 -0
  199. package/skills/seaborn/SKILL.md +97 -0
  200. package/skills/secure-agent-orchestration-review/SKILL.md +47 -0
  201. package/skills/self-improve/SKILL.md +119 -0
  202. package/skills/semantic-compression/SKILL.md +62 -0
  203. package/skills/setup/SKILL.md +42 -0
  204. package/skills/shap/SKILL.md +103 -0
  205. package/skills/simpy/SKILL.md +116 -0
  206. package/skills/skill/SKILL.md +117 -0
  207. package/skills/skill-search/SKILL.md +67 -0
  208. package/skills/skillify/SKILL.md +46 -0
  209. package/skills/smart-explore/SKILL.md +94 -0
  210. package/skills/sqlite-pandas/SKILL.md +30 -0
  211. package/skills/stable-baselines3/SKILL.md +86 -0
  212. package/skills/state-mutation-locking/SKILL.md +44 -0
  213. package/skills/statistical-analysis/SKILL.md +108 -0
  214. package/skills/statsmodels/SKILL.md +29 -0
  215. package/skills/subagent-driven-development/SKILL.md +89 -0
  216. package/skills/sympy/SKILL.md +115 -0
  217. package/skills/system-prompts/SKILL.md +116 -0
  218. package/skills/systematic-debugging/SKILL.md +119 -0
  219. package/skills/team/SKILL.md +85 -0
  220. package/skills/test-driven-development/SKILL.md +84 -0
  221. package/skills/tiledbvcf/SKILL.md +119 -0
  222. package/skills/timeline-report/SKILL.md +85 -0
  223. package/skills/timesfm-forecasting/SKILL.md +112 -0
  224. package/skills/torch-geometric/SKILL.md +118 -0
  225. package/skills/torchdrug/SKILL.md +118 -0
  226. package/skills/trace/SKILL.md +118 -0
  227. package/skills/transformers/SKILL.md +110 -0
  228. package/skills/treatment-plans/SKILL.md +119 -0
  229. package/skills/ui-render-performance/SKILL.md +41 -0
  230. package/skills/ultragoal/SKILL.md +63 -0
  231. package/skills/ultraqa/SKILL.md +85 -0
  232. package/skills/ultrawork/SKILL.md +20 -0
  233. package/skills/umap-learn/SKILL.md +119 -0
  234. package/skills/usfiscaldata/SKILL.md +118 -0
  235. package/skills/using-git-worktrees/SKILL.md +112 -0
  236. package/skills/using-superpowers/SKILL.md +85 -0
  237. package/skills/using-vetc/SKILL.md +92 -0
  238. package/skills/vaex/SKILL.md +111 -0
  239. package/skills/venue-templates/SKILL.md +113 -0
  240. package/skills/verification-before-completion/SKILL.md +88 -0
  241. package/skills/verification-before-done/SKILL.md +68 -0
  242. package/skills/verify/SKILL.md +33 -0
  243. package/skills/version-bump/SKILL.md +54 -0
  244. package/skills/vetc-analyze-ba/SKILL.md +117 -0
  245. package/skills/vetc-analyze-codebase/SKILL.md +118 -0
  246. package/skills/vetc-api-design/SKILL.md +103 -0
  247. package/skills/vetc-brainstorming/SKILL.md +116 -0
  248. package/skills/vetc-change-proposal/SKILL.md +111 -0
  249. package/skills/vetc-cicd/SKILL.md +113 -0
  250. package/skills/vetc-continuous-learning/SKILL.md +115 -0
  251. package/skills/vetc-deep-interview/SKILL.md +103 -0
  252. package/skills/vetc-docgen/SKILL.md +108 -0
  253. package/skills/vetc-frontend-patterns/SKILL.md +99 -0
  254. package/skills/vetc-iterative-retrieval/SKILL.md +110 -0
  255. package/skills/vetc-java-patterns/SKILL.md +113 -0
  256. package/skills/vetc-meta-skill-creator/SKILL.md +99 -0
  257. package/skills/vetc-oracle-patterns/SKILL.md +109 -0
  258. package/skills/vetc-performance-testing/SKILL.md +104 -0
  259. package/skills/vetc-pr-response/SKILL.md +106 -0
  260. package/skills/vetc-ralph/SKILL.md +108 -0
  261. package/skills/vetc-ralplan/SKILL.md +116 -0
  262. package/skills/vetc-receiving-review/SKILL.md +106 -0
  263. package/skills/vetc-reconcile-patterns/SKILL.md +117 -0
  264. package/skills/vetc-refactoring/SKILL.md +96 -0
  265. package/skills/vetc-runbook/SKILL.md +118 -0
  266. package/skills/vetc-sast/SKILL.md +118 -0
  267. package/skills/vetc-sdlc/SKILL.md +97 -0
  268. package/skills/vetc-security/SKILL.md +117 -0
  269. package/skills/vetc-spec-driven/SKILL.md +111 -0
  270. package/skills/vetc-spec-quality/SKILL.md +117 -0
  271. package/skills/vetc-systematic-debugging/SKILL.md +74 -0
  272. package/skills/vetc-tdd/SKILL.md +96 -0
  273. package/skills/vetc-thinking-pm/SKILL.md +110 -0
  274. package/skills/vetc-ui-visual-qa/SKILL.md +117 -0
  275. package/skills/vetc-verify/SKILL.md +101 -0
  276. package/skills/visual-verdict/SKILL.md +59 -0
  277. package/skills/what-if-oracle/SKILL.md +87 -0
  278. package/skills/widget-rendering/SKILL.md +85 -0
  279. package/skills/wiki/SKILL.md +69 -0
  280. package/skills/workspace-isolation/SKILL.md +85 -0
  281. package/skills/worktree-isolation/SKILL.md +85 -0
  282. package/skills/wowerpoint/SKILL.md +101 -0
  283. package/skills/writer-memory/SKILL.md +82 -0
  284. package/skills/writing-plans/SKILL.md +115 -0
  285. package/skills/writing-skills/SKILL.md +115 -0
  286. package/skills/xgboost/SKILL.md +29 -0
  287. package/skills/xgboost-ts/SKILL.md +28 -0
  288. package/skills/xlsx/SKILL.md +111 -0
  289. package/skills/zarr-python/SKILL.md +101 -0
  290. package/src/categories.ts +383 -0
  291. package/src/format.ts +104 -0
  292. package/src/indexer.ts +101 -0
  293. package/src/proactive.ts +51 -0
  294. package/src/scanner.ts +85 -0
  295. package/src/search.ts +89 -0
  296. package/src/strip.ts +29 -0
  297. package/src/synonyms.ts +83 -0
  298. package/src/text.ts +118 -0
  299. package/src/types.ts +64 -0
@@ -0,0 +1,90 @@
1
+ ---
2
+ name: pymoo
3
+ description: Multi-objective optimization framework. NSGA-II, NSGA-III, MOEA/D, Pareto fronts, constraint handling, benchmarks (ZDT, DTLZ), for engineering design and optimization problems.
4
+ ---
5
+
6
+ # Pymoo - Multi-Objective Optimization in Python
7
+
8
+ ## Overview
9
+
10
+ Pymoo is a comprehensive Python framework for optimization with emphasis on multi-objective problems. Solve single and multi-objective optimization using state-of-the-art algorithms (NSGA-II/III, MOEA/D), benchmark problems (ZDT, DTLZ), customizable genetic operators, and multi-criteria decision making methods. Excels at finding trade-off solutions (Pareto fronts) for problems with conflicting objectives.
11
+
12
+ ## When to Use This Skill
13
+
14
+ This skill should be used when:
15
+ - Solving optimization problems with one or multiple objectives
16
+ - Finding Pareto-optimal solutions and analyzing trade-offs
17
+ - Implementing evolutionary algorithms (GA, DE, PSO, NSGA-II/III)
18
+ - Working with constrained optimization problems
19
+ - Benchmarking algorithms on standard test problems (ZDT, DTLZ, WFG)
20
+ - Customizing genetic operators (crossover, mutation, selection)
21
+ - Visualizing high-dimensional optimization results
22
+ - Making decisions from multiple competing solutions
23
+ - Handling binary, discrete, continuous, or mixed-variable problems
24
+
25
+ ## Core Concepts
26
+
27
+ ### The Unified Interface
28
+
29
+ Pymoo uses a consistent `minimize()` function for all optimization tasks:
30
+
31
+ ```python
32
+ from pymoo.optimize import minimize
33
+
34
+ result = minimize(
35
+ problem, # What to optimize
36
+ algorithm, # How to optimize
37
+ termination, # When to stop
38
+ seed=1,
39
+ verbose=True
40
+ )
41
+ ```
42
+
43
+ ### Problem Types
44
+
45
+ **Single-objective:** One objective to minimize/maximize
46
+ **Multi-objective:** 2-3 conflicting objectives → Pareto front
47
+ **Many-objective:** 4+ objectives → High-dimensional Pareto front
48
+ **Constrained:** Objectives + inequality/equality constraints
49
+ **Dynamic:** Time-varying objectives or constraints
50
+
51
+ ## Quick Start Workflows
52
+
53
+ ### Workflow 1: Single-Objective Optimization
54
+
55
+ **When:** Optimizing one objective function
56
+
57
+ **Steps:**
58
+ 1. Define or select problem
59
+ 2. Choose single-objective algorithm (GA, DE, PSO, CMA-ES)
60
+ 3. Configure termination criteria
61
+ 4. Run optimization
62
+ 5. Extract best solution
63
+
64
+ **Example:**
65
+ ```python
66
+ from pymoo.algorithms.soo.nonconvex.ga import GA
67
+ from pymoo.problems import get_problem
68
+ from pymoo.optimize import minimize
69
+
70
+ # Built-in problem
71
+ problem = get_problem("rastrigin", n_var=10)
72
+
73
+ # Configure Genetic Algorithm
74
+ algorithm = GA(
75
+ pop_size=100,
76
+ eliminate_duplicates=True
77
+ )
78
+
79
+ # Optimize
80
+ result = minimize(
81
+ problem,
82
+ algorithm,
83
+ ('n_gen', 200),
84
+ seed=1,
85
+ verbose=True
86
+ )
87
+
88
+ print(f"Best solution: {result.X}")
89
+ print(f"Best objective: {result.F[0]}")
90
+ ```
@@ -0,0 +1,119 @@
1
+ ---
2
+ name: pyopenms
3
+ description: Complete mass spectrometry analysis platform. Use for proteomics workflows feature detection, peptide identification, protein quantification, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. Best for proteomics, comprehensive MS data processing. For simple spectral comparison and metabolite ID use matchms.
4
+ ---
5
+
6
+ # PyOpenMS
7
+
8
+ ## Overview
9
+
10
+ PyOpenMS provides Python bindings to the OpenMS library for computational mass spectrometry, enabling analysis of proteomics and metabolomics data. Use for handling mass spectrometry file formats, processing spectral data, detecting features, identifying peptides/proteins, and performing quantitative analysis.
11
+
12
+ ## Core Capabilities
13
+
14
+ PyOpenMS organizes functionality into these domains:
15
+
16
+ # Read mzML file
17
+ exp = ms.MSExperiment()
18
+ ms.MzMLFile().load("data.mzML", exp)
19
+
20
+ # Access spectra
21
+ for spectrum in exp:
22
+ mz, intensity = spectrum.get_peaks()
23
+ print(f"Spectrum: {len(mz)} peaks")
24
+ ```
25
+
26
+ **For detailed file handling**: See `(see docs)`
27
+
28
+ ### 2. Signal Processing
29
+
30
+ Process raw spectral data with smoothing, filtering, centroiding, and normalization.
31
+
32
+ Basic spectrum processing:
33
+
34
+ ```python
35
+ # Smooth spectrum with Gaussian filter
36
+ gaussian = ms.GaussFilter()
37
+ params = gaussian.getParameters()
38
+ params.setValue("gaussian_width", 0.1)
39
+ gaussian.setParameters(params)
40
+ gaussian.filterExperiment(exp)
41
+ ```
42
+
43
+ **For algorithm details**: See `(see docs)`
44
+
45
+ ### 3. Feature Detection
46
+
47
+ Detect and link features across spectra and samples for quantitative analysis.
48
+
49
+ ```python
50
+ # Detect features
51
+ ff = ms.FeatureFinder()
52
+ ff.run("centroided", exp, features, params, ms.FeatureMap())
53
+ ```
54
+
55
+ **For complete workflows**: See `(see docs)`
56
+
57
+ ### 4. Peptide and Protein Identification
58
+
59
+ Integrate with search engines and process identification results.
60
+
61
+ **Supported engines**: Comet, Mascot, MSGFPlus, XTandem, OMSSA, Myrimatch
62
+
63
+ Basic identification workflow:
64
+
65
+ ```python
66
+ # Load identification data
67
+ protein_ids = []
68
+ peptide_ids = []
69
+ ms.IdXMLFile().load("identifications.idXML", protein_ids, peptide_ids)
70
+
71
+ # Apply FDR filtering
72
+ fdr = ms.FalseDiscoveryRate()
73
+ fdr.apply(peptide_ids)
74
+ ```
75
+
76
+ **For detailed workflows**: See `(see docs)`
77
+
78
+ ### 5. Metabolomics Analysis
79
+
80
+ Perform untargeted metabolomics preprocessing and analysis.
81
+
82
+ Typical workflow:
83
+ 1. Load and process raw data
84
+ 2. Detect features
85
+ 3. Align retention times across samples
86
+ 4. Link features to consensus map
87
+ 5. Annotate with compound databases
88
+
89
+ **For complete metabolomics workflows**: See `(see docs)`
90
+
91
+ ## Data Structures
92
+
93
+ PyOpenMS uses these primary objects:
94
+
95
+ - **MSExperiment**: Collection of spectra and chromatograms
96
+ - **MSSpectrum**: Single mass spectrum with m/z and intensity pairs
97
+ - **MSChromatogram**: Chromatographic trace
98
+ - **Feature**: Detected chromatographic peak with quality metrics
99
+ - **FeatureMap**: Collection of features
100
+ - **PeptideIdentification**: Search results for peptides
101
+ - **ProteinIdentification**: Search results for proteins
102
+
103
+ **For detailed documentation**: See `(see docs)`
104
+
105
+ ## Common Workflows
106
+
107
+ ### Quick Start: Load and Explore Data
108
+
109
+ ```python
110
+ import pyopenms as ms
111
+
112
+ # Load mzML file
113
+ exp = ms.MSExperiment()
114
+ ms.MzMLFile().load("sample.mzML", exp)
115
+
116
+ # Get basic statistics
117
+ print(f"Number of spectra: {exp.getNrSpectra()}")
118
+
119
+
@@ -0,0 +1,118 @@
1
+ ---
2
+ name: pysam
3
+ description: Genomic file toolkit. Read/write SAM/BAM/CRAM alignments, VCF/BCF variants, FASTA/FASTQ sequences, extract regions, calculate coverage, for NGS data processing pipelines.
4
+ ---
5
+
6
+ # Pysam
7
+
8
+ ## Overview
9
+
10
+ Pysam is a Python module for reading, manipulating, and writing genomic datasets. Read/write SAM/BAM/CRAM alignment files, VCF/BCF variant files, and FASTA/FASTQ sequences with a Pythonic interface to htslib. Query tabix-indexed files, perform pileup analysis for coverage, and execute samtools/bcftools commands.
11
+
12
+ ## When to Use This Skill
13
+
14
+ This skill should be used when:
15
+ - Working with sequencing alignment files (BAM/CRAM)
16
+ - Analyzing genetic variants (VCF/BCF)
17
+ - Extracting reference sequences or gene regions
18
+ - Processing raw sequencing data (FASTQ)
19
+ - Calculating coverage or read depth
20
+ - Implementing bioinformatics analysis pipelines
21
+ - Quality control of sequencing data
22
+ - Variant calling and annotation workflows
23
+
24
+ ## Quick Start
25
+
26
+ ### Basic Examples
27
+
28
+ **Read alignment file:**
29
+ ```python
30
+ import pysam
31
+
32
+ # Open BAM file and fetch reads in region
33
+ samfile = pysam.AlignmentFile("example.bam", "rb")
34
+ for read in samfile.fetch("chr1", 1000, 2000):
35
+ print(f"{read.query_name}: {read.reference_start}")
36
+ samfile.close()
37
+ ```
38
+
39
+ **Read variant file:**
40
+ ```python
41
+ # Open VCF file and iterate variants
42
+ vcf = pysam.VariantFile("variants.vcf")
43
+ for variant in vcf:
44
+ print(f"{variant.chrom}:{variant.pos} {variant.ref}>{variant.alts}")
45
+ vcf.close()
46
+ ```
47
+
48
+ **Query reference sequence:**
49
+ ```python
50
+ # Open FASTA and extract sequence
51
+ fasta = pysam.FastaFile("reference.fasta")
52
+ sequence = fasta.fetch("chr1", 1000, 2000)
53
+ print(sequence)
54
+ fasta.close()
55
+
56
+ ## Core Capabilities
57
+
58
+ ### 1. Alignment File Operations (SAM/BAM/CRAM)
59
+
60
+ Use the `AlignmentFile` class to work with aligned sequencing reads. This is appropriate for analyzing mapping results, calculating coverage, extracting reads, or quality control.
61
+
62
+ **Common operations:**
63
+ - Open and read BAM/SAM/CRAM files
64
+ - Fetch reads from specific genomic regions
65
+ - Filter reads by mapping quality, flags, or other criteria
66
+ - Write filtered or modified alignments
67
+ - Calculate coverage statistics
68
+ - Perform pileup analysis (base-by-base coverage)
69
+ - Access read sequences, quality scores, and alignment information
70
+
71
+ **Reference:** See `(see docs)` for detailed documentation on:
72
+ - Opening and reading alignment files
73
+
74
+ ### 2. Variant File Operations (VCF/BCF)
75
+
76
+ Use the `VariantFile` class to work with genetic variants from variant calling pipelines. This is appropriate for variant analysis, filtering, annotation, or population genetics.
77
+
78
+ **Common operations:**
79
+ - Read and write VCF/BCF files
80
+ - Query variants in specific regions
81
+ - Access variant information (position, alleles, quality)
82
+ - Extract genotype data for samples
83
+ - Filter variants by quality, allele frequency, or other criteria
84
+ - Annotate variants with additional information
85
+ - Subset samples or regions
86
+
87
+ **Reference:** See `(see docs)` for detailed documentation on:
88
+ - Opening and reading variant files
89
+
90
+ ### 3. Sequence File Operations (FASTA/FASTQ)
91
+
92
+ Use `FastaFile` for random access to reference sequences and `FastxFile` for reading raw sequencing data. This is appropriate for extracting gene sequences, validating variants against reference, or processing raw reads.
93
+
94
+ **Common operations:**
95
+ - Query reference sequences by genomic coordinates
96
+ - Extract sequences for genes or regions of interest
97
+ - Read FASTQ files with quality scores
98
+ - Validate variant reference alleles
99
+ - Calculate sequence statistics
100
+ - Filter reads by quality or length
101
+ - Convert between FASTA and FASTQ formats
102
+
103
+ **Reference:** See `(see docs)` for detailed documentation on:
104
+ - FASTA file access and indexing
105
+
106
+ ## Key Concepts
107
+
108
+ ### Coordinate Systems
109
+
110
+ **Critical:** Pysam uses **0-based, half-open** coordinates (Python convention):
111
+ - Start positions are 0-based (first base is position 0)
112
+ - End positions are exclusive (not included in the range)
113
+ - Region 1000-2000 includes bases 1000-1999 (1000 bases total)
114
+
115
+ **Exception:** Region strings in `fetch()` follow samtools convention (1-based):
116
+ ```python
117
+
118
+
@@ -0,0 +1,30 @@
1
+ ---
2
+ name: pyspark
3
+ description: Distributed data processing with Apache Spark. Use when working with large-scale datasets, distributed SQL, DataFrame operations, ETL pipelines, or cluster computing. Trigger on imports of pyspark, SparkSession, or mentions of big data, distributed, cluster, ETL, Spark, DataFrame at scale.
4
+ ---
5
+ # pyspark
6
+
7
+ Use this skill for large-scale distributed data processing.
8
+
9
+ ## Core patterns
10
+
11
+ - **Session**: `SparkSession.builder.appName('analysis').getOrCreate()`.
12
+ - **Read**: `spark.read.parquet('data/')` or `spark.read.csv('data.csv', header=True, inferSchema=True)`.
13
+ - **Transform**: `df.filter()`, `df.select()`, `df.groupBy().agg()`, `df.join(other, on='key')`.
14
+ - **SQL**: `df.createOrReplaceTempView('table')` → `spark.sql('SELECT * FROM table')`.
15
+ - **Write**: `df.write.parquet('output/', mode='overwrite')`.
16
+
17
+ ## Rules
18
+
19
+ - Always use `coalesce(1)` or `repartition()` before writing small outputs.
20
+ - Persist intermediate DataFrames used multiple times: `df.persist(StorageLevel.MEMORY_AND_DISK)`.
21
+ - Use broadcast join for small/large table joins: `broadcast(small_df)`.
22
+ - Avoid `collect()` on large DataFrames — use `take(n)` or `toPandas()` with caution.
23
+
24
+ ## Anti-patterns
25
+
26
+ - Don't call `toPandas()` on large DataFrames — it collects all data to driver.
27
+ - Don't use Python UDFs when built-in Spark SQL functions suffice.
28
+ - Don't create `SparkSession` per operation — reuse across the application.
29
+
30
+
@@ -0,0 +1,102 @@
1
+ ---
2
+ name: pytdc
3
+ description: Therapeutics Data Commons. AI-ready drug discovery datasets (ADME, toxicity, DTI), benchmarks, scaffold splits, molecular oracles, for therapeutic ML and pharmacological prediction.
4
+ ---
5
+
6
+ # PyTDC (Therapeutics Data Commons)
7
+
8
+ ## Overview
9
+
10
+ PyTDC is an open-science platform providing AI-ready datasets and benchmarks for drug discovery and development. Access curated datasets spanning the entire therapeutics pipeline with standardized evaluation metrics and meaningful data splits, organized into three categories: single-instance prediction (molecular/protein properties), multi-instance prediction (drug-target interactions, DDI), and generation (molecule generation, retrosynthesis).
11
+
12
+ ## When to Use This Skill
13
+
14
+ This skill should be used when:
15
+ - Working with drug discovery or therapeutic ML datasets
16
+ - Benchmarking machine learning models on standardized pharmaceutical tasks
17
+ - Predicting molecular properties (ADME, toxicity, bioactivity)
18
+ - Predicting drug-target or drug-drug interactions
19
+ - Generating novel molecules with desired properties
20
+ - Accessing curated datasets with proper train/test splits (scaffold, cold-split)
21
+ - Using molecular oracles for property optimization
22
+
23
+ ## Quick Start
24
+
25
+ The basic pattern for accessing any TDC dataset follows this structure:
26
+
27
+ ```python
28
+ from tdc.<problem> import
29
+ data = (name='')
30
+ split = data.get_split(method='scaffold', seed=1, frac=[0.7, 0.1, 0.2])
31
+ df = data.get_data(format='df')
32
+ ```
33
+
34
+ Where:
35
+ - `<problem>`: One of `single_pred`, `multi_pred`, or `generation`
36
+ - ``: Specific task category (e.g., ADME, DTI, MolGen)
37
+ - ``: Dataset name within that task
38
+
39
+ # Returns dict with 'train', 'valid', 'test' DataFrames
40
+ ```
41
+
42
+ ## Single-Instance Prediction Tasks
43
+
44
+ Single-instance prediction involves forecasting properties of individual biomedical entities (molecules, proteins, etc.).
45
+
46
+ ### Available Task Categories
47
+
48
+ #### 1. ADME (Absorption, Distribution, Metabolism, Excretion)
49
+
50
+ Predict pharmacokinetic properties of drug molecules.
51
+
52
+ ```python
53
+ from tdc.single_pred import ADME
54
+ data = ADME(name='Caco2_Wang') # Intestinal permeability
55
+ # Other datasets: HIA_Hou, Bioavailability_Ma, Lipophilicity_AstraZeneca, etc.
56
+ ```
57
+
58
+ **Common ADME datasets:**
59
+ - Caco2 - Intestinal permeability
60
+ - HIA - Human intestinal absorption
61
+ - Bioavailability - Oral bioavailability
62
+ - Lipophilicity - Octanol-water partition coefficient
63
+ - Solubility - Aqueous solubility
64
+ - BBB - Blood-brain barrier penetration
65
+ - CYP - Cytochrome P450 metabolism
66
+
67
+ #### 2. Toxicity (Tox)
68
+
69
+ Predict toxicity and adverse effects of compounds.
70
+
71
+ # Other datasets: AMES, DILI, Carcinogens_Lagunin, etc.
72
+ ```
73
+
74
+ **Common toxicity datasets:**
75
+ - hERG - Cardiac toxicity
76
+ - AMES - Mutagenicity
77
+ - DILI - Drug-induced liver injury
78
+ - Carcinogens - Carcinogenicity
79
+ - ClinTox - Clinical trial toxicity
80
+
81
+ #### 3. HTS (High-Throughput Screening)
82
+
83
+ Bioactivity predictions from screening data.
84
+
85
+ ```python
86
+
87
+ ## Multi-Instance Prediction Tasks
88
+
89
+ Multi-instance prediction involves forecasting properties of interactions between multiple biomedical entities.
90
+
91
+ ### Available Task Categories
92
+
93
+ #### 1. DTI (Drug-Target Interaction)
94
+
95
+ Predict binding affinity between drugs and protein targets.
96
+
97
+ ```python
98
+ from tdc.multi_pred import DTI
99
+ data = DTI(name='BindingDB_Kd')
100
+ split = data.get_split()
101
+
102
+
@@ -0,0 +1,31 @@
1
+ ---
2
+ name: pytorch
3
+ description: Deep learning framework for building and training neural networks. Use when creating CNN, RNN, Transformer, or custom architectures, training models with GPU acceleration, implementing custom loss functions, or optimizing with autograd. Trigger on imports of torch, torchvision, torchaudio, nn.Module, or mentions of neural network training, GPU, CUDA, tensor operations.
4
+ ---
5
+ # pytorch
6
+
7
+ Use this skill for deep learning model development.
8
+
9
+ ## Core patterns
10
+
11
+ - **Model**: Subclass `nn.Module`, define `__init__` and `forward()`.
12
+ - **Training loop**: Forward → loss → `loss.backward()` → `optimizer.step()` → `optimizer.zero_grad()`.
13
+ - **Data**: `Dataset` + `DataLoader(shuffle=True, num_workers=4, pin_memory=True)`.
14
+ - **GPU**: `tensor.to(device)`, `model.to(device)`. Check `torch.cuda.is_available()`.
15
+ - **Saving**: `torch.save(model.state_dict(), path)` / `model.load_state_dict(torch.load(path))`.
16
+
17
+ ## Rules
18
+
19
+ - Use `torch.no_grad()` context during inference and evaluation.
20
+ - Set `model.eval()` before validation; `model.train()` before training.
21
+ - Use `nn.Sequential` for simple stacks; custom `forward()` for complex architectures.
22
+ - Learning rate scheduling: call `scheduler.step()` after `optimizer.step()`.
23
+ - Mixed precision: `torch.amp.autocast('cuda')` + `GradScaler` for faster training.
24
+
25
+ ## Anti-patterns
26
+
27
+ - Don't forget `optimizer.zero_grad()` — gradients accumulate by default.
28
+ - Don't use `.item()` inside training loop on large tensors — only for scalar metrics.
29
+ - Don't hardcode device — always use `device = 'cuda' if torch.cuda.is_available() else 'cpu'`.
30
+
31
+
@@ -0,0 +1,119 @@
1
+ ---
2
+ name: pytorch-lightning
3
+ description: Deep learning framework (PyTorch Lightning). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.
4
+ ---
5
+
6
+ # PyTorch Lightning
7
+
8
+ ## Overview
9
+
10
+ PyTorch Lightning is a deep learning framework that organizes PyTorch code to eliminate boilerplate while maintaining full flexibility. Automate training workflows, multi-device orchestration, and implement best practices for neural network training and scaling across multiple GPUs/TPUs.
11
+
12
+ ## When to Use This Skill
13
+
14
+ This skill should be used when:
15
+ - Building, training, or deploying neural networks using PyTorch Lightning
16
+ - Organizing PyTorch code into LightningModules
17
+ - Configuring Trainers for multi-GPU/TPU training
18
+ - Implementing data pipelines with LightningDataModules
19
+ - Working with callbacks, logging, and distributed training strategies (DDP, FSDP, DeepSpeed)
20
+ - Structuring deep learning projects professionally
21
+
22
+ ## Core Capabilities
23
+
24
+ ### 1. LightningModule - Model Definition
25
+
26
+ Organize PyTorch models into six logical sections:
27
+
28
+ 1. **Initialization** - `__init__()` and `setup()`
29
+ 2. **Training Loop** - `training_step(batch, batch_idx)`
30
+ 3. **Validation Loop** - `validation_step(batch, batch_idx)`
31
+ 4. **Test Loop** - `test_step(batch, batch_idx)`
32
+ 5. **Prediction** - `predict_step(batch, batch_idx)`
33
+ 6. **Optimizer Configuration** - `configure_optimizers()`
34
+
35
+ **Quick template reference:** See `scripts/template_lightning_module.py` for a complete boilerplate.
36
+
37
+ **Detailed documentation:** Read `(see docs)` for comprehensive method documentation, hooks, properties, and best practices.
38
+
39
+ ### 2. Trainer - Training Automation
40
+
41
+ The Trainer automates the training loop, device management, gradient operations, and callbacks. Key features:
42
+
43
+ - Multi-GPU/TPU support with strategy selection (DDP, FSDP, DeepSpeed)
44
+ - Automatic mixed precision training
45
+ - Gradient accumulation and clipping
46
+ - Checkpointing and early stopping
47
+ - Progress bars and logging
48
+
49
+ **Quick setup reference:** See `scripts/quick_trainer_setup.py` for common Trainer configurations.
50
+
51
+ **Detailed documentation:** Read `(see docs)` for all parameters, methods, and configuration options.
52
+
53
+ ### 3. LightningDataModule - Data Pipeline Organization
54
+
55
+ Encapsulate all data processing steps in a reusable class:
56
+
57
+ 1. `prepare_data()` - Download and process data (single-process)
58
+ 2. `setup()` - Create datasets and apply transforms (per-GPU)
59
+ 3. `train_dataloader()` - Return training DataLoader
60
+ 4. `val_dataloader()` - Return validation DataLoader
61
+ 5. `test_dataloader()` - Return test DataLoader
62
+
63
+ **Quick template reference:** See `scripts/template_datamodule.py` for a complete boilerplate.
64
+
65
+ **Detailed documentation:** Read `(see docs)` for method details and usage patterns.
66
+
67
+ ### 4. Callbacks - Extensible Training Logic
68
+
69
+ Add custom functionality at specific training hooks without modifying your LightningModule. Built-in callbacks include:
70
+
71
+ - **ModelCheckpoint** - Save best/latest models
72
+ - **EarlyStopping** - Stop when metrics plateau
73
+ - **LearningRateMonitor** - Track LR scheduler changes
74
+ - **BatchSizeFinder** - Auto-determine optimal batch size
75
+
76
+ **Detailed documentation:** Read `(see docs)` for built-in callbacks and custom callback creation.
77
+
78
+ ### 5. Logging - Experiment Tracking
79
+
80
+ Integrate with multiple logging platforms:
81
+
82
+ - TensorBoard (default)
83
+ - Weights & Biases (WandbLogger)
84
+ - MLflow (MLFlowLogger)
85
+ - Neptune (NeptuneLogger)
86
+ - Comet (CometLogger)
87
+ - CSV (CSVLogger)
88
+
89
+ Log metrics using `self.log("metric_name", value)` in any LightningModule method.
90
+
91
+ **Detailed documentation:** Read `(see docs)` for logger setup and configuration.
92
+
93
+ ### 6. Distributed Training - Scale to Multiple Devices
94
+
95
+ Choose the right strategy based on model size:
96
+
97
+ - **DDP** - For models <500M parameters (ResNet, smaller transformers)
98
+ - **FSDP** - For models 500M+ parameters (large transformers, recommended for Lightning users)
99
+ - **DeepSpeed** - For cutting-edge features and fine-grained control
100
+
101
+ Configure with: `Trainer(strategy="ddp", accelerator="gpu", devices=4)`
102
+
103
+ **Detailed documentation:** Read `(see docs)` for strategy comparison and configuration.
104
+
105
+ ### 7. Best Practices
106
+
107
+ - Device agnostic code - Use `self.device` instead of `.cuda()`
108
+ - Hyperparameter saving - Use `self.save_hyperparameters()` in `__init__()`
109
+ - Metric logging - Use `self.log()` for automatic aggregation across devices
110
+ - Reproducibility - Use `seed_everything()` and `Trainer(deterministic=True)`
111
+ - Debugging - Use `Trainer(fast_dev_run=True)` to test with 1 batch
112
+
113
+ **Detailed documentation:** Read `(see docs)` for common patterns and pitfalls.
114
+
115
+ ## Quick Workflow
116
+
117
+ 1. **Define model:**
118
+
119
+