pi-skill-search 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (299) hide show
  1. package/CHANGELOG.md +20 -0
  2. package/LICENSE +21 -0
  3. package/README.md +97 -0
  4. package/index.ts +163 -0
  5. package/package.json +48 -0
  6. package/skills/adaptyv/SKILL.md +92 -0
  7. package/skills/add-community-extension/SKILL.md +85 -0
  8. package/skills/aeon/SKILL.md +111 -0
  9. package/skills/ai-slop-cleaner/SKILL.md +118 -0
  10. package/skills/anndata/SKILL.md +83 -0
  11. package/skills/arboreto/SKILL.md +107 -0
  12. package/skills/ask/SKILL.md +55 -0
  13. package/skills/astropy/SKILL.md +30 -0
  14. package/skills/async-worker-recovery/SKILL.md +44 -0
  15. package/skills/autopilot/SKILL.md +63 -0
  16. package/skills/autoresearch/SKILL.md +64 -0
  17. package/skills/autoskill/SKILL.md +116 -0
  18. package/skills/babysit/SKILL.md +43 -0
  19. package/skills/benchling-integration/SKILL.md +106 -0
  20. package/skills/bgpt-paper-search/SKILL.md +67 -0
  21. package/skills/biopython/SKILL.md +29 -0
  22. package/skills/bioservices/SKILL.md +96 -0
  23. package/skills/brainstorming/SKILL.md +104 -0
  24. package/skills/cancel/SKILL.md +85 -0
  25. package/skills/ccg/SKILL.md +87 -0
  26. package/skills/celery-pipeline/SKILL.md +30 -0
  27. package/skills/cellxgene-census/SKILL.md +104 -0
  28. package/skills/child-pi-spawning/SKILL.md +85 -0
  29. package/skills/cirq/SKILL.md +113 -0
  30. package/skills/citation-management/SKILL.md +91 -0
  31. package/skills/clinical-decision-support/SKILL.md +117 -0
  32. package/skills/clinical-reports/SKILL.md +118 -0
  33. package/skills/clinical-trial/SKILL.md +28 -0
  34. package/skills/cobrapy/SKILL.md +116 -0
  35. package/skills/configure-notifications/SKILL.md +85 -0
  36. package/skills/consciousness-council/SKILL.md +120 -0
  37. package/skills/context-artifact-hygiene/SKILL.md +85 -0
  38. package/skills/context-mode-ops/SKILL.md +87 -0
  39. package/skills/dask/SKILL.md +85 -0
  40. package/skills/database-lookup/SKILL.md +118 -0
  41. package/skills/datamol/SKILL.md +108 -0
  42. package/skills/debug/SKILL.md +32 -0
  43. package/skills/deep-dive/SKILL.md +114 -0
  44. package/skills/deep-interview/SKILL.md +90 -0
  45. package/skills/deepchem/SKILL.md +117 -0
  46. package/skills/deepinit/SKILL.md +100 -0
  47. package/skills/deeptools/SKILL.md +118 -0
  48. package/skills/delegation-patterns/SKILL.md +56 -0
  49. package/skills/depmap/SKILL.md +94 -0
  50. package/skills/dhdna-profiler/SKILL.md +86 -0
  51. package/skills/diffdock/SKILL.md +101 -0
  52. package/skills/dispatching-parallel-agents/SKILL.md +119 -0
  53. package/skills/dnanexus-integration/SKILL.md +118 -0
  54. package/skills/do/SKILL.md +48 -0
  55. package/skills/docker-sandbox/SKILL.md +29 -0
  56. package/skills/docx/SKILL.md +119 -0
  57. package/skills/esm/SKILL.md +116 -0
  58. package/skills/etetoolkit/SKILL.md +103 -0
  59. package/skills/event-log-tracing/SKILL.md +85 -0
  60. package/skills/exa-search/SKILL.md +72 -0
  61. package/skills/executing-plans/SKILL.md +69 -0
  62. package/skills/exploratory-data-analysis/SKILL.md +118 -0
  63. package/skills/external-context/SKILL.md +80 -0
  64. package/skills/fastapi/SKILL.md +30 -0
  65. package/skills/finishing-a-development-branch/SKILL.md +106 -0
  66. package/skills/flowio/SKILL.md +114 -0
  67. package/skills/fluidsim/SKILL.md +108 -0
  68. package/skills/generate-image/SKILL.md +108 -0
  69. package/skills/geniml/SKILL.md +117 -0
  70. package/skills/geomaster/SKILL.md +109 -0
  71. package/skills/geopandas/SKILL.md +114 -0
  72. package/skills/get-available-resources/SKILL.md +100 -0
  73. package/skills/gget/SKILL.md +111 -0
  74. package/skills/ginkgo-cloud-lab/SKILL.md +52 -0
  75. package/skills/git-master/SKILL.md +85 -0
  76. package/skills/glycoengineering/SKILL.md +104 -0
  77. package/skills/gtars/SKILL.md +104 -0
  78. package/skills/hackernews-frontpage/SKILL.md +46 -0
  79. package/skills/histolab/SKILL.md +98 -0
  80. package/skills/how-it-works/SKILL.md +25 -0
  81. package/skills/hud/SKILL.md +86 -0
  82. package/skills/hugging-science/SKILL.md +93 -0
  83. package/skills/huggingface/SKILL.md +30 -0
  84. package/skills/hypogenic/SKILL.md +107 -0
  85. package/skills/hypothesis-generation/SKILL.md +118 -0
  86. package/skills/imaging-data-commons/SKILL.md +119 -0
  87. package/skills/infographics/SKILL.md +102 -0
  88. package/skills/iso-13485-certification/SKILL.md +114 -0
  89. package/skills/knowledge-agent/SKILL.md +83 -0
  90. package/skills/labarchive-integration/SKILL.md +98 -0
  91. package/skills/lamindb/SKILL.md +119 -0
  92. package/skills/landsat/SKILL.md +29 -0
  93. package/skills/latchbio-integration/SKILL.md +118 -0
  94. package/skills/latex-posters/SKILL.md +112 -0
  95. package/skills/learn-codebase/SKILL.md +24 -0
  96. package/skills/learner/SKILL.md +118 -0
  97. package/skills/literature-review/SKILL.md +118 -0
  98. package/skills/live-agent-lifecycle/SKILL.md +85 -0
  99. package/skills/mailbox-interactive/SKILL.md +85 -0
  100. package/skills/make-plan/SKILL.md +59 -0
  101. package/skills/markdown-mermaid-writing/SKILL.md +118 -0
  102. package/skills/market-research-reports/SKILL.md +119 -0
  103. package/skills/markitdown/SKILL.md +111 -0
  104. package/skills/markitdown-docs/SKILL.md +28 -0
  105. package/skills/matchms/SKILL.md +91 -0
  106. package/skills/matlab/SKILL.md +118 -0
  107. package/skills/matplotlib/SKILL.md +30 -0
  108. package/skills/mcp-setup/SKILL.md +84 -0
  109. package/skills/medchem/SKILL.md +109 -0
  110. package/skills/mem-search/SKILL.md +96 -0
  111. package/skills/modal/SKILL.md +104 -0
  112. package/skills/model-routing-context/SKILL.md +85 -0
  113. package/skills/molecular-dynamics/SKILL.md +116 -0
  114. package/skills/molfeat/SKILL.md +110 -0
  115. package/skills/multi-perspective-review/SKILL.md +85 -0
  116. package/skills/networkx/SKILL.md +111 -0
  117. package/skills/neurokit2/SKILL.md +114 -0
  118. package/skills/neuropixels-analysis/SKILL.md +112 -0
  119. package/skills/nilearn/SKILL.md +29 -0
  120. package/skills/observability-reliability/SKILL.md +43 -0
  121. package/skills/omc-doctor/SKILL.md +86 -0
  122. package/skills/omc-reference/SKILL.md +119 -0
  123. package/skills/omc-setup/SKILL.md +82 -0
  124. package/skills/omc-teams/SKILL.md +81 -0
  125. package/skills/omero-integration/SKILL.md +111 -0
  126. package/skills/open-notebook/SKILL.md +100 -0
  127. package/skills/openephys/SKILL.md +28 -0
  128. package/skills/opentrons-integration/SKILL.md +110 -0
  129. package/skills/optimize-for-gpu/SKILL.md +119 -0
  130. package/skills/orchestration/SKILL.md +85 -0
  131. package/skills/ownership-session-security/SKILL.md +43 -0
  132. package/skills/paper-lookup/SKILL.md +119 -0
  133. package/skills/paperzilla/SKILL.md +114 -0
  134. package/skills/parallel-web/SKILL.md +64 -0
  135. package/skills/pathfinder/SKILL.md +114 -0
  136. package/skills/pathml/SKILL.md +98 -0
  137. package/skills/pdf/SKILL.md +113 -0
  138. package/skills/peer-review/SKILL.md +119 -0
  139. package/skills/pennylane/SKILL.md +119 -0
  140. package/skills/phylogenetics/SKILL.md +102 -0
  141. package/skills/pi-extension-lifecycle/SKILL.md +41 -0
  142. package/skills/plan/SKILL.md +66 -0
  143. package/skills/polars/SKILL.md +114 -0
  144. package/skills/polars-bio/SKILL.md +84 -0
  145. package/skills/pptx/SKILL.md +118 -0
  146. package/skills/pptx-posters/SKILL.md +112 -0
  147. package/skills/primekg/SKILL.md +97 -0
  148. package/skills/project-session-manager/SKILL.md +85 -0
  149. package/skills/protocolsio-integration/SKILL.md +119 -0
  150. package/skills/pubmed-search/SKILL.md +29 -0
  151. package/skills/pufferlib/SKILL.md +103 -0
  152. package/skills/pydeseq2/SKILL.md +106 -0
  153. package/skills/pydicom/SKILL.md +115 -0
  154. package/skills/pyhealth/SKILL.md +117 -0
  155. package/skills/pylabrobot/SKILL.md +100 -0
  156. package/skills/pymatgen/SKILL.md +28 -0
  157. package/skills/pymc/SKILL.md +108 -0
  158. package/skills/pymoo/SKILL.md +90 -0
  159. package/skills/pyopenms/SKILL.md +119 -0
  160. package/skills/pysam/SKILL.md +118 -0
  161. package/skills/pyspark/SKILL.md +30 -0
  162. package/skills/pytdc/SKILL.md +102 -0
  163. package/skills/pytorch/SKILL.md +31 -0
  164. package/skills/pytorch-lightning/SKILL.md +119 -0
  165. package/skills/pyzotero/SKILL.md +104 -0
  166. package/skills/qiskit/SKILL.md +119 -0
  167. package/skills/qutip/SKILL.md +111 -0
  168. package/skills/ralph/SKILL.md +23 -0
  169. package/skills/ralplan/SKILL.md +105 -0
  170. package/skills/rdflib/SKILL.md +29 -0
  171. package/skills/rdkit/SKILL.md +30 -0
  172. package/skills/read-only-explorer/SKILL.md +85 -0
  173. package/skills/receiving-code-review/SKILL.md +103 -0
  174. package/skills/release/SKILL.md +117 -0
  175. package/skills/remember/SKILL.md +39 -0
  176. package/skills/requesting-code-review/SKILL.md +85 -0
  177. package/skills/requirements-to-task-packet/SKILL.md +65 -0
  178. package/skills/research-grants/SKILL.md +118 -0
  179. package/skills/research-lookup/SKILL.md +117 -0
  180. package/skills/research-reproducibility/SKILL.md +28 -0
  181. package/skills/resource-discovery-config/SKILL.md +43 -0
  182. package/skills/rowan/SKILL.md +100 -0
  183. package/skills/runtime-state-reader/SKILL.md +46 -0
  184. package/skills/safe-bash/SKILL.md +85 -0
  185. package/skills/scanpy/SKILL.md +32 -0
  186. package/skills/scholar-evaluation/SKILL.md +115 -0
  187. package/skills/scientific-brainstorming/SKILL.md +118 -0
  188. package/skills/scientific-critical-thinking/SKILL.md +119 -0
  189. package/skills/scientific-schematics/SKILL.md +116 -0
  190. package/skills/scientific-slides/SKILL.md +117 -0
  191. package/skills/scientific-visualization/SKILL.md +109 -0
  192. package/skills/scientific-writing/SKILL.md +119 -0
  193. package/skills/scikit-bio/SKILL.md +92 -0
  194. package/skills/scikit-learn/SKILL.md +99 -0
  195. package/skills/scikit-survival/SKILL.md +110 -0
  196. package/skills/sciomc/SKILL.md +86 -0
  197. package/skills/scvelo/SKILL.md +106 -0
  198. package/skills/scvi-tools/SKILL.md +114 -0
  199. package/skills/seaborn/SKILL.md +97 -0
  200. package/skills/secure-agent-orchestration-review/SKILL.md +47 -0
  201. package/skills/self-improve/SKILL.md +119 -0
  202. package/skills/semantic-compression/SKILL.md +62 -0
  203. package/skills/setup/SKILL.md +42 -0
  204. package/skills/shap/SKILL.md +103 -0
  205. package/skills/simpy/SKILL.md +116 -0
  206. package/skills/skill/SKILL.md +117 -0
  207. package/skills/skill-search/SKILL.md +67 -0
  208. package/skills/skillify/SKILL.md +46 -0
  209. package/skills/smart-explore/SKILL.md +94 -0
  210. package/skills/sqlite-pandas/SKILL.md +30 -0
  211. package/skills/stable-baselines3/SKILL.md +86 -0
  212. package/skills/state-mutation-locking/SKILL.md +44 -0
  213. package/skills/statistical-analysis/SKILL.md +108 -0
  214. package/skills/statsmodels/SKILL.md +29 -0
  215. package/skills/subagent-driven-development/SKILL.md +89 -0
  216. package/skills/sympy/SKILL.md +115 -0
  217. package/skills/system-prompts/SKILL.md +116 -0
  218. package/skills/systematic-debugging/SKILL.md +119 -0
  219. package/skills/team/SKILL.md +85 -0
  220. package/skills/test-driven-development/SKILL.md +84 -0
  221. package/skills/tiledbvcf/SKILL.md +119 -0
  222. package/skills/timeline-report/SKILL.md +85 -0
  223. package/skills/timesfm-forecasting/SKILL.md +112 -0
  224. package/skills/torch-geometric/SKILL.md +118 -0
  225. package/skills/torchdrug/SKILL.md +118 -0
  226. package/skills/trace/SKILL.md +118 -0
  227. package/skills/transformers/SKILL.md +110 -0
  228. package/skills/treatment-plans/SKILL.md +119 -0
  229. package/skills/ui-render-performance/SKILL.md +41 -0
  230. package/skills/ultragoal/SKILL.md +63 -0
  231. package/skills/ultraqa/SKILL.md +85 -0
  232. package/skills/ultrawork/SKILL.md +20 -0
  233. package/skills/umap-learn/SKILL.md +119 -0
  234. package/skills/usfiscaldata/SKILL.md +118 -0
  235. package/skills/using-git-worktrees/SKILL.md +112 -0
  236. package/skills/using-superpowers/SKILL.md +85 -0
  237. package/skills/using-vetc/SKILL.md +92 -0
  238. package/skills/vaex/SKILL.md +111 -0
  239. package/skills/venue-templates/SKILL.md +113 -0
  240. package/skills/verification-before-completion/SKILL.md +88 -0
  241. package/skills/verification-before-done/SKILL.md +68 -0
  242. package/skills/verify/SKILL.md +33 -0
  243. package/skills/version-bump/SKILL.md +54 -0
  244. package/skills/vetc-analyze-ba/SKILL.md +117 -0
  245. package/skills/vetc-analyze-codebase/SKILL.md +118 -0
  246. package/skills/vetc-api-design/SKILL.md +103 -0
  247. package/skills/vetc-brainstorming/SKILL.md +116 -0
  248. package/skills/vetc-change-proposal/SKILL.md +111 -0
  249. package/skills/vetc-cicd/SKILL.md +113 -0
  250. package/skills/vetc-continuous-learning/SKILL.md +115 -0
  251. package/skills/vetc-deep-interview/SKILL.md +103 -0
  252. package/skills/vetc-docgen/SKILL.md +108 -0
  253. package/skills/vetc-frontend-patterns/SKILL.md +99 -0
  254. package/skills/vetc-iterative-retrieval/SKILL.md +110 -0
  255. package/skills/vetc-java-patterns/SKILL.md +113 -0
  256. package/skills/vetc-meta-skill-creator/SKILL.md +99 -0
  257. package/skills/vetc-oracle-patterns/SKILL.md +109 -0
  258. package/skills/vetc-performance-testing/SKILL.md +104 -0
  259. package/skills/vetc-pr-response/SKILL.md +106 -0
  260. package/skills/vetc-ralph/SKILL.md +108 -0
  261. package/skills/vetc-ralplan/SKILL.md +116 -0
  262. package/skills/vetc-receiving-review/SKILL.md +106 -0
  263. package/skills/vetc-reconcile-patterns/SKILL.md +117 -0
  264. package/skills/vetc-refactoring/SKILL.md +96 -0
  265. package/skills/vetc-runbook/SKILL.md +118 -0
  266. package/skills/vetc-sast/SKILL.md +118 -0
  267. package/skills/vetc-sdlc/SKILL.md +97 -0
  268. package/skills/vetc-security/SKILL.md +117 -0
  269. package/skills/vetc-spec-driven/SKILL.md +111 -0
  270. package/skills/vetc-spec-quality/SKILL.md +117 -0
  271. package/skills/vetc-systematic-debugging/SKILL.md +74 -0
  272. package/skills/vetc-tdd/SKILL.md +96 -0
  273. package/skills/vetc-thinking-pm/SKILL.md +110 -0
  274. package/skills/vetc-ui-visual-qa/SKILL.md +117 -0
  275. package/skills/vetc-verify/SKILL.md +101 -0
  276. package/skills/visual-verdict/SKILL.md +59 -0
  277. package/skills/what-if-oracle/SKILL.md +87 -0
  278. package/skills/widget-rendering/SKILL.md +85 -0
  279. package/skills/wiki/SKILL.md +69 -0
  280. package/skills/workspace-isolation/SKILL.md +85 -0
  281. package/skills/worktree-isolation/SKILL.md +85 -0
  282. package/skills/wowerpoint/SKILL.md +101 -0
  283. package/skills/writer-memory/SKILL.md +82 -0
  284. package/skills/writing-plans/SKILL.md +115 -0
  285. package/skills/writing-skills/SKILL.md +115 -0
  286. package/skills/xgboost/SKILL.md +29 -0
  287. package/skills/xgboost-ts/SKILL.md +28 -0
  288. package/skills/xlsx/SKILL.md +111 -0
  289. package/skills/zarr-python/SKILL.md +101 -0
  290. package/src/categories.ts +383 -0
  291. package/src/format.ts +104 -0
  292. package/src/indexer.ts +101 -0
  293. package/src/proactive.ts +51 -0
  294. package/src/scanner.ts +85 -0
  295. package/src/search.ts +89 -0
  296. package/src/strip.ts +29 -0
  297. package/src/synonyms.ts +83 -0
  298. package/src/text.ts +118 -0
  299. package/src/types.ts +64 -0
@@ -0,0 +1,85 @@
1
+ ---
2
+ name: orchestration
3
+ description: Multi-phase orchestration skill for pi-crew planners and executors. Use when decomposing complex tasks into parallel phases, dispatching workers, verifying gates, and iterating until closure.
4
+ ---
5
+
6
+
7
+ # orchestration
8
+
9
+ Use this skill when orchestrating multi-phase tasks across pi-crew teams and workers.
10
+
11
+ ## Role definition
12
+
13
+ You are the orchestrator — bạn là người điều phối, không phải người thực thi.
14
+
15
+ You decompose, dispatch, verify, and iterate. You do NOT edit code directly. If you find yourself opening a file to fix a typo "real quick," stop — spawn a worker instead.
16
+
17
+ ## Rules (8 orchestration rules)
18
+
19
+ Adapted from oh-my-pi's orchestrate command pattern for pi-crew context.
20
+
21
+ ### 1. Do not yield until everything is closed
22
+
23
+ Không trả lại control khi vẫn còn việc chưa xong. Run every phase to completion. The orchestrator owns the full lifecycle — from first dispatch to final green gate.
24
+
25
+ ### 2. Enumerate the full surface before dispatching
26
+
27
+ Before writing any task packet, read every referenced file and understand the complete work surface. Liệt kê toàn bộ surface trước khi giao việc — không giao việc khi chưa hiểu hết scope.
28
+
29
+ ### 3. Parallelize maximally
30
+
31
+ Every set of edits with disjoint file scope MUST ship as one batch. Nếu 5 tasks chỉnh 5 file khác nhau và không phụ thuộc nhau, dispatch tất cả cùng lúc. Never serialize what can be parallelized.
32
+
33
+ ### 4. Each task assignment is self-contained
34
+
35
+ Subagents have no shared context. Mỗi worker chỉ biết những gì bạn ghi trong task packet. Include all necessary context, file paths, constraints, and acceptance criteria in every task.
36
+
37
+ ### 5. Verify after every phase before launching the next
38
+
39
+ Run appropriate gates between phases: typecheck, tests, lint. Không bỏ qua verification — một phase đỏ không được phép chuyển sang phase tiếp theo.
40
+
41
+ ### 6. Commit policy — green only
42
+
43
+ Commit after each green phase. Never commit a red tree. Chỉ commit khi tất cả gates pass. If the phase fails, fix it first.
44
+
45
+ ### 7. Respawn, do not absorb
46
+
47
+ If a subagent returns incomplete or broken work, spawn a corrective subagent with a focused fix-up task packet. Không tự sửa lỗi của worker — respawn worker mới để sửa.
48
+
49
+ ### 8. No scope creep, no scope shrink
50
+
51
+ Maintain the original scope exactly. Không mở rộng scope vì "thấy thêm việc," cũng không thu hẹp vì "tạm đủ." If scope needs to change, escalate to the requester.
52
+
53
+ ## Workflow (7 steps)
54
+
55
+ ### Step 1 — Ingest
56
+
57
+ - Read every referenced file in the goal/task description.
58
+ - Run `git status` and `git diff` to understand current tree state.
59
+ - Identify all files, symbols, and subsystems in scope.
60
+ - Check workspace tree for project context and existing patterns.
61
+
62
+ ### Step 2 — Plan
63
+
64
+ - Materialize the full work surface as ordered phases.
65
+ - For each phase, enumerate: files to touch, workers needed, dependencies on other phases.
66
+ - Phases must be ordered by dependency; tasks within a phase must be independent (disjoint file scope).
67
+ - Write the plan down — không giữ plan trong head.
68
+
69
+ ### Step 3 — Dispatch phase
70
+
71
+ - Launch all parallel subagents in one `team` call.
72
+ - Each subagent receives a complete task packet (see `task-packet` skill).
73
+ - Set explicit file ownership per worker — no two workers touch the same file.
74
+ - Use `workspaceMode: 'worktree'` when parallel edits risk conflict.
75
+
76
+ ### Step 4 — Verify phase
77
+
78
+ - Run verification gates: typecheck, tests, lint as appropriate.
79
+ - If green → proceed to commit.
80
+ - If red → dispatch fix-up subagents with precise failure context (error output, file, line). Do NOT fix it yourself.
81
+
82
+ ### Step 5 — Commit phase (if applicable)
83
+
84
+ - Only when all gates are green.
85
+ - Commit message should reference the phase and what was accomplished.
@@ -0,0 +1,43 @@
1
+ ---
2
+ name: ownership-session-security
3
+ description: Session ownership and authorization workflow. Use when implementing cancel, respond, steer, run ownership, cwd overrides, imported runs, or cross-session actions.
4
+ ---
5
+
6
+
7
+ # ownership-session-security
8
+
9
+ Use this skill for cross-session safety and trust-boundary work.
10
+
11
+ ## Source patterns distilled
12
+
13
+ - Pi session IDs: `ctx.sessionManager.getSessionId()` from Pi core `ExtensionContext`
14
+ - pi-crew ownership: `TeamRunManifest.ownerSessionId`, `src/extension/team-tool/run.ts`, `cancel.ts`, `respond.ts`
15
+ - Path safety: `src/utils/safe-paths.ts`, `src/state/state-store.ts`, `src/state/mailbox.ts`
16
+ - Destructive actions: `src/extension/team-tool/lifecycle-actions.ts`, `src/worktree/cleanup.ts`
17
+
18
+ ## Rules
19
+
20
+ - Propagate the active Pi session ID into `TeamContext` for every production tool/command path.
21
+ - New runs should record `ownerSessionId` when available.
22
+ - For owned runs, cross-session actions that mutate state must be rejected unless explicit force/admin semantics are designed and tested.
23
+ - Legacy runs without `ownerSessionId` may remain permissive for backward compatibility, but document this behavior.
24
+ - User/LLM-controlled path fields (`cwd`, import paths, artifact paths, task IDs) must be normalized and contained under an allowed base.
25
+ - Use `resolveContainedPath`, `resolveRealContainedPath`, `assertSafePathId`, and symlink checks rather than ad-hoc `startsWith` checks.
26
+ - Destructive management actions must require `confirm: true`; referenced resource deletes must require `force: true` where applicable.
27
+
28
+ ## Anti-patterns
29
+
30
+ - Assuming `ctx.sessionId` exists directly on Pi context.
31
+ - Letting `cwd: ../other-project` move run state into another project.
32
+ - Letting `respond`/`cancel` mutate a foreign owned run.
33
+ - Trusting task IDs, run IDs, or artifact paths from tool params without validation.
34
+
35
+ ## Verification
36
+
37
+ ```bash
38
+ cd pi-crew
39
+ npx tsc --noEmit
40
+ node --experimental-strip-types --test test/unit/cancel-ownership.test.ts test/unit/respond-tool.test.ts test/unit/cwd-override-security.test.ts test/unit/api-artifact-security.test.ts
41
+ npm test
42
+ ```
43
+
@@ -0,0 +1,119 @@
1
+ ---
2
+ name: paper-lookup
3
+ description: Search 10 academic paper databases via REST APIs for research papers, preprints, and scholarly articles. Covers PubMed, PMC (full text), bioRxiv, medRxiv, arXiv, OpenAlex, Crossref, Semantic Scholar, CORE, Unpaywall. Use when searching for papers, citations, DOI/PMID lookups, abstracts, full text, open access, preprints, citation graphs, author search, or any scholarly literature query. Triggers on mentions of any supported database or requests like "find papers on X" or "look up this DOI".
4
+ ---
5
+
6
+ # Paper Lookup
7
+
8
+ You have access to 10 academic paper databases through their REST APIs. Your job is to figure out which database(s) best serve the user's query, call them, and return the results.
9
+
10
+ ## Core Workflow
11
+
12
+ 1. **Understand the query** -- What is the user looking for? A specific paper by DOI? Papers on a topic? An author's publications? Open access PDFs? Full text? This determines which database(s) to hit.
13
+
14
+ 2. **Select database(s)** -- Use the database selection guide below. Many queries benefit from hitting multiple databases -- for example, searching PubMed for papers and then checking Unpaywall for open access copies.
15
+
16
+ 3. **Read the reference file** -- Each database has a reference file in `references/` with endpoint details, query formats, and example calls. Read the relevant file(s) before making API calls.
17
+
18
+ 4. **Make the API call(s)** -- See the **Making API Calls** section below for which HTTP fetch tool to use on your platform.
19
+
20
+ 5. **Return results** -- Always return:
21
+ - The **raw JSON** (or parsed XML for arXiv) response from each database
22
+ - A **list of databases queried** with the specific endpoints used
23
+ - If a query returned no results, say so explicitly rather than omitting it
24
+
25
+ ## Database Selection Guide
26
+
27
+ Match the user's intent to the right database(s).
28
+
29
+ ### By Use Case
30
+
31
+ | User is asking about... | Primary database(s) | Also consider |
32
+ |---|---|---|
33
+ | Papers on a biomedical topic | PubMed | Semantic Scholar, OpenAlex |
34
+ | Full text of a biomedical article | PMC | CORE |
35
+ | Biology preprints | bioRxiv | Semantic Scholar, OpenAlex |
36
+ | Health/medical preprints | medRxiv | Semantic Scholar, OpenAlex |
37
+ | Physics, math, or CS preprints | arXiv | Semantic Scholar, OpenAlex |
38
+ | Papers across all fields | OpenAlex | Semantic Scholar, Crossref |
39
+ | A specific paper by DOI | Crossref | Unpaywall, Semantic Scholar |
40
+ | Open access PDF for a paper | Unpaywall | CORE, PMC |
41
+ | Citation graph (who cites whom) | Semantic Scholar | OpenAlex |
42
+ | Author's publications | Semantic Scholar | OpenAlex |
43
+ | Paper recommendations | Semantic Scholar | -- |
44
+
45
+ ### Cross-Database Queries
46
+
47
+ | User is asking about... | Databases to query |
48
+ |---|---|
49
+ | Everything about a paper (metadata + citations + OA) | Crossref + Semantic Scholar + Unpaywall |
50
+ | Comprehensive literature search | PubMed + OpenAlex + Semantic Scholar |
51
+ | Find and read a paper | PubMed (find) + Unpaywall (OA link) + PMC or CORE (full text) |
52
+ | Preprint and its published version | bioRxiv/medRxiv + Crossref |
53
+ | Author overview with citation metrics | Semantic Scholar + OpenAlex |
54
+
55
+ When a query spans multiple needs (e.g., "find papers about CRISPR and get me the PDFs"), query the relevant databases in parallel.
56
+
57
+ ## Common Identifier Formats
58
+
59
+ Different databases use different identifier systems. If a query fails, the identifier format may be wrong.
60
+
61
+ | Identifier | Format | Example | Used by |
62
+ |---|---|---|---|
63
+ | DOI | `10.xxxx/xxxxx` | `10.1038/nature12373` | All databases |
64
+ | PMID | Integer | `34567890` | PubMed, PMC, Semantic Scholar |
65
+ | PMCID | `PMC` + digits | `PMC7029759` | PMC, Europe PMC |
66
+ | arXiv ID | `YYMM.NNNNN` | `2103.15348` | arXiv, Semantic Scholar |
67
+ | OpenAlex ID | `W` + digits | `W2741809807` | OpenAlex |
68
+ | Semantic Scholar ID | 40-char hex | `649def34f8be...` | Semantic Scholar |
69
+ | ORCID | `0000-XXXX-XXXX-XXXX` | `0000-0001-6187-6610` | OpenAlex, Crossref |
70
+ | ISSN | `XXXX-XXXX` | `0028-0836` | Crossref, OpenAlex |
71
+
72
+ **Cross-referencing IDs:** Semantic Scholar accepts DOI, PMID, PMCID, and arXiv ID via prefixes (e.g., `DOI:10.1038/nature12373`, `PMID:34567890`, `ARXIV:2103.15348`). OpenAlex accepts DOI and PMID via prefixes (`doi:10.1038/...`, `pmid:34567890`). Use the PMC ID Converter to translate between PMID, PMCID, and DOI.
73
+
74
+ ## API Keys and Access
75
+
76
+ Most of these databases are fully open. A few benefit from API keys for higher rate limits.
77
+
78
+ ### Databases requiring or benefiting from API keys
79
+
80
+ | Database | Env Variable | Required? | Registration |
81
+ |---|---|---|---|
82
+ | NCBI (PubMed, PMC) | `NCBI_API_KEY` | No (3 req/s without, 10 with) | https://www.ncbi.nlm.nih.gov/account/settings/ |
83
+ | CORE | `CORE_API_KEY` | Yes for full text | https://core.ac.uk/services/api |
84
+ | Semantic Scholar | `S2_API_KEY` | No (shared pool without) | https://www.semanticscholar.org/product/api#api-key-form |
85
+ | OpenAlex | `OPENALEX_API_KEY` | Recommended | https://openalex.org/settings/api |
86
+
87
+ ### Fully open databases (no key needed)
88
+
89
+ | Database | Notes |
90
+ |---|---|
91
+ | bioRxiv / medRxiv | No auth, no documented rate limits |
92
+ | arXiv | No auth, max 1 request per 3 seconds |
93
+ | Crossref | No auth; add `mailto` param for polite pool (2x rate limit) |
94
+ | Unpaywall | No auth; requires `email` parameter |
95
+
96
+ ### Loading API keys
97
+
98
+ 1. **Check the environment first** -- the key may already be exported (e.g., `$NCBI_API_KEY`).
99
+ 2. **Fall back to `.env`** -- check `.env` in the current working directory.
100
+ 3. **Proceed without** -- most APIs still work at lower rate limits. Tell the user which key is missing and how to get one.
101
+
102
+ ## Making API Calls
103
+
104
+ Use your environment's HTTP fetch tool to call REST endpoints:
105
+
106
+ | Platform | HTTP Fetch Tool | Fallback |
107
+ |---|---|---|
108
+ | Claude Code | `WebFetch` | `curl` via Bash |
109
+ | Gemini CLI | `web_fetch` | `curl` via shell |
110
+ | Windsurf | `read_url_content` | `curl` via terminal |
111
+ | Cursor | No dedicated fetch tool | `curl` via `run_terminal_cmd` |
112
+ | Codex CLI | No dedicated fetch tool | `curl` via `shell` |
113
+ | Cline | No dedicated fetch tool | `curl` via `execute_command` |
114
+
115
+ If the fetch tool fails, fall back to `curl` via whatever shell tool is available.
116
+
117
+ ### Special cases
118
+
119
+
@@ -0,0 +1,114 @@
1
+ ---
2
+ name: paperzilla
3
+ description: Chat with your agent about projects, recommendations, and canonical papers in Paperzilla. Use when users ask for recent project recommendations, canonical paper details, markdown-based summaries, recommendation feedback, feed export, or Atom feed URLs.
4
+ ---
5
+
6
+ # Paperzilla
7
+
8
+ Use this skill when you want to chat with your agent about projects, recommendations, and canonical papers in Paperzilla.
9
+
10
+ ## What you can ask
11
+
12
+ - "Give me the latest recommendations from project X."
13
+ - "Open recommendation Y and explain why it matters."
14
+ - "Fetch canonical paper Z as markdown and summarize it."
15
+ - "Tell me how this paper is relevant to my research."
16
+ - "Show me the feed for project X."
17
+ - "Leave feedback on a recommendation."
18
+ - "Export this paper, recommendation, or feed as JSON."
19
+
20
+ This is the core Paperzilla skill. It gives your agent direct access to Paperzilla data, but it does not impose a workflow or external delivery integration.
21
+
22
+ ## Access method
23
+
24
+ Most current profiles in this repo use the `pz` CLI.
25
+
26
+ If the current profile ships extra agent-specific instructions, follow those as well.
27
+
28
+ ### macOS
29
+ ```bash
30
+ brew install paperzilla-ai/tap/pz
31
+ ```
32
+
33
+ ### Windows (Scoop)
34
+ ```bash
35
+ scoop bucket add paperzilla-ai https://github.com/paperzilla-ai/scoop-bucket
36
+ scoop install pz
37
+ ```
38
+
39
+ ### Linux
40
+ Use the official Linux install guide:
41
+
42
+ - https://docs.paperzilla.ai/guides/cli-getting-started
43
+
44
+ ### Build from source (Go 1.23+)
45
+ See the CLI repository for source builds:
46
+
47
+ - https://github.com/paperzilla-ai/pz
48
+
49
+ ## Update
50
+
51
+ Check whether your CLI is up to date and get install-specific upgrade steps:
52
+
53
+ ```bash
54
+ pz update
55
+ ```
56
+
57
+ If detection is ambiguous, override it explicitly:
58
+
59
+ ```bash
60
+ pz update --install-method homebrew
61
+ pz update --install-method scoop
62
+ pz update --install-method release
63
+ pz update --install-method source
64
+
65
+ ## Authentication
66
+
67
+ ```bash
68
+ pz login
69
+ ```
70
+
71
+ ## CLI reference
72
+
73
+ If the current profile uses `pz`, these are the core commands.
74
+
75
+ ### List projects
76
+ ```bash
77
+ pz project list
78
+ ```
79
+
80
+ ### Show one project
81
+ ```bash
82
+ pz project <project-id>
83
+ ```
84
+
85
+ ### Browse project feed
86
+ ```bash
87
+ pz feed <project-id>
88
+ ```
89
+
90
+ Useful flags:
91
+ - `--must-read`
92
+ - `--since YYYY-MM-DD`
93
+ - `--limit N`
94
+ - `--json`
95
+ - `--atom`
96
+
97
+ Examples:
98
+ ```bash
99
+ pz feed <project-id> --must-read --since 2026-03-01 --limit 5
100
+
101
+ ### Read a canonical paper
102
+ ```bash
103
+ pz paper <paper-id>
104
+ pz paper <paper-id> --json
105
+ pz paper <paper-id> --markdown
106
+ pz paper <paper-id> --project <project-id>
107
+ ```
108
+
109
+ ### Open a recommendation from one of your projects
110
+ ```bash
111
+ pz rec <project-paper-id>
112
+ pz rec <project-paper-id> --json
113
+ pz rec <project-paper-id> --markdown
114
+ ```
@@ -0,0 +1,64 @@
1
+ ---
2
+ name: parallel-web
3
+ description: "All-in-one web toolkit powered by parallel-cli, with a strong emphasis on academic and scientific sources. Use this skill whenever the user needs to search the web, fetch/extract URL content, enrich data with web-sourced fields, or run deep research reports. Covers: web search (fast lookups, research, current info — prioritizing peer-reviewed papers, preprints, and scholarly databases), URL extraction (fetching pages, articles, academic PDFs), bulk data enrichment (adding fields to CSV/lists from the web), and deep research (exhaustive multi-source reports grounded in academic literature). Also handles setup, status checks, and result retrieval. Use this skill for ANY web-related task — even if the user doesn't mention 'parallel' or 'web' explicitly. If they want to look something up, fetch a page, enrich a dataset, investigate a topic, find academic papers, check citations, or review scientific literature, this is the skill to use."
4
+ ---
5
+
6
+ # Parallel Web Toolkit
7
+
8
+ A unified skill for all web-powered tasks: searching, extracting, enriching, and researching — with academic and scientific sources as the default priority.
9
+
10
+ ## Routing — pick the right capability
11
+
12
+ Read the user's request and match it to one of the capabilities below. For web search, extract, enrichment, and deep research, read the corresponding reference file for detailed instructions.
13
+
14
+ | User wants to... | Capability | Where |
15
+ |---|---|---|
16
+ | Look something up, research a topic, find current info | **Web Search** | `(see docs)` |
17
+ | Fetch content from a specific URL (webpage, article, PDF) | **Web Extract** | `(see docs)` |
18
+ | Add web-sourced fields to a list of companies/people/products | **Data Enrichment** | `(see docs)` |
19
+ | Get an exhaustive, multi-source report (user says "deep research", "exhaustive", "comprehensive") | **Deep Research** | `(see docs)` |
20
+ | Install or authenticate parallel-cli | **Setup** | Below |
21
+ | Check status of a running research/enrichment task | **Status** | Below |
22
+ | Retrieve completed research results by run ID | **Result** | Below |
23
+
24
+ ### Decision guide
25
+
26
+ - **Default to Web Search** for a single lookup, research question, or "what is X?" query. It's fast and cost-effective. When the query touches a scientific or technical topic, include academic domains (see `(see docs)`) to surface peer-reviewed and preprint sources alongside general results.
27
+ - **Use Web Extract** when the user provides a URL or asks you to read/fetch a specific page. Prefer this over the built-in WebFetch tool. Particularly useful for extracting full text from academic PDFs, preprint servers, and journal articles.
28
+ - **Use Data Enrichment** when the user has **multiple entities** (a CSV, a list of companies/people/products, or even a short inline list) and wants to find or add the same kind of information for each one. The key signal is a repeated lookup across a set of items — e.g., "find the CEO for each of these companies" or "get the founding year for Apple, Stripe, and Anthropic." Even if the user doesn't say "enrich," use `parallel-cli enrich` whenever the task is the same query applied to multiple entities. Do NOT use Web Search in a loop for this — the enrichment pipeline handles batching, parallelism, and structured output automatically.
29
+ - **Use Deep Research only** when the user explicitly asks for deep, exhaustive, or comprehensive research. It is 10-100x slower and more expensive than Web Search — never default to it. Deep research is especially valuable for literature reviews and multi-paper synthesis.
30
+ - If `parallel-cli` is not found when running any command, follow the Setup section below.
31
+
32
+ ### Academic source priority
33
+
34
+ Across all capabilities, prefer academic and scientific sources when the query is technical or scientific in nature. This means:
35
+ - Peer-reviewed journal articles and conference proceedings over blog posts or news articles
36
+ - Preprints (arXiv, bioRxiv, medRxiv) when peer-reviewed versions aren't available
37
+ - Institutional and government sources (NIH, WHO, NASA, NIST) over commercial sites
38
+ - Primary research over secondary summaries
39
+
40
+ When citing academic sources, include author names and publication year where available (e.g., [Smith et al., 2025](url)) in addition to the standard citation format. If a DOI is present, prefer the DOI link.
41
+
42
+ ## Context chaining
43
+
44
+ Several capabilities support multi-turn context via `interaction_id`. When a research or enrichment task completes, it returns an `interaction_id`. If the user asks a follow-up question related to that task, pass `--previous-interaction-id` to carry context forward automatically. This avoids restating what was already found.
45
+
46
+ ---
47
+
48
+ ## Check task status
49
+
50
+ ```bash
51
+ parallel-cli research status "$RUN_ID" --json
52
+ ```
53
+
54
+ Report the current status to the user (running, completed, failed, etc.).
55
+
56
+ ## Get completed result
57
+
58
+ ```bash
59
+ parallel-cli research poll "$RUN_ID" --json
60
+ ```
61
+
62
+ Present results in a clear, organized format.
63
+
64
+
@@ -0,0 +1,114 @@
1
+ ---
2
+ name: pathfinder
3
+ description: Map a codebase into feature-grouped flowcharts, identify duplicated concerns across features, and propose a unified architecture. Use when asked to "find the ideal path," unify duplicated systems, or audit architecture before a refactor. Emits a proposed unified flowchart plus per-system /make-plan prompts.
4
+ ---
5
+
6
+ # Pathfinder
7
+
8
+ You are an ORCHESTRATOR. Map the codebase into feature-grouped flowcharts, identify duplicated concerns, propose the simplest unified architecture, and hand off per-system plans to `/make-plan`.
9
+
10
+ You do not write implementation code. You produce diagrams, a duplication report, a proposed unified flowchart, and handoff prompts.
11
+
12
+ ## Delegation Model
13
+
14
+ Use subagents for *discovery and extraction* (file reading, flow tracing, grep, diagramming). Keep *synthesis* (deciding feature boundaries, picking unification strategies, final flowchart) with the orchestrator. Reject subagent reports that lack source citations and redeploy.
15
+
16
+ ### Subagent Reporting Contract (MANDATORY)
17
+
18
+ Each subagent response must include:
19
+ 1. Sources consulted — exact file paths and line ranges read
20
+ 2. Concrete findings — exact function names, call sites, data flow
21
+ 3. Mermaid diagram(s) with nodes labeled by `file:line`
22
+ 4. Confidence note + known gaps
23
+
24
+ ## Output Artifacts
25
+
26
+ All artifacts go in `PATHFINDER-<YYYY-MM-DD>/` at repo root:
27
+ - `00-features.md` — feature inventory with boundaries
28
+ - `01-flowcharts/<feature>.md` — one Mermaid flowchart per feature
29
+ - `02-duplication-report.md` — cross-cutting duplicated concerns with evidence
30
+ - `03-unified-proposal.md` — proposed unified architecture + Mermaid
31
+ - `04-handoff-prompts.md` — copy-pasteable `/make-plan` prompts per unified system
32
+
33
+ ## Phases
34
+
35
+ ### Phase 0: Feature Discovery (ALWAYS FIRST)
36
+
37
+ Deploy ONE "Feature Discovery" subagent to:
38
+ 1. Walk the source tree (not built artifacts) and read top-level README / CLAUDE.md
39
+ 2. Propose feature boundaries based on directory structure, import graph, and naming
40
+ 3. Return a flat list of features with: name, entry points (file:line), core files, brief purpose
41
+
42
+ Orchestrator reviews the proposal, adjusts boundaries if needed, writes `00-features.md`. Do NOT fan out until feature boundaries are approved.
43
+
44
+ ### Phase 1: Per-Feature Flowcharts (FAN OUT)
45
+
46
+ Deploy ONE "Flowchart" subagent per feature in parallel. Each receives only its feature's scope. Each must:
47
+ 1. Trace the feature's primary happy path from entry point to terminal state
48
+ 2. Identify side effects (DB writes, HTTP calls, file I/O, process spawns)
49
+ 3. Note error and fallback branches but do not let them dominate the diagram
50
+ 4. Produce a Mermaid `flowchart TD` with every node labeled `Name<br/>file:line`
51
+ 5. List external dependencies (other features it calls into) at the bottom
52
+
53
+ Orchestrator writes each flowchart to `01-flowcharts/<feature>.md`. Reject any diagram missing `file:line` labels.
54
+
55
+ ### Phase 2: Duplication Hunt
56
+
57
+ Deploy TWO subagents in parallel:
58
+
59
+ **"Within-Feature Duplication"** subagent:
60
+ - For each feature, find repeated code/logic patterns inside the feature only
61
+ - Report only duplications worth consolidating (ignore trivial repetition)
62
+
63
+ **"Cross-Feature Duplication"** subagent:
64
+ - Compare flowcharts across features for concerns that appear in multiple places
65
+ - Examples of what to look for: multiple capture paths, parallel queue implementations, duplicated storage/migration code, repeated agent scaffolding, parallel parsing layers
66
+ - For each duplication, report: (a) the concern, (b) every location with `file:line`, (c) why they diverged, (d) whether the divergence is legitimate specialization or accidental
67
+
68
+ Orchestrator synthesizes both into `02-duplication-report.md`. Every duplication claim must cite ≥2 `file:line` locations.
69
+
70
+ ### Phase 3: Unified Proposal (ORCHESTRATOR)
71
+
72
+ The orchestrator writes `03-unified-proposal.md` itself — do not delegate synthesis.
73
+
74
+ For each duplicated concern from Phase 2 that is NOT legitimate specialization:
75
+ 1. Propose the simplest unified design (one path, one store, one handler — whatever applies)
76
+ 2. Name the consolidated component and its single entry point
77
+ 3. Show what each old call site becomes
78
+ 4. Call out any loss of capability and whether it's acceptable
79
+
80
+ End the document with ONE combined Mermaid flowchart showing the proposed unified system. Nodes still labeled with target `file:line` (new or existing) where knowable.
81
+
82
+ **Anti-patterns to reject in your own proposal:**
83
+ - Adding a new abstraction layer "for flexibility"
84
+ - Keeping both old paths behind a feature flag
85
+ - Introducing a registry/factory when a switch statement suffices
86
+ - Preserving divergent behavior "just in case"
87
+
88
+ ### Phase 4: Per-System Handoff Prompts
89
+
90
+ For each unified system in the proposal, write a ready-to-run `/make-plan` prompt to `04-handoff-prompts.md`. Each prompt must:
91
+ 1. State the target unified component and its single entry point
92
+ 2. List the exact call sites to rewrite (from Phase 2 evidence)
93
+ 3. Cite the relevant flowchart file from `01-flowcharts/`
94
+ 4. Include anti-pattern guards specific to this system
95
+
96
+ Format each as a fenced code block the user can copy directly into `/make-plan`.
97
+
98
+ ## Key Principles
99
+
100
+ - **Evidence over intuition** — every diagram node and duplication claim cites `file:line`
101
+ - **Current state before ideal state** — Phases 0–2 describe what IS; Phase 3 describes what SHOULD BE
102
+ - **Simplest unification wins** — prefer deletion over abstraction; prefer one path over configurable paths
103
+ - **Specialization is not duplication** — two components serving different trust models or data sources are legitimate even if their code looks similar
104
+ - **Handoff, don't implement** — Pathfinder ends at plan prompts; `/make-plan` and `/do` take it from there
105
+
106
+ ## Failure Modes to Prevent
107
+
108
+ - Drawing flowcharts from memory instead of source — redeploy subagent with grep evidence requirement
109
+ - Proposing unification of legitimately specialized components — re-examine trust/data-source divergence
110
+ - Handoff prompts that lack concrete call sites — rewrite with Phase 2 evidence
111
+ - Skipping Phase 0 boundary review — fanning out on bad feature boundaries wastes all of Phase 1
112
+
113
+
114
+
@@ -0,0 +1,98 @@
1
+ ---
2
+ name: pathml
3
+ description: Full-featured computational pathology toolkit. Use for advanced WSI analysis including multiplexed immunofluorescence (CODEX, Vectra), nucleus segmentation, tissue graph construction, and ML model training on pathology data. Supports 160+ slide formats. For simple tile extraction from H&E slides, histolab may be simpler.
4
+ ---
5
+
6
+ # PathML
7
+
8
+ ## Overview
9
+
10
+ PathML is a comprehensive Python toolkit for computational pathology workflows, designed to facilitate machine learning and image analysis for whole-slide pathology images. The framework provides modular, composable tools for loading diverse slide formats, preprocessing images, constructing spatial graphs, training deep learning models, and analyzing multiparametric imaging data from technologies like CODEX and multiplex immunofluorescence.
11
+
12
+ ## When to Use This Skill
13
+
14
+ Apply this skill for:
15
+ - Loading and processing whole-slide images (WSI) in various proprietary formats
16
+ - Preprocessing H&E stained tissue images with stain normalization
17
+ - Nucleus detection, segmentation, and classification workflows
18
+ - Building cell and tissue graphs for spatial analysis
19
+ - Training or deploying machine learning models (HoVer-Net, HACTNet) on pathology data
20
+ - Analyzing multiparametric imaging (CODEX, Vectra, MERFISH) for spatial proteomics
21
+ - Quantifying marker expression from multiplex immunofluorescence
22
+ - Managing large-scale pathology datasets with HDF5 storage
23
+ - Tile-based analysis and stitching operations
24
+
25
+ ## Core Capabilities
26
+
27
+ PathML provides six major capability areas documented in detail within reference files:
28
+
29
+ ### 2. Preprocessing Pipelines
30
+
31
+ Build modular preprocessing pipelines by composing transforms for image manipulation, quality control, stain normalization, tissue detection, and mask operations. PathML's Pipeline architecture enables reproducible, scalable preprocessing across large datasets.
32
+
33
+ **Key transforms:**
34
+ - `StainNormalizationHE` - Macenko/Vahadane stain normalization
35
+ - `TissueDetectionHE`, `NucleusDetectionHE` - Tissue/nucleus segmentation
36
+ - `MedianBlur`, `GaussianBlur` - Noise reduction
37
+ - `LabelArtifactTileHE` - Quality control for artifacts
38
+
39
+ **See:** `(see docs)` for complete transform catalog, pipeline construction, and preprocessing workflows.
40
+
41
+ ### 3. Graph Construction
42
+
43
+ Construct spatial graphs representing cellular and tissue-level relationships. Extract features from segmented objects to create graph-based representations suitable for graph neural networks and spatial analysis.
44
+
45
+ **See:** `(see docs)` for graph construction methods, feature extraction, and spatial analysis workflows.
46
+
47
+ ### 4. Machine Learning
48
+
49
+ Train and deploy deep learning models for nucleus detection, segmentation, and classification. PathML integrates PyTorch with pre-built models (HoVer-Net, HACTNet), custom DataLoaders, and ONNX support for inference.
50
+
51
+ **Key models:**
52
+ - **HoVer-Net** - Simultaneous nucleus segmentation and classification
53
+ - **HACTNet** - Hierarchical cell-type classification
54
+
55
+ **See:** `(see docs)` for model training, evaluation, inference workflows, and working with public datasets.
56
+
57
+ ### 5. Multiparametric Imaging
58
+
59
+ Analyze spatial proteomics and gene expression data from CODEX, Vectra, MERFISH, and other multiplex imaging platforms. PathML provides specialized slide classes and transforms for processing multiparametric data, cell segmentation with Mesmer, and quantification workflows.
60
+
61
+ **See:** `(see docs)` for CODEX/Vectra workflows, cell segmentation, marker quantification, and integration with AnnData.
62
+
63
+ ### 6. Data Management
64
+
65
+ Efficiently store and manage large pathology datasets using HDF5 format. PathML handles tiles, masks, metadata, and extracted features in unified storage structures optimized for machine learning workflows.
66
+
67
+ **See:** `(see docs)` for HDF5 integration, tile management, dataset organization, and batch processing strategies.
68
+
69
+ ## Quick Start
70
+
71
+ # With optional dependencies for all features
72
+ uv pip install pathml[all]
73
+ ```
74
+
75
+ ### Basic Workflow Example
76
+
77
+ ```python
78
+ from pathml.core import SlideData
79
+ from pathml.preprocessing import Pipeline, StainNormalizationHE, TissueDetectionHE
80
+
81
+ # Load a whole-slide image
82
+ wsi = SlideData.from_slide("path/to/slide.svs")
83
+
84
+ # Create preprocessing pipeline
85
+ pipeline = Pipeline([
86
+ TissueDetectionHE(),
87
+ StainNormalizationHE(target='normalize', stain_estimation_method='macenko')
88
+ ])
89
+
90
+ # Run pipeline
91
+ pipeline.run(wsi)
92
+
93
+ # Access processed tiles
94
+ for tile in wsi.tiles:
95
+ processed_image = tile.image
96
+ tissue_mask = tile.masks['tissue']
97
+
98
+