@wentorai/research-plugins 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (252) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +204 -0
  3. package/curated/analysis/README.md +64 -0
  4. package/curated/domains/README.md +104 -0
  5. package/curated/literature/README.md +53 -0
  6. package/curated/research/README.md +62 -0
  7. package/curated/tools/README.md +87 -0
  8. package/curated/writing/README.md +61 -0
  9. package/index.ts +39 -0
  10. package/mcp-configs/academic-db/ChatSpatial.json +17 -0
  11. package/mcp-configs/academic-db/academia-mcp.json +17 -0
  12. package/mcp-configs/academic-db/academic-paper-explorer.json +17 -0
  13. package/mcp-configs/academic-db/academic-search-mcp-server.json +17 -0
  14. package/mcp-configs/academic-db/agentinterviews-mcp.json +17 -0
  15. package/mcp-configs/academic-db/all-in-mcp.json +17 -0
  16. package/mcp-configs/academic-db/apple-health-mcp.json +17 -0
  17. package/mcp-configs/academic-db/arxiv-latex-mcp.json +17 -0
  18. package/mcp-configs/academic-db/arxiv-mcp-server.json +17 -0
  19. package/mcp-configs/academic-db/bgpt-mcp.json +17 -0
  20. package/mcp-configs/academic-db/biomcp.json +17 -0
  21. package/mcp-configs/academic-db/biothings-mcp.json +17 -0
  22. package/mcp-configs/academic-db/catalysishub-mcp-server.json +17 -0
  23. package/mcp-configs/academic-db/clinicaltrialsgov-mcp-server.json +17 -0
  24. package/mcp-configs/academic-db/deep-research-mcp.json +17 -0
  25. package/mcp-configs/academic-db/dicom-mcp.json +17 -0
  26. package/mcp-configs/academic-db/enrichr-mcp-server.json +17 -0
  27. package/mcp-configs/academic-db/fec-mcp-server.json +17 -0
  28. package/mcp-configs/academic-db/fhir-mcp-server-themomentum.json +17 -0
  29. package/mcp-configs/academic-db/fhir-mcp.json +19 -0
  30. package/mcp-configs/academic-db/gget-mcp.json +17 -0
  31. package/mcp-configs/academic-db/google-researcher-mcp.json +17 -0
  32. package/mcp-configs/academic-db/idea-reality-mcp.json +17 -0
  33. package/mcp-configs/academic-db/legiscan-mcp.json +19 -0
  34. package/mcp-configs/academic-db/lex.json +17 -0
  35. package/mcp-configs/ai-platform/Adaptive-Graph-of-Thoughts-MCP-server.json +17 -0
  36. package/mcp-configs/ai-platform/ai-counsel.json +17 -0
  37. package/mcp-configs/ai-platform/atlas-mcp-server.json +17 -0
  38. package/mcp-configs/ai-platform/counsel-mcp.json +17 -0
  39. package/mcp-configs/ai-platform/cross-llm-mcp.json +17 -0
  40. package/mcp-configs/ai-platform/gptr-mcp.json +17 -0
  41. package/mcp-configs/browser/decipher-research-agent.json +17 -0
  42. package/mcp-configs/browser/deep-research.json +17 -0
  43. package/mcp-configs/browser/everything-claude-code.json +17 -0
  44. package/mcp-configs/browser/gpt-researcher.json +17 -0
  45. package/mcp-configs/browser/heurist-agent-framework.json +17 -0
  46. package/mcp-configs/data-platform/4everland-hosting-mcp.json +17 -0
  47. package/mcp-configs/data-platform/context-keeper.json +17 -0
  48. package/mcp-configs/data-platform/context7.json +19 -0
  49. package/mcp-configs/data-platform/contextstream-mcp.json +17 -0
  50. package/mcp-configs/data-platform/email-mcp.json +17 -0
  51. package/mcp-configs/note-knowledge/ApeRAG.json +17 -0
  52. package/mcp-configs/note-knowledge/In-Memoria.json +17 -0
  53. package/mcp-configs/note-knowledge/agent-memory.json +17 -0
  54. package/mcp-configs/note-knowledge/aimemo.json +17 -0
  55. package/mcp-configs/note-knowledge/biel-mcp.json +19 -0
  56. package/mcp-configs/note-knowledge/cognee.json +17 -0
  57. package/mcp-configs/note-knowledge/context-awesome.json +17 -0
  58. package/mcp-configs/note-knowledge/context-mcp.json +17 -0
  59. package/mcp-configs/note-knowledge/conversation-handoff-mcp.json +17 -0
  60. package/mcp-configs/note-knowledge/cortex.json +17 -0
  61. package/mcp-configs/note-knowledge/devrag.json +17 -0
  62. package/mcp-configs/note-knowledge/easy-obsidian-mcp.json +17 -0
  63. package/mcp-configs/note-knowledge/engram.json +17 -0
  64. package/mcp-configs/note-knowledge/gnosis-mcp.json +17 -0
  65. package/mcp-configs/note-knowledge/graphlit-mcp-server.json +19 -0
  66. package/mcp-configs/reference-mgr/arxiv-cli.json +17 -0
  67. package/mcp-configs/reference-mgr/arxiv-search-mcp.json +17 -0
  68. package/mcp-configs/reference-mgr/chiken.json +17 -0
  69. package/mcp-configs/reference-mgr/claude-scholar.json +17 -0
  70. package/mcp-configs/reference-mgr/devonthink-mcp.json +17 -0
  71. package/mcp-configs/registry.json +447 -0
  72. package/openclaw.plugin.json +21 -0
  73. package/package.json +61 -0
  74. package/skills/analysis/dataviz/color-accessibility-guide/SKILL.md +230 -0
  75. package/skills/analysis/dataviz/geospatial-viz-guide/SKILL.md +218 -0
  76. package/skills/analysis/dataviz/interactive-viz-guide/SKILL.md +287 -0
  77. package/skills/analysis/dataviz/network-visualization-guide/SKILL.md +195 -0
  78. package/skills/analysis/dataviz/publication-figures-guide/SKILL.md +238 -0
  79. package/skills/analysis/dataviz/python-dataviz-guide/SKILL.md +195 -0
  80. package/skills/analysis/econometrics/causal-inference-guide/SKILL.md +197 -0
  81. package/skills/analysis/econometrics/iv-regression-guide/SKILL.md +198 -0
  82. package/skills/analysis/econometrics/panel-data-guide/SKILL.md +274 -0
  83. package/skills/analysis/econometrics/robustness-checks/SKILL.md +250 -0
  84. package/skills/analysis/econometrics/stata-regression/SKILL.md +117 -0
  85. package/skills/analysis/econometrics/time-series-guide/SKILL.md +235 -0
  86. package/skills/analysis/statistics/bayesian-statistics-guide/SKILL.md +221 -0
  87. package/skills/analysis/statistics/hypothesis-testing-guide/SKILL.md +210 -0
  88. package/skills/analysis/statistics/meta-analysis-guide/SKILL.md +206 -0
  89. package/skills/analysis/statistics/nonparametric-tests-guide/SKILL.md +221 -0
  90. package/skills/analysis/statistics/power-analysis-guide/SKILL.md +240 -0
  91. package/skills/analysis/statistics/sem-guide/SKILL.md +231 -0
  92. package/skills/analysis/statistics/survival-analysis-guide/SKILL.md +195 -0
  93. package/skills/analysis/wrangling/missing-data-handling/SKILL.md +224 -0
  94. package/skills/analysis/wrangling/pandas-data-wrangling/SKILL.md +242 -0
  95. package/skills/analysis/wrangling/questionnaire-design-guide/SKILL.md +234 -0
  96. package/skills/analysis/wrangling/text-mining-guide/SKILL.md +225 -0
  97. package/skills/domains/ai-ml/computer-vision-guide/SKILL.md +213 -0
  98. package/skills/domains/ai-ml/deep-learning-papers-guide/SKILL.md +200 -0
  99. package/skills/domains/ai-ml/llm-evaluation-guide/SKILL.md +194 -0
  100. package/skills/domains/ai-ml/prompt-engineering-research/SKILL.md +233 -0
  101. package/skills/domains/ai-ml/reinforcement-learning-guide/SKILL.md +254 -0
  102. package/skills/domains/ai-ml/transformer-architecture-guide/SKILL.md +233 -0
  103. package/skills/domains/biomedical/clinical-research-guide/SKILL.md +232 -0
  104. package/skills/domains/biomedical/clinicaltrials-api/SKILL.md +177 -0
  105. package/skills/domains/biomedical/epidemiology-guide/SKILL.md +200 -0
  106. package/skills/domains/biomedical/genomics-analysis-guide/SKILL.md +270 -0
  107. package/skills/domains/business/market-analysis-guide/SKILL.md +112 -0
  108. package/skills/domains/business/strategic-management-guide/SKILL.md +154 -0
  109. package/skills/domains/chemistry/computational-chemistry-guide/SKILL.md +266 -0
  110. package/skills/domains/chemistry/retrosynthesis-guide/SKILL.md +215 -0
  111. package/skills/domains/cs/algorithms-complexity-guide/SKILL.md +194 -0
  112. package/skills/domains/cs/dblp-api/SKILL.md +129 -0
  113. package/skills/domains/cs/software-engineering-research/SKILL.md +218 -0
  114. package/skills/domains/ecology/biodiversity-data-guide/SKILL.md +296 -0
  115. package/skills/domains/ecology/conservation-biology-guide/SKILL.md +198 -0
  116. package/skills/domains/ecology/gbif-api/SKILL.md +158 -0
  117. package/skills/domains/ecology/inaturalist-api/SKILL.md +173 -0
  118. package/skills/domains/economics/behavioral-economics-guide/SKILL.md +239 -0
  119. package/skills/domains/economics/development-economics-guide/SKILL.md +181 -0
  120. package/skills/domains/economics/fred-api/SKILL.md +189 -0
  121. package/skills/domains/education/curriculum-design-guide/SKILL.md +144 -0
  122. package/skills/domains/education/learning-science-guide/SKILL.md +150 -0
  123. package/skills/domains/finance/financial-data-analysis/SKILL.md +152 -0
  124. package/skills/domains/finance/quantitative-finance-guide/SKILL.md +151 -0
  125. package/skills/domains/geoscience/climate-science-guide/SKILL.md +158 -0
  126. package/skills/domains/geoscience/gis-remote-sensing-guide/SKILL.md +129 -0
  127. package/skills/domains/humanities/digital-humanities-guide/SKILL.md +181 -0
  128. package/skills/domains/humanities/philosophy-research-guide/SKILL.md +148 -0
  129. package/skills/domains/law/courtlistener-api/SKILL.md +213 -0
  130. package/skills/domains/law/legal-research-guide/SKILL.md +250 -0
  131. package/skills/domains/math/linear-algebra-applications/SKILL.md +227 -0
  132. package/skills/domains/math/numerical-methods-guide/SKILL.md +236 -0
  133. package/skills/domains/math/oeis-api/SKILL.md +158 -0
  134. package/skills/domains/pharma/clinical-pharmacology-guide/SKILL.md +165 -0
  135. package/skills/domains/pharma/drug-development-guide/SKILL.md +177 -0
  136. package/skills/domains/physics/computational-physics-guide/SKILL.md +300 -0
  137. package/skills/domains/physics/nasa-ads-api/SKILL.md +150 -0
  138. package/skills/domains/physics/quantum-computing-guide/SKILL.md +234 -0
  139. package/skills/domains/social-science/social-research-methods/SKILL.md +194 -0
  140. package/skills/domains/social-science/survey-research-guide/SKILL.md +182 -0
  141. package/skills/literature/discovery/citation-alert-guide/SKILL.md +154 -0
  142. package/skills/literature/discovery/conference-proceedings-guide/SKILL.md +142 -0
  143. package/skills/literature/discovery/literature-mapping-guide/SKILL.md +175 -0
  144. package/skills/literature/discovery/paper-tracking-guide/SKILL.md +211 -0
  145. package/skills/literature/discovery/rss-paper-feeds/SKILL.md +214 -0
  146. package/skills/literature/discovery/semantic-scholar-recs-guide/SKILL.md +164 -0
  147. package/skills/literature/fulltext/doaj-api/SKILL.md +120 -0
  148. package/skills/literature/fulltext/interlibrary-loan-guide/SKILL.md +163 -0
  149. package/skills/literature/fulltext/open-access-guide/SKILL.md +183 -0
  150. package/skills/literature/fulltext/pmc-oai-api/SKILL.md +184 -0
  151. package/skills/literature/fulltext/preprint-servers-guide/SKILL.md +128 -0
  152. package/skills/literature/fulltext/repository-harvesting-guide/SKILL.md +207 -0
  153. package/skills/literature/fulltext/unpaywall-api/SKILL.md +113 -0
  154. package/skills/literature/metadata/altmetrics-guide/SKILL.md +132 -0
  155. package/skills/literature/metadata/citation-network-guide/SKILL.md +236 -0
  156. package/skills/literature/metadata/crossref-api/SKILL.md +133 -0
  157. package/skills/literature/metadata/datacite-api/SKILL.md +126 -0
  158. package/skills/literature/metadata/doi-resolution-guide/SKILL.md +168 -0
  159. package/skills/literature/metadata/h-index-guide/SKILL.md +183 -0
  160. package/skills/literature/metadata/journal-metrics-guide/SKILL.md +188 -0
  161. package/skills/literature/metadata/opencitations-api/SKILL.md +128 -0
  162. package/skills/literature/metadata/orcid-api/SKILL.md +136 -0
  163. package/skills/literature/metadata/orcid-integration-guide/SKILL.md +178 -0
  164. package/skills/literature/search/arxiv-api/SKILL.md +95 -0
  165. package/skills/literature/search/biorxiv-api/SKILL.md +123 -0
  166. package/skills/literature/search/boolean-search-guide/SKILL.md +199 -0
  167. package/skills/literature/search/citation-chaining-guide/SKILL.md +148 -0
  168. package/skills/literature/search/database-comparison-guide/SKILL.md +100 -0
  169. package/skills/literature/search/europe-pmc-api/SKILL.md +120 -0
  170. package/skills/literature/search/google-scholar-guide/SKILL.md +182 -0
  171. package/skills/literature/search/mesh-terms-guide/SKILL.md +164 -0
  172. package/skills/literature/search/openalex-api/SKILL.md +134 -0
  173. package/skills/literature/search/pubmed-api/SKILL.md +130 -0
  174. package/skills/literature/search/scientify-literature-survey/SKILL.md +203 -0
  175. package/skills/literature/search/semantic-scholar-api/SKILL.md +134 -0
  176. package/skills/literature/search/systematic-search-strategy/SKILL.md +214 -0
  177. package/skills/research/automation/ai-scientist-guide/SKILL.md +228 -0
  178. package/skills/research/automation/data-collection-automation/SKILL.md +248 -0
  179. package/skills/research/automation/research-workflow-automation/SKILL.md +266 -0
  180. package/skills/research/deep-research/meta-synthesis-guide/SKILL.md +174 -0
  181. package/skills/research/deep-research/research-cog/SKILL.md +153 -0
  182. package/skills/research/deep-research/scoping-review-guide/SKILL.md +217 -0
  183. package/skills/research/deep-research/systematic-review-guide/SKILL.md +250 -0
  184. package/skills/research/funding/figshare-api/SKILL.md +163 -0
  185. package/skills/research/funding/grant-writing-guide/SKILL.md +233 -0
  186. package/skills/research/funding/nsf-grant-guide/SKILL.md +206 -0
  187. package/skills/research/funding/open-science-guide/SKILL.md +255 -0
  188. package/skills/research/funding/zenodo-api/SKILL.md +174 -0
  189. package/skills/research/methodology/action-research-guide/SKILL.md +201 -0
  190. package/skills/research/methodology/experimental-design-guide/SKILL.md +236 -0
  191. package/skills/research/methodology/grad-school-guide/SKILL.md +182 -0
  192. package/skills/research/methodology/grounded-theory-guide/SKILL.md +171 -0
  193. package/skills/research/methodology/mixed-methods-guide/SKILL.md +208 -0
  194. package/skills/research/methodology/qualitative-research-guide/SKILL.md +234 -0
  195. package/skills/research/methodology/scientify-idea-generation/SKILL.md +222 -0
  196. package/skills/research/paper-review/paper-reading-assistant/SKILL.md +266 -0
  197. package/skills/research/paper-review/peer-review-guide/SKILL.md +227 -0
  198. package/skills/research/paper-review/rebuttal-writing-guide/SKILL.md +185 -0
  199. package/skills/research/paper-review/scientify-write-review-paper/SKILL.md +209 -0
  200. package/skills/tools/code-exec/jupyter-notebook-guide/SKILL.md +178 -0
  201. package/skills/tools/code-exec/python-reproducibility-guide/SKILL.md +341 -0
  202. package/skills/tools/code-exec/r-reproducibility-guide/SKILL.md +236 -0
  203. package/skills/tools/code-exec/sandbox-execution-guide/SKILL.md +221 -0
  204. package/skills/tools/diagram/mermaid-diagram-guide/SKILL.md +269 -0
  205. package/skills/tools/diagram/plantuml-guide/SKILL.md +397 -0
  206. package/skills/tools/diagram/scientific-illustration-guide/SKILL.md +225 -0
  207. package/skills/tools/document/anystyle-api/SKILL.md +199 -0
  208. package/skills/tools/document/grobid-pdf-parsing/SKILL.md +294 -0
  209. package/skills/tools/document/markdown-academic-guide/SKILL.md +217 -0
  210. package/skills/tools/document/pdf-extraction-guide/SKILL.md +321 -0
  211. package/skills/tools/knowledge-graph/knowledge-graph-construction/SKILL.md +306 -0
  212. package/skills/tools/knowledge-graph/ontology-design-guide/SKILL.md +214 -0
  213. package/skills/tools/knowledge-graph/rag-methodology-guide/SKILL.md +325 -0
  214. package/skills/tools/ocr-translate/formula-recognition-guide/SKILL.md +367 -0
  215. package/skills/tools/ocr-translate/handwriting-recognition-guide/SKILL.md +211 -0
  216. package/skills/tools/ocr-translate/latex-ocr-guide/SKILL.md +204 -0
  217. package/skills/tools/ocr-translate/multilingual-research-guide/SKILL.md +234 -0
  218. package/skills/tools/scraping/academic-web-scraping/SKILL.md +326 -0
  219. package/skills/tools/scraping/api-data-collection-guide/SKILL.md +301 -0
  220. package/skills/tools/scraping/web-scraping-ethics-guide/SKILL.md +250 -0
  221. package/skills/writing/citation/bibtex-management-guide/SKILL.md +246 -0
  222. package/skills/writing/citation/citation-style-guide/SKILL.md +248 -0
  223. package/skills/writing/citation/reference-manager-comparison/SKILL.md +208 -0
  224. package/skills/writing/citation/zotero-api/SKILL.md +188 -0
  225. package/skills/writing/composition/abstract-writing-guide/SKILL.md +188 -0
  226. package/skills/writing/composition/discussion-writing-guide/SKILL.md +194 -0
  227. package/skills/writing/composition/introduction-writing-guide/SKILL.md +194 -0
  228. package/skills/writing/composition/literature-review-writing/SKILL.md +196 -0
  229. package/skills/writing/composition/methods-section-guide/SKILL.md +185 -0
  230. package/skills/writing/composition/response-to-reviewers/SKILL.md +215 -0
  231. package/skills/writing/composition/scientific-writing-guide/SKILL.md +152 -0
  232. package/skills/writing/latex/bibliography-management-guide/SKILL.md +206 -0
  233. package/skills/writing/latex/latex-drawing-guide/SKILL.md +234 -0
  234. package/skills/writing/latex/latex-ecosystem-guide/SKILL.md +240 -0
  235. package/skills/writing/latex/math-typesetting-guide/SKILL.md +231 -0
  236. package/skills/writing/latex/overleaf-collaboration-guide/SKILL.md +211 -0
  237. package/skills/writing/latex/tikz-diagrams-guide/SKILL.md +211 -0
  238. package/skills/writing/polish/academic-translation-guide/SKILL.md +175 -0
  239. package/skills/writing/polish/academic-writing-refiner/SKILL.md +143 -0
  240. package/skills/writing/polish/ai-writing-humanizer/SKILL.md +178 -0
  241. package/skills/writing/polish/grammar-checker-guide/SKILL.md +184 -0
  242. package/skills/writing/polish/plagiarism-detection-guide/SKILL.md +167 -0
  243. package/skills/writing/templates/beamer-presentation-guide/SKILL.md +263 -0
  244. package/skills/writing/templates/conference-paper-template/SKILL.md +219 -0
  245. package/skills/writing/templates/thesis-template-guide/SKILL.md +200 -0
  246. package/skills/writing/templates/thesis-writing-guide/SKILL.md +220 -0
  247. package/src/tools/arxiv.ts +131 -0
  248. package/src/tools/crossref.ts +112 -0
  249. package/src/tools/openalex.ts +174 -0
  250. package/src/tools/pubmed.ts +166 -0
  251. package/src/tools/semantic-scholar.ts +108 -0
  252. package/src/tools/unpaywall.ts +58 -0
@@ -0,0 +1,240 @@
1
+ ---
2
+ name: power-analysis-guide
3
+ description: "Sample size calculation and statistical power analysis guide"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "target"
7
+ category: "analysis"
8
+ subcategory: "statistics"
9
+ keywords: ["sample size calculation", "power analysis", "effect size", "significance testing"]
10
+ source: "wentor-research-plugins"
11
+ ---
12
+
13
+ # Power Analysis Guide
14
+
15
+ Calculate appropriate sample sizes for your study using power analysis, understand effect sizes, and avoid underpowered or wastefully overpowered designs.
16
+
17
+ ## Core Concepts
18
+
19
+ ### The Four Parameters of Power Analysis
20
+
21
+ Every power analysis involves four interrelated quantities. Fix any three to solve for the fourth:
22
+
23
+ | Parameter | Symbol | Definition | Typical Value |
24
+ |-----------|--------|-----------|---------------|
25
+ | **Effect size** | d, r, f, etc. | Magnitude of the phenomenon you expect to detect | Varies by field |
26
+ | **Significance level** (alpha) | alpha | Probability of Type I error (false positive) | 0.05 |
27
+ | **Statistical power** (1 - beta) | 1 - beta | Probability of detecting a true effect | 0.80 or 0.90 |
28
+ | **Sample size** | N | Number of observations needed | Solve for this |
29
+
30
+ ### Error Types
31
+
32
+ | | H0 is true (no effect) | H0 is false (effect exists) |
33
+ |---|---|---|
34
+ | **Reject H0** | Type I error (alpha) | Correct (power = 1 - beta) |
35
+ | **Fail to reject H0** | Correct (1 - alpha) | Type II error (beta) |
36
+
37
+ ## Effect Size Conventions
38
+
39
+ ### Cohen's d (Two-Group Comparison)
40
+
41
+ ```
42
+ d = (M1 - M2) / SD_pooled
43
+ ```
44
+
45
+ | Size | Cohen's d | Interpretation |
46
+ |------|-----------|---------------|
47
+ | Small | 0.2 | Subtle, may need large N to detect |
48
+ | Medium | 0.5 | Noticeable, typical in social sciences |
49
+ | Large | 0.8 | Obvious, often visible without statistics |
50
+
51
+ ### Correlation (r)
52
+
53
+ | Size | r | r-squared |
54
+ |------|---|-----------|
55
+ | Small | 0.1 | 1% variance explained |
56
+ | Medium | 0.3 | 9% variance explained |
57
+ | Large | 0.5 | 25% variance explained |
58
+
59
+ ### Cohen's f (ANOVA)
60
+
61
+ | Size | f | Equivalent eta-squared |
62
+ |------|---|----------------------|
63
+ | Small | 0.10 | 0.01 |
64
+ | Medium | 0.25 | 0.06 |
65
+ | Large | 0.40 | 0.14 |
66
+
67
+ ### Odds Ratio (Logistic Regression)
68
+
69
+ | Size | OR |
70
+ |------|-----|
71
+ | Small | 1.5 |
72
+ | Medium | 2.5 |
73
+ | Large | 4.0 |
74
+
75
+ ## Power Analysis in Python (statsmodels)
76
+
77
+ ### Two-Sample t-Test
78
+
79
+ ```python
80
+ from statsmodels.stats.power import TTestIndPower
81
+
82
+ analysis = TTestIndPower()
83
+
84
+ # Solve for sample size
85
+ n = analysis.solve_power(
86
+ effect_size=0.5, # Cohen's d = medium
87
+ alpha=0.05, # Significance level
88
+ power=0.80, # 80% power
89
+ ratio=1.0, # Equal group sizes
90
+ alternative='two-sided'
91
+ )
92
+ print(f"Required N per group: {int(n) + 1}") # Output: 64
93
+
94
+ # Solve for power (given N)
95
+ power = analysis.solve_power(
96
+ effect_size=0.5,
97
+ alpha=0.05,
98
+ nobs1=50,
99
+ ratio=1.0,
100
+ alternative='two-sided'
101
+ )
102
+ print(f"Power with N=50 per group: {power:.3f}") # Output: 0.697
103
+ ```
104
+
105
+ ### Paired t-Test
106
+
107
+ ```python
108
+ from statsmodels.stats.power import TTestPower
109
+
110
+ analysis = TTestPower()
111
+ n = analysis.solve_power(
112
+ effect_size=0.3, # Small-medium effect
113
+ alpha=0.05,
114
+ power=0.80,
115
+ alternative='two-sided'
116
+ )
117
+ print(f"Required N (paired): {int(n) + 1}") # Output: 90
118
+ ```
119
+
120
+ ### One-Way ANOVA
121
+
122
+ ```python
123
+ from statsmodels.stats.power import FTestAnovaPower
124
+
125
+ analysis = FTestAnovaPower()
126
+ n = analysis.solve_power(
127
+ effect_size=0.25, # Cohen's f = medium
128
+ alpha=0.05,
129
+ power=0.80,
130
+ k_groups=4 # Number of groups
131
+ )
132
+ print(f"Required N per group: {int(n) + 1}") # Output: 45
133
+ ```
134
+
135
+ ### Chi-Square Test
136
+
137
+ ```python
138
+ from statsmodels.stats.power import GofChisquarePower
139
+
140
+ analysis = GofChisquarePower()
141
+ n = analysis.solve_power(
142
+ effect_size=0.3, # Cohen's w = medium
143
+ alpha=0.05,
144
+ power=0.80,
145
+ n_bins=4 # Degrees of freedom + 1
146
+ )
147
+ print(f"Required total N: {int(n) + 1}")
148
+ ```
149
+
150
+ ### Multiple Regression
151
+
152
+ ```python
153
+ from statsmodels.stats.power import FTestPower
154
+
155
+ analysis = FTestPower()
156
+ # For R-squared: convert to f2 = R2 / (1 - R2)
157
+ r_squared = 0.10 # Expected R-squared for the model
158
+ f2 = r_squared / (1 - r_squared) # f2 = 0.111
159
+
160
+ n = analysis.solve_power(
161
+ effect_size=f2,
162
+ alpha=0.05,
163
+ power=0.80,
164
+ df_num=5 # Number of predictors
165
+ )
166
+ # n returned is df_denom; total N = n + df_num + 1
167
+ total_n = int(n) + 5 + 1
168
+ print(f"Required total N: {total_n}")
169
+ ```
170
+
171
+ ## Power Analysis in R (pwr Package)
172
+
173
+ ```r
174
+ library(pwr)
175
+
176
+ # Two-sample t-test
177
+ result <- pwr.t.test(d = 0.5, sig.level = 0.05, power = 0.80,
178
+ type = "two.sample", alternative = "two.sided")
179
+ cat("N per group:", ceiling(result$n), "\n")
180
+
181
+ # Correlation test
182
+ result <- pwr.r.test(r = 0.3, sig.level = 0.05, power = 0.80,
183
+ alternative = "two.sided")
184
+ cat("Total N:", ceiling(result$n), "\n")
185
+
186
+ # One-way ANOVA (4 groups)
187
+ result <- pwr.anova.test(k = 4, f = 0.25, sig.level = 0.05, power = 0.80)
188
+ cat("N per group:", ceiling(result$n), "\n")
189
+
190
+ # Chi-square test
191
+ result <- pwr.chisq.test(w = 0.3, df = 3, sig.level = 0.05, power = 0.80)
192
+ cat("Total N:", ceiling(result$N), "\n")
193
+
194
+ # Plot power curve
195
+ result <- pwr.t.test(d = 0.5, sig.level = 0.05, power = NULL,
196
+ n = seq(10, 200, by = 5))
197
+ plot(result)
198
+ ```
199
+
200
+ ## Using G*Power (Desktop Application)
201
+
202
+ G*Power (gpower.hhu.de) is a free, widely-used GUI application for power analysis:
203
+
204
+ 1. **Select test family**: t-tests, F-tests, chi-square, z-tests, exact tests
205
+ 2. **Select statistical test**: e.g., "Means: Difference between two independent means (two groups)"
206
+ 3. **Select type of analysis**: A priori (compute N), Post hoc (compute power), Sensitivity (compute detectable effect)
207
+ 4. **Input parameters**: Effect size, alpha, power, allocation ratio
208
+ 5. **Calculate**: Click "Calculate" to get the result
209
+ 6. **Plot**: Use "X-Y plot for a range of values" to visualize power curves
210
+
211
+ ## Practical Recommendations
212
+
213
+ ### Choosing Effect Sizes
214
+
215
+ Do NOT blindly use Cohen's conventions. Instead:
216
+
217
+ 1. **Literature review**: Find effect sizes reported in similar studies
218
+ 2. **Pilot data**: Run a small pilot study to estimate the effect
219
+ 3. **Smallest effect of interest (SESOI)**: What is the smallest effect that would be practically meaningful?
220
+ 4. **Meta-analyses**: Use pooled effect sizes from meta-analyses in your area
221
+
222
+ ### Common Mistakes
223
+
224
+ | Mistake | Problem | Solution |
225
+ |---------|---------|----------|
226
+ | Post hoc power analysis | Circular and uninformative after data collection | Only do a priori power analysis |
227
+ | Using Cohen's "medium" by default | May be unrealistic for your field | Base on literature or SESOI |
228
+ | Ignoring attrition | Actual N may be lower than planned | Inflate N by 10-20% for expected dropout |
229
+ | Forgetting multiple comparisons | Bonferroni corrections reduce power | Adjust alpha for the number of tests |
230
+ | Not reporting power analysis | Reviewers cannot evaluate adequacy | Always report in Methods section |
231
+
232
+ ### Reporting Template
233
+
234
+ ```
235
+ A priori power analysis was conducted using [G*Power 3.1 / statsmodels / R pwr].
236
+ For a [test name] with an expected effect size of [d/r/f = X] (based on
237
+ [source: previous study / meta-analysis / pilot data]), alpha = .05, and
238
+ power = .80, the required sample size was [N per group / total N]. To account
239
+ for an estimated [X]% attrition rate, we recruited [final N] participants.
240
+ ```
@@ -0,0 +1,231 @@
1
+ ---
2
+ name: sem-guide
3
+ description: "Structural equation modeling with latent variables guide"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "network"
7
+ category: "analysis"
8
+ subcategory: "statistics"
9
+ keywords: ["structural equation modeling", "SEM", "latent variable model", "multilevel model"]
10
+ source: "wentor-research-plugins"
11
+ ---
12
+
13
+ # Structural Equation Modeling Guide
14
+
15
+ Build, estimate, and evaluate structural equation models (SEM) with latent variables using Python (semopy) and R (lavaan), including confirmatory factor analysis and path analysis.
16
+
17
+ ## What Is SEM?
18
+
19
+ Structural Equation Modeling is a multivariate statistical framework that combines factor analysis and path analysis to test complex theoretical models involving:
20
+
21
+ - **Observed (manifest) variables**: Directly measured (e.g., survey items, test scores)
22
+ - **Latent (unobserved) variables**: Theoretical constructs measured indirectly through observed indicators (e.g., "motivation," "intelligence")
23
+ - **Structural paths**: Directional relationships between variables (regression-like)
24
+ - **Measurement model**: How latent variables relate to their indicators (CFA)
25
+ - **Structural model**: How latent variables relate to each other (path analysis)
26
+
27
+ ## SEM Components
28
+
29
+ | Component | Description | Diagram Symbol |
30
+ |-----------|-------------|---------------|
31
+ | Observed variable | Measured directly | Rectangle |
32
+ | Latent variable | Inferred from indicators | Oval/circle |
33
+ | Regression path | Directional relationship | Single-headed arrow |
34
+ | Covariance | Non-directional association | Double-headed arrow |
35
+ | Error/residual | Unexplained variance | Small circle with arrow |
36
+
37
+ ## Step 1: Confirmatory Factor Analysis (CFA)
38
+
39
+ CFA tests whether observed variables load onto hypothesized latent factors.
40
+
41
+ ### In R (lavaan)
42
+
43
+ ```r
44
+ library(lavaan)
45
+
46
+ # Define the measurement model
47
+ # =~ means "is measured by"
48
+ cfa_model <- '
49
+ # Latent variable definitions
50
+ Motivation =~ mot1 + mot2 + mot3 + mot4
51
+ SelfEfficacy =~ se1 + se2 + se3
52
+ Performance =~ perf1 + perf2 + perf3 + perf4
53
+
54
+ # Covariances between latent variables (estimated by default in CFA)
55
+ '
56
+
57
+ # Fit the model
58
+ fit <- cfa(cfa_model, data = mydata, estimator = "MLR")
59
+
60
+ # View results
61
+ summary(fit, fit.measures = TRUE, standardized = TRUE)
62
+
63
+ # Key output to examine:
64
+ # - Factor loadings (standardized > 0.5 is desirable)
65
+ # - Model fit indices (see table below)
66
+ # - Modification indices (for model improvement)
67
+ modindices(fit, sort = TRUE, minimum.value = 10)
68
+ ```
69
+
70
+ ### In Python (semopy)
71
+
72
+ ```python
73
+ import semopy
74
+ import pandas as pd
75
+
76
+ # Define model in lavaan-like syntax
77
+ model_spec = """
78
+ Motivation =~ mot1 + mot2 + mot3 + mot4
79
+ SelfEfficacy =~ se1 + se2 + se3
80
+ Performance =~ perf1 + perf2 + perf3 + perf4
81
+ """
82
+
83
+ # Fit the model
84
+ model = semopy.Model(model_spec)
85
+ result = model.fit(data)
86
+
87
+ # View parameter estimates
88
+ print(model.inspect())
89
+
90
+ # Get fit statistics
91
+ stats = semopy.calc_stats(model)
92
+ print(stats.T)
93
+ ```
94
+
95
+ ## Step 2: Full Structural Model
96
+
97
+ After confirming the measurement model, add structural (regression) paths.
98
+
99
+ ### In R (lavaan)
100
+
101
+ ```r
102
+ sem_model <- '
103
+ # Measurement model
104
+ Motivation =~ mot1 + mot2 + mot3 + mot4
105
+ SelfEfficacy =~ se1 + se2 + se3
106
+ Performance =~ perf1 + perf2 + perf3 + perf4
107
+
108
+ # Structural model (regressions)
109
+ # ~ means "is regressed on"
110
+ Performance ~ Motivation + SelfEfficacy
111
+ SelfEfficacy ~ Motivation
112
+
113
+ # Optional: define indirect effect
114
+ # indirect := a * b
115
+ '
116
+
117
+ fit <- sem(sem_model, data = mydata, estimator = "MLR")
118
+ summary(fit, fit.measures = TRUE, standardized = TRUE, rsquare = TRUE)
119
+ ```
120
+
121
+ ### Mediation Analysis
122
+
123
+ ```r
124
+ mediation_model <- '
125
+ # Measurement model
126
+ X =~ x1 + x2 + x3
127
+ M =~ m1 + m2 + m3
128
+ Y =~ y1 + y2 + y3
129
+
130
+ # Structural model
131
+ M ~ a*X # a path
132
+ Y ~ b*M + c*X # b path + direct effect c
133
+
134
+ # Define indirect and total effects
135
+ indirect := a * b
136
+ total := c + a * b
137
+ '
138
+
139
+ fit <- sem(mediation_model, data = mydata, se = "bootstrap", bootstrap = 1000)
140
+ summary(fit, standardized = TRUE)
141
+
142
+ # Bootstrap confidence intervals for indirect effect
143
+ parameterEstimates(fit, boot.ci.type = "bca.simple", standardized = TRUE)
144
+ ```
145
+
146
+ ## Model Fit Assessment
147
+
148
+ ### Fit Index Reference Table
149
+
150
+ | Index | Good Fit | Acceptable | What It Measures |
151
+ |-------|----------|------------|-----------------|
152
+ | Chi-square (p) | p > 0.05 | Sensitive to N; use with other indices | Exact fit test |
153
+ | Chi-square/df | < 2 | < 3 | Parsimony-adjusted exact fit |
154
+ | CFI | > 0.95 | > 0.90 | Comparative fit vs. null model |
155
+ | TLI | > 0.95 | > 0.90 | CFI adjusted for parsimony |
156
+ | RMSEA | < 0.06 | < 0.08 | Approximate fit per df |
157
+ | SRMR | < 0.08 | < 0.10 | Average residual correlation |
158
+ | AIC/BIC | Lower = better | -- | Model comparison (not absolute) |
159
+
160
+ ### Interpreting Fit
161
+
162
+ ```r
163
+ # Extract fit measures in lavaan
164
+ fitMeasures(fit, c("chisq", "df", "pvalue", "cfi", "tli", "rmsea",
165
+ "rmsea.ci.lower", "rmsea.ci.upper", "srmr"))
166
+ ```
167
+
168
+ **Reporting template:**
169
+ ```
170
+ The structural equation model demonstrated adequate fit to the data:
171
+ chi-square(df) = X.XX, p = .XXX; CFI = .XX; TLI = .XX; RMSEA = .XXX
172
+ [90% CI: .XXX, .XXX]; SRMR = .XXX.
173
+ ```
174
+
175
+ ## Model Modification and Comparison
176
+
177
+ ### Modification Indices
178
+
179
+ ```r
180
+ # Show top modification indices
181
+ mi <- modindices(fit, sort = TRUE)
182
+ head(mi, 10)
183
+
184
+ # Common modifications:
185
+ # - Allow error covariances between similarly-worded items
186
+ # - Add cross-loadings (if theoretically justified)
187
+ # - Remove non-significant paths
188
+ ```
189
+
190
+ ### Model Comparison
191
+
192
+ ```r
193
+ # Compare nested models using chi-square difference test
194
+ fit1 <- sem(model1, data = mydata) # More constrained
195
+ fit2 <- sem(model2, data = mydata) # Less constrained
196
+
197
+ anova(fit1, fit2) # Chi-square difference test
198
+
199
+ # For non-nested models, compare AIC/BIC
200
+ fitMeasures(fit1, c("aic", "bic"))
201
+ fitMeasures(fit2, c("aic", "bic"))
202
+ ```
203
+
204
+ ## Common Pitfalls
205
+
206
+ | Issue | Problem | Solution |
207
+ |-------|---------|----------|
208
+ | Small sample size | Unstable estimates, poor fit | Minimum N = 200, or 10-20 per parameter |
209
+ | Too many parameters | Overfitting, non-convergence | Simplify model, use parceling |
210
+ | Non-normal data | Biased standard errors | Use MLR estimator or bootstrapping |
211
+ | Ignoring missing data | Biased results | Use FIML (full information maximum likelihood) |
212
+ | Data-driven respecification | Capitalizing on chance | Cross-validate with holdout sample |
213
+ | Conflating fit with truth | Good fit does not mean correct model | Consider equivalent/alternative models |
214
+
215
+ ## Assumptions and Diagnostics
216
+
217
+ 1. **Multivariate normality**: Check with Mardia's test; use robust estimators (MLR) if violated
218
+ 2. **Linearity**: SEM assumes linear relationships between variables
219
+ 3. **No multicollinearity**: Correlations between latent variables should not exceed 0.85
220
+ 4. **Sufficient sample size**: Rule of thumb: N >= 200 or 10-20 observations per estimated parameter
221
+ 5. **Correct model specification**: Omitted variables can bias all estimates
222
+
223
+ ```r
224
+ # Check multivariate normality
225
+ library(MVN)
226
+ mvn(mydata[, c("mot1", "mot2", "mot3", "se1", "se2", "se3")],
227
+ mvnTest = "mardia")
228
+
229
+ # Use robust estimation if non-normal
230
+ fit_robust <- sem(sem_model, data = mydata, estimator = "MLR")
231
+ ```
@@ -0,0 +1,195 @@
1
+ ---
2
+ name: survival-analysis-guide
3
+ description: "Conduct Kaplan-Meier, Cox regression, and time-to-event analyses"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "hourglass_flowing_sand"
7
+ category: "analysis"
8
+ subcategory: "statistics"
9
+ keywords: ["survival analysis", "Kaplan-Meier", "Cox regression", "time-to-event", "hazard ratio", "censoring"]
10
+ source: "wentor-research-plugins"
11
+ ---
12
+
13
+ # Survival Analysis Guide
14
+
15
+ A skill for conducting time-to-event analyses including Kaplan-Meier estimation, log-rank tests, and Cox proportional hazards regression. Covers censoring concepts, assumption checking, and reporting standards for clinical and social science research.
16
+
17
+ ## Core Concepts
18
+
19
+ ### What Is Survival Analysis?
20
+
21
+ Survival analysis studies the time until an event of interest occurs. Despite the name, the "event" need not be death -- it can be any well-defined transition:
22
+
23
+ ```
24
+ Medical: Time to disease recurrence, death, or recovery
25
+ Engineering: Time to equipment failure
26
+ Social: Time to job termination, divorce, or graduation
27
+ Business: Time to customer churn or first purchase
28
+ Ecology: Time to species extinction in a habitat
29
+ ```
30
+
31
+ ### Censoring
32
+
33
+ ```
34
+ Right censoring (most common):
35
+ The event has not occurred by the end of the study period.
36
+ Example: Patient is still alive at study end.
37
+ The survival time is "at least T" -- we know T but not the true event time.
38
+
39
+ Left censoring:
40
+ The event occurred before the observation period began.
41
+ Example: HIV infection detected, but seroconversion happened before testing.
42
+
43
+ Interval censoring:
44
+ The event occurred between two observation times.
45
+ Example: A patient tests negative at visit 3 and positive at visit 4.
46
+ ```
47
+
48
+ ## Kaplan-Meier Estimation
49
+
50
+ ### Computing the Survival Curve
51
+
52
+ ```python
53
+ import numpy as np
54
+
55
+
56
+ def kaplan_meier(times: list[float], events: list[int]) -> dict:
57
+ """
58
+ Compute Kaplan-Meier survival estimates.
59
+
60
+ Args:
61
+ times: Observed times (event or censoring time)
62
+ events: Event indicator (1 = event occurred, 0 = censored)
63
+
64
+ Returns:
65
+ Dict with time points and survival probabilities
66
+ """
67
+ data = sorted(zip(times, events), key=lambda x: x[0])
68
+ n = len(data)
69
+
70
+ unique_event_times = sorted(set(t for t, e in data if e == 1))
71
+ survival = 1.0
72
+ results = {"time": [0], "survival": [1.0]}
73
+
74
+ at_risk = n
75
+ idx = 0
76
+
77
+ for t_event in unique_event_times:
78
+ # Count censored before this event time
79
+ while idx < n and data[idx][0] < t_event:
80
+ if data[idx][1] == 0:
81
+ at_risk -= 1
82
+ idx += 1
83
+
84
+ # Count events at this time
85
+ d = sum(1 for t, e in data if t == t_event and e == 1)
86
+ c = sum(1 for t, e in data if t == t_event and e == 0)
87
+
88
+ survival *= (at_risk - d) / at_risk
89
+ results["time"].append(t_event)
90
+ results["survival"].append(survival)
91
+
92
+ at_risk -= (d + c)
93
+ idx = max(idx, sum(1 for t, _ in data if t <= t_event))
94
+
95
+ return results
96
+ ```
97
+
98
+ ### Using lifelines in Python
99
+
100
+ ```python
101
+ from lifelines import KaplanMeierFitter
102
+
103
+ kmf = KaplanMeierFitter()
104
+ kmf.fit(durations=time_column, event_observed=event_column, label="Overall")
105
+
106
+ # Plot the survival curve
107
+ kmf.plot_survival_function()
108
+
109
+ # Median survival time
110
+ print(f"Median survival: {kmf.median_survival_time_}")
111
+
112
+ # Survival probability at specific time
113
+ print(f"5-year survival: {kmf.predict(5.0):.3f}")
114
+ ```
115
+
116
+ ## Log-Rank Test
117
+
118
+ ### Comparing Survival Between Groups
119
+
120
+ ```python
121
+ from lifelines.statistics import logrank_test
122
+
123
+ results = logrank_test(
124
+ durations_A=group_a_times,
125
+ durations_B=group_b_times,
126
+ event_observed_A=group_a_events,
127
+ event_observed_B=group_b_events
128
+ )
129
+
130
+ print(f"Test statistic: {results.test_statistic:.3f}")
131
+ print(f"p-value: {results.p_value:.4f}")
132
+ ```
133
+
134
+ The log-rank test is the standard method for comparing two or more survival curves. It tests the null hypothesis that the survival functions are identical. It is most powerful when hazards are proportional (consistent relative risk over time).
135
+
136
+ ## Cox Proportional Hazards Regression
137
+
138
+ ### Model Fitting
139
+
140
+ ```python
141
+ from lifelines import CoxPHFitter
142
+ import pandas as pd
143
+
144
+ cph = CoxPHFitter()
145
+ cph.fit(
146
+ df,
147
+ duration_col="time",
148
+ event_col="event",
149
+ formula="age + treatment + stage"
150
+ )
151
+
152
+ cph.print_summary()
153
+
154
+ # Hazard ratios
155
+ print(cph.summary[["exp(coef)", "exp(coef) lower 95%", "exp(coef) upper 95%", "p"]])
156
+ ```
157
+
158
+ ### Interpreting Hazard Ratios
159
+
160
+ ```
161
+ Hazard Ratio (HR) = exp(coefficient)
162
+
163
+ HR = 1.0 No effect
164
+ HR > 1.0 Increased hazard (worse survival)
165
+ HR < 1.0 Decreased hazard (better survival)
166
+
167
+ Example output:
168
+ treatment: HR = 0.65, 95% CI [0.48, 0.88], p = 0.005
169
+ Interpretation: Treatment group has 35% lower hazard of the event
170
+ compared to the control group.
171
+ ```
172
+
173
+ ### Checking the Proportional Hazards Assumption
174
+
175
+ ```python
176
+ # Schoenfeld residuals test
177
+ cph.check_assumptions(df, p_value_threshold=0.05, show_plots=True)
178
+ ```
179
+
180
+ If the proportional hazards assumption is violated, consider: stratified Cox models, time-varying covariates, or accelerated failure time (AFT) models as alternatives.
181
+
182
+ ## Reporting Standards
183
+
184
+ ### STROBE-style Reporting for Survival Analyses
185
+
186
+ ```
187
+ 1. Report number of events and total person-time at risk
188
+ 2. Present Kaplan-Meier curves with number-at-risk tables
189
+ 3. Report median survival with 95% confidence intervals
190
+ 4. Report hazard ratios with 95% CIs and p-values
191
+ 5. State which covariates were included in adjusted models
192
+ 6. Report proportional hazards assumption test results
193
+ 7. Specify the handling of tied event times (Efron, Breslow)
194
+ 8. Note any competing risks and how they were handled
195
+ ```