@wentorai/research-plugins 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (203) hide show
  1. package/README.md +22 -22
  2. package/curated/analysis/README.md +71 -56
  3. package/curated/domains/README.md +176 -67
  4. package/curated/literature/README.md +71 -47
  5. package/curated/research/README.md +91 -58
  6. package/curated/tools/README.md +88 -87
  7. package/curated/writing/README.md +80 -45
  8. package/mcp-configs/cloud-docs/confluence-mcp.json +37 -0
  9. package/mcp-configs/cloud-docs/google-drive-mcp.json +35 -0
  10. package/mcp-configs/cloud-docs/notion-mcp.json +29 -0
  11. package/mcp-configs/communication/discord-mcp.json +29 -0
  12. package/mcp-configs/communication/slack-mcp.json +29 -0
  13. package/mcp-configs/communication/telegram-mcp.json +28 -0
  14. package/mcp-configs/database/neo4j-mcp.json +37 -0
  15. package/mcp-configs/database/postgres-mcp.json +28 -0
  16. package/mcp-configs/database/sqlite-mcp.json +29 -0
  17. package/mcp-configs/dev-platform/github-mcp.json +31 -0
  18. package/mcp-configs/dev-platform/gitlab-mcp.json +34 -0
  19. package/mcp-configs/email/email-mcp.json +40 -0
  20. package/mcp-configs/email/gmail-mcp.json +37 -0
  21. package/mcp-configs/registry.json +178 -149
  22. package/mcp-configs/repository/dataverse-mcp.json +33 -0
  23. package/mcp-configs/repository/huggingface-mcp.json +29 -0
  24. package/openclaw.plugin.json +2 -2
  25. package/package.json +2 -2
  26. package/skills/analysis/dataviz/algorithm-visualizer-guide/SKILL.md +259 -0
  27. package/skills/analysis/dataviz/bokeh-visualization-guide/SKILL.md +270 -0
  28. package/skills/analysis/dataviz/chart-image-generator/SKILL.md +229 -0
  29. package/skills/analysis/dataviz/d3-visualization-guide/SKILL.md +281 -0
  30. package/skills/analysis/dataviz/echarts-visualization-guide/SKILL.md +250 -0
  31. package/skills/analysis/dataviz/metabase-analytics-guide/SKILL.md +242 -0
  32. package/skills/analysis/dataviz/plotly-interactive-guide/SKILL.md +266 -0
  33. package/skills/analysis/dataviz/redash-analytics-guide/SKILL.md +284 -0
  34. package/skills/analysis/econometrics/econml-causal-guide/SKILL.md +163 -0
  35. package/skills/analysis/econometrics/mostly-harmless-guide/SKILL.md +139 -0
  36. package/skills/analysis/econometrics/panel-data-analyst/SKILL.md +259 -0
  37. package/skills/analysis/econometrics/python-causality-guide/SKILL.md +134 -0
  38. package/skills/analysis/econometrics/stata-accounting-guide/SKILL.md +269 -0
  39. package/skills/analysis/econometrics/stata-analyst-guide/SKILL.md +245 -0
  40. package/skills/analysis/statistics/data-anomaly-detection/SKILL.md +157 -0
  41. package/skills/analysis/statistics/ml-experiment-tracker/SKILL.md +212 -0
  42. package/skills/analysis/statistics/pywayne-statistics-guide/SKILL.md +192 -0
  43. package/skills/analysis/statistics/quantitative-methods-guide/SKILL.md +193 -0
  44. package/skills/analysis/statistics/senior-data-scientist-guide/SKILL.md +223 -0
  45. package/skills/analysis/wrangling/csv-data-analyzer/SKILL.md +170 -0
  46. package/skills/analysis/wrangling/data-cleaning-pipeline/SKILL.md +266 -0
  47. package/skills/analysis/wrangling/data-cog-guide/SKILL.md +178 -0
  48. package/skills/analysis/wrangling/stata-data-cleaning/SKILL.md +276 -0
  49. package/skills/analysis/wrangling/survey-data-processing/SKILL.md +298 -0
  50. package/skills/domains/ai-ml/ai-model-benchmarking/SKILL.md +209 -0
  51. package/skills/domains/ai-ml/annotated-dl-papers-guide/SKILL.md +159 -0
  52. package/skills/domains/ai-ml/dl-transformer-finetune/SKILL.md +239 -0
  53. package/skills/domains/ai-ml/generative-ai-guide/SKILL.md +146 -0
  54. package/skills/domains/ai-ml/huggingface-inference-guide/SKILL.md +196 -0
  55. package/skills/domains/ai-ml/keras-deep-learning/SKILL.md +210 -0
  56. package/skills/domains/ai-ml/llm-from-scratch-guide/SKILL.md +124 -0
  57. package/skills/domains/ai-ml/ml-pipeline-guide/SKILL.md +295 -0
  58. package/skills/domains/ai-ml/nlp-toolkit-guide/SKILL.md +247 -0
  59. package/skills/domains/ai-ml/pytorch-guide/SKILL.md +281 -0
  60. package/skills/domains/ai-ml/pytorch-lightning-guide/SKILL.md +244 -0
  61. package/skills/domains/ai-ml/tensorflow-guide/SKILL.md +241 -0
  62. package/skills/domains/biomedical/bioagents-guide/SKILL.md +308 -0
  63. package/skills/domains/biomedical/medgeclaw-guide/SKILL.md +345 -0
  64. package/skills/domains/biomedical/medical-imaging-guide/SKILL.md +305 -0
  65. package/skills/domains/business/architecture-design-guide/SKILL.md +279 -0
  66. package/skills/domains/business/innovation-management-guide/SKILL.md +257 -0
  67. package/skills/domains/business/operations-research-guide/SKILL.md +258 -0
  68. package/skills/domains/chemistry/molecular-dynamics-guide/SKILL.md +237 -0
  69. package/skills/domains/chemistry/pubchem-api-guide/SKILL.md +180 -0
  70. package/skills/domains/chemistry/spectroscopy-analysis-guide/SKILL.md +290 -0
  71. package/skills/domains/cs/distributed-systems-guide/SKILL.md +268 -0
  72. package/skills/domains/cs/formal-verification-guide/SKILL.md +298 -0
  73. package/skills/domains/ecology/species-distribution-guide/SKILL.md +343 -0
  74. package/skills/domains/economics/imf-data-api-guide/SKILL.md +174 -0
  75. package/skills/domains/economics/post-labor-economics/SKILL.md +254 -0
  76. package/skills/domains/economics/pricing-psychology-guide/SKILL.md +273 -0
  77. package/skills/domains/economics/world-bank-data-guide/SKILL.md +179 -0
  78. package/skills/domains/education/assessment-design-guide/SKILL.md +213 -0
  79. package/skills/domains/education/educational-research-methods/SKILL.md +179 -0
  80. package/skills/domains/education/mooc-analytics-guide/SKILL.md +206 -0
  81. package/skills/domains/finance/portfolio-optimization-guide/SKILL.md +279 -0
  82. package/skills/domains/finance/risk-modeling-guide/SKILL.md +260 -0
  83. package/skills/domains/finance/stata-accounting-research/SKILL.md +372 -0
  84. package/skills/domains/geoscience/climate-modeling-guide/SKILL.md +215 -0
  85. package/skills/domains/geoscience/satellite-remote-sensing/SKILL.md +193 -0
  86. package/skills/domains/geoscience/seismology-data-guide/SKILL.md +208 -0
  87. package/skills/domains/humanities/ethical-philosophy-guide/SKILL.md +244 -0
  88. package/skills/domains/humanities/history-research-guide/SKILL.md +260 -0
  89. package/skills/domains/humanities/political-history-guide/SKILL.md +241 -0
  90. package/skills/domains/law/legal-nlp-guide/SKILL.md +236 -0
  91. package/skills/domains/law/patent-analysis-guide/SKILL.md +257 -0
  92. package/skills/domains/law/regulatory-compliance-guide/SKILL.md +267 -0
  93. package/skills/domains/math/symbolic-computation-guide/SKILL.md +263 -0
  94. package/skills/domains/math/topology-data-analysis/SKILL.md +305 -0
  95. package/skills/domains/pharma/clinical-trial-design-guide/SKILL.md +271 -0
  96. package/skills/domains/pharma/drug-target-interaction/SKILL.md +242 -0
  97. package/skills/domains/pharma/pharmacovigilance-guide/SKILL.md +216 -0
  98. package/skills/domains/physics/astrophysics-data-guide/SKILL.md +305 -0
  99. package/skills/domains/physics/particle-physics-guide/SKILL.md +287 -0
  100. package/skills/domains/social-science/network-analysis-guide/SKILL.md +310 -0
  101. package/skills/domains/social-science/psychology-research-guide/SKILL.md +270 -0
  102. package/skills/domains/social-science/sociology-research-guide/SKILL.md +238 -0
  103. package/skills/literature/discovery/paper-recommendation-guide/SKILL.md +120 -0
  104. package/skills/literature/discovery/semantic-paper-radar/SKILL.md +144 -0
  105. package/skills/literature/discovery/zotero-arxiv-daily-guide/SKILL.md +94 -0
  106. package/skills/literature/fulltext/core-api-guide/SKILL.md +144 -0
  107. package/skills/literature/fulltext/institutional-repository-guide/SKILL.md +212 -0
  108. package/skills/literature/fulltext/open-access-mining-guide/SKILL.md +341 -0
  109. package/skills/literature/metadata/academic-paper-summarizer/SKILL.md +101 -0
  110. package/skills/literature/metadata/wikidata-api-guide/SKILL.md +156 -0
  111. package/skills/literature/search/arxiv-batch-reporting/SKILL.md +133 -0
  112. package/skills/literature/search/arxiv-paper-processor/SKILL.md +141 -0
  113. package/skills/literature/search/baidu-scholar-guide/SKILL.md +110 -0
  114. package/skills/literature/search/chatpaper-guide/SKILL.md +122 -0
  115. package/skills/literature/search/deep-literature-search/SKILL.md +149 -0
  116. package/skills/literature/search/deepgit-search-guide/SKILL.md +147 -0
  117. package/skills/literature/search/pasa-paper-search-guide/SKILL.md +138 -0
  118. package/skills/research/automation/ai-scientist-v2-guide/SKILL.md +284 -0
  119. package/skills/research/automation/aim-experiment-guide/SKILL.md +234 -0
  120. package/skills/research/automation/datagen-research-guide/SKILL.md +131 -0
  121. package/skills/research/automation/kedro-pipeline-guide/SKILL.md +216 -0
  122. package/skills/research/automation/mle-agent-guide/SKILL.md +139 -0
  123. package/skills/research/automation/paper-to-agent-guide/SKILL.md +116 -0
  124. package/skills/research/automation/rd-agent-guide/SKILL.md +246 -0
  125. package/skills/research/automation/research-paper-orchestrator/SKILL.md +254 -0
  126. package/skills/research/deep-research/academic-deep-research/SKILL.md +190 -0
  127. package/skills/research/deep-research/auto-deep-research-guide/SKILL.md +141 -0
  128. package/skills/research/deep-research/deep-research-pro/SKILL.md +213 -0
  129. package/skills/research/deep-research/deep-research-work/SKILL.md +204 -0
  130. package/skills/research/deep-research/deep-searcher-guide/SKILL.md +253 -0
  131. package/skills/research/deep-research/gpt-researcher-guide/SKILL.md +191 -0
  132. package/skills/research/deep-research/khoj-research-guide/SKILL.md +200 -0
  133. package/skills/research/deep-research/local-deep-research-guide/SKILL.md +253 -0
  134. package/skills/research/deep-research/tongyi-deep-research-guide/SKILL.md +217 -0
  135. package/skills/research/funding/eu-horizon-guide/SKILL.md +244 -0
  136. package/skills/research/funding/grant-budget-guide/SKILL.md +284 -0
  137. package/skills/research/funding/nih-reporter-api-guide/SKILL.md +166 -0
  138. package/skills/research/funding/nsf-award-api-guide/SKILL.md +133 -0
  139. package/skills/research/methodology/academic-mentor-guide/SKILL.md +169 -0
  140. package/skills/research/methodology/claude-scientific-guide/SKILL.md +122 -0
  141. package/skills/research/methodology/deep-innovator-guide/SKILL.md +242 -0
  142. package/skills/research/methodology/osf-api-guide/SKILL.md +165 -0
  143. package/skills/research/methodology/research-paper-kb/SKILL.md +263 -0
  144. package/skills/research/methodology/research-town-guide/SKILL.md +263 -0
  145. package/skills/research/paper-review/automated-review-guide/SKILL.md +281 -0
  146. package/skills/research/paper-review/paper-compare-guide/SKILL.md +238 -0
  147. package/skills/research/paper-review/paper-digest-guide/SKILL.md +240 -0
  148. package/skills/research/paper-review/paper-research-assistant/SKILL.md +231 -0
  149. package/skills/research/paper-review/research-quality-filter/SKILL.md +261 -0
  150. package/skills/research/paper-review/review-response-guide/SKILL.md +275 -0
  151. package/skills/tools/code-exec/google-colab-guide/SKILL.md +276 -0
  152. package/skills/tools/code-exec/kaggle-api-guide/SKILL.md +216 -0
  153. package/skills/tools/code-exec/overleaf-cli-guide/SKILL.md +279 -0
  154. package/skills/tools/diagram/code-flow-visualizer/SKILL.md +197 -0
  155. package/skills/tools/diagram/excalidraw-diagram-guide/SKILL.md +170 -0
  156. package/skills/tools/diagram/json-data-visualizer/SKILL.md +270 -0
  157. package/skills/tools/diagram/mermaid-architect-guide/SKILL.md +219 -0
  158. package/skills/tools/diagram/tldraw-whiteboard-guide/SKILL.md +397 -0
  159. package/skills/tools/document/docsgpt-guide/SKILL.md +130 -0
  160. package/skills/tools/document/large-document-reader/SKILL.md +202 -0
  161. package/skills/tools/document/paper-parse-guide/SKILL.md +243 -0
  162. package/skills/tools/knowledge-graph/citation-network-builder/SKILL.md +244 -0
  163. package/skills/tools/knowledge-graph/concept-map-generator/SKILL.md +284 -0
  164. package/skills/tools/knowledge-graph/graphiti-guide/SKILL.md +219 -0
  165. package/skills/tools/ocr-translate/pdf-math-translate-guide/SKILL.md +141 -0
  166. package/skills/tools/ocr-translate/zotero-pdf-translate-guide/SKILL.md +95 -0
  167. package/skills/tools/ocr-translate/zotero-pdf2zh-guide/SKILL.md +143 -0
  168. package/skills/tools/scraping/dataset-finder-guide/SKILL.md +253 -0
  169. package/skills/tools/scraping/easy-spider-guide/SKILL.md +250 -0
  170. package/skills/tools/scraping/google-scholar-scraper/SKILL.md +255 -0
  171. package/skills/tools/scraping/repository-harvesting-guide/SKILL.md +310 -0
  172. package/skills/writing/citation/academic-citation-manager/SKILL.md +314 -0
  173. package/skills/writing/citation/jabref-reference-guide/SKILL.md +127 -0
  174. package/skills/writing/citation/jasminum-zotero-guide/SKILL.md +103 -0
  175. package/skills/writing/citation/obsidian-citation-guide/SKILL.md +164 -0
  176. package/skills/writing/citation/obsidian-zotero-guide/SKILL.md +137 -0
  177. package/skills/writing/citation/papersgpt-zotero-guide/SKILL.md +132 -0
  178. package/skills/writing/citation/papis-cli-guide/SKILL.md +213 -0
  179. package/skills/writing/citation/zotero-better-bibtex-guide/SKILL.md +107 -0
  180. package/skills/writing/citation/zotero-better-notes-guide/SKILL.md +121 -0
  181. package/skills/writing/citation/zotero-gpt-guide/SKILL.md +111 -0
  182. package/skills/writing/citation/zotero-mcp-guide/SKILL.md +164 -0
  183. package/skills/writing/citation/zotero-mdnotes-guide/SKILL.md +162 -0
  184. package/skills/writing/citation/zotero-reference-guide/SKILL.md +139 -0
  185. package/skills/writing/citation/zotero-scholar-guide/SKILL.md +294 -0
  186. package/skills/writing/citation/zotfile-attachment-guide/SKILL.md +140 -0
  187. package/skills/writing/composition/ml-paper-writing/SKILL.md +163 -0
  188. package/skills/writing/composition/paper-debugger-guide/SKILL.md +143 -0
  189. package/skills/writing/composition/scientific-writing-resources/SKILL.md +151 -0
  190. package/skills/writing/composition/scientific-writing-wrapper/SKILL.md +153 -0
  191. package/skills/writing/latex/latex-drawing-collection/SKILL.md +154 -0
  192. package/skills/writing/latex/latex-templates-collection/SKILL.md +159 -0
  193. package/skills/writing/latex/md-to-pdf-academic/SKILL.md +230 -0
  194. package/skills/writing/latex/tex-render-guide/SKILL.md +243 -0
  195. package/skills/writing/polish/academic-tone-guide/SKILL.md +209 -0
  196. package/skills/writing/polish/conciseness-editing-guide/SKILL.md +225 -0
  197. package/skills/writing/polish/paper-polish-guide/SKILL.md +160 -0
  198. package/skills/writing/templates/graphical-abstract-guide/SKILL.md +183 -0
  199. package/skills/writing/templates/novathesis-guide/SKILL.md +152 -0
  200. package/skills/writing/templates/scientific-article-pdf/SKILL.md +261 -0
  201. package/skills/writing/templates/sjtuthesis-guide/SKILL.md +197 -0
  202. package/skills/writing/templates/thuthesis-guide/SKILL.md +181 -0
  203. package/skills/literature/fulltext/repository-harvesting-guide/SKILL.md +0 -207
@@ -0,0 +1,290 @@
1
+ ---
2
+ name: spectroscopy-analysis-guide
3
+ description: "Spectral data analysis for NMR, IR, mass spectrometry, and UV-Vis"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "microscope"
7
+ category: "domains"
8
+ subcategory: "chemistry"
9
+ keywords: ["spectroscopy", "nmr", "mass-spectrometry", "infrared", "uv-vis", "analytical-chemistry"]
10
+ source: "wentor"
11
+ ---
12
+
13
+ # Spectroscopy Analysis Guide
14
+
15
+ A skill for processing and interpreting spectroscopic data in chemistry research. Covers NMR, IR, mass spectrometry, and UV-Vis spectroscopy including data formats, baseline correction, peak detection, spectral matching, and structure elucidation workflows.
16
+
17
+ ## Spectral Data Formats
18
+
19
+ ### Common File Formats
20
+
21
+ | Format | Spectroscopy | Description |
22
+ |--------|-------------|-------------|
23
+ | JCAMP-DX (.jdx, .dx) | All types | IUPAC standard exchange format |
24
+ | Bruker (1r, fid, acqu) | NMR | Raw and processed Bruker data |
25
+ | mzML / mzXML | MS | Open mass spectrometry format |
26
+ | SPC (.spc) | IR, UV-Vis | Galactic/Thermo spectral format |
27
+ | CSV / TXT | All | Simple x,y pairs (wavelength/wavenumber, intensity) |
28
+
29
+ ### Reading Spectral Data
30
+
31
+ ```python
32
+ import numpy as np
33
+ from scipy.signal import find_peaks, savgol_filter
34
+
35
+ def read_jcamp(filepath: str) -> dict:
36
+ """
37
+ Read a JCAMP-DX spectral file.
38
+ Returns x (wavenumber/chemical shift/m/z) and y (intensity) arrays.
39
+ """
40
+ x_data, y_data = [], []
41
+ metadata = {}
42
+
43
+ with open(filepath, "r") as f:
44
+ for line in f:
45
+ line = line.strip()
46
+ if line.startswith("##"):
47
+ key_val = line[2:].split("=", 1)
48
+ if len(key_val) == 2:
49
+ metadata[key_val[0].strip()] = key_val[1].strip()
50
+ elif line and not line.startswith("$$"):
51
+ parts = line.split()
52
+ try:
53
+ values = [float(v) for v in parts]
54
+ if len(values) >= 2:
55
+ x_data.append(values[0])
56
+ y_data.extend(values[1:])
57
+ except ValueError:
58
+ continue
59
+
60
+ return {
61
+ "x": np.array(x_data),
62
+ "y": np.array(y_data[:len(x_data)]),
63
+ "metadata": metadata,
64
+ }
65
+ ```
66
+
67
+ ## NMR Spectroscopy
68
+
69
+ ### 1H NMR Processing
70
+
71
+ ```python
72
+ import nmrglue as ng
73
+
74
+ def process_1h_nmr(bruker_dir: str) -> dict:
75
+ """
76
+ Process 1H NMR data from Bruker format using nmrglue.
77
+ bruker_dir: path to Bruker experiment directory
78
+ """
79
+ # Read raw data
80
+ dic, data = ng.bruker.read(bruker_dir)
81
+
82
+ # Apply processing
83
+ data = ng.bruker.remove_digital_filter(dic, data)
84
+ data = ng.proc_base.zf_size(data, 65536) # zero-fill
85
+ data = ng.proc_base.fft(data) # Fourier transform
86
+ data = ng.proc_autophase.autops(data, "acme") # automatic phasing
87
+ data = ng.proc_base.rev(data) # reverse spectrum
88
+ data = ng.proc_base.di(data) # discard imaginary
89
+
90
+ # Generate chemical shift axis (ppm)
91
+ udic = ng.bruker.guess_udic(dic, data)
92
+ uc = ng.fileiobase.uc_from_udic(udic)
93
+ ppm = uc.ppm_scale()
94
+
95
+ return {
96
+ "ppm": ppm,
97
+ "spectrum": data.real,
98
+ "sf": dic["acqus"]["SFO1"], # spectrometer frequency (MHz)
99
+ "sw_ppm": dic["acqus"]["SW"], # sweep width (ppm)
100
+ }
101
+
102
+ def pick_nmr_peaks(ppm: np.ndarray, spectrum: np.ndarray,
103
+ threshold: float = 0.05) -> list[dict]:
104
+ """
105
+ Automatic peak picking for 1H NMR.
106
+ threshold: minimum peak height as fraction of max intensity.
107
+ """
108
+ min_height = threshold * np.max(spectrum)
109
+ indices, properties = find_peaks(
110
+ spectrum, height=min_height, distance=10, prominence=min_height * 0.5
111
+ )
112
+
113
+ peaks = []
114
+ for idx in indices:
115
+ peaks.append({
116
+ "ppm": round(float(ppm[idx]), 3),
117
+ "intensity": float(spectrum[idx]),
118
+ })
119
+
120
+ # Sort by chemical shift (high to low, NMR convention)
121
+ peaks.sort(key=lambda p: p["ppm"], reverse=True)
122
+ return peaks
123
+ ```
124
+
125
+ ### Common 1H NMR Chemical Shift Ranges
126
+
127
+ | Chemical Shift (ppm) | Functional Group |
128
+ |----------------------|-----------------|
129
+ | 0.8-1.0 | CH3 (methyl, alkyl) |
130
+ | 1.2-1.4 | CH2 (methylene, alkyl chain) |
131
+ | 2.0-2.5 | CH next to C=O |
132
+ | 3.3-3.9 | CH next to O or N (ethers, amines) |
133
+ | 4.5-5.5 | Vinyl C=CH2, OCH |
134
+ | 6.5-8.5 | Aromatic H |
135
+ | 9.0-10.0 | Aldehyde CHO |
136
+ | 10.0-12.0 | Carboxylic acid OH |
137
+
138
+ ## Mass Spectrometry
139
+
140
+ ### Processing MS Data
141
+
142
+ ```python
143
+ from pyteomics import mzml
144
+ import numpy as np
145
+
146
+ def read_mzml_spectra(filepath: str, ms_level: int = 1) -> list[dict]:
147
+ """
148
+ Read mass spectra from an mzML file.
149
+ ms_level: 1 for MS1 (survey scans), 2 for MS/MS
150
+ """
151
+ spectra = []
152
+ with mzml.read(filepath) as reader:
153
+ for spectrum in reader:
154
+ if spectrum.get("ms level") == ms_level:
155
+ spectra.append({
156
+ "scan": spectrum["index"],
157
+ "rt": spectrum["scanList"]["scan"][0].get(
158
+ "scan start time", 0
159
+ ),
160
+ "mz": spectrum["m/z array"],
161
+ "intensity": spectrum["intensity array"],
162
+ "tic": np.sum(spectrum["intensity array"]),
163
+ })
164
+ return spectra
165
+
166
+ def find_molecular_ion(mz: np.ndarray, intensity: np.ndarray,
167
+ expected_mw: float = None,
168
+ tolerance_da: float = 0.5) -> list[dict]:
169
+ """
170
+ Identify molecular ion peaks ([M+H]+, [M+Na]+, [M-H]-).
171
+ """
172
+ # Find top peaks
173
+ top_indices = np.argsort(intensity)[::-1][:20]
174
+ candidates = []
175
+
176
+ adducts = {
177
+ "[M+H]+": 1.00728,
178
+ "[M+Na]+": 22.98922,
179
+ "[M+K]+": 38.96316,
180
+ "[M-H]-": -1.00728,
181
+ "[M+NH4]+": 18.03437,
182
+ }
183
+
184
+ for idx in top_indices:
185
+ peak_mz = mz[idx]
186
+ peak_int = intensity[idx]
187
+
188
+ if expected_mw:
189
+ for adduct_name, adduct_mass in adducts.items():
190
+ calc_mw = peak_mz - adduct_mass
191
+ if abs(calc_mw - expected_mw) < tolerance_da:
192
+ candidates.append({
193
+ "mz": round(float(peak_mz), 4),
194
+ "intensity": float(peak_int),
195
+ "adduct": adduct_name,
196
+ "calc_mw": round(calc_mw, 4),
197
+ "error_da": round(abs(calc_mw - expected_mw), 4),
198
+ })
199
+ else:
200
+ candidates.append({
201
+ "mz": round(float(peak_mz), 4),
202
+ "intensity": float(peak_int),
203
+ })
204
+
205
+ return candidates
206
+ ```
207
+
208
+ ## Infrared Spectroscopy
209
+
210
+ ### IR Peak Assignment
211
+
212
+ ```python
213
+ # Standard IR functional group frequency table
214
+ IR_ASSIGNMENTS = {
215
+ (3200, 3600): "O-H stretch (broad: alcohol, acid; sharp: free OH)",
216
+ (3300, 3500): "N-H stretch (primary amine: 2 bands; secondary: 1 band)",
217
+ (2850, 3000): "C-H stretch (sp3: 2850-2960; sp2: 3000-3100)",
218
+ (2100, 2260): "Triple bond stretch (C-triple-N: 2210-2260; C-triple-C: 2100-2150)",
219
+ (1680, 1750): "C=O stretch (ketone ~1715; ester ~1735; acid ~1710; amide ~1650)",
220
+ (1600, 1680): "C=C stretch (alkene ~1640; aromatic ~1600, 1500)",
221
+ (1000, 1300): "C-O stretch (ether, ester, alcohol)",
222
+ }
223
+
224
+ def assign_ir_peaks(wavenumber: np.ndarray, absorbance: np.ndarray,
225
+ threshold: float = 0.1) -> list[dict]:
226
+ """Detect and assign IR absorption peaks to functional groups."""
227
+ # Invert for peak detection (absorbance peaks are positive)
228
+ peaks, properties = find_peaks(absorbance, height=threshold, prominence=0.05)
229
+
230
+ assignments = []
231
+ for idx in peaks:
232
+ wn = float(wavenumber[idx])
233
+ assignment = "unassigned"
234
+ for (low, high), group in IR_ASSIGNMENTS.items():
235
+ if low <= wn <= high:
236
+ assignment = group
237
+ break
238
+ assignments.append({
239
+ "wavenumber_cm-1": round(wn, 1),
240
+ "absorbance": round(float(absorbance[idx]), 4),
241
+ "assignment": assignment,
242
+ })
243
+
244
+ return sorted(assignments, key=lambda x: x["wavenumber_cm-1"], reverse=True)
245
+ ```
246
+
247
+ ## Spectral Processing Utilities
248
+
249
+ ### Baseline Correction and Smoothing
250
+
251
+ ```python
252
+ def baseline_correction(y: np.ndarray, lam: float = 1e6,
253
+ p: float = 0.001, n_iter: int = 10) -> np.ndarray:
254
+ """
255
+ Asymmetric least squares baseline correction (Eilers and Boelens, 2005).
256
+ lam: smoothness parameter (larger = smoother baseline)
257
+ p: asymmetry parameter (smaller = more emphasis on fitting below peaks)
258
+ """
259
+ from scipy.sparse import diags, csc_matrix
260
+ from scipy.sparse.linalg import spsolve
261
+
262
+ L = len(y)
263
+ D = diags([1, -2, 1], [0, -1, -2], shape=(L, L - 2)).toarray()
264
+ H = lam * D.dot(D.T)
265
+ w = np.ones(L)
266
+
267
+ for _ in range(n_iter):
268
+ W = diags(w, 0, shape=(L, L))
269
+ Z = csc_matrix(W + H)
270
+ baseline = spsolve(Z, w * y)
271
+ w = p * (y > baseline) + (1 - p) * (y < baseline)
272
+
273
+ return y - baseline
274
+
275
+ def smooth_spectrum(y: np.ndarray, window: int = 11,
276
+ polyorder: int = 3) -> np.ndarray:
277
+ """Apply Savitzky-Golay smoothing to a spectrum."""
278
+ return savgol_filter(y, window, polyorder)
279
+ ```
280
+
281
+ ## Tools and Software
282
+
283
+ - **nmrglue**: Python NMR data processing (Bruker, Varian, Agilent)
284
+ - **pyOpenMS / pyteomics**: Mass spectrometry data processing
285
+ - **RDKit**: Molecular structure to predicted spectra
286
+ - **MestReNova**: Commercial NMR processing (widely used in chemistry labs)
287
+ - **TopSpin (Bruker)**: NMR acquisition and processing
288
+ - **SDBS (AIST)**: Free spectral database (IR, NMR, MS)
289
+ - **MassBank**: Open mass spectral database
290
+ - **NIST Chemistry WebBook**: Reference spectra for IR and MS
@@ -0,0 +1,268 @@
1
+ ---
2
+ name: distributed-systems-guide
3
+ description: "Distributed systems design patterns and analysis for CS research"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "globe-with-meridians"
7
+ category: "domains"
8
+ subcategory: "cs"
9
+ keywords: ["distributed-systems", "consensus", "replication", "fault-tolerance", "scalability", "cap-theorem"]
10
+ source: "wentor"
11
+ ---
12
+
13
+ # Distributed Systems Guide
14
+
15
+ A skill for researching and designing distributed systems, covering consensus algorithms, replication strategies, consistency models, fault tolerance, and performance analysis. Provides theoretical foundations and practical implementations relevant to systems research.
16
+
17
+ ## Consistency Models
18
+
19
+ ### Consistency Hierarchy
20
+
21
+ ```
22
+ Strongest
23
+ | Linearizability (atomic, real-time ordering)
24
+ | Sequential consistency (program order respected)
25
+ | Causal consistency (causally related ops ordered)
26
+ | PRAM / FIFO consistency (per-process order)
27
+ | Eventual consistency (converges if updates stop)
28
+ Weakest
29
+ ```
30
+
31
+ ### CAP Theorem and PACELC
32
+
33
+ The CAP theorem states that during a network partition, a distributed system must choose between consistency and availability:
34
+
35
+ | System | Partition Behavior | Normal Behavior | Classification |
36
+ |--------|-------------------|----------------|----------------|
37
+ | ZooKeeper | Consistent (sacrifice A) | Low latency, consistent | CP / PC/EC |
38
+ | Cassandra | Available (sacrifice C) | Low latency, eventual | AP / PA/EL |
39
+ | Spanner | Consistent (sacrifice A) | Higher latency, consistent | CP / PC/EC |
40
+ | DynamoDB | Configurable per-read | Tunable consistency | AP or CP |
41
+ | CockroachDB | Consistent (sacrifice A) | Serializable | CP / PC/EC |
42
+
43
+ ## Consensus Algorithms
44
+
45
+ ### Raft Implementation Sketch
46
+
47
+ ```python
48
+ from enum import Enum
49
+ from dataclasses import dataclass, field
50
+ import random
51
+
52
+ class NodeState(Enum):
53
+ FOLLOWER = "follower"
54
+ CANDIDATE = "candidate"
55
+ LEADER = "leader"
56
+
57
+ @dataclass
58
+ class LogEntry:
59
+ term: int
60
+ index: int
61
+ command: str
62
+
63
+ @dataclass
64
+ class RaftNode:
65
+ """
66
+ Simplified Raft consensus node for educational purposes.
67
+ Implements leader election and log replication state machine.
68
+ """
69
+ node_id: str
70
+ state: NodeState = NodeState.FOLLOWER
71
+ current_term: int = 0
72
+ voted_for: str = None
73
+ log: list = field(default_factory=list)
74
+ commit_index: int = 0
75
+ last_applied: int = 0
76
+
77
+ # Leader state
78
+ next_index: dict = field(default_factory=dict)
79
+ match_index: dict = field(default_factory=dict)
80
+
81
+ def start_election(self, peers: list[str]) -> dict:
82
+ """Transition to candidate and request votes."""
83
+ self.state = NodeState.CANDIDATE
84
+ self.current_term += 1
85
+ self.voted_for = self.node_id
86
+
87
+ last_log_index = len(self.log) - 1 if self.log else -1
88
+ last_log_term = self.log[-1].term if self.log else 0
89
+
90
+ return {
91
+ "type": "RequestVote",
92
+ "term": self.current_term,
93
+ "candidate_id": self.node_id,
94
+ "last_log_index": last_log_index,
95
+ "last_log_term": last_log_term,
96
+ }
97
+
98
+ def handle_vote_request(self, term: int, candidate_id: str,
99
+ last_log_index: int,
100
+ last_log_term: int) -> dict:
101
+ """Process a RequestVote RPC."""
102
+ if term < self.current_term:
103
+ return {"term": self.current_term, "vote_granted": False}
104
+
105
+ if term > self.current_term:
106
+ self.current_term = term
107
+ self.state = NodeState.FOLLOWER
108
+ self.voted_for = None
109
+
110
+ # Check if candidate's log is at least as up-to-date
111
+ my_last_term = self.log[-1].term if self.log else 0
112
+ my_last_index = len(self.log) - 1 if self.log else -1
113
+
114
+ log_ok = (last_log_term > my_last_term or
115
+ (last_log_term == my_last_term and
116
+ last_log_index >= my_last_index))
117
+
118
+ vote_granted = (
119
+ (self.voted_for is None or self.voted_for == candidate_id)
120
+ and log_ok
121
+ )
122
+
123
+ if vote_granted:
124
+ self.voted_for = candidate_id
125
+
126
+ return {"term": self.current_term, "vote_granted": vote_granted}
127
+
128
+ def append_entry(self, command: str) -> LogEntry:
129
+ """Leader appends a new entry to its log."""
130
+ entry = LogEntry(
131
+ term=self.current_term,
132
+ index=len(self.log),
133
+ command=command,
134
+ )
135
+ self.log.append(entry)
136
+ return entry
137
+ ```
138
+
139
+ ### Paxos vs Raft vs PBFT Comparison
140
+
141
+ | Algorithm | Fault Model | Tolerance | Rounds | Complexity |
142
+ |-----------|-------------|-----------|--------|------------|
143
+ | Paxos | Crash faults | f < n/2 | 2 (normal) | Difficult to implement correctly |
144
+ | Raft | Crash faults | f < n/2 | 2 (normal) | Designed for understandability |
145
+ | PBFT | Byzantine faults | f < n/3 | 3 | O(n^2) message complexity |
146
+ | HotStuff | Byzantine faults | f < n/3 | 3 | O(n) with pipelining |
147
+
148
+ ## Replication Strategies
149
+
150
+ ### State Machine Replication
151
+
152
+ ```python
153
+ class ReplicatedStateMachine:
154
+ """
155
+ State machine replication with configurable consistency.
156
+ Demonstrates read/write quorum intersection for correctness.
157
+ """
158
+
159
+ def __init__(self, n_replicas: int, read_quorum: int = None,
160
+ write_quorum: int = None):
161
+ self.n = n_replicas
162
+ self.R = read_quorum or (n_replicas // 2 + 1)
163
+ self.W = write_quorum or (n_replicas // 2 + 1)
164
+
165
+ # Quorum intersection guarantees: R + W > N
166
+ assert self.R + self.W > self.n, (
167
+ f"Quorum intersection violated: R({self.R}) + W({self.W}) "
168
+ f"must be > N({self.n})"
169
+ )
170
+
171
+ self.replicas = [{} for _ in range(n_replicas)]
172
+ self.version_clock = 0
173
+
174
+ def write(self, key: str, value: str) -> dict:
175
+ """Write to W replicas."""
176
+ self.version_clock += 1
177
+ # Select W replicas (in practice, based on availability)
178
+ targets = random.sample(range(self.n), self.W)
179
+ for i in targets:
180
+ self.replicas[i][key] = (value, self.version_clock)
181
+
182
+ return {
183
+ "key": key,
184
+ "version": self.version_clock,
185
+ "acked_by": len(targets),
186
+ "quorum_met": True,
187
+ }
188
+
189
+ def read(self, key: str) -> dict:
190
+ """Read from R replicas, return latest version."""
191
+ targets = random.sample(range(self.n), self.R)
192
+ responses = []
193
+ for i in targets:
194
+ if key in self.replicas[i]:
195
+ responses.append(self.replicas[i][key])
196
+
197
+ if not responses:
198
+ return {"key": key, "value": None, "found": False}
199
+
200
+ # Return the value with the highest version
201
+ latest = max(responses, key=lambda x: x[1])
202
+ return {
203
+ "key": key,
204
+ "value": latest[0],
205
+ "version": latest[1],
206
+ "found": True,
207
+ }
208
+ ```
209
+
210
+ ## Clock Synchronization and Ordering
211
+
212
+ ### Vector Clocks
213
+
214
+ ```python
215
+ class VectorClock:
216
+ """Vector clock for tracking causality in distributed systems."""
217
+
218
+ def __init__(self, process_id: str, processes: list[str]):
219
+ self.pid = process_id
220
+ self.clock = {p: 0 for p in processes}
221
+
222
+ def increment(self):
223
+ """Local event: increment own counter."""
224
+ self.clock[self.pid] += 1
225
+
226
+ def send(self) -> dict:
227
+ """Prepare clock for sending with a message."""
228
+ self.increment()
229
+ return dict(self.clock)
230
+
231
+ def receive(self, other_clock: dict):
232
+ """Merge received clock: element-wise max, then increment."""
233
+ for p in self.clock:
234
+ self.clock[p] = max(self.clock[p], other_clock.get(p, 0))
235
+ self.increment()
236
+
237
+ def happened_before(self, other: dict) -> bool:
238
+ """Check if this clock happened-before other (causal ordering)."""
239
+ return (all(self.clock[p] <= other.get(p, 0) for p in self.clock) and
240
+ any(self.clock[p] < other.get(p, 0) for p in self.clock))
241
+ ```
242
+
243
+ ## Performance Analysis
244
+
245
+ ### Latency and Throughput Modeling
246
+
247
+ Key metrics for evaluating distributed systems:
248
+
249
+ - **Tail latency (p99, p999)**: Critical for real-world SLAs; often dominated by slow replicas
250
+ - **Throughput under contention**: How performance degrades with conflict rate
251
+ - **Scalability**: Linear vs sub-linear throughput increase with added nodes
252
+ - **Recovery time**: Time to restore consistency after node failure
253
+
254
+ ## Key Research Papers
255
+
256
+ - Lamport, L. (1998). The Part-Time Parliament (Paxos). *ACM TOCS*.
257
+ - Ongaro, D. and Ousterhout, J. (2014). In Search of an Understandable Consensus Algorithm (Raft). *USENIX ATC*.
258
+ - Corbett, J. et al. (2013). Spanner: Google's Globally-Distributed Database. *ACM TOCS*.
259
+ - DeCandia, G. et al. (2007). Dynamo: Amazon's Highly Available Key-value Store. *SOSP*.
260
+
261
+ ## Tools and Frameworks
262
+
263
+ - **etcd / ZooKeeper**: Production consensus stores for coordination
264
+ - **Jepsen**: Distributed systems correctness testing framework
265
+ - **TLA+ / PlusCal**: Formal specification and model checking
266
+ - **ns-3 / OMNeT++**: Network simulation for distributed protocols
267
+ - **gRPC / Cap'n Proto**: High-performance RPC frameworks
268
+ - **FoundationDB**: Multi-model distributed database with strong consistency