@wentorai/research-plugins 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (203) hide show
  1. package/README.md +22 -22
  2. package/curated/analysis/README.md +71 -56
  3. package/curated/domains/README.md +176 -67
  4. package/curated/literature/README.md +71 -47
  5. package/curated/research/README.md +91 -58
  6. package/curated/tools/README.md +88 -87
  7. package/curated/writing/README.md +80 -45
  8. package/mcp-configs/cloud-docs/confluence-mcp.json +37 -0
  9. package/mcp-configs/cloud-docs/google-drive-mcp.json +35 -0
  10. package/mcp-configs/cloud-docs/notion-mcp.json +29 -0
  11. package/mcp-configs/communication/discord-mcp.json +29 -0
  12. package/mcp-configs/communication/slack-mcp.json +29 -0
  13. package/mcp-configs/communication/telegram-mcp.json +28 -0
  14. package/mcp-configs/database/neo4j-mcp.json +37 -0
  15. package/mcp-configs/database/postgres-mcp.json +28 -0
  16. package/mcp-configs/database/sqlite-mcp.json +29 -0
  17. package/mcp-configs/dev-platform/github-mcp.json +31 -0
  18. package/mcp-configs/dev-platform/gitlab-mcp.json +34 -0
  19. package/mcp-configs/email/email-mcp.json +40 -0
  20. package/mcp-configs/email/gmail-mcp.json +37 -0
  21. package/mcp-configs/registry.json +178 -149
  22. package/mcp-configs/repository/dataverse-mcp.json +33 -0
  23. package/mcp-configs/repository/huggingface-mcp.json +29 -0
  24. package/openclaw.plugin.json +2 -2
  25. package/package.json +2 -2
  26. package/skills/analysis/dataviz/algorithm-visualizer-guide/SKILL.md +259 -0
  27. package/skills/analysis/dataviz/bokeh-visualization-guide/SKILL.md +270 -0
  28. package/skills/analysis/dataviz/chart-image-generator/SKILL.md +229 -0
  29. package/skills/analysis/dataviz/d3-visualization-guide/SKILL.md +281 -0
  30. package/skills/analysis/dataviz/echarts-visualization-guide/SKILL.md +250 -0
  31. package/skills/analysis/dataviz/metabase-analytics-guide/SKILL.md +242 -0
  32. package/skills/analysis/dataviz/plotly-interactive-guide/SKILL.md +266 -0
  33. package/skills/analysis/dataviz/redash-analytics-guide/SKILL.md +284 -0
  34. package/skills/analysis/econometrics/econml-causal-guide/SKILL.md +163 -0
  35. package/skills/analysis/econometrics/mostly-harmless-guide/SKILL.md +139 -0
  36. package/skills/analysis/econometrics/panel-data-analyst/SKILL.md +259 -0
  37. package/skills/analysis/econometrics/python-causality-guide/SKILL.md +134 -0
  38. package/skills/analysis/econometrics/stata-accounting-guide/SKILL.md +269 -0
  39. package/skills/analysis/econometrics/stata-analyst-guide/SKILL.md +245 -0
  40. package/skills/analysis/statistics/data-anomaly-detection/SKILL.md +157 -0
  41. package/skills/analysis/statistics/ml-experiment-tracker/SKILL.md +212 -0
  42. package/skills/analysis/statistics/pywayne-statistics-guide/SKILL.md +192 -0
  43. package/skills/analysis/statistics/quantitative-methods-guide/SKILL.md +193 -0
  44. package/skills/analysis/statistics/senior-data-scientist-guide/SKILL.md +223 -0
  45. package/skills/analysis/wrangling/csv-data-analyzer/SKILL.md +170 -0
  46. package/skills/analysis/wrangling/data-cleaning-pipeline/SKILL.md +266 -0
  47. package/skills/analysis/wrangling/data-cog-guide/SKILL.md +178 -0
  48. package/skills/analysis/wrangling/stata-data-cleaning/SKILL.md +276 -0
  49. package/skills/analysis/wrangling/survey-data-processing/SKILL.md +298 -0
  50. package/skills/domains/ai-ml/ai-model-benchmarking/SKILL.md +209 -0
  51. package/skills/domains/ai-ml/annotated-dl-papers-guide/SKILL.md +159 -0
  52. package/skills/domains/ai-ml/dl-transformer-finetune/SKILL.md +239 -0
  53. package/skills/domains/ai-ml/generative-ai-guide/SKILL.md +146 -0
  54. package/skills/domains/ai-ml/huggingface-inference-guide/SKILL.md +196 -0
  55. package/skills/domains/ai-ml/keras-deep-learning/SKILL.md +210 -0
  56. package/skills/domains/ai-ml/llm-from-scratch-guide/SKILL.md +124 -0
  57. package/skills/domains/ai-ml/ml-pipeline-guide/SKILL.md +295 -0
  58. package/skills/domains/ai-ml/nlp-toolkit-guide/SKILL.md +247 -0
  59. package/skills/domains/ai-ml/pytorch-guide/SKILL.md +281 -0
  60. package/skills/domains/ai-ml/pytorch-lightning-guide/SKILL.md +244 -0
  61. package/skills/domains/ai-ml/tensorflow-guide/SKILL.md +241 -0
  62. package/skills/domains/biomedical/bioagents-guide/SKILL.md +308 -0
  63. package/skills/domains/biomedical/medgeclaw-guide/SKILL.md +345 -0
  64. package/skills/domains/biomedical/medical-imaging-guide/SKILL.md +305 -0
  65. package/skills/domains/business/architecture-design-guide/SKILL.md +279 -0
  66. package/skills/domains/business/innovation-management-guide/SKILL.md +257 -0
  67. package/skills/domains/business/operations-research-guide/SKILL.md +258 -0
  68. package/skills/domains/chemistry/molecular-dynamics-guide/SKILL.md +237 -0
  69. package/skills/domains/chemistry/pubchem-api-guide/SKILL.md +180 -0
  70. package/skills/domains/chemistry/spectroscopy-analysis-guide/SKILL.md +290 -0
  71. package/skills/domains/cs/distributed-systems-guide/SKILL.md +268 -0
  72. package/skills/domains/cs/formal-verification-guide/SKILL.md +298 -0
  73. package/skills/domains/ecology/species-distribution-guide/SKILL.md +343 -0
  74. package/skills/domains/economics/imf-data-api-guide/SKILL.md +174 -0
  75. package/skills/domains/economics/post-labor-economics/SKILL.md +254 -0
  76. package/skills/domains/economics/pricing-psychology-guide/SKILL.md +273 -0
  77. package/skills/domains/economics/world-bank-data-guide/SKILL.md +179 -0
  78. package/skills/domains/education/assessment-design-guide/SKILL.md +213 -0
  79. package/skills/domains/education/educational-research-methods/SKILL.md +179 -0
  80. package/skills/domains/education/mooc-analytics-guide/SKILL.md +206 -0
  81. package/skills/domains/finance/portfolio-optimization-guide/SKILL.md +279 -0
  82. package/skills/domains/finance/risk-modeling-guide/SKILL.md +260 -0
  83. package/skills/domains/finance/stata-accounting-research/SKILL.md +372 -0
  84. package/skills/domains/geoscience/climate-modeling-guide/SKILL.md +215 -0
  85. package/skills/domains/geoscience/satellite-remote-sensing/SKILL.md +193 -0
  86. package/skills/domains/geoscience/seismology-data-guide/SKILL.md +208 -0
  87. package/skills/domains/humanities/ethical-philosophy-guide/SKILL.md +244 -0
  88. package/skills/domains/humanities/history-research-guide/SKILL.md +260 -0
  89. package/skills/domains/humanities/political-history-guide/SKILL.md +241 -0
  90. package/skills/domains/law/legal-nlp-guide/SKILL.md +236 -0
  91. package/skills/domains/law/patent-analysis-guide/SKILL.md +257 -0
  92. package/skills/domains/law/regulatory-compliance-guide/SKILL.md +267 -0
  93. package/skills/domains/math/symbolic-computation-guide/SKILL.md +263 -0
  94. package/skills/domains/math/topology-data-analysis/SKILL.md +305 -0
  95. package/skills/domains/pharma/clinical-trial-design-guide/SKILL.md +271 -0
  96. package/skills/domains/pharma/drug-target-interaction/SKILL.md +242 -0
  97. package/skills/domains/pharma/pharmacovigilance-guide/SKILL.md +216 -0
  98. package/skills/domains/physics/astrophysics-data-guide/SKILL.md +305 -0
  99. package/skills/domains/physics/particle-physics-guide/SKILL.md +287 -0
  100. package/skills/domains/social-science/network-analysis-guide/SKILL.md +310 -0
  101. package/skills/domains/social-science/psychology-research-guide/SKILL.md +270 -0
  102. package/skills/domains/social-science/sociology-research-guide/SKILL.md +238 -0
  103. package/skills/literature/discovery/paper-recommendation-guide/SKILL.md +120 -0
  104. package/skills/literature/discovery/semantic-paper-radar/SKILL.md +144 -0
  105. package/skills/literature/discovery/zotero-arxiv-daily-guide/SKILL.md +94 -0
  106. package/skills/literature/fulltext/core-api-guide/SKILL.md +144 -0
  107. package/skills/literature/fulltext/institutional-repository-guide/SKILL.md +212 -0
  108. package/skills/literature/fulltext/open-access-mining-guide/SKILL.md +341 -0
  109. package/skills/literature/metadata/academic-paper-summarizer/SKILL.md +101 -0
  110. package/skills/literature/metadata/wikidata-api-guide/SKILL.md +156 -0
  111. package/skills/literature/search/arxiv-batch-reporting/SKILL.md +133 -0
  112. package/skills/literature/search/arxiv-paper-processor/SKILL.md +141 -0
  113. package/skills/literature/search/baidu-scholar-guide/SKILL.md +110 -0
  114. package/skills/literature/search/chatpaper-guide/SKILL.md +122 -0
  115. package/skills/literature/search/deep-literature-search/SKILL.md +149 -0
  116. package/skills/literature/search/deepgit-search-guide/SKILL.md +147 -0
  117. package/skills/literature/search/pasa-paper-search-guide/SKILL.md +138 -0
  118. package/skills/research/automation/ai-scientist-v2-guide/SKILL.md +284 -0
  119. package/skills/research/automation/aim-experiment-guide/SKILL.md +234 -0
  120. package/skills/research/automation/datagen-research-guide/SKILL.md +131 -0
  121. package/skills/research/automation/kedro-pipeline-guide/SKILL.md +216 -0
  122. package/skills/research/automation/mle-agent-guide/SKILL.md +139 -0
  123. package/skills/research/automation/paper-to-agent-guide/SKILL.md +116 -0
  124. package/skills/research/automation/rd-agent-guide/SKILL.md +246 -0
  125. package/skills/research/automation/research-paper-orchestrator/SKILL.md +254 -0
  126. package/skills/research/deep-research/academic-deep-research/SKILL.md +190 -0
  127. package/skills/research/deep-research/auto-deep-research-guide/SKILL.md +141 -0
  128. package/skills/research/deep-research/deep-research-pro/SKILL.md +213 -0
  129. package/skills/research/deep-research/deep-research-work/SKILL.md +204 -0
  130. package/skills/research/deep-research/deep-searcher-guide/SKILL.md +253 -0
  131. package/skills/research/deep-research/gpt-researcher-guide/SKILL.md +191 -0
  132. package/skills/research/deep-research/khoj-research-guide/SKILL.md +200 -0
  133. package/skills/research/deep-research/local-deep-research-guide/SKILL.md +253 -0
  134. package/skills/research/deep-research/tongyi-deep-research-guide/SKILL.md +217 -0
  135. package/skills/research/funding/eu-horizon-guide/SKILL.md +244 -0
  136. package/skills/research/funding/grant-budget-guide/SKILL.md +284 -0
  137. package/skills/research/funding/nih-reporter-api-guide/SKILL.md +166 -0
  138. package/skills/research/funding/nsf-award-api-guide/SKILL.md +133 -0
  139. package/skills/research/methodology/academic-mentor-guide/SKILL.md +169 -0
  140. package/skills/research/methodology/claude-scientific-guide/SKILL.md +122 -0
  141. package/skills/research/methodology/deep-innovator-guide/SKILL.md +242 -0
  142. package/skills/research/methodology/osf-api-guide/SKILL.md +165 -0
  143. package/skills/research/methodology/research-paper-kb/SKILL.md +263 -0
  144. package/skills/research/methodology/research-town-guide/SKILL.md +263 -0
  145. package/skills/research/paper-review/automated-review-guide/SKILL.md +281 -0
  146. package/skills/research/paper-review/paper-compare-guide/SKILL.md +238 -0
  147. package/skills/research/paper-review/paper-digest-guide/SKILL.md +240 -0
  148. package/skills/research/paper-review/paper-research-assistant/SKILL.md +231 -0
  149. package/skills/research/paper-review/research-quality-filter/SKILL.md +261 -0
  150. package/skills/research/paper-review/review-response-guide/SKILL.md +275 -0
  151. package/skills/tools/code-exec/google-colab-guide/SKILL.md +276 -0
  152. package/skills/tools/code-exec/kaggle-api-guide/SKILL.md +216 -0
  153. package/skills/tools/code-exec/overleaf-cli-guide/SKILL.md +279 -0
  154. package/skills/tools/diagram/code-flow-visualizer/SKILL.md +197 -0
  155. package/skills/tools/diagram/excalidraw-diagram-guide/SKILL.md +170 -0
  156. package/skills/tools/diagram/json-data-visualizer/SKILL.md +270 -0
  157. package/skills/tools/diagram/mermaid-architect-guide/SKILL.md +219 -0
  158. package/skills/tools/diagram/tldraw-whiteboard-guide/SKILL.md +397 -0
  159. package/skills/tools/document/docsgpt-guide/SKILL.md +130 -0
  160. package/skills/tools/document/large-document-reader/SKILL.md +202 -0
  161. package/skills/tools/document/paper-parse-guide/SKILL.md +243 -0
  162. package/skills/tools/knowledge-graph/citation-network-builder/SKILL.md +244 -0
  163. package/skills/tools/knowledge-graph/concept-map-generator/SKILL.md +284 -0
  164. package/skills/tools/knowledge-graph/graphiti-guide/SKILL.md +219 -0
  165. package/skills/tools/ocr-translate/pdf-math-translate-guide/SKILL.md +141 -0
  166. package/skills/tools/ocr-translate/zotero-pdf-translate-guide/SKILL.md +95 -0
  167. package/skills/tools/ocr-translate/zotero-pdf2zh-guide/SKILL.md +143 -0
  168. package/skills/tools/scraping/dataset-finder-guide/SKILL.md +253 -0
  169. package/skills/tools/scraping/easy-spider-guide/SKILL.md +250 -0
  170. package/skills/tools/scraping/google-scholar-scraper/SKILL.md +255 -0
  171. package/skills/tools/scraping/repository-harvesting-guide/SKILL.md +310 -0
  172. package/skills/writing/citation/academic-citation-manager/SKILL.md +314 -0
  173. package/skills/writing/citation/jabref-reference-guide/SKILL.md +127 -0
  174. package/skills/writing/citation/jasminum-zotero-guide/SKILL.md +103 -0
  175. package/skills/writing/citation/obsidian-citation-guide/SKILL.md +164 -0
  176. package/skills/writing/citation/obsidian-zotero-guide/SKILL.md +137 -0
  177. package/skills/writing/citation/papersgpt-zotero-guide/SKILL.md +132 -0
  178. package/skills/writing/citation/papis-cli-guide/SKILL.md +213 -0
  179. package/skills/writing/citation/zotero-better-bibtex-guide/SKILL.md +107 -0
  180. package/skills/writing/citation/zotero-better-notes-guide/SKILL.md +121 -0
  181. package/skills/writing/citation/zotero-gpt-guide/SKILL.md +111 -0
  182. package/skills/writing/citation/zotero-mcp-guide/SKILL.md +164 -0
  183. package/skills/writing/citation/zotero-mdnotes-guide/SKILL.md +162 -0
  184. package/skills/writing/citation/zotero-reference-guide/SKILL.md +139 -0
  185. package/skills/writing/citation/zotero-scholar-guide/SKILL.md +294 -0
  186. package/skills/writing/citation/zotfile-attachment-guide/SKILL.md +140 -0
  187. package/skills/writing/composition/ml-paper-writing/SKILL.md +163 -0
  188. package/skills/writing/composition/paper-debugger-guide/SKILL.md +143 -0
  189. package/skills/writing/composition/scientific-writing-resources/SKILL.md +151 -0
  190. package/skills/writing/composition/scientific-writing-wrapper/SKILL.md +153 -0
  191. package/skills/writing/latex/latex-drawing-collection/SKILL.md +154 -0
  192. package/skills/writing/latex/latex-templates-collection/SKILL.md +159 -0
  193. package/skills/writing/latex/md-to-pdf-academic/SKILL.md +230 -0
  194. package/skills/writing/latex/tex-render-guide/SKILL.md +243 -0
  195. package/skills/writing/polish/academic-tone-guide/SKILL.md +209 -0
  196. package/skills/writing/polish/conciseness-editing-guide/SKILL.md +225 -0
  197. package/skills/writing/polish/paper-polish-guide/SKILL.md +160 -0
  198. package/skills/writing/templates/graphical-abstract-guide/SKILL.md +183 -0
  199. package/skills/writing/templates/novathesis-guide/SKILL.md +152 -0
  200. package/skills/writing/templates/scientific-article-pdf/SKILL.md +261 -0
  201. package/skills/writing/templates/sjtuthesis-guide/SKILL.md +197 -0
  202. package/skills/writing/templates/thuthesis-guide/SKILL.md +181 -0
  203. package/skills/literature/fulltext/repository-harvesting-guide/SKILL.md +0 -207
@@ -0,0 +1,144 @@
1
+ ---
2
+ name: semantic-paper-radar
3
+ description: "Semantic literature discovery and synthesis using embeddings"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "📡"
7
+ category: "literature"
8
+ subcategory: "discovery"
9
+ keywords: ["semantic search", "embeddings", "literature synthesis", "paper discovery", "vector search", "knowledge mapping"]
10
+ source: "https://github.com/mukulpatnaik/researchgpt"
11
+ ---
12
+
13
+ # Semantic Paper Radar
14
+
15
+ ## Overview
16
+
17
+ Traditional literature search relies on keyword matching—you find papers that contain the exact terms you search for. Semantic paper discovery goes further by understanding the meaning of research content and finding papers that are conceptually related, even when they use different terminology. This is especially powerful for interdisciplinary research, where the same idea may be expressed in completely different vocabularies across fields.
18
+
19
+ The Semantic Paper Radar skill provides methods for using embedding-based semantic search, vector databases, and AI-powered synthesis to build a comprehensive, continuously updated view of the literature relevant to your research. It enables you to discover papers you would never find through keyword search alone and to synthesize findings across large bodies of work.
20
+
21
+ This skill covers setting up a personal semantic search index over your paper collection, querying public semantic search APIs, and using LLM-powered analysis to extract themes and connections from clusters of related papers.
22
+
23
+ ## Semantic Search Fundamentals
24
+
25
+ ### How Embedding-Based Search Works
26
+
27
+ Semantic search represents both your query and each paper as dense numerical vectors (embeddings) in a high-dimensional space. Papers whose embeddings are close to your query's embedding are semantically similar, regardless of the specific words used.
28
+
29
+ Key components:
30
+ - **Embedding model**: Converts text to vectors. Models like SPECTER2, SciBERT, or general-purpose models like `text-embedding-3-small` work well for academic text.
31
+ - **Vector database**: Stores and indexes embeddings for fast similarity search. Options include ChromaDB (local), Qdrant, Pinecone, or Weaviate.
32
+ - **Similarity metric**: Cosine similarity is standard for comparing text embeddings.
33
+
34
+ ### Using Semantic Scholar's Embedding Search
35
+
36
+ Semantic Scholar provides pre-computed SPECTER embeddings for millions of papers. You can use their search API for semantic queries:
37
+
38
+ ```bash
39
+ # Semantic search via the Semantic Scholar API
40
+ curl "https://api.semanticscholar.org/graph/v1/paper/search?query=attention+mechanisms+for+graph+neural+networks&fields=title,abstract,year,citationCount&limit=20"
41
+ ```
42
+
43
+ The search endpoint uses semantic matching, not just keyword matching. A query like "methods for handling missing values in longitudinal studies" will find papers about imputation techniques, dropout analysis, and panel data methods even if they do not use the phrase "missing values."
44
+
45
+ ### Building a Personal Semantic Index
46
+
47
+ For deeper control, build a local semantic search index over your own paper collection:
48
+
49
+ ```python
50
+ import chromadb
51
+ from sentence_transformers import SentenceTransformer
52
+
53
+ # Initialize
54
+ model = SentenceTransformer("allenai/specter2")
55
+ client = chromadb.PersistentClient(path="./paper_index")
56
+ collection = client.get_or_create_collection(
57
+ name="my_papers",
58
+ metadata={"hnsw:space": "cosine"}
59
+ )
60
+
61
+ # Index a paper
62
+ abstract = "We propose a novel attention mechanism for graph neural networks..."
63
+ embedding = model.encode(abstract).tolist()
64
+ collection.add(
65
+ documents=[abstract],
66
+ embeddings=[embedding],
67
+ metadatas=[{"title": "Graph Attention v2", "year": 2025, "arxiv_id": "2501.xxxxx"}],
68
+ ids=["paper_001"]
69
+ )
70
+
71
+ # Query
72
+ results = collection.query(
73
+ query_embeddings=[model.encode("message passing in GNNs").tolist()],
74
+ n_results=10
75
+ )
76
+ ```
77
+
78
+ This local index lets you search across all papers you have collected using natural language queries. As you add more papers, the index becomes a personalized discovery tool tuned to your specific research interests.
79
+
80
+ ## Discovery Workflows
81
+
82
+ ### Concept Expansion Radar
83
+
84
+ Use semantic search to expand your awareness beyond your current reading:
85
+
86
+ 1. **Seed**: Take the abstract of your current paper (or a paragraph describing your research question).
87
+ 2. **Search**: Run it as a semantic query against a large corpus (Semantic Scholar, OpenAlex, or your local index).
88
+ 3. **Filter**: Remove papers you have already read. Sort by a combination of semantic similarity and recency.
89
+ 4. **Cluster**: Group the top 50 results into thematic clusters using k-means or HDBSCAN on their embeddings.
90
+ 5. **Explore clusters**: Each cluster represents a related subtopic. Read the most-cited paper in each cluster to understand the connection to your work.
91
+
92
+ ### Cross-Disciplinary Bridge Detection
93
+
94
+ Semantic search excels at finding papers from other fields that address similar problems:
95
+
96
+ 1. Describe your research problem in plain, non-technical language.
97
+ 2. Run this as a semantic query without restricting to your field's journals or categories.
98
+ 3. Review results from unexpected fields—these are potential interdisciplinary connections.
99
+ 4. For each bridge paper, check its reference list for more domain-specific work in that field.
100
+
101
+ ### Novelty Radar
102
+
103
+ Set up periodic semantic searches to detect new papers in your area:
104
+
105
+ 1. Define 3-5 "concept vectors" by encoding descriptions of your core research interests.
106
+ 2. Weekly, search against newly published papers (last 7 days) from arXiv or Semantic Scholar.
107
+ 3. Rank new papers by maximum similarity to any of your concept vectors.
108
+ 4. Papers above your similarity threshold enter your reading queue automatically.
109
+
110
+ ## Semantic Synthesis
111
+
112
+ Once you have discovered a cluster of related papers, use AI-assisted synthesis to extract insights across the collection:
113
+
114
+ ### Theme Extraction
115
+
116
+ Feed the abstracts of a cluster of papers to an LLM and ask for:
117
+ - Common themes and findings across the papers
118
+ - Points of disagreement or contradiction
119
+ - Methodological trends (what approaches are gaining vs. losing popularity)
120
+ - Open questions that none of the papers fully address
121
+
122
+ ### Evidence Mapping
123
+
124
+ Create a structured evidence map from your semantic cluster:
125
+
126
+ | Theme | Supporting Papers | Contradicting Papers | Strength of Evidence |
127
+ |-------|-------------------|----------------------|---------------------|
128
+ | Theme A | [1], [3], [7] | [5] | Strong |
129
+ | Theme B | [2], [4] | None | Moderate |
130
+ | Theme C | [6] | [1], [8] | Contested |
131
+
132
+ This provides a bird's-eye view of where consensus exists and where debates remain open.
133
+
134
+ ### Gap Identification
135
+
136
+ Compare your research question against the semantic landscape of existing work. Regions of embedding space where your query falls but few papers exist represent potential research gaps—areas where your contribution would be most novel.
137
+
138
+ ## References
139
+
140
+ - Semantic Scholar API: https://api.semanticscholar.org
141
+ - SPECTER2 model: https://huggingface.co/allenai/specter2
142
+ - ChromaDB: https://www.trychroma.com
143
+ - ResearchGPT: https://github.com/mukulpatnaik/researchgpt
144
+ - OpenAlex: https://openalex.org
@@ -0,0 +1,94 @@
1
+ ---
2
+ name: zotero-arxiv-daily-guide
3
+ description: "Guide to Zotero arXiv Daily for personalized daily paper recommendations"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "📰"
7
+ category: literature
8
+ subcategory: discovery
9
+ keywords: ["zotero", "arxiv", "daily-papers", "recommendations", "preprint", "discovery"]
10
+ source: "https://github.com/TechPenguineer/zotero-arxiv-daily"
11
+ ---
12
+
13
+ # Zotero arXiv Daily Guide
14
+
15
+ ## Overview
16
+
17
+ Zotero arXiv Daily is a popular Zotero plugin with over 5,000 GitHub stars that delivers personalized daily paper recommendations from arXiv directly into your Zotero library. By analyzing the papers you already have in your collections, the plugin identifies your research interests and surfaces new preprints that are most relevant to your work.
18
+
19
+ The challenge of staying current with preprint literature is well known to researchers. ArXiv publishes thousands of new papers daily across dozens of categories, and manually scanning listings or relying solely on keyword alerts often results in information overload or missed relevant work. Zotero arXiv Daily addresses this by using your existing library as a profile of your interests, producing recommendations that improve as your library grows.
20
+
21
+ The plugin integrates naturally into the Zotero workflow. Recommended papers appear in a dedicated collection where you can review titles and abstracts, save promising papers to your working collections, and dismiss irrelevant suggestions. Over time the recommendation engine learns from your accept and dismiss decisions, refining its model of your interests.
22
+
23
+ ## Installation and Setup
24
+
25
+ Install Zotero arXiv Daily through the standard Zotero plugin process:
26
+
27
+ 1. Download the latest `.xpi` release from https://github.com/TechPenguineer/zotero-arxiv-daily/releases
28
+ 2. In Zotero, go to Tools > Add-ons > gear icon > Install Add-on From File
29
+ 3. Select the `.xpi` file and restart Zotero
30
+
31
+ Configure the plugin after installation:
32
+
33
+ - Open Zotero Preferences > arXiv Daily
34
+ - Select the arXiv categories relevant to your research (e.g., cs.AI, cs.CL, stat.ML, physics.comp-ph)
35
+ - Choose which Zotero collections to use as the basis for recommendations (your core research collections work best)
36
+ - Set the number of daily recommendations (10-30 is typical)
37
+ - Configure the schedule for fetching new recommendations (daily at a specific time or on-demand)
38
+ - Set up a dedicated Zotero collection where recommendations will appear
39
+
40
+ For enhanced recommendation quality, ensure your library collections are well-organized. The algorithm performs better when it can distinguish between your core research interests and peripheral references. Consider creating a dedicated collection of your most representative papers to serve as the recommendation seed.
41
+
42
+ ## Core Features
43
+
44
+ **Personalized Recommendations**: The plugin analyzes titles, abstracts, authors, and citation patterns in your Zotero library to build a profile of your research interests. New arXiv submissions are scored against this profile and the top matches are presented as daily recommendations.
45
+
46
+ **Category Filtering**: Select specific arXiv categories to narrow the recommendation scope. This prevents the system from suggesting papers in completely unrelated fields while still allowing cross-disciplinary discoveries within your selected categories.
47
+
48
+ **Daily Digest View**: Recommendations appear in a dedicated Zotero collection organized by date. Each entry includes the paper title, authors, abstract, arXiv identifier, and a relevance score indicating how closely it matches your library profile.
49
+
50
+ **Quick Actions**: For each recommended paper, you can:
51
+ - Save to a working collection with one click
52
+ - Open the full paper on arXiv
53
+ - Download the PDF directly to Zotero
54
+ - Dismiss the recommendation (improves future suggestions)
55
+ - Add tags for later organization
56
+
57
+ **Trend Detection**: The plugin can highlight papers that are receiving unusual attention in your field based on early citation velocity and social media mentions. This helps you identify potentially important work before it becomes widely known.
58
+
59
+ **Author Tracking**: When the plugin detects papers by authors who are frequently cited in your library, it flags these with higher priority. This ensures you never miss new work from the researchers most relevant to your field.
60
+
61
+ ## Research Workflow Integration
62
+
63
+ **Morning Review Routine**: Start your research day by spending 10-15 minutes reviewing the daily arXiv recommendations. Scan titles and abstracts, save promising papers to a "To Read" collection, and dismiss irrelevant ones. This disciplined approach keeps you current without consuming excessive time.
64
+
65
+ **Literature Review Enhancement**: During active literature review phases, increase the number of daily recommendations and expand the arXiv categories. The plugin helps identify relevant preprints that may not yet appear in traditional databases, giving your review a more comprehensive and timely scope.
66
+
67
+ **Collaborative Discovery**: Share your recommended papers collection with lab members through a Zotero group library. This creates a collective discovery mechanism where the entire group benefits from each member's library-driven recommendations.
68
+
69
+ **Research Trend Monitoring**: Track which topics appear frequently in your recommendations over weeks and months. Shifts in the recommendation patterns can signal emerging trends in your field, helping you anticipate where the research community is heading.
70
+
71
+ **Optimizing Recommendation Quality**:
72
+ - Maintain a well-curated "seed" collection of your most important papers
73
+ - Regularly dismiss irrelevant recommendations to refine the algorithm
74
+ - Update your arXiv category selections as your interests evolve
75
+ - Add newly published papers from your own group to keep the profile current
76
+ - Review recommendations from adjacent categories periodically for cross-disciplinary insights
77
+
78
+ ## Configuring Notification Preferences
79
+
80
+ Control how and when you receive recommendation alerts:
81
+
82
+ - **Desktop Notifications**: Enable system notifications when new recommendations arrive
83
+ - **Batch Mode**: Accumulate recommendations and review them at a scheduled time
84
+ - **Threshold Filtering**: Only show recommendations above a configurable relevance score
85
+ - **Keyword Highlighting**: Specify key terms to highlight in recommended paper titles and abstracts
86
+
87
+ For researchers who find the default recommendation volume too high, set a higher relevance threshold to receive only the most closely matched papers. Conversely, those in rapidly moving fields may want to lower the threshold and increase the daily count to ensure broad coverage.
88
+
89
+ ## References
90
+
91
+ - GitHub Repository: https://github.com/TechPenguineer/zotero-arxiv-daily
92
+ - arXiv API Documentation: https://info.arxiv.org/help/api
93
+ - Zotero Plugin Directory: https://www.zotero.org/support/plugins
94
+ - arXiv Category Taxonomy: https://arxiv.org/category_taxonomy
@@ -0,0 +1,144 @@
1
+ ---
2
+ name: core-api-guide
3
+ description: "Search and retrieve open access research papers via CORE aggregator"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "🔬"
7
+ category: "literature"
8
+ subcategory: "fulltext"
9
+ keywords: ["open-access", "fulltext", "research-papers", "aggregator", "CORE"]
10
+ source: "https://core.ac.uk/documentation/api"
11
+ ---
12
+
13
+ # CORE API Guide
14
+
15
+ ## Overview
16
+
17
+ CORE (COnnecting REpositories) is the world's largest aggregator of open access research papers, providing access to over 130 million articles harvested from thousands of data providers worldwide. The CORE API enables programmatic search, retrieval, and analysis of scholarly full-text content across repositories, journals, and preprint servers.
18
+
19
+ The API is particularly valuable for researchers conducting systematic reviews, bibliometric analyses, and literature mining tasks. Unlike many scholarly APIs that only provide metadata, CORE specializes in delivering full-text content, making it essential for text mining and natural language processing workflows in academic research.
20
+
21
+ CORE's v3 API provides a RESTful interface with JSON responses, supporting complex search queries with Boolean operators, field-specific filtering, and batch operations. It is free for non-commercial academic use, though an API key is required to access the service.
22
+
23
+ ## Authentication
24
+
25
+ CORE requires a free API key for all requests. Register at https://core.ac.uk/services/api to obtain one.
26
+
27
+ Always store your API key in an environment variable and reference it in requests:
28
+
29
+ ```bash
30
+ export CORE_API_KEY=$CORE_API_KEY
31
+ ```
32
+
33
+ Pass the key via the `Authorization` header:
34
+
35
+ ```bash
36
+ curl -H "Authorization: Bearer $CORE_API_KEY" \
37
+ "https://api.core.ac.uk/v3/search/works?q=machine+learning"
38
+ ```
39
+
40
+ ## Core Endpoints
41
+
42
+ ### Search Works
43
+
44
+ Search across the entire CORE corpus with full-text and metadata queries.
45
+
46
+ ```
47
+ GET https://api.core.ac.uk/v3/search/works?q={query}&limit={n}&offset={n}
48
+ ```
49
+
50
+ **Parameters:**
51
+ - `q` (required): Search query string, supports Boolean operators (AND, OR, NOT)
52
+ - `limit`: Number of results (default 10, max 100)
53
+ - `offset`: Pagination offset
54
+ - `entity_type`: Filter by type (e.g., `journal-article`, `preprint`)
55
+
56
+ **Example: Search for climate change papers with full text:**
57
+
58
+ ```bash
59
+ curl -s -H "Authorization: Bearer $CORE_API_KEY" \
60
+ "https://api.core.ac.uk/v3/search/works?q=climate+change+adaptation&limit=5" \
61
+ | python3 -m json.tool
62
+ ```
63
+
64
+ **Python example:**
65
+
66
+ ```python
67
+ import requests
68
+ import os
69
+
70
+ headers = {"Authorization": f"Bearer {os.environ['CORE_API_KEY']}"}
71
+ params = {
72
+ "q": "deep learning AND medical imaging",
73
+ "limit": 20,
74
+ "offset": 0
75
+ }
76
+ resp = requests.get("https://api.core.ac.uk/v3/search/works", headers=headers, params=params)
77
+ data = resp.json()
78
+
79
+ for result in data.get("results", []):
80
+ print(f"Title: {result.get('title')}")
81
+ print(f"DOI: {result.get('doi')}")
82
+ print(f"Year: {result.get('yearPublished')}")
83
+ print(f"Full text length: {len(result.get('fullText', ''))}")
84
+ print("---")
85
+ ```
86
+
87
+ ### Get Work by ID
88
+
89
+ Retrieve a specific paper by its CORE ID or DOI.
90
+
91
+ ```
92
+ GET https://api.core.ac.uk/v3/works/{core_id}
93
+ ```
94
+
95
+ ```bash
96
+ curl -s -H "Authorization: Bearer $CORE_API_KEY" \
97
+ "https://api.core.ac.uk/v3/works/doi:10.1234/example.doi" \
98
+ | python3 -m json.tool
99
+ ```
100
+
101
+ ### Batch Retrieval
102
+
103
+ Retrieve multiple works in a single request using POST with a list of IDs.
104
+
105
+ ```bash
106
+ curl -s -X POST -H "Authorization: Bearer $CORE_API_KEY" \
107
+ -H "Content-Type: application/json" \
108
+ -d '[12345, 67890, 11111]' \
109
+ "https://api.core.ac.uk/v3/works"
110
+ ```
111
+
112
+ ### Search Data Providers
113
+
114
+ List or search CORE's data providers (repositories, journals).
115
+
116
+ ```
117
+ GET https://api.core.ac.uk/v3/data-providers?q={query}
118
+ ```
119
+
120
+ ## Common Research Patterns
121
+
122
+ **Systematic Literature Review:** Use Boolean queries to replicate a search strategy across the full-text corpus. Combine with date filters to identify papers within a specific time window, then export results for screening in tools like Rayyan or Covidence.
123
+
124
+ **Full-Text Mining:** Retrieve full-text content programmatically for NLP pipelines. Extract named entities, key phrases, or citation contexts at scale across thousands of papers.
125
+
126
+ **Repository Coverage Analysis:** Query data providers to understand which institutional repositories contribute to a specific field, useful for bibliometric and open-access policy research.
127
+
128
+ **Trend Detection:** Run time-series queries for specific terms and track publication volume over years to identify emerging research fronts.
129
+
130
+ ## Rate Limits and Best Practices
131
+
132
+ - **Free tier:** 150 requests per 15-minute window (10 req/min effective)
133
+ - **Batch endpoints:** Use batch retrieval for multiple IDs to minimize request count
134
+ - **Pagination:** Always use `offset` and `limit` for large result sets; do not fetch all results in one call
135
+ - **Caching:** Cache responses locally for repeat queries, especially for static metadata
136
+ - **Respect robots.txt:** When downloading full texts, add delays between requests
137
+ - **Error handling:** The API returns standard HTTP status codes; implement exponential backoff for 429 (rate limit) responses
138
+
139
+ ## References
140
+
141
+ - CORE API v3 Documentation: https://core.ac.uk/documentation/api
142
+ - CORE Dashboard and Key Registration: https://core.ac.uk/services/api
143
+ - CORE Data Dumps (for bulk access): https://core.ac.uk/documentation/dataset
144
+ - CORE GitHub: https://github.com/oacore
@@ -0,0 +1,212 @@
1
+ ---
2
+ name: institutional-repository-guide
3
+ description: "Access papers from institutional and subject repositories at scale"
4
+ metadata:
5
+ openclaw:
6
+ emoji: "🏛️"
7
+ category: "literature"
8
+ subcategory: "fulltext"
9
+ keywords: ["institutional repository", "DSpace", "EPrints", "open access archive", "subject repository", "OpenDOAR"]
10
+ source: "wentor-research-plugins"
11
+ ---
12
+
13
+ # Institutional Repository Guide
14
+
15
+ Institutional repositories (IRs) are university-run digital archives that store and provide open access to their researchers' scholarly output — dissertations, journal articles, conference papers, datasets, and technical reports. Subject repositories like arXiv, bioRxiv, SSRN, and RePEc serve similar functions for specific disciplines. Together, they form a distributed network of open scholarship that complements commercial databases.
16
+
17
+ This guide covers how to discover, access, and systematically harvest content from institutional and subject repositories for literature reviews, meta-analyses, and research data collection.
18
+
19
+ ## Repository Landscape
20
+
21
+ ### Types of Repositories
22
+
23
+ ```
24
+ Institutional Repositories (IR):
25
+ - Run by universities to archive their researchers' output
26
+ - Examples: DSpace, EPrints, Fedora-based systems
27
+ - Discovery: OpenDOAR directory (v2.sherpa.ac.uk/opendoar)
28
+
29
+ Subject Repositories:
30
+ - Discipline-specific archives
31
+ - arXiv (physics, CS, math), bioRxiv, SSRN, RePEc, EarthArXiv
32
+
33
+ Aggregators:
34
+ - Harvest from many repositories into a single search interface
35
+ - BASE (Bielefeld Academic Search Engine)
36
+ - CORE (core.ac.uk, 200M+ open access articles)
37
+ - OpenAIRE (European research output)
38
+ ```
39
+
40
+ ### Discovering Repositories
41
+
42
+ OpenDOAR (Directory of Open Access Repositories) is the primary registry for finding institutional repositories:
43
+
44
+ ```python
45
+ import urllib.request
46
+ import json
47
+
48
+ def search_opendoar(subject: str = None, country: str = None) -> list:
49
+ """
50
+ Search the OpenDOAR registry for institutional repositories.
51
+
52
+ Args:
53
+ subject: Filter by subject area (e.g., "Biology", "Computer Science")
54
+ country: ISO country code (e.g., "US", "GB", "CN")
55
+ """
56
+ base_url = "https://v2.sherpa.ac.uk/cgi/retrieve"
57
+ params = "?item-type=repository&format=Json"
58
+ if subject:
59
+ params += f"&filter=[[\"{subject}\",\"subject\"]]"
60
+ if country:
61
+ params += f"&filter=[[\"{country}\",\"country\"]]"
62
+
63
+ req = urllib.request.Request(base_url + params)
64
+ response = urllib.request.urlopen(req)
65
+ data = json.loads(response.read())
66
+
67
+ repositories = []
68
+ for item in data.get("items", []):
69
+ repo_info = {
70
+ "name": item.get("repository_metadata", {}).get("name", [{}])[0].get("name", ""),
71
+ "url": item.get("repository_metadata", {}).get("url", ""),
72
+ "oai_url": item.get("repository_metadata", {}).get("oai_url", ""),
73
+ "software": item.get("repository_metadata", {}).get("software", {}).get("name", ""),
74
+ "type": item.get("repository_metadata", {}).get("type", "")
75
+ }
76
+ repositories.append(repo_info)
77
+
78
+ return repositories
79
+ ```
80
+
81
+ ## OAI-PMH Harvesting from Repositories
82
+
83
+ Most institutional repositories support OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting), the standard protocol for metadata exchange:
84
+
85
+ ```python
86
+ import xml.etree.ElementTree as ET
87
+ import urllib.request
88
+
89
+ def harvest_repository(base_url: str, metadata_prefix: str = "oai_dc",
90
+ set_spec: str = None, from_date: str = None) -> list:
91
+ """
92
+ Harvest metadata records from a repository's OAI-PMH endpoint.
93
+
94
+ Args:
95
+ base_url: The OAI-PMH base URL
96
+ metadata_prefix: Metadata format (oai_dc, datacite, mets)
97
+ set_spec: Optional set/collection to restrict harvesting
98
+ from_date: Harvest only records added after this date (YYYY-MM-DD)
99
+ """
100
+ params = f"?verb=ListRecords&metadataPrefix={metadata_prefix}"
101
+ if set_spec:
102
+ params += f"&set={set_spec}"
103
+ if from_date:
104
+ params += f"&from={from_date}"
105
+
106
+ url = base_url + params
107
+ records = []
108
+
109
+ while url:
110
+ response = urllib.request.urlopen(url)
111
+ tree = ET.parse(response)
112
+ root = tree.getroot()
113
+ ns = {"oai": "http://www.openarchives.org/OAI/2.0/"}
114
+
115
+ for record in root.findall(".//oai:record", ns):
116
+ header = record.find("oai:header", ns)
117
+ identifier = header.find("oai:identifier", ns).text
118
+ datestamp = header.find("oai:datestamp", ns).text
119
+ records.append({"identifier": identifier, "datestamp": datestamp})
120
+
121
+ token_elem = root.find(".//oai:resumptionToken", ns)
122
+ if token_elem is not None and token_elem.text:
123
+ url = f"{base_url}?verb=ListRecords&resumptionToken={token_elem.text}"
124
+ else:
125
+ url = None
126
+
127
+ return records
128
+ ```
129
+
130
+ ### Key OAI-PMH Verbs
131
+
132
+ | Verb | Purpose |
133
+ |------|---------|
134
+ | `Identify` | Get repository name, admin email, policies |
135
+ | `ListSets` | List available collections/sets |
136
+ | `ListMetadataFormats` | List supported metadata schemas |
137
+ | `ListIdentifiers` | Lightweight listing of record headers |
138
+ | `ListRecords` | Full metadata records with pagination |
139
+ | `GetRecord` | Retrieve a single record by identifier |
140
+
141
+ ## Major Repository Platforms
142
+
143
+ ### DSpace
144
+
145
+ The most widely deployed open-source repository platform (used by ~40% of repositories worldwide):
146
+
147
+ - OAI-PMH endpoint: `{base-url}/oai/request`
148
+ - REST API: `{base-url}/server/api`
149
+ - Supports Dublin Core, METS, and custom metadata schemas
150
+ - Examples: MIT DSpace, University of Cambridge Repository
151
+
152
+ ### EPrints
153
+
154
+ Popular in the UK and Europe:
155
+
156
+ - OAI-PMH endpoint: `{base-url}/cgi/oai2`
157
+ - REST API: `{base-url}/cgi/export/{id}/{format}`
158
+ - Strong support for research output types (articles, theses, conference items)
159
+ - Examples: University of Southampton EPrints
160
+
161
+ ### Fedora / Islandora
162
+
163
+ Used by larger institutions with complex digital collections:
164
+
165
+ - Typically paired with a discovery layer (Solr/Blacklight)
166
+ - Strong support for digital preservation workflows
167
+ - Examples: University of Toronto, Smithsonian Institution
168
+
169
+ ## Building a Harvesting Pipeline
170
+
171
+ ### Systematic Collection Workflow
172
+
173
+ ```
174
+ 1. Identify target repositories
175
+ - Use OpenDOAR to find IRs by subject or country
176
+ - List subject repositories relevant to your discipline
177
+
178
+ 2. Test endpoints
179
+ - Send Identify request to verify the endpoint is active
180
+ - Check ListMetadataFormats for available schemas
181
+
182
+ 3. Harvest incrementally
183
+ - Use "from" parameter to harvest only new records
184
+ - Store last harvest date for each repository
185
+ - Respect rate limits (typically 1 request per second)
186
+
187
+ 4. Deduplicate
188
+ - Match records by DOI when available
189
+ - Use title + author fuzzy matching for records without DOIs
190
+ - Flag duplicates rather than deleting (keep provenance)
191
+
192
+ 5. Store and index
193
+ - Save metadata in structured format (JSON, SQLite, CSV)
194
+ - Build a local search index for efficient retrieval
195
+ ```
196
+
197
+ ## Ethical Considerations
198
+
199
+ - Always respect `robots.txt` and repository rate limits
200
+ - Metadata harvesting is generally permitted; bulk full-text download may require permission
201
+ - Check each repository's terms of use before harvesting
202
+ - Use harvested data for research purposes, not commercial redistribution
203
+ - Attribute the source repository in publications using harvested data
204
+ - Consider reaching out to repository administrators for large-scale harvesting projects
205
+
206
+ ## References
207
+
208
+ - OpenDOAR: https://v2.sherpa.ac.uk/opendoar/
209
+ - OAI-PMH specification: http://www.openarchives.org/OAI/openarchivesprotocol.html
210
+ - CORE: https://core.ac.uk
211
+ - BASE: https://www.base-search.net
212
+ - DSpace documentation: https://wiki.lyrasis.org/display/DSPACE