@wentorai/research-plugins 1.2.2 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (141) hide show
  1. package/README.md +16 -8
  2. package/openclaw.plugin.json +10 -3
  3. package/package.json +2 -5
  4. package/skills/analysis/dataviz/SKILL.md +25 -0
  5. package/skills/analysis/dataviz/chart-image-generator/SKILL.md +1 -1
  6. package/skills/analysis/econometrics/SKILL.md +23 -0
  7. package/skills/analysis/econometrics/robustness-checks/SKILL.md +1 -1
  8. package/skills/analysis/statistics/SKILL.md +21 -0
  9. package/skills/analysis/statistics/data-anomaly-detection/SKILL.md +1 -1
  10. package/skills/analysis/statistics/ml-experiment-tracker/SKILL.md +1 -1
  11. package/skills/analysis/statistics/{senior-data-scientist-guide → modeling-strategy-guide}/SKILL.md +5 -5
  12. package/skills/analysis/wrangling/SKILL.md +21 -0
  13. package/skills/analysis/wrangling/csv-data-analyzer/SKILL.md +1 -1
  14. package/skills/analysis/wrangling/data-cog-guide/SKILL.md +1 -1
  15. package/skills/domains/ai-ml/SKILL.md +37 -0
  16. package/skills/domains/biomedical/SKILL.md +28 -0
  17. package/skills/domains/biomedical/genomas-guide/SKILL.md +1 -1
  18. package/skills/domains/biomedical/med-researcher-guide/SKILL.md +1 -1
  19. package/skills/domains/biomedical/medgeclaw-guide/SKILL.md +1 -1
  20. package/skills/domains/business/SKILL.md +17 -0
  21. package/skills/domains/business/architecture-design-guide/SKILL.md +1 -1
  22. package/skills/domains/chemistry/SKILL.md +19 -0
  23. package/skills/domains/chemistry/computational-chemistry-guide/SKILL.md +1 -1
  24. package/skills/domains/cs/SKILL.md +21 -0
  25. package/skills/domains/ecology/SKILL.md +16 -0
  26. package/skills/domains/economics/SKILL.md +20 -0
  27. package/skills/domains/economics/post-labor-economics/SKILL.md +1 -1
  28. package/skills/domains/economics/pricing-psychology-guide/SKILL.md +1 -1
  29. package/skills/domains/education/SKILL.md +19 -0
  30. package/skills/domains/education/academic-study-methods/SKILL.md +1 -1
  31. package/skills/domains/education/edumcp-guide/SKILL.md +1 -1
  32. package/skills/domains/finance/SKILL.md +19 -0
  33. package/skills/domains/finance/akshare-finance-data/SKILL.md +1 -1
  34. package/skills/domains/finance/options-analytics-agent-guide/SKILL.md +1 -1
  35. package/skills/domains/finance/stata-accounting-research/SKILL.md +1 -1
  36. package/skills/domains/geoscience/SKILL.md +17 -0
  37. package/skills/domains/humanities/SKILL.md +16 -0
  38. package/skills/domains/humanities/history-research-guide/SKILL.md +1 -1
  39. package/skills/domains/humanities/political-history-guide/SKILL.md +1 -1
  40. package/skills/domains/law/SKILL.md +19 -0
  41. package/skills/domains/math/SKILL.md +17 -0
  42. package/skills/domains/pharma/SKILL.md +17 -0
  43. package/skills/domains/physics/SKILL.md +16 -0
  44. package/skills/domains/social-science/SKILL.md +17 -0
  45. package/skills/domains/social-science/sociology-research-methods/SKILL.md +1 -1
  46. package/skills/literature/discovery/SKILL.md +20 -0
  47. package/skills/literature/discovery/paper-recommendation-guide/SKILL.md +1 -1
  48. package/skills/literature/discovery/semantic-paper-radar/SKILL.md +1 -1
  49. package/skills/literature/fulltext/SKILL.md +26 -0
  50. package/skills/literature/metadata/SKILL.md +35 -0
  51. package/skills/literature/metadata/doi-content-negotiation/SKILL.md +4 -0
  52. package/skills/literature/metadata/doi-resolution-guide/SKILL.md +4 -0
  53. package/skills/literature/metadata/orcid-api/SKILL.md +4 -0
  54. package/skills/literature/metadata/orcid-integration-guide/SKILL.md +4 -0
  55. package/skills/literature/search/SKILL.md +43 -0
  56. package/skills/literature/search/paper-search-mcp-guide/SKILL.md +1 -1
  57. package/skills/research/automation/SKILL.md +21 -0
  58. package/skills/research/deep-research/SKILL.md +24 -0
  59. package/skills/research/deep-research/auto-deep-research-guide/SKILL.md +1 -1
  60. package/skills/research/deep-research/in-depth-research-guide/SKILL.md +1 -1
  61. package/skills/research/funding/SKILL.md +20 -0
  62. package/skills/research/methodology/SKILL.md +24 -0
  63. package/skills/research/paper-review/SKILL.md +19 -0
  64. package/skills/research/paper-review/paper-critique-framework/SKILL.md +1 -1
  65. package/skills/tools/code-exec/SKILL.md +18 -0
  66. package/skills/tools/diagram/SKILL.md +20 -0
  67. package/skills/tools/document/SKILL.md +21 -0
  68. package/skills/tools/knowledge-graph/SKILL.md +21 -0
  69. package/skills/tools/ocr-translate/SKILL.md +18 -0
  70. package/skills/tools/ocr-translate/handwriting-recognition-guide/SKILL.md +2 -0
  71. package/skills/tools/ocr-translate/latex-ocr-guide/SKILL.md +2 -0
  72. package/skills/tools/scraping/SKILL.md +17 -0
  73. package/skills/writing/citation/SKILL.md +33 -0
  74. package/skills/writing/citation/zotfile-attachment-guide/SKILL.md +2 -0
  75. package/skills/writing/composition/SKILL.md +22 -0
  76. package/skills/writing/composition/research-paper-writer/SKILL.md +1 -1
  77. package/skills/writing/composition/scientific-writing-wrapper/SKILL.md +1 -1
  78. package/skills/writing/latex/SKILL.md +22 -0
  79. package/skills/writing/latex/academic-writing-latex/SKILL.md +1 -1
  80. package/skills/writing/latex/latex-drawing-guide/SKILL.md +1 -1
  81. package/skills/writing/polish/SKILL.md +20 -0
  82. package/skills/writing/polish/chinese-text-humanizer/SKILL.md +1 -1
  83. package/skills/writing/templates/SKILL.md +22 -0
  84. package/skills/writing/templates/beamer-presentation-guide/SKILL.md +1 -1
  85. package/skills/writing/templates/scientific-article-pdf/SKILL.md +1 -1
  86. package/skills/analysis/dataviz/citation-map-guide/SKILL.md +0 -184
  87. package/skills/analysis/dataviz/data-visualization-principles/SKILL.md +0 -171
  88. package/skills/analysis/econometrics/empirical-paper-analysis/SKILL.md +0 -192
  89. package/skills/analysis/econometrics/panel-data-regression-workflow/SKILL.md +0 -267
  90. package/skills/analysis/econometrics/stata-regression/SKILL.md +0 -117
  91. package/skills/analysis/statistics/general-statistics-guide/SKILL.md +0 -226
  92. package/skills/analysis/statistics/infiagent-benchmark-guide/SKILL.md +0 -106
  93. package/skills/analysis/statistics/pywayne-statistics-guide/SKILL.md +0 -192
  94. package/skills/analysis/statistics/quantitative-methods-guide/SKILL.md +0 -193
  95. package/skills/analysis/wrangling/claude-data-analysis-guide/SKILL.md +0 -100
  96. package/skills/analysis/wrangling/open-data-scientist-guide/SKILL.md +0 -197
  97. package/skills/domains/ai-ml/annotated-dl-papers-guide/SKILL.md +0 -159
  98. package/skills/domains/humanities/digital-humanities-methods/SKILL.md +0 -232
  99. package/skills/domains/law/legal-research-methods/SKILL.md +0 -190
  100. package/skills/domains/social-science/sociology-research-guide/SKILL.md +0 -238
  101. package/skills/literature/discovery/arxiv-paper-monitoring/SKILL.md +0 -233
  102. package/skills/literature/discovery/paper-tracking-guide/SKILL.md +0 -211
  103. package/skills/literature/fulltext/zotero-scihub-guide/SKILL.md +0 -168
  104. package/skills/literature/search/arxiv-osiris/SKILL.md +0 -199
  105. package/skills/literature/search/deepgit-search-guide/SKILL.md +0 -147
  106. package/skills/literature/search/multi-database-literature-search/SKILL.md +0 -198
  107. package/skills/literature/search/papers-chat-guide/SKILL.md +0 -194
  108. package/skills/literature/search/pasa-paper-search-guide/SKILL.md +0 -138
  109. package/skills/literature/search/scientify-literature-survey/SKILL.md +0 -203
  110. package/skills/research/automation/ai-scientist-guide/SKILL.md +0 -228
  111. package/skills/research/automation/coexist-ai-guide/SKILL.md +0 -149
  112. package/skills/research/automation/foam-agent-guide/SKILL.md +0 -203
  113. package/skills/research/automation/research-paper-orchestrator/SKILL.md +0 -254
  114. package/skills/research/deep-research/academic-deep-research/SKILL.md +0 -190
  115. package/skills/research/deep-research/cognitive-kernel-guide/SKILL.md +0 -200
  116. package/skills/research/deep-research/corvus-research-guide/SKILL.md +0 -132
  117. package/skills/research/deep-research/deep-research-pro/SKILL.md +0 -213
  118. package/skills/research/deep-research/deep-research-work/SKILL.md +0 -204
  119. package/skills/research/deep-research/research-cog/SKILL.md +0 -153
  120. package/skills/research/methodology/academic-mentor-guide/SKILL.md +0 -169
  121. package/skills/research/methodology/deep-innovator-guide/SKILL.md +0 -242
  122. package/skills/research/methodology/research-pipeline-units-guide/SKILL.md +0 -169
  123. package/skills/research/paper-review/paper-compare-guide/SKILL.md +0 -238
  124. package/skills/research/paper-review/paper-digest-guide/SKILL.md +0 -240
  125. package/skills/research/paper-review/paper-research-assistant/SKILL.md +0 -231
  126. package/skills/research/paper-review/research-quality-filter/SKILL.md +0 -261
  127. package/skills/tools/code-exec/contextplus-mcp-guide/SKILL.md +0 -110
  128. package/skills/tools/diagram/clawphd-guide/SKILL.md +0 -149
  129. package/skills/tools/diagram/scientific-graphical-abstract/SKILL.md +0 -201
  130. package/skills/tools/document/md2pdf-xelatex/SKILL.md +0 -212
  131. package/skills/tools/document/openpaper-guide/SKILL.md +0 -232
  132. package/skills/tools/document/weknora-guide/SKILL.md +0 -216
  133. package/skills/tools/knowledge-graph/mimir-memory-guide/SKILL.md +0 -135
  134. package/skills/tools/knowledge-graph/open-webui-tools-guide/SKILL.md +0 -156
  135. package/skills/tools/ocr-translate/formula-recognition-guide/SKILL.md +0 -367
  136. package/skills/tools/ocr-translate/math-equation-renderer/SKILL.md +0 -198
  137. package/skills/tools/scraping/api-data-collection-guide/SKILL.md +0 -301
  138. package/skills/writing/citation/academic-citation-manager-guide/SKILL.md +0 -182
  139. package/skills/writing/composition/opendraft-thesis-guide/SKILL.md +0 -200
  140. package/skills/writing/composition/paper-debugger-guide/SKILL.md +0 -143
  141. package/skills/writing/composition/paperforge-guide/SKILL.md +0 -205
@@ -1,198 +0,0 @@
1
- ---
2
- name: math-equation-renderer
3
- description: "Render LaTeX math equations as publication-ready PNG and SVG images"
4
- metadata:
5
- openclaw:
6
- emoji: "🔢"
7
- category: "tools"
8
- subcategory: "ocr-translate"
9
- keywords: ["latex math", "equation rendering", "math images", "formula to png", "math typesetting", "tex rendering"]
10
- source: "https://clawhub.ai/huaruoji/math-images"
11
- ---
12
-
13
- # Math Equation Renderer — LaTeX to Image
14
-
15
- ## Overview
16
-
17
- Rendering LaTeX math equations as standalone images (PNG, SVG, PDF) is essential for presentations, social media, documentation, and any context where native LaTeX rendering is unavailable. This guide covers multiple methods: command-line tools, Python libraries, and web APIs. Choose based on your workflow: batch processing favors CLI tools, while one-off equations work well with web APIs.
18
-
19
- ## Method 1: TeX + dvipng (Command Line)
20
-
21
- The classic approach using a minimal LaTeX document:
22
-
23
- ```bash
24
- # Install prerequisites
25
- # macOS: brew install --cask mactex
26
- # Ubuntu: sudo apt install texlive-latex-base dvipng
27
-
28
- # Create a minimal .tex file
29
- cat > equation.tex << 'EOF'
30
- \documentclass[border=2pt]{standalone}
31
- \usepackage{amsmath,amssymb}
32
- \begin{document}
33
- $\displaystyle E = mc^2$
34
- \end{document}
35
- EOF
36
-
37
- # Compile to DVI then PNG
38
- latex equation.tex
39
- dvipng -D 300 -bg Transparent -T tight equation.dvi -o equation.png
40
-
41
- # Or compile directly to PDF
42
- pdflatex equation.tex
43
- ```
44
-
45
- ### Batch Rendering Script
46
-
47
- ```bash
48
- #!/bin/bash
49
- # Render multiple equations from a text file (one per line)
50
- INPUT="equations.txt"
51
- OUTPUT_DIR="./rendered"
52
- DPI=300
53
- mkdir -p "$OUTPUT_DIR"
54
-
55
- i=1
56
- while IFS= read -r eq; do
57
- cat > "/tmp/eq_${i}.tex" << EOF
58
- \documentclass[border=2pt]{standalone}
59
- \usepackage{amsmath,amssymb}
60
- \begin{document}
61
- \$\displaystyle ${eq}\$
62
- \end{document}
63
- EOF
64
- (cd /tmp && latex -interaction=batchmode "eq_${i}.tex" && \
65
- dvipng -D "$DPI" -bg Transparent -T tight "eq_${i}.dvi" \
66
- -o "${OLDPWD}/${OUTPUT_DIR}/eq_${i}.png") 2>/dev/null
67
- echo "Rendered equation $i: $eq"
68
- ((i++))
69
- done < "$INPUT"
70
- echo "Done. $((i-1)) equations rendered to $OUTPUT_DIR/"
71
- ```
72
-
73
- ## Method 2: matplotlib (Python)
74
-
75
- ```python
76
- import matplotlib.pyplot as plt
77
- import matplotlib
78
- matplotlib.use('Agg') # Non-interactive backend
79
-
80
- def render_equation(latex_str: str, output_path: str, dpi: int = 300,
81
- fontsize: int = 20, color: str = "black"):
82
- """Render a LaTeX equation to PNG."""
83
- fig, ax = plt.subplots(figsize=(0.1, 0.1))
84
- ax.axis('off')
85
- text = ax.text(0.5, 0.5, f"${latex_str}$",
86
- fontsize=fontsize, color=color,
87
- ha='center', va='center',
88
- transform=ax.transAxes)
89
-
90
- # Auto-size the figure to fit the equation
91
- fig.savefig(output_path, dpi=dpi, bbox_inches='tight',
92
- pad_inches=0.1, transparent=True)
93
- plt.close(fig)
94
- print(f"Saved: {output_path}")
95
-
96
- # Usage
97
- render_equation(r"\nabla \times \mathbf{E} = -\frac{\partial \mathbf{B}}{\partial t}",
98
- "maxwell.png")
99
- render_equation(r"\int_{-\infty}^{\infty} e^{-x^2} dx = \sqrt{\pi}",
100
- "gaussian.png")
101
- ```
102
-
103
- ### Batch Rendering with matplotlib
104
-
105
- ```python
106
- equations = {
107
- "schrodinger": r"i\hbar\frac{\partial}{\partial t}\Psi = \hat{H}\Psi",
108
- "einstein": r"R_{\mu\nu} - \frac{1}{2}Rg_{\mu\nu} = \frac{8\pi G}{c^4}T_{\mu\nu}",
109
- "euler": r"e^{i\pi} + 1 = 0",
110
- "bayes": r"P(A|B) = \frac{P(B|A)\,P(A)}{P(B)}",
111
- "fourier": r"\hat{f}(\xi) = \int_{-\infty}^{\infty} f(x)\,e^{-2\pi ix\xi}\,dx",
112
- }
113
-
114
- for name, eq in equations.items():
115
- render_equation(eq, f"equations/{name}.png", dpi=300, fontsize=24)
116
- ```
117
-
118
- ## Method 3: sympy (Python — Symbolic Math)
119
-
120
- ```python
121
- from sympy import *
122
- from sympy.printing.preview import preview
123
-
124
- x, y, z = symbols('x y z')
125
- f = Function('f')
126
-
127
- # Render symbolic expression
128
- expr = Integral(exp(-x**2), (x, -oo, oo))
129
- preview(expr, viewer='file', filename='integral.png',
130
- dvioptions=['-D', '300', '-bg', 'Transparent'])
131
-
132
- # Render matrix
133
- M = Matrix([[1, x], [y, x*y]])
134
- preview(M, viewer='file', filename='matrix.png',
135
- dvioptions=['-D', '300', '-bg', 'Transparent'])
136
- ```
137
-
138
- ## Method 4: Web APIs (No Local TeX Required)
139
-
140
- ### Codecogs API
141
-
142
- ```bash
143
- # URL-encoded LaTeX → PNG
144
- curl -o equation.png \
145
- "https://latex.codecogs.com/png.image?\dpi{300}\bg{transparent}E=mc^2"
146
-
147
- # SVG output
148
- curl -o equation.svg \
149
- "https://latex.codecogs.com/svg.image?E=mc^2"
150
- ```
151
-
152
- ### MathJax (HTML to Image)
153
-
154
- ```python
155
- # Using Playwright to render MathJax equations
156
- from playwright.sync_api import sync_playwright
157
-
158
- def mathjax_to_png(latex: str, output: str):
159
- html = f"""<!DOCTYPE html>
160
- <html><head>
161
- <script src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-svg.js"></script>
162
- </head><body>
163
- <div id="eq">$$\\displaystyle {latex}$$</div>
164
- </body></html>"""
165
-
166
- with sync_playwright() as p:
167
- browser = p.chromium.launch()
168
- page = browser.new_page()
169
- page.set_content(html)
170
- page.wait_for_function("() => window.MathJax && MathJax.startup.promise")
171
- element = page.query_selector("#eq")
172
- element.screenshot(path=output)
173
- browser.close()
174
- ```
175
-
176
- ## Output Format Comparison
177
-
178
- | Format | Best For | Scalable? | Transparent BG? |
179
- |--------|----------|-----------|-----------------|
180
- | PNG | Presentations, docs, web | No (raster) | Yes (with `-bg Transparent`) |
181
- | SVG | Web, scalable contexts | Yes (vector) | Yes |
182
- | PDF | LaTeX inclusion, print | Yes (vector) | Yes |
183
- | EPS | Legacy LaTeX workflows | Yes (vector) | No |
184
-
185
- ## Tips
186
-
187
- - **DPI**: Use 300 for print, 150 for screen, 96 for web thumbnails
188
- - **Font consistency**: Match the equation font to your document's body font
189
- - **Dark mode**: Render with white text color for dark backgrounds: `color="white"`
190
- - **Accessibility**: Always provide alt-text describing the equation when embedding images
191
- - **Version control**: Store the LaTeX source alongside rendered images for reproducibility
192
-
193
- ## References
194
-
195
- - [dvipng Documentation](https://ctan.org/pkg/dvipng)
196
- - [matplotlib mathtext](https://matplotlib.org/stable/gallery/text_labels_and_annotations/mathtext_examples.html)
197
- - [CodeCogs Equation Editor](https://latex.codecogs.com/)
198
- - [MathJax Documentation](https://docs.mathjax.org/)
@@ -1,301 +0,0 @@
1
- ---
2
- name: api-data-collection-guide
3
- description: "API-based data collection and web scraping for research"
4
- metadata:
5
- openclaw:
6
- emoji: "spider"
7
- category: "tools"
8
- subcategory: "scraping"
9
- keywords: ["API data collection", "web search strategies", "data extraction", "web scraping"]
10
- source: "wentor-research-plugins"
11
- ---
12
-
13
- # API Data Collection Guide
14
-
15
- Collect research data from web APIs and structured sources using Python, with proper rate limiting, error handling, pagination, and ethical considerations.
16
-
17
- ## API vs. Web Scraping
18
-
19
- | Approach | When to Use | Reliability | Legal Risk |
20
- |----------|------------|-------------|------------|
21
- | Official API | API exists and provides needed data | High | Low (within TOS) |
22
- | Unofficial API | Browser dev tools reveal JSON endpoints | Medium | Medium |
23
- | Web scraping | No API available, data is publicly accessible | Low (pages change) | Medium-High |
24
- | Bulk data download | Provider offers data dumps | High | Low |
25
-
26
- **Always prefer official APIs over scraping**. Check for APIs first at: ProgrammableWeb, RapidAPI, or the data provider's developer documentation.
27
-
28
- ## RESTful API Fundamentals
29
-
30
- ### HTTP Methods
31
-
32
- | Method | Purpose | Example |
33
- |--------|---------|---------|
34
- | GET | Retrieve data | `GET /api/papers?q=machine+learning` |
35
- | POST | Create or submit data | `POST /api/annotations` |
36
- | PUT | Update existing data | `PUT /api/papers/123` |
37
- | DELETE | Remove data | `DELETE /api/papers/123` |
38
-
39
- ### Common Response Codes
40
-
41
- | Code | Meaning | Action |
42
- |------|---------|--------|
43
- | 200 | Success | Process response |
44
- | 201 | Created | Resource created successfully |
45
- | 400 | Bad request | Fix query parameters |
46
- | 401 | Unauthorized | Check API key |
47
- | 403 | Forbidden | Access denied; check permissions |
48
- | 404 | Not found | Resource does not exist |
49
- | 429 | Rate limited | Wait and retry with backoff |
50
- | 500 | Server error | Retry later |
51
-
52
- ## Python API Client Template
53
-
54
- ```python
55
- import requests
56
- import time
57
- import json
58
- import logging
59
- from pathlib import Path
60
- from datetime import datetime
61
-
62
- logging.basicConfig(level=logging.INFO)
63
- logger = logging.getLogger(__name__)
64
-
65
- class APIClient:
66
- """Reusable API client with rate limiting, retries, and caching."""
67
-
68
- def __init__(self, base_url, api_key=None, rate_limit=1.0, max_retries=3):
69
- self.base_url = base_url.rstrip("/")
70
- self.session = requests.Session()
71
- if api_key:
72
- self.session.headers["Authorization"] = f"Bearer {api_key}"
73
- self.session.headers["User-Agent"] = "ResearchCollector/1.0 (academic research)"
74
- self.rate_limit = rate_limit # seconds between requests
75
- self.max_retries = max_retries
76
- self.last_request_time = 0
77
- self.cache_dir = Path("./cache")
78
- self.cache_dir.mkdir(exist_ok=True)
79
-
80
- def _rate_limit_wait(self):
81
- """Enforce minimum time between requests."""
82
- elapsed = time.time() - self.last_request_time
83
- if elapsed < self.rate_limit:
84
- time.sleep(self.rate_limit - elapsed)
85
- self.last_request_time = time.time()
86
-
87
- def _get_cache_key(self, endpoint, params):
88
- """Generate a cache key from the request."""
89
- import hashlib
90
- key_string = f"{endpoint}_{json.dumps(params, sort_keys=True)}"
91
- return hashlib.md5(key_string.encode()).hexdigest()
92
-
93
- def get(self, endpoint, params=None, use_cache=True):
94
- """Make a GET request with rate limiting, retries, and caching."""
95
- cache_key = self._get_cache_key(endpoint, params or {})
96
- cache_file = self.cache_dir / f"{cache_key}.json"
97
-
98
- # Check cache
99
- if use_cache and cache_file.exists():
100
- logger.debug(f"Cache hit: {endpoint}")
101
- return json.loads(cache_file.read_text())
102
-
103
- url = f"{self.base_url}/{endpoint.lstrip('/')}"
104
-
105
- for attempt in range(self.max_retries):
106
- self._rate_limit_wait()
107
- try:
108
- response = self.session.get(url, params=params, timeout=30)
109
-
110
- if response.status_code == 200:
111
- data = response.json()
112
- # Save to cache
113
- cache_file.write_text(json.dumps(data))
114
- return data
115
-
116
- elif response.status_code == 429:
117
- retry_after = int(response.headers.get("Retry-After", 60))
118
- logger.warning(f"Rate limited. Waiting {retry_after}s...")
119
- time.sleep(retry_after)
120
-
121
- elif response.status_code >= 500:
122
- logger.warning(f"Server error {response.status_code}. Retry {attempt+1}/{self.max_retries}")
123
- time.sleep(2 ** attempt) # Exponential backoff
124
-
125
- else:
126
- logger.error(f"Request failed: {response.status_code} {response.text[:200]}")
127
- return None
128
-
129
- except requests.exceptions.RequestException as e:
130
- logger.error(f"Request exception: {e}")
131
- time.sleep(2 ** attempt)
132
-
133
- logger.error(f"Max retries exceeded for {endpoint}")
134
- return None
135
-
136
- def paginate(self, endpoint, params=None, page_key="page",
137
- results_key="results", max_pages=100):
138
- """Automatically paginate through all results."""
139
- params = params or {}
140
- all_results = []
141
- page = 1
142
-
143
- while page <= max_pages:
144
- params[page_key] = page
145
- data = self.get(endpoint, params)
146
-
147
- if not data or not data.get(results_key):
148
- break
149
-
150
- results = data[results_key]
151
- all_results.extend(results)
152
- logger.info(f"Page {page}: {len(results)} results (total: {len(all_results)})")
153
-
154
- # Check if more pages exist
155
- if len(results) < params.get("per_page", params.get("limit", 20)):
156
- break
157
-
158
- page += 1
159
-
160
- return all_results
161
- ```
162
-
163
- ## Academic API Examples
164
-
165
- ### OpenAlex (Open Scholarly Metadata)
166
-
167
- ```python
168
- # OpenAlex: free, comprehensive, no authentication required
169
- client = APIClient("https://api.openalex.org", rate_limit=0.1)
170
-
171
- # Search for works
172
- results = client.get("works", params={
173
- "filter": "title.search:transformer attention mechanism",
174
- "sort": "cited_by_count:desc",
175
- "per_page": 25
176
- })
177
-
178
- for work in results.get("results", []):
179
- print(f"[{work.get('publication_year')}] {work.get('title')}")
180
- print(f" Citations: {work.get('cited_by_count')}")
181
- print(f" DOI: {work.get('doi')}")
182
- ```
183
-
184
- ### CrossRef (DOI Metadata)
185
-
186
- ```python
187
- client = APIClient("https://api.crossref.org", rate_limit=0.05)
188
- client.session.headers["User-Agent"] = "ResearchClaw/1.0 (mailto:researcher@university.edu)"
189
-
190
- # Search for works
191
- results = client.get("works", params={
192
- "query": "machine learning drug discovery",
193
- "rows": 20,
194
- "sort": "relevance",
195
- "order": "desc"
196
- })
197
-
198
- for item in results.get("message", {}).get("items", []):
199
- title = item.get("title", ["N/A"])[0]
200
- doi = item.get("DOI", "N/A")
201
- cited = item.get("is-referenced-by-count", 0)
202
- print(f" {title} | DOI: {doi} | Cited: {cited}")
203
- ```
204
-
205
- ### GitHub API (Code and Repositories)
206
-
207
- ```python
208
- # GitHub API for finding research code repositories
209
- client = APIClient("https://api.github.com", api_key=os.environ["GITHUB_TOKEN"], rate_limit=0.75)
210
-
211
- # Search repositories
212
- results = client.get("search/repositories", params={
213
- "q": "topic:machine-learning language:python stars:>100",
214
- "sort": "stars",
215
- "order": "desc",
216
- "per_page": 30
217
- })
218
-
219
- for repo in results.get("items", []):
220
- print(f"{repo['full_name']} ({repo['stargazers_count']} stars)")
221
- print(f" {repo.get('description', 'No description')[:80]}")
222
- ```
223
-
224
- ## Web Scraping (When APIs Are Unavailable)
225
-
226
- ```python
227
- import requests
228
- from bs4 import BeautifulSoup
229
- import time
230
-
231
- def scrape_conference_proceedings(url, delay=2.0):
232
- """Scrape paper titles and authors from a conference page."""
233
- headers = {
234
- "User-Agent": "Mozilla/5.0 (Research Bot; academic research only)"
235
- }
236
-
237
- response = requests.get(url, headers=headers)
238
- response.raise_for_status()
239
-
240
- soup = BeautifulSoup(response.text, "html.parser")
241
-
242
- papers = []
243
- for article in soup.find_all("div", class_="paper-entry"):
244
- title = article.find("h3")
245
- authors = article.find("span", class_="authors")
246
- abstract = article.find("p", class_="abstract")
247
-
248
- papers.append({
249
- "title": title.text.strip() if title else "N/A",
250
- "authors": authors.text.strip() if authors else "N/A",
251
- "abstract": abstract.text.strip() if abstract else "N/A"
252
- })
253
-
254
- time.sleep(delay) # Be polite
255
- return papers
256
- ```
257
-
258
- ## Data Storage and Management
259
-
260
- ```python
261
- import pandas as pd
262
- import sqlite3
263
-
264
- def save_to_sqlite(data, db_path="research_data.db", table_name="papers"):
265
- """Save collected data to SQLite database."""
266
- df = pd.DataFrame(data)
267
- conn = sqlite3.connect(db_path)
268
- df.to_sql(table_name, conn, if_exists="append", index=False)
269
- conn.close()
270
- logger.info(f"Saved {len(df)} records to {db_path}:{table_name}")
271
-
272
- def save_incremental_json(data, output_file="collected_data.jsonl"):
273
- """Append data as JSON Lines (one JSON object per line)."""
274
- with open(output_file, "a") as f:
275
- for record in data:
276
- f.write(json.dumps(record) + "\n")
277
- ```
278
-
279
- ## Ethical and Legal Considerations
280
-
281
- | Principle | Description |
282
- |-----------|-------------|
283
- | **Respect robots.txt** | Check `robots.txt` before scraping any site |
284
- | **Rate limiting** | Never exceed 1 request/second unless the API permits more |
285
- | **Identify yourself** | Use a descriptive User-Agent with contact email |
286
- | **Terms of Service** | Read and follow the API/website TOS |
287
- | **Data minimization** | Only collect data you actually need |
288
- | **Privacy** | Do not scrape personal data without consent |
289
- | **Acknowledge sources** | Cite data sources in publications |
290
- | **IRB review** | Consult your IRB if collecting human-related data |
291
-
292
- ## Troubleshooting Common Issues
293
-
294
- | Problem | Cause | Solution |
295
- |---------|-------|----------|
296
- | 403 Forbidden | Missing or incorrect authentication | Check API key, update User-Agent |
297
- | Timeout errors | Slow server or large response | Increase timeout, reduce page size |
298
- | Inconsistent data | API schema changed | Version-lock API endpoints, validate schema |
299
- | Missing fields | Optional fields are null | Use `.get()` with defaults, handle None |
300
- | Encoding errors | Non-UTF8 characters | Set `response.encoding = "utf-8"`, use `errors="replace"` |
301
- | IP blocking | Too many requests | Use exponential backoff, rotate IPs (with caution) |
@@ -1,182 +0,0 @@
1
- ---
2
- name: academic-citation-manager-guide
3
- description: "Comparison and workflow guide for academic citation management tools"
4
- metadata:
5
- openclaw:
6
- emoji: "📎"
7
- category: "writing"
8
- subcategory: "citation"
9
- keywords: ["citation management", "reference manager", "zotero", "mendeley", "bibliography", "bibtex"]
10
- source: "https://clawhub.com/YouStudyeveryday/academic-citation-manager"
11
- ---
12
-
13
- # Academic Citation Management Guide
14
-
15
- ## Overview
16
-
17
- Citation managers are essential tools for collecting, organizing, annotating, and citing academic references. This guide compares the major options (Zotero, Mendeley, EndNote, JabRef, Paperpile), covers core workflows, and provides best practices for building and maintaining a well-organized reference library.
18
-
19
- ## Tool Comparison
20
-
21
- | Feature | Zotero | Mendeley | EndNote | JabRef | Paperpile |
22
- |---------|--------|----------|---------|--------|-----------|
23
- | **Price** | Free (open source) | Free (Elsevier) | $250+ | Free (open source) | $3/mo (students) |
24
- | **Storage** | 300 MB free, $20/yr for 2GB | 2 GB free | Unlimited (desktop) | Local files | 10 GB |
25
- | **PDF Management** | Yes + annotations | Yes + annotations | Yes | Basic | Yes |
26
- | **Browser Extension** | Excellent (Connector) | Good (Importer) | Average | Manual | Google Docs native |
27
- | **Word Plugin** | Word + LibreOffice | Word | Word | — | Google Docs + Word |
28
- | **LaTeX/BibTeX** | Export + Better BibTeX plugin | Export | Export | Native BibTeX editor | Export |
29
- | **Collaboration** | Group libraries (free) | Teams | Share libraries | Git-friendly files | Shared folders |
30
- | **Open Source** | Yes (GPL) | No | No | Yes (MIT) | No |
31
- | **Offline Use** | Full offline | Full offline | Full offline | Full offline | Limited |
32
-
33
- ### Recommendation by Workflow
34
-
35
- | If You... | Use |
36
- |-----------|-----|
37
- | Write in LaTeX primarily | **JabRef** (native BibTeX) or **Zotero + Better BibTeX** |
38
- | Write in Word/LibreOffice | **Zotero** (best free plugin) |
39
- | Write in Google Docs | **Paperpile** (native integration) |
40
- | Need Elsevier integration | **Mendeley** (same company) |
41
- | Need institutional license | **EndNote** (common in universities) |
42
- | Want maximum flexibility | **Zotero** (open source, 600+ plugins) |
43
-
44
- ## Core Workflow
45
-
46
- ### 1. Collecting References
47
-
48
- **Browser Extension** (Zotero Connector example):
49
- 1. Install Zotero Connector for Chrome/Firefox
50
- 2. Browse to any paper page (journal, arXiv, Google Scholar)
51
- 3. Click the Zotero icon → metadata + PDF saved automatically
52
- 4. Works on: journal websites, arXiv, PubMed, Google Scholar, Amazon (books)
53
-
54
- **Import from DOI**:
55
- ```
56
- Zotero: "Add Item(s) by Identifier" → paste DOI → auto-import
57
- Mendeley: "Add" → "Import DOI" → paste
58
- JabRef: "New Entry" → "ID-based entry generator" → paste DOI
59
- ```
60
-
61
- **Import from BibTeX**:
62
- ```bibtex
63
- % Save as references.bib, then import
64
- @article{vaswani2017attention,
65
- title={Attention is all you need},
66
- author={Vaswani, Ashish and others},
67
- journal={NeurIPS},
68
- year={2017}
69
- }
70
- ```
71
-
72
- **Batch Import from Google Scholar**:
73
- ```
74
- 1. Search Google Scholar for your topic
75
- 2. Select papers (checkbox) → Export → BibTeX
76
- 3. Import .bib file into your citation manager
77
- 4. Merge duplicates
78
- ```
79
-
80
- ### 2. Organizing References
81
-
82
- **Folder/Collection Structure** (recommended):
83
-
84
- ```
85
- My Library/
86
- ├── By Project/
87
- │ ├── Dissertation/
88
- │ │ ├── Chapter 1 - Introduction/
89
- │ │ ├── Chapter 2 - Literature Review/
90
- │ │ └── Chapter 3 - Methods/
91
- │ └── ICML 2026 Paper/
92
- ├── By Topic/
93
- │ ├── Attention Mechanisms/
94
- │ ├── Retrieval-Augmented Generation/
95
- │ └── Code Generation/
96
- └── Reading Queue/
97
- ├── To Read/
98
- ├── In Progress/
99
- └── Read + Annotated/
100
- ```
101
-
102
- **Tagging Strategy**:
103
- ```
104
- Tags by reading status: #to-read, #reading, #done
105
- Tags by relevance: #core, #supporting, #background
106
- Tags by content: #methodology, #dataset, #benchmark, #survey
107
- Tags by quality: #seminal, #highly-cited, #controversial
108
- ```
109
-
110
- ### 3. Annotating and Note-Taking
111
-
112
- ```markdown
113
- ## Per-Paper Note Template
114
-
115
- ### Quick Reference
116
- - **One-line summary**: [What this paper does in one sentence]
117
- - **Key contribution**: [The novel idea]
118
- - **Method**: [Technique used]
119
- - **Dataset**: [What data they use]
120
- - **Result**: [Main quantitative finding]
121
-
122
- ### Detailed Notes
123
- - Strengths: [What's convincing]
124
- - Weaknesses: [What's questionable]
125
- - Relevance to my work: [How does this connect?]
126
- - Follow-up: [What to read next based on this paper]
127
- ```
128
-
129
- ### 4. Citing in Documents
130
-
131
- **In Word/LibreOffice** (Zotero):
132
- 1. Place cursor where citation goes
133
- 2. Zotero toolbar → "Add Citation"
134
- 3. Search by author, title, or year
135
- 4. Select citation → insert
136
- 5. At end: "Add Bibliography" to generate reference list
137
-
138
- **In LaTeX** (Better BibTeX for Zotero):
139
- ```bash
140
- # Install Better BibTeX plugin for Zotero
141
- # Set up auto-export:
142
- # Zotero → File → Export Library → Better BibTeX → Keep updated
143
-
144
- # In your .tex file:
145
- \bibliography{exported_library}
146
- \bibliographystyle{apalike}
147
-
148
- # Cite: \cite{vaswani2017attention}
149
- # Textual: \citet{vaswani2017attention} → Vaswani et al. (2017)
150
- # Parenthetical: \citep{vaswani2017attention} → (Vaswani et al., 2017)
151
- ```
152
-
153
- ### 5. Maintaining Your Library
154
-
155
- ```markdown
156
- ## Monthly Maintenance Checklist
157
-
158
- □ Merge duplicate entries (Zotero: right-click → Merge)
159
- □ Fix incomplete metadata (missing year, venue, DOI)
160
- □ Update "In Press" papers to final published versions
161
- □ Clean up tags (remove unused, consolidate synonyms)
162
- □ Back up library (export to BibTeX or Zotero Backup)
163
- □ Check for retracted papers (Retraction Watch database)
164
- ```
165
-
166
- ## Citation Style Quick Reference
167
-
168
- | Style | Disciplines | In-Text | Bibliography |
169
- |-------|------------|---------|-------------|
170
- | **APA 7th** | Psychology, Social Sciences | (Author, Year) | Author, A. A. (Year). Title. *Journal*, *Vol*(Issue), pages. DOI |
171
- | **IEEE** | Engineering, CS | [1] | [1] A. Author, "Title," *Journal*, vol. X, pp. Y-Z, Year. |
172
- | **Chicago Author-Date** | Humanities, Social Sciences | (Author Year) | Author, First. Year. *Title*. Place: Publisher. |
173
- | **Harvard** | Business, Social Sciences | (Author Year) | Author, F. (Year) 'Title', *Journal*, Vol(Issue), pp. X-Y. |
174
- | **Vancouver** | Biomedical, Medicine | (1) | 1. Author AB. Title. Journal. Year;Vol(Issue):pages. |
175
- | **MLA 9th** | Humanities, Literature | (Author Page) | Author. "Title." *Journal*, vol. X, no. Y, Year, pp. Z. |
176
-
177
- ## References
178
-
179
- - [Zotero Documentation](https://www.zotero.org/support/)
180
- - [Better BibTeX for Zotero](https://retorque.re/zotero-better-bibtex/)
181
- - [JabRef Documentation](https://docs.jabref.org/)
182
- - [Citation Style Language](https://citationstyles.org/)