@wentorai/research-plugins 1.1.0 → 1.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +18 -18
- package/curated/analysis/README.md +12 -1
- package/curated/domains/README.md +48 -1
- package/curated/literature/README.md +46 -1
- package/curated/research/README.md +16 -1
- package/curated/tools/README.md +20 -1
- package/curated/writing/README.md +13 -1
- package/mcp-configs/academic-db/alphafold-mcp.json +20 -0
- package/mcp-configs/academic-db/brightspace-mcp.json +21 -0
- package/mcp-configs/academic-db/climatiq-mcp.json +20 -0
- package/mcp-configs/academic-db/gibs-mcp.json +20 -0
- package/mcp-configs/academic-db/gis-mcp-server.json +22 -0
- package/mcp-configs/academic-db/google-earth-engine-mcp.json +21 -0
- package/mcp-configs/academic-db/m4-clinical-mcp.json +21 -0
- package/mcp-configs/academic-db/medical-mcp.json +21 -0
- package/mcp-configs/academic-db/nexonco-mcp.json +20 -0
- package/mcp-configs/academic-db/omop-mcp.json +20 -0
- package/mcp-configs/academic-db/onekgpd-mcp.json +20 -0
- package/mcp-configs/academic-db/openedu-mcp.json +20 -0
- package/mcp-configs/academic-db/opengenes-mcp.json +20 -0
- package/mcp-configs/academic-db/openstax-mcp.json +21 -0
- package/mcp-configs/academic-db/openstreetmap-mcp.json +21 -0
- package/mcp-configs/academic-db/opentargets-mcp.json +21 -0
- package/mcp-configs/academic-db/pdb-mcp.json +21 -0
- package/mcp-configs/academic-db/smithsonian-mcp.json +20 -0
- package/mcp-configs/ai-platform/magi-researchers.json +21 -0
- package/mcp-configs/ai-platform/mcp-academic-researcher.json +22 -0
- package/mcp-configs/ai-platform/open-paper-machine.json +21 -0
- package/mcp-configs/ai-platform/paper-intelligence.json +21 -0
- package/mcp-configs/ai-platform/paper-reader.json +21 -0
- package/mcp-configs/ai-platform/paperdebugger.json +21 -0
- package/mcp-configs/browser/exa-mcp.json +20 -0
- package/mcp-configs/browser/mcp-searxng.json +21 -0
- package/mcp-configs/browser/mcp-webresearch.json +20 -0
- package/mcp-configs/communication/discourse-mcp.json +21 -0
- package/mcp-configs/data-platform/automl-stat-mcp.json +21 -0
- package/mcp-configs/data-platform/jefferson-stats-mcp.json +22 -0
- package/mcp-configs/data-platform/mcp-excel-server.json +21 -0
- package/mcp-configs/data-platform/mcp-stata.json +21 -0
- package/mcp-configs/data-platform/mcpstack-jupyter.json +21 -0
- package/mcp-configs/data-platform/ml-mcp.json +21 -0
- package/mcp-configs/data-platform/nasdaq-data-link-mcp.json +20 -0
- package/mcp-configs/data-platform/numpy-mcp.json +21 -0
- package/mcp-configs/dev-platform/geogebra-mcp.json +21 -0
- package/mcp-configs/dev-platform/latex-mcp-server.json +21 -0
- package/mcp-configs/dev-platform/manim-mcp.json +20 -0
- package/mcp-configs/dev-platform/mcp-echarts.json +20 -0
- package/mcp-configs/dev-platform/panel-viz-mcp.json +20 -0
- package/mcp-configs/dev-platform/paperbanana.json +20 -0
- package/mcp-configs/dev-platform/texflow-mcp.json +20 -0
- package/mcp-configs/dev-platform/texmcp.json +20 -0
- package/mcp-configs/dev-platform/typst-mcp.json +21 -0
- package/mcp-configs/dev-platform/vizro-mcp.json +20 -0
- package/mcp-configs/note-knowledge/local-faiss-mcp.json +21 -0
- package/mcp-configs/note-knowledge/mcp-memory-service.json +21 -0
- package/mcp-configs/note-knowledge/mcp-obsidian.json +23 -0
- package/mcp-configs/note-knowledge/mcp-ragdocs.json +20 -0
- package/mcp-configs/note-knowledge/mcp-summarizer.json +21 -0
- package/mcp-configs/note-knowledge/mediawiki-mcp.json +21 -0
- package/mcp-configs/note-knowledge/openzim-mcp.json +20 -0
- package/mcp-configs/note-knowledge/zettelkasten-mcp.json +21 -0
- package/mcp-configs/reference-mgr/academic-paper-mcp-http.json +20 -0
- package/mcp-configs/reference-mgr/academix.json +20 -0
- package/mcp-configs/reference-mgr/arxiv-research-mcp.json +21 -0
- package/mcp-configs/reference-mgr/google-scholar-abstract-mcp.json +19 -0
- package/mcp-configs/reference-mgr/google-scholar-mcp.json +20 -0
- package/mcp-configs/reference-mgr/mcp-paperswithcode.json +21 -0
- package/mcp-configs/reference-mgr/mcp-scholarly.json +20 -0
- package/mcp-configs/reference-mgr/mcp-simple-arxiv.json +20 -0
- package/mcp-configs/reference-mgr/mcp-simple-pubmed.json +20 -0
- package/mcp-configs/reference-mgr/mcp-zotero.json +21 -0
- package/mcp-configs/reference-mgr/mendeley-mcp.json +20 -0
- package/mcp-configs/reference-mgr/ncbi-mcp-server.json +22 -0
- package/mcp-configs/reference-mgr/onecite.json +21 -0
- package/mcp-configs/reference-mgr/paper-search-mcp.json +21 -0
- package/mcp-configs/reference-mgr/pubmed-search-mcp.json +21 -0
- package/mcp-configs/reference-mgr/scholar-mcp.json +21 -0
- package/mcp-configs/reference-mgr/scholar-multi-mcp.json +21 -0
- package/mcp-configs/reference-mgr/seerai.json +21 -0
- package/mcp-configs/reference-mgr/semantic-scholar-fastmcp.json +21 -0
- package/mcp-configs/reference-mgr/sourcelibrary.json +20 -0
- package/openclaw.plugin.json +2 -2
- package/package.json +2 -2
- package/skills/analysis/dataviz/citation-map-guide/SKILL.md +184 -0
- package/skills/analysis/dataviz/data-visualization-principles/SKILL.md +171 -0
- package/skills/analysis/econometrics/econml-causal-guide/SKILL.md +2 -2
- package/skills/analysis/econometrics/empirical-paper-analysis/SKILL.md +192 -0
- package/skills/analysis/econometrics/mostly-harmless-guide/SKILL.md +2 -2
- package/skills/analysis/econometrics/panel-data-regression-workflow/SKILL.md +267 -0
- package/skills/analysis/econometrics/python-causality-guide/SKILL.md +2 -2
- package/skills/analysis/econometrics/stata-reference-guide/SKILL.md +293 -0
- package/skills/analysis/statistics/general-statistics-guide/SKILL.md +226 -0
- package/skills/analysis/statistics/infiagent-benchmark-guide/SKILL.md +106 -0
- package/skills/analysis/wrangling/claude-data-analysis-guide/SKILL.md +100 -0
- package/skills/analysis/wrangling/open-data-scientist-guide/SKILL.md +197 -0
- package/skills/analysis/wrangling/streamline-analyst-guide/SKILL.md +119 -0
- package/skills/domains/ai-ml/ai-agent-papers-guide/SKILL.md +146 -0
- package/skills/domains/ai-ml/anomaly-detection-papers-guide/SKILL.md +167 -0
- package/skills/domains/ai-ml/autonomous-agents-papers-guide/SKILL.md +178 -0
- package/skills/domains/ai-ml/domain-adaptation-papers-guide/SKILL.md +173 -0
- package/skills/domains/ai-ml/generative-ai-guide/SKILL.md +2 -2
- package/skills/domains/ai-ml/graph-learning-papers-guide/SKILL.md +125 -0
- package/skills/domains/ai-ml/kolmogorov-arnold-networks-guide/SKILL.md +185 -0
- package/skills/domains/ai-ml/npcpy-research-guide/SKILL.md +137 -0
- package/skills/domains/ai-ml/responsible-ai-guide/SKILL.md +126 -0
- package/skills/domains/ai-ml/vmas-simulator-guide/SKILL.md +129 -0
- package/skills/domains/biomedical/clawbio-guide/SKILL.md +167 -0
- package/skills/domains/biomedical/clinical-dialogue-agents-guide/SKILL.md +145 -0
- package/skills/domains/biomedical/ena-sequence-api/SKILL.md +175 -0
- package/skills/domains/biomedical/genomas-guide/SKILL.md +126 -0
- package/skills/domains/biomedical/genotex-benchmark-guide/SKILL.md +125 -0
- package/skills/domains/biomedical/med-researcher-guide/SKILL.md +161 -0
- package/skills/domains/biomedical/med-researcher-r1-guide/SKILL.md +146 -0
- package/skills/domains/biomedical/ncbi-blast-api/SKILL.md +195 -0
- package/skills/domains/biomedical/ncbi-datasets-api/SKILL.md +220 -0
- package/skills/domains/biomedical/quickgo-api/SKILL.md +181 -0
- package/skills/domains/business/xpert-bi-guide/SKILL.md +84 -0
- package/skills/domains/chemistry/cactus-cheminformatics-guide/SKILL.md +89 -0
- package/skills/domains/chemistry/chemeagle-guide/SKILL.md +147 -0
- package/skills/domains/chemistry/chemgraph-agent-guide/SKILL.md +120 -0
- package/skills/domains/cs/ai-security-papers-guide/SKILL.md +103 -0
- package/skills/domains/cs/code-llm-papers-guide/SKILL.md +131 -0
- package/skills/domains/cs/gaussian-splatting-papers-guide/SKILL.md +158 -0
- package/skills/domains/cs/llm-aiops-guide/SKILL.md +70 -0
- package/skills/domains/cs/software-heritage-api/SKILL.md +200 -0
- package/skills/domains/economics/nber-working-papers-api/SKILL.md +177 -0
- package/skills/domains/economics/repec-economics-api/SKILL.md +188 -0
- package/skills/domains/education/academic-study-methods/SKILL.md +228 -0
- package/skills/domains/education/edumcp-guide/SKILL.md +74 -0
- package/skills/domains/education/open-syllabus-api/SKILL.md +171 -0
- package/skills/domains/finance/akshare-finance-data/SKILL.md +207 -0
- package/skills/domains/finance/finsight-research-guide/SKILL.md +113 -0
- package/skills/domains/finance/options-analytics-agent-guide/SKILL.md +117 -0
- package/skills/domains/geoscience/pangaea-data-api/SKILL.md +197 -0
- package/skills/domains/humanities/digital-humanities-methods/SKILL.md +232 -0
- package/skills/domains/law/caselaw-access-api/SKILL.md +149 -0
- package/skills/domains/law/legal-agent-skills-guide/SKILL.md +132 -0
- package/skills/domains/law/legal-research-methods/SKILL.md +190 -0
- package/skills/domains/law/opencontracts-guide/SKILL.md +168 -0
- package/skills/domains/math/lean-theorem-proving-guide/SKILL.md +140 -0
- package/skills/domains/pharma/madd-drug-discovery-guide/SKILL.md +153 -0
- package/skills/domains/social-science/ipums-microdata-api/SKILL.md +211 -0
- package/skills/domains/social-science/sociology-research-methods/SKILL.md +181 -0
- package/skills/literature/discovery/arxiv-paper-monitoring/SKILL.md +233 -0
- package/skills/literature/discovery/papers-we-love-guide/SKILL.md +169 -0
- package/skills/literature/discovery/zotero-arxiv-daily-guide/SKILL.md +2 -2
- package/skills/literature/fulltext/bioc-pmc-api/SKILL.md +146 -0
- package/skills/literature/fulltext/dataverse-api/SKILL.md +215 -0
- package/skills/literature/fulltext/hal-archive-api/SKILL.md +218 -0
- package/skills/literature/fulltext/osf-api/SKILL.md +212 -0
- package/skills/literature/fulltext/pmc-ftp-bulk-download/SKILL.md +182 -0
- package/skills/literature/fulltext/zotero-ai-butler-guide/SKILL.md +166 -0
- package/skills/literature/fulltext/zotero-scihub-guide/SKILL.md +168 -0
- package/skills/literature/metadata/bibliometrix-guide/SKILL.md +164 -0
- package/skills/literature/metadata/crossref-event-data-api/SKILL.md +183 -0
- package/skills/literature/metadata/doi-content-negotiation/SKILL.md +202 -0
- package/skills/literature/metadata/orkg-api/SKILL.md +153 -0
- package/skills/literature/metadata/plumx-metrics-api/SKILL.md +188 -0
- package/skills/literature/metadata/ror-organization-api/SKILL.md +208 -0
- package/skills/literature/metadata/sophosia-reference-guide/SKILL.md +110 -0
- package/skills/literature/metadata/viaf-authority-api/SKILL.md +209 -0
- package/skills/literature/metadata/zoplicate-dedup-guide/SKILL.md +147 -0
- package/skills/literature/metadata/zotero-actions-tags-guide/SKILL.md +212 -0
- package/skills/literature/metadata/zotmoov-guide/SKILL.md +120 -0
- package/skills/literature/metadata/zutilo-guide/SKILL.md +140 -0
- package/skills/literature/search/arxiv-cli-tools/SKILL.md +172 -0
- package/skills/literature/search/arxiv-osiris/SKILL.md +199 -0
- package/skills/literature/search/base-academic-search/SKILL.md +196 -0
- package/skills/literature/search/chatpaper-guide/SKILL.md +2 -2
- package/skills/literature/search/citeseerx-api/SKILL.md +183 -0
- package/skills/literature/search/deepgit-search-guide/SKILL.md +2 -2
- package/skills/literature/search/eric-education-api/SKILL.md +199 -0
- package/skills/literature/search/findpapers-guide/SKILL.md +177 -0
- package/skills/literature/search/ieee-xplore-api/SKILL.md +177 -0
- package/skills/literature/search/lens-scholarly-api/SKILL.md +211 -0
- package/skills/literature/search/multi-database-literature-search/SKILL.md +198 -0
- package/skills/literature/search/open-library-api/SKILL.md +196 -0
- package/skills/literature/search/open-semantic-search-guide/SKILL.md +190 -0
- package/skills/literature/search/openaire-api/SKILL.md +141 -0
- package/skills/literature/search/paper-search-mcp-guide/SKILL.md +107 -0
- package/skills/literature/search/papers-chat-guide/SKILL.md +194 -0
- package/skills/literature/search/pasa-paper-search-guide/SKILL.md +2 -2
- package/skills/literature/search/plos-open-access-api/SKILL.md +203 -0
- package/skills/literature/search/scielo-api/SKILL.md +182 -0
- package/skills/literature/search/share-research-api/SKILL.md +129 -0
- package/skills/literature/search/worldcat-search-api/SKILL.md +224 -0
- package/skills/research/automation/aim-experiment-guide/SKILL.md +2 -2
- package/skills/research/automation/claude-academic-workflow-guide/SKILL.md +202 -0
- package/skills/research/automation/coexist-ai-guide/SKILL.md +149 -0
- package/skills/research/automation/datagen-research-guide/SKILL.md +2 -2
- package/skills/research/automation/foam-agent-guide/SKILL.md +203 -0
- package/skills/research/automation/kedro-pipeline-guide/SKILL.md +2 -2
- package/skills/research/automation/mle-agent-guide/SKILL.md +2 -2
- package/skills/research/automation/paper-to-agent-guide/SKILL.md +2 -2
- package/skills/research/deep-research/auto-deep-research-guide/SKILL.md +2 -2
- package/skills/research/deep-research/cognitive-kernel-guide/SKILL.md +200 -0
- package/skills/research/deep-research/corvus-research-guide/SKILL.md +132 -0
- package/skills/research/deep-research/in-depth-research-guide/SKILL.md +205 -0
- package/skills/research/deep-research/kosmos-scientist-guide/SKILL.md +185 -0
- package/skills/research/deep-research/llm-scientific-discovery-guide/SKILL.md +178 -0
- package/skills/research/deep-research/open-researcher-guide/SKILL.md +138 -0
- package/skills/research/methodology/claude-scientific-guide/SKILL.md +2 -2
- package/skills/research/methodology/parsifal-slr-guide/SKILL.md +154 -0
- package/skills/research/methodology/research-pipeline-units-guide/SKILL.md +169 -0
- package/skills/research/methodology/slr-automation-guide/SKILL.md +235 -0
- package/skills/research/paper-review/latte-review-guide/SKILL.md +175 -0
- package/skills/research/paper-review/paper-critique-framework/SKILL.md +181 -0
- package/skills/tools/code-exec/contextplus-mcp-guide/SKILL.md +110 -0
- package/skills/tools/diagram/clawphd-guide/SKILL.md +149 -0
- package/skills/tools/diagram/kroki-diagram-api/SKILL.md +198 -0
- package/skills/tools/diagram/scientific-graphical-abstract/SKILL.md +201 -0
- package/skills/tools/document/docsgpt-guide/SKILL.md +2 -2
- package/skills/tools/document/md2pdf-xelatex/SKILL.md +212 -0
- package/skills/tools/document/openpaper-guide/SKILL.md +232 -0
- package/skills/tools/document/weknora-guide/SKILL.md +216 -0
- package/skills/tools/document/zotero-addon-market-guide/SKILL.md +108 -0
- package/skills/tools/document/zotero-night-theme-guide/SKILL.md +142 -0
- package/skills/tools/document/zotero-style-guide/SKILL.md +217 -0
- package/skills/tools/knowledge-graph/graphiti-guide/SKILL.md +2 -2
- package/skills/tools/knowledge-graph/mimir-memory-guide/SKILL.md +135 -0
- package/skills/tools/knowledge-graph/notero-zotero-notion-guide/SKILL.md +187 -0
- package/skills/tools/knowledge-graph/open-webui-tools-guide/SKILL.md +156 -0
- package/skills/tools/knowledge-graph/openspg-guide/SKILL.md +210 -0
- package/skills/tools/knowledge-graph/paperpile-notion-guide/SKILL.md +84 -0
- package/skills/tools/knowledge-graph/zotero-markdb-connect-guide/SKILL.md +162 -0
- package/skills/tools/ocr-translate/latex-translation-guide/SKILL.md +176 -0
- package/skills/tools/ocr-translate/math-equation-renderer/SKILL.md +198 -0
- package/skills/tools/ocr-translate/pdf-math-translate-guide/SKILL.md +2 -2
- package/skills/tools/ocr-translate/zotero-pdf-translate-guide/SKILL.md +2 -2
- package/skills/tools/ocr-translate/zotero-pdf2zh-guide/SKILL.md +2 -2
- package/skills/writing/citation/academic-citation-manager-guide/SKILL.md +182 -0
- package/skills/writing/citation/citation-assistant-skill/SKILL.md +192 -0
- package/skills/writing/citation/jabref-reference-guide/SKILL.md +2 -2
- package/skills/writing/citation/jasminum-zotero-guide/SKILL.md +2 -2
- package/skills/writing/citation/mendeley-api/SKILL.md +231 -0
- package/skills/writing/citation/obsidian-citation-guide/SKILL.md +2 -2
- package/skills/writing/citation/obsidian-zotero-guide/SKILL.md +2 -2
- package/skills/writing/citation/onecite-reference-guide/SKILL.md +168 -0
- package/skills/writing/citation/papersgpt-zotero-guide/SKILL.md +2 -2
- package/skills/writing/citation/papis-cli-guide/SKILL.md +2 -2
- package/skills/writing/citation/zotero-better-bibtex-guide/SKILL.md +2 -2
- package/skills/writing/citation/zotero-better-notes-guide/SKILL.md +2 -2
- package/skills/writing/citation/zotero-gpt-guide/SKILL.md +2 -2
- package/skills/writing/citation/zotero-mcp-guide/SKILL.md +2 -2
- package/skills/writing/citation/zotero-mdnotes-guide/SKILL.md +2 -2
- package/skills/writing/citation/zotero-reference-guide/SKILL.md +2 -2
- package/skills/writing/citation/zotfile-attachment-guide/SKILL.md +2 -2
- package/skills/writing/composition/opendraft-thesis-guide/SKILL.md +200 -0
- package/skills/writing/composition/paper-debugger-guide/SKILL.md +2 -2
- package/skills/writing/composition/paperforge-guide/SKILL.md +205 -0
- package/skills/writing/composition/research-paper-writer/SKILL.md +226 -0
- package/skills/writing/composition/scientific-writing-resources/SKILL.md +2 -2
- package/skills/writing/latex/academic-writing-latex/SKILL.md +285 -0
- package/skills/writing/latex/latex-drawing-collection/SKILL.md +2 -2
- package/skills/writing/latex/latex-templates-collection/SKILL.md +2 -2
- package/skills/writing/polish/chinese-text-humanizer/SKILL.md +140 -0
- package/skills/writing/templates/arxiv-preprint-template/SKILL.md +184 -0
- package/skills/writing/templates/elegant-paper-template/SKILL.md +141 -0
- package/skills/writing/templates/novathesis-guide/SKILL.md +2 -2
- package/skills/writing/templates/sjtuthesis-guide/SKILL.md +2 -2
- package/skills/writing/templates/thuthesis-guide/SKILL.md +2 -2
- package/src/tools/arxiv.ts +17 -10
- package/src/tools/crossref.ts +17 -10
- package/src/tools/openalex.ts +25 -17
- package/src/tools/pubmed.ts +19 -12
- package/src/tools/semantic-scholar.ts +20 -12
- package/src/tools/unpaywall.ts +12 -6
|
@@ -0,0 +1,182 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: pmc-ftp-bulk-download
|
|
3
|
+
description: "Bulk download PMC Open Access articles via FTP for large-scale mining"
|
|
4
|
+
metadata:
|
|
5
|
+
openclaw:
|
|
6
|
+
emoji: "📦"
|
|
7
|
+
category: "literature"
|
|
8
|
+
subcategory: "fulltext"
|
|
9
|
+
keywords: ["pmc", "bulk download", "ftp", "text mining", "open access", "pubmed central"]
|
|
10
|
+
source: "https://www.ncbi.nlm.nih.gov/pmc/tools/ftp/"
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# PMC FTP Bulk Download
|
|
14
|
+
|
|
15
|
+
## Overview
|
|
16
|
+
|
|
17
|
+
The PMC FTP Service provides bulk download access to millions of full-text articles from PubMed Central's Open Access Subset. Unlike the single-article APIs (E-utilities, BioC), the FTP service is designed for large-scale corpus construction — downloading entire collections for text mining, NLP training, systematic reviews, and bibliometric analysis. Free, no authentication required.
|
|
18
|
+
|
|
19
|
+
**Note**: PMC is migrating to AWS-based Cloud Service in August 2026. FTP paths may change; check official docs for updates.
|
|
20
|
+
|
|
21
|
+
## FTP Access Points
|
|
22
|
+
|
|
23
|
+
### Connection
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
# FTP (classic)
|
|
27
|
+
ftp ftp.ncbi.nlm.nih.gov
|
|
28
|
+
# Navigate to: /pub/pmc
|
|
29
|
+
|
|
30
|
+
# HTTPS alternative (recommended)
|
|
31
|
+
# Base: https://ftp.ncbi.nlm.nih.gov/pub/pmc/
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
### Available Datasets
|
|
35
|
+
|
|
36
|
+
| Dataset | Path | Content | Format |
|
|
37
|
+
|---------|------|---------|--------|
|
|
38
|
+
| **OA Commercial** | `/pub/pmc/oa_comm/` | CC BY/CC0 articles (commercial use OK) | .tar.gz packages |
|
|
39
|
+
| **OA Non-Commercial** | `/pub/pmc/oa_noncomm/` | CC BY-NC articles | .tar.gz packages |
|
|
40
|
+
| **OA Other** | `/pub/pmc/oa_other/` | Other open licenses | .tar.gz packages |
|
|
41
|
+
| **Author Manuscripts** | `/pub/pmc/manuscript/` | NIH-funded manuscripts | .tar.gz packages |
|
|
42
|
+
| **Historical OCR** | `/pub/pmc/historical_ocr/` | Pre-digital scanned articles | .tar.gz |
|
|
43
|
+
| **File lists** | `/pub/pmc/oa_file_list.csv` | Index of all OA articles | CSV |
|
|
44
|
+
|
|
45
|
+
### File List Index
|
|
46
|
+
|
|
47
|
+
Download the master index to plan your downloads:
|
|
48
|
+
|
|
49
|
+
```bash
|
|
50
|
+
# Download the OA file list (CSV, ~200MB)
|
|
51
|
+
wget https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_file_list.csv
|
|
52
|
+
|
|
53
|
+
# CSV columns:
|
|
54
|
+
# File, Article Citation, AccessionID, LastUpdated, PMID, License
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## Download Strategies
|
|
58
|
+
|
|
59
|
+
### Strategy 1: Download Specific Articles
|
|
60
|
+
|
|
61
|
+
```python
|
|
62
|
+
import requests
|
|
63
|
+
import tarfile
|
|
64
|
+
import io
|
|
65
|
+
import csv
|
|
66
|
+
|
|
67
|
+
def download_article_package(pmcid: str, base_url: str = "https://ftp.ncbi.nlm.nih.gov/pub/pmc"):
|
|
68
|
+
"""Download and extract a specific PMC article package."""
|
|
69
|
+
# First, look up the file path from the file list
|
|
70
|
+
# (In practice, you'd load this once and index by PMCID)
|
|
71
|
+
file_list_url = f"{base_url}/oa_file_list.csv"
|
|
72
|
+
# ... lookup pmcid in file list to get path ...
|
|
73
|
+
|
|
74
|
+
# Download the tar.gz package
|
|
75
|
+
resp = requests.get(f"{base_url}/{file_path}", stream=True)
|
|
76
|
+
resp.raise_for_status()
|
|
77
|
+
|
|
78
|
+
# Extract
|
|
79
|
+
with tarfile.open(fileobj=io.BytesIO(resp.content), mode="r:gz") as tar:
|
|
80
|
+
tar.extractall(path=f"./articles/{pmcid}")
|
|
81
|
+
print(f"Extracted {pmcid}")
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Strategy 2: Bulk Download by License
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
#!/bin/bash
|
|
88
|
+
# Download all commercial-use articles (CC BY / CC0)
|
|
89
|
+
# WARNING: This is ~100GB+ compressed
|
|
90
|
+
|
|
91
|
+
mkdir -p pmc_corpus/commercial
|
|
92
|
+
cd pmc_corpus/commercial
|
|
93
|
+
|
|
94
|
+
# Download the baseline (all current articles)
|
|
95
|
+
wget -r -np -nH --cut-dirs=3 \
|
|
96
|
+
https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_comm/xml/
|
|
97
|
+
|
|
98
|
+
# Incremental updates (run periodically)
|
|
99
|
+
wget -r -np -nH --cut-dirs=3 -N \
|
|
100
|
+
https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_comm/xml/
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### Strategy 3: Filtered Download via File List
|
|
104
|
+
|
|
105
|
+
```python
|
|
106
|
+
import csv
|
|
107
|
+
import requests
|
|
108
|
+
from pathlib import Path
|
|
109
|
+
|
|
110
|
+
def download_filtered_corpus(file_list_path: str, output_dir: str,
|
|
111
|
+
license_filter: str = "CC BY",
|
|
112
|
+
max_articles: int = 1000):
|
|
113
|
+
"""Download articles matching a license filter."""
|
|
114
|
+
output = Path(output_dir)
|
|
115
|
+
output.mkdir(parents=True, exist_ok=True)
|
|
116
|
+
base = "https://ftp.ncbi.nlm.nih.gov/pub/pmc"
|
|
117
|
+
downloaded = 0
|
|
118
|
+
|
|
119
|
+
with open(file_list_path) as f:
|
|
120
|
+
reader = csv.DictReader(f)
|
|
121
|
+
for row in reader:
|
|
122
|
+
if license_filter and license_filter not in row.get("License", ""):
|
|
123
|
+
continue
|
|
124
|
+
if downloaded >= max_articles:
|
|
125
|
+
break
|
|
126
|
+
|
|
127
|
+
file_path = row["File"]
|
|
128
|
+
url = f"{base}/{file_path}"
|
|
129
|
+
local_path = output / Path(file_path).name
|
|
130
|
+
|
|
131
|
+
if local_path.exists():
|
|
132
|
+
continue
|
|
133
|
+
|
|
134
|
+
resp = requests.get(url, stream=True, timeout=60)
|
|
135
|
+
if resp.status_code == 200:
|
|
136
|
+
local_path.write_bytes(resp.content)
|
|
137
|
+
downloaded += 1
|
|
138
|
+
if downloaded % 100 == 0:
|
|
139
|
+
print(f"Downloaded {downloaded} articles...")
|
|
140
|
+
|
|
141
|
+
print(f"Total downloaded: {downloaded}")
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
## PMC ID Cross-Referencing
|
|
145
|
+
|
|
146
|
+
Convert between different article identifiers:
|
|
147
|
+
|
|
148
|
+
```bash
|
|
149
|
+
# PMID → PMCID → DOI conversion
|
|
150
|
+
curl "https://www.ncbi.nlm.nih.gov/pmc/utils/idconv/v1.0/?ids=29346600&format=json"
|
|
151
|
+
|
|
152
|
+
# Batch conversion (up to 200 IDs)
|
|
153
|
+
curl "https://www.ncbi.nlm.nih.gov/pmc/utils/idconv/v1.0/?ids=29346600,30266829,31048553&format=json"
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
## Package Contents
|
|
157
|
+
|
|
158
|
+
Each article package (.tar.gz) typically contains:
|
|
159
|
+
|
|
160
|
+
```
|
|
161
|
+
PMC1234567/
|
|
162
|
+
├── PMC1234567.xml # Full text in JATS XML
|
|
163
|
+
├── PMC1234567.pdf # PDF (if available)
|
|
164
|
+
├── figure1.jpg # Figures
|
|
165
|
+
├── figure2.jpg
|
|
166
|
+
├── table1.html # Tables (sometimes)
|
|
167
|
+
└── supplement1.pdf # Supplementary materials
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
## Best Practices
|
|
171
|
+
|
|
172
|
+
- **Start with the file list**: Download `oa_file_list.csv` first and filter locally
|
|
173
|
+
- **Respect rate limits**: Space requests 0.3s apart for individual downloads
|
|
174
|
+
- **Use incremental updates**: After initial download, use `-N` flag to only get new/updated files
|
|
175
|
+
- **Check licenses**: OA Commercial (CC BY) allows any use; Non-Commercial restricts commercial applications
|
|
176
|
+
- **Storage planning**: Full OA Subset is ~500GB+ uncompressed
|
|
177
|
+
|
|
178
|
+
## References
|
|
179
|
+
|
|
180
|
+
- [PMC FTP Documentation](https://www.ncbi.nlm.nih.gov/pmc/tools/ftp/)
|
|
181
|
+
- [PMC Open Access Subset](https://www.ncbi.nlm.nih.gov/pmc/tools/openftlist/)
|
|
182
|
+
- [PMC ID Converter API](https://www.ncbi.nlm.nih.gov/pmc/tools/id-converter-api/)
|
|
@@ -0,0 +1,166 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: zotero-ai-butler-guide
|
|
3
|
+
description: "AI-powered paper summarization plugin for Zotero"
|
|
4
|
+
metadata:
|
|
5
|
+
openclaw:
|
|
6
|
+
emoji: "🤵"
|
|
7
|
+
category: "literature"
|
|
8
|
+
subcategory: "fulltext"
|
|
9
|
+
keywords: ["Zotero", "AI summary", "paper summarization", "LLM", "abstract generation", "reading assistant"]
|
|
10
|
+
source: "https://github.com/steven-jianhao-li/zotero-AI-Butler"
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Zotero AI Butler Guide
|
|
14
|
+
|
|
15
|
+
## Overview
|
|
16
|
+
|
|
17
|
+
Zotero AI Butler is a Zotero plugin that uses LLMs to summarize, analyze, and annotate academic papers directly within Zotero. It can generate structured summaries, extract key findings, compare papers, and answer questions about documents — all without leaving the reference manager. Supports multiple LLM backends (OpenAI, Claude, local models).
|
|
18
|
+
|
|
19
|
+
## Installation
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
# Download .xpi from GitHub releases
|
|
23
|
+
# Zotero 7: Tools → Add-ons → Install Add-on From File
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
## Configuration
|
|
27
|
+
|
|
28
|
+
```markdown
|
|
29
|
+
### LLM Backend Setup (Preferences → AI Butler)
|
|
30
|
+
|
|
31
|
+
**Option 1: OpenAI**
|
|
32
|
+
- Provider: OpenAI
|
|
33
|
+
- Model: gpt-4o
|
|
34
|
+
- Set environment variable for credentials
|
|
35
|
+
|
|
36
|
+
**Option 2: Anthropic**
|
|
37
|
+
- Provider: Anthropic
|
|
38
|
+
- Model: claude-sonnet-4-20250514
|
|
39
|
+
|
|
40
|
+
**Option 3: Local (Ollama)**
|
|
41
|
+
- Provider: Ollama
|
|
42
|
+
- Endpoint: http://localhost:11434
|
|
43
|
+
- Model: llama3.1
|
|
44
|
+
|
|
45
|
+
**Option 4: Custom API**
|
|
46
|
+
- Provider: Custom
|
|
47
|
+
- Endpoint: your-api-url
|
|
48
|
+
- Compatible with OpenAI API format
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## Features
|
|
52
|
+
|
|
53
|
+
### Paper Summarization
|
|
54
|
+
|
|
55
|
+
```markdown
|
|
56
|
+
### Usage
|
|
57
|
+
1. Select paper in Zotero
|
|
58
|
+
2. Right-click → AI Butler → Summarize
|
|
59
|
+
3. Summary added as Zotero note
|
|
60
|
+
|
|
61
|
+
### Summary Templates
|
|
62
|
+
- **Quick Summary** (1 paragraph): Core contribution + method + result
|
|
63
|
+
- **Structured Summary**: Background / Method / Results / Limitations
|
|
64
|
+
- **Executive Brief**: Who should read this and why
|
|
65
|
+
- **Technical Deep-Dive**: Detailed methodology and math
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
### Key Finding Extraction
|
|
69
|
+
|
|
70
|
+
```markdown
|
|
71
|
+
### Extract structured information:
|
|
72
|
+
- **Research question**: What problem does this paper address?
|
|
73
|
+
- **Methodology**: What approach do the authors use?
|
|
74
|
+
- **Key results**: What are the main findings?
|
|
75
|
+
- **Contributions**: What is novel about this work?
|
|
76
|
+
- **Limitations**: What are the acknowledged limitations?
|
|
77
|
+
- **Future work**: What directions do the authors suggest?
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Paper Comparison
|
|
81
|
+
|
|
82
|
+
```markdown
|
|
83
|
+
### Compare multiple papers:
|
|
84
|
+
1. Select 2+ papers in Zotero
|
|
85
|
+
2. Right-click → AI Butler → Compare Papers
|
|
86
|
+
3. Generates comparison table:
|
|
87
|
+
- Shared and unique contributions
|
|
88
|
+
- Methodological differences
|
|
89
|
+
- Performance comparison (if applicable)
|
|
90
|
+
- Complementary insights
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
### Q&A Mode
|
|
94
|
+
|
|
95
|
+
```markdown
|
|
96
|
+
### Ask questions about papers:
|
|
97
|
+
1. Open paper in Zotero reader
|
|
98
|
+
2. AI Butler sidebar → Ask a question
|
|
99
|
+
3. Answers grounded in paper content with page references
|
|
100
|
+
|
|
101
|
+
Example questions:
|
|
102
|
+
- "What loss function do they use?"
|
|
103
|
+
- "How does this compare to prior work?"
|
|
104
|
+
- "What are the hyperparameters?"
|
|
105
|
+
- "Explain equation 3 in simpler terms"
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
## Batch Processing
|
|
109
|
+
|
|
110
|
+
```markdown
|
|
111
|
+
### Summarize multiple papers:
|
|
112
|
+
1. Select papers (or entire collection)
|
|
113
|
+
2. Right-click → AI Butler → Batch Summarize
|
|
114
|
+
3. Progress bar shows completion
|
|
115
|
+
4. Each paper gets a summary note attached
|
|
116
|
+
|
|
117
|
+
### Reading List Generation:
|
|
118
|
+
1. Select collection
|
|
119
|
+
2. AI Butler → Generate Reading Order
|
|
120
|
+
3. Suggests optimal reading sequence based on:
|
|
121
|
+
- Citation relationships
|
|
122
|
+
- Conceptual dependencies
|
|
123
|
+
- Publication chronology
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
## Custom Prompts
|
|
127
|
+
|
|
128
|
+
```markdown
|
|
129
|
+
### Create custom analysis prompts:
|
|
130
|
+
# In AI Butler preferences → Custom Prompts
|
|
131
|
+
|
|
132
|
+
Prompt: "Systematic Review Extraction"
|
|
133
|
+
Template: |
|
|
134
|
+
Extract the following from this paper:
|
|
135
|
+
1. Study design (RCT, cohort, etc.)
|
|
136
|
+
2. Sample size
|
|
137
|
+
3. Primary outcome
|
|
138
|
+
4. Effect size with CI
|
|
139
|
+
5. Risk of bias indicators
|
|
140
|
+
Format as structured JSON.
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
## Integration with Zotero Workflow
|
|
144
|
+
|
|
145
|
+
```markdown
|
|
146
|
+
### Combined Plugin Workflow
|
|
147
|
+
1. **Zotero Connector** → Import paper
|
|
148
|
+
2. **Zotero Sci-Hub** → Fetch PDF
|
|
149
|
+
3. **AI Butler** → Generate summary note
|
|
150
|
+
4. **Zotero Actions Tags** → Auto-tag based on summary
|
|
151
|
+
5. **Notero** → Sync to Notion with summary
|
|
152
|
+
6. **Better BibTeX** → Export citations for writing
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
## Use Cases
|
|
156
|
+
|
|
157
|
+
1. **Rapid screening**: Quick summaries for literature triage
|
|
158
|
+
2. **Paper comprehension**: Ask clarifying questions
|
|
159
|
+
3. **Comparison studies**: Side-by-side paper analysis
|
|
160
|
+
4. **Data extraction**: Structured information for systematic reviews
|
|
161
|
+
5. **Reading preparation**: Generate briefings before journal club
|
|
162
|
+
|
|
163
|
+
## References
|
|
164
|
+
|
|
165
|
+
- [Zotero AI Butler GitHub](https://github.com/steven-jianhao-li/zotero-AI-Butler)
|
|
166
|
+
- [Zotero Plugin Development](https://www.zotero.org/support/dev/client_coding/plugin_development)
|
|
@@ -0,0 +1,168 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: zotero-scihub-guide
|
|
3
|
+
description: "Zotero plugin for automatic PDF retrieval from Sci-Hub"
|
|
4
|
+
metadata:
|
|
5
|
+
openclaw:
|
|
6
|
+
emoji: "🔓"
|
|
7
|
+
category: "literature"
|
|
8
|
+
subcategory: "fulltext"
|
|
9
|
+
keywords: ["Zotero", "Sci-Hub", "PDF download", "open access", "full text", "paper retrieval"]
|
|
10
|
+
source: "https://github.com/ethanwillis/zotero-scihub"
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Zotero Sci-Hub Guide
|
|
14
|
+
|
|
15
|
+
## Overview
|
|
16
|
+
|
|
17
|
+
Zotero Sci-Hub is a Zotero plugin that automatically fetches PDFs from Sci-Hub when papers cannot be found through standard open-access channels. It integrates seamlessly into Zotero's existing PDF retrieval workflow — when Zotero's built-in retriever fails, the plugin automatically attempts Sci-Hub as a fallback. Useful for researchers at institutions with limited journal subscriptions.
|
|
18
|
+
|
|
19
|
+
## Installation
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
# Download the .xpi file from GitHub releases
|
|
23
|
+
# In Zotero 7: Tools → Add-ons → Install Add-on From File
|
|
24
|
+
|
|
25
|
+
# Manual installation
|
|
26
|
+
# 1. Go to https://github.com/ethanwillis/zotero-scihub/releases
|
|
27
|
+
# 2. Download zotero-scihub-*.xpi
|
|
28
|
+
# 3. In Zotero: Tools → Add-ons → gear icon → Install from file
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## Configuration
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
# In Zotero: Edit → Preferences → Zotero Sci-Hub
|
|
35
|
+
|
|
36
|
+
# Settings:
|
|
37
|
+
# 1. Sci-Hub URL: Set current working mirror
|
|
38
|
+
# - The plugin ships with default URLs
|
|
39
|
+
# - Update if mirrors change
|
|
40
|
+
|
|
41
|
+
# 2. Automatic mode:
|
|
42
|
+
# - ON: Try Sci-Hub automatically after Zotero fails
|
|
43
|
+
# - OFF: Only fetch via right-click menu
|
|
44
|
+
|
|
45
|
+
# 3. DOI sources: Where to look for DOIs
|
|
46
|
+
# - Item DOI field
|
|
47
|
+
# - Item URL field
|
|
48
|
+
# - Item Extra field
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## Usage Workflow
|
|
52
|
+
|
|
53
|
+
```markdown
|
|
54
|
+
### Automatic PDF Retrieval
|
|
55
|
+
1. Add item to Zotero (via browser connector, DOI, or manual)
|
|
56
|
+
2. Zotero tries built-in PDF retrieval (Open Access, institutional)
|
|
57
|
+
3. If no PDF found → plugin automatically queries Sci-Hub
|
|
58
|
+
4. PDF attached to Zotero item
|
|
59
|
+
|
|
60
|
+
### Manual Retrieval
|
|
61
|
+
1. Right-click item(s) in Zotero
|
|
62
|
+
2. Select "Fetch PDF from Sci-Hub"
|
|
63
|
+
3. Works for single items or batch selection
|
|
64
|
+
|
|
65
|
+
### Bulk Retrieval
|
|
66
|
+
1. Select multiple items (Ctrl+A for all)
|
|
67
|
+
2. Right-click → "Fetch PDF from Sci-Hub"
|
|
68
|
+
3. Plugin processes items sequentially with rate limiting
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## Integration with Other Plugins
|
|
72
|
+
|
|
73
|
+
```markdown
|
|
74
|
+
### Recommended Plugin Stack
|
|
75
|
+
1. **Zotero Connector** — Browser extension for importing items
|
|
76
|
+
2. **Zotero Sci-Hub** — PDF fallback retrieval
|
|
77
|
+
3. **ZotFile/ZotMoov** — Organize downloaded PDFs
|
|
78
|
+
4. **Zotero Better BibTeX** — Citation key management
|
|
79
|
+
5. **Zotero PDF Translate** — Translate retrieved papers
|
|
80
|
+
|
|
81
|
+
### Workflow
|
|
82
|
+
Import item → Auto-fetch PDF → Organize files → Read & annotate
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
## Troubleshooting
|
|
86
|
+
|
|
87
|
+
```markdown
|
|
88
|
+
### Common Issues
|
|
89
|
+
|
|
90
|
+
**PDF not found:**
|
|
91
|
+
- Check if DOI is present in item metadata
|
|
92
|
+
- Try updating the Sci-Hub mirror URL
|
|
93
|
+
- Some very recent papers may not be available yet
|
|
94
|
+
|
|
95
|
+
**Connection errors:**
|
|
96
|
+
- Current mirror may be down; try alternate URL
|
|
97
|
+
- Check network/proxy settings
|
|
98
|
+
- Some institutions block Sci-Hub domains
|
|
99
|
+
|
|
100
|
+
**Duplicate PDFs:**
|
|
101
|
+
- Disable automatic mode if using other PDF fetchers
|
|
102
|
+
- Check Zotero's duplicate detection settings
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
## Programmatic Alternative
|
|
106
|
+
|
|
107
|
+
```python
|
|
108
|
+
# For building custom PDF retrieval pipelines
|
|
109
|
+
import requests
|
|
110
|
+
|
|
111
|
+
def fetch_paper_by_doi(doi, output_path):
|
|
112
|
+
"""Attempt to fetch paper PDF via DOI resolution."""
|
|
113
|
+
# Try Unpaywall first (legal open access)
|
|
114
|
+
unpaywall_url = (
|
|
115
|
+
f"https://api.unpaywall.org/v2/{doi}"
|
|
116
|
+
f"?email=your@email.com"
|
|
117
|
+
)
|
|
118
|
+
resp = requests.get(unpaywall_url)
|
|
119
|
+
if resp.ok:
|
|
120
|
+
data = resp.json()
|
|
121
|
+
if data.get("is_oa") and data.get("best_oa_location"):
|
|
122
|
+
pdf_url = data["best_oa_location"].get("url_for_pdf")
|
|
123
|
+
if pdf_url:
|
|
124
|
+
pdf = requests.get(pdf_url)
|
|
125
|
+
with open(output_path, "wb") as f:
|
|
126
|
+
f.write(pdf.content)
|
|
127
|
+
return True
|
|
128
|
+
|
|
129
|
+
# Try CORE API
|
|
130
|
+
core_url = f"https://api.core.ac.uk/v3/search/works?q=doi:{doi}"
|
|
131
|
+
resp = requests.get(core_url)
|
|
132
|
+
if resp.ok:
|
|
133
|
+
results = resp.json().get("results", [])
|
|
134
|
+
if results and results[0].get("downloadUrl"):
|
|
135
|
+
pdf = requests.get(results[0]["downloadUrl"])
|
|
136
|
+
with open(output_path, "wb") as f:
|
|
137
|
+
f.write(pdf.content)
|
|
138
|
+
return True
|
|
139
|
+
|
|
140
|
+
return False
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
## Legal Considerations
|
|
144
|
+
|
|
145
|
+
```markdown
|
|
146
|
+
### Open Access Alternatives
|
|
147
|
+
Before using Sci-Hub, check these legal sources:
|
|
148
|
+
1. **Unpaywall** — Browser extension for legal OA versions
|
|
149
|
+
2. **CORE** — Aggregator of OA research papers
|
|
150
|
+
3. **PubMed Central** — Free biomedical literature archive
|
|
151
|
+
4. **arXiv/bioRxiv** — Preprint servers
|
|
152
|
+
5. **Author websites** — Many post preprints freely
|
|
153
|
+
6. **Interlibrary Loan** — Request through your library
|
|
154
|
+
7. **Email the author** — Most researchers share on request
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
## Use Cases
|
|
158
|
+
|
|
159
|
+
1. **PDF retrieval**: Automatic paper downloading for Zotero
|
|
160
|
+
2. **Literature collection**: Build reading libraries efficiently
|
|
161
|
+
3. **Systematic reviews**: Bulk-fetch papers for review pipelines
|
|
162
|
+
4. **Research onboarding**: Quickly gather papers for new topics
|
|
163
|
+
|
|
164
|
+
## References
|
|
165
|
+
|
|
166
|
+
- [Zotero Sci-Hub GitHub](https://github.com/ethanwillis/zotero-scihub)
|
|
167
|
+
- [Unpaywall](https://unpaywall.org/) — Legal OA alternative
|
|
168
|
+
- [CORE](https://core.ac.uk/) — OA aggregator
|
|
@@ -0,0 +1,164 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: bibliometrix-guide
|
|
3
|
+
description: "Perform science mapping and bibliometric analysis with R bibliometrix"
|
|
4
|
+
metadata:
|
|
5
|
+
openclaw:
|
|
6
|
+
emoji: "📉"
|
|
7
|
+
category: "literature"
|
|
8
|
+
subcategory: "metadata"
|
|
9
|
+
keywords: ["bibliometrix", "bibliometrics", "science mapping", "R", "citation analysis", "research trends"]
|
|
10
|
+
source: "https://github.com/massimoaria/bibliometrix"
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Bibliometrix Guide
|
|
14
|
+
|
|
15
|
+
## Overview
|
|
16
|
+
|
|
17
|
+
Bibliometrix is an R package for comprehensive science mapping and bibliometric analysis. It imports data from Scopus, Web of Science, PubMed, and other databases, then performs co-citation analysis, keyword co-occurrence mapping, collaboration networks, thematic evolution tracking, and more. Includes Biblioshiny — a Shiny-based web interface for no-code analysis.
|
|
18
|
+
|
|
19
|
+
## Installation
|
|
20
|
+
|
|
21
|
+
```r
|
|
22
|
+
install.packages("bibliometrix")
|
|
23
|
+
|
|
24
|
+
# Or development version
|
|
25
|
+
devtools::install_github("massimoaria/bibliometrix")
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Quick Start
|
|
29
|
+
|
|
30
|
+
### Import Data
|
|
31
|
+
|
|
32
|
+
```r
|
|
33
|
+
library(bibliometrix)
|
|
34
|
+
|
|
35
|
+
# From Scopus CSV export
|
|
36
|
+
M <- convert2df("scopus_export.csv", dbsource = "scopus", format = "csv")
|
|
37
|
+
|
|
38
|
+
# From Web of Science
|
|
39
|
+
M <- convert2df("wos_export.txt", dbsource = "wos", format = "plaintext")
|
|
40
|
+
|
|
41
|
+
# From PubMed
|
|
42
|
+
M <- convert2df("pubmed_export.txt", dbsource = "pubmed", format = "pubmed")
|
|
43
|
+
|
|
44
|
+
# From multiple files
|
|
45
|
+
file_list <- c("data1.csv", "data2.csv")
|
|
46
|
+
M <- convert2df(file_list, dbsource = "scopus", format = "csv")
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
### Descriptive Analysis
|
|
50
|
+
|
|
51
|
+
```r
|
|
52
|
+
# Basic bibliometric summary
|
|
53
|
+
results <- biblioAnalysis(M)
|
|
54
|
+
summary(results, k = 10) # Top 10 in each category
|
|
55
|
+
|
|
56
|
+
# Key metrics produced:
|
|
57
|
+
# - Publication trends over time
|
|
58
|
+
# - Most productive authors
|
|
59
|
+
# - Most cited papers
|
|
60
|
+
# - Top journals/sources
|
|
61
|
+
# - Country/affiliation rankings
|
|
62
|
+
# - Keyword frequency
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
### Citation Analysis
|
|
66
|
+
|
|
67
|
+
```r
|
|
68
|
+
# Most cited documents
|
|
69
|
+
CR <- citations(M, field = "article", sep = ";")
|
|
70
|
+
head(CR$Cited, 20)
|
|
71
|
+
|
|
72
|
+
# Most cited first authors
|
|
73
|
+
CR_auth <- citations(M, field = "author", sep = ";")
|
|
74
|
+
|
|
75
|
+
# Local citations (within the dataset)
|
|
76
|
+
LC <- localCitations(M)
|
|
77
|
+
head(LC$Papers, 10)
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### Network Analysis
|
|
81
|
+
|
|
82
|
+
```r
|
|
83
|
+
# Co-citation network
|
|
84
|
+
NetMatrix <- biblioNetwork(M, analysis = "co-citation",
|
|
85
|
+
network = "references", sep = ";")
|
|
86
|
+
net <- networkPlot(NetMatrix, n = 30, type = "fruchterman",
|
|
87
|
+
Title = "Co-citation Network")
|
|
88
|
+
|
|
89
|
+
# Author collaboration network
|
|
90
|
+
NetMatrix <- biblioNetwork(M, analysis = "collaboration",
|
|
91
|
+
network = "authors", sep = ";")
|
|
92
|
+
net <- networkPlot(NetMatrix, n = 50, type = "kamada",
|
|
93
|
+
Title = "Collaboration Network")
|
|
94
|
+
|
|
95
|
+
# Keyword co-occurrence
|
|
96
|
+
NetMatrix <- biblioNetwork(M, analysis = "co-occurrences",
|
|
97
|
+
network = "keywords", sep = ";")
|
|
98
|
+
net <- networkPlot(NetMatrix, n = 40, type = "fruchterman",
|
|
99
|
+
Title = "Keyword Co-occurrence")
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### Thematic Analysis
|
|
103
|
+
|
|
104
|
+
```r
|
|
105
|
+
# Thematic map (strategic diagram)
|
|
106
|
+
Map <- thematicMap(M, field = "DE", n = 250, minfreq = 5)
|
|
107
|
+
plot(Map$map)
|
|
108
|
+
|
|
109
|
+
# Quadrants:
|
|
110
|
+
# Motor themes (high centrality, high density)
|
|
111
|
+
# Basic themes (high centrality, low density)
|
|
112
|
+
# Niche themes (low centrality, high density)
|
|
113
|
+
# Emerging/declining themes (low centrality, low density)
|
|
114
|
+
|
|
115
|
+
# Thematic evolution over time periods
|
|
116
|
+
nexus <- thematicEvolution(M,
|
|
117
|
+
field = "DE",
|
|
118
|
+
years = c(2015, 2019, 2023),
|
|
119
|
+
n = 100, minFreq = 3)
|
|
120
|
+
plotThematicEvolution(nexus$Nodes, nexus$Edges)
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
### Biblioshiny (Web Interface)
|
|
124
|
+
|
|
125
|
+
```r
|
|
126
|
+
# Launch interactive web dashboard
|
|
127
|
+
biblioshiny()
|
|
128
|
+
|
|
129
|
+
# Opens browser with GUI for:
|
|
130
|
+
# - Data import from multiple sources
|
|
131
|
+
# - Descriptive analysis
|
|
132
|
+
# - Network visualization
|
|
133
|
+
# - Thematic mapping
|
|
134
|
+
# - All plots exportable
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
## Supported Data Sources
|
|
138
|
+
|
|
139
|
+
| Source | Format | Import function |
|
|
140
|
+
|--------|--------|----------------|
|
|
141
|
+
| Scopus | CSV/BibTeX | `convert2df(..., dbsource="scopus")` |
|
|
142
|
+
| Web of Science | Plain text/BibTeX | `convert2df(..., dbsource="wos")` |
|
|
143
|
+
| PubMed | PubMed format | `convert2df(..., dbsource="pubmed")` |
|
|
144
|
+
| Dimensions | CSV | `convert2df(..., dbsource="dimensions")` |
|
|
145
|
+
| Cochrane | Plain text | `convert2df(..., dbsource="cochrane")` |
|
|
146
|
+
| OpenAlex | JSON | Via API integration |
|
|
147
|
+
|
|
148
|
+
## Key Analysis Types
|
|
149
|
+
|
|
150
|
+
| Analysis | Function | Output |
|
|
151
|
+
|----------|----------|--------|
|
|
152
|
+
| Descriptive | `biblioAnalysis()` | Summary statistics |
|
|
153
|
+
| Co-citation | `biblioNetwork(analysis="co-citation")` | Citation clusters |
|
|
154
|
+
| Collaboration | `biblioNetwork(analysis="collaboration")` | Author networks |
|
|
155
|
+
| Co-occurrence | `biblioNetwork(analysis="co-occurrences")` | Keyword maps |
|
|
156
|
+
| Thematic map | `thematicMap()` | Strategic quadrant diagram |
|
|
157
|
+
| Trend analysis | `fieldByYear()` | Topic evolution |
|
|
158
|
+
| Country collab | `metaTagExtraction() + biblioNetwork()` | Geo collaboration |
|
|
159
|
+
|
|
160
|
+
## References
|
|
161
|
+
|
|
162
|
+
- [Bibliometrix](https://www.bibliometrix.org/)
|
|
163
|
+
- [Bibliometrix GitHub](https://github.com/massimoaria/bibliometrix)
|
|
164
|
+
- Aria, M. & Cuccurullo, C. (2017). "bibliometrix: An R-tool for comprehensive science mapping analysis." *Journal of Informetrics* 11(4): 959-975.
|