paper-search-cli 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (82) hide show
  1. package/.env.example +18 -3
  2. package/README-sc.md +110 -25
  3. package/README.md +109 -24
  4. package/dist/cli.js +27 -3
  5. package/dist/cli.js.map +1 -1
  6. package/dist/config/ConfigService.d.ts +1 -1
  7. package/dist/config/ConfigService.d.ts.map +1 -1
  8. package/dist/config/ConfigService.js +4 -1
  9. package/dist/config/ConfigService.js.map +1 -1
  10. package/dist/config/constants.d.ts +6 -0
  11. package/dist/config/constants.d.ts.map +1 -1
  12. package/dist/config/constants.js +4 -1
  13. package/dist/config/constants.js.map +1 -1
  14. package/dist/core/diagnostics.d.ts.map +1 -1
  15. package/dist/core/diagnostics.js +34 -12
  16. package/dist/core/diagnostics.js.map +1 -1
  17. package/dist/core/handleToolCall.d.ts.map +1 -1
  18. package/dist/core/handleToolCall.js +50 -7
  19. package/dist/core/handleToolCall.js.map +1 -1
  20. package/dist/core/platformMetadata.d.ts +28 -0
  21. package/dist/core/platformMetadata.d.ts.map +1 -0
  22. package/dist/core/platformMetadata.js +258 -0
  23. package/dist/core/platformMetadata.js.map +1 -0
  24. package/dist/core/schemas.d.ts +108 -74
  25. package/dist/core/schemas.d.ts.map +1 -1
  26. package/dist/core/schemas.js +29 -38
  27. package/dist/core/schemas.js.map +1 -1
  28. package/dist/core/searchers.d.ts +11 -0
  29. package/dist/core/searchers.d.ts.map +1 -1
  30. package/dist/core/searchers.js +17 -1
  31. package/dist/core/searchers.js.map +1 -1
  32. package/dist/core/tools.d.ts.map +1 -1
  33. package/dist/core/tools.js +57 -30
  34. package/dist/core/tools.js.map +1 -1
  35. package/dist/platforms/ACMSearcher.d.ts +16 -0
  36. package/dist/platforms/ACMSearcher.d.ts.map +1 -0
  37. package/dist/platforms/ACMSearcher.js +125 -0
  38. package/dist/platforms/ACMSearcher.js.map +1 -0
  39. package/dist/platforms/ArxivSearcher.d.ts.map +1 -1
  40. package/dist/platforms/ArxivSearcher.js +10 -2
  41. package/dist/platforms/ArxivSearcher.js.map +1 -1
  42. package/dist/platforms/DBLPSearcher.d.ts +16 -0
  43. package/dist/platforms/DBLPSearcher.d.ts.map +1 -0
  44. package/dist/platforms/DBLPSearcher.js +116 -0
  45. package/dist/platforms/DBLPSearcher.js.map +1 -0
  46. package/dist/platforms/IEEESearcher.d.ts +15 -0
  47. package/dist/platforms/IEEESearcher.d.ts.map +1 -0
  48. package/dist/platforms/IEEESearcher.js +123 -0
  49. package/dist/platforms/IEEESearcher.js.map +1 -0
  50. package/dist/platforms/OpenAIRESearcher.d.ts.map +1 -1
  51. package/dist/platforms/OpenAIRESearcher.js +9 -1
  52. package/dist/platforms/OpenAIRESearcher.js.map +1 -1
  53. package/dist/platforms/OpenReviewSearcher.d.ts +19 -0
  54. package/dist/platforms/OpenReviewSearcher.d.ts.map +1 -0
  55. package/dist/platforms/OpenReviewSearcher.js +141 -0
  56. package/dist/platforms/OpenReviewSearcher.js.map +1 -0
  57. package/dist/platforms/PaperSource.d.ts +6 -0
  58. package/dist/platforms/PaperSource.d.ts.map +1 -1
  59. package/dist/platforms/PaperSource.js.map +1 -1
  60. package/dist/platforms/PubMedSearcher.d.ts.map +1 -1
  61. package/dist/platforms/PubMedSearcher.js +8 -0
  62. package/dist/platforms/PubMedSearcher.js.map +1 -1
  63. package/dist/platforms/USENIXSearcher.d.ts +14 -0
  64. package/dist/platforms/USENIXSearcher.d.ts.map +1 -0
  65. package/dist/platforms/USENIXSearcher.js +75 -0
  66. package/dist/platforms/USENIXSearcher.js.map +1 -0
  67. package/dist/services/MultiSourceSearchService.d.ts.map +1 -1
  68. package/dist/services/MultiSourceSearchService.js +8 -34
  69. package/dist/services/MultiSourceSearchService.js.map +1 -1
  70. package/dist/services/OpenAccessFallbackService.d.ts +1 -0
  71. package/dist/services/OpenAccessFallbackService.d.ts.map +1 -1
  72. package/dist/services/OpenAccessFallbackService.js +2 -2
  73. package/dist/services/OpenAccessFallbackService.js.map +1 -1
  74. package/dist/utils/HttpClient.d.ts +6 -0
  75. package/dist/utils/HttpClient.d.ts.map +1 -0
  76. package/dist/utils/HttpClient.js +30 -0
  77. package/dist/utils/HttpClient.js.map +1 -0
  78. package/dist/utils/PdfDownload.d.ts.map +1 -1
  79. package/dist/utils/PdfDownload.js +106 -16
  80. package/dist/utils/PdfDownload.js.map +1 -1
  81. package/package.json +3 -1
  82. package/skills/paper-search/SKILL.md +16 -5
package/.env.example CHANGED
@@ -1,11 +1,12 @@
1
1
  # ==============================================================================
2
2
  # Paper Search CLI - Environment Variables
3
3
  # ==============================================================================
4
- # Supports 20 academic sources/platforms with a unified CLI interface
4
+ # Supports 25 academic sources/platforms with a unified CLI interface
5
5
  # Platforms: Crossref, OpenAlex, PubMed, PubMed Central, Europe PMC, arXiv,
6
6
  # bioRxiv, medRxiv, Semantic Scholar, CORE, OpenAIRE, Web of Science,
7
- # Google Scholar, IACR ePrint, Sci-Hub, ScienceDirect, Springer, Wiley,
8
- # Scopus, Unpaywall
7
+ # Google Scholar, DBLP, ACM metadata, USENIX metadata, OpenReview,
8
+ # IACR ePrint, Sci-Hub, IEEE Xplore, ScienceDirect, Springer/SpringerLink,
9
+ # Wiley, Scopus, Unpaywall
9
10
  #
10
11
  # For global installs, prefer:
11
12
  # paper-search config set KEY VALUE
@@ -26,6 +27,15 @@
26
27
  WOS_API_KEY=your_web_of_science_api_key_here
27
28
  WOS_API_VERSION=v1
28
29
 
30
+ # ------------------------------------------------------------------------------
31
+ # IEEE Xplore Metadata API - REQUIRED for IEEE search
32
+ # ------------------------------------------------------------------------------
33
+ # Official metadata search API for IEEE Xplore
34
+ # Get API key: https://developer.ieee.org/
35
+ # Documentation: https://developer.ieee.org/docs/read/Searching_the_IEEE_Xplore_Metadata_API
36
+ # Pricing/access: Requires IEEE API access; entitlements may vary
37
+ IEEE_API_KEY=your_ieee_api_key_here
38
+
29
39
  # ------------------------------------------------------------------------------
30
40
  # PubMed/NCBI E-utilities - OPTIONAL (but recommended)
31
41
  # ------------------------------------------------------------------------------
@@ -140,9 +150,14 @@ WILEY_TDM_TOKEN=your_wiley_tdm_token_here
140
150
  # Semantic Scholar | 🟡 Optional| AI-powered search, citations | CS, AI research
141
151
  # CORE | 🟡 Optional| Repository PDFs | Open repositories
142
152
  # OpenAIRE | 🟡 Optional| Repository discovery | Open repositories
153
+ # DBLP | ✅ Free | CS bibliography metadata | CS literature
154
+ # ACM metadata | ✅ Free | Crossref DOI-prefix metadata | ACM papers
155
+ # USENIX metadata | ✅ Free | DBLP-backed proceedings data | Systems/security
156
+ # OpenReview | ✅ Free | Notes/search metadata | ML conferences
143
157
  # Unpaywall | 📧 Email | DOI OA resolution | Fallback downloads
144
158
  # IACR ePrint | ✅ Free | Cryptography papers, PDF | Cryptography
145
159
  # Sci-Hub | ✅ Free | Universal DOI access | Any paper by DOI
160
+ # IEEE Xplore | 🔴 Required| IEEE metadata | Engineering
146
161
  # ScienceDirect | 🔴 Required| Elsevier journals | Full-text search
147
162
  # Springer | 🔴 Required| Dual API (Meta+OpenAccess) | Books & journals
148
163
  # Wiley | 🔴 Required| TDM API, full text mining | Data mining
package/README-sc.md CHANGED
@@ -9,8 +9,8 @@ Paper Search CLI 是一个独立的 Node.js 命令行工具,用于跨多个学
9
9
  ![Node.js](https://img.shields.io/badge/node.js->=18.0.0-green.svg)
10
10
  ![TypeScript](https://img.shields.io/badge/typescript-^5.5.3-blue.svg)
11
11
  ![License](https://img.shields.io/badge/license-MIT-blue.svg)
12
- ![Platforms](https://img.shields.io/badge/platforms-20-brightgreen.svg)
13
- ![Version](https://img.shields.io/badge/version-0.1.1-blue.svg)
12
+ ![Platforms](https://img.shields.io/badge/platforms-25-brightgreen.svg)
13
+ ![Version](https://img.shields.io/badge/version-0.1.2-blue.svg)
14
14
  [![LinuxDo](https://img.shields.io/badge/LinuxDo-community-1f6feb)](https://linux.do)
15
15
 
16
16
  感谢真诚、友善、团结、专业的 [LinuxDo](https://linux.do) 社区。本项目的 CLI + Skill 路线和论文检索工作流改进,来自社区交流与开源分享的启发。
@@ -27,13 +27,13 @@ Paper Search CLI 是一个独立的 Node.js 命令行工具,用于跨多个学
27
27
 
28
28
  ## 核心特性
29
29
 
30
- - **20 个学术来源/平台**:Crossref、OpenAlex、PubMed、PubMed Central、Europe PMC、arXiv、bioRxiv、medRxiv、Semantic Scholar、CORE、OpenAIRE、Web of Science、Google Scholar、IACR ePrint、Sci-Hub、ScienceDirect、Springer Nature、Wiley、Scopus、Unpaywall。
30
+ - **25 个学术来源/平台**:Crossref、OpenAlex、PubMed、PubMed Central、Europe PMC、arXiv、bioRxiv、medRxiv、Semantic Scholar、CORE、OpenAIRE、DBLP、ACM Digital Library 元数据、USENIX 元数据、OpenReview、Web of Science、Google Scholar、IACR ePrint、Sci-Hub、IEEE Xplore、ScienceDirect、Springer Nature/SpringerLink、Wiley、Scopus、Unpaywall。
31
31
  - **单一命令入口**:安装后通过 `paper-search` 调用,适合终端、脚本和 agent。
32
32
  - **JSON 优先输出**:stdout 默认输出 JSON,stderr 保留给人类可读日志和错误。
33
33
  - **统一论文数据模型**:标准化标题、作者、DOI、来源、日期、摘要、PDF 链接、引用数和平台扩展字段。
34
34
  - **多源检索与去重**:用 `--sources crossref,openalex,pmc` 选择来源,或用 `platform=all` 尝试所有已注册检索来源,再按 DOI、标题+作者合并重复结果。
35
35
  - **Semantic Scholar 正文片段检索**:`search_semantic_snippets` 用于检索 Semantic Scholar Open Access snippet 索引中的正文片段,适合查找论文中的方法学细节。该功能需要 `SEMANTIC_SCHOLAR_API_KEY`。
36
- - **开放获取优先下载链**:`download_with_fallback` 会先尝试原生下载、结果里的 PDF URL、PMC/Europe PMC/CORE/OpenAIRE、Unpaywall DOI 解析,只有显式开启时才把 Sci-Hub 作为最后兜底。
36
+ - **漏斗式回退下载链**:`download_with_fallback` 会先尝试原生下载、结果里的 PDF URL、PMC/Europe PMC/CORE/OpenAIRE、Unpaywall DOI 解析,最后默认使用 Sci-Hub 兜底,除非传入 `useSciHub=false`。
37
37
  - **限速与重试**:内置平台级限速和可重试 API 错误处理。
38
38
  - **PDF 下载支持**:支持 arXiv、bioRxiv、medRxiv、Semantic Scholar、IACR、Sci-Hub、Springer 开放获取、Wiley DOI 下载等路径。
39
39
  - **适合 agent 调用**:`tools`、`status`、`search`、`download`、`run` 覆盖简单检索和精确工具调用。
@@ -51,6 +51,7 @@ paper-search search "machine learning" --platform crossref --max-results 3 --pre
51
51
  ```
52
52
 
53
53
  安装后运行 `paper-search setup`,即可把可选 API key 和 email 写入用户级配置。
54
+ 其中 Unpaywall 和 Crossref 的邮箱项可以直接回车跳过,CLI 会自动写入一个随机前缀的 Gmail 格式邮箱;如果你想使用自己的邮箱,后续再用 `paper-search config set` 覆盖即可。
54
55
 
55
56
  如果你需要本地开发版,或要验证尚未发布的改动,可以从源码安装:
56
57
 
@@ -72,34 +73,90 @@ paper-search config doctor --pretty
72
73
 
73
74
  ## 支持的平台
74
75
 
76
+ ### 平台类型
77
+
78
+ 下面的能力表仍然是平台能力的准确信息来源。如果只是快速选择检索来源,可以先按这些类型判断:
79
+
80
+ | 类型 | 平台 | 适合场景 |
81
+ | --- | --- | --- |
82
+ | 综合检索 | Crossref、OpenAlex、Semantic Scholar、Google Scholar | 广覆盖发现、DOI 元数据、引用线索、文献初筛 |
83
+ | 医学/生命科学 | PubMed、PubMed Central、Europe PMC | 临床、生物医学、公卫、生物医学元数据和开放全文 |
84
+ | 预印本/会议稿 | arXiv、bioRxiv、medRxiv、OpenReview、IACR ePrint | 跨学科预印本、生命科学/医学预印本、AI/ML 投稿和密码学 ePrint |
85
+ | 计算机/工程 | DBLP、ACM Digital Library 元数据、IEEE Xplore、USENIX | CS 文献目录、工程数据库、系统/安全会议论文 |
86
+ | 开放全文/仓储 | CORE、OpenAIRE、Unpaywall | 跨学科仓储发现和开放获取 PDF 回退路径 |
87
+ | 引文库/出版商 | Web of Science、Scopus、ScienceDirect、Springer Nature/SpringerLink、Wiley | 机构权限型元数据、引文数据库、出版商记录和下载 |
88
+ | DOI 定向获取 | Sci-Hub | DOI 定向获取,并作为 PDF 下载漏斗的最后自动兜底;除非传入 `useSciHub=false` |
89
+
90
+ 部分平台会跨多个实际工作流。例如 Semantic Scholar 既适合广覆盖检索,也常用于 CS/AI;arXiv 覆盖计算机、数学、物理和部分定量学科。这里按主要使用方式归类;做计算机方向检索时,通常会同时用“计算机/工程”和“预印本/会议稿”两组。
91
+
92
+ ### 能力矩阵
93
+
94
+ #### 综合检索
95
+
75
96
  | 平台 | 搜索 | 下载 | 全文 | 被引统计 | API Key | 特色功能 |
76
97
  | --- | --- | --- | --- | --- | --- | --- |
77
98
  | Crossref | ✅ | ❌ | ❌ | ✅ | ❌ | 默认搜索平台,广泛元数据覆盖 |
78
99
  | OpenAlex | ✅ | 🟡 条件支持 | ❌ | ✅ | ❌ | 广泛免费元数据;记录含开放链接时可用于回退下载 |
79
- | arXiv | ✅ | ✅ | ✅ | | | 物理、计算机、数学等预印本 |
80
- | Web of Science | ✅ | ❌ | ❌ | ✅ | 必需 | 引文数据库、日期排序、年份范围 |
100
+ | Semantic Scholar | ✅ | ✅ | ✅ 正文片段 | | 🟡 可选* | AI 语义检索 + OA 正文片段 |
101
+ | Google Scholar | ✅ | ❌ | ❌ | ✅ | | 广泛学术发现,基于页面解析 |
102
+
103
+ #### 医学/生命科学
104
+
105
+ | 平台 | 搜索 | 下载 | 全文 | 被引统计 | API Key | 特色功能 |
106
+ | --- | --- | --- | --- | --- | --- | --- |
81
107
  | PubMed | ✅ | ❌ | ❌ | ❌ | 🟡 可选 | NCBI E-utilities 生物医学文献 |
82
108
  | PubMed Central | ✅ | ✅ | ✅ | ❌ | ❌ | 生物医学开放全文和 PMC PDF |
83
109
  | Europe PMC | ✅ | ✅ | ✅ | ❌ | ❌ | 生物医学元数据和开放全文链接 |
84
- | Google Scholar | ✅ | ❌ | ❌ | ✅ | ❌ | 广泛学术发现,基于页面解析 |
85
- | bioRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | 生物学预印本 |
86
- | medRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | 医学预印本 |
87
- | Semantic Scholar | | | 正文片段 | | 🟡 可选* | AI 语义检索 + OA 正文片段 |
110
+
111
+ #### 计算机/工程
112
+
113
+ | 平台 | 搜索 | 下载 | 全文 | 被引统计 | API Key | 特色功能 |
114
+ | --- | --- | --- | --- | --- | --- | --- |
115
+ | DBLP | ✅ | ❌ | ❌ | ❌ | ❌ | 通过官方 DBLP search API 检索计算机文献目录 |
116
+ | ACM Digital Library | ✅ | ❌ | ❌ | ✅ | ❌ | 通过 Crossref 的 ACM DOI 前缀元数据检索;不抓取 ACM 页面 |
117
+ | USENIX | ✅ | ❌ | ❌ | ❌ | ❌ | 基于 DBLP 的 USENIX 会议元数据;不抓取 USENIX 搜索页 |
118
+ | IEEE Xplore | ✅ | ❌ | ❌ | ✅ | ✅ 必需 | 通过官方 IEEE Xplore Metadata API 检索 IEEE 元数据 |
119
+
120
+ #### 开放全文/仓储
121
+
122
+ | 平台 | 搜索 | 下载 | 全文 | 被引统计 | API Key | 特色功能 |
123
+ | --- | --- | --- | --- | --- | --- | --- |
88
124
  | CORE | ✅ | 🟡 条件支持 | 🟡 条件支持 | ❌ | 🟡 可选 | 记录含 PDF 或全文链接时可下载 |
89
125
  | OpenAIRE | ✅ | 🟡 条件支持 | ❌ | ❌ | 🟡 可选 | 记录含开放链接时可用于回退下载 |
90
126
  | Unpaywall | 🟡 条件支持 | 🟡 条件支持 | ❌ | ❌ | ✅ 必需 | 仅支持 DOI 查询;需要 email;发现 OA PDF 时可下载 |
127
+
128
+ #### 预印本/会议稿
129
+
130
+ | 平台 | 搜索 | 下载 | 全文 | 被引统计 | API Key | 特色功能 |
131
+ | --- | --- | --- | --- | --- | --- | --- |
132
+ | arXiv | ✅ | ✅ | ✅ | ❌ | ❌ | 物理、计算机、数学等预印本 |
133
+ | bioRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | 生物学预印本 |
134
+ | medRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | 医学预印本 |
135
+ | OpenReview | ✅ | ❌ | ❌ | ❌ | ❌ | 通过公开 OpenReview notes search 检索会议投稿、评审和预印本 |
91
136
  | IACR ePrint | ✅ | ✅ | ✅ | ❌ | ❌ | 密码学论文 |
92
- | Sci-Hub | ✅ | ✅ | ❌ | ❌ | ❌ | 基于 DOI 查询和下载 |
137
+
138
+ #### 引文库/出版商
139
+
140
+ | 平台 | 搜索 | 下载 | 全文 | 被引统计 | API Key | 特色功能 |
141
+ | --- | --- | --- | --- | --- | --- | --- |
142
+ | Web of Science | ✅ | ❌ | ❌ | ✅ | ✅ 必需 | 引文数据库、日期排序、年份范围 |
93
143
  | ScienceDirect | ✅ | ❌ | ❌ | ✅ | ✅ 必需 | Elsevier 元数据和摘要 |
94
- | Springer Nature | ✅ | 🟡 条件支持 | ❌ | ❌ | ✅ 必需 | 开放获取记录可下载;元数据 API 需要 key |
144
+ | Springer Nature / SpringerLink | ✅ | 🟡 条件支持 | ❌ | ❌ | ✅ 必需 | `springerlink` 是现有 Springer Nature 集成的别名 |
95
145
  | Wiley | ❌ 关键词搜索 | ✅ | ✅ | ❌ | ✅ 必需 | TDM API,仅支持 DOI 下载 PDF |
96
146
  | Scopus | ✅ | ❌ | ❌ | ✅ | ✅ 必需 | 摘要和引文数据库 |
97
147
 
148
+ #### DOI 定向获取
149
+
150
+ | 平台 | 搜索 | 下载 | 全文 | 被引统计 | API Key | 特色功能 |
151
+ | --- | --- | --- | --- | --- | --- | --- |
152
+ | Sci-Hub | ✅ | ✅ | ❌ | ❌ | ❌ | 基于 DOI 查询和下载 |
153
+
98
154
  说明:
99
155
 
100
156
  - 能力列中,`✅` 表示直接支持,`❌` 表示不支持,`🟡 条件支持` 表示只在满足条件时可用,例如记录里含 PDF/开放获取链接、只能按 DOI 查询,或只能下载开放获取记录。
101
157
  - API Key 列中,`❌` 表示不需要配置,`🟡 可选` 表示不配置也能用但限额或稳定性较弱,`✅ 必需` 表示只在启用该平台时必须配置,不代表新用户默认都要配置。Unpaywall 需要的是 email,不是传统 API key。
102
158
  - Wiley TDM API 不支持关键词搜索。应先用 `search_crossref` 找到 Wiley 文章 DOI,再用 `download_paper` 配合 `platform=wiley` 下载。
159
+ - ACM 和 USENIX 检索刻意走元数据后端,不抓取平台搜索页,以遵守 robots.txt 并降低 IP 被封风险。
103
160
  - `platform=all` 会尝试所有已注册检索来源,但不包含 Wiley 这类只支持 DOI 下载、不能关键词搜索的平台。未配置 key、超时或请求失败的来源会写入 `failed_sources` / `errors`,其他来源继续返回。
104
161
  - `--sources` 接受逗号分隔来源,例如 `--sources crossref,openalex,pmc`。
105
162
  - `🟡 可选*` 对 Semantic Scholar 的含义是:普通检索可选;`search_semantic_snippets` 正文片段检索必需配置 `SEMANTIC_SCHOLAR_API_KEY`。
@@ -111,7 +168,7 @@ paper-search config doctor --pretty
111
168
  ```bash
112
169
  paper-search setup
113
170
  paper-search config set SEMANTIC_SCHOLAR_API_KEY your_semantic_scholar_api_key_here
114
- paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com
171
+ paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com # 可选:手动覆盖 setup 自动生成的邮箱
115
172
  paper-search config list --pretty
116
173
  paper-search config doctor --pretty
117
174
  paper-search diagnostics --pretty
@@ -127,6 +184,8 @@ paper-search diagnostics --pretty
127
184
 
128
185
  `paper-search setup` 是引导式配置命令。默认只询问推荐配置:Semantic Scholar、Unpaywall email、Crossref email 和 CORE。需要遍历所有支持项时使用 `paper-search setup --all`;只想配置指定项时使用 `paper-search setup --keys SEMANTIC_SCHOLAR_API_KEY,CORE_API_KEY`。
129
186
 
187
+ 为降低首次配置成本,如果 `PAPER_SEARCH_UNPAYWALL_EMAIL` / `UNPAYWALL_EMAIL` / `CROSSREF_MAILTO` 尚未配置,setup 时直接回车会自动写入一个随机前缀的 Gmail 格式邮箱,例如 `paper.search.xxxxxx@gmail.com`,用于让 Unpaywall 和 Crossref 的基础请求能直接运行。
188
+
130
189
  `paper-search diagnostics --pretty` 会列出所有依赖 API key 或 email 的能力、相关配置项、当前是否已配置、常见失败原因和建议排查动作。检索命令在 key-backed 平台返回 0 结果,或遇到 401、403、400、429 时,也会在 JSON 输出里附带 `diagnostic` 字段。
131
190
 
132
191
  ### API key 推荐策略
@@ -136,11 +195,12 @@ paper-search diagnostics --pretty
136
195
  | 等级 | 配置项 | 是否建议新用户配置 | 说明 |
137
196
  | --- | --- | --- | --- |
138
197
  | 默认推荐 | `SEMANTIC_SCHOLAR_API_KEY` | 建议配置 | 开启 Semantic Scholar 正文片段检索,适合方法学细节检索,也能提高请求稳定性。 |
139
- | 默认推荐 | `PAPER_SEARCH_UNPAYWALL_EMAIL` 或 `UNPAYWALL_EMAIL` | 建议配置 | 用 DOI 查找开放获取 PDF;只需要邮箱,不需要申请 API key |
140
- | 默认推荐 | `CROSSREF_MAILTO` | 建议配置 | 让 Crossref 请求进入 polite pool,适合长期或高频检索。 |
198
+ | 默认推荐 | `PAPER_SEARCH_UNPAYWALL_EMAIL` 或 `UNPAYWALL_EMAIL` | 建议配置 | 用 DOI 查找开放获取 PDF;只需要邮箱,不需要申请 API key。`setup` 直接回车会自动生成随机 Gmail 格式邮箱,也可以手动换成自己的邮箱。 |
199
+ | 默认推荐 | `CROSSREF_MAILTO` | 建议配置 | 让 Crossref 请求进入 polite pool,适合长期或高频检索。`setup` 直接回车会复用自动生成的邮箱,也可以手动换成自己的邮箱。 |
141
200
  | 默认推荐 | `CORE_API_KEY` 或 `PAPER_SEARCH_CORE_API_KEY` | 建议配置 | CORE 匿名访问容易限流;配置 key 后更适合开放仓储检索。 |
142
201
  | 生物医学高频 | `PUBMED_API_KEY`、`NCBI_EMAIL`、`NCBI_TOOL` | 经常用 PubMed 时建议配置 | 提高 NCBI E-utilities 限额,并让请求带上明确客户端信息。 |
143
202
  | 机构权限型 | `WOS_API_KEY` | 有 Web of Science API 权限再配置 | 用于 Web of Science 检索和引文数据;需要 Clarivate API 权限。 |
203
+ | 机构权限型 | `IEEE_API_KEY` | 有 IEEE Xplore API 权限再配置 | 用于 IEEE Xplore 元数据检索;IEEE 可能要求注册 API 访问和产品权限。 |
144
204
  | 机构权限型 | `ELSEVIER_API_KEY` | 有 Scopus 或 ScienceDirect API 权限再配置 | 同一个 Elsevier key 不等于自动拥有两个产品权限,Scopus 和 ScienceDirect 需要分别开通。 |
145
205
  | 机构权限型 | `SPRINGER_API_KEY`、`SPRINGER_OPENACCESS_API_KEY` | 需要 Springer 平台时再配置 | 用于 Springer 元数据和开放获取记录;401 通常表示 key 无效或产品权限未开通。 |
146
206
  | 机构权限型 | `WILEY_TDM_TOKEN` | 有 Wiley TDM/机构全文权限再配置 | 仅支持 DOI 下载;能否下载取决于 token 和机构订阅权限。 |
@@ -172,6 +232,9 @@ cp .env.example .env
172
232
  WOS_API_KEY=your_web_of_science_api_key_here
173
233
  WOS_API_VERSION=v1
174
234
 
235
+ # IEEE Xplore,IEEE 元数据检索必需
236
+ IEEE_API_KEY=your_ieee_api_key_here
237
+
175
238
  # PubMed,可选;从 3 requests/sec 提升到 10 requests/sec
176
239
  PUBMED_API_KEY=your_ncbi_api_key_here
177
240
  NCBI_EMAIL=you@example.com
@@ -190,10 +253,10 @@ SPRINGER_OPENACCESS_API_KEY=your_openaccess_api_key_here
190
253
  # Wiley TDM,Wiley DOI 下载必需
191
254
  WILEY_TDM_TOKEN=your_wiley_tdm_token_here
192
255
 
193
- # Crossref polite pool,可选但推荐
256
+ # Crossref polite pool,可选但推荐;setup 直接回车会自动生成/复用随机 Gmail 格式邮箱
194
257
  CROSSREF_MAILTO=you@example.com
195
258
 
196
- # Unpaywall,DOI 开放获取解析必需
259
+ # Unpaywall,DOI 开放获取解析必需;setup 直接回车会自动生成随机 Gmail 格式邮箱
197
260
  PAPER_SEARCH_UNPAYWALL_EMAIL=you@example.com
198
261
  UNPAYWALL_EMAIL=you@example.com
199
262
 
@@ -209,6 +272,7 @@ OPENAIRE_API_KEY=your_openaire_api_key_here
209
272
  ### API Key 获取入口
210
273
 
211
274
  - Web of Science: [Clarivate Developer Portal](https://developer.clarivate.com/apis)
275
+ - IEEE Xplore: [IEEE Xplore Metadata API](https://developer.ieee.org/docs/read/Searching_the_IEEE_Xplore_Metadata_API)
212
276
  - PubMed: [NCBI API Keys](https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/)
213
277
  - Semantic Scholar: [Semantic Scholar API](https://www.semanticscholar.org/product/api)
214
278
  - Elsevier: [Elsevier Developer Portal](https://dev.elsevier.com/apikey/manage)
@@ -355,7 +419,7 @@ paper-search diagnostics --pretty
355
419
  ```bash
356
420
  paper-search config init --pretty
357
421
  paper-search config set SEMANTIC_SCHOLAR_API_KEY your_key --pretty
358
- paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com --pretty
422
+ paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com --pretty # 可选:手动覆盖 setup 自动生成的邮箱
359
423
  paper-search config import-env .env --pretty
360
424
  paper-search config list --pretty
361
425
  paper-search config doctor --pretty
@@ -396,8 +460,9 @@ paper-search run search_papers --json-args '{"query":"machine learning","platfor
396
460
 
397
461
  ```text
398
462
  crossref, arxiv, webofscience, wos, pubmed, biorxiv, medrxiv, semantic,
399
- iacr, googlescholar, scholar, scihub, sciencedirect, springer, scopus,
400
- openalex, unpaywall, pmc, europepmc, core, openaire, all
463
+ iacr, googlescholar, scholar, scihub, ieee, sciencedirect, springer,
464
+ springerlink, scopus, openalex, unpaywall, pmc, europepmc, core,
465
+ openaire, dblp, acm, usenix, openreview, all
401
466
  ```
402
467
 
403
468
  多源检索使用 `sources`:
@@ -445,6 +510,24 @@ paper-search run search_openaire --arg query="machine learning" --arg maxResults
445
510
 
446
511
  Unpaywall 只支持 DOI,且需要配置 email。CORE 匿名访问可能很快返回空结果或被限流,长期使用建议配置 API key。
447
512
 
513
+ ### 注册表驱动的平台检索
514
+
515
+ 这些偏元数据检索的工具由平台注册表生成;后续接入新平台时,只需要增加新的 searcher 和平台注册信息:
516
+
517
+ ```bash
518
+ paper-search run search_dblp --arg query="graph neural networks" --arg maxResults=5 --pretty
519
+ paper-search run search_acm --arg query="software testing" --arg maxResults=5 --pretty
520
+ paper-search run search_usenix --arg query="file systems" --arg maxResults=5 --pretty
521
+ paper-search run search_openreview --arg query="large language models" --arg maxResults=5 --pretty
522
+ paper-search run search_springerlink --arg query="machine learning" --arg maxResults=5 --pretty
523
+ ```
524
+
525
+ `search_ieee` 使用同一套通用参数,但需要配置 `IEEE_API_KEY`:
526
+
527
+ ```bash
528
+ paper-search run search_ieee --arg query="wireless networks" --arg maxResults=5 --arg articleTitle="wireless" --pretty
529
+ ```
530
+
448
531
  ### `search_webofscience`
449
532
 
450
533
  搜索 Web of Science。需要 `WOS_API_KEY`。
@@ -547,29 +630,31 @@ paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --arg platform
547
630
 
548
631
  ### `download_paper`
549
632
 
550
- 从支持的平台下载 PDF
633
+ 从指定平台下载 PDF。如果该平台没有原生下载器,或原生下载失败,会进入与 `download_with_fallback` 相同的下载漏斗。
551
634
 
552
635
  ```bash
553
636
  paper-search run download_paper --arg paperId="2301.00001" --arg platform=arxiv --arg savePath=./downloads --pretty
554
637
  ```
555
638
 
556
- 支持下载的平台:
639
+ 原生下载平台:
557
640
 
558
641
  ```text
559
642
  arxiv, biorxiv, medrxiv, semantic, iacr, scihub, springer, wiley,
560
643
  pmc, europepmc, core
561
644
  ```
562
645
 
646
+ 其他已注册来源,例如 `crossref`、`openalex`、`dblp`、`acm`、`usenix`、`openreview`,也可以传给 `download_paper`;它们会直接进入元数据/仓储/Unpaywall/Sci-Hub 回退漏斗。
647
+
563
648
  ### `download_with_fallback`
564
649
 
565
- 按开放获取优先顺序尝试下载:
650
+ 按完整下载漏斗尝试下载。顺序是原生下载、元数据 PDF URL、仓储发现、Unpaywall DOI 解析,最后默认使用 Sci-Hub 兜底:
566
651
 
567
652
  ```bash
568
653
  paper-search run download_with_fallback --arg source=arxiv --arg paperId=1201.0490 --arg doi=10.48550/arxiv.1201.0490 --arg savePath=./downloads --pretty
569
- paper-search run download_with_fallback --arg source=crossref --arg paperId="10.1038/nature12373" --arg doi="10.1038/nature12373" --arg savePath=./downloads --arg useSciHub=false --pretty
654
+ paper-search run download_with_fallback --arg source=crossref --arg paperId="10.1038/nature12373" --arg doi="10.1038/nature12373" --arg savePath=./downloads --pretty
570
655
  ```
571
656
 
572
- `useSciHub` 默认为 `false`;只有明确选择该最后兜底路径时才设置为 `true`。
657
+ `useSciHub` 默认为 `true`;只有需要关闭该最后兜底路径时才设置为 `false`。`download_paper` 在指定平台下载失败或平台不支持直接下载时,也会进入同一条漏斗。
573
658
 
574
659
  ### `search_wiley`
575
660
 
package/README.md CHANGED
@@ -9,8 +9,8 @@ It keeps the broad platform coverage, unified paper model, and detailed capabili
9
9
  ![Node.js](https://img.shields.io/badge/node.js->=18.0.0-green.svg)
10
10
  ![TypeScript](https://img.shields.io/badge/typescript-^5.5.3-blue.svg)
11
11
  ![License](https://img.shields.io/badge/license-MIT-blue.svg)
12
- ![Platforms](https://img.shields.io/badge/platforms-20-brightgreen.svg)
13
- ![Version](https://img.shields.io/badge/version-0.1.1-blue.svg)
12
+ ![Platforms](https://img.shields.io/badge/platforms-25-brightgreen.svg)
13
+ ![Version](https://img.shields.io/badge/version-0.1.2-blue.svg)
14
14
  [![LinuxDo](https://img.shields.io/badge/LinuxDo-community-1f6feb)](https://linux.do)
15
15
 
16
16
  Thanks to the sincere, friendly, collaborative, and professional [LinuxDo](https://linux.do) community. The CLI + Skill direction and the paper-search workflow refinements in this project were shaped by LinuxDo discussions and open-source sharing.
@@ -27,13 +27,13 @@ Thanks to the sincere, friendly, collaborative, and professional [LinuxDo](https
27
27
 
28
28
  ## Key Features
29
29
 
30
- - **20 academic sources/platforms**: Crossref, OpenAlex, PubMed, PubMed Central, Europe PMC, arXiv, bioRxiv, medRxiv, Semantic Scholar, CORE, OpenAIRE, Web of Science, Google Scholar, IACR ePrint, Sci-Hub, ScienceDirect, Springer Nature, Wiley, Scopus, and Unpaywall.
30
+ - **25 academic sources/platforms**: Crossref, OpenAlex, PubMed, PubMed Central, Europe PMC, arXiv, bioRxiv, medRxiv, Semantic Scholar, CORE, OpenAIRE, DBLP, ACM Digital Library metadata, USENIX metadata, OpenReview, Web of Science, Google Scholar, IACR ePrint, Sci-Hub, IEEE Xplore, ScienceDirect, Springer Nature/SpringerLink, Wiley, Scopus, and Unpaywall.
31
31
  - **Single command interface**: install once, then call `paper-search` from terminal, scripts, or agents.
32
32
  - **JSON-first output**: stdout is machine-readable JSON by default; stderr is reserved for human-readable diagnostics.
33
33
  - **Unified paper model**: normalized title, authors, DOI, source, dates, abstract, PDF URL, citation count, and provider-specific metadata where available.
34
34
  - **Multi-source search with dedupe**: query selected sources with `--sources crossref,openalex,pmc`, or use `platform=all` to try every registered search source, then merge duplicates by DOI and title/author keys.
35
35
  - **Semantic Scholar body-snippet search**: `search_semantic_snippets` searches Semantic Scholar's Open Access snippet index for body-text snippets, which is useful for finding methodological details. It requires `SEMANTIC_SCHOLAR_API_KEY`.
36
- - **Open-access-first fallback download**: `download_with_fallback` tries native source download, discovered PDF URLs, PMC/Europe PMC/CORE/OpenAIRE, Unpaywall DOI resolution, then optional Sci-Hub only when explicitly enabled.
36
+ - **Funnel-style fallback download**: `download_with_fallback` tries native source download, discovered PDF URLs, PMC/Europe PMC/CORE/OpenAIRE, Unpaywall DOI resolution, then Sci-Hub as the final fallback unless `useSciHub=false`.
37
37
  - **Rate limits and retry logic**: platform-specific rate limiting and retryable API error handling.
38
38
  - **PDF download support**: download from supported sources such as arXiv, bioRxiv, medRxiv, Semantic Scholar, IACR, Sci-Hub, Springer open access, and Wiley DOI-based access.
39
39
  - **Agent-friendly commands**: `tools`, `status`, `search`, `download`, and `run` cover both simple use and precise advanced calls.
@@ -51,6 +51,7 @@ paper-search search "machine learning" --platform crossref --max-results 3 --pre
51
51
  ```
52
52
 
53
53
  Run `paper-search setup` after installation to write optional API keys and emails into the user config.
54
+ For the Unpaywall and Crossref email prompts, you can press Enter and the CLI will write a random Gmail-format address automatically; use `paper-search config set` later if you want to replace it with your own email.
54
55
 
55
56
  For local development, or to test changes that have not been released yet, install from source:
56
57
 
@@ -72,34 +73,90 @@ paper-search config doctor --pretty
72
73
 
73
74
  ## Supported Platforms
74
75
 
76
+ ### Platform Families
77
+
78
+ The table below remains the source-of-truth for capabilities. For choosing a source quickly, use these broad families:
79
+
80
+ | Family | Platforms | Best For |
81
+ | --- | --- | --- |
82
+ | General scholarly metadata | Crossref, OpenAlex, Semantic Scholar, Google Scholar | Broad discovery, DOI metadata, citation clues, first-pass literature search |
83
+ | Medicine / life sciences | PubMed, PubMed Central, Europe PMC | Clinical, biomedical, public health, biomedical metadata, and open full text |
84
+ | Preprints / conference papers | arXiv, bioRxiv, medRxiv, OpenReview, IACR ePrint | Cross-disciplinary preprints, life-science/medical preprints, AI/ML submissions, and cryptography ePrints |
85
+ | Computer science / engineering | DBLP, ACM Digital Library metadata, IEEE Xplore, USENIX | CS bibliography, engineering databases, systems/security proceedings |
86
+ | Open full text / repositories | CORE, OpenAIRE, Unpaywall | Cross-disciplinary repository discovery and open-access PDF fallback routes |
87
+ | Citation indexes / publishers | Web of Science, Scopus, ScienceDirect, Springer Nature/SpringerLink, Wiley | Institution-backed metadata, citation databases, publisher-specific records and downloads |
88
+ | DOI-targeted lookup | Sci-Hub | DOI-based retrieval and the final automatic PDF fallback unless `useSciHub=false` |
89
+
90
+ Some platforms belong to more than one practical workflow. For example, Semantic Scholar is useful for broad discovery and CS/AI, while arXiv covers CS, math, physics, and quantitative fields. These groups reflect the primary way a platform is used; CS searches often combine "computer science / engineering" with "preprints / conference papers."
91
+
92
+ ### Capability Matrix
93
+
94
+ #### General Scholarly Metadata
95
+
75
96
  | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
76
97
  | --- | --- | --- | --- | --- | --- | --- |
77
98
  | Crossref | ✅ | ❌ | ❌ | ✅ | ❌ | Default search platform, broad metadata coverage |
78
99
  | OpenAlex | ✅ | 🟡 Conditional | ❌ | ✅ | ❌ | Broad free metadata; can feed fallback downloads when records include OA links |
79
- | arXiv | ✅ | ✅ | ✅ | | | Physics, CS, math, and related preprints |
80
- | Web of Science | ✅ | ❌ | ❌ | ✅ | Required | Citation database, date sorting, year ranges |
100
+ | Semantic Scholar | ✅ | ✅ | ✅ Body snippets | | 🟡 Optional* | AI semantic search + OA body snippets |
101
+ | Google Scholar | ✅ | ❌ | ❌ | ✅ | | Broad academic discovery, scrape-based |
102
+
103
+ #### Medicine / Life Sciences
104
+
105
+ | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
106
+ | --- | --- | --- | --- | --- | --- | --- |
81
107
  | PubMed | ✅ | ❌ | ❌ | ❌ | 🟡 Optional | Biomedical literature through NCBI E-utilities |
82
108
  | PubMed Central | ✅ | ✅ | ✅ | ❌ | ❌ | Open biomedical full text and PMC PDFs |
83
109
  | Europe PMC | ✅ | ✅ | ✅ | ❌ | ❌ | Biomedical metadata plus open full-text links |
84
- | Google Scholar | ✅ | ❌ | ❌ | ✅ | ❌ | Broad academic discovery, scrape-based |
110
+
111
+ #### Computer Science / Engineering
112
+
113
+ | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
114
+ | --- | --- | --- | --- | --- | --- | --- |
115
+ | DBLP | ✅ | ❌ | ❌ | ❌ | ❌ | Computer science bibliography through the official DBLP search API |
116
+ | ACM Digital Library | ✅ | ❌ | ❌ | ✅ | ❌ | ACM DOI-prefix metadata through Crossref; no ACM scraping |
117
+ | USENIX | ✅ | ❌ | ❌ | ❌ | ❌ | DBLP-backed USENIX proceedings metadata; no USENIX search-page scraping |
118
+ | IEEE Xplore | ✅ | ❌ | ❌ | ✅ | ✅ Required | IEEE metadata through the official IEEE Xplore Metadata API |
119
+
120
+ #### Preprints / Conference Papers
121
+
122
+ | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
123
+ | --- | --- | --- | --- | --- | --- | --- |
124
+ | arXiv | ✅ | ✅ | ✅ | ❌ | ❌ | Physics, CS, math, and related preprints |
85
125
  | bioRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | Biology preprints |
86
126
  | medRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | Medical preprints |
87
- | Semantic Scholar | ✅ | | Body snippets | | 🟡 Optional* | AI semantic search + OA body snippets |
127
+ | OpenReview | ✅ | | | | | Conference submissions, reviews, and preprints through public OpenReview notes search |
128
+ | IACR ePrint | ✅ | ✅ | ✅ | ❌ | ❌ | Cryptography papers |
129
+
130
+ #### Open Full Text / Repositories
131
+
132
+ | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
133
+ | --- | --- | --- | --- | --- | --- | --- |
88
134
  | CORE | ✅ | 🟡 Conditional | 🟡 Conditional | ❌ | 🟡 Optional | Downloads work when records include PDF or full-text links |
89
135
  | OpenAIRE | ✅ | 🟡 Conditional | ❌ | ❌ | 🟡 Optional | Can feed fallback downloads when records include open links |
90
136
  | Unpaywall | 🟡 Conditional | 🟡 Conditional | ❌ | ❌ | ✅ Required | DOI-only lookup; requires an email; downloads work when an OA PDF is found |
91
- | IACR ePrint | ✅ | ✅ | ✅ | ❌ | ❌ | Cryptography papers |
92
- | Sci-Hub | | ✅ | ❌ | ❌ | ❌ | DOI-based paper lookup and PDF retrieval |
137
+
138
+ #### Citation Indexes / Publishers
139
+
140
+ | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
141
+ | --- | --- | --- | --- | --- | --- | --- |
142
+ | Web of Science | ✅ | ❌ | ❌ | ✅ | ✅ Required | Citation database, date sorting, year ranges |
93
143
  | ScienceDirect | ✅ | ❌ | ❌ | ✅ | ✅ Required | Elsevier metadata and abstracts |
94
- | Springer Nature | ✅ | 🟡 Conditional | ❌ | ❌ | ✅ Required | Open-access records can be downloaded; metadata API requires a key |
144
+ | Springer Nature / SpringerLink | ✅ | 🟡 Conditional | ❌ | ❌ | ✅ Required | `springerlink` is an alias for the existing Springer Nature integration |
95
145
  | Wiley | ❌ Keyword search | ✅ | ✅ | ❌ | ✅ Required | TDM API, DOI-based PDF download only |
96
146
  | Scopus | ✅ | ❌ | ❌ | ✅ | ✅ Required | Abstract and citation database |
97
147
 
148
+ #### DOI-Targeted Lookup
149
+
150
+ | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
151
+ | --- | --- | --- | --- | --- | --- | --- |
152
+ | Sci-Hub | ✅ | ✅ | ❌ | ❌ | ❌ | DOI-based paper lookup and PDF retrieval |
153
+
98
154
  Notes:
99
155
 
100
156
  - In capability columns, `✅` means directly supported, `❌` means unsupported, and `🟡 Conditional` means support depends on record content or provider constraints, such as DOI-only lookup, available PDF/OA links, or open-access-only downloads.
101
157
  - In the API Key column, `❌` means no configuration is needed, `🟡 Optional` means configuration improves limits or stability, and `✅ Required` means the key is required only when you use that platform, not that every new installation should configure it. Unpaywall requires an email rather than a traditional API key.
102
158
  - Wiley does not support keyword search through the Wiley TDM API. Use `search_crossref` to find Wiley articles and then use `download_paper` with `platform=wiley` and the DOI.
159
+ - ACM and USENIX search intentionally use metadata-backed routes rather than crawling provider search pages, which keeps the integration compatible with robots.txt and reduces IP-blocking risk.
103
160
  - `platform=all` tries every registered search source except DOI-download-only providers such as Wiley. Sources without configured credentials, sources that time out, and sources that fail are recorded in `failed_sources` / `errors` while the remaining sources continue.
104
161
  - `--sources` accepts a comma-separated source list, for example `--sources crossref,openalex,pmc`.
105
162
  - `🟡 Optional*` for Semantic Scholar means optional for regular search; `search_semantic_snippets` body-snippet search requires `SEMANTIC_SCHOLAR_API_KEY`.
@@ -111,7 +168,7 @@ Most free metadata sources work without configuration. For API keys and emails,
111
168
  ```bash
112
169
  paper-search setup
113
170
  paper-search config set SEMANTIC_SCHOLAR_API_KEY your_semantic_scholar_api_key_here
114
- paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com
171
+ paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com # optional: replace the setup-generated email
115
172
  paper-search config list --pretty
116
173
  paper-search config doctor --pretty
117
174
  paper-search diagnostics --pretty
@@ -127,6 +184,8 @@ The file is written with `0600` permissions. `config list` and `config doctor` m
127
184
 
128
185
  `paper-search setup` is the guided setup command. By default it asks for the recommended credentials only: Semantic Scholar, Unpaywall email, Crossref email, and CORE. Use `paper-search setup --all` to walk through every supported configuration key, or `paper-search setup --keys SEMANTIC_SCHOLAR_API_KEY,CORE_API_KEY` to configure a specific subset.
129
186
 
187
+ To reduce first-run friction, if `PAPER_SEARCH_UNPAYWALL_EMAIL` / `UNPAYWALL_EMAIL` / `CROSSREF_MAILTO` are not configured, pressing Enter during setup writes a random Gmail-format address such as `paper.search.xxxxxx@gmail.com`, so basic Unpaywall and Crossref requests can run immediately.
188
+
130
189
  `paper-search diagnostics --pretty` lists every API-key or email-backed capability, the related config keys, whether the required keys are configured, common failure modes, and suggested next checks. Search commands also add a `diagnostic` field when a key-backed platform returns zero results or an auth/permission/rate-limit error.
131
190
 
132
191
  ### API Key Recommendation
@@ -136,11 +195,12 @@ The file is written with `0600` permissions. `config list` and `config doctor` m
136
195
  | Level | Config keys | Recommended for new users | Notes |
137
196
  | --- | --- | --- | --- |
138
197
  | Default recommended | `SEMANTIC_SCHOLAR_API_KEY` | Yes | Enables Semantic Scholar body-snippet search for methodology details and improves request stability. |
139
- | Default recommended | `PAPER_SEARCH_UNPAYWALL_EMAIL` or `UNPAYWALL_EMAIL` | Yes | Finds open-access PDFs from DOI records; this only needs an email, not an API key. |
140
- | Default recommended | `CROSSREF_MAILTO` | Yes | Puts Crossref requests in the polite pool, which is better for long-running or frequent searches. |
198
+ | Default recommended | `PAPER_SEARCH_UNPAYWALL_EMAIL` or `UNPAYWALL_EMAIL` | Yes | Finds open-access PDFs from DOI records; this only needs an email, not an API key. Press Enter in `setup` to generate a random Gmail-format email, or replace it manually. |
199
+ | Default recommended | `CROSSREF_MAILTO` | Yes | Puts Crossref requests in the polite pool, which is better for long-running or frequent searches. Press Enter in `setup` to reuse the generated email, or replace it manually. |
141
200
  | Default recommended | `CORE_API_KEY` or `PAPER_SEARCH_CORE_API_KEY` | Yes | CORE anonymous access is often rate-limited; a key makes open repository search more reliable. |
142
201
  | Biomedical-heavy use | `PUBMED_API_KEY`, `NCBI_EMAIL`, `NCBI_TOOL` | Recommended if you use PubMed heavily | Raises NCBI E-utilities limits and identifies the client. |
143
202
  | Institution entitlement | `WOS_API_KEY` | Configure only with Web of Science API access | Enables Web of Science search and citation data; requires Clarivate API entitlement. |
203
+ | Institution entitlement | `IEEE_API_KEY` | Configure only with IEEE Xplore API access | Enables IEEE Xplore metadata search; IEEE may require registered API access and product entitlement. |
144
204
  | Institution entitlement | `ELSEVIER_API_KEY` | Configure only with Scopus or ScienceDirect API access | One Elsevier key does not automatically grant both products; Scopus and ScienceDirect need separate entitlements. |
145
205
  | Institution entitlement | `SPRINGER_API_KEY`, `SPRINGER_OPENACCESS_API_KEY` | Configure only when you need Springer | Used for Springer metadata and open-access records; 401 usually means an invalid key or missing product access. |
146
206
  | Institution entitlement | `WILEY_TDM_TOKEN` | Configure only with Wiley TDM/institutional full-text access | DOI-based download only; availability depends on the token and institutional subscription. |
@@ -172,6 +232,9 @@ cp .env.example .env
172
232
  WOS_API_KEY=your_web_of_science_api_key_here
173
233
  WOS_API_VERSION=v1
174
234
 
235
+ # IEEE Xplore, required for IEEE metadata search
236
+ IEEE_API_KEY=your_ieee_api_key_here
237
+
175
238
  # PubMed, optional; increases rate limit from 3 requests/sec to 10 requests/sec
176
239
  PUBMED_API_KEY=your_ncbi_api_key_here
177
240
  NCBI_EMAIL=you@example.com
@@ -190,10 +253,10 @@ SPRINGER_OPENACCESS_API_KEY=your_openaccess_api_key_here
190
253
  # Wiley TDM, required for Wiley DOI-based PDF download
191
254
  WILEY_TDM_TOKEN=your_wiley_tdm_token_here
192
255
 
193
- # Crossref polite pool, optional but recommended
256
+ # Crossref polite pool, optional but recommended; setup can auto-generate/reuse a random Gmail-format email
194
257
  CROSSREF_MAILTO=you@example.com
195
258
 
196
- # Unpaywall, required for DOI-based OA resolution
259
+ # Unpaywall, required for DOI-based OA resolution; setup can auto-generate a random Gmail-format email
197
260
  PAPER_SEARCH_UNPAYWALL_EMAIL=you@example.com
198
261
  UNPAYWALL_EMAIL=you@example.com
199
262
 
@@ -209,6 +272,7 @@ OPENAIRE_API_KEY=your_openaire_api_key_here
209
272
  ### API Key Sources
210
273
 
211
274
  - Web of Science: [Clarivate Developer Portal](https://developer.clarivate.com/apis)
275
+ - IEEE Xplore: [IEEE Xplore Metadata API](https://developer.ieee.org/docs/read/Searching_the_IEEE_Xplore_Metadata_API)
212
276
  - PubMed: [NCBI API Keys](https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/)
213
277
  - Semantic Scholar: [Semantic Scholar API](https://www.semanticscholar.org/product/api)
214
278
  - Elsevier: [Elsevier Developer Portal](https://dev.elsevier.com/apikey/manage)
@@ -355,7 +419,7 @@ Manage the user-level config file.
355
419
  ```bash
356
420
  paper-search config init --pretty
357
421
  paper-search config set SEMANTIC_SCHOLAR_API_KEY your_key --pretty
358
- paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com --pretty
422
+ paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com --pretty # optional: replace the setup-generated email
359
423
  paper-search config import-env .env --pretty
360
424
  paper-search config list --pretty
361
425
  paper-search config doctor --pretty
@@ -396,8 +460,9 @@ Supported platforms:
396
460
 
397
461
  ```text
398
462
  crossref, arxiv, webofscience, wos, pubmed, biorxiv, medrxiv, semantic,
399
- iacr, googlescholar, scholar, scihub, sciencedirect, springer, scopus,
400
- openalex, unpaywall, pmc, europepmc, core, openaire, all
463
+ iacr, googlescholar, scholar, scihub, ieee, sciencedirect, springer,
464
+ springerlink, scopus, openalex, unpaywall, pmc, europepmc, core,
465
+ openaire, dblp, acm, usenix, openreview, all
401
466
  ```
402
467
 
403
468
  For multi-source search, pass `sources`:
@@ -445,6 +510,24 @@ paper-search run search_openaire --arg query="machine learning" --arg maxResults
445
510
 
446
511
  Unpaywall is DOI-only and requires an email. CORE public access may return zero results or rate-limit quickly without an API key.
447
512
 
513
+ ### Registry-Backed Platform Search
514
+
515
+ These metadata-oriented tools are generated from the platform registry, so adding later platforms only needs a new searcher plus registry metadata:
516
+
517
+ ```bash
518
+ paper-search run search_dblp --arg query="graph neural networks" --arg maxResults=5 --pretty
519
+ paper-search run search_acm --arg query="software testing" --arg maxResults=5 --pretty
520
+ paper-search run search_usenix --arg query="file systems" --arg maxResults=5 --pretty
521
+ paper-search run search_openreview --arg query="large language models" --arg maxResults=5 --pretty
522
+ paper-search run search_springerlink --arg query="machine learning" --arg maxResults=5 --pretty
523
+ ```
524
+
525
+ `search_ieee` uses the same generic schema but requires `IEEE_API_KEY`:
526
+
527
+ ```bash
528
+ paper-search run search_ieee --arg query="wireless networks" --arg maxResults=5 --arg articleTitle="wireless" --pretty
529
+ ```
530
+
448
531
  ### `search_webofscience`
449
532
 
450
533
  Search Web of Science. Requires `WOS_API_KEY`.
@@ -547,29 +630,31 @@ paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --arg platform
547
630
 
548
631
  ### `download_paper`
549
632
 
550
- Download PDF files from supported platforms.
633
+ Download PDF files from a platform. If the selected platform has no native downloader, or if native download fails, the command enters the same fallback funnel used by `download_with_fallback`.
551
634
 
552
635
  ```bash
553
636
  paper-search run download_paper --arg paperId="2301.00001" --arg platform=arxiv --arg savePath=./downloads --pretty
554
637
  ```
555
638
 
556
- Supported download platforms:
639
+ Native download platforms:
557
640
 
558
641
  ```text
559
642
  arxiv, biorxiv, medrxiv, semantic, iacr, scihub, springer, wiley,
560
643
  pmc, europepmc, core
561
644
  ```
562
645
 
646
+ Other registered sources, such as `crossref`, `openalex`, `dblp`, `acm`, `usenix`, or `openreview`, can still be passed to `download_paper`; they start directly at the metadata/repository/Unpaywall/Sci-Hub fallback funnel.
647
+
563
648
  ### `download_with_fallback`
564
649
 
565
- Try open-access routes before optional last-resort sources:
650
+ Try the full download funnel. The order is source-native download, metadata PDF URL, repository discovery, Unpaywall DOI resolution, then Sci-Hub as the final fallback:
566
651
 
567
652
  ```bash
568
653
  paper-search run download_with_fallback --arg source=arxiv --arg paperId=1201.0490 --arg doi=10.48550/arxiv.1201.0490 --arg savePath=./downloads --pretty
569
- paper-search run download_with_fallback --arg source=crossref --arg paperId="10.1038/nature12373" --arg doi="10.1038/nature12373" --arg savePath=./downloads --arg useSciHub=false --pretty
654
+ paper-search run download_with_fallback --arg source=crossref --arg paperId="10.1038/nature12373" --arg doi="10.1038/nature12373" --arg savePath=./downloads --pretty
570
655
  ```
571
656
 
572
- `useSciHub` defaults to `false`; set it to `true` only when you explicitly choose that final fallback.
657
+ `useSciHub` defaults to `true`; set it to `false` only when you need to suppress that final fallback. `download_paper` also routes failed or unsupported platform downloads through the same funnel.
573
658
 
574
659
  ### `search_wiley`
575
660