paper-search-cli 0.2.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (72) hide show
  1. package/.env.example +2 -6
  2. package/README.md +149 -653
  3. package/README.zh.md +270 -0
  4. package/dist/cli.js +184 -21
  5. package/dist/cli.js.map +1 -1
  6. package/dist/config/ConfigService.d.ts +1 -1
  7. package/dist/config/ConfigService.d.ts.map +1 -1
  8. package/dist/config/ConfigService.js +1 -3
  9. package/dist/config/ConfigService.js.map +1 -1
  10. package/dist/config/ResultCaps.d.ts +4 -0
  11. package/dist/config/ResultCaps.d.ts.map +1 -0
  12. package/dist/config/ResultCaps.js +10 -0
  13. package/dist/config/ResultCaps.js.map +1 -0
  14. package/dist/core/capabilityProfile.d.ts +18 -0
  15. package/dist/core/capabilityProfile.d.ts.map +1 -0
  16. package/dist/core/capabilityProfile.js +167 -0
  17. package/dist/core/capabilityProfile.js.map +1 -0
  18. package/dist/core/diagnostics.js +16 -16
  19. package/dist/core/diagnostics.js.map +1 -1
  20. package/dist/core/handleToolCall.d.ts.map +1 -1
  21. package/dist/core/handleToolCall.js +33 -0
  22. package/dist/core/handleToolCall.js.map +1 -1
  23. package/dist/core/liveSmoke.d.ts +42 -0
  24. package/dist/core/liveSmoke.d.ts.map +1 -0
  25. package/dist/core/liveSmoke.js +226 -0
  26. package/dist/core/liveSmoke.js.map +1 -0
  27. package/dist/core/platformMetadata.js +2 -2
  28. package/dist/core/platformMetadata.js.map +1 -1
  29. package/dist/core/schemas.d.ts +77 -2
  30. package/dist/core/schemas.d.ts.map +1 -1
  31. package/dist/core/schemas.js +58 -3
  32. package/dist/core/schemas.js.map +1 -1
  33. package/dist/core/textReports.d.ts +21 -0
  34. package/dist/core/textReports.d.ts.map +1 -0
  35. package/dist/core/textReports.js +85 -0
  36. package/dist/core/textReports.js.map +1 -0
  37. package/dist/core/tools.d.ts.map +1 -1
  38. package/dist/core/tools.js +60 -1
  39. package/dist/core/tools.js.map +1 -1
  40. package/dist/platforms/BioRxivSearcher.d.ts.map +1 -1
  41. package/dist/platforms/BioRxivSearcher.js +40 -21
  42. package/dist/platforms/BioRxivSearcher.js.map +1 -1
  43. package/dist/platforms/CORESearcher.d.ts.map +1 -1
  44. package/dist/platforms/CORESearcher.js +39 -9
  45. package/dist/platforms/CORESearcher.js.map +1 -1
  46. package/dist/platforms/GoogleScholarSearcher.d.ts.map +1 -1
  47. package/dist/platforms/GoogleScholarSearcher.js +3 -2
  48. package/dist/platforms/GoogleScholarSearcher.js.map +1 -1
  49. package/dist/platforms/OpenAIRESearcher.js +1 -1
  50. package/dist/platforms/OpenAIRESearcher.js.map +1 -1
  51. package/dist/services/CitationService.d.ts.map +1 -1
  52. package/dist/services/CitationService.js +8 -2
  53. package/dist/services/CitationService.js.map +1 -1
  54. package/dist/services/JournalMetricsService.js +1 -1
  55. package/dist/services/JournalMetricsService.js.map +1 -1
  56. package/dist/services/OpenAccessFallbackService.d.ts +20 -0
  57. package/dist/services/OpenAccessFallbackService.d.ts.map +1 -1
  58. package/dist/services/OpenAccessFallbackService.js +95 -72
  59. package/dist/services/OpenAccessFallbackService.js.map +1 -1
  60. package/dist/skills/SkillInstaller.d.ts +108 -0
  61. package/dist/skills/SkillInstaller.d.ts.map +1 -0
  62. package/dist/skills/SkillInstaller.js +389 -0
  63. package/dist/skills/SkillInstaller.js.map +1 -0
  64. package/dist/utils/RateLimiter.d.ts.map +1 -1
  65. package/dist/utils/RateLimiter.js +7 -0
  66. package/dist/utils/RateLimiter.js.map +1 -1
  67. package/package.json +2 -2
  68. package/skills/paper-search/SKILL.md +52 -143
  69. package/skills/paper-search/references/capability-routing.md +147 -0
  70. package/skills/paper-search/references/cli-contract.md +152 -0
  71. package/skills/paper-search/references/management-layer.md +140 -0
  72. package/README-sc.md +0 -766
package/README.md CHANGED
@@ -1,761 +1,259 @@
1
1
  # Paper Search CLI
2
2
 
3
- [中文](README-sc.md)
3
+ [简体中文](README.zh.md) | English
4
4
 
5
- Paper Search CLI is a standalone Node.js command line tool for searching, validating, and downloading academic papers from multiple scholarly sources, plus querying journal metrics through EasyScholar. It is designed for direct terminal use, automation scripts, and agent workflows that need a stable command surface with predictable JSON output.
6
-
7
- It keeps the broad platform coverage, unified paper model, and detailed capability descriptions of the earlier Paper Search implementation, but runs as a normal CLI process. There is no long-running background service to configure, start, or keep alive.
5
+ Paper Search CLI is an agent-facing Skill + CLI package built on a standalone Node.js command line tool for academic literature work. It gives AI agents, terminal users, and scripts one reproducible command layer with agent-friendly JSON output for literature metadata search, citation expansion, journal metrics lookup, PDF discovery/download, and paper body snippet search.
8
6
 
9
7
  ![Node.js](https://img.shields.io/badge/node.js->=18.0.0-green.svg)
10
8
  ![TypeScript](https://img.shields.io/badge/typescript-^5.5.3-blue.svg)
11
9
  ![License](https://img.shields.io/badge/license-MIT-blue.svg)
12
10
  ![Platforms](https://img.shields.io/badge/platforms-25-brightgreen.svg)
13
- ![Version](https://img.shields.io/badge/version-0.2.0-blue.svg)
11
+ ![Version](https://img.shields.io/badge/version-0.3.0-blue.svg)
14
12
  [![LinuxDo](https://img.shields.io/badge/LinuxDo-community-1f6feb)](https://linux.do)
15
13
 
16
- Thanks to the sincere, friendly, collaborative, and professional [LinuxDo](https://linux.do) community. The CLI + Skill direction and the paper-search workflow refinements in this project were shaped by LinuxDo discussions and open-source sharing.
14
+ [Quick Start](#quick-start) | [Architecture](#architecture) | [Configuration](#configuration) | [Agent Skill](#agent-skill) | [Supported Platforms](#supported-platforms) | [Commands](#commands) | [Troubleshooting](#troubleshooting)
17
15
 
18
- [Quick Start](#quick-start) · [Configuration](#configuration) · [Agent Skill](#agent-skill) · [Supported Platforms](#supported-platforms) · [Commands](#commands) · [Tool Reference](#tool-reference) · [Troubleshooting](#troubleshooting)
16
+ ## Core Workflows
19
17
 
20
- ## Design Goals
18
+ | Workflow | Primary commands | What it returns |
19
+ | --- | --- | --- |
20
+ | Literature metadata search | `paper-search search`, `paper-search run search_*` | Paper title, authors, year, journal, DOI, PMID/PMCID, arXiv ID, URL, abstract, and source metadata |
21
+ | Citation expansion | `paper-search run get_paper_citations`, `paper-search run get_paper_references` | Citing papers and cited references for a known Semantic Scholar paper ID, DOI, or arXiv ID |
22
+ | Journal metrics lookup | `paper-search journal-metrics`, `paper-search run query_journal_metrics` | Impact factor, 5-year IF, JCR/SSCI quartile, CAS zone, JCI, ESI, warning flags, and rank fields |
23
+ | PDF discovery and download | `paper-search download`, `paper-search run download_with_fallback` | Verified PDF download paths through native sources, open access, configured entitlement, and Sci-Hub fallback when enabled |
24
+ | Body snippet search | `paper-search run search_semantic_snippets` | Semantic Scholar Open Access body snippets for methods, parameters, and wording clues |
21
25
 
22
- - **Free-first retrieval**: prefer public metadata and open-access full-text routes before restricted or fragile sources.
23
- - **One command surface**: keep search, status, download, and precise tool calls behind the same executable.
24
- - **Agent-safe output**: produce predictable JSON that can be parsed without scraping terminal text.
25
- - **Transparent source behavior**: document which platforms provide metadata only, which can download PDFs, and which need API keys.
26
- - **No hidden background process**: each command starts, returns a result, and exits.
26
+ ## Architecture
27
27
 
28
- ## Key Features
28
+ `paper-search` is not an MCP server. It is a normal CLI that AI agents can call through the bundled Skill, while terminal users and scripts can call the same `paper-search` command directly.
29
29
 
30
- - **25 academic sources/platforms**: Crossref, OpenAlex, PubMed, PubMed Central, Europe PMC, arXiv, bioRxiv, medRxiv, Semantic Scholar, CORE, OpenAIRE, DBLP, ACM Digital Library metadata, USENIX metadata, OpenReview, Web of Science, Google Scholar, IACR ePrint, Sci-Hub, IEEE Xplore, ScienceDirect, Springer Nature/SpringerLink, Wiley, Scopus, and Unpaywall.
31
- - **EasyScholar journal metrics**: query impact factor, 5-year impact factor, JCR/SSCI quartiles, CAS zones, JCI, ESI, warning flags, and optional raw official/custom rank fields.
32
- - **Single command interface**: install once, then call `paper-search` from terminal, scripts, or agents.
33
- - **JSON-first output**: stdout is machine-readable JSON by default; stderr is reserved for human-readable diagnostics.
34
- - **Unified paper model**: normalized title, authors, DOI, source, dates, abstract, PDF URL, citation count, and provider-specific metadata where available.
35
- - **Multi-source search with dedupe**: query selected sources with `--sources crossref,openalex,pmc`, or use `platform=all` to try every registered search source, then merge duplicates by DOI and title/author keys.
36
- - **Semantic Scholar body-snippet search**: `search_semantic_snippets` searches Semantic Scholar's Open Access snippet index for body-text snippets, which is useful for finding methodological details. It requires `SEMANTIC_SCHOLAR_API_KEY`.
37
- - **Funnel-style fallback download**: `download_with_fallback` tries native source download, discovered PDF URLs, PMC/Europe PMC/CORE/OpenAIRE, Unpaywall DOI resolution, then Sci-Hub as the final fallback unless `useSciHub=false`.
38
- - **Rate limits and retry logic**: platform-specific rate limiting and retryable API error handling.
39
- - **PDF download support**: download from supported sources such as arXiv, bioRxiv, medRxiv, Semantic Scholar, IACR, Sci-Hub, Springer open access, and Wiley DOI-based access.
40
- - **Agent-friendly commands**: `tools`, `status`, `search`, `journal-metrics`, `download`, and `run` cover both simple use and precise advanced calls.
30
+ | Layer | Responsibility |
31
+ | --- | --- |
32
+ | CLI package body | Executes literature search, citation expansion, journal metrics lookup, PDF discovery/download, body snippet search, and stable JSON output |
33
+ | Bundled Skill | Ships `skills/paper-search` with agent routing rules and focused references; it does not store API keys, cookies, or account credentials |
34
+ | Friendly Management Layer | Provides `doctor`, `smoke`, `skills`, `config`, and `tools` around the five main capabilities: `metadata_search`, `citation_expansion`, `journal_metrics`, `pdf_discovery`, and `body_snippet_search`. `doctor` health reports include masked configuration, Capability Profile, platform/source status, and missing or degraded items; `smoke` checks command wiring and live readiness; `skills` syncs the bundled Skill |
41
35
 
42
- ## Quick Start
36
+ The five main capabilities are executed by the CLI package body and reported by the management layer. The Capability Profile also reports `entitled_access` so users can see whether publisher API keys, database keys, TDM tokens, or institutional entitlements are configured. Missing or degraded configuration for one workflow does not make unrelated workflows unavailable.
43
37
 
44
- ### Install
38
+ ## Quick Start
45
39
 
46
40
  Requires Node.js >= 18.0.0 and npm.
47
41
 
48
42
  ```bash
49
43
  npm install -g paper-search-cli
50
44
  paper-search setup
51
- paper-search search "machine learning" --platform crossref --max-results 3 --pretty
45
+ paper-search doctor --pretty
52
46
  ```
53
47
 
54
- Run `paper-search setup` after installation to write optional API keys and emails into the user config.
55
- For the Unpaywall and Crossref email prompts, you can press Enter and the CLI will write a random Gmail-format address automatically; use `paper-search config set` later if you want to replace it with your own email.
56
-
57
- For local development, or to test changes that have not been released yet, install from source:
48
+ Try the five main workflows:
58
49
 
59
50
  ```bash
60
- git clone git@github.com:dr-dumpling/paper-search-cli.git
61
- cd paper-search-cli
62
- npm install
63
- npm run build
64
- npm install -g .
51
+ paper-search search "machine learning clinical prediction" --platform crossref --max-results 3 --pretty
52
+ paper-search run get_paper_citations --arg doi="10.1038/nature12373" --arg limit=5 --pretty
53
+ paper-search journal-metrics "Nature" "BMJ" --pretty
54
+ paper-search download 10.48550/arxiv.1201.0490 --platform arxiv --save-path ./downloads
55
+ paper-search run search_semantic_snippets --arg query="propensity score matching" --arg limit=3 --pretty
65
56
  ```
66
57
 
67
- ### Common Checks
58
+ Useful checks:
68
59
 
69
60
  ```bash
70
- paper-search status --pretty
71
61
  paper-search tools --pretty
72
- paper-search config doctor --pretty
62
+ paper-search doctor --format text
63
+ paper-search smoke --mock --pretty
64
+ paper-search skills status --pretty
73
65
  ```
74
66
 
75
67
  ## Supported Platforms
76
68
 
77
- ### Platform Families
69
+ The quick group table helps choose a source family. The capability matrix below is the clearer source of truth for what each platform can actually do.
78
70
 
79
- The table below remains the source-of-truth for capabilities. In addition to the 25 paper search/retrieval sources, the CLI also provides EasyScholar journal metrics. EasyScholar does not participate in `platform=all` or `--sources`; call it with `journal-metrics` / `query_journal_metrics`.
71
+ ### Platform Groups
80
72
 
81
- For choosing a source or lookup tool quickly, use these broad families:
82
-
83
- | Family | Platforms | Best For |
73
+ | Family | Platforms | Main use |
84
74
  | --- | --- | --- |
85
- | General scholarly metadata | Crossref, OpenAlex, Semantic Scholar, Google Scholar | Broad discovery, DOI metadata, citation clues, first-pass literature search |
86
- | Journal metrics | EasyScholar | Impact factor, 5-year impact factor, JCR/SSCI quartiles, CAS zones, JCI, ESI, warning flags, and rank data |
87
- | Medicine / life sciences | PubMed, PubMed Central, Europe PMC | Clinical, biomedical, public health, biomedical metadata, and open full text |
88
- | Preprints / conference papers | arXiv, bioRxiv, medRxiv, OpenReview, IACR ePrint | Cross-disciplinary preprints, life-science/medical preprints, AI/ML submissions, and cryptography ePrints |
89
- | Computer science / engineering | DBLP, ACM Digital Library metadata, IEEE Xplore, USENIX | CS bibliography, engineering databases, systems/security proceedings |
90
- | Open full text / repositories | CORE, OpenAIRE, Unpaywall | Cross-disciplinary repository discovery and open-access PDF fallback routes |
91
- | Citation indexes / publishers | Web of Science, Scopus, ScienceDirect, Springer Nature/SpringerLink, Wiley | Institution-backed metadata, citation databases, publisher-specific records and downloads |
92
- | DOI-targeted lookup | Sci-Hub | DOI-based retrieval and the final automatic PDF fallback unless `useSciHub=false` |
93
-
94
- Some platforms belong to more than one practical workflow. For example, Semantic Scholar is useful for broad discovery and CS/AI, while arXiv covers CS, math, physics, and quantitative fields. These groups reflect the primary way a platform is used; CS searches often combine "computer science / engineering" with "preprints / conference papers."
75
+ | General scholarly metadata | Crossref, OpenAlex, Semantic Scholar, Google Scholar | Broad discovery, DOI metadata, citation clues, first-pass screening |
76
+ | Journal metrics | EasyScholar | Impact factor, JCR/SSCI quartile, CAS zone, JCI, ESI, warning flags |
77
+ | Biomedical and life sciences | PubMed, PubMed Central, Europe PMC | Biomedical metadata, PMID/PMCID verification, open full text |
78
+ | Preprints and conference papers | arXiv, bioRxiv, medRxiv, OpenReview, IACR ePrint | Preprints, AI/ML submissions, cryptography ePrints |
79
+ | Computer science and engineering | DBLP, ACM metadata, IEEE Xplore, USENIX | CS bibliography, engineering metadata, conference proceedings |
80
+ | Open access and repositories | CORE, OpenAIRE, Unpaywall | Repository discovery and open-access PDF fallback |
81
+ | Citation indexes and publishers | Web of Science, Scopus, ScienceDirect, Springer Nature/SpringerLink, Wiley | Institution-backed metadata, publisher records, entitled access |
82
+ | DOI-targeted fallback | Sci-Hub | DOI-based PDF fallback when enabled |
95
83
 
96
84
  ### Capability Matrix
97
85
 
98
86
  #### General Scholarly Metadata
99
87
 
100
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
88
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
101
89
  | --- | --- | --- | --- | --- | --- | --- |
102
- | Crossref | ✅ | ❌ | ❌ | ✅ | ❌ | Default search platform, broad metadata coverage |
103
- | OpenAlex | ✅ | 🟡 Conditional | ❌ | ✅ | ❌ | Broad free metadata; can feed fallback downloads when records include OA links |
104
- | Semantic Scholar | ✅ | | ✅ Body snippets | ✅ | 🟡 Optional* | AI semantic search + OA body snippets |
105
- | Google Scholar | ✅ | ❌ | ❌ | ✅ | ❌ | Broad academic discovery, scrape-based |
90
+ | Crossref | ✅ Yes | ❌ No | ❌ No | ✅ Yes | ❌ None | Default broad metadata source |
91
+ | OpenAlex | ✅ Yes | 🟡 Conditional | ❌ No | ✅ Yes | ❌ None | Free metadata; OA links can help PDF fallback |
92
+ | Semantic Scholar | ✅ Yes | 🟡 Conditional | ✅ Body snippets | ✅ Yes | 🟡 Optional; snippets require `SEMANTIC_SCHOLAR_API_KEY` | Good for AI/CS, citation expansion, and body snippet clues |
93
+ | Google Scholar | ✅ Yes | ❌ No | ❌ No | ✅ Yes | ❌ None | Broad discovery through page parsing |
106
94
 
107
95
  #### Journal Metrics
108
96
 
109
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
97
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
110
98
  | --- | --- | --- | --- | --- | --- | --- |
111
- | EasyScholar | Journal lookup | ❌ | ❌ | ❌ | ✅ Required | Impact factor, 5-year impact factor, JCR/SSCI quartiles, CAS zones, JCI, ESI, warning flags, and optional raw official/custom rank fields |
99
+ | EasyScholar | 🟡 Journal metrics only | ❌ No | ❌ No | ❌ No | ✅ Required `EASYSCHOLAR_KEY` | Impact factor, JCR/SSCI quartile, CAS zone, JCI, ESI, warnings, rank fields |
112
100
 
113
- #### Medicine / Life Sciences
101
+ #### Biomedical and Life Sciences
114
102
 
115
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
103
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
116
104
  | --- | --- | --- | --- | --- | --- | --- |
117
- | PubMed | ✅ | ❌ | ❌ | ❌ | 🟡 Optional | Biomedical literature through NCBI E-utilities |
118
- | PubMed Central | ✅ | ✅ | ✅ | ❌ | ❌ | Open biomedical full text and PMC PDFs |
119
- | Europe PMC | ✅ | | | ❌ | ❌ | Biomedical metadata plus open full-text links |
105
+ | PubMed | ✅ Yes | ❌ No | ❌ No | ❌ No | 🟡 Optional `PUBMED_API_KEY`, `NCBI_EMAIL`, `NCBI_TOOL` | NCBI E-utilities biomedical metadata |
106
+ | PubMed Central | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No | ❌ None | Open biomedical full text and PMC PDFs |
107
+ | Europe PMC | ✅ Yes | 🟡 Conditional | 🟡 Conditional | ❌ No | ❌ None | Biomedical metadata and open full-text links |
120
108
 
121
- #### Computer Science / Engineering
109
+ #### Preprints and Conference Papers
122
110
 
123
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
111
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
124
112
  | --- | --- | --- | --- | --- | --- | --- |
125
- | DBLP | ✅ | | | ❌ | ❌ | Computer science bibliography through the official DBLP search API |
126
- | ACM Digital Library | ✅ | | | ✅ | ❌ | ACM DOI-prefix metadata through Crossref; no ACM scraping |
127
- | USENIX | ✅ | | | ❌ | ❌ | DBLP-backed USENIX proceedings metadata; no USENIX search-page scraping |
128
- | IEEE Xplore | ✅ | ❌ | ❌ | | Required | IEEE metadata through the official IEEE Xplore Metadata API |
113
+ | arXiv | ✅ Yes | Yes | Yes | ❌ No | ❌ None | Physics, CS, math, quantitative preprints |
114
+ | bioRxiv | ✅ Yes | Yes | ✅ Yes | ❌ No | None | Biology preprints |
115
+ | medRxiv | ✅ Yes | Yes | Yes | ❌ No | ❌ None | Medical preprints |
116
+ | OpenReview | ✅ Yes | ❌ No | ❌ No | No | None | Public OpenReview notes, reviews, and submissions |
117
+ | IACR ePrint | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No | ❌ None | Cryptography ePrint papers |
129
118
 
130
- #### Preprints / Conference Papers
119
+ #### Computer Science and Engineering
131
120
 
132
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
121
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
133
122
  | --- | --- | --- | --- | --- | --- | --- |
134
- | arXiv | ✅ | | | ❌ | ❌ | Physics, CS, math, and related preprints |
135
- | bioRxiv | ✅ | | | | ❌ | Biology preprints |
136
- | medRxiv | ✅ | | | ❌ | ❌ | Medical preprints |
137
- | OpenReview | ✅ | ❌ | ❌ | | | Conference submissions, reviews, and preprints through public OpenReview notes search |
138
- | IACR ePrint | ✅ | ✅ | ✅ | ❌ | ❌ | Cryptography papers |
123
+ | DBLP | ✅ Yes | No | No | ❌ No | ❌ None | Official DBLP computer-science bibliography |
124
+ | ACM metadata | ✅ Yes | No | No | Yes | ❌ None | Uses Crossref ACM DOI-prefix metadata; does not scrape ACM pages |
125
+ | USENIX | ✅ Yes | No | No | ❌ No | ❌ None | Uses DBLP-backed USENIX metadata; does not scrape USENIX search pages |
126
+ | IEEE Xplore | ✅ Yes | ❌ No | ❌ No | Yes | Required `IEEE_API_KEY` | Official IEEE Xplore Metadata API |
139
127
 
140
- #### Open Full Text / Repositories
128
+ #### Open Access and Repositories
141
129
 
142
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
130
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
143
131
  | --- | --- | --- | --- | --- | --- | --- |
144
- | CORE | ✅ | 🟡 Conditional | 🟡 Conditional | ❌ | 🟡 Optional | Downloads work when records include PDF or full-text links |
145
- | OpenAIRE | ✅ | 🟡 Conditional | ❌ | ❌ | 🟡 Optional | Can feed fallback downloads when records include open links |
146
- | Unpaywall | 🟡 Conditional | 🟡 Conditional | ❌ | ❌ | ✅ Required | DOI-only lookup; requires an email; downloads work when an OA PDF is found |
132
+ | CORE | ✅ Yes | 🟡 Conditional | 🟡 Conditional | ❌ No | 🟡 Optional `CORE_API_KEY` | Repository records may expose PDF or full-text links |
133
+ | OpenAIRE | ✅ Yes | 🟡 Conditional | ❌ No | ❌ No | 🟡 Optional `OPENAIRE_API_KEY` | Public search usually works without a key |
134
+ | Unpaywall | 🟡 DOI lookup only | 🟡 Conditional | ❌ No | ❌ No | ✅ Required `UNPAYWALL_EMAIL` or `PAPER_SEARCH_UNPAYWALL_EMAIL` | Finds DOI-based open-access PDF locations; email, not API key |
147
135
 
148
- #### Citation Indexes / Publishers
136
+ #### Citation Indexes and Publishers
149
137
 
150
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
138
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
151
139
  | --- | --- | --- | --- | --- | --- | --- |
152
- | Web of Science | ✅ | ❌ | ❌ | ✅ | ✅ Required | Citation database, date sorting, year ranges |
153
- | ScienceDirect | ✅ | | ❌ | ✅ | ✅ Required | Elsevier metadata and abstracts |
154
- | Springer Nature / SpringerLink | ✅ | 🟡 Conditional | ❌ | ❌ | ✅ Required | `springerlink` is an alias for the existing Springer Nature integration |
155
- | Wiley | ❌ Keyword search | ✅ | ✅ | ❌ | ✅ Required | TDM API, DOI-based PDF download only |
156
- | Scopus | ✅ | | ❌ | ✅ | ✅ Required | Abstract and citation database |
140
+ | Web of Science | ✅ Yes | ❌ No | ❌ No | ✅ Yes | ✅ Required `WOS_API_KEY` | Citation database metadata, date sorting, year ranges |
141
+ | ScienceDirect | ✅ Yes | 🟡 Conditional | ❌ No | ✅ Yes | ✅ Required `ELSEVIER_API_KEY` | Elsevier metadata; product permission is separate from Scopus |
142
+ | Springer Nature / SpringerLink | ✅ Yes | 🟡 Conditional | ❌ No | ❌ No | ✅ Required `SPRINGER_API_KEY`; 🟡 Optional `SPRINGER_OPENACCESS_API_KEY` | `springerlink` is an alias for the Springer integration |
143
+ | Wiley | ❌ No keyword search | ✅ DOI download | ✅ Yes | ❌ No | ✅ Required `WILEY_TDM_TOKEN` | TDM API; use a DOI found from another metadata source |
144
+ | Scopus | ✅ Yes | 🟡 Conditional metadata | ❌ No | ✅ Yes | ✅ Required `ELSEVIER_API_KEY` | Abstract and citation database; product permission is separate from ScienceDirect |
157
145
 
158
- #### DOI-Targeted Lookup
146
+ #### DOI-Targeted Fallback
159
147
 
160
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
148
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
161
149
  | --- | --- | --- | --- | --- | --- | --- |
162
- | Sci-Hub | | ✅ | ❌ | ❌ | ❌ | DOI-based paper lookup and PDF retrieval |
150
+ | Sci-Hub | No | ✅ Yes | ❌ No | ❌ No | ❌ None | DOI/URL-targeted lookup and final PDF fallback when enabled |
163
151
 
164
152
  Notes:
165
153
 
166
- - In capability columns, `✅` means directly supported, `❌` means unsupported, and `🟡 Conditional` means support depends on record content or provider constraints, such as DOI-only lookup, available PDF/OA links, or open-access-only downloads.
167
- - In the API Key column, `❌` means no configuration is needed, `🟡 Optional` means configuration improves limits or stability, and `✅ Required` means the key is required only when you use that platform, not that every new installation should configure it. Unpaywall requires an email rather than a traditional API key.
168
- - Wiley does not support keyword search through the Wiley TDM API. Use `search_crossref` to find Wiley articles and then use `download_paper` with `platform=wiley` and the DOI.
169
- - ACM and USENIX search intentionally use metadata-backed routes rather than crawling provider search pages, which keeps the integration compatible with robots.txt and reduces IP-blocking risk.
170
- - `platform=all` tries every registered search source except DOI-download-only providers such as Wiley. Sources without configured credentials, sources that time out, and sources that fail are recorded in `failed_sources` / `errors` while the remaining sources continue.
171
- - `--sources` accepts a comma-separated source list, for example `--sources crossref,openalex,pmc`.
172
- - `🟡 Optional*` for Semantic Scholar means optional for regular search; `search_semantic_snippets` body-snippet search requires `SEMANTIC_SCHOLAR_API_KEY`.
173
- - EasyScholar is a journal metrics lookup tool, not a paper search source. Use `paper-search journal-metrics "Nature"` or `paper-search run query_journal_metrics`.
154
+ - `Metadata search` means finding and screening papers; it is not the same as PDF download or body evidence.
155
+ - `pdf_discovery` separates open-access sources, configured entitlement sources, and Sci-Hub as a separately identified final fallback.
156
+ - EasyScholar is a journal metrics source, not a paper search source.
157
+ - Sci-Hub is not part of `metadata_search`; it is DOI/URL-targeted PDF fallback.
158
+ - `🟡 Conditional` means the platform can help only when the record exposes a DOI, open-access link, PDF URL, or configured entitlement.
159
+ - API keys are only required when you use the corresponding key-backed source or workflow.
174
160
 
175
161
  ## Configuration
176
162
 
177
- Most free metadata sources work without configuration. For API keys and emails, prefer the user-level config file so the CLI works from any directory:
163
+ Most free metadata sources work without configuration. For stable agent workflows, run setup once and store credentials in the user config file:
178
164
 
179
165
  ```bash
180
166
  paper-search setup
181
- paper-search config set SEMANTIC_SCHOLAR_API_KEY your_semantic_scholar_api_key_here
182
- paper-search setup EASYSCHOLAR_KEY # hidden prompt; safer for EasyScholar SecretKey
183
- paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com # optional: replace the setup-generated email
184
167
  paper-search config list --pretty
185
- paper-search config doctor --pretty
186
- paper-search diagnostics --pretty
168
+ paper-search doctor --pretty
187
169
  ```
188
170
 
189
- The default config path is:
171
+ Default config path:
190
172
 
191
173
  ```text
192
174
  ~/.config/paper-search-cli/config.json
193
175
  ```
194
176
 
195
- The file is written with `0600` permissions. `config list` and `config doctor` mask secrets.
177
+ The config file is written with `0600` permissions. `config list`, `doctor`, and related commands mask secrets.
196
178
 
197
- `paper-search setup` is the guided setup command. By default it asks for the recommended credentials only: Semantic Scholar, Unpaywall email, Crossref email, CORE, and EasyScholar. Use `paper-search setup --all` to walk through every supported configuration key, or `paper-search setup --keys SEMANTIC_SCHOLAR_API_KEY,CORE_API_KEY` to configure a specific subset.
179
+ ### API Key Tiers
198
180
 
199
- To reduce first-run friction, if `PAPER_SEARCH_UNPAYWALL_EMAIL` / `UNPAYWALL_EMAIL` / `CROSSREF_MAILTO` are not configured, pressing Enter during setup writes a random Gmail-format address such as `paper.search.xxxxxx@gmail.com`, so basic Unpaywall and Crossref requests can run immediately.
200
-
201
- `paper-search diagnostics --pretty` lists every API-key or email-backed capability, the related config keys, whether the required keys are configured, common failure modes, and suggested next checks. Search commands also add a `diagnostic` field when a key-backed platform returns zero results or an auth/permission/rate-limit error.
202
-
203
- ### API Key Recommendation
204
-
205
- `paper-search setup` asks only for the credentials that are most useful for ordinary new users. `✅ Required` in the platform table means "required for that platform", not "recommended for every installation".
206
-
207
- | Level | Config keys | Recommended for new users | Notes |
181
+ | Tier | Keys | Used for | When to configure |
208
182
  | --- | --- | --- | --- |
209
- | Default recommended | `SEMANTIC_SCHOLAR_API_KEY` | Yes | Enables Semantic Scholar body-snippet search for methodology details and improves request stability. |
210
- | Default recommended | `PAPER_SEARCH_UNPAYWALL_EMAIL` or `UNPAYWALL_EMAIL` | Yes | Finds open-access PDFs from DOI records; this only needs an email, not an API key. Press Enter in `setup` to generate a random Gmail-format email, or replace it manually. |
211
- | Default recommended | `CROSSREF_MAILTO` | Yes | Puts Crossref requests in the polite pool, which is better for long-running or frequent searches. Press Enter in `setup` to reuse the generated email, or replace it manually. |
212
- | Default recommended | `CORE_API_KEY` or `PAPER_SEARCH_CORE_API_KEY` | Yes | CORE anonymous access is often rate-limited; a key makes open repository search more reliable. |
213
- | Default recommended | `EASYSCHOLAR_KEY` or `PAPER_SEARCH_EASYSCHOLAR_KEY` | Yes, if you need journal metrics | Enables EasyScholar journal metrics such as impact factor, JCR quartile, CAS zones, JCI, ESI, and warning flags. Use `paper-search setup EASYSCHOLAR_KEY` so the SecretKey is entered through a hidden prompt. |
214
- | Biomedical-heavy use | `PUBMED_API_KEY`, `NCBI_EMAIL`, `NCBI_TOOL` | Recommended if you use PubMed heavily | Raises NCBI E-utilities limits and identifies the client. |
215
- | Institution entitlement | `WOS_API_KEY` | Configure only with Web of Science API access | Enables Web of Science search and citation data; requires Clarivate API entitlement. |
216
- | Institution entitlement | `IEEE_API_KEY` | Configure only with IEEE Xplore API access | Enables IEEE Xplore metadata search; IEEE may require registered API access and product entitlement. |
217
- | Institution entitlement | `ELSEVIER_API_KEY` | Configure only with Scopus or ScienceDirect API access | One Elsevier key does not automatically grant both products; Scopus and ScienceDirect need separate entitlements. |
218
- | Institution entitlement | `SPRINGER_API_KEY`, `SPRINGER_OPENACCESS_API_KEY` | Configure only when you need Springer | Used for Springer metadata and open-access records; 401 usually means an invalid key or missing product access. |
219
- | Institution entitlement | `WILEY_TDM_TOKEN` | Configure only with Wiley TDM/institutional full-text access | DOI-based download only; availability depends on the token and institutional subscription. |
220
- | Usually unnecessary | `PAPER_SEARCH_OPENAIRE_API_KEY` or `OPENAIRE_API_KEY` | Not recommended by default | OpenAIRE public search usually works without a key; configure only for account or quota requirements. |
221
-
222
- You can also import an existing `.env`:
223
-
224
- ```bash
225
- paper-search config import-env .env --pretty
226
- ```
227
-
228
- Config priority is:
229
-
230
- 1. Shell environment variables.
231
- 2. Current working directory `.env`.
232
- 3. User config file.
233
- 4. Built-in defaults for free sources.
234
-
235
- For repo-local development, copying `.env.example` still works:
236
-
237
- ```bash
238
- cp .env.example .env
239
- ```
240
-
241
- ### Environment Variables
242
-
243
- ```bash
244
- # Web of Science, required for Web of Science search
245
- WOS_API_KEY=your_web_of_science_api_key_here
246
- WOS_API_VERSION=v1
247
-
248
- # IEEE Xplore, required for IEEE metadata search
249
- IEEE_API_KEY=your_ieee_api_key_here
250
-
251
- # PubMed, optional; increases rate limit from 3 requests/sec to 10 requests/sec
252
- PUBMED_API_KEY=your_ncbi_api_key_here
253
- NCBI_EMAIL=you@example.com
254
- NCBI_TOOL=paper-search-cli
255
-
256
- # Semantic Scholar, required for body-snippet search and useful for higher request limits
257
- SEMANTIC_SCHOLAR_API_KEY=your_semantic_scholar_api_key_here
258
-
259
- # EasyScholar, required for journal metrics such as IF, JCR quartile, and CAS zones
260
- EASYSCHOLAR_KEY=your_easyscholar_secret_key_here
261
-
262
- # Elsevier, required for Scopus and ScienceDirect; each product still needs separate entitlement
263
- ELSEVIER_API_KEY=your_elsevier_api_key_here
264
-
265
- # Springer Nature, required for Springer search and open access download
266
- SPRINGER_API_KEY=your_springer_api_key_here
267
- SPRINGER_OPENACCESS_API_KEY=your_openaccess_api_key_here
268
-
269
- # Wiley TDM, required for Wiley DOI-based PDF download
270
- WILEY_TDM_TOKEN=your_wiley_tdm_token_here
271
-
272
- # Crossref polite pool, optional but recommended; setup can auto-generate/reuse a random Gmail-format email
273
- CROSSREF_MAILTO=you@example.com
274
-
275
- # Unpaywall, required for DOI-based OA resolution; setup can auto-generate a random Gmail-format email
276
- PAPER_SEARCH_UNPAYWALL_EMAIL=you@example.com
277
- UNPAYWALL_EMAIL=you@example.com
278
-
279
- # CORE, optional but recommended; anonymous access is often heavily rate-limited
280
- PAPER_SEARCH_CORE_API_KEY=your_core_api_key_here
281
- CORE_API_KEY=your_core_api_key_here
282
-
283
- # OpenAIRE, optional; public search works without a key
284
- PAPER_SEARCH_OPENAIRE_API_KEY=your_openaire_api_key_here
285
- OPENAIRE_API_KEY=your_openaire_api_key_here
286
- ```
287
-
288
- ### API Key Sources
289
-
290
- - Web of Science: [Clarivate Developer Portal](https://developer.clarivate.com/apis)
291
- - IEEE Xplore: [IEEE Xplore Metadata API](https://developer.ieee.org/docs/read/Searching_the_IEEE_Xplore_Metadata_API)
292
- - PubMed: [NCBI API Keys](https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/)
293
- - Semantic Scholar: [Semantic Scholar API](https://www.semanticscholar.org/product/api)
294
- - EasyScholar: [EasyScholar Open API](https://www.easyscholar.cc/console/user/open)
295
- - Elsevier: [Elsevier Developer Portal](https://dev.elsevier.com/apikey/manage)
296
- - Springer Nature: [Springer Nature Developers](https://dev.springernature.com/)
297
- - Wiley TDM: [Wiley Text and Data Mining](https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining)
298
- - Unpaywall: [Unpaywall Data Format and API](https://unpaywall.org/products/api)
299
- - CORE: [CORE API](https://core.ac.uk/services/api)
300
- - OpenAIRE: [OpenAIRE APIs](https://develop.openaire.eu/)
301
-
302
- `.env` is ignored by git. Do not commit API keys or tokens.
183
+ | Recommended for most users | `SEMANTIC_SCHOLAR_API_KEY` | Body snippet search and more stable Semantic Scholar requests | Configure if you use methods/detail searches or high-frequency Semantic Scholar lookup |
184
+ | Recommended for most users | `UNPAYWALL_EMAIL` or `PAPER_SEARCH_UNPAYWALL_EMAIL` | DOI-based open-access PDF resolution | Configure during setup; an email is required, not an API key |
185
+ | Recommended for most users | `CROSSREF_MAILTO` | Crossref polite pool | Configure for long-running or frequent metadata search |
186
+ | Recommended for most users | `CORE_API_KEY` | CORE repository search | Configure if you rely on CORE or hit anonymous rate limits |
187
+ | Journal metrics | `EASYSCHOLAR_KEY` | EasyScholar impact factor, JCR/SSCI, CAS, JCI, ESI, warning flags | Configure if you need journal metrics; use `paper-search setup EASYSCHOLAR_KEY` for hidden input |
188
+ | Biomedical-heavy use | `PUBMED_API_KEY`, `NCBI_EMAIL`, `NCBI_TOOL` | NCBI E-utilities stability and higher limits | Configure if PubMed is a frequent source |
189
+ | Institutional or publisher access | `WOS_API_KEY`, `IEEE_API_KEY`, `ELSEVIER_API_KEY`, `SPRINGER_API_KEY`, `SPRINGER_OPENACCESS_API_KEY`, `WILEY_TDM_TOKEN` | Web of Science, IEEE, Scopus, ScienceDirect, Springer, Wiley metadata or entitled access | Configure only when you have the relevant API or institutional permission |
190
+ | Usually optional | `OPENAIRE_API_KEY` | OpenAIRE account/quota use | Usually unnecessary for public search |
191
+
192
+ Useful key dashboards:
193
+
194
+ | Service | Link |
195
+ | --- | --- |
196
+ | EasyScholar | [EasyScholar Open API](https://www.easyscholar.cc/console/user/open) |
197
+ | Semantic Scholar | [Semantic Scholar API](https://www.semanticscholar.org/product/api) |
198
+ | Unpaywall | [Unpaywall API](https://unpaywall.org/products/api) |
199
+ | CORE | [CORE API](https://core.ac.uk/services/api) |
200
+ | PubMed | [NCBI API Keys](https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/) |
201
+ | Web of Science | [Clarivate Developer Portal](https://developer.clarivate.com/apis) |
202
+ | IEEE Xplore | [IEEE Xplore Metadata API](https://developer.ieee.org/docs/read/Searching_the_IEEE_Xplore_Metadata_API) |
203
+ | Elsevier | [Elsevier Developer Portal](https://dev.elsevier.com/apikey/manage) |
204
+ | Springer Nature | [Springer Nature Developers](https://dev.springernature.com/) |
205
+ | Wiley TDM | [Wiley Text and Data Mining](https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining) |
206
+ | OpenAIRE | [OpenAIRE APIs](https://develop.openaire.eu/) |
303
207
 
304
208
  ## Agent Skill
305
209
 
306
- This repository includes an optional agent skill at `skills/paper-search/SKILL.md`. Install it into your agent's skill directory if your agent supports skills.
307
-
308
- For example:
309
-
310
- ```bash
311
- mkdir -p ~/.agents/skills/paper-search
312
- cp skills/paper-search/SKILL.md ~/.agents/skills/paper-search/SKILL.md
313
- ```
314
-
315
- The skill only teaches the agent how to call the `paper-search` CLI. API keys are still configured through `paper-search setup`, `paper-search config`, `.env`, or shell environment variables. Do not store secrets in the skill file.
316
-
317
- ## Output Contract
318
-
319
- By default, every command writes JSON to stdout.
320
-
321
- ```json
322
- {
323
- "ok": true,
324
- "tool": "search_papers",
325
- "message": "Found 1 papers.",
326
- "data": []
327
- }
328
- ```
329
-
330
- Use `--pretty` for formatted JSON:
331
-
332
- ```bash
333
- paper-search search "machine learning" --platform crossref --max-results 1 --pretty
334
- ```
335
-
336
- Use `--format text` if you need the raw text response:
210
+ The npm package ships a bundled agent Skill at `skills/paper-search/SKILL.md`. Terminal users can use the CLI directly; AI agent workflows should install or sync the Skill so the agent can route the five main workflows correctly.
337
211
 
338
212
  ```bash
339
- paper-search search "machine learning" --platform crossref --max-results 1 --format text
213
+ paper-search setup --install-skills agents
214
+ paper-search skills status --pretty
215
+ paper-search skills diff --targets agents --format text
216
+ paper-search skills update --targets agents --pretty
340
217
  ```
341
218
 
342
- Use `--include-text` to keep the raw response text alongside parsed JSON:
219
+ Supported targets include `agents`, `codex`, `claude`, `cursor`, `gemini`, `antigravity`, and `all`. Skill updates overwrite package-managed Skill files while preserving extra files in the installed Skill directory.
343
220
 
344
- ```bash
345
- paper-search run search_crossref --arg query="machine learning" --arg maxResults=3 --include-text --pretty
346
- ```
221
+ The Skill only teaches agents how to call the `paper-search` CLI. API keys still belong in `paper-search setup`, `paper-search config`, `.env`, or shell environment variables.
347
222
 
348
223
  ## Commands
349
224
 
350
- ### `paper-search search`
351
-
352
- Unified search entrypoint.
353
-
354
- ```bash
355
- paper-search search <query> [options]
356
- ```
357
-
358
- Examples:
359
-
360
- ```bash
361
- paper-search search "machine learning" --platform crossref --max-results 10 --pretty
362
- paper-search search "machine learning" --sources crossref,openalex --max-results 2 --pretty
363
- paper-search search "cancer immunotherapy" --platform all --max-results 2 --pretty
364
- paper-search search "transformer neural networks" --platform arxiv --category cs.AI --year 2023 --pretty
365
- paper-search search "COVID-19 vaccine efficacy" --platform pubmed --max-results 20 --year 2023 --pretty
366
- paper-search search "CRISPR gene editing" --platform webofscience --journal Nature --max-results 15 --pretty
367
- ```
368
-
369
- Common options:
370
-
371
- | Option | Description |
225
+ | Command | Purpose |
372
226
  | --- | --- |
373
- | `--platform` | Source platform. Default: `crossref` |
374
- | `--sources` | Comma-separated source list for multi-source search, e.g. `crossref,openalex,pmc` |
375
- | `--max-results` | Maximum result count |
376
- | `--year` | Year filter, e.g. `2024`, `2020-2024`, `2020-` |
377
- | `--author` | Author name filter |
378
- | `--journal` | Journal name filter |
379
- | `--category` | Category filter, mainly arXiv/bioRxiv/medRxiv |
380
- | `--days` | Days back for bioRxiv/medRxiv |
381
- | `--sort-by` | `relevance`, `date`, or `citations` |
382
- | `--sort-order` | `asc` or `desc` |
383
-
384
- ### `paper-search run`
385
-
386
- Run a specific internal tool by name. This is the most precise command for agent workflows.
387
-
388
- ```bash
389
- paper-search run <tool-name> --arg key=value --arg key=value
390
- paper-search run <tool-name> --json-args '{"key":"value"}'
391
- paper-search run <tool-name> --json-args @args.json
392
- ```
393
-
394
- Examples:
395
-
396
- ```bash
397
- paper-search run search_crossref --arg query="machine learning" --arg maxResults=5 --pretty
398
- paper-search run search_papers --json-args '{"query":"machine learning","sources":"crossref,openalex","maxResults":2}' --pretty
399
- paper-search run search_pubmed --json-args '{"query":"osteoarthritis","maxResults":5,"sortBy":"date"}' --pretty
400
- paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --pretty
401
- paper-search run query_journal_metrics --json-args '{"journals":["Nature","BMJ"],"includeRaw":true}' --pretty
402
- ```
403
-
404
- ### `paper-search journal-metrics`
405
-
406
- Query journal-level metrics through EasyScholar. Requires `EASYSCHOLAR_KEY` or `PAPER_SEARCH_EASYSCHOLAR_KEY`.
407
-
408
- ```bash
409
- paper-search journal-metrics "Nature" "BMJ" --pretty
410
- paper-search journal-metrics --file journals.txt --include-raw --pretty
411
- ```
412
-
413
- Returned normalized fields include `impact_factor`, `impact_factor_5y`, `jcr_quartile`, `ssci_quartile`, `jci`, `cas_base`, `cas_upgraded`, `cas_small`, `cas_top`, `cas_zone`, `esi`, `warning`, `pku`, `cssci`, `cscd`, `ahci`, `ccf`, `ei`, and `china_st_core` when EasyScholar returns them. `--include-raw` also keeps `official_all`, `official_select`, and `custom_rank`.
414
-
415
- ### `paper-search tools`
416
-
417
- List all available tool names, descriptions, and input schemas.
418
-
419
- ```bash
420
- paper-search tools --pretty
421
- ```
422
-
423
- ### `paper-search status`
424
-
425
- Show platform capabilities and API key status. Secrets are never printed.
426
-
427
- ```bash
428
- paper-search status --pretty
429
- paper-search status --validate --pretty
430
- ```
431
-
432
- `--validate` may make live provider requests. Use it when you intentionally want credential validation.
433
-
434
- ### `paper-search diagnostics`
435
-
436
- Show API-key-backed capabilities and troubleshooting guidance. This does not print secrets.
437
-
438
- ```bash
439
- paper-search diagnostics --pretty
440
- ```
441
-
442
- When a command returns zero results from a configured key-backed source, or fails with 401, 403, 400, or 429, JSON output includes a `diagnostic` field with likely causes and next actions.
443
-
444
- ### `paper-search config`
445
-
446
- Manage the user-level config file.
447
-
448
- ```bash
449
- paper-search config init --pretty
450
- paper-search config set SEMANTIC_SCHOLAR_API_KEY your_key --pretty
451
- paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com --pretty # optional: replace the setup-generated email
452
- paper-search config import-env .env --pretty
453
- paper-search config list --pretty
454
- paper-search config doctor --pretty
455
- paper-search config path --pretty
456
- paper-search config keys --pretty
457
- ```
458
-
459
- ### `paper-search download`
460
-
461
- Download a paper PDF through a platform that supports downloads.
462
-
463
- ```bash
464
- paper-search download <paper-id-or-doi> --platform <platform> [--save-path ./downloads]
465
- ```
466
-
467
- Examples:
468
-
469
- ```bash
470
- paper-search download 2301.00001 --platform arxiv --save-path ./downloads
471
- paper-search download 10.1000/example --platform scihub --save-path ./downloads
472
- paper-search download 10.1111/jtsb.12390 --platform wiley --save-path ./downloads
473
- paper-search run download_with_fallback --arg source=arxiv --arg paperId=1201.0490 --arg doi=10.48550/arxiv.1201.0490 --arg savePath=./downloads --pretty
474
- ```
475
-
476
- ## Tool Reference
477
-
478
- These names can be used with `paper-search run`.
479
-
480
- ### `search_papers`
481
-
482
- Search across the unified dispatcher.
483
-
484
- ```bash
485
- paper-search run search_papers --json-args '{"query":"machine learning","platform":"crossref","maxResults":10,"year":"2023","sortBy":"date"}' --pretty
486
- ```
487
-
488
- Supported platforms:
489
-
490
- ```text
491
- crossref, arxiv, webofscience, wos, pubmed, biorxiv, medrxiv, semantic,
492
- iacr, googlescholar, scholar, scihub, ieee, sciencedirect, springer,
493
- springerlink, scopus, openalex, unpaywall, pmc, europepmc, core,
494
- openaire, dblp, acm, usenix, openreview, all
495
- ```
496
-
497
- For multi-source search, pass `sources`:
498
-
499
- ```bash
500
- paper-search run search_papers --json-args '{"query":"machine learning","sources":"crossref,openalex,pmc","maxResults":2}' --pretty
501
- ```
502
-
503
- ### `search_crossref`
504
-
505
- Search Crossref, the default free metadata source.
506
-
507
- ```bash
508
- paper-search run search_crossref --arg query="machine learning" --arg maxResults=10 --arg year=2023 --arg sortBy=relevance --arg sortOrder=desc --pretty
509
- ```
227
+ | `paper-search search` | Integrated metadata search |
228
+ | `paper-search journal-metrics` | EasyScholar journal metrics lookup |
229
+ | `paper-search download` | Direct PDF download for a verified paper ID or DOI |
230
+ | `paper-search run` | Precise tool invocation with `--arg` or `--json-args` |
231
+ | `paper-search tools` | Runtime tool names and schemas |
232
+ | `paper-search doctor` | Masked config, Capability Profile, and platform status |
233
+ | `paper-search smoke` | Mock or live self-checks |
234
+ | `paper-search skills` | Bundled Skill status, diff, and update |
235
+ | `paper-search config` | User-level configuration management |
510
236
 
511
- ### `search_arxiv`
237
+ Full command and tool schema: run `paper-search tools --pretty` or see [`skills/paper-search/references/cli-contract.md`](skills/paper-search/references/cli-contract.md).
512
238
 
513
- Search arXiv preprints.
239
+ ## Output
514
240
 
515
- ```bash
516
- paper-search run search_arxiv --arg query="transformer neural networks" --arg maxResults=10 --arg category=cs.AI --arg year=2023 --arg sortBy=date --arg sortOrder=desc --pretty
517
- ```
518
-
519
- ### `search_pubmed`
520
-
521
- Search PubMed/MEDLINE biomedical literature.
522
-
523
- ```bash
524
- paper-search run search_pubmed --json-args '{"query":"COVID-19 vaccine efficacy","maxResults":20,"year":"2023","journal":"New England Journal of Medicine","publicationType":["Journal Article","Clinical Trial"],"sortBy":"date"}' --pretty
525
- ```
526
-
527
- ### Open Metadata And Full-Text Sources
528
-
529
- Use these commands for open metadata search, open full-text discovery, and fallback PDF lookup:
241
+ Commands return JSON by default. Use `--pretty` for formatted JSON and `--format text` only when you need a human-readable report.
530
242
 
531
243
  ```bash
532
- paper-search run search_openalex --arg query="machine learning" --arg maxResults=3 --pretty
533
- paper-search run search_unpaywall --arg query="10.48550/arxiv.1201.0490" --pretty
534
- paper-search run search_pmc --arg query="cancer immunotherapy" --arg maxResults=3 --pretty
535
- paper-search run search_europepmc --arg query="cancer genomics" --arg maxResults=3 --pretty
536
- paper-search run search_core --arg query="machine learning" --arg maxResults=3 --pretty
537
- paper-search run search_openaire --arg query="machine learning" --arg maxResults=3 --pretty
538
- ```
539
-
540
- Unpaywall is DOI-only and requires an email. CORE public access may return zero results or rate-limit quickly without an API key.
541
-
542
- ### Registry-Backed Platform Search
543
-
544
- These metadata-oriented tools are generated from the platform registry, so adding later platforms only needs a new searcher plus registry metadata:
545
-
546
- ```bash
547
- paper-search run search_dblp --arg query="graph neural networks" --arg maxResults=5 --pretty
548
- paper-search run search_acm --arg query="software testing" --arg maxResults=5 --pretty
549
- paper-search run search_usenix --arg query="file systems" --arg maxResults=5 --pretty
550
- paper-search run search_openreview --arg query="large language models" --arg maxResults=5 --pretty
551
- paper-search run search_springerlink --arg query="machine learning" --arg maxResults=5 --pretty
552
- ```
553
-
554
- `search_ieee` uses the same generic schema but requires `IEEE_API_KEY`:
555
-
556
- ```bash
557
- paper-search run search_ieee --arg query="wireless networks" --arg maxResults=5 --arg articleTitle="wireless" --pretty
558
- ```
559
-
560
- ### `search_webofscience`
561
-
562
- Search Web of Science. Requires `WOS_API_KEY`.
563
-
564
- ```bash
565
- paper-search run search_webofscience --arg query="CRISPR gene editing" --arg maxResults=15 --arg year=2022 --arg journal=Nature --pretty
566
- ```
567
-
568
- ### `search_google_scholar`
569
-
570
- Search Google Scholar.
571
-
572
- ```bash
573
- paper-search run search_google_scholar --arg query="deep learning" --arg maxResults=10 --arg yearLow=2020 --arg yearHigh=2024 --pretty
574
- ```
575
-
576
- ### `search_biorxiv` and `search_medrxiv`
577
-
578
- Search preprint servers by recent day window and optional category.
579
-
580
- ```bash
581
- paper-search run search_biorxiv --arg query="genomics" --arg maxResults=10 --arg days=30 --pretty
582
- paper-search run search_medrxiv --arg query="epidemiology" --arg maxResults=10 --arg days=60 --pretty
583
- ```
584
-
585
- ### `search_semantic_scholar`
586
-
587
- Search Semantic Scholar with optional field filters.
588
-
589
- ```bash
590
- paper-search run search_semantic_scholar --json-args '{"query":"graph neural networks","maxResults":10,"fieldsOfStudy":["Computer Science"]}' --pretty
591
- ```
592
-
593
- ### `search_semantic_snippets`
594
-
595
- Search Semantic Scholar's Open Access snippet index for body-text snippets that can help locate methodological details. Requires `SEMANTIC_SCHOLAR_API_KEY`.
596
-
597
- ```bash
598
- paper-search run search_semantic_snippets --arg query="CMAverse mediation bootstrap confidence interval" --arg limit=5 --arg fieldsOfStudy=Medicine --pretty
599
- ```
600
-
601
- ### `query_journal_metrics`
602
-
603
- Query EasyScholar journal metrics. This is not a paper search source; it is a journal-level lookup for publication planning, target-journal screening, and submission checks. Requires `EASYSCHOLAR_KEY` or `PAPER_SEARCH_EASYSCHOLAR_KEY`.
604
-
605
- ```bash
606
- paper-search run query_journal_metrics --json-args '{"journals":["Nature","BMJ"]}' --pretty
607
- paper-search run query_journal_metrics --json-args '{"journal":"Journal of Medical Internet Research","includeRaw":true}' --pretty
608
- ```
609
-
610
- The normalized `core` object returns only fields present in EasyScholar for that journal, such as impact factor, JCR/SSCI quartiles, CAS zones, JCI, ESI, warning flags, and Chinese/discipline ranking indicators. Add `includeRaw=true` when you need the complete `officialRank.all`, `officialRank.select`, and `customRank` payloads.
611
-
612
- ### `search_iacr`
613
-
614
- Search IACR ePrint Archive.
615
-
616
- ```bash
617
- paper-search run search_iacr --arg query="zero knowledge proof" --arg maxResults=10 --arg fetchDetails=true --pretty
618
- ```
619
-
620
- ### `search_sciencedirect`
621
-
622
- Search ScienceDirect. Requires `ELSEVIER_API_KEY`.
623
-
624
- ```bash
625
- paper-search run search_sciencedirect --arg query="materials science" --arg maxResults=10 --arg openAccess=true --pretty
626
- ```
627
-
628
- ### `search_scopus`
629
-
630
- Search Scopus. Requires `ELSEVIER_API_KEY`.
631
-
632
- ```bash
633
- paper-search run search_scopus --arg query="citation analysis" --arg maxResults=10 --arg documentType=ar --pretty
634
- ```
635
-
636
- ### `search_springer`
637
-
638
- Search Springer Nature. Requires `SPRINGER_API_KEY`.
639
-
640
- ```bash
641
- paper-search run search_springer --arg query="machine learning" --arg maxResults=10 --arg type=Journal --arg openAccess=true --pretty
642
- ```
643
-
644
- ### `search_scihub`
645
-
646
- Lookup a DOI or article URL through Sci-Hub and optionally download a PDF.
647
-
648
- ```bash
649
- paper-search run search_scihub --arg doiOrUrl="10.1038/nature12373" --arg downloadPdf=false --pretty
650
- paper-search run search_scihub --arg doiOrUrl="10.1038/nature12373" --arg downloadPdf=true --arg savePath=./downloads --pretty
651
- ```
652
-
653
- ### `check_scihub_mirrors`
654
-
655
- Show Sci-Hub mirror health.
656
-
657
- ```bash
658
- paper-search run check_scihub_mirrors --pretty
659
- paper-search run check_scihub_mirrors --arg forceCheck=true --pretty
660
- ```
661
-
662
- ### `get_paper_by_doi`
663
-
664
- Lookup metadata by DOI.
665
-
666
- ```bash
667
- paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --arg platform=all --pretty
668
- paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --arg platform=arxiv --pretty
669
- ```
670
-
671
- ### `download_paper`
672
-
673
- Download PDF files from a platform. If the selected platform has no native downloader, or if native download fails, the command enters the same fallback funnel used by `download_with_fallback`.
674
-
675
- ```bash
676
- paper-search run download_paper --arg paperId="2301.00001" --arg platform=arxiv --arg savePath=./downloads --pretty
677
- ```
678
-
679
- Native download platforms:
680
-
681
- ```text
682
- arxiv, biorxiv, medrxiv, semantic, iacr, scihub, springer, wiley,
683
- pmc, europepmc, core
684
- ```
685
-
686
- Other registered sources, such as `crossref`, `openalex`, `dblp`, `acm`, `usenix`, or `openreview`, can still be passed to `download_paper`; they start directly at the metadata/repository/Unpaywall/Sci-Hub fallback funnel.
687
-
688
- ### `download_with_fallback`
689
-
690
- Try the full download funnel. The order is source-native download, metadata PDF URL, repository discovery, Unpaywall DOI resolution, then Sci-Hub as the final fallback:
691
-
692
- ```bash
693
- paper-search run download_with_fallback --arg source=arxiv --arg paperId=1201.0490 --arg doi=10.48550/arxiv.1201.0490 --arg savePath=./downloads --pretty
694
- paper-search run download_with_fallback --arg source=crossref --arg paperId="10.1038/nature12373" --arg doi="10.1038/nature12373" --arg savePath=./downloads --pretty
695
- ```
696
-
697
- `useSciHub` defaults to `true`; set it to `false` only when you need to suppress that final fallback. `download_paper` also routes failed or unsupported platform downloads through the same funnel.
698
-
699
- ### `search_wiley`
700
-
701
- Wiley keyword search is not supported by the Wiley TDM API. Use Crossref first, then download by DOI:
702
-
703
- ```bash
704
- paper-search run search_crossref --arg query="site:wiley.com machine learning" --arg maxResults=10 --pretty
705
- paper-search run download_paper --arg paperId="10.1111/example" --arg platform=wiley --pretty
706
- ```
707
-
708
- ### `get_platform_status`
709
-
710
- Same as `paper-search status`.
711
-
712
- ```bash
713
- paper-search run get_platform_status --pretty
714
- paper-search run get_platform_status --arg validate=true --pretty
244
+ paper-search search "machine learning" --platform crossref --max-results 1 --pretty
245
+ paper-search doctor --format text
715
246
  ```
716
247
 
717
248
  ## Troubleshooting
718
249
 
719
- ### Command Not Found
720
-
721
- Run from the project:
722
-
723
- ```bash
724
- node dist/cli.js status --pretty
725
- ```
726
-
727
- Or register the local command:
728
-
729
- ```bash
730
- npm link
731
- paper-search status --pretty
732
- ```
733
-
734
- ### Missing API Key
735
-
736
- Run:
737
-
738
- ```bash
739
- paper-search status --pretty
740
- ```
741
-
742
- If a provider shows `missing`, add the relevant key through `paper-search setup`, user config, or `.env`, then rerun the command.
743
-
744
- For global installs, prefer user config:
745
-
746
- ```bash
747
- paper-search setup
748
- paper-search config set SEMANTIC_SCHOLAR_API_KEY your_key
749
- paper-search config doctor --pretty
750
- ```
751
-
752
- ### Provider Rate Limits
753
-
754
- Reduce `--max-results`, avoid repeated live validation, and prefer sources with official APIs. PubMed, Semantic Scholar, and CORE support optional keys for better limits. CORE anonymous access can return HTTP 429; configure `PAPER_SEARCH_CORE_API_KEY` when you rely on it.
755
-
756
- ### JSON Parsing In Scripts
757
-
758
- Use default JSON output and parse stdout. Human diagnostics are written to stderr.
250
+ | Problem | First check |
251
+ | --- | --- |
252
+ | Command not found | Reinstall globally with `npm install -g paper-search-cli` |
253
+ | Missing capability | Run `paper-search doctor --pretty` and configure the missing key with `paper-search setup` |
254
+ | Provider rate limits | Lower `--max-results`, configure the relevant key, or switch sources |
255
+ | Skill looks stale | Run `paper-search skills status --pretty`, then `paper-search skills update --targets agents --pretty` |
256
+ | Need complete CLI details | Run `paper-search tools --pretty` |
759
257
 
760
258
  ## Usage Boundaries
761
259
 
@@ -763,9 +261,7 @@ Some sources may be subject to platform terms, institutional subscriptions, or l
763
261
 
764
262
  ## Project Origin
765
263
 
766
- This project acknowledges and thanks the [LinuxDo](https://linux.do) community.
767
-
768
- The CLI + Skill direction and paper-search workflow refinements were shaped by community discussions and open-source sharing. This repository keeps the workflow focused on a one-command terminal tool and does not require an MCP runtime.
264
+ This project acknowledges and thanks the [LinuxDo](https://linux.do) community. The CLI + Skill direction and paper-search workflow refinements were shaped by community discussions and open-source sharing.
769
265
 
770
266
  It also references ideas from [openags/paper-search-mcp](https://github.com/openags/paper-search-mcp) while adapting the workflow to a standalone CLI.
771
267