paper-search-cli 0.1.3 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (59) hide show
  1. package/.env.example +14 -5
  2. package/README.md +151 -617
  3. package/README.zh.md +268 -0
  4. package/dist/cli.js +199 -21
  5. package/dist/cli.js.map +1 -1
  6. package/dist/config/ConfigService.d.ts +1 -1
  7. package/dist/config/ConfigService.d.ts.map +1 -1
  8. package/dist/config/ConfigService.js +2 -2
  9. package/dist/config/ConfigService.js.map +1 -1
  10. package/dist/config/ResultCaps.d.ts +4 -0
  11. package/dist/config/ResultCaps.d.ts.map +1 -0
  12. package/dist/config/ResultCaps.js +10 -0
  13. package/dist/config/ResultCaps.js.map +1 -0
  14. package/dist/core/capabilityProfile.d.ts +18 -0
  15. package/dist/core/capabilityProfile.d.ts.map +1 -0
  16. package/dist/core/capabilityProfile.js +153 -0
  17. package/dist/core/capabilityProfile.js.map +1 -0
  18. package/dist/core/diagnostics.d.ts.map +1 -1
  19. package/dist/core/diagnostics.js +35 -15
  20. package/dist/core/diagnostics.js.map +1 -1
  21. package/dist/core/handleToolCall.d.ts.map +1 -1
  22. package/dist/core/handleToolCall.js +27 -0
  23. package/dist/core/handleToolCall.js.map +1 -1
  24. package/dist/core/liveSmoke.d.ts +42 -0
  25. package/dist/core/liveSmoke.d.ts.map +1 -0
  26. package/dist/core/liveSmoke.js +226 -0
  27. package/dist/core/liveSmoke.js.map +1 -0
  28. package/dist/core/platformMetadata.js +2 -2
  29. package/dist/core/platformMetadata.js.map +1 -1
  30. package/dist/core/schemas.d.ts +77 -2
  31. package/dist/core/schemas.d.ts.map +1 -1
  32. package/dist/core/schemas.js +57 -3
  33. package/dist/core/schemas.js.map +1 -1
  34. package/dist/core/textReports.d.ts +21 -0
  35. package/dist/core/textReports.d.ts.map +1 -0
  36. package/dist/core/textReports.js +85 -0
  37. package/dist/core/textReports.js.map +1 -0
  38. package/dist/core/tools.d.ts.map +1 -1
  39. package/dist/core/tools.js +31 -1
  40. package/dist/core/tools.js.map +1 -1
  41. package/dist/platforms/CORESearcher.d.ts.map +1 -1
  42. package/dist/platforms/CORESearcher.js +39 -9
  43. package/dist/platforms/CORESearcher.js.map +1 -1
  44. package/dist/platforms/OpenAIRESearcher.js +1 -1
  45. package/dist/platforms/OpenAIRESearcher.js.map +1 -1
  46. package/dist/services/JournalMetricsService.d.ts +38 -0
  47. package/dist/services/JournalMetricsService.d.ts.map +1 -0
  48. package/dist/services/JournalMetricsService.js +142 -0
  49. package/dist/services/JournalMetricsService.js.map +1 -0
  50. package/dist/skills/SkillInstaller.d.ts +108 -0
  51. package/dist/skills/SkillInstaller.d.ts.map +1 -0
  52. package/dist/skills/SkillInstaller.js +389 -0
  53. package/dist/skills/SkillInstaller.js.map +1 -0
  54. package/package.json +2 -2
  55. package/skills/paper-search/SKILL.md +53 -127
  56. package/skills/paper-search/references/capability-routing.md +134 -0
  57. package/skills/paper-search/references/cli-contract.md +133 -0
  58. package/skills/paper-search/references/management-layer.md +139 -0
  59. package/README-sc.md +0 -734
package/README.md CHANGED
@@ -1,721 +1,257 @@
1
1
  # Paper Search CLI
2
2
 
3
- [中文](README-sc.md)
3
+ [简体中文](README.zh.md) | English
4
4
 
5
- Paper Search CLI is a standalone Node.js command line tool for searching, validating, and downloading academic papers from multiple scholarly sources. It is designed for direct terminal use, automation scripts, and agent workflows that need a stable command surface with predictable JSON output.
6
-
7
- It keeps the broad platform coverage, unified paper model, and detailed capability descriptions of the earlier Paper Search implementation, but runs as a normal CLI process. There is no long-running background service to configure, start, or keep alive.
5
+ Paper Search CLI is an agent-facing Skill + CLI package built on a standalone Node.js command line tool for academic literature work. It gives AI agents, terminal users, and scripts one reproducible command layer with agent-friendly JSON output for literature metadata search, journal metrics lookup, PDF discovery/download, and paper body snippet search.
8
6
 
9
7
  ![Node.js](https://img.shields.io/badge/node.js->=18.0.0-green.svg)
10
8
  ![TypeScript](https://img.shields.io/badge/typescript-^5.5.3-blue.svg)
11
9
  ![License](https://img.shields.io/badge/license-MIT-blue.svg)
12
10
  ![Platforms](https://img.shields.io/badge/platforms-25-brightgreen.svg)
13
- ![Version](https://img.shields.io/badge/version-0.1.2-blue.svg)
11
+ ![Version](https://img.shields.io/badge/version-0.3.0-blue.svg)
14
12
  [![LinuxDo](https://img.shields.io/badge/LinuxDo-community-1f6feb)](https://linux.do)
15
13
 
16
- Thanks to the sincere, friendly, collaborative, and professional [LinuxDo](https://linux.do) community. The CLI + Skill direction and the paper-search workflow refinements in this project were shaped by LinuxDo discussions and open-source sharing.
14
+ [Quick Start](#quick-start) | [Architecture](#architecture) | [Configuration](#configuration) | [Agent Skill](#agent-skill) | [Supported Platforms](#supported-platforms) | [Commands](#commands) | [Troubleshooting](#troubleshooting)
17
15
 
18
- [Quick Start](#quick-start) · [Configuration](#configuration) · [Agent Skill](#agent-skill) · [Supported Platforms](#supported-platforms) · [Commands](#commands) · [Tool Reference](#tool-reference) · [Troubleshooting](#troubleshooting)
16
+ ## Core Workflows
19
17
 
20
- ## Design Goals
18
+ | Workflow | Primary commands | What it returns |
19
+ | --- | --- | --- |
20
+ | Literature metadata search | `paper-search search`, `paper-search run search_*` | Paper title, authors, year, journal, DOI, PMID/PMCID, arXiv ID, URL, abstract, and source metadata |
21
+ | Journal metrics lookup | `paper-search journal-metrics`, `paper-search run query_journal_metrics` | Impact factor, 5-year IF, JCR/SSCI quartile, CAS zone, JCI, ESI, warning flags, and rank fields |
22
+ | PDF discovery and download | `paper-search download`, `paper-search run download_with_fallback` | Verified PDF download paths through native sources, open access, configured entitlement, and Sci-Hub fallback when enabled |
23
+ | Body snippet search | `paper-search run search_semantic_snippets` | Semantic Scholar Open Access body snippets for methods, parameters, and wording clues |
21
24
 
22
- - **Free-first retrieval**: prefer public metadata and open-access full-text routes before restricted or fragile sources.
23
- - **One command surface**: keep search, status, download, and precise tool calls behind the same executable.
24
- - **Agent-safe output**: produce predictable JSON that can be parsed without scraping terminal text.
25
- - **Transparent source behavior**: document which platforms provide metadata only, which can download PDFs, and which need API keys.
26
- - **No hidden background process**: each command starts, returns a result, and exits.
25
+ ## Architecture
27
26
 
28
- ## Key Features
27
+ `paper-search` is not an MCP server. It is a normal CLI that AI agents can call through the bundled Skill, while terminal users and scripts can call the same `paper-search` command directly.
29
28
 
30
- - **25 academic sources/platforms**: Crossref, OpenAlex, PubMed, PubMed Central, Europe PMC, arXiv, bioRxiv, medRxiv, Semantic Scholar, CORE, OpenAIRE, DBLP, ACM Digital Library metadata, USENIX metadata, OpenReview, Web of Science, Google Scholar, IACR ePrint, Sci-Hub, IEEE Xplore, ScienceDirect, Springer Nature/SpringerLink, Wiley, Scopus, and Unpaywall.
31
- - **Single command interface**: install once, then call `paper-search` from terminal, scripts, or agents.
32
- - **JSON-first output**: stdout is machine-readable JSON by default; stderr is reserved for human-readable diagnostics.
33
- - **Unified paper model**: normalized title, authors, DOI, source, dates, abstract, PDF URL, citation count, and provider-specific metadata where available.
34
- - **Multi-source search with dedupe**: query selected sources with `--sources crossref,openalex,pmc`, or use `platform=all` to try every registered search source, then merge duplicates by DOI and title/author keys.
35
- - **Semantic Scholar body-snippet search**: `search_semantic_snippets` searches Semantic Scholar's Open Access snippet index for body-text snippets, which is useful for finding methodological details. It requires `SEMANTIC_SCHOLAR_API_KEY`.
36
- - **Funnel-style fallback download**: `download_with_fallback` tries native source download, discovered PDF URLs, PMC/Europe PMC/CORE/OpenAIRE, Unpaywall DOI resolution, then Sci-Hub as the final fallback unless `useSciHub=false`.
37
- - **Rate limits and retry logic**: platform-specific rate limiting and retryable API error handling.
38
- - **PDF download support**: download from supported sources such as arXiv, bioRxiv, medRxiv, Semantic Scholar, IACR, Sci-Hub, Springer open access, and Wiley DOI-based access.
39
- - **Agent-friendly commands**: `tools`, `status`, `search`, `download`, and `run` cover both simple use and precise advanced calls.
29
+ | Layer | Responsibility |
30
+ | --- | --- |
31
+ | CLI package body | Executes literature search, journal metrics lookup, PDF discovery/download, body snippet search, and stable JSON output |
32
+ | Bundled Skill | Ships `skills/paper-search` with agent routing rules and focused references; it does not store API keys, cookies, or account credentials |
33
+ | Friendly Management Layer | Provides `doctor`, `smoke`, `skills`, `config`, and `tools` around the four main capabilities: `metadata_search`, `journal_metrics`, `pdf_discovery`, and `body_snippet_search`. `doctor` health reports include masked configuration, Capability Profile, platform/source status, and missing or degraded items; `smoke` checks command wiring and live readiness; `skills` syncs the bundled Skill |
40
34
 
41
- ## Quick Start
35
+ The four main capabilities are executed by the CLI package body and reported by the management layer. The Capability Profile also reports `entitled_access` so users can see whether publisher API keys, database keys, TDM tokens, or institutional entitlements are configured. Missing or degraded configuration for one workflow does not make unrelated workflows unavailable.
42
36
 
43
- ### Install
37
+ ## Quick Start
44
38
 
45
39
  Requires Node.js >= 18.0.0 and npm.
46
40
 
47
41
  ```bash
48
42
  npm install -g paper-search-cli
49
43
  paper-search setup
50
- paper-search search "machine learning" --platform crossref --max-results 3 --pretty
44
+ paper-search doctor --pretty
51
45
  ```
52
46
 
53
- Run `paper-search setup` after installation to write optional API keys and emails into the user config.
54
- For the Unpaywall and Crossref email prompts, you can press Enter and the CLI will write a random Gmail-format address automatically; use `paper-search config set` later if you want to replace it with your own email.
55
-
56
- For local development, or to test changes that have not been released yet, install from source:
47
+ Try the four main workflows:
57
48
 
58
49
  ```bash
59
- git clone git@github.com:dr-dumpling/paper-search-cli.git
60
- cd paper-search-cli
61
- npm install
62
- npm run build
63
- npm install -g .
50
+ paper-search search "machine learning clinical prediction" --platform crossref --max-results 3 --pretty
51
+ paper-search journal-metrics "Nature" "BMJ" --pretty
52
+ paper-search download 10.48550/arxiv.1201.0490 --platform arxiv --save-path ./downloads
53
+ paper-search run search_semantic_snippets --arg query="propensity score matching" --arg maxResults=3 --pretty
64
54
  ```
65
55
 
66
- ### Common Checks
56
+ Useful checks:
67
57
 
68
58
  ```bash
69
- paper-search status --pretty
70
59
  paper-search tools --pretty
71
- paper-search config doctor --pretty
60
+ paper-search doctor --format text
61
+ paper-search smoke --mock --pretty
62
+ paper-search skills status --pretty
72
63
  ```
73
64
 
74
65
  ## Supported Platforms
75
66
 
76
- ### Platform Families
67
+ The quick group table helps choose a source family. The capability matrix below is the clearer source of truth for what each platform can actually do.
77
68
 
78
- The table below remains the source-of-truth for capabilities. For choosing a source quickly, use these broad families:
69
+ ### Platform Groups
79
70
 
80
- | Family | Platforms | Best For |
71
+ | Family | Platforms | Main use |
81
72
  | --- | --- | --- |
82
- | General scholarly metadata | Crossref, OpenAlex, Semantic Scholar, Google Scholar | Broad discovery, DOI metadata, citation clues, first-pass literature search |
83
- | Medicine / life sciences | PubMed, PubMed Central, Europe PMC | Clinical, biomedical, public health, biomedical metadata, and open full text |
84
- | Preprints / conference papers | arXiv, bioRxiv, medRxiv, OpenReview, IACR ePrint | Cross-disciplinary preprints, life-science/medical preprints, AI/ML submissions, and cryptography ePrints |
85
- | Computer science / engineering | DBLP, ACM Digital Library metadata, IEEE Xplore, USENIX | CS bibliography, engineering databases, systems/security proceedings |
86
- | Open full text / repositories | CORE, OpenAIRE, Unpaywall | Cross-disciplinary repository discovery and open-access PDF fallback routes |
87
- | Citation indexes / publishers | Web of Science, Scopus, ScienceDirect, Springer Nature/SpringerLink, Wiley | Institution-backed metadata, citation databases, publisher-specific records and downloads |
88
- | DOI-targeted lookup | Sci-Hub | DOI-based retrieval and the final automatic PDF fallback unless `useSciHub=false` |
89
-
90
- Some platforms belong to more than one practical workflow. For example, Semantic Scholar is useful for broad discovery and CS/AI, while arXiv covers CS, math, physics, and quantitative fields. These groups reflect the primary way a platform is used; CS searches often combine "computer science / engineering" with "preprints / conference papers."
73
+ | General scholarly metadata | Crossref, OpenAlex, Semantic Scholar, Google Scholar | Broad discovery, DOI metadata, citation clues, first-pass screening |
74
+ | Journal metrics | EasyScholar | Impact factor, JCR/SSCI quartile, CAS zone, JCI, ESI, warning flags |
75
+ | Biomedical and life sciences | PubMed, PubMed Central, Europe PMC | Biomedical metadata, PMID/PMCID verification, open full text |
76
+ | Preprints and conference papers | arXiv, bioRxiv, medRxiv, OpenReview, IACR ePrint | Preprints, AI/ML submissions, cryptography ePrints |
77
+ | Computer science and engineering | DBLP, ACM metadata, IEEE Xplore, USENIX | CS bibliography, engineering metadata, conference proceedings |
78
+ | Open access and repositories | CORE, OpenAIRE, Unpaywall | Repository discovery and open-access PDF fallback |
79
+ | Citation indexes and publishers | Web of Science, Scopus, ScienceDirect, Springer Nature/SpringerLink, Wiley | Institution-backed metadata, publisher records, entitled access |
80
+ | DOI-targeted fallback | Sci-Hub | DOI-based PDF fallback when enabled |
91
81
 
92
82
  ### Capability Matrix
93
83
 
94
84
  #### General Scholarly Metadata
95
85
 
96
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
86
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
97
87
  | --- | --- | --- | --- | --- | --- | --- |
98
- | Crossref | ✅ | ❌ | ❌ | ✅ | ❌ | Default search platform, broad metadata coverage |
99
- | OpenAlex | ✅ | 🟡 Conditional | ❌ | ✅ | ❌ | Broad free metadata; can feed fallback downloads when records include OA links |
100
- | Semantic Scholar | ✅ | | ✅ Body snippets | ✅ | 🟡 Optional* | AI semantic search + OA body snippets |
101
- | Google Scholar | ✅ | ❌ | ❌ | ✅ | ❌ | Broad academic discovery, scrape-based |
88
+ | Crossref | ✅ Yes | ❌ No | ❌ No | ✅ Yes | ❌ None | Default broad metadata source |
89
+ | OpenAlex | ✅ Yes | 🟡 Conditional | ❌ No | ✅ Yes | ❌ None | Free metadata; OA links can help PDF fallback |
90
+ | Semantic Scholar | ✅ Yes | 🟡 Conditional | ✅ Body snippets | ✅ Yes | 🟡 Optional; snippets require `SEMANTIC_SCHOLAR_API_KEY` | Good for AI/CS and body snippet clues |
91
+ | Google Scholar | ✅ Yes | ❌ No | ❌ No | ✅ Yes | ❌ None | Broad discovery through page parsing |
102
92
 
103
- #### Medicine / Life Sciences
93
+ #### Journal Metrics
104
94
 
105
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
95
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
106
96
  | --- | --- | --- | --- | --- | --- | --- |
107
- | PubMed | | ❌ | ❌ | ❌ | 🟡 Optional | Biomedical literature through NCBI E-utilities |
108
- | PubMed Central | ✅ | ✅ | ✅ | ❌ | ❌ | Open biomedical full text and PMC PDFs |
109
- | Europe PMC | ✅ | ✅ | ✅ | ❌ | ❌ | Biomedical metadata plus open full-text links |
97
+ | EasyScholar | 🟡 Journal metrics only | ❌ No | ❌ No | ❌ No | Required `EASYSCHOLAR_KEY` | Impact factor, JCR/SSCI quartile, CAS zone, JCI, ESI, warnings, rank fields |
110
98
 
111
- #### Computer Science / Engineering
99
+ #### Biomedical and Life Sciences
112
100
 
113
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
101
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
114
102
  | --- | --- | --- | --- | --- | --- | --- |
115
- | DBLP | ✅ | ❌ | ❌ | ❌ | | Computer science bibliography through the official DBLP search API |
116
- | ACM Digital Library | ✅ | | ❌ | | ❌ | ACM DOI-prefix metadata through Crossref; no ACM scraping |
117
- | USENIX | ✅ | | | ❌ | ❌ | DBLP-backed USENIX proceedings metadata; no USENIX search-page scraping |
118
- | IEEE Xplore | ✅ | ❌ | ❌ | ✅ | ✅ Required | IEEE metadata through the official IEEE Xplore Metadata API |
103
+ | PubMed | ✅ Yes | ❌ No | ❌ No | ❌ No | 🟡 Optional `PUBMED_API_KEY`, `NCBI_EMAIL`, `NCBI_TOOL` | NCBI E-utilities biomedical metadata |
104
+ | PubMed Central | ✅ Yes | ✅ Yes | Yes | ❌ No | ❌ None | Open biomedical full text and PMC PDFs |
105
+ | Europe PMC | ✅ Yes | 🟡 Conditional | 🟡 Conditional | ❌ No | ❌ None | Biomedical metadata and open full-text links |
119
106
 
120
- #### Preprints / Conference Papers
107
+ #### Preprints and Conference Papers
121
108
 
122
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
109
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
123
110
  | --- | --- | --- | --- | --- | --- | --- |
124
- | arXiv | ✅ | ✅ | ✅ | ❌ | ❌ | Physics, CS, math, and related preprints |
125
- | bioRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | Biology preprints |
126
- | medRxiv | ✅ | ✅ | ✅ | ❌ | ❌ | Medical preprints |
127
- | OpenReview | ✅ | ❌ | ❌ | ❌ | ❌ | Conference submissions, reviews, and preprints through public OpenReview notes search |
128
- | IACR ePrint | ✅ | ✅ | ✅ | ❌ | ❌ | Cryptography papers |
111
+ | arXiv | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No | ❌ None | Physics, CS, math, quantitative preprints |
112
+ | bioRxiv | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No | ❌ None | Biology preprints |
113
+ | medRxiv | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No | ❌ None | Medical preprints |
114
+ | OpenReview | ✅ Yes | ❌ No | ❌ No | ❌ No | ❌ None | Public OpenReview notes, reviews, and submissions |
115
+ | IACR ePrint | ✅ Yes | ✅ Yes | ✅ Yes | ❌ No | ❌ None | Cryptography ePrint papers |
129
116
 
130
- #### Open Full Text / Repositories
117
+ #### Computer Science and Engineering
131
118
 
132
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
119
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
133
120
  | --- | --- | --- | --- | --- | --- | --- |
134
- | CORE | ✅ | 🟡 Conditional | 🟡 Conditional | ❌ | 🟡 Optional | Downloads work when records include PDF or full-text links |
135
- | OpenAIRE | ✅ | 🟡 Conditional | ❌ | | 🟡 Optional | Can feed fallback downloads when records include open links |
136
- | Unpaywall | 🟡 Conditional | 🟡 Conditional | ❌ | ❌ | Required | DOI-only lookup; requires an email; downloads work when an OA PDF is found |
121
+ | DBLP | ✅ Yes | No | No | ❌ No | None | Official DBLP computer-science bibliography |
122
+ | ACM metadata | ✅ Yes | No | ❌ No | Yes | None | Uses Crossref ACM DOI-prefix metadata; does not scrape ACM pages |
123
+ | USENIX | Yes | No | ❌ No | ❌ No | None | Uses DBLP-backed USENIX metadata; does not scrape USENIX search pages |
124
+ | IEEE Xplore | ✅ Yes | ❌ No | ❌ No | ✅ Yes | ✅ Required `IEEE_API_KEY` | Official IEEE Xplore Metadata API |
137
125
 
138
- #### Citation Indexes / Publishers
126
+ #### Open Access and Repositories
139
127
 
140
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
128
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
141
129
  | --- | --- | --- | --- | --- | --- | --- |
142
- | Web of Science | | | ❌ | | Required | Citation database, date sorting, year ranges |
143
- | ScienceDirect | ✅ | | ❌ | | Required | Elsevier metadata and abstracts |
144
- | Springer Nature / SpringerLink | | 🟡 Conditional | ❌ | ❌ | ✅ Required | `springerlink` is an alias for the existing Springer Nature integration |
145
- | Wiley | ❌ Keyword search | ✅ | ✅ | ❌ | ✅ Required | TDM API, DOI-based PDF download only |
146
- | Scopus | ✅ | ❌ | ❌ | ✅ | ✅ Required | Abstract and citation database |
130
+ | CORE | Yes | 🟡 Conditional | 🟡 Conditional | ❌ No | 🟡 Optional `CORE_API_KEY` | Repository records may expose PDF or full-text links |
131
+ | OpenAIRE | ✅ Yes | 🟡 Conditional | ❌ No | No | 🟡 Optional `OPENAIRE_API_KEY` | Public search usually works without a key |
132
+ | Unpaywall | 🟡 DOI lookup only | 🟡 Conditional | ❌ No | ❌ No | ✅ Required `UNPAYWALL_EMAIL` or `PAPER_SEARCH_UNPAYWALL_EMAIL` | Finds DOI-based open-access PDF locations; email, not API key |
147
133
 
148
- #### DOI-Targeted Lookup
134
+ #### Citation Indexes and Publishers
149
135
 
150
- | Platform | Search | Download | Full Text | Citations | API Key | Special Features |
136
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
151
137
  | --- | --- | --- | --- | --- | --- | --- |
152
- | Sci-Hub | | ✅ | ❌ | ❌ | | DOI-based paper lookup and PDF retrieval |
138
+ | Web of Science | ✅ Yes | ❌ No | ❌ No | Yes | Required `WOS_API_KEY` | Citation database metadata, date sorting, year ranges |
139
+ | ScienceDirect | ✅ Yes | 🟡 Conditional | ❌ No | ✅ Yes | ✅ Required `ELSEVIER_API_KEY` | Elsevier metadata; product permission is separate from Scopus |
140
+ | Springer Nature / SpringerLink | ✅ Yes | 🟡 Conditional | ❌ No | ❌ No | ✅ Required `SPRINGER_API_KEY`; 🟡 Optional `SPRINGER_OPENACCESS_API_KEY` | `springerlink` is an alias for the Springer integration |
141
+ | Wiley | ❌ No keyword search | ✅ DOI download | ✅ Yes | ❌ No | ✅ Required `WILEY_TDM_TOKEN` | TDM API; use a DOI found from another metadata source |
142
+ | Scopus | ✅ Yes | 🟡 Conditional metadata | ❌ No | ✅ Yes | ✅ Required `ELSEVIER_API_KEY` | Abstract and citation database; product permission is separate from ScienceDirect |
143
+
144
+ #### DOI-Targeted Fallback
145
+
146
+ | Platform | Metadata search | PDF path | Body/full text | Citation signal | Config | Notes |
147
+ | --- | --- | --- | --- | --- | --- | --- |
148
+ | Sci-Hub | ❌ No | ✅ Yes | ❌ No | ❌ No | ❌ None | DOI/URL-targeted lookup and final PDF fallback when enabled |
153
149
 
154
150
  Notes:
155
151
 
156
- - In capability columns, `✅` means directly supported, `❌` means unsupported, and `🟡 Conditional` means support depends on record content or provider constraints, such as DOI-only lookup, available PDF/OA links, or open-access-only downloads.
157
- - In the API Key column, `❌` means no configuration is needed, `🟡 Optional` means configuration improves limits or stability, and `✅ Required` means the key is required only when you use that platform, not that every new installation should configure it. Unpaywall requires an email rather than a traditional API key.
158
- - Wiley does not support keyword search through the Wiley TDM API. Use `search_crossref` to find Wiley articles and then use `download_paper` with `platform=wiley` and the DOI.
159
- - ACM and USENIX search intentionally use metadata-backed routes rather than crawling provider search pages, which keeps the integration compatible with robots.txt and reduces IP-blocking risk.
160
- - `platform=all` tries every registered search source except DOI-download-only providers such as Wiley. Sources without configured credentials, sources that time out, and sources that fail are recorded in `failed_sources` / `errors` while the remaining sources continue.
161
- - `--sources` accepts a comma-separated source list, for example `--sources crossref,openalex,pmc`.
162
- - `🟡 Optional*` for Semantic Scholar means optional for regular search; `search_semantic_snippets` body-snippet search requires `SEMANTIC_SCHOLAR_API_KEY`.
152
+ - `Metadata search` means finding and screening papers; it is not the same as PDF download or body evidence.
153
+ - `pdf_discovery` separates open-access sources, configured entitlement sources, and Sci-Hub as a separately identified final fallback.
154
+ - EasyScholar is a journal metrics source, not a paper search source.
155
+ - Sci-Hub is not part of `metadata_search`; it is DOI/URL-targeted PDF fallback.
156
+ - `🟡 Conditional` means the platform can help only when the record exposes a DOI, open-access link, PDF URL, or configured entitlement.
157
+ - API keys are only required when you use the corresponding key-backed source or workflow.
163
158
 
164
159
  ## Configuration
165
160
 
166
- Most free metadata sources work without configuration. For API keys and emails, prefer the user-level config file so the CLI works from any directory:
161
+ Most free metadata sources work without configuration. For stable agent workflows, run setup once and store credentials in the user config file:
167
162
 
168
163
  ```bash
169
164
  paper-search setup
170
- paper-search config set SEMANTIC_SCHOLAR_API_KEY your_semantic_scholar_api_key_here
171
- paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com # optional: replace the setup-generated email
172
165
  paper-search config list --pretty
173
- paper-search config doctor --pretty
174
- paper-search diagnostics --pretty
166
+ paper-search doctor --pretty
175
167
  ```
176
168
 
177
- The default config path is:
169
+ Default config path:
178
170
 
179
171
  ```text
180
172
  ~/.config/paper-search-cli/config.json
181
173
  ```
182
174
 
183
- The file is written with `0600` permissions. `config list` and `config doctor` mask secrets.
184
-
185
- `paper-search setup` is the guided setup command. By default it asks for the recommended credentials only: Semantic Scholar, Unpaywall email, Crossref email, and CORE. Use `paper-search setup --all` to walk through every supported configuration key, or `paper-search setup --keys SEMANTIC_SCHOLAR_API_KEY,CORE_API_KEY` to configure a specific subset.
186
-
187
- To reduce first-run friction, if `PAPER_SEARCH_UNPAYWALL_EMAIL` / `UNPAYWALL_EMAIL` / `CROSSREF_MAILTO` are not configured, pressing Enter during setup writes a random Gmail-format address such as `paper.search.xxxxxx@gmail.com`, so basic Unpaywall and Crossref requests can run immediately.
188
-
189
- `paper-search diagnostics --pretty` lists every API-key or email-backed capability, the related config keys, whether the required keys are configured, common failure modes, and suggested next checks. Search commands also add a `diagnostic` field when a key-backed platform returns zero results or an auth/permission/rate-limit error.
175
+ The config file is written with `0600` permissions. `config list`, `doctor`, and related commands mask secrets.
190
176
 
191
- ### API Key Recommendation
177
+ ### API Key Tiers
192
178
 
193
- `paper-search setup` asks only for the credentials that are most useful for ordinary new users. `✅ Required` in the platform table means "required for that platform", not "recommended for every installation".
194
-
195
- | Level | Config keys | Recommended for new users | Notes |
179
+ | Tier | Keys | Used for | When to configure |
196
180
  | --- | --- | --- | --- |
197
- | Default recommended | `SEMANTIC_SCHOLAR_API_KEY` | Yes | Enables Semantic Scholar body-snippet search for methodology details and improves request stability. |
198
- | Default recommended | `PAPER_SEARCH_UNPAYWALL_EMAIL` or `UNPAYWALL_EMAIL` | Yes | Finds open-access PDFs from DOI records; this only needs an email, not an API key. Press Enter in `setup` to generate a random Gmail-format email, or replace it manually. |
199
- | Default recommended | `CROSSREF_MAILTO` | Yes | Puts Crossref requests in the polite pool, which is better for long-running or frequent searches. Press Enter in `setup` to reuse the generated email, or replace it manually. |
200
- | Default recommended | `CORE_API_KEY` or `PAPER_SEARCH_CORE_API_KEY` | Yes | CORE anonymous access is often rate-limited; a key makes open repository search more reliable. |
201
- | Biomedical-heavy use | `PUBMED_API_KEY`, `NCBI_EMAIL`, `NCBI_TOOL` | Recommended if you use PubMed heavily | Raises NCBI E-utilities limits and identifies the client. |
202
- | Institution entitlement | `WOS_API_KEY` | Configure only with Web of Science API access | Enables Web of Science search and citation data; requires Clarivate API entitlement. |
203
- | Institution entitlement | `IEEE_API_KEY` | Configure only with IEEE Xplore API access | Enables IEEE Xplore metadata search; IEEE may require registered API access and product entitlement. |
204
- | Institution entitlement | `ELSEVIER_API_KEY` | Configure only with Scopus or ScienceDirect API access | One Elsevier key does not automatically grant both products; Scopus and ScienceDirect need separate entitlements. |
205
- | Institution entitlement | `SPRINGER_API_KEY`, `SPRINGER_OPENACCESS_API_KEY` | Configure only when you need Springer | Used for Springer metadata and open-access records; 401 usually means an invalid key or missing product access. |
206
- | Institution entitlement | `WILEY_TDM_TOKEN` | Configure only with Wiley TDM/institutional full-text access | DOI-based download only; availability depends on the token and institutional subscription. |
207
- | Usually unnecessary | `PAPER_SEARCH_OPENAIRE_API_KEY` or `OPENAIRE_API_KEY` | Not recommended by default | OpenAIRE public search usually works without a key; configure only for account or quota requirements. |
208
-
209
- You can also import an existing `.env`:
210
-
211
- ```bash
212
- paper-search config import-env .env --pretty
213
- ```
214
-
215
- Config priority is:
216
-
217
- 1. Shell environment variables.
218
- 2. Current working directory `.env`.
219
- 3. User config file.
220
- 4. Built-in defaults for free sources.
221
-
222
- For repo-local development, copying `.env.example` still works:
223
-
224
- ```bash
225
- cp .env.example .env
226
- ```
227
-
228
- ### Environment Variables
229
-
230
- ```bash
231
- # Web of Science, required for Web of Science search
232
- WOS_API_KEY=your_web_of_science_api_key_here
233
- WOS_API_VERSION=v1
234
-
235
- # IEEE Xplore, required for IEEE metadata search
236
- IEEE_API_KEY=your_ieee_api_key_here
237
-
238
- # PubMed, optional; increases rate limit from 3 requests/sec to 10 requests/sec
239
- PUBMED_API_KEY=your_ncbi_api_key_here
240
- NCBI_EMAIL=you@example.com
241
- NCBI_TOOL=paper-search-cli
242
-
243
- # Semantic Scholar, required for body-snippet search and useful for higher request limits
244
- SEMANTIC_SCHOLAR_API_KEY=your_semantic_scholar_api_key_here
245
-
246
- # Elsevier, required for Scopus and ScienceDirect; each product still needs separate entitlement
247
- ELSEVIER_API_KEY=your_elsevier_api_key_here
248
-
249
- # Springer Nature, required for Springer search and open access download
250
- SPRINGER_API_KEY=your_springer_api_key_here
251
- SPRINGER_OPENACCESS_API_KEY=your_openaccess_api_key_here
252
-
253
- # Wiley TDM, required for Wiley DOI-based PDF download
254
- WILEY_TDM_TOKEN=your_wiley_tdm_token_here
255
-
256
- # Crossref polite pool, optional but recommended; setup can auto-generate/reuse a random Gmail-format email
257
- CROSSREF_MAILTO=you@example.com
258
-
259
- # Unpaywall, required for DOI-based OA resolution; setup can auto-generate a random Gmail-format email
260
- PAPER_SEARCH_UNPAYWALL_EMAIL=you@example.com
261
- UNPAYWALL_EMAIL=you@example.com
262
-
263
- # CORE, optional but recommended; anonymous access is often heavily rate-limited
264
- PAPER_SEARCH_CORE_API_KEY=your_core_api_key_here
265
- CORE_API_KEY=your_core_api_key_here
266
-
267
- # OpenAIRE, optional; public search works without a key
268
- PAPER_SEARCH_OPENAIRE_API_KEY=your_openaire_api_key_here
269
- OPENAIRE_API_KEY=your_openaire_api_key_here
270
- ```
271
-
272
- ### API Key Sources
273
-
274
- - Web of Science: [Clarivate Developer Portal](https://developer.clarivate.com/apis)
275
- - IEEE Xplore: [IEEE Xplore Metadata API](https://developer.ieee.org/docs/read/Searching_the_IEEE_Xplore_Metadata_API)
276
- - PubMed: [NCBI API Keys](https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/)
277
- - Semantic Scholar: [Semantic Scholar API](https://www.semanticscholar.org/product/api)
278
- - Elsevier: [Elsevier Developer Portal](https://dev.elsevier.com/apikey/manage)
279
- - Springer Nature: [Springer Nature Developers](https://dev.springernature.com/)
280
- - Wiley TDM: [Wiley Text and Data Mining](https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining)
281
- - Unpaywall: [Unpaywall Data Format and API](https://unpaywall.org/products/api)
282
- - CORE: [CORE API](https://core.ac.uk/services/api)
283
- - OpenAIRE: [OpenAIRE APIs](https://develop.openaire.eu/)
284
-
285
- `.env` is ignored by git. Do not commit API keys or tokens.
181
+ | Recommended for most users | `SEMANTIC_SCHOLAR_API_KEY` | Body snippet search and more stable Semantic Scholar requests | Configure if you use methods/detail searches or high-frequency Semantic Scholar lookup |
182
+ | Recommended for most users | `UNPAYWALL_EMAIL` or `PAPER_SEARCH_UNPAYWALL_EMAIL` | DOI-based open-access PDF resolution | Configure during setup; an email is required, not an API key |
183
+ | Recommended for most users | `CROSSREF_MAILTO` | Crossref polite pool | Configure for long-running or frequent metadata search |
184
+ | Recommended for most users | `CORE_API_KEY` | CORE repository search | Configure if you rely on CORE or hit anonymous rate limits |
185
+ | Journal metrics | `EASYSCHOLAR_KEY` | EasyScholar impact factor, JCR/SSCI, CAS, JCI, ESI, warning flags | Configure if you need journal metrics; use `paper-search setup EASYSCHOLAR_KEY` for hidden input |
186
+ | Biomedical-heavy use | `PUBMED_API_KEY`, `NCBI_EMAIL`, `NCBI_TOOL` | NCBI E-utilities stability and higher limits | Configure if PubMed is a frequent source |
187
+ | Institutional or publisher access | `WOS_API_KEY`, `IEEE_API_KEY`, `ELSEVIER_API_KEY`, `SPRINGER_API_KEY`, `SPRINGER_OPENACCESS_API_KEY`, `WILEY_TDM_TOKEN` | Web of Science, IEEE, Scopus, ScienceDirect, Springer, Wiley metadata or entitled access | Configure only when you have the relevant API or institutional permission |
188
+ | Usually optional | `OPENAIRE_API_KEY` | OpenAIRE account/quota use | Usually unnecessary for public search |
189
+
190
+ Useful key dashboards:
191
+
192
+ | Service | Link |
193
+ | --- | --- |
194
+ | EasyScholar | [EasyScholar Open API](https://www.easyscholar.cc/console/user/open) |
195
+ | Semantic Scholar | [Semantic Scholar API](https://www.semanticscholar.org/product/api) |
196
+ | Unpaywall | [Unpaywall API](https://unpaywall.org/products/api) |
197
+ | CORE | [CORE API](https://core.ac.uk/services/api) |
198
+ | PubMed | [NCBI API Keys](https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/) |
199
+ | Web of Science | [Clarivate Developer Portal](https://developer.clarivate.com/apis) |
200
+ | IEEE Xplore | [IEEE Xplore Metadata API](https://developer.ieee.org/docs/read/Searching_the_IEEE_Xplore_Metadata_API) |
201
+ | Elsevier | [Elsevier Developer Portal](https://dev.elsevier.com/apikey/manage) |
202
+ | Springer Nature | [Springer Nature Developers](https://dev.springernature.com/) |
203
+ | Wiley TDM | [Wiley Text and Data Mining](https://onlinelibrary.wiley.com/library-info/resources/text-and-datamining) |
204
+ | OpenAIRE | [OpenAIRE APIs](https://develop.openaire.eu/) |
286
205
 
287
206
  ## Agent Skill
288
207
 
289
- This repository includes an optional agent skill at `skills/paper-search/SKILL.md`. Install it into your agent's skill directory if your agent supports skills.
290
-
291
- For example:
208
+ The npm package ships a bundled agent Skill at `skills/paper-search/SKILL.md`. Terminal users can use the CLI directly; AI agent workflows should install or sync the Skill so the agent can route the four main workflows correctly.
292
209
 
293
210
  ```bash
294
- mkdir -p ~/.agents/skills/paper-search
295
- cp skills/paper-search/SKILL.md ~/.agents/skills/paper-search/SKILL.md
211
+ paper-search setup --install-skills agents
212
+ paper-search skills status --pretty
213
+ paper-search skills diff --targets agents --format text
214
+ paper-search skills update --targets agents --pretty
296
215
  ```
297
216
 
298
- The skill only teaches the agent how to call the `paper-search` CLI. API keys are still configured through `paper-search setup`, `paper-search config`, `.env`, or shell environment variables. Do not store secrets in the skill file.
217
+ Supported targets include `agents`, `codex`, `claude`, `cursor`, `gemini`, `antigravity`, and `all`. Skill updates overwrite package-managed Skill files while preserving extra files in the installed Skill directory.
299
218
 
300
- ## Output Contract
301
-
302
- By default, every command writes JSON to stdout.
303
-
304
- ```json
305
- {
306
- "ok": true,
307
- "tool": "search_papers",
308
- "message": "Found 1 papers.",
309
- "data": []
310
- }
311
- ```
312
-
313
- Use `--pretty` for formatted JSON:
314
-
315
- ```bash
316
- paper-search search "machine learning" --platform crossref --max-results 1 --pretty
317
- ```
318
-
319
- Use `--format text` if you need the raw text response:
320
-
321
- ```bash
322
- paper-search search "machine learning" --platform crossref --max-results 1 --format text
323
- ```
324
-
325
- Use `--include-text` to keep the raw response text alongside parsed JSON:
326
-
327
- ```bash
328
- paper-search run search_crossref --arg query="machine learning" --arg maxResults=3 --include-text --pretty
329
- ```
219
+ The Skill only teaches agents how to call the `paper-search` CLI. API keys still belong in `paper-search setup`, `paper-search config`, `.env`, or shell environment variables.
330
220
 
331
221
  ## Commands
332
222
 
333
- ### `paper-search search`
334
-
335
- Unified search entrypoint.
336
-
337
- ```bash
338
- paper-search search <query> [options]
339
- ```
340
-
341
- Examples:
342
-
343
- ```bash
344
- paper-search search "machine learning" --platform crossref --max-results 10 --pretty
345
- paper-search search "machine learning" --sources crossref,openalex --max-results 2 --pretty
346
- paper-search search "cancer immunotherapy" --platform all --max-results 2 --pretty
347
- paper-search search "transformer neural networks" --platform arxiv --category cs.AI --year 2023 --pretty
348
- paper-search search "COVID-19 vaccine efficacy" --platform pubmed --max-results 20 --year 2023 --pretty
349
- paper-search search "CRISPR gene editing" --platform webofscience --journal Nature --max-results 15 --pretty
350
- ```
351
-
352
- Common options:
353
-
354
- | Option | Description |
223
+ | Command | Purpose |
355
224
  | --- | --- |
356
- | `--platform` | Source platform. Default: `crossref` |
357
- | `--sources` | Comma-separated source list for multi-source search, e.g. `crossref,openalex,pmc` |
358
- | `--max-results` | Maximum result count |
359
- | `--year` | Year filter, e.g. `2024`, `2020-2024`, `2020-` |
360
- | `--author` | Author name filter |
361
- | `--journal` | Journal name filter |
362
- | `--category` | Category filter, mainly arXiv/bioRxiv/medRxiv |
363
- | `--days` | Days back for bioRxiv/medRxiv |
364
- | `--sort-by` | `relevance`, `date`, or `citations` |
365
- | `--sort-order` | `asc` or `desc` |
366
-
367
- ### `paper-search run`
225
+ | `paper-search search` | Integrated metadata search |
226
+ | `paper-search journal-metrics` | EasyScholar journal metrics lookup |
227
+ | `paper-search download` | Direct PDF download for a verified paper ID or DOI |
228
+ | `paper-search run` | Precise tool invocation with `--arg` or `--json-args` |
229
+ | `paper-search tools` | Runtime tool names and schemas |
230
+ | `paper-search doctor` | Masked config, Capability Profile, and platform status |
231
+ | `paper-search smoke` | Mock or live self-checks |
232
+ | `paper-search skills` | Bundled Skill status, diff, and update |
233
+ | `paper-search config` | User-level configuration management |
368
234
 
369
- Run a specific internal tool by name. This is the most precise command for agent workflows.
235
+ Full command and tool schema: run `paper-search tools --pretty` or see [`skills/paper-search/references/cli-contract.md`](skills/paper-search/references/cli-contract.md).
370
236
 
371
- ```bash
372
- paper-search run <tool-name> --arg key=value --arg key=value
373
- paper-search run <tool-name> --json-args '{"key":"value"}'
374
- paper-search run <tool-name> --json-args @args.json
375
- ```
237
+ ## Output
376
238
 
377
- Examples:
239
+ Commands return JSON by default. Use `--pretty` for formatted JSON and `--format text` only when you need a human-readable report.
378
240
 
379
241
  ```bash
380
- paper-search run search_crossref --arg query="machine learning" --arg maxResults=5 --pretty
381
- paper-search run search_papers --json-args '{"query":"machine learning","sources":"crossref,openalex","maxResults":2}' --pretty
382
- paper-search run search_pubmed --json-args '{"query":"osteoarthritis","maxResults":5,"sortBy":"date"}' --pretty
383
- paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --pretty
384
- ```
385
-
386
- ### `paper-search tools`
387
-
388
- List all available tool names, descriptions, and input schemas.
389
-
390
- ```bash
391
- paper-search tools --pretty
392
- ```
393
-
394
- ### `paper-search status`
395
-
396
- Show platform capabilities and API key status. Secrets are never printed.
397
-
398
- ```bash
399
- paper-search status --pretty
400
- paper-search status --validate --pretty
401
- ```
402
-
403
- `--validate` may make live provider requests. Use it when you intentionally want credential validation.
404
-
405
- ### `paper-search diagnostics`
406
-
407
- Show API-key-backed capabilities and troubleshooting guidance. This does not print secrets.
408
-
409
- ```bash
410
- paper-search diagnostics --pretty
411
- ```
412
-
413
- When a command returns zero results from a configured key-backed source, or fails with 401, 403, 400, or 429, JSON output includes a `diagnostic` field with likely causes and next actions.
414
-
415
- ### `paper-search config`
416
-
417
- Manage the user-level config file.
418
-
419
- ```bash
420
- paper-search config init --pretty
421
- paper-search config set SEMANTIC_SCHOLAR_API_KEY your_key --pretty
422
- paper-search config set PAPER_SEARCH_UNPAYWALL_EMAIL you@example.com --pretty # optional: replace the setup-generated email
423
- paper-search config import-env .env --pretty
424
- paper-search config list --pretty
425
- paper-search config doctor --pretty
426
- paper-search config path --pretty
427
- paper-search config keys --pretty
428
- ```
429
-
430
- ### `paper-search download`
431
-
432
- Download a paper PDF through a platform that supports downloads.
433
-
434
- ```bash
435
- paper-search download <paper-id-or-doi> --platform <platform> [--save-path ./downloads]
436
- ```
437
-
438
- Examples:
439
-
440
- ```bash
441
- paper-search download 2301.00001 --platform arxiv --save-path ./downloads
442
- paper-search download 10.1000/example --platform scihub --save-path ./downloads
443
- paper-search download 10.1111/jtsb.12390 --platform wiley --save-path ./downloads
444
- paper-search run download_with_fallback --arg source=arxiv --arg paperId=1201.0490 --arg doi=10.48550/arxiv.1201.0490 --arg savePath=./downloads --pretty
445
- ```
446
-
447
- ## Tool Reference
448
-
449
- These names can be used with `paper-search run`.
450
-
451
- ### `search_papers`
452
-
453
- Search across the unified dispatcher.
454
-
455
- ```bash
456
- paper-search run search_papers --json-args '{"query":"machine learning","platform":"crossref","maxResults":10,"year":"2023","sortBy":"date"}' --pretty
457
- ```
458
-
459
- Supported platforms:
460
-
461
- ```text
462
- crossref, arxiv, webofscience, wos, pubmed, biorxiv, medrxiv, semantic,
463
- iacr, googlescholar, scholar, scihub, ieee, sciencedirect, springer,
464
- springerlink, scopus, openalex, unpaywall, pmc, europepmc, core,
465
- openaire, dblp, acm, usenix, openreview, all
466
- ```
467
-
468
- For multi-source search, pass `sources`:
469
-
470
- ```bash
471
- paper-search run search_papers --json-args '{"query":"machine learning","sources":"crossref,openalex,pmc","maxResults":2}' --pretty
472
- ```
473
-
474
- ### `search_crossref`
475
-
476
- Search Crossref, the default free metadata source.
477
-
478
- ```bash
479
- paper-search run search_crossref --arg query="machine learning" --arg maxResults=10 --arg year=2023 --arg sortBy=relevance --arg sortOrder=desc --pretty
480
- ```
481
-
482
- ### `search_arxiv`
483
-
484
- Search arXiv preprints.
485
-
486
- ```bash
487
- paper-search run search_arxiv --arg query="transformer neural networks" --arg maxResults=10 --arg category=cs.AI --arg year=2023 --arg sortBy=date --arg sortOrder=desc --pretty
488
- ```
489
-
490
- ### `search_pubmed`
491
-
492
- Search PubMed/MEDLINE biomedical literature.
493
-
494
- ```bash
495
- paper-search run search_pubmed --json-args '{"query":"COVID-19 vaccine efficacy","maxResults":20,"year":"2023","journal":"New England Journal of Medicine","publicationType":["Journal Article","Clinical Trial"],"sortBy":"date"}' --pretty
496
- ```
497
-
498
- ### Open Metadata And Full-Text Sources
499
-
500
- Use these commands for open metadata search, open full-text discovery, and fallback PDF lookup:
501
-
502
- ```bash
503
- paper-search run search_openalex --arg query="machine learning" --arg maxResults=3 --pretty
504
- paper-search run search_unpaywall --arg query="10.48550/arxiv.1201.0490" --pretty
505
- paper-search run search_pmc --arg query="cancer immunotherapy" --arg maxResults=3 --pretty
506
- paper-search run search_europepmc --arg query="cancer genomics" --arg maxResults=3 --pretty
507
- paper-search run search_core --arg query="machine learning" --arg maxResults=3 --pretty
508
- paper-search run search_openaire --arg query="machine learning" --arg maxResults=3 --pretty
509
- ```
510
-
511
- Unpaywall is DOI-only and requires an email. CORE public access may return zero results or rate-limit quickly without an API key.
512
-
513
- ### Registry-Backed Platform Search
514
-
515
- These metadata-oriented tools are generated from the platform registry, so adding later platforms only needs a new searcher plus registry metadata:
516
-
517
- ```bash
518
- paper-search run search_dblp --arg query="graph neural networks" --arg maxResults=5 --pretty
519
- paper-search run search_acm --arg query="software testing" --arg maxResults=5 --pretty
520
- paper-search run search_usenix --arg query="file systems" --arg maxResults=5 --pretty
521
- paper-search run search_openreview --arg query="large language models" --arg maxResults=5 --pretty
522
- paper-search run search_springerlink --arg query="machine learning" --arg maxResults=5 --pretty
523
- ```
524
-
525
- `search_ieee` uses the same generic schema but requires `IEEE_API_KEY`:
526
-
527
- ```bash
528
- paper-search run search_ieee --arg query="wireless networks" --arg maxResults=5 --arg articleTitle="wireless" --pretty
529
- ```
530
-
531
- ### `search_webofscience`
532
-
533
- Search Web of Science. Requires `WOS_API_KEY`.
534
-
535
- ```bash
536
- paper-search run search_webofscience --arg query="CRISPR gene editing" --arg maxResults=15 --arg year=2022 --arg journal=Nature --pretty
537
- ```
538
-
539
- ### `search_google_scholar`
540
-
541
- Search Google Scholar.
542
-
543
- ```bash
544
- paper-search run search_google_scholar --arg query="deep learning" --arg maxResults=10 --arg yearLow=2020 --arg yearHigh=2024 --pretty
545
- ```
546
-
547
- ### `search_biorxiv` and `search_medrxiv`
548
-
549
- Search preprint servers by recent day window and optional category.
550
-
551
- ```bash
552
- paper-search run search_biorxiv --arg query="genomics" --arg maxResults=10 --arg days=30 --pretty
553
- paper-search run search_medrxiv --arg query="epidemiology" --arg maxResults=10 --arg days=60 --pretty
554
- ```
555
-
556
- ### `search_semantic_scholar`
557
-
558
- Search Semantic Scholar with optional field filters.
559
-
560
- ```bash
561
- paper-search run search_semantic_scholar --json-args '{"query":"graph neural networks","maxResults":10,"fieldsOfStudy":["Computer Science"]}' --pretty
562
- ```
563
-
564
- ### `search_semantic_snippets`
565
-
566
- Search Semantic Scholar's Open Access snippet index for body-text snippets that can help locate methodological details. Requires `SEMANTIC_SCHOLAR_API_KEY`.
567
-
568
- ```bash
569
- paper-search run search_semantic_snippets --arg query="CMAverse mediation bootstrap confidence interval" --arg limit=5 --arg fieldsOfStudy=Medicine --pretty
570
- ```
571
-
572
- ### `search_iacr`
573
-
574
- Search IACR ePrint Archive.
575
-
576
- ```bash
577
- paper-search run search_iacr --arg query="zero knowledge proof" --arg maxResults=10 --arg fetchDetails=true --pretty
578
- ```
579
-
580
- ### `search_sciencedirect`
581
-
582
- Search ScienceDirect. Requires `ELSEVIER_API_KEY`.
583
-
584
- ```bash
585
- paper-search run search_sciencedirect --arg query="materials science" --arg maxResults=10 --arg openAccess=true --pretty
586
- ```
587
-
588
- ### `search_scopus`
589
-
590
- Search Scopus. Requires `ELSEVIER_API_KEY`.
591
-
592
- ```bash
593
- paper-search run search_scopus --arg query="citation analysis" --arg maxResults=10 --arg documentType=ar --pretty
594
- ```
595
-
596
- ### `search_springer`
597
-
598
- Search Springer Nature. Requires `SPRINGER_API_KEY`.
599
-
600
- ```bash
601
- paper-search run search_springer --arg query="machine learning" --arg maxResults=10 --arg type=Journal --arg openAccess=true --pretty
602
- ```
603
-
604
- ### `search_scihub`
605
-
606
- Lookup a DOI or article URL through Sci-Hub and optionally download a PDF.
607
-
608
- ```bash
609
- paper-search run search_scihub --arg doiOrUrl="10.1038/nature12373" --arg downloadPdf=false --pretty
610
- paper-search run search_scihub --arg doiOrUrl="10.1038/nature12373" --arg downloadPdf=true --arg savePath=./downloads --pretty
611
- ```
612
-
613
- ### `check_scihub_mirrors`
614
-
615
- Show Sci-Hub mirror health.
616
-
617
- ```bash
618
- paper-search run check_scihub_mirrors --pretty
619
- paper-search run check_scihub_mirrors --arg forceCheck=true --pretty
620
- ```
621
-
622
- ### `get_paper_by_doi`
623
-
624
- Lookup metadata by DOI.
625
-
626
- ```bash
627
- paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --arg platform=all --pretty
628
- paper-search run get_paper_by_doi --arg doi="10.1038/nature12373" --arg platform=arxiv --pretty
629
- ```
630
-
631
- ### `download_paper`
632
-
633
- Download PDF files from a platform. If the selected platform has no native downloader, or if native download fails, the command enters the same fallback funnel used by `download_with_fallback`.
634
-
635
- ```bash
636
- paper-search run download_paper --arg paperId="2301.00001" --arg platform=arxiv --arg savePath=./downloads --pretty
637
- ```
638
-
639
- Native download platforms:
640
-
641
- ```text
642
- arxiv, biorxiv, medrxiv, semantic, iacr, scihub, springer, wiley,
643
- pmc, europepmc, core
644
- ```
645
-
646
- Other registered sources, such as `crossref`, `openalex`, `dblp`, `acm`, `usenix`, or `openreview`, can still be passed to `download_paper`; they start directly at the metadata/repository/Unpaywall/Sci-Hub fallback funnel.
647
-
648
- ### `download_with_fallback`
649
-
650
- Try the full download funnel. The order is source-native download, metadata PDF URL, repository discovery, Unpaywall DOI resolution, then Sci-Hub as the final fallback:
651
-
652
- ```bash
653
- paper-search run download_with_fallback --arg source=arxiv --arg paperId=1201.0490 --arg doi=10.48550/arxiv.1201.0490 --arg savePath=./downloads --pretty
654
- paper-search run download_with_fallback --arg source=crossref --arg paperId="10.1038/nature12373" --arg doi="10.1038/nature12373" --arg savePath=./downloads --pretty
655
- ```
656
-
657
- `useSciHub` defaults to `true`; set it to `false` only when you need to suppress that final fallback. `download_paper` also routes failed or unsupported platform downloads through the same funnel.
658
-
659
- ### `search_wiley`
660
-
661
- Wiley keyword search is not supported by the Wiley TDM API. Use Crossref first, then download by DOI:
662
-
663
- ```bash
664
- paper-search run search_crossref --arg query="site:wiley.com machine learning" --arg maxResults=10 --pretty
665
- paper-search run download_paper --arg paperId="10.1111/example" --arg platform=wiley --pretty
666
- ```
667
-
668
- ### `get_platform_status`
669
-
670
- Same as `paper-search status`.
671
-
672
- ```bash
673
- paper-search run get_platform_status --pretty
674
- paper-search run get_platform_status --arg validate=true --pretty
242
+ paper-search search "machine learning" --platform crossref --max-results 1 --pretty
243
+ paper-search doctor --format text
675
244
  ```
676
245
 
677
246
  ## Troubleshooting
678
247
 
679
- ### Command Not Found
680
-
681
- Run from the project:
682
-
683
- ```bash
684
- node dist/cli.js status --pretty
685
- ```
686
-
687
- Or register the local command:
688
-
689
- ```bash
690
- npm link
691
- paper-search status --pretty
692
- ```
693
-
694
- ### Missing API Key
695
-
696
- Run:
697
-
698
- ```bash
699
- paper-search status --pretty
700
- ```
701
-
702
- If a provider shows `missing`, add the relevant key through `paper-search setup`, user config, or `.env`, then rerun the command.
703
-
704
- For global installs, prefer user config:
705
-
706
- ```bash
707
- paper-search setup
708
- paper-search config set SEMANTIC_SCHOLAR_API_KEY your_key
709
- paper-search config doctor --pretty
710
- ```
711
-
712
- ### Provider Rate Limits
713
-
714
- Reduce `--max-results`, avoid repeated live validation, and prefer sources with official APIs. PubMed, Semantic Scholar, and CORE support optional keys for better limits. CORE anonymous access can return HTTP 429; configure `PAPER_SEARCH_CORE_API_KEY` when you rely on it.
715
-
716
- ### JSON Parsing In Scripts
717
-
718
- Use default JSON output and parse stdout. Human diagnostics are written to stderr.
248
+ | Problem | First check |
249
+ | --- | --- |
250
+ | Command not found | Reinstall globally with `npm install -g paper-search-cli` |
251
+ | Missing capability | Run `paper-search doctor --pretty` and configure the missing key with `paper-search setup` |
252
+ | Provider rate limits | Lower `--max-results`, configure the relevant key, or switch sources |
253
+ | Skill looks stale | Run `paper-search skills status --pretty`, then `paper-search skills update --targets agents --pretty` |
254
+ | Need complete CLI details | Run `paper-search tools --pretty` |
719
255
 
720
256
  ## Usage Boundaries
721
257
 
@@ -723,9 +259,7 @@ Some sources may be subject to platform terms, institutional subscriptions, or l
723
259
 
724
260
  ## Project Origin
725
261
 
726
- This project acknowledges and thanks the [LinuxDo](https://linux.do) community.
727
-
728
- The CLI + Skill direction and paper-search workflow refinements were shaped by community discussions and open-source sharing. This repository keeps the workflow focused on a one-command terminal tool and does not require an MCP runtime.
262
+ This project acknowledges and thanks the [LinuxDo](https://linux.do) community. The CLI + Skill direction and paper-search workflow refinements were shaped by community discussions and open-source sharing.
729
263
 
730
264
  It also references ideas from [openags/paper-search-mcp](https://github.com/openags/paper-search-mcp) while adapting the workflow to a standalone CLI.
731
265