@wentorai/research-plugins 1.4.0 → 1.4.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. package/README.en.md +143 -0
  2. package/README.md +98 -131
  3. package/curated/literature/README.md +2 -2
  4. package/curated/writing/README.md +1 -1
  5. package/openclaw.plugin.json +1 -1
  6. package/package.json +1 -1
  7. package/skills/literature/discovery/SKILL.md +1 -1
  8. package/skills/literature/discovery/citation-alert-guide/SKILL.md +2 -2
  9. package/skills/literature/discovery/conference-proceedings-guide/SKILL.md +2 -2
  10. package/skills/literature/discovery/literature-mapping-guide/SKILL.md +1 -1
  11. package/skills/literature/discovery/paper-recommendation-guide/SKILL.md +8 -14
  12. package/skills/literature/discovery/rss-paper-feeds/SKILL.md +20 -14
  13. package/skills/literature/discovery/semantic-paper-radar/SKILL.md +8 -8
  14. package/skills/literature/discovery/semantic-scholar-recs-guide/SKILL.md +103 -86
  15. package/skills/literature/fulltext/open-access-guide/SKILL.md +1 -1
  16. package/skills/literature/fulltext/open-access-mining-guide/SKILL.md +5 -5
  17. package/skills/literature/metadata/citation-network-guide/SKILL.md +3 -3
  18. package/skills/literature/metadata/h-index-guide/SKILL.md +0 -27
  19. package/skills/literature/search/SKILL.md +1 -1
  20. package/skills/literature/search/citation-chaining-guide/SKILL.md +42 -32
  21. package/skills/literature/search/database-comparison-guide/SKILL.md +1 -1
  22. package/skills/literature/search/semantic-scholar-api/SKILL.md +56 -53
  23. package/skills/research/automation/paper-to-agent-guide/SKILL.md +1 -1
  24. package/skills/research/deep-research/in-depth-research-guide/SKILL.md +1 -1
  25. package/skills/research/deep-research/kosmos-scientist-guide/SKILL.md +3 -3
  26. package/skills/research/deep-research/llm-scientific-discovery-guide/SKILL.md +1 -1
  27. package/skills/research/deep-research/local-deep-research-guide/SKILL.md +6 -6
  28. package/skills/research/deep-research/open-researcher-guide/SKILL.md +3 -3
  29. package/skills/research/deep-research/tongyi-deep-research-guide/SKILL.md +4 -4
  30. package/skills/research/methodology/grad-school-guide/SKILL.md +1 -1
  31. package/skills/research/paper-review/automated-review-guide/SKILL.md +1 -1
  32. package/skills/tools/diagram/excalidraw-diagram-guide/SKILL.md +1 -1
  33. package/skills/tools/diagram/mermaid-architect-guide/SKILL.md +1 -1
  34. package/skills/tools/diagram/plantuml-guide/SKILL.md +1 -1
  35. package/skills/tools/document/grobid-pdf-parsing/SKILL.md +1 -1
  36. package/skills/tools/document/paper-parse-guide/SKILL.md +2 -2
  37. package/skills/tools/knowledge-graph/citation-network-builder/SKILL.md +5 -5
  38. package/skills/tools/knowledge-graph/knowledge-graph-construction/SKILL.md +1 -1
  39. package/skills/tools/scraping/academic-web-scraping/SKILL.md +1 -2
  40. package/skills/tools/scraping/google-scholar-scraper/SKILL.md +7 -7
  41. package/skills/writing/citation/SKILL.md +1 -1
  42. package/skills/writing/citation/academic-citation-manager/SKILL.md +20 -17
  43. package/skills/writing/citation/citation-assistant-skill/SKILL.md +72 -58
  44. package/skills/writing/citation/onecite-reference-guide/SKILL.md +1 -1
  45. package/skills/writing/citation/zotero-reference-guide/SKILL.md +1 -1
  46. package/skills/writing/citation/zotero-scholar-guide/SKILL.md +1 -1
  47. package/src/tools/arxiv.ts +13 -3
  48. package/src/tools/biorxiv.ts +21 -5
  49. package/src/tools/crossref.ts +13 -6
  50. package/src/tools/datacite.ts +7 -3
  51. package/src/tools/doaj.ts +3 -2
  52. package/src/tools/europe-pmc.ts +4 -3
  53. package/src/tools/hal.ts +6 -4
  54. package/src/tools/inspire-hep.ts +3 -2
  55. package/src/tools/openaire.ts +11 -6
  56. package/src/tools/openalex.ts +17 -2
  57. package/src/tools/opencitations.ts +9 -0
  58. package/src/tools/orcid.ts +3 -0
  59. package/src/tools/osf-preprints.ts +3 -2
  60. package/src/tools/pubmed.ts +12 -5
  61. package/src/tools/unpaywall.ts +3 -0
  62. package/src/tools/util.ts +33 -0
  63. package/src/tools/zenodo.ts +10 -4
@@ -31,16 +31,16 @@ Key components:
31
31
  - **Vector database**: Stores and indexes embeddings for fast similarity search. Options include ChromaDB (local), Qdrant, Pinecone, or Weaviate.
32
32
  - **Similarity metric**: Cosine similarity is standard for comparing text embeddings.
33
33
 
34
- ### Using Semantic Scholar's Embedding Search
34
+ ### Using OpenAlex's Search API
35
35
 
36
- Semantic Scholar provides pre-computed SPECTER embeddings for millions of papers. You can use their search API for semantic queries:
36
+ OpenAlex indexes 250M+ works and supports search queries across all disciplines:
37
37
 
38
38
  ```bash
39
- # Semantic search via the Semantic Scholar API
40
- curl "https://api.semanticscholar.org/graph/v1/paper/search?query=attention+mechanisms+for+graph+neural+networks&fields=title,abstract,year,citationCount&limit=20"
39
+ # Search works via the OpenAlex API
40
+ curl "https://api.openalex.org/works?search=attention+mechanisms+for+graph+neural+networks&per_page=20"
41
41
  ```
42
42
 
43
- The search endpoint uses semantic matching, not just keyword matching. A query like "methods for handling missing values in longitudinal studies" will find papers about imputation techniques, dropout analysis, and panel data methods even if they do not use the phrase "missing values."
43
+ The search endpoint uses relevance-ranked matching. Combine with concept filters and citation data for more targeted discovery. For true semantic matching, build a local embedding index (see below).
44
44
 
45
45
  ### Building a Personal Semantic Index
46
46
 
@@ -84,7 +84,7 @@ This local index lets you search across all papers you have collected using natu
84
84
  Use semantic search to expand your awareness beyond your current reading:
85
85
 
86
86
  1. **Seed**: Take the abstract of your current paper (or a paragraph describing your research question).
87
- 2. **Search**: Run it as a semantic query against a large corpus (Semantic Scholar, OpenAlex, or your local index).
87
+ 2. **Search**: Run it as a semantic query against a large corpus (OpenAlex, CrossRef, or your local index).
88
88
  3. **Filter**: Remove papers you have already read. Sort by a combination of semantic similarity and recency.
89
89
  4. **Cluster**: Group the top 50 results into thematic clusters using k-means or HDBSCAN on their embeddings.
90
90
  5. **Explore clusters**: Each cluster represents a related subtopic. Read the most-cited paper in each cluster to understand the connection to your work.
@@ -103,7 +103,7 @@ Semantic search excels at finding papers from other fields that address similar
103
103
  Set up periodic semantic searches to detect new papers in your area:
104
104
 
105
105
  1. Define 3-5 "concept vectors" by encoding descriptions of your core research interests.
106
- 2. Weekly, search against newly published papers (last 7 days) from arXiv or Semantic Scholar.
106
+ 2. Weekly, search against newly published papers (last 7 days) from arXiv or OpenAlex.
107
107
  3. Rank new papers by maximum similarity to any of your concept vectors.
108
108
  4. Papers above your similarity threshold enter your reading queue automatically.
109
109
 
@@ -137,7 +137,7 @@ Compare your research question against the semantic landscape of existing work.
137
137
 
138
138
  ## References
139
139
 
140
- - Semantic Scholar API: https://api.semanticscholar.org
140
+ - OpenAlex API: https://api.openalex.org
141
141
  - SPECTER2 model: https://huggingface.co/allenai/specter2
142
142
  - ChromaDB: https://www.trychroma.com
143
143
  - ResearchGPT: https://github.com/mukulpatnaik/researchgpt
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: semantic-scholar-recs-guide
3
- description: "Using Semantic Scholar recommendations API for paper discovery"
3
+ description: "Paper discovery via recommendation APIs (OpenAlex, CrossRef citation networks)"
4
4
  metadata:
5
5
  openclaw:
6
6
  emoji: "🤖"
@@ -10,70 +10,72 @@ metadata:
10
10
  source: "wentor-research-plugins"
11
11
  ---
12
12
 
13
- # Semantic Scholar Recommendations Guide
13
+ # Paper Discovery via OpenAlex & CrossRef
14
14
 
15
- Leverage the Semantic Scholar (S2) API to discover related papers, traverse citation networks, and build comprehensive reading lists programmatically.
15
+ Leverage the OpenAlex and CrossRef APIs to discover related papers, traverse citation networks, and build comprehensive reading lists programmatically.
16
16
 
17
17
  ## Overview
18
18
 
19
- Semantic Scholar indexes over 200 million academic papers and provides a free, rate-limited API that supports:
19
+ OpenAlex indexes over 250 million academic works and provides a free, no-key-required API that supports:
20
20
 
21
- - Paper search by title, keyword, or DOI
22
- - Recommendations based on positive and negative seed papers
21
+ - Work search by title, keyword, or DOI
23
22
  - Citation and reference graph traversal
24
23
  - Author profiles and publication histories
25
- - Bulk data access for large-scale analyses
24
+ - Concept-based discovery across disciplines
25
+ - Institutional and venue filtering
26
26
 
27
- Base URL: `https://api.semanticscholar.org/graph/v1`
28
- Recommendations endpoint: `https://api.semanticscholar.org/recommendations/v1`
27
+ Base URL: `https://api.openalex.org`
28
+ CrossRef URL: `https://api.crossref.org`
29
29
 
30
- ## Getting Recommendations from Seed Papers
30
+ ## Finding Related Papers
31
31
 
32
- The recommendations endpoint accepts a list of positive (and optionally negative) paper IDs and returns related papers ranked by relevance.
32
+ Use OpenAlex's concept graph and citation data to discover related work from seed papers.
33
33
 
34
- ### Single-Paper Recommendations
34
+ ### Concept-Based Discovery
35
35
 
36
36
  ```python
37
37
  import requests
38
38
 
39
- PAPER_ID = "649def34f8be52c8b66281af98ae884c09aef38b" # SHA or S2 ID
39
+ HEADERS = {"User-Agent": "ResearchPlugins/1.0 (https://wentor.ai)"}
40
+ WORK_ID = "W2741809807" # OpenAlex work ID
40
41
 
42
+ # Get the seed paper's concepts
41
43
  response = requests.get(
42
- f"https://api.semanticscholar.org/recommendations/v1/papers/forpaper/{PAPER_ID}",
43
- params={
44
- "fields": "title,authors,year,citationCount,abstract,externalIds",
45
- "limit": 20
46
- },
47
- headers={"x-api-key": "YOUR_API_KEY"} # optional, increases rate limit
44
+ f"https://api.openalex.org/works/{WORK_ID}",
45
+ headers=HEADERS
48
46
  )
49
-
50
- for paper in response.json()["recommendedPapers"]:
51
- print(f"[{paper['year']}] {paper['title']} (citations: {paper['citationCount']})")
47
+ paper = response.json()
48
+ concepts = [c["id"] for c in paper.get("concepts", [])[:3]]
49
+
50
+ # Find works sharing the same concepts, sorted by citations
51
+ for concept_id in concepts:
52
+ related = requests.get(
53
+ "https://api.openalex.org/works",
54
+ params={"filter": f"concepts.id:{concept_id}", "sort": "cited_by_count:desc", "per_page": 10},
55
+ headers=HEADERS
56
+ )
57
+ for w in related.json().get("results", []):
58
+ print(f"[{w.get('publication_year')}] {w.get('title')} (citations: {w.get('cited_by_count')})")
52
59
  ```
53
60
 
54
- ### Multi-Paper Recommendations (Positive + Negative Seeds)
61
+ ### CrossRef Subject-Based Discovery
55
62
 
56
63
  ```python
57
64
  import requests
58
65
 
59
- payload = {
60
- "positivePaperIds": [
61
- "649def34f8be52c8b66281af98ae884c09aef38b",
62
- "ARXIV:2005.14165" # can use arXiv ID prefix
63
- ],
64
- "negativePaperIds": [
65
- "ArXiv:1706.03762" # exclude attention-is-all-you-need style papers
66
- ]
67
- }
68
-
69
- response = requests.post(
70
- "https://api.semanticscholar.org/recommendations/v1/papers/",
71
- json=payload,
72
- params={"fields": "title,year,citationCount,url,abstract", "limit": 30}
73
- )
74
-
75
- results = response.json()["recommendedPapers"]
76
- print(f"Found {len(results)} recommended papers")
66
+ def search_crossref(query, limit=10, sort="is-referenced-by-count"):
67
+ """Search CrossRef for papers sorted by citation count."""
68
+ resp = requests.get(
69
+ "https://api.crossref.org/works",
70
+ params={"query": query, "rows": limit, "sort": sort, "order": "desc"},
71
+ headers={"User-Agent": "ResearchPlugins/1.0 (https://wentor.ai; mailto:dev@wentor.ai)"}
72
+ )
73
+ return resp.json().get("message", {}).get("items", [])
74
+
75
+ results = search_crossref("transformer attention mechanism")
76
+ for w in results:
77
+ title = w.get("title", [""])[0] if w.get("title") else ""
78
+ print(f" {title} — Cited by: {w.get('is-referenced-by-count', 0)}")
77
79
  ```
78
80
 
79
81
  ## Citation Network Traversal
@@ -83,48 +85,49 @@ Walk the citation graph to discover foundational and derivative works.
83
85
  ### Forward Citations (Who Cited This Paper?)
84
86
 
85
87
  ```python
86
- paper_id = "649def34f8be52c8b66281af98ae884c09aef38b"
88
+ work_id = "W2741809807"
87
89
 
88
90
  response = requests.get(
89
- f"https://api.semanticscholar.org/graph/v1/paper/{paper_id}/citations",
91
+ "https://api.openalex.org/works",
90
92
  params={
91
- "fields": "title,year,citationCount,authors",
92
- "limit": 100,
93
- "offset": 0
94
- }
93
+ "filter": f"cites:{work_id}",
94
+ "sort": "cited_by_count:desc",
95
+ "per_page": 20
96
+ },
97
+ headers=HEADERS
95
98
  )
96
99
 
97
- citations = response.json()["data"]
98
- # Sort by citation count to find most influential derivative works
99
- citations.sort(key=lambda x: x["citingPaper"]["citationCount"], reverse=True)
100
- for c in citations[:10]:
101
- p = c["citingPaper"]
102
- print(f" [{p['year']}] {p['title']} ({p['citationCount']} cites)")
100
+ for w in response.json().get("results", []):
101
+ print(f" [{w.get('publication_year')}] {w.get('title')} ({w.get('cited_by_count')} cites)")
103
102
  ```
104
103
 
105
104
  ### Backward References (What Did This Paper Cite?)
106
105
 
107
106
  ```python
108
107
  response = requests.get(
109
- f"https://api.semanticscholar.org/graph/v1/paper/{paper_id}/references",
110
- params={"fields": "title,year,citationCount,authors", "limit": 100}
108
+ f"https://api.openalex.org/works/{work_id}",
109
+ headers=HEADERS
111
110
  )
111
+ paper = response.json()
112
+ ref_ids = paper.get("referenced_works", [])
112
113
 
113
- refs = response.json()["data"]
114
- refs.sort(key=lambda x: x["citedPaper"]["citationCount"], reverse=True)
114
+ # Fetch details for referenced works
115
+ for ref_id in ref_ids[:20]:
116
+ ref = requests.get(f"https://api.openalex.org/works/{ref_id.split('/')[-1]}", headers=HEADERS).json()
117
+ print(f" [{ref.get('publication_year')}] {ref.get('title')} ({ref.get('cited_by_count')} cites)")
115
118
  ```
116
119
 
117
120
  ## Building a Reading List Pipeline
118
121
 
119
- Combine search, recommendations, and citation traversal into a discovery pipeline:
122
+ Combine search, concept discovery, and citation traversal into a discovery pipeline:
120
123
 
121
124
  | Step | Method | Purpose |
122
125
  |------|--------|---------|
123
126
  | 1. Seed selection | Manual or keyword search | Identify 3-5 highly relevant papers |
124
- | 2. Expand via recs | Multi-paper recommendations | Find thematically related work |
125
- | 3. Forward citation | Citations endpoint | Find recent derivative works |
126
- | 4. Backward citation | References endpoint | Find foundational papers |
127
- | 5. Deduplicate | S2 paper ID matching | Remove duplicates across steps |
127
+ | 2. Expand via concepts | OpenAlex concept graph | Find thematically related work |
128
+ | 3. Forward citation | OpenAlex cites filter | Find recent derivative works |
129
+ | 4. Backward citation | referenced_works field | Find foundational papers |
130
+ | 5. Deduplicate | OpenAlex work ID matching | Remove duplicates across steps |
128
131
  | 6. Rank & filter | Sort by year, citations, relevance | Prioritize reading order |
129
132
 
130
133
  ```python
@@ -133,32 +136,46 @@ def build_reading_list(seed_ids, max_papers=50):
133
136
  seen = set()
134
137
  candidates = []
135
138
 
136
- # Step 1: Get recommendations
137
- recs = get_recommendations(seed_ids)
138
- for paper in recs:
139
- if paper["paperId"] not in seen:
140
- seen.add(paper["paperId"])
141
- candidates.append(paper)
142
-
143
- # Step 2: Get citations of seed papers
144
- for sid in seed_ids:
145
- cites = get_citations(sid, limit=50)
146
- for c in cites:
147
- pid = c["citingPaper"]["paperId"]
148
- if pid not in seen:
149
- seen.add(pid)
150
- candidates.append(c["citingPaper"])
151
-
152
- # Step 3: Rank by citation count and recency
153
- candidates.sort(key=lambda p: (p.get("year", 0), p.get("citationCount", 0)), reverse=True)
139
+ for seed_id in seed_ids:
140
+ # Get concepts from seed paper
141
+ paper = requests.get(f"https://api.openalex.org/works/{seed_id}", headers=HEADERS).json()
142
+ concept_ids = [c["id"] for c in paper.get("concepts", [])[:2]]
143
+
144
+ # Find related works via concepts
145
+ for cid in concept_ids:
146
+ related = requests.get(
147
+ "https://api.openalex.org/works",
148
+ params={"filter": f"concepts.id:{cid}", "sort": "cited_by_count:desc", "per_page": 20},
149
+ headers=HEADERS
150
+ ).json().get("results", [])
151
+ for w in related:
152
+ wid = w.get("id", "").split("/")[-1]
153
+ if wid not in seen:
154
+ seen.add(wid)
155
+ candidates.append(w)
156
+
157
+ # Get citing works
158
+ citing = requests.get(
159
+ "https://api.openalex.org/works",
160
+ params={"filter": f"cites:{seed_id}", "sort": "cited_by_count:desc", "per_page": 20},
161
+ headers=HEADERS
162
+ ).json().get("results", [])
163
+ for w in citing:
164
+ wid = w.get("id", "").split("/")[-1]
165
+ if wid not in seen:
166
+ seen.add(wid)
167
+ candidates.append(w)
168
+
169
+ # Rank by citation count and recency
170
+ candidates.sort(key=lambda p: (p.get("publication_year", 0), p.get("cited_by_count", 0)), reverse=True)
154
171
  return candidates[:max_papers]
155
172
  ```
156
173
 
157
- ## Rate Limits and Best Practices
174
+ ## Best Practices
158
175
 
159
- - **Without API key**: 100 requests per 5 minutes
160
- - **With API key**: 1 request/second sustained (request a key at semanticscholar.org/product/api)
161
- - Always include only the fields you need to reduce payload size
162
- - Use `offset` and `limit` for pagination on large result sets
176
+ - OpenAlex is free with no API key required; use a polite `User-Agent` header
177
+ - CrossRef requires a polite pool user agent with contact info for higher rate limits
178
+ - Always include only the fields you need via `select` parameter to reduce payload size
179
+ - Use `page` and `per_page` for pagination on large result sets
163
180
  - Cache responses locally to avoid redundant requests
164
- - Use DOI, arXiv ID, or PubMed ID as paper identifiers for cross-system compatibility (prefix with `DOI:`, `ARXIV:`, or `PMID:`)
181
+ - Use DOI as the universal identifier for cross-system compatibility
@@ -84,7 +84,7 @@ else:
84
84
  | SSRN | Preprint server | Social sciences, law, economics | ssrn.com |
85
85
  | Zenodo | Repository | All disciplines | zenodo.org |
86
86
  | CORE | Aggregator | 300M+ papers from repositories | core.ac.uk |
87
- | Semantic Scholar | Search + OA links | Cross-disciplinary | semanticscholar.org |
87
+ | OpenAlex | Search + OA links | Cross-disciplinary | openalex.org |
88
88
  | BASE (Bielefeld) | Aggregator | 400M+ documents | base-search.net |
89
89
 
90
90
  ### Batch OA Lookup
@@ -93,11 +93,11 @@ Unpaywall / OpenAlex:
93
93
  - Use: Find OA versions of any DOI
94
94
  - Best for: Locating freely available versions of papers
95
95
 
96
- Semantic Scholar:
97
- - Coverage: 200M+ papers, abstracts + some full text
98
- - Access: Free API, bulk datasets
99
- - Features: TLDR summaries, citation intents, S2ORC corpus
100
- - Best for: NLP research on scientific text
96
+ OpenAlex:
97
+ - Coverage: 250M+ works, all disciplines
98
+ - Access: Free API, no key required
99
+ - Features: Concepts, citation counts, author profiles, institution data
100
+ - Best for: Cross-disciplinary metadata and OA discovery
101
101
  ```
102
102
 
103
103
  ## Full-Text Retrieval and Parsing
@@ -49,7 +49,7 @@ Whether you are conducting a systematic literature review, mapping a new researc
49
49
 
50
50
  | Source | Coverage | API | Cost |
51
51
  |--------|----------|-----|------|
52
- | Semantic Scholar | 200M+ papers, CS/biomed focus | REST API, free | Free (rate limited) |
52
+ | OpenAlex | 250M+ works, all disciplines | REST API, free | Free (no key required) |
53
53
  | OpenAlex | 250M+ works, all disciplines | REST API, free | Free |
54
54
  | Crossref | 140M+ DOIs | REST API | Free |
55
55
  | Web of Science | Curated, multi-disciplinary | Institutional | Licensed |
@@ -219,7 +219,7 @@ Traditional citations take years to accumulate. Altmetrics capture immediate att
219
219
 
220
220
  ## Best Practices
221
221
 
222
- - **Combine multiple data sources.** No single database has complete coverage. Merge OpenAlex and Semantic Scholar for best results.
222
+ - **Combine multiple data sources.** No single database has complete coverage. Merge OpenAlex and CrossRef for best results.
223
223
  - **Normalize by field and age.** A 2024 paper in biology and a 2024 paper in mathematics have very different citation rate baselines.
224
224
  - **Use relative indicators.** Field-Weighted Citation Impact (FWCI) accounts for disciplinary differences.
225
225
  - **Do not equate citations with quality.** Retracted papers sometimes have high citation counts. Controversial papers accumulate criticism citations.
@@ -229,7 +229,7 @@ Traditional citations take years to accumulate. Altmetrics capture immediate att
229
229
  ## References
230
230
 
231
231
  - [OpenAlex API](https://docs.openalex.org/) -- Free, open bibliographic data
232
- - [Semantic Scholar API](https://api.semanticscholar.org/) -- AI-powered paper data
232
+ - [CrossRef API](https://api.crossref.org/) -- DOI resolution and metadata
233
233
  - [VOSviewer](https://www.vosviewer.com/) -- Bibliometric visualization tool
234
234
  - [bibliometrix R package](https://www.bibliometrix.org/) -- Comprehensive bibliometric analysis
235
235
  - [Altmetric](https://www.altmetric.com/) -- Alternative impact metrics
@@ -115,33 +115,6 @@ for source in results:
115
115
 
116
116
  Google Scholar profiles automatically display h-index and i10-index. No calculation needed, but coverage is the broadest (includes non-peer-reviewed sources).
117
117
 
118
- ### From Semantic Scholar API
119
-
120
- ```python
121
- def get_author_h_index(author_name):
122
- """Calculate h-index for an author using Semantic Scholar."""
123
- # Search for author
124
- search_resp = requests.get(
125
- "https://api.semanticscholar.org/graph/v1/author/search",
126
- params={"query": author_name, "limit": 1}
127
- )
128
- authors = search_resp.json().get("data", [])
129
- if not authors:
130
- return None
131
-
132
- author_id = authors[0]["authorId"]
133
-
134
- # Get all papers with citation counts
135
- papers_resp = requests.get(
136
- f"https://api.semanticscholar.org/graph/v1/author/{author_id}/papers",
137
- params={"fields": "citationCount", "limit": 1000}
138
- )
139
- papers = papers_resp.json().get("data", [])
140
- citation_counts = [p.get("citationCount", 0) for p in papers]
141
-
142
- return calculate_h_index(citation_counts)
143
- ```
144
-
145
118
  ### From OpenAlex
146
119
 
147
120
  ```python
@@ -36,7 +36,7 @@ Select the skill matching the user's need, then `read` its SKILL.md.
36
36
  | [plos-open-access-api](./plos-open-access-api/SKILL.md) | Search PLOS open access journals with full-text Solr-powered API |
37
37
  | [pubmed-api](./pubmed-api/SKILL.md) | Search biomedical literature and retrieve records via PubMed E-utilities |
38
38
  | [scielo-api](./scielo-api/SKILL.md) | Access Latin American and developing world research via SciELO API |
39
- | [semantic-scholar-api](./semantic-scholar-api/SKILL.md) | Search papers and analyze citation graphs via Semantic Scholar |
39
+ | [semantic-scholar-api](./semantic-scholar-api/SKILL.md) | Search papers and analyze citation graphs via OpenAlex and CrossRef APIs |
40
40
  | [share-research-api](./share-research-api/SKILL.md) | Discover open access research outputs via the SHARE notification API |
41
41
  | [systematic-search-strategy](./systematic-search-strategy/SKILL.md) | Construct rigorous systematic search strategies for literature reviews |
42
42
  | [worldcat-search-api](./worldcat-search-api/SKILL.md) | Search the world's largest library catalog via OCLC WorldCat API |
@@ -40,24 +40,30 @@ Examine the reference list of each seed paper and identify which cited works are
40
40
  ```python
41
41
  import requests
42
42
 
43
- def get_references(paper_id, limit=100):
44
- """Get all references of a paper via Semantic Scholar."""
45
- url = f"https://api.semanticscholar.org/graph/v1/paper/{paper_id}/references"
46
- response = requests.get(url, params={
47
- "fields": "title,year,citationCount,externalIds,abstract",
48
- "limit": limit
49
- })
50
- refs = response.json().get("data", [])
51
- return [r["citedPaper"] for r in refs if r["citedPaper"].get("title")]
43
+ HEADERS = {"User-Agent": "ResearchPlugins/1.0 (https://wentor.ai)"}
44
+
45
+ def get_references(work_id):
46
+ """Get all references of a paper via OpenAlex."""
47
+ url = f"https://api.openalex.org/works/{work_id}"
48
+ response = requests.get(url, headers=HEADERS)
49
+ paper = response.json()
50
+ ref_ids = paper.get("referenced_works", [])
51
+
52
+ references = []
53
+ for ref_id in ref_ids:
54
+ ref = requests.get(f"https://api.openalex.org/works/{ref_id.split('/')[-1]}", headers=HEADERS).json()
55
+ if ref.get("title"):
56
+ references.append(ref)
57
+ return references
52
58
 
53
59
  # Get references of a seed paper
54
- seed_doi = "DOI:10.1038/s41586-021-03819-2"
55
- references = get_references(seed_doi)
60
+ seed_id = "W2741809807"
61
+ references = get_references(seed_id)
56
62
 
57
63
  # Sort by citation count to find the most influential foundations
58
- references.sort(key=lambda p: p.get("citationCount", 0), reverse=True)
64
+ references.sort(key=lambda p: p.get("cited_by_count", 0), reverse=True)
59
65
  for ref in references[:15]:
60
- print(f"[{ref.get('year', '?')}] {ref['title']} ({ref.get('citationCount', 0)} citations)")
66
+ print(f"[{ref.get('publication_year', '?')}] {ref['title']} ({ref.get('cited_by_count', 0)} citations)")
61
67
  ```
62
68
 
63
69
  ### Step 3: Forward Chaining (Citation Tracking)
@@ -65,28 +71,32 @@ for ref in references[:15]:
65
71
  Find all papers that have cited your seed paper.
66
72
 
67
73
  ```python
68
- def get_citations(paper_id, limit=200):
69
- """Get papers citing a given paper via Semantic Scholar."""
70
- url = f"https://api.semanticscholar.org/graph/v1/paper/{paper_id}/citations"
74
+ def get_citations(work_id, limit=200):
75
+ """Get papers citing a given paper via OpenAlex."""
71
76
  all_citations = []
72
- offset = 0
73
- while offset < limit:
74
- response = requests.get(url, params={
75
- "fields": "title,year,citationCount,externalIds,abstract",
76
- "limit": min(100, limit - offset),
77
- "offset": offset
78
- })
79
- data = response.json().get("data", [])
80
- if not data:
77
+ page = 1
78
+ while len(all_citations) < limit:
79
+ response = requests.get(
80
+ "https://api.openalex.org/works",
81
+ params={
82
+ "filter": f"cites:{work_id}",
83
+ "sort": "cited_by_count:desc",
84
+ "per_page": min(200, limit - len(all_citations)),
85
+ "page": page
86
+ },
87
+ headers=HEADERS
88
+ )
89
+ results = response.json().get("results", [])
90
+ if not results:
81
91
  break
82
- all_citations.extend([c["citingPaper"] for c in data if c["citingPaper"].get("title")])
83
- offset += len(data)
92
+ all_citations.extend(results)
93
+ page += 1
84
94
  return all_citations
85
95
 
86
- citations = get_citations(seed_doi)
96
+ citations = get_citations(seed_id)
87
97
  # Filter for recent, well-cited papers
88
- recent_impactful = [c for c in citations if c.get("year", 0) >= 2022 and c.get("citationCount", 0) >= 5]
89
- recent_impactful.sort(key=lambda p: p.get("citationCount", 0), reverse=True)
98
+ recent_impactful = [c for c in citations if c.get("publication_year", 0) >= 2022 and c.get("cited_by_count", 0) >= 5]
99
+ recent_impactful.sort(key=lambda p: p.get("cited_by_count", 0), reverse=True)
90
100
  ```
91
101
 
92
102
  ### Step 4: Co-Citation and Bibliographic Coupling
@@ -134,7 +144,7 @@ Repeat the process with the most relevant papers discovered in each round:
134
144
  | Google Scholar "Cited by" | Forward chaining | Free |
135
145
  | Web of Science "Cited References" / "Times Cited" | Both directions | Subscription |
136
146
  | Scopus "References" / "Cited by" | Both directions | Subscription |
137
- | Semantic Scholar API | Programmatic, both directions | Free |
147
+ | OpenAlex API | Programmatic, both directions | Free |
138
148
  | Connected Papers (connectedpapers.com) | Visual co-citation graph | Free (limited) |
139
149
  | Litmaps (litmaps.com) | Visual citation network | Free tier |
140
150
  | CoCites (cocites.com) | Co-citation analysis | Free |
@@ -145,4 +155,4 @@ Repeat the process with the most relevant papers discovered in each round:
145
155
  - **Citation bias**: Highly cited papers are not always the best or most relevant. Pay attention to less-cited but methodologically sound papers.
146
156
  - **Recency bias**: Forward chaining favors recent papers with fewer citations. Allow time for citation accumulation or use Mendeley readership as a proxy.
147
157
  - **Field boundaries**: Citation chains tend to stay within disciplinary silos. Combine with keyword searches in adjacent-field databases to break out.
148
- - **Incomplete coverage**: No single database indexes all citations. Cross-check with at least two sources (e.g., Semantic Scholar + Google Scholar).
158
+ - **Incomplete coverage**: No single database indexes all citations. Cross-check with at least two sources (e.g., OpenAlex + Google Scholar).
@@ -96,5 +96,5 @@ A robust literature search should query multiple databases to maximize recall:
96
96
 
97
97
  - **Scopus vs. Web of Science**: Scopus has broader coverage (especially post-2000 and non-English journals); WoS has deeper historical archives and the Journal Impact Factor.
98
98
  - **Google Scholar** finds the most results but lacks advanced filtering. Use it for snowball searches and finding grey literature, not as your primary systematic search tool.
99
- - **API access**: PubMed (E-utilities), Semantic Scholar, OpenAlex, and Crossref all offer free APIs for programmatic searching. Scopus and WoS require institutional API keys.
99
+ - **API access**: PubMed (E-utilities), OpenAlex, and Crossref all offer free APIs for programmatic searching. Scopus and WoS require institutional API keys.
100
100
  - **Alert services**: Set up saved search alerts on PubMed, Scopus, and Google Scholar to stay current in fast-moving fields.