npm - @wentorai/research-plugins - Versions diffs - 1.3.2 → 1.4.2 - Mend

@wentorai/research-plugins 1.3.2 → 1.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (266) hide show

package/skills/literature/discovery/semantic-scholar-recs-guide/SKILL.md CHANGED Viewed

@@ -1,6 +1,6 @@
 ---
 name: semantic-scholar-recs-guide
-description: "Using Semantic Scholar recommendations API for paper discovery"
+description: "Paper discovery via recommendation APIs (OpenAlex, CrossRef citation networks)"
 metadata:
   openclaw:
     emoji: "🤖"
@@ -10,70 +10,72 @@ metadata:
     source: "wentor-research-plugins"
 ---
-# Semantic Scholar Recommendations Guide
+# Paper Discovery via OpenAlex & CrossRef
-Leverage the Semantic Scholar (S2) API to discover related papers, traverse citation networks, and build comprehensive reading lists programmatically.
+Leverage the OpenAlex and CrossRef APIs to discover related papers, traverse citation networks, and build comprehensive reading lists programmatically.
 ## Overview
-Semantic Scholar indexes over 200 million academic papers and provides a free, rate-limited API that supports:
+OpenAlex indexes over 250 million academic works and provides a free, no-key-required API that supports:
-- Paper search by title, keyword, or DOI
-- Recommendations based on positive and negative seed papers
+- Work search by title, keyword, or DOI
 - Citation and reference graph traversal
 - Author profiles and publication histories
-- Bulk data access for large-scale analyses
+- Concept-based discovery across disciplines
+- Institutional and venue filtering
-Base URL: `https://api.semanticscholar.org/graph/v1`
-Recommendations endpoint: `https://api.semanticscholar.org/recommendations/v1`
+Base URL: `https://api.openalex.org`
+CrossRef URL: `https://api.crossref.org`
-## Getting Recommendations from Seed Papers
+## Finding Related Papers
-The recommendations endpoint accepts a list of positive (and optionally negative) paper IDs and returns related papers ranked by relevance.
+Use OpenAlex's concept graph and citation data to discover related work from seed papers.
-### Single-Paper Recommendations
+### Concept-Based Discovery
 ```python
 import requests
-PAPER_ID = "649def34f8be52c8b66281af98ae884c09aef38b"  # SHA or S2 ID
+HEADERS = {"User-Agent": "ResearchPlugins/1.0 (https://wentor.ai)"}
+WORK_ID = "W2741809807"  # OpenAlex work ID
+# Get the seed paper's concepts
 response = requests.get(
-    f"https://api.semanticscholar.org/recommendations/v1/papers/forpaper/{PAPER_ID}",
-    params={
-        "fields": "title,authors,year,citationCount,abstract,externalIds",
-        "limit": 20
-    },
-    headers={"x-api-key": "YOUR_API_KEY"}  # optional, increases rate limit
+    f"https://api.openalex.org/works/{WORK_ID}",
+    headers=HEADERS
 )
-for paper in response.json()["recommendedPapers"]:
-    print(f"[{paper['year']}] {paper['title']} (citations: {paper['citationCount']})")
+paper = response.json()
+concepts = [c["id"] for c in paper.get("concepts", [])[:3]]
+# Find works sharing the same concepts, sorted by citations
+for concept_id in concepts:
+    related = requests.get(
+        "https://api.openalex.org/works",
+        params={"filter": f"concepts.id:{concept_id}", "sort": "cited_by_count:desc", "per_page": 10},
+        headers=HEADERS
+    )
+    for w in related.json().get("results", []):
+        print(f"[{w.get('publication_year')}] {w.get('title')} (citations: {w.get('cited_by_count')})")
 ```
-### Multi-Paper Recommendations (Positive + Negative Seeds)
+### CrossRef Subject-Based Discovery
 ```python
 import requests
-payload = {
-    "positivePaperIds": [
-        "649def34f8be52c8b66281af98ae884c09aef38b",
-        "ARXIV:2005.14165"  # can use arXiv ID prefix
-    ],
-    "negativePaperIds": [
-        "ArXiv:1706.03762"  # exclude attention-is-all-you-need style papers
-    ]
-}
-response = requests.post(
-    "https://api.semanticscholar.org/recommendations/v1/papers/",
-    json=payload,
-    params={"fields": "title,year,citationCount,url,abstract", "limit": 30}
-)
-results = response.json()["recommendedPapers"]
-print(f"Found {len(results)} recommended papers")
+def search_crossref(query, limit=10, sort="is-referenced-by-count"):
+    """Search CrossRef for papers sorted by citation count."""
+    resp = requests.get(
+        "https://api.crossref.org/works",
+        params={"query": query, "rows": limit, "sort": sort, "order": "desc"},
+        headers={"User-Agent": "ResearchPlugins/1.0 (https://wentor.ai; mailto:dev@wentor.ai)"}
+    )
+    return resp.json().get("message", {}).get("items", [])
+results = search_crossref("transformer attention mechanism")
+for w in results:
+    title = w.get("title", [""])[0] if w.get("title") else ""
+    print(f"  {title} — Cited by: {w.get('is-referenced-by-count', 0)}")
 ```
 ## Citation Network Traversal
@@ -83,48 +85,49 @@ Walk the citation graph to discover foundational and derivative works.
 ### Forward Citations (Who Cited This Paper?)
 ```python
-paper_id = "649def34f8be52c8b66281af98ae884c09aef38b"
+work_id = "W2741809807"
 response = requests.get(
-    f"https://api.semanticscholar.org/graph/v1/paper/{paper_id}/citations",
+    "https://api.openalex.org/works",
     params={
-        "fields": "title,year,citationCount,authors",
-        "limit": 100,
-        "offset": 0
-    }
+        "filter": f"cites:{work_id}",
+        "sort": "cited_by_count:desc",
+        "per_page": 20
+    },
+    headers=HEADERS
 )
-citations = response.json()["data"]
-# Sort by citation count to find most influential derivative works
-citations.sort(key=lambda x: x["citingPaper"]["citationCount"], reverse=True)
-for c in citations[:10]:
-    p = c["citingPaper"]
-    print(f"  [{p['year']}] {p['title']} ({p['citationCount']} cites)")
+for w in response.json().get("results", []):
+    print(f"  [{w.get('publication_year')}] {w.get('title')} ({w.get('cited_by_count')} cites)")
 ```
 ### Backward References (What Did This Paper Cite?)
 ```python
 response = requests.get(
-    f"https://api.semanticscholar.org/graph/v1/paper/{paper_id}/references",
-    params={"fields": "title,year,citationCount,authors", "limit": 100}
+    f"https://api.openalex.org/works/{work_id}",
+    headers=HEADERS
 )
+paper = response.json()
+ref_ids = paper.get("referenced_works", [])
-refs = response.json()["data"]
-refs.sort(key=lambda x: x["citedPaper"]["citationCount"], reverse=True)
+# Fetch details for referenced works
+for ref_id in ref_ids[:20]:
+    ref = requests.get(f"https://api.openalex.org/works/{ref_id.split('/')[-1]}", headers=HEADERS).json()
+    print(f"  [{ref.get('publication_year')}] {ref.get('title')} ({ref.get('cited_by_count')} cites)")
 ```
 ## Building a Reading List Pipeline
-Combine search, recommendations, and citation traversal into a discovery pipeline:
+Combine search, concept discovery, and citation traversal into a discovery pipeline:
 | Step | Method | Purpose |
 |------|--------|---------|
 | 1. Seed selection | Manual or keyword search | Identify 3-5 highly relevant papers |
-| 2. Expand via recs | Multi-paper recommendations | Find thematically related work |
-| 3. Forward citation | Citations endpoint | Find recent derivative works |
-| 4. Backward citation | References endpoint | Find foundational papers |
-| 5. Deduplicate | S2 paper ID matching | Remove duplicates across steps |
+| 2. Expand via concepts | OpenAlex concept graph | Find thematically related work |
+| 3. Forward citation | OpenAlex cites filter | Find recent derivative works |
+| 4. Backward citation | referenced_works field | Find foundational papers |
+| 5. Deduplicate | OpenAlex work ID matching | Remove duplicates across steps |
 | 6. Rank & filter | Sort by year, citations, relevance | Prioritize reading order |
 ```python
@@ -133,32 +136,46 @@ def build_reading_list(seed_ids, max_papers=50):
     seen = set()
     candidates = []
-    # Step 1: Get recommendations
-    recs = get_recommendations(seed_ids)
-    for paper in recs:
-        if paper["paperId"] not in seen:
-            seen.add(paper["paperId"])
-            candidates.append(paper)
-    # Step 2: Get citations of seed papers
-    for sid in seed_ids:
-        cites = get_citations(sid, limit=50)
-        for c in cites:
-            pid = c["citingPaper"]["paperId"]
-            if pid not in seen:
-                seen.add(pid)
-                candidates.append(c["citingPaper"])
-    # Step 3: Rank by citation count and recency
-    candidates.sort(key=lambda p: (p.get("year", 0), p.get("citationCount", 0)), reverse=True)
+    for seed_id in seed_ids:
+        # Get concepts from seed paper
+        paper = requests.get(f"https://api.openalex.org/works/{seed_id}", headers=HEADERS).json()
+        concept_ids = [c["id"] for c in paper.get("concepts", [])[:2]]
+        # Find related works via concepts
+        for cid in concept_ids:
+            related = requests.get(
+                "https://api.openalex.org/works",
+                params={"filter": f"concepts.id:{cid}", "sort": "cited_by_count:desc", "per_page": 20},
+                headers=HEADERS
+            ).json().get("results", [])
+            for w in related:
+                wid = w.get("id", "").split("/")[-1]
+                if wid not in seen:
+                    seen.add(wid)
+                    candidates.append(w)
+        # Get citing works
+        citing = requests.get(
+            "https://api.openalex.org/works",
+            params={"filter": f"cites:{seed_id}", "sort": "cited_by_count:desc", "per_page": 20},
+            headers=HEADERS
+        ).json().get("results", [])
+        for w in citing:
+            wid = w.get("id", "").split("/")[-1]
+            if wid not in seen:
+                seen.add(wid)
+                candidates.append(w)
+    # Rank by citation count and recency
+    candidates.sort(key=lambda p: (p.get("publication_year", 0), p.get("cited_by_count", 0)), reverse=True)
     return candidates[:max_papers]
 ```
-## Rate Limits and Best Practices
+## Best Practices
-- **Without API key**: 100 requests per 5 minutes
-- **With API key**: 1 request/second sustained (request a key at semanticscholar.org/product/api)
-- Always include only the fields you need to reduce payload size
-- Use `offset` and `limit` for pagination on large result sets
+- OpenAlex is free with no API key required; use a polite `User-Agent` header
+- CrossRef requires a polite pool user agent with contact info for higher rate limits
+- Always include only the fields you need via `select` parameter to reduce payload size
+- Use `page` and `per_page` for pagination on large result sets
 - Cache responses locally to avoid redundant requests
-- Use DOI, arXiv ID, or PubMed ID as paper identifiers for cross-system compatibility (prefix with `DOI:`, `ARXIV:`, or `PMID:`)
+- Use DOI as the universal identifier for cross-system compatibility

package/skills/literature/fulltext/SKILL.md CHANGED Viewed

@@ -1,14 +1,15 @@
 ---
 name: fulltext-skills
-description: "15 full-text access skills. Trigger: accessing paper PDFs, bulk downloading, open access, text mining. Design: legal full-text retrieval from open repositories, archives, and preprint servers."
+description: "16 full-text access skills. Trigger: accessing paper PDFs, bulk downloading, open access, text mining. Design: legal full-text retrieval from open repositories, archives, and preprint servers."
 ---
-# Full-Text Access — 15 Skills
+# Full-Text Access — 16 Skills
 Select the skill matching the user's need, then `read` its SKILL.md.
 | Skill | Description |
 |-------|-------------|
+| [arxiv-latex-source](./arxiv-latex-source/SKILL.md) | Download and parse LaTeX source files from arXiv preprints |
 | [bioc-pmc-api](./bioc-pmc-api/SKILL.md) | Access PMC Open Access articles in BioC format for text mining |
 | [core-api-guide](./core-api-guide/SKILL.md) | Search and retrieve open access research papers via CORE aggregator |
 | [dataverse-api](./dataverse-api/SKILL.md) | Deposit and discover research datasets via Harvard Dataverse API |

package/skills/literature/fulltext/arxiv-latex-source/SKILL.md ADDED Viewed

@@ -0,0 +1,195 @@
+---
+name: arxiv-latex-source
+description: "Download and parse LaTeX source files from arXiv preprints"
+metadata:
+  openclaw:
+    emoji: "📜"
+    category: "literature"
+    subcategory: "fulltext"
+    keywords: ["arXiv", "LaTeX source", "paper parsing", "formula extraction", "full text", "preprint"]
+    source: "https://info.arxiv.org/help/bulk_data_s3.html"
+---
+# arXiv LaTeX Source Access Guide
+## Overview
+arXiv stores the original LaTeX source files for the vast majority of its 2.4 million+ preprints. Accessing LaTeX source provides major advantages over PDF parsing: exact mathematical notation as written by the author, structured sections and labels, machine-readable bibliography entries, and intact figure captions, table data, and cross-references.
+For formula extraction, citation graph construction, section-level text analysis, or training data curation for scientific language models, LaTeX source is the gold standard. PDF parsing introduces OCR errors in equations, loses structural hierarchy, and mangles complex tables.
+The e-print endpoint serves source bundles as gzip-compressed tarballs (`.tar.gz`) containing `.tex` files, figures, `.bib`/`.bbl` bibliography files, style files, and supplementary materials. No authentication is required.
+## Authentication
+No authentication or API key is required. The e-print endpoint is publicly accessible. However, arXiv asks that automated tools set a descriptive `User-Agent` header and comply with rate limits.
+## Core Endpoints
+### Download LaTeX Source
+- **URL**: `GET https://arxiv.org/e-print/{arxiv_id}`
+- **Response**: `application/gzip` — a `.tar.gz` archive containing the source files
+- **Parameters**:
+  | Param | Type | Required | Description |
+  |-------|------|----------|-------------|
+  | arxiv_id | string | Yes | arXiv identifier, e.g. `2301.00001` or `2301.00001v2` for a specific version |
+- **Example**:
+  ```bash
+  # Download source archive (response: 200, application/gzip, ~1.3 MB)
+  curl -sL -o source.tar.gz "https://arxiv.org/e-print/2301.00001"
+  # List archive contents
+  tar tz -f source.tar.gz | head -10
+  # ACM-Reference-Format.bbx
+  # ACM-Reference-Format.bst
+  # Image_1.jpg
+  # README.txt
+  # acmart.cls
+  ```
+- **Content-Disposition header**: `attachment; filename="arXiv-2301.00001v1.tar.gz"`
+- **ETag**: SHA-256 hash provided for caching: `sha256:f1ffe8ec...`
+### Format Detection
+The endpoint almost always returns a gzip-compressed tar archive. Rare cases (very old or single-file submissions) may return a single gzip-compressed `.tex` file without tar wrapper. Always verify format before extracting:
+```bash
+curl -sL "https://arxiv.org/e-print/{arxiv_id}" -o source.gz
+file source.gz  # "gzip compressed data, was 'XXXX.tar', ..."
+```
+### Metadata API (Companion)
+Pair source downloads with the arXiv Atom API for structured metadata:
+- **URL**: `GET https://export.arxiv.org/api/query?id_list={arxiv_id}`
+- **Response**: Atom XML with `<title>`, `<author>`, `<summary>`, `<category>`, `<published>`
+- **Example**: `curl -s "https://export.arxiv.org/api/query?id_list=2301.00001"`
+## LaTeX Source Parsing Guide
+### Locating the Main .tex File
+A source archive typically contains multiple files. To find the main document:
+1. Look for `\documentclass` in `.tex` files — this marks the root document
+2. Check for a `README.txt` that may specify the main file
+3. If multiple `.tex` files contain `\documentclass`, prefer the one with `\begin{document}`
+```python
+import tarfile, re
+def find_main_tex(tar_path):
+    with tarfile.open(tar_path, 'r:gz') as tar:
+        tex_files = [m for m in tar.getmembers() if m.name.endswith('.tex')]
+        for member in tex_files:
+            content = tar.extractfile(member).read().decode('utf-8', errors='ignore')
+            if r'\documentclass' in content and r'\begin{document}' in content:
+                return member.name, content
+    return None, None
+```
+### Extracting Sections
+LaTeX sections follow a predictable hierarchy:
+```python
+import re
+def extract_sections(tex_content):
+    pattern = r'\\(section|subsection|subsubsection)\{([^}]+)\}'
+    sections = re.findall(pattern, tex_content)
+    return [(level, title) for level, title in sections]
+    # [('section', 'Introduction'), ('section', 'Related Work'), ...]
+```
+### Extracting Equations
+```python
+def extract_equations(tex_content):
+    patterns = [
+        r'\\\[(.+?)\\\]',
+        r'\\begin\{equation\}(.+?)\\end\{equation\}',
+        r'\\begin\{align\*?\}(.+?)\\end\{align\*?\}',
+    ]
+    equations = []
+    for pat in patterns:
+        equations.extend(re.findall(pat, tex_content, re.DOTALL))
+    return equations
+```
+### Extracting Bibliography
+Parse `.bib` files (BibTeX entries) or `.bbl` files (compiled `\bibitem` commands):
+```python
+def extract_bibliography(tar_path):
+    refs = []
+    with tarfile.open(tar_path, 'r:gz') as tar:
+        for member in tar.getmembers():
+            if member.name.endswith('.bib'):
+                content = tar.extractfile(member).read().decode('utf-8', errors='ignore')
+                refs.extend(re.findall(r'@\w+\{([^,]+),(.+?)\n\}', content, re.DOTALL))
+            elif member.name.endswith('.bbl'):
+                content = tar.extractfile(member).read().decode('utf-8', errors='ignore')
+                refs.extend(re.findall(r'\\bibitem.*?\{(.+?)\}', content))
+    return refs
+```
+## Rate Limits
+- **Maximum**: 4 requests per second for automated access
+- **Recommended**: 1 request/second with delays between sequential downloads
+- **Bulk access**: For 1000+ papers, use the arXiv S3 bulk data mirror instead
+- **HTTP 429**: Rate limit exceeded; implement exponential backoff
+- **User-Agent**: Required — set a descriptive string: `MyTool/1.0 (mailto:user@university.edu)`
+- Persistent abuse may result in IP-level blocks
+## Academic Use Cases
+- **Formula extraction for ML training** — Build equation datasets with ground-truth LaTeX notation, free of OCR noise from PDF parsing
+- **Citation network analysis** — Parse `.bib`/`.bbl` files for exact reference keys to construct citation graphs
+- **Section-level text analysis** — Extract specific sections (e.g., all "Related Work" across a subfield) for systematic reviews
+- **Reproducibility auditing** — Examine algorithm environments, hyperparameter tables, and methodology sections
+- **Cross-paper notation alignment** — Compare and normalize equation environments across papers in a subfield
+## Complete Python Example
+```python
+import requests, tarfile, io, re, time, gzip
+def download_arxiv_source(arxiv_id, delay=1.0):
+    """Download and extract all .tex files from an arXiv paper's source."""
+    url = f"https://arxiv.org/e-print/{arxiv_id}"
+    headers = {"User-Agent": "ResearchTool/1.0 (mailto:user@example.com)"}
+    resp = requests.get(url, headers=headers)
+    resp.raise_for_status()
+    time.sleep(delay)
+    buf = io.BytesIO(resp.content)
+    try:
+        with tarfile.open(fileobj=buf, mode='r:gz') as tar:
+            return {m.name: tar.extractfile(m).read().decode('utf-8', errors='ignore')
+                    for m in tar.getmembers() if m.name.endswith('.tex') and m.isfile()}
+    except tarfile.ReadError:
+        buf.seek(0)
+        return {"main.tex": gzip.decompress(buf.read()).decode('utf-8', errors='ignore')}
+# Usage
+sources = download_arxiv_source("2301.00001")
+for fname, content in sources.items():
+    if r'\documentclass' in content:
+        sections = re.findall(r'\\section\{([^}]+)\}', content)
+        equations = re.findall(r'\\begin\{equation\}(.+?)\\end\{equation\}', content, re.DOTALL)
+        print(f"{fname}: {len(sections)} sections, {len(equations)} equations")
+```
+## References
+- arXiv e-print access: https://info.arxiv.org/help/bulk_data_s3.html
+- arXiv API documentation: https://info.arxiv.org/help/api/index.html
+- arXiv terms of use: https://info.arxiv.org/help/api/tou.html
+- arXiv S3 bulk data: https://info.arxiv.org/help/bulk_data_s3.html

package/skills/literature/fulltext/open-access-guide/SKILL.md CHANGED Viewed

@@ -84,7 +84,7 @@ else:
 | SSRN | Preprint server | Social sciences, law, economics | ssrn.com |
 | Zenodo | Repository | All disciplines | zenodo.org |
 | CORE | Aggregator | 300M+ papers from repositories | core.ac.uk |
-| Semantic Scholar | Search + OA links | Cross-disciplinary | semanticscholar.org |
+| OpenAlex | Search + OA links | Cross-disciplinary | openalex.org |
 | BASE (Bielefeld) | Aggregator | 400M+ documents | base-search.net |
 ### Batch OA Lookup

package/skills/literature/fulltext/open-access-mining-guide/SKILL.md CHANGED Viewed

@@ -93,11 +93,11 @@ Unpaywall / OpenAlex:
   - Use: Find OA versions of any DOI
   - Best for: Locating freely available versions of papers
-Semantic Scholar:
-  - Coverage: 200M+ papers, abstracts + some full text
-  - Access: Free API, bulk datasets
-  - Features: TLDR summaries, citation intents, S2ORC corpus
-  - Best for: NLP research on scientific text
+OpenAlex:
+  - Coverage: 250M+ works, all disciplines
+  - Access: Free API, no key required
+  - Features: Concepts, citation counts, author profiles, institution data
+  - Best for: Cross-disciplinary metadata and OA discovery
 ```
 ## Full-Text Retrieval and Parsing

package/skills/literature/metadata/citation-network-guide/SKILL.md CHANGED Viewed

@@ -49,7 +49,7 @@ Whether you are conducting a systematic literature review, mapping a new researc
 | Source | Coverage | API | Cost |
 |--------|----------|-----|------|
-| Semantic Scholar | 200M+ papers, CS/biomed focus | REST API, free | Free (rate limited) |
+| OpenAlex | 250M+ works, all disciplines | REST API, free | Free (no key required) |
 | OpenAlex | 250M+ works, all disciplines | REST API, free | Free |
 | Crossref | 140M+ DOIs | REST API | Free |
 | Web of Science | Curated, multi-disciplinary | Institutional | Licensed |
@@ -219,7 +219,7 @@ Traditional citations take years to accumulate. Altmetrics capture immediate att
 ## Best Practices
-- **Combine multiple data sources.** No single database has complete coverage. Merge OpenAlex and Semantic Scholar for best results.
+- **Combine multiple data sources.** No single database has complete coverage. Merge OpenAlex and CrossRef for best results.
 - **Normalize by field and age.** A 2024 paper in biology and a 2024 paper in mathematics have very different citation rate baselines.
 - **Use relative indicators.** Field-Weighted Citation Impact (FWCI) accounts for disciplinary differences.
 - **Do not equate citations with quality.** Retracted papers sometimes have high citation counts. Controversial papers accumulate criticism citations.
@@ -229,7 +229,7 @@ Traditional citations take years to accumulate. Altmetrics capture immediate att
 ## References
 - [OpenAlex API](https://docs.openalex.org/) -- Free, open bibliographic data
-- [Semantic Scholar API](https://api.semanticscholar.org/) -- AI-powered paper data
+- [CrossRef API](https://api.crossref.org/) -- DOI resolution and metadata
 - [VOSviewer](https://www.vosviewer.com/) -- Bibliometric visualization tool
 - [bibliometrix R package](https://www.bibliometrix.org/) -- Comprehensive bibliometric analysis
 - [Altmetric](https://www.altmetric.com/) -- Alternative impact metrics

package/skills/literature/metadata/h-index-guide/SKILL.md CHANGED Viewed

@@ -115,33 +115,6 @@ for source in results:
 Google Scholar profiles automatically display h-index and i10-index. No calculation needed, but coverage is the broadest (includes non-peer-reviewed sources).
-### From Semantic Scholar API
-```python
-def get_author_h_index(author_name):
-    """Calculate h-index for an author using Semantic Scholar."""
-    # Search for author
-    search_resp = requests.get(
-        "https://api.semanticscholar.org/graph/v1/author/search",
-        params={"query": author_name, "limit": 1}
-    )
-    authors = search_resp.json().get("data", [])
-    if not authors:
-        return None
-    author_id = authors[0]["authorId"]
-    # Get all papers with citation counts
-    papers_resp = requests.get(
-        f"https://api.semanticscholar.org/graph/v1/author/{author_id}/papers",
-        params={"fields": "citationCount", "limit": 1000}
-    )
-    papers = papers_resp.json().get("data", [])
-    citation_counts = [p.get("citationCount", 0) for p in papers]
-    return calculate_h_index(citation_counts)
-```
 ### From OpenAlex
 ```python

package/skills/literature/search/SKILL.md CHANGED Viewed

@@ -1,9 +1,9 @@
 ---
 name: search-skills
-description: "32 database search skills. Trigger: finding papers, search strategies, querying academic databases. Design: one skill per database/tool with API details, query syntax, and rate limits."
+description: "31 database search skills. Trigger: finding papers, search strategies, querying academic databases. Design: one skill per database/tool with API details, query syntax, and rate limits."
 ---
-# Database Search — 32 Skills
+# Database Search — 31 Skills
 Select the skill matching the user's need, then `read` its SKILL.md.
@@ -33,11 +33,10 @@ Select the skill matching the user's need, then `read` its SKILL.md.
 | [open-semantic-search-guide](./open-semantic-search-guide/SKILL.md) | Self-hosted semantic search and text mining platform |
 | [openaire-api](./openaire-api/SKILL.md) | Search EU-funded research outputs via the OpenAIRE Graph API |
 | [openalex-api](./openalex-api/SKILL.md) | Query the OpenAlex catalog of scholarly works, authors, and institutions |
-| [paper-search-mcp-guide](./paper-search-mcp-guide/SKILL.md) | MCP server for searching papers across arXiv, PubMed, bioRxiv |
 | [plos-open-access-api](./plos-open-access-api/SKILL.md) | Search PLOS open access journals with full-text Solr-powered API |
 | [pubmed-api](./pubmed-api/SKILL.md) | Search biomedical literature and retrieve records via PubMed E-utilities |
 | [scielo-api](./scielo-api/SKILL.md) | Access Latin American and developing world research via SciELO API |
-| [semantic-scholar-api](./semantic-scholar-api/SKILL.md) | Search papers and analyze citation graphs via Semantic Scholar |
+| [semantic-scholar-api](./semantic-scholar-api/SKILL.md) | Search papers and analyze citation graphs via OpenAlex and CrossRef APIs |
 | [share-research-api](./share-research-api/SKILL.md) | Discover open access research outputs via the SHARE notification API |
 | [systematic-search-strategy](./systematic-search-strategy/SKILL.md) | Construct rigorous systematic search strategies for literature reviews |
 | [worldcat-search-api](./worldcat-search-api/SKILL.md) | Search the world's largest library catalog via OCLC WorldCat API |