@fbraza/pi-cite 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,24 +1,22 @@
1
1
  # @fbraza/pi-cite
2
2
 
3
3
  A standalone [Pi](https://pi.dev) extension providing literature-research tools for
4
- academic workflows. Registers four tools callable by the agent:
4
+ academic workflows. Registers two tools callable by the agent:
5
5
 
6
- - **`literature_search`** — PubMed-first search with optional Semantic Scholar
7
- supplementary metadata.
6
+ - **`literature_search`** — literature workflow search against PubMed using a
7
+ PubMed-ready query (MeSH `[mh]`, `[tiab]`, `[pt]`, substance `[nm]`, and Boolean
8
+ logic), with streaming progress and deduplicated results.
8
9
  - **`pubmed_search`** — direct PubMed query (MeSH, `[tiab]`, `[pt]`, etc.).
9
- - **`fetch_fulltext`** — retrieve a paper PDF via PMC → publisher OA → fallback.
10
- - (`semantic_scholar` helper used internally by the search tools.)
11
10
 
12
11
  ## Bundled skill
13
12
 
14
13
  Ships with the **`literature`** skill (`skills/literature/`), which turns these
15
- tools into an end-to-end review workflow: verified-citation search, full-text
16
- retrieval, per-paper experiment extraction, and a structured hypothesis
17
- synthesis. Its frontmatter declares `allowed-tools` covering the extension's
18
- tools above, so the skill and extension are paired on purpose.
14
+ tools into an end-to-end review workflow: verified-citation search, per-paper
15
+ experiment extraction, and a structured hypothesis synthesis. Its frontmatter
16
+ declares `allowed-tools` covering the extension's tools above, so the skill and
17
+ extension are paired on purpose.
19
18
 
20
- - `references/` — PubMed/Semantic Scholar query syntax, API reference, and
21
- full-text access routines.
19
+ - `references/` — PubMed query syntax, API reference, and common queries.
22
20
  - `scripts/` — Python helpers (`extract_experiments.py`, `synthesis.py`,
23
21
  `generate_table.py`, `export_all.py`) invoked by the skill.
24
22
 
@@ -54,4 +52,3 @@ npm run pack:check # preview the published tarball contents
54
52
  | Variable | Purpose |
55
53
  |---|---|
56
54
  | `NCBI_API_KEY` / `api_key` env | PubMed rate limit + E-utilities auth |
57
- | `SEMANTIC_SCHOLAR_API_KEY` | Enables Semantic Scholar supplementary search |
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "@fbraza/pi-cite",
3
- "version": "0.2.0",
4
- "description": "Pi extension with PubMed, Semantic Scholar, literature search, and full-text retrieval tools.",
3
+ "version": "0.3.0",
4
+ "description": "Pi extension with PubMed and literature search tools.",
5
5
  "license": "MIT",
6
6
  "type": "module",
7
7
  "files": [
@@ -13,8 +13,7 @@
13
13
  "pi-package",
14
14
  "pi-extension",
15
15
  "literature",
16
- "pubmed",
17
- "semantic-scholar"
16
+ "pubmed"
18
17
  ],
19
18
  "pi": {
20
19
  "extensions": [
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  name: literature
3
- description: Unified literature search, verification, full-text retrieval, and synthesis workflow for scientific questions. Use when any biological claim needs a verified citation, when reviewing a gene/pathway/disease/drug/target, when surveying preclinical evidence for a target in a disease, when checking novelty, when retrieving full text for specific papers, or when turning a paper set into a structured hypothesis synthesis.
4
- allowed-tools: Read, Write, WebFetch, WebSearch, literature_search, pubmed_search, semantic_scholar_search, fetch_fulltext
3
+ description: Unified literature search, verification, and synthesis workflow for scientific questions. Use when any biological claim needs a verified citation, when reviewing a gene/pathway/disease/drug/target, when surveying preclinical evidence for a target in a disease, when checking novelty, or when turning a paper set into a structured hypothesis synthesis.
4
+ allowed-tools: Read, Write, WebFetch, WebSearch, literature_search, pubmed_search
5
5
  starting-prompt: Conduct a literature review on my research topic with verified citations, structured synthesis, and a per-paper summary table.
6
6
  ---
7
7
 
@@ -16,7 +16,6 @@ Use this skill when you need to:
16
16
  - review literature on a gene, pathway, disease, drug, or molecular target
17
17
  - survey preclinical evidence for a target in a disease context
18
18
  - check whether a finding appears novel or already published
19
- - retrieve full text or PDFs for key papers
20
19
  - synthesize a paper set into hypotheses, contradictions, and evidence-weighted conclusions
21
20
 
22
21
  Do not use this skill for:
@@ -30,7 +29,6 @@ Do not use this skill for:
30
29
  - Never fabricate PMIDs, DOIs, titles, journals, years, or author lists.
31
30
  - Distinguish human, animal, and in vitro evidence.
32
31
  - Weight evidence quality by study design and replication.
33
- - Record how full text was obtained for each paper.
34
32
  - Use inline numbered citations like `[1]` or `[1, 2]` in narrative synthesis.
35
33
  - Never overwrite outputs from a previous literature search.
36
34
  - Never write literature-review outputs directly to generic shared paths under `results/`.
@@ -48,26 +46,29 @@ Always clarify:
48
46
 
49
47
  ### Step 2 — Create a dedicated output folder
50
48
 
51
- For every new literature review or literature research task, create a new dedicated folder inside `results/` before generating files.
49
+ For every new literature review or literature research task, create a new dedicated folder under `results/literature_review/` before generating files.
52
50
 
53
- The folder name must describe the search session/topic clearly, for example:
54
- - `results/literature_multiomics_ML_biomarkers_PGD/`
55
- - `results/literature_siRNA_lung_transplant_new_treatments/`
51
+ Use the path `results/literature_review/<subject_of_study>/`, where `<subject_of_study>` is a short **snake_case title summary of the theme** of the literature search. Derive it from the scope clarified in Step 1: lower case, words separated by single underscores, no spaces, hyphens, or punctuation. For example, a review on **trained immunity in transplantation** becomes:
56
52
 
57
- All generated files for that search session must be saved inside this dedicated folder, including:
53
+ - `results/literature_review/trained_immunity_in_transplantation/`
54
+
55
+ Other examples:
56
+ - `results/literature_review/sirna_lung_transplant_new_treatments/`
57
+ - `results/literature_review/multiomics_ml_biomarkers_in_pgd/`
58
+
59
+ All generated files for that search session must be saved inside this dedicated subject folder, including:
58
60
  - `literature_report.md`
59
61
  - `paper_summary_table.csv`
60
62
  - `search_log.md`
61
- - `pdfs/`
62
63
  - any optional analysis/export artifacts such as `analysis_object.pkl`
63
64
 
64
- Never write directly to generic shared paths such as:
65
+ Never write outputs directly to the parent folder or to the `results/` root, for example:
66
+ - `results/literature_review/literature_report.md`
67
+ - `results/literature_review/paper_summary_table.csv`
68
+ - `results/literature_review/analysis_object.pkl`
65
69
  - `results/literature_report.md`
66
- - `results/paper_summary_table.csv`
67
- - `results/analysis_object.pkl`
68
- - `results/literature_pdfs/`
69
70
 
70
- If a folder for a previous search already exists, create a new folder with a distinct descriptive search-session title rather than using versioned filenames.
71
+ If a folder for a previous search on the same subject already exists, create a new folder with a distinct descriptive `<subject_of_study>` title rather than using versioned filenames.
71
72
 
72
73
  At the end of the task, clearly report the exact output folder and generated file paths to the user.
73
74
 
@@ -79,7 +80,6 @@ Use the custom literature tool as the primary search path:
79
80
  When calling `literature_search`:
80
81
  - Always construct `pubmed_query` using PubMed-specific syntax from the references below.
81
82
  - Use MeSH terms (`[mh]` / `[majr]`), title/abstract terms (`[tiab]`), publication types (`[pt]`), substance names (`[nm]`), date filters, and Boolean logic as appropriate.
82
- - Construct `semantic_scholar_query` separately as broader natural-language search terms when useful. Semantic Scholar is used automatically as supplementary search only when `SEMANTIC_SCHOLAR_API_KEY` is configured.
83
83
  - Do not pass a generic natural-language query as `pubmed_query` when a PubMed/MeSH query can be constructed.
84
84
 
85
85
  These extension tools are the preferred search path for this skill. Do not fall back to generic `WebFetch` / `WebSearch` first when one of these typed tools fits the task.
@@ -88,29 +88,15 @@ Read these references before constructing queries:
88
88
  - `references/pubmed_routine.md`
89
89
  - `references/pubmed_search_syntax.md`
90
90
  - `references/pubmed_common_queries.md`
91
- - `references/semanticscholar_routine.md`
92
91
 
93
92
  ### Step 4 — Screen and prioritise
94
93
 
95
- - Deduplicate across PubMed and Semantic Scholar sources.
96
- - Prioritise by relevance, recency, citation count, and study type.
94
+ - Deduplicate PubMed results.
95
+ - Prioritise by relevance, recency, and study type.
97
96
  - Default to deep reading of the top 20 papers unless the user asks otherwise.
98
97
  - For preclinical requests, keep studies with experimental target perturbation evidence.
99
98
 
100
- ### Step 5 — Retrieve full text
101
-
102
- Use `fetch_fulltext` for top papers. Prefer it over ad-hoc `WebFetch` PDF retrieval because it applies the defined PMC → publisher OA → Sci-Hub chain.
103
-
104
- Access chain:
105
- 1. PMC
106
- 2. publisher open-access page
107
- 3. Sci-Hub fallback
108
-
109
- Read:
110
- - `references/full-text-access-guide.md`
111
- - `references/scihub_routine.md`
112
-
113
- ### Step 6 — Synthesis
99
+ ### Step 5 — Synthesis
114
100
 
115
101
  Always produce:
116
102
  1. a narrative synthesis with inline numbered citations
@@ -179,14 +165,13 @@ After reviewing the core paper set, optionally produce:
179
165
 
180
166
  ## Expected files
181
167
 
182
- Typical outputs must be placed in a dedicated search-session folder under `./results/`, for example `./results/literature_<descriptive_topic>/`:
168
+ Typical outputs must be placed in a dedicated subject folder under `./results/literature_review/`, for example `./results/literature_review/<subject_of_study>/`:
183
169
  - `literature_report.md`
184
170
  - `paper_summary_table.csv`
185
171
  - `search_log.md`
186
- - `pdfs/`
187
172
  - optional `analysis_object.pkl` or other export artifacts when produced
188
173
 
189
- Do not write these outputs directly to `./results/` or reuse a previous search folder.
174
+ Do not write these outputs directly to `./results/literature_review/` or to `./results/`, and do not reuse a previous subject folder.
190
175
 
191
176
  ## Companion references
192
177
 
@@ -194,10 +179,7 @@ Do not write these outputs directly to `./results/` or reuse a previous search f
194
179
  - `references/pubmed_routine.md`
195
180
  - `references/pubmed_search_syntax.md`
196
181
  - `references/pubmed_common_queries.md`
197
- - `references/semanticscholar_routine.md`
198
182
  - `references/preclinical-extraction-guide.md`
199
- - `references/full-text-access-guide.md`
200
- - `references/scihub_routine.md`
201
183
 
202
184
  ## Companion scripts
203
185
 
@@ -205,4 +187,3 @@ Do not write these outputs directly to `./results/` or reuse a previous search f
205
187
  - `scripts/synthesis.py`
206
188
  - `scripts/generate_table.py`
207
189
  - `scripts/export_all.py`
208
- - `scripts/scihub_pdf_resolver.py`
@@ -173,7 +173,7 @@ The `experiment_extraction.csv` file contains one row per paper with these colum
173
173
  ### 1. Abstract-only extraction
174
174
  The script only reads abstracts, not full text. Papers that describe experiments only in the methods/results sections (not the abstract) will be misclassified as "unclassified".
175
175
 
176
- **Mitigation:** Step 5 of the workflow (full-text enrichment) addresses this for top papers.
176
+ **Mitigation:** No full-text enrichment step is currently available; papers whose experiments appear only in the methods/results sections may be misclassified as "unclassified".
177
177
 
178
178
  ### 2. Keyword sensitivity
179
179
  - **False positives:** A paper mentioning "mouse model" in the introduction (not as an experiment performed) may be classified as in_vivo.
@@ -24,8 +24,6 @@ def _identifier(paper: Dict) -> str:
24
24
  return f"PMID:{paper['pmid']}"
25
25
  if paper.get("doi"):
26
26
  return paper["doi"]
27
- if paper.get("s2_id"):
28
- return paper["s2_id"]
29
27
  return "NA"
30
28
 
31
29
 
@@ -51,7 +49,7 @@ def build_table_rows(papers: List[Dict], experiments: List[Dict] | None = None,
51
49
  "#": idx,
52
50
  "PMID/DOI": _identifier(paper),
53
51
  "Authors (year)": _authors_year(paper),
54
- "Key Message": _truncate(paper.get("tldr") or paper.get("title") or ""),
52
+ "Key Message": _truncate(paper.get("title") or ""),
55
53
  "Key Results": _truncate(paper.get("abstract") or exp.get("key_findings") or ""),
56
54
  "Key Methods": _truncate(
57
55
  "; ".join(filter(None, [
@@ -35,15 +35,16 @@ def classify_study_type(paper: Dict) -> str:
35
35
 
36
36
  def classify_evidence_quality(paper: Dict) -> str:
37
37
  study_type = classify_study_type(paper)
38
- citation_count = int(paper.get("citation_count") or 0)
39
38
  if study_type in {"Systematic review / meta-analysis", "Randomized controlled trial"}:
40
39
  return "High"
41
40
  if study_type in {"Clinical study", "In vitro + in vivo"}:
42
41
  return "Moderate"
43
42
  if paper.get("is_preprint"):
44
43
  return "Preliminary (preprint)"
45
- if study_type in {"In vivo", "In vitro"}:
46
- return "Moderate" if citation_count >= 20 else "Low to moderate"
44
+ if study_type == "In vivo":
45
+ return "Moderate"
46
+ if study_type == "In vitro":
47
+ return "Preliminary"
47
48
  return "Preliminary"
48
49
 
49
50
 
package/src/index.ts CHANGED
@@ -1,12 +1,8 @@
1
1
  import type { ExtensionAPI } from "@earendil-works/pi-coding-agent";
2
- import { registerFetchFulltextTool } from "./fulltext.ts";
3
2
  import { registerLiteratureSearchTool } from "./literature-search.ts";
4
3
  import { registerPubmedSearchTool } from "./pubmed.ts";
5
- import { registerSemanticScholarSearchTool } from "./semantic-scholar.ts";
6
4
 
7
5
  export default function literatureToolsExtension(pi: ExtensionAPI) {
8
6
  registerLiteratureSearchTool(pi);
9
7
  registerPubmedSearchTool(pi);
10
- registerSemanticScholarSearchTool(pi);
11
- registerFetchFulltextTool(pi);
12
8
  }
@@ -7,7 +7,6 @@ import {
7
7
  type LiteratureSearchDisplayEvent,
8
8
  type LiteratureSearchDisplaySearch,
9
9
  } from "./rendering.ts";
10
- import { searchSemanticScholar } from "./semantic-scholar.ts";
11
10
  import { formatPaperText, normalizeDoi, unique } from "./shared.ts";
12
11
  import { emitProgress, textResult, type TextToolUpdate } from "./tool-output.ts";
13
12
  import type { PaperRecord } from "./types.ts";
@@ -17,12 +16,6 @@ export const LITERATURE_SEARCH_PARAMS = Type.Object({
17
16
  description:
18
17
  "PubMed-ready query using PubMed syntax such as MeSH [mh], title/abstract [tiab], publication type [pt], substance [nm], and Boolean logic.",
19
18
  }),
20
- semantic_scholar_query: Type.Optional(
21
- Type.String({
22
- description:
23
- "Optional natural-language Semantic Scholar query for supplementary search. If omitted and Semantic Scholar is configured, a simplified query is derived from pubmed_query.",
24
- }),
25
- ),
26
19
  max_results: Type.Optional(
27
20
  Type.Number({ description: "Maximum results per provider (default 20)" }),
28
21
  ),
@@ -51,27 +44,11 @@ export type LiteratureSearchResult = {
51
44
  papers: PaperRecord[];
52
45
  providers: {
53
46
  pubmed: ProviderExecution;
54
- semantic_scholar: ProviderExecution;
55
47
  };
56
48
  searches: LiteratureSearchDisplaySearch[];
57
49
  events: LiteratureSearchDisplayEvent[];
58
50
  };
59
51
 
60
- function firstYear(value?: string): number | undefined {
61
- const match = value?.match(/^(\d{4})/);
62
- return match?.[1] ? Number(match[1]) : undefined;
63
- }
64
-
65
- export function simplifyPubmedQueryForSemanticScholar(query: string): string {
66
- const simplified = query
67
- .replace(/\[[^\]]+\]/g, " ")
68
- .replace(/\b(?:AND|OR|NOT)\b/gi, " ")
69
- .replace(/[()"']/g, " ")
70
- .replace(/\s+/g, " ")
71
- .trim();
72
- return simplified || query.trim();
73
- }
74
-
75
52
  function sourceList(paper: PaperRecord): string[] {
76
53
  return unique([
77
54
  ...(paper.sources ?? []),
@@ -92,7 +69,6 @@ function dedupeKeys(paper: PaperRecord): string[] {
92
69
  const keys = [
93
70
  doi ? `doi:${doi}` : undefined,
94
71
  paper.pmid ? `pmid:${paper.pmid}` : undefined,
95
- paper.s2_id ? `s2:${paper.s2_id}` : undefined,
96
72
  ];
97
73
  const title = normalizedTitle(paper.title);
98
74
  if (title && paper.year) keys.push(`title-year:${title}:${paper.year}`);
@@ -106,7 +82,6 @@ function mergePapers(existing: PaperRecord, incoming: PaperRecord): PaperRecord
106
82
  ...existing,
107
83
  doi: normalizeDoi(existing.doi) ?? normalizeDoi(incoming.doi),
108
84
  pmid: existing.pmid ?? incoming.pmid,
109
- s2_id: existing.s2_id ?? incoming.s2_id,
110
85
  title: existing.title !== "Untitled" ? existing.title : incoming.title,
111
86
  abstract: existing.abstract ?? incoming.abstract,
112
87
  authors: unique([...(existing.authors ?? []), ...(incoming.authors ?? [])]),
@@ -117,10 +92,6 @@ function mergePapers(existing: PaperRecord, incoming: PaperRecord): PaperRecord
117
92
  ...(incoming.publication_types ?? []),
118
93
  ]),
119
94
  mesh_terms: unique([...(existing.mesh_terms ?? []), ...(incoming.mesh_terms ?? [])]),
120
- citation_count: existing.citation_count ?? incoming.citation_count,
121
- tldr: existing.tldr ?? incoming.tldr,
122
- open_access_pdf: existing.open_access_pdf ?? incoming.open_access_pdf,
123
- external_ids: { ...(incoming.external_ids ?? {}), ...(existing.external_ids ?? {}) },
124
95
  source: sources.join(";"),
125
96
  sources,
126
97
  };
@@ -208,88 +179,10 @@ export async function searchLiterature(
208
179
  });
209
180
  emitEvent(`PubMed q1 found ${pubmed.count} candidate papers.`);
210
181
 
211
- const semanticScholarApiKey = process.env.SEMANTIC_SCHOLAR_API_KEY?.trim();
212
- let semanticScholar: ProviderExecution = {
213
- searched: false,
214
- reason: "SEMANTIC_SCHOLAR_API_KEY not configured",
215
- };
216
- let semanticScholarPapers: PaperRecord[] = [];
217
-
218
- if (semanticScholarApiKey) {
219
- const semanticScholarQuery =
220
- params.semantic_scholar_query?.trim() ||
221
- simplifyPubmedQueryForSemanticScholar(params.pubmed_query);
222
-
223
- events.push({
224
- phase: "query_start",
225
- provider: "semantic_scholar",
226
- query_index: 1,
227
- query: semanticScholarQuery,
228
- });
229
- emitEvent(`Searching Semantic Scholar q1: ${semanticScholarQuery}`);
230
-
231
- try {
232
- const semanticScholarResult = await searchSemanticScholar(
233
- {
234
- query: semanticScholarQuery,
235
- max_results: Math.min(100, maxResults),
236
- year_from: firstYear(params.date_from),
237
- year_to: firstYear(params.date_to),
238
- },
239
- signal,
240
- undefined,
241
- );
242
- semanticScholarPapers = semanticScholarResult.papers;
243
- const semanticScholarDisplayPapers = compactPapersForDisplay(
244
- semanticScholarResult.papers,
245
- );
246
- searches.push({
247
- provider: "semantic_scholar",
248
- query_index: 1,
249
- query: semanticScholarQuery,
250
- count: semanticScholarResult.count,
251
- papers: semanticScholarDisplayPapers,
252
- });
253
- events.push({
254
- phase: "query_results",
255
- provider: "semantic_scholar",
256
- query_index: 1,
257
- query: semanticScholarQuery,
258
- count: semanticScholarResult.count,
259
- papers: semanticScholarDisplayPapers,
260
- });
261
- emitEvent(
262
- `Semantic Scholar q1 found ${semanticScholarResult.count} candidate papers.`,
263
- );
264
- semanticScholar = {
265
- searched: true,
266
- count: semanticScholarResult.count,
267
- query: semanticScholarQuery,
268
- };
269
- } catch (err) {
270
- const message = err instanceof Error ? err.message : String(err);
271
- events.push({
272
- phase: "query_error",
273
- provider: "semantic_scholar",
274
- query_index: 1,
275
- query: semanticScholarQuery,
276
- error: message,
277
- });
278
- semanticScholar = {
279
- searched: false,
280
- reason: `Semantic Scholar search failed: ${message}`,
281
- };
282
- emitEvent(`Semantic Scholar q1 failed: ${message}`);
283
- }
284
- }
285
-
286
182
  events.push({ phase: "dedupe" });
287
183
  emitEvent("Deduplicating literature results...");
288
184
 
289
- const papers = dedupeLiteraturePapers([
290
- ...pubmed.papers,
291
- ...semanticScholarPapers,
292
- ]);
185
+ const papers = dedupeLiteraturePapers(pubmed.papers);
293
186
  events.push({
294
187
  phase: "complete",
295
188
  count: papers.length,
@@ -307,7 +200,6 @@ export async function searchLiterature(
307
200
  query: pubmed.query ?? params.pubmed_query,
308
201
  total: pubmed.total,
309
202
  },
310
- semantic_scholar: semanticScholar,
311
203
  },
312
204
  searches,
313
205
  events,
@@ -319,7 +211,7 @@ export function createLiteratureSearchTool() {
319
211
  name: "literature_search",
320
212
  label: "Literature Search",
321
213
  description:
322
- "Run the literature workflow search: PubMed is always searched first with a PubMed-ready query; Semantic Scholar is searched as supplementary metadata when SEMANTIC_SCHOLAR_API_KEY is configured.",
214
+ "Run the literature workflow search against PubMed using a PubMed-ready query (MeSH [mh], title/abstract [tiab], publication type [pt], substance [nm], and Boolean logic).",
323
215
  parameters: LITERATURE_SEARCH_PARAMS,
324
216
  async execute(
325
217
  _toolCallId: string,
package/src/rendering.ts CHANGED
@@ -16,20 +16,19 @@ export type CompactPaperForDisplay = {
16
16
  source: string;
17
17
  year?: number;
18
18
  journal?: string;
19
- citation_count?: number;
20
19
  };
21
20
 
22
21
  export type LiteratureSearchDisplayEvent =
23
22
  | { phase: "start" }
24
23
  | {
25
24
  phase: "query_start";
26
- provider: "pubmed" | "semantic_scholar";
25
+ provider: "pubmed";
27
26
  query_index: number;
28
27
  query: string;
29
28
  }
30
29
  | {
31
30
  phase: "query_results";
32
- provider: "pubmed" | "semantic_scholar";
31
+ provider: "pubmed";
33
32
  query_index: number;
34
33
  query: string;
35
34
  count: number;
@@ -37,7 +36,7 @@ export type LiteratureSearchDisplayEvent =
37
36
  }
38
37
  | {
39
38
  phase: "query_error";
40
- provider: "pubmed" | "semantic_scholar";
39
+ provider: "pubmed";
41
40
  query_index: number;
42
41
  query: string;
43
42
  error: string;
@@ -46,7 +45,7 @@ export type LiteratureSearchDisplayEvent =
46
45
  | { phase: "complete"; count: number; papers: CompactPaperForDisplay[] };
47
46
 
48
47
  export type LiteratureSearchDisplaySearch = {
49
- provider: "pubmed" | "semantic_scholar";
48
+ provider: "pubmed";
50
49
  query_index: number;
51
50
  query: string;
52
51
  count: number;
@@ -107,7 +106,6 @@ export function authorRange(paper: PaperRecord): string {
107
106
  export function paperIdentifier(paper: PaperRecord): string {
108
107
  if (paper.doi) return `DOI:${paper.doi}`;
109
108
  if (paper.pmid) return `PMID:${paper.pmid}`;
110
- if (paper.s2_id) return `S2:${paper.s2_id}`;
111
109
  return "—";
112
110
  }
113
111
 
@@ -120,11 +118,7 @@ export function sourceLabel(paper: PaperRecord): string {
120
118
  .map((source) => source.trim())
121
119
  .filter(Boolean),
122
120
  );
123
- const hasPubmed = sources.has("pubmed");
124
- const hasS2 = sources.has("semantic_scholar");
125
- if (hasPubmed && hasS2) return "PM+S2";
126
- if (hasPubmed) return "PM";
127
- if (hasS2) return "S2";
121
+ if (sources.has("pubmed")) return "PM";
128
122
  return paper.source ?? "—";
129
123
  }
130
124
 
@@ -136,7 +130,6 @@ export function compactPaperForDisplay(paper: PaperRecord): CompactPaperForDispl
136
130
  source: sourceLabel(paper),
137
131
  year: paper.year,
138
132
  journal: paper.journal,
139
- citation_count: paper.citation_count,
140
133
  };
141
134
  }
142
135
 
@@ -144,12 +137,12 @@ export function compactPapersForDisplay(papers: PaperRecord[]): CompactPaperForD
144
137
  return papers.map(compactPaperForDisplay);
145
138
  }
146
139
 
147
- function providerLabel(provider: "pubmed" | "semantic_scholar"): string {
148
- return provider === "pubmed" ? "PubMed" : "Semantic Scholar";
140
+ function providerLabel(provider: "pubmed"): string {
141
+ return "PubMed";
149
142
  }
150
143
 
151
- function providerColor(provider: "pubmed" | "semantic_scholar"): string {
152
- return provider === "pubmed" ? "success" : "accent";
144
+ function providerColor(provider: "pubmed"): string {
145
+ return "success";
153
146
  }
154
147
 
155
148
  export function formatFoundLine(
@@ -168,7 +161,7 @@ export function formatMergedLine(
168
161
  theme?: ThemeLike,
169
162
  ): string {
170
163
  const title = truncateText(paper.title, 72);
171
- const source = color(theme, paper.source.includes("S2") ? "accent" : "success", `(${paper.source})`);
164
+ const source = color(theme, "success", `(${paper.source})`);
172
165
  return ` ${color(theme, "success", "+")} ${index + 1}. ${title} ${source}`;
173
166
  }
174
167
 
@@ -237,7 +230,6 @@ type LiteratureResultDetails = {
237
230
  papers?: PaperRecord[];
238
231
  providers?: {
239
232
  pubmed?: ProviderSearchSummary;
240
- semantic_scholar?: ProviderSearchSummary;
241
233
  };
242
234
  events?: LiteratureSearchDisplayEvent[];
243
235
  };
@@ -250,11 +242,9 @@ type ProviderResultDetails = {
250
242
 
251
243
  function renderCollapsedLiteratureResult(details: LiteratureResultDetails, theme?: ThemeLike): string {
252
244
  const pubmed = details?.providers?.pubmed;
253
- const s2 = details?.providers?.semantic_scholar;
254
245
  const pubmedText = pubmed?.searched ? `PubMed: ${pubmed.count}` : "PubMed: —";
255
- const s2Text = s2?.searched ? `S2: ${s2.count}` : "S2: skipped";
256
246
  const count = details?.count ?? details?.papers?.length ?? 0;
257
- return `${color(theme, "success", "✓")} ${color(theme, "toolTitle", "literature_search")} ${color(theme, "success", pubmedText)} | ${color(theme, "accent", s2Text)} | merged: ${count}`;
247
+ return `${color(theme, "success", "✓")} ${color(theme, "toolTitle", "literature_search")} ${color(theme, "success", pubmedText)} | merged: ${count}`;
258
248
  }
259
249
 
260
250
  export function renderLiteratureSearchResult(
@@ -284,7 +274,7 @@ export function renderLiteratureSearchResult(
284
274
  }
285
275
 
286
276
  export function renderProviderSearchResult(
287
- provider: "pubmed" | "semantic_scholar",
277
+ provider: "pubmed",
288
278
  result: ToolRenderResult<ProviderResultDetails>,
289
279
  options: RenderOptions,
290
280
  theme?: ThemeLike,
@@ -298,7 +288,7 @@ export function renderProviderSearchResult(
298
288
  return terminalText(color(theme, "warning", text));
299
289
  }
300
290
  if (!options.expanded) {
301
- return terminalText(`${color(theme, "success", "✓")} ${color(theme, "toolTitle", provider === "pubmed" ? "pubmed_search" : "semantic_scholar_search")} ${papers.length} papers`);
291
+ return terminalText(`${color(theme, "success", "✓")} ${color(theme, "toolTitle", "pubmed_search")} ${papers.length} papers`);
302
292
  }
303
293
  const lines = [
304
294
  `${color(theme, providerColor(provider), "→")} ${color(theme, providerColor(provider), providerName)} q1: ${query}`,
package/src/shared.ts CHANGED
@@ -1,5 +1,3 @@
1
- import { mkdir, writeFile } from "node:fs/promises";
2
- import path from "node:path";
3
1
  import type { PaperRecord } from "./types.ts";
4
2
 
5
3
  export const USER_AGENT = "research-skills-literature-tools/0.1 (+https://github.com/fbraza/research-skills)";
@@ -82,22 +80,3 @@ export async function fetchJson<T>(url: string, signal?: AbortSignal, headers?:
82
80
  export function formatPaperText(papers: PaperRecord[]): string {
83
81
  return JSON.stringify(papers, null, 2);
84
82
  }
85
-
86
- export function sanitizeFilename(value: string): string {
87
- return value.replace(/[^a-z0-9._-]+/gi, "_").replace(/^_+|_+$/g, "") || "paper";
88
- }
89
-
90
- export async function savePdf(pdfUrl: string, outputDir: string, preferredId: string, signal?: AbortSignal): Promise<string> {
91
- await mkdir(outputDir, { recursive: true });
92
- const response = await fetch(pdfUrl, {
93
- method: "GET",
94
- signal,
95
- headers: { "user-agent": USER_AGENT, accept: "application/pdf,*/*" },
96
- redirect: "follow",
97
- });
98
- if (!response.ok) throw new Error(`Failed to download PDF (${response.status})`);
99
- const bytes = Buffer.from(await response.arrayBuffer());
100
- const filePath = path.resolve(outputDir, `${sanitizeFilename(preferredId)}.pdf`);
101
- await writeFile(filePath, bytes);
102
- return filePath;
103
- }
package/src/types.ts CHANGED
@@ -1,7 +1,6 @@
1
1
  export type PaperRecord = {
2
2
  pmid?: string;
3
3
  doi?: string;
4
- s2_id?: string;
5
4
  title: string;
6
5
  abstract?: string;
7
6
  authors?: string[];
@@ -9,22 +8,10 @@ export type PaperRecord = {
9
8
  year?: number;
10
9
  publication_types?: string[];
11
10
  mesh_terms?: string[];
12
- citation_count?: number;
13
- tldr?: string;
14
- open_access_pdf?: string;
15
- external_ids?: Record<string, string>;
16
11
  source?: string;
17
12
  sources?: string[];
18
13
  date?: string;
19
14
  category?: string;
20
15
  version?: string;
21
16
  license?: string;
22
- pdf_url?: string;
23
- };
24
-
25
- export type FullTextRouteResult = {
26
- source: string;
27
- pdf_url?: string;
28
- access_note: string;
29
- is_preprint?: boolean;
30
17
  };
@@ -1,34 +0,0 @@
1
- # Full-Text Access Guide
2
-
3
- **Workflow:** literature
4
- **Purpose:** Retrieve PDFs for prioritised papers using a consistent fallback chain.
5
-
6
- ## Access order
7
-
8
- 1. **PubMed Central (PMC)**
9
- - Preferred for PubMed-indexed papers with open full text.
10
- - Use PubMed/PMC linking first when a PMID is available.
11
-
12
- 2. **Publisher open-access page**
13
- - Resolve DOI at `https://doi.org/<doi>`.
14
- - Look for `citation_pdf_url`, explicit PDF links, or embedded PDF viewers.
15
-
16
- 3. **Sci-Hub fallback**
17
- - Use only as the final fallback after OA routes are exhausted.
18
- - Record that Sci-Hub was used.
19
-
20
- ## Per-paper logging
21
-
22
- For each paper, record:
23
- - PMID
24
- - DOI
25
- - source used: `pmc`, `publisher_oa`, `scihub`, or `not_found`
26
- - direct PDF URL if found
27
- - local saved path if downloaded
28
- - access note
29
-
30
- ## Notes
31
-
32
- - PMC and publisher OA should always be attempted before Sci-Hub.
33
- - If no DOI is known but PMID exists, try resolving identifiers from PubMed metadata first.
34
- - If no PDF is found, keep the paper in the synthesis and note `not_found`.