@fbraza/pi-cite 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +13 -0
- package/package.json +5 -1
- package/skills/literature/SKILL.md +208 -0
- package/skills/literature/references/full-text-access-guide.md +34 -0
- package/skills/literature/references/preclinical-extraction-guide.md +215 -0
- package/skills/literature/references/pubmed_api_reference.md +298 -0
- package/skills/literature/references/pubmed_common_queries.md +453 -0
- package/skills/literature/references/pubmed_routine.md +93 -0
- package/skills/literature/references/pubmed_search_syntax.md +436 -0
- package/skills/literature/references/scihub_routine.md +40 -0
- package/skills/literature/references/semanticscholar_routine.md +50 -0
- package/skills/literature/scripts/export_all.py +53 -0
- package/skills/literature/scripts/extract_experiments.py +401 -0
- package/skills/literature/scripts/generate_table.py +96 -0
- package/skills/literature/scripts/scihub_pdf_resolver.py +289 -0
- package/skills/literature/scripts/synthesis.py +93 -0
|
@@ -0,0 +1,298 @@
|
|
|
1
|
+
# PubMed E-utilities API Reference
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
The NCBI E-utilities provide programmatic access to PubMed and other Entrez databases through a REST API. The base URL for all E-utilities is:
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
## API Key Requirements
|
|
12
|
+
|
|
13
|
+
As of December 1, 2018, NCBI enforces API key usage for E-utility calls. API keys increase rate limits from 3 requests/second to 10 requests/second. To obtain an API key, register for an NCBI account and generate a key from your account settings.
|
|
14
|
+
|
|
15
|
+
Include the API key in requests using the `&api_key` parameter:
|
|
16
|
+
```
|
|
17
|
+
esearch.fcgi?db=pubmed&term=cancer&api_key=YOUR_API_KEY
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
## Rate Limits
|
|
21
|
+
|
|
22
|
+
- **Without API key**: 3 requests per second
|
|
23
|
+
- **With API key**: 10 requests per second
|
|
24
|
+
- Always include a User-Agent header in requests
|
|
25
|
+
|
|
26
|
+
## Core E-utility Tools
|
|
27
|
+
|
|
28
|
+
### 1. ESearch - Query Databases
|
|
29
|
+
|
|
30
|
+
**Endpoint**: `esearch.fcgi`
|
|
31
|
+
|
|
32
|
+
**Purpose**: Search an Entrez database and retrieve a list of UIDs (e.g., PMIDs for PubMed)
|
|
33
|
+
|
|
34
|
+
**Required Parameters**:
|
|
35
|
+
- `db` - Database to search (e.g., pubmed, gene, protein)
|
|
36
|
+
- `term` - Search query
|
|
37
|
+
|
|
38
|
+
**Optional Parameters**:
|
|
39
|
+
- `retmax` - Maximum records to return (default: 20, max: 10000)
|
|
40
|
+
- `retstart` - Index of first record to return (default: 0)
|
|
41
|
+
- `usehistory=y` - Store results on history server for large result sets
|
|
42
|
+
- `retmode` - Return format (xml, json)
|
|
43
|
+
- `sort` - Sort order (relevance, pub_date, first_author, last_author, journal)
|
|
44
|
+
- `field` - Limit search to specific field
|
|
45
|
+
- `datetype` - Type of date to use for filtering (pdat for publication date)
|
|
46
|
+
- `mindate` - Minimum date (YYYY/MM/DD format)
|
|
47
|
+
- `maxdate` - Maximum date (YYYY/MM/DD format)
|
|
48
|
+
|
|
49
|
+
**Example Request**:
|
|
50
|
+
```
|
|
51
|
+
esearch.fcgi?db=pubmed&term=breast+cancer&retmax=100&retmode=json&api_key=YOUR_API_KEY
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
**Response Elements**:
|
|
55
|
+
- `Count` - Total number of records matching query
|
|
56
|
+
- `RetMax` - Number of records returned in this response
|
|
57
|
+
- `RetStart` - Index of first returned record
|
|
58
|
+
- `IdList` - List of UIDs (PMIDs)
|
|
59
|
+
- `WebEnv` - History server environment string (when usehistory=y)
|
|
60
|
+
- `QueryKey` - Query key for history server (when usehistory=y)
|
|
61
|
+
|
|
62
|
+
### 2. EFetch - Download Records
|
|
63
|
+
|
|
64
|
+
**Endpoint**: `efetch.fcgi`
|
|
65
|
+
|
|
66
|
+
**Purpose**: Retrieve full records from a database in various formats
|
|
67
|
+
|
|
68
|
+
**Required Parameters**:
|
|
69
|
+
- `db` - Database name
|
|
70
|
+
- `id` - Comma-separated list of UIDs, or use WebEnv/query_key from ESearch
|
|
71
|
+
|
|
72
|
+
**Optional Parameters**:
|
|
73
|
+
- `rettype` - Record type (abstract, medline, xml, uilist)
|
|
74
|
+
- `retmode` - Return mode (text, xml)
|
|
75
|
+
- `retstart` - Starting record index
|
|
76
|
+
- `retmax` - Maximum records per request
|
|
77
|
+
|
|
78
|
+
**Example Request**:
|
|
79
|
+
```
|
|
80
|
+
efetch.fcgi?db=pubmed&id=123456,234567&rettype=abstract&retmode=text&api_key=YOUR_API_KEY
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
**Common rettype Values for PubMed**:
|
|
84
|
+
- `abstract` - Abstract text
|
|
85
|
+
- `medline` - Full MEDLINE format
|
|
86
|
+
- `xml` - PubMed XML format
|
|
87
|
+
- `uilist` - List of UIDs only
|
|
88
|
+
|
|
89
|
+
### 3. ESummary - Retrieve Document Summaries
|
|
90
|
+
|
|
91
|
+
**Endpoint**: `esummary.fcgi`
|
|
92
|
+
|
|
93
|
+
**Purpose**: Get document summaries (DocSum) for a list of UIDs
|
|
94
|
+
|
|
95
|
+
**Required Parameters**:
|
|
96
|
+
- `db` - Database name
|
|
97
|
+
- `id` - Comma-separated UIDs or WebEnv/query_key
|
|
98
|
+
|
|
99
|
+
**Optional Parameters**:
|
|
100
|
+
- `retmode` - Return format (xml, json)
|
|
101
|
+
- `version` - DocSum version (1.0 or 2.0, default is 1.0)
|
|
102
|
+
|
|
103
|
+
**Example Request**:
|
|
104
|
+
```
|
|
105
|
+
esummary.fcgi?db=pubmed&id=123456,234567&retmode=json&version=2.0&api_key=YOUR_API_KEY
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
**DocSum Fields** (vary by database, common PubMed fields):
|
|
109
|
+
- Title
|
|
110
|
+
- Authors
|
|
111
|
+
- Source (journal)
|
|
112
|
+
- PubDate
|
|
113
|
+
- Volume, Issue, Pages
|
|
114
|
+
- DOI
|
|
115
|
+
- PmcRefCount (citations in PMC)
|
|
116
|
+
|
|
117
|
+
### 4. EPost - Upload UIDs
|
|
118
|
+
|
|
119
|
+
**Endpoint**: `epost.fcgi`
|
|
120
|
+
|
|
121
|
+
**Purpose**: Upload a list of UIDs to the history server for use in subsequent requests
|
|
122
|
+
|
|
123
|
+
**Required Parameters**:
|
|
124
|
+
- `db` - Database name
|
|
125
|
+
- `id` - Comma-separated list of UIDs
|
|
126
|
+
|
|
127
|
+
**Example Request**:
|
|
128
|
+
```
|
|
129
|
+
epost.fcgi?db=pubmed&id=123456,234567,345678&api_key=YOUR_API_KEY
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
**Response**:
|
|
133
|
+
Returns WebEnv and QueryKey for use in subsequent requests
|
|
134
|
+
|
|
135
|
+
### 5. ELink - Find Related Data
|
|
136
|
+
|
|
137
|
+
**Endpoint**: `elink.fcgi`
|
|
138
|
+
|
|
139
|
+
**Purpose**: Find related records within the same database or in different databases
|
|
140
|
+
|
|
141
|
+
**Required Parameters**:
|
|
142
|
+
- `dbfrom` - Source database
|
|
143
|
+
- `db` - Target database (can be same as dbfrom)
|
|
144
|
+
- `id` - UID(s) from source database
|
|
145
|
+
|
|
146
|
+
**Optional Parameters**:
|
|
147
|
+
- `cmd` - Link command (neighbor, neighbor_history, prlinks, llinks, etc.)
|
|
148
|
+
- `linkname` - Specific link type to retrieve
|
|
149
|
+
- `term` - Filter results with search query
|
|
150
|
+
- `holding` - Filter by library holdings
|
|
151
|
+
|
|
152
|
+
**Example Request**:
|
|
153
|
+
```
|
|
154
|
+
elink.fcgi?dbfrom=pubmed&db=pubmed&id=123456&cmd=neighbor&api_key=YOUR_API_KEY
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
**Common Link Commands**:
|
|
158
|
+
- `neighbor` - Return related records
|
|
159
|
+
- `neighbor_history` - Post related records to history server
|
|
160
|
+
- `prlinks` - Return provider URLs
|
|
161
|
+
- `llinks` - Return LinkOut URLs
|
|
162
|
+
|
|
163
|
+
### 6. EInfo - Database Information
|
|
164
|
+
|
|
165
|
+
**Endpoint**: `einfo.fcgi`
|
|
166
|
+
|
|
167
|
+
**Purpose**: Get information about available Entrez databases or specific database fields
|
|
168
|
+
|
|
169
|
+
**Parameters**:
|
|
170
|
+
- `db` - Database name (optional; omit to list all databases)
|
|
171
|
+
- `retmode` - Return format (xml, json)
|
|
172
|
+
|
|
173
|
+
**Example Request**:
|
|
174
|
+
```
|
|
175
|
+
einfo.fcgi?db=pubmed&retmode=json&api_key=YOUR_API_KEY
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
**Returns**:
|
|
179
|
+
- Database description
|
|
180
|
+
- Record count
|
|
181
|
+
- Last update date
|
|
182
|
+
- Available search fields with descriptions
|
|
183
|
+
|
|
184
|
+
### 7. EGQuery - Global Query
|
|
185
|
+
|
|
186
|
+
**Endpoint**: `egquery.fcgi`
|
|
187
|
+
|
|
188
|
+
**Purpose**: Search term counts across all Entrez databases
|
|
189
|
+
|
|
190
|
+
**Required Parameters**:
|
|
191
|
+
- `term` - Search query
|
|
192
|
+
|
|
193
|
+
**Example Request**:
|
|
194
|
+
```
|
|
195
|
+
egquery.fcgi?term=cancer&api_key=YOUR_API_KEY
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
### 8. ESpell - Spelling Suggestions
|
|
199
|
+
|
|
200
|
+
**Endpoint**: `espell.fcgi`
|
|
201
|
+
|
|
202
|
+
**Purpose**: Get spelling suggestions for queries
|
|
203
|
+
|
|
204
|
+
**Required Parameters**:
|
|
205
|
+
- `db` - Database name
|
|
206
|
+
- `term` - Search term with potential misspelling
|
|
207
|
+
|
|
208
|
+
**Example Request**:
|
|
209
|
+
```
|
|
210
|
+
espell.fcgi?db=pubmed&term=cancre&api_key=YOUR_API_KEY
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
### 9. ECitMatch - Citation Matching
|
|
214
|
+
|
|
215
|
+
**Endpoint**: `ecitmatch.cgi`
|
|
216
|
+
|
|
217
|
+
**Purpose**: Search PubMed citations using journal, year, volume, page, author information
|
|
218
|
+
|
|
219
|
+
**Request Format**: POST request with citation strings
|
|
220
|
+
|
|
221
|
+
**Citation String Format**:
|
|
222
|
+
```
|
|
223
|
+
journal|year|volume|page|author|key|
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
**Example**:
|
|
227
|
+
```
|
|
228
|
+
Science|2008|320|5880|1185|key1|
|
|
229
|
+
Nature|2010|463|7279|318|key2|
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
**Rate Limit**: 3 requests per second with User-Agent header required
|
|
233
|
+
|
|
234
|
+
## Best Practices
|
|
235
|
+
|
|
236
|
+
### Use History Server for Large Result Sets
|
|
237
|
+
|
|
238
|
+
For queries returning more than 500 records, use the history server:
|
|
239
|
+
|
|
240
|
+
1. **Initial Search with History**:
|
|
241
|
+
```
|
|
242
|
+
esearch.fcgi?db=pubmed&term=cancer&usehistory=y&retmode=json&api_key=YOUR_API_KEY
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
2. **Retrieve Records in Batches**:
|
|
246
|
+
```
|
|
247
|
+
efetch.fcgi?db=pubmed&query_key=1&WebEnv=MCID_12345&retstart=0&retmax=500&rettype=xml&api_key=YOUR_API_KEY
|
|
248
|
+
efetch.fcgi?db=pubmed&query_key=1&WebEnv=MCID_12345&retstart=500&retmax=500&rettype=xml&api_key=YOUR_API_KEY
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
### Batch Operations
|
|
252
|
+
|
|
253
|
+
Use EPost to upload large lists of UIDs before fetching:
|
|
254
|
+
|
|
255
|
+
```
|
|
256
|
+
# Step 1: Post UIDs
|
|
257
|
+
epost.fcgi?db=pubmed&id=123,456,789,...&api_key=YOUR_API_KEY
|
|
258
|
+
|
|
259
|
+
# Step 2: Fetch using WebEnv/query_key
|
|
260
|
+
efetch.fcgi?db=pubmed&query_key=1&WebEnv=MCID_12345&rettype=xml&api_key=YOUR_API_KEY
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
### Error Handling
|
|
264
|
+
|
|
265
|
+
Common HTTP status codes:
|
|
266
|
+
- `200` - Success
|
|
267
|
+
- `400` - Bad request (check parameters)
|
|
268
|
+
- `414` - URI too long (use POST or history server)
|
|
269
|
+
- `429` - Rate limit exceeded
|
|
270
|
+
|
|
271
|
+
### Caching
|
|
272
|
+
|
|
273
|
+
Implement local caching to:
|
|
274
|
+
- Reduce redundant API calls
|
|
275
|
+
- Stay within rate limits
|
|
276
|
+
- Improve response times
|
|
277
|
+
- Respect NCBI resources
|
|
278
|
+
|
|
279
|
+
## Response Formats
|
|
280
|
+
|
|
281
|
+
### XML (Default)
|
|
282
|
+
|
|
283
|
+
Most detailed format with full structured data. Each database has its own DTD (Document Type Definition).
|
|
284
|
+
|
|
285
|
+
### JSON
|
|
286
|
+
|
|
287
|
+
Available for most utilities with `retmode=json`. Easier to parse in modern applications.
|
|
288
|
+
|
|
289
|
+
### Text
|
|
290
|
+
|
|
291
|
+
Plain text format, useful for abstracts and simple data retrieval.
|
|
292
|
+
|
|
293
|
+
## Support and Resources
|
|
294
|
+
|
|
295
|
+
- **API Documentation**: https://www.ncbi.nlm.nih.gov/books/NBK25501/
|
|
296
|
+
- **Mailing List**: utilities-announce@ncbi.nlm.nih.gov
|
|
297
|
+
- **Support**: vog.hin.mln.ibcn@seitilitue
|
|
298
|
+
- **NLM Help Desk**: 1-888-FIND-NLM (1-888-346-3656)
|