@fbraza/pi-cite 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,298 @@
1
+ # PubMed E-utilities API Reference
2
+
3
+ ## Overview
4
+
5
+ The NCBI E-utilities provide programmatic access to PubMed and other Entrez databases through a REST API. The base URL for all E-utilities is:
6
+
7
+ ```
8
+ https://eutils.ncbi.nlm.nih.gov/entrez/eutils/
9
+ ```
10
+
11
+ ## API Key Requirements
12
+
13
+ As of December 1, 2018, NCBI enforces API key usage for E-utility calls. API keys increase rate limits from 3 requests/second to 10 requests/second. To obtain an API key, register for an NCBI account and generate a key from your account settings.
14
+
15
+ Include the API key in requests using the `&api_key` parameter:
16
+ ```
17
+ esearch.fcgi?db=pubmed&term=cancer&api_key=YOUR_API_KEY
18
+ ```
19
+
20
+ ## Rate Limits
21
+
22
+ - **Without API key**: 3 requests per second
23
+ - **With API key**: 10 requests per second
24
+ - Always include a User-Agent header in requests
25
+
26
+ ## Core E-utility Tools
27
+
28
+ ### 1. ESearch - Query Databases
29
+
30
+ **Endpoint**: `esearch.fcgi`
31
+
32
+ **Purpose**: Search an Entrez database and retrieve a list of UIDs (e.g., PMIDs for PubMed)
33
+
34
+ **Required Parameters**:
35
+ - `db` - Database to search (e.g., pubmed, gene, protein)
36
+ - `term` - Search query
37
+
38
+ **Optional Parameters**:
39
+ - `retmax` - Maximum records to return (default: 20, max: 10000)
40
+ - `retstart` - Index of first record to return (default: 0)
41
+ - `usehistory=y` - Store results on history server for large result sets
42
+ - `retmode` - Return format (xml, json)
43
+ - `sort` - Sort order (relevance, pub_date, first_author, last_author, journal)
44
+ - `field` - Limit search to specific field
45
+ - `datetype` - Type of date to use for filtering (pdat for publication date)
46
+ - `mindate` - Minimum date (YYYY/MM/DD format)
47
+ - `maxdate` - Maximum date (YYYY/MM/DD format)
48
+
49
+ **Example Request**:
50
+ ```
51
+ esearch.fcgi?db=pubmed&term=breast+cancer&retmax=100&retmode=json&api_key=YOUR_API_KEY
52
+ ```
53
+
54
+ **Response Elements**:
55
+ - `Count` - Total number of records matching query
56
+ - `RetMax` - Number of records returned in this response
57
+ - `RetStart` - Index of first returned record
58
+ - `IdList` - List of UIDs (PMIDs)
59
+ - `WebEnv` - History server environment string (when usehistory=y)
60
+ - `QueryKey` - Query key for history server (when usehistory=y)
61
+
62
+ ### 2. EFetch - Download Records
63
+
64
+ **Endpoint**: `efetch.fcgi`
65
+
66
+ **Purpose**: Retrieve full records from a database in various formats
67
+
68
+ **Required Parameters**:
69
+ - `db` - Database name
70
+ - `id` - Comma-separated list of UIDs, or use WebEnv/query_key from ESearch
71
+
72
+ **Optional Parameters**:
73
+ - `rettype` - Record type (abstract, medline, xml, uilist)
74
+ - `retmode` - Return mode (text, xml)
75
+ - `retstart` - Starting record index
76
+ - `retmax` - Maximum records per request
77
+
78
+ **Example Request**:
79
+ ```
80
+ efetch.fcgi?db=pubmed&id=123456,234567&rettype=abstract&retmode=text&api_key=YOUR_API_KEY
81
+ ```
82
+
83
+ **Common rettype Values for PubMed**:
84
+ - `abstract` - Abstract text
85
+ - `medline` - Full MEDLINE format
86
+ - `xml` - PubMed XML format
87
+ - `uilist` - List of UIDs only
88
+
89
+ ### 3. ESummary - Retrieve Document Summaries
90
+
91
+ **Endpoint**: `esummary.fcgi`
92
+
93
+ **Purpose**: Get document summaries (DocSum) for a list of UIDs
94
+
95
+ **Required Parameters**:
96
+ - `db` - Database name
97
+ - `id` - Comma-separated UIDs or WebEnv/query_key
98
+
99
+ **Optional Parameters**:
100
+ - `retmode` - Return format (xml, json)
101
+ - `version` - DocSum version (1.0 or 2.0, default is 1.0)
102
+
103
+ **Example Request**:
104
+ ```
105
+ esummary.fcgi?db=pubmed&id=123456,234567&retmode=json&version=2.0&api_key=YOUR_API_KEY
106
+ ```
107
+
108
+ **DocSum Fields** (vary by database, common PubMed fields):
109
+ - Title
110
+ - Authors
111
+ - Source (journal)
112
+ - PubDate
113
+ - Volume, Issue, Pages
114
+ - DOI
115
+ - PmcRefCount (citations in PMC)
116
+
117
+ ### 4. EPost - Upload UIDs
118
+
119
+ **Endpoint**: `epost.fcgi`
120
+
121
+ **Purpose**: Upload a list of UIDs to the history server for use in subsequent requests
122
+
123
+ **Required Parameters**:
124
+ - `db` - Database name
125
+ - `id` - Comma-separated list of UIDs
126
+
127
+ **Example Request**:
128
+ ```
129
+ epost.fcgi?db=pubmed&id=123456,234567,345678&api_key=YOUR_API_KEY
130
+ ```
131
+
132
+ **Response**:
133
+ Returns WebEnv and QueryKey for use in subsequent requests
134
+
135
+ ### 5. ELink - Find Related Data
136
+
137
+ **Endpoint**: `elink.fcgi`
138
+
139
+ **Purpose**: Find related records within the same database or in different databases
140
+
141
+ **Required Parameters**:
142
+ - `dbfrom` - Source database
143
+ - `db` - Target database (can be same as dbfrom)
144
+ - `id` - UID(s) from source database
145
+
146
+ **Optional Parameters**:
147
+ - `cmd` - Link command (neighbor, neighbor_history, prlinks, llinks, etc.)
148
+ - `linkname` - Specific link type to retrieve
149
+ - `term` - Filter results with search query
150
+ - `holding` - Filter by library holdings
151
+
152
+ **Example Request**:
153
+ ```
154
+ elink.fcgi?dbfrom=pubmed&db=pubmed&id=123456&cmd=neighbor&api_key=YOUR_API_KEY
155
+ ```
156
+
157
+ **Common Link Commands**:
158
+ - `neighbor` - Return related records
159
+ - `neighbor_history` - Post related records to history server
160
+ - `prlinks` - Return provider URLs
161
+ - `llinks` - Return LinkOut URLs
162
+
163
+ ### 6. EInfo - Database Information
164
+
165
+ **Endpoint**: `einfo.fcgi`
166
+
167
+ **Purpose**: Get information about available Entrez databases or specific database fields
168
+
169
+ **Parameters**:
170
+ - `db` - Database name (optional; omit to list all databases)
171
+ - `retmode` - Return format (xml, json)
172
+
173
+ **Example Request**:
174
+ ```
175
+ einfo.fcgi?db=pubmed&retmode=json&api_key=YOUR_API_KEY
176
+ ```
177
+
178
+ **Returns**:
179
+ - Database description
180
+ - Record count
181
+ - Last update date
182
+ - Available search fields with descriptions
183
+
184
+ ### 7. EGQuery - Global Query
185
+
186
+ **Endpoint**: `egquery.fcgi`
187
+
188
+ **Purpose**: Search term counts across all Entrez databases
189
+
190
+ **Required Parameters**:
191
+ - `term` - Search query
192
+
193
+ **Example Request**:
194
+ ```
195
+ egquery.fcgi?term=cancer&api_key=YOUR_API_KEY
196
+ ```
197
+
198
+ ### 8. ESpell - Spelling Suggestions
199
+
200
+ **Endpoint**: `espell.fcgi`
201
+
202
+ **Purpose**: Get spelling suggestions for queries
203
+
204
+ **Required Parameters**:
205
+ - `db` - Database name
206
+ - `term` - Search term with potential misspelling
207
+
208
+ **Example Request**:
209
+ ```
210
+ espell.fcgi?db=pubmed&term=cancre&api_key=YOUR_API_KEY
211
+ ```
212
+
213
+ ### 9. ECitMatch - Citation Matching
214
+
215
+ **Endpoint**: `ecitmatch.cgi`
216
+
217
+ **Purpose**: Search PubMed citations using journal, year, volume, page, author information
218
+
219
+ **Request Format**: POST request with citation strings
220
+
221
+ **Citation String Format**:
222
+ ```
223
+ journal|year|volume|page|author|key|
224
+ ```
225
+
226
+ **Example**:
227
+ ```
228
+ Science|2008|320|5880|1185|key1|
229
+ Nature|2010|463|7279|318|key2|
230
+ ```
231
+
232
+ **Rate Limit**: 3 requests per second with User-Agent header required
233
+
234
+ ## Best Practices
235
+
236
+ ### Use History Server for Large Result Sets
237
+
238
+ For queries returning more than 500 records, use the history server:
239
+
240
+ 1. **Initial Search with History**:
241
+ ```
242
+ esearch.fcgi?db=pubmed&term=cancer&usehistory=y&retmode=json&api_key=YOUR_API_KEY
243
+ ```
244
+
245
+ 2. **Retrieve Records in Batches**:
246
+ ```
247
+ efetch.fcgi?db=pubmed&query_key=1&WebEnv=MCID_12345&retstart=0&retmax=500&rettype=xml&api_key=YOUR_API_KEY
248
+ efetch.fcgi?db=pubmed&query_key=1&WebEnv=MCID_12345&retstart=500&retmax=500&rettype=xml&api_key=YOUR_API_KEY
249
+ ```
250
+
251
+ ### Batch Operations
252
+
253
+ Use EPost to upload large lists of UIDs before fetching:
254
+
255
+ ```
256
+ # Step 1: Post UIDs
257
+ epost.fcgi?db=pubmed&id=123,456,789,...&api_key=YOUR_API_KEY
258
+
259
+ # Step 2: Fetch using WebEnv/query_key
260
+ efetch.fcgi?db=pubmed&query_key=1&WebEnv=MCID_12345&rettype=xml&api_key=YOUR_API_KEY
261
+ ```
262
+
263
+ ### Error Handling
264
+
265
+ Common HTTP status codes:
266
+ - `200` - Success
267
+ - `400` - Bad request (check parameters)
268
+ - `414` - URI too long (use POST or history server)
269
+ - `429` - Rate limit exceeded
270
+
271
+ ### Caching
272
+
273
+ Implement local caching to:
274
+ - Reduce redundant API calls
275
+ - Stay within rate limits
276
+ - Improve response times
277
+ - Respect NCBI resources
278
+
279
+ ## Response Formats
280
+
281
+ ### XML (Default)
282
+
283
+ Most detailed format with full structured data. Each database has its own DTD (Document Type Definition).
284
+
285
+ ### JSON
286
+
287
+ Available for most utilities with `retmode=json`. Easier to parse in modern applications.
288
+
289
+ ### Text
290
+
291
+ Plain text format, useful for abstracts and simple data retrieval.
292
+
293
+ ## Support and Resources
294
+
295
+ - **API Documentation**: https://www.ncbi.nlm.nih.gov/books/NBK25501/
296
+ - **Mailing List**: utilities-announce@ncbi.nlm.nih.gov
297
+ - **Support**: vog.hin.mln.ibcn@seitilitue
298
+ - **NLM Help Desk**: 1-888-FIND-NLM (1-888-346-3656)