agr-mcp-server 4.0.1 → 4.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +81 -475
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -1,531 +1,137 @@
1
- # Enhanced AGR MCP Server - JavaScript Implementation
1
+ # AGR MCP Server
2
2
 
3
- **A high-performance, modern JavaScript implementation of the Alliance of Genome Resources MCP server with advanced natural language query capabilities and cross-entity search.**
3
+ MCP server for querying [Alliance of Genome Resources](https://www.alliancegenome.org) - genomics data across model organisms.
4
4
 
5
- ## NEW: Complex Query Engine
5
+ ## Installation
6
6
 
7
- This server now features a sophisticated natural language processing engine that understands:
8
- - **Boolean Logic**: `"breast cancer genes AND DNA repair NOT p53"`
9
- - **Multi-Entity Search**: Simultaneously search genes, diseases, phenotypes, and alleles
10
- - **Smart Filtering**: Automatic detection of species, processes, functions, and locations
11
- - **Relationship Discovery**: Find connections between genes, diseases, and orthologs
12
- - **Faceted Search**: Multi-dimensional filtering with real-time aggregations
7
+ ### Claude Desktop
13
8
 
14
- ## Why This JavaScript Version is Better
9
+ Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
15
10
 
16
- This JavaScript implementation offers significant improvements over the Python version:
17
-
18
- ### Performance Enhancements
19
- - **25-40% faster API responses** due to Node.js async I/O optimization
20
- - **Intelligent caching system** with configurable TTL and automatic cleanup
21
- - **Connection pooling** with optimized HTTP client settings
22
- - **Exponential backoff retry logic** for robust error recovery
23
- - **Rate limiting** to prevent API overwhelm
24
-
25
- ### Advanced Features
26
- - **🧠 Complex Natural Language Queries** with Boolean operators (AND, OR, NOT)
27
- - **🎯 Multi-Entity Cross-Search** (genes + diseases + phenotypes + alleles)
28
- - **🔍 Advanced Query Parsing** with automatic species/process/function detection
29
- - **📊 Intelligent Aggregations** across multiple data types
30
- - **🔗 Relationship Discovery** between genes, diseases, and orthologs
31
- - **🎛️ Faceted Search** with multiple simultaneous filters
32
- - **📈 Real-time Query Analytics** and performance insights
33
- - **🏷️ Automatic Entity Classification** and metadata extraction
34
-
35
- ### Reliability & Security
36
- - **Robust error boundaries** with detailed error reporting
37
- - **Input sanitization** to prevent injection attacks
38
- - **Request timeout handling** with configurable limits
39
- - **Process monitoring** with health check capabilities
40
- - **Memory leak prevention** with automated cache management
41
-
42
- ### Monitoring & Observability
43
- - **Real-time performance metrics**
44
- - **Cache hit/miss ratio tracking**
45
- - **API response time monitoring**
46
- - **Structured JSON logging**
47
- - **Health check endpoints**
48
-
49
- ## Architecture
50
-
51
- ```
52
- Enhanced AGR MCP Server (JavaScript)
53
- ├── High-Performance HTTP Client (Axios)
54
- │ ├── Connection Pooling
55
- │ ├── Request/Response Interceptors
56
- │ └── Automatic Retry Logic
57
-
58
- ├── Intelligent Caching Layer (NodeCache)
59
- │ ├── Configurable TTL per endpoint
60
- │ ├── Memory-efficient storage
61
- │ └── Automatic cleanup
62
-
63
- ├── Rate Limiting System
64
- │ ├── Per-endpoint rate tracking
65
- │ ├── Sliding window algorithm
66
- │ └── Automatic throttling
67
-
68
- ├── Enhanced Logging (Pino)
69
- │ ├── Structured JSON output
70
- │ ├── Pretty console formatting
71
- │ └── Performance tracking
72
-
73
- └── Advanced Validation
74
- ├── Gene ID format validation
75
- ├── Sequence validation
76
- └── Input sanitization
77
- ```
78
-
79
- ## Quick Start
80
-
81
- ### Prerequisites
82
- - Node.js 18+
83
- - npm 8+
84
-
85
- ### Installation
86
-
87
- #### Option 1: npm Package (Recommended)
88
- ```bash
89
- # Install globally from npm
90
- npm install -g agr-mcp-server-enhanced
91
-
92
- # Start the server
93
- agr-mcp-server
94
-
95
- # Or use the natural language server
96
- agr-mcp-natural
97
-
98
- # Or start interactive chat
99
- agr-chat
100
- ```
101
-
102
- #### Option 2: From Source
103
- ```bash
104
- # Clone the repository
105
- git clone https://github.com/nuin/agr-mcp-server-js.git
106
- cd agr-mcp-server-js
107
-
108
- # Install dependencies and validate setup
109
- npm run setup
110
-
111
- # Start the server
112
- npm start
113
-
114
- # Or start with development logging
115
- npm run dev
116
- ```
117
-
118
- ### Development Setup
119
- ```bash
120
- # Complete development setup
121
- npm run setup
122
-
123
- # Run with hot reload and debugging
124
- npm run dev
125
-
126
- # Run tests
127
- npm test
128
-
129
- # Run with coverage
130
- npm run test:coverage
131
-
132
- # Lint and format code
133
- npm run lint:fix
134
- npm run format
135
- ```
136
-
137
- ## Available Tools (12 Advanced Tools)
138
-
139
- ### Core Genomics Tools
140
- 1. **`search_genes`** - Advanced gene search with natural language support
141
- 2. **`get_gene_info`** - Comprehensive gene information
142
- 3. **`get_gene_diseases`** - Disease associations and models
143
- 4. **`search_diseases`** - Disease search with filtered results
144
- 5. **`get_gene_expression`** - Expression data across tissues
145
- 6. **`find_orthologs`** - Cross-species orthology analysis
146
- 7. **`blast_sequence`** - BLAST search with auto-detection
147
- 8. **`get_species_list`** - Supported model organisms
148
-
149
- ### Advanced Query Tools
150
- 9. **`complex_search`** - Natural language cross-entity search with relationships
151
- 10. **`faceted_search`** - Multi-filter advanced search with aggregations
152
-
153
- ### Performance & Monitoring Tools
154
- 11. **`get_cache_stats`** - Real-time performance metrics
155
- 12. **`clear_cache`** - Cache management (dev/testing)
156
-
157
- ## Usage Examples
158
-
159
- ### Complex Natural Language Queries (NEW!)
160
-
161
- The Enhanced AGR MCP Server now supports advanced Boolean queries with natural language processing:
162
-
163
- #### Working Complex Query Examples
164
-
165
- ##### 1. Boolean NOT - Exclude specific genes
166
- ```bash
167
- # Find DNA repair genes in breast cancer, excluding p53
168
- npm run query complex "breast cancer genes in human AND DNA repair NOT p53"
169
- # Returns: 6,021 genes (XRCC3, XRCC1, RAD50, ERCC1, etc.)
170
- ```
171
-
172
- ##### 2. Boolean OR - Multiple terms
173
- ```bash
174
- # Find genes related to insulin OR glucose in mouse
175
- npm run query complex "insulin OR glucose in mouse"
176
- # Returns: 28 genes (Insl5, Igfbp7, Irs3, Ide, etc.)
177
- ```
178
-
179
- ##### 3. Species-specific search
180
- ```bash
181
- # Find BRCA1 genes specifically in humans
182
- npm run query complex "BRCA1 in human"
183
- # Returns: 29 human-specific BRCA1-related genes
184
- ```
185
-
186
- #### Advanced Query Features
187
- - **Boolean Operators**: AND, OR, NOT for precise filtering
188
- - **Species Filters**: "in human", "in mouse", "in zebrafish", etc.
189
- - **Disease Context**: Automatically recognizes disease terms
190
- - **Process Filters**: Detects biological processes (apoptosis, DNA repair, etc.)
191
- - **Cross-Entity Search**: Searches genes, diseases, phenotypes simultaneously
192
-
193
- #### JavaScript/Node.js Examples
194
- ```javascript
195
- // Using complex_search tool with MCP
196
- {
197
- "tool": "complex_search",
198
- "arguments": {
199
- "query": "breast cancer genes in human AND DNA repair NOT p53",
200
- "limit": 5
201
- }
202
- }
203
-
204
- // Species and process filtering
11
+ ```json
205
12
  {
206
- "tool": "search_genes",
207
- "arguments": {
208
- "query": "tumor suppressor genes in mouse involved in apoptosis",
209
- "limit": 10
13
+ "mcpServers": {
14
+ "agr-genomics": {
15
+ "command": "npx",
16
+ "args": ["-y", "agr-mcp-server"]
17
+ }
210
18
  }
211
19
  }
212
20
  ```
213
21
 
214
- #### Cross-Entity Search with Relationships
215
- ```javascript
216
- // Search across genes, diseases, and phenotypes simultaneously
217
- {
218
- "tool": "complex_search",
219
- "arguments": {
220
- "query": "insulin resistance genes and diabetes diseases in human",
221
- "limit": 10
222
- }
223
- }
224
- ```
22
+ ### Claude Code (CLI)
23
+
24
+ Add to `~/.claude/settings.json`:
225
25
 
226
- ### Advanced Faceted Search
227
- ```javascript
228
- // Multi-dimensional filtering
26
+ ```json
229
27
  {
230
- "tool": "faceted_search",
231
- "arguments": {
232
- "genes": ["BRCA1", "BRCA2", "TP53"],
233
- "diseases": ["breast cancer", "ovarian cancer"],
234
- "processes": ["DNA repair", "apoptosis"],
235
- "species": "Homo sapiens",
236
- "chromosome": "17",
237
- "limit": 20
28
+ "mcpServers": {
29
+ "agr-genomics": {
30
+ "command": "npx",
31
+ "args": ["-y", "agr-mcp-server"]
32
+ }
238
33
  }
239
34
  }
240
35
  ```
241
36
 
242
- ### Tested & Verified Query Examples
243
-
244
- #### Natural Language Queries That Work
245
- - `"breast cancer genes in human AND DNA repair NOT p53"` - 6,021 results
246
- - `"insulin OR glucose in mouse"` - 28 results
247
- - `"BRCA1 in human"` - 29 results
248
- - `"kinase genes in mouse involved in signaling"` - Species + process filtering
249
- - `"tumor suppressor NOT p53 in zebrafish"` - Exclusion queries
250
- - `"transcription factors NOT zinc finger in fly"` ✅
251
- - `"diabetes genes on chromosome 11 in human"` ✅
252
- - `"tumor suppressor genes involved in apoptosis NOT p53"` ✅
253
-
254
- #### Multi-Entity Discovery
255
- - `"insulin genes and diabetes diseases"` → Returns genes + related diseases
256
- - `"BRCA1 orthologs and cancer associations"` → Cross-species + disease links
257
- - `"DNA repair genes and associated phenotypes"` → Genes + phenotype relationships
37
+ ### Cursor
258
38
 
259
- ### Basic Tool Usage
39
+ Add to Cursor settings (Settings > MCP Servers):
260
40
 
261
- #### Gene Information
262
- ```javascript
263
- {
264
- "tool": "get_gene_info",
265
- "arguments": {
266
- "gene_id": "HGNC:1100"
267
- }
268
- }
269
- ```
270
-
271
- #### BLAST Search
272
- ```javascript
41
+ ```json
273
42
  {
274
- "tool": "blast_sequence",
275
- "arguments": {
276
- "sequence": "ATCGATCGATCGATCG",
277
- "max_target_seqs": 20
43
+ "agr-genomics": {
44
+ "command": "npx",
45
+ "args": ["-y", "agr-mcp-server"]
278
46
  }
279
47
  }
280
48
  ```
281
49
 
282
- #### Performance Monitoring
283
- ```javascript
284
- {
285
- "tool": "get_cache_stats",
286
- "arguments": {}
287
- }
288
- ```
289
-
290
- ## Configuration
291
-
292
- ### Environment Variables
293
- ```bash
294
- # Logging level
295
- export LOG_LEVEL=debug
296
-
297
- # Custom timeouts
298
- export API_TIMEOUT=30000
299
-
300
- # Cache settings
301
- export CACHE_TTL=300
302
- export CACHE_MAX_KEYS=1000
303
- ```
304
-
305
- ### Advanced Configuration
306
- The server automatically configures itself with optimal settings:
307
-
308
- - **Cache TTL**: 5 minutes (gene info cached 10 minutes)
309
- - **Rate Limiting**: 100 requests/minute per endpoint
310
- - **Retry Logic**: 3 attempts with exponential backoff
311
- - **Connection Pooling**: Optimized for genomics API patterns
312
-
313
- ## Docker Support
314
-
315
- ```bash
316
- # Build Docker image
317
- npm run docker:build
318
-
319
- # Run in container
320
- npm run docker:run
321
-
322
- # Or use docker-compose
323
- docker-compose up -d
324
- ```
325
-
326
- ## Performance Comparison
327
-
328
- | Metric | Python Version | **JavaScript Version** | Improvement |
329
- |--------|---------------|----------------------|-------------|
330
- | Cold Start | ~800ms | **~450ms** | **44% faster** |
331
- | API Response | ~200ms | **~120ms** | **40% faster** |
332
- | Memory Usage | ~45MB | **~28MB** | **38% less** |
333
- | Cache Hit Rate | ~65% | **~89%** | **37% better** |
334
- | Error Recovery | Basic | **Advanced** | Exponential backoff |
335
- | Input Validation | Limited | **Comprehensive** | Type safety |
336
-
337
- ## Testing & Quality
338
-
339
- ```bash
340
- # Run comprehensive tests
341
- npm test
342
-
343
- # Run with coverage reporting
344
- npm run test:coverage
345
-
346
- # Performance benchmarking
347
- npm run benchmark
348
-
349
- # Code quality checks
350
- npm run lint
351
- npm run validate
352
-
353
- # Health check
354
- npm run health-check
355
- ```
356
-
357
- ## Advanced Features
358
-
359
- ### Intelligent Caching
360
- - **Per-endpoint TTL optimization**
361
- - **Memory-efficient storage**
362
- - **Automatic cache warming**
363
- - **Cache hit/miss analytics**
50
+ ### Windsurf
364
51
 
365
- ### Enhanced Error Handling
366
- - **Detailed error classification**
367
- - **Automatic retry with backoff**
368
- - **Graceful degradation**
369
- - **Structured error reporting**
370
-
371
- ### Performance Monitoring
372
- - **Real-time metrics collection**
373
- - **Cache performance tracking**
374
- - **API response time analysis**
375
- - **Memory usage monitoring**
376
-
377
- ### Input Validation
378
- - **Gene ID format validation** (HGNC, MGI, RGD, etc.)
379
- - **Sequence validation** (DNA/RNA/Protein)
380
- - **Query sanitization**
381
- - **Parameter bounds checking**
382
-
383
- ## Claude Integration
384
-
385
- ### Claude Desktop Configuration
386
-
387
- #### Option 1: Global Installation (Recommended)
388
- ```bash
389
- # Install globally for easy setup
390
- npm install -g .
391
- ```
392
-
393
- Then configure Claude Desktop:
394
-
395
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
396
- **Windows**: `%APPDATA%/Claude/claude_desktop_config.json`
52
+ Add to `~/.codeium/windsurf/mcp_config.json`:
397
53
 
398
54
  ```json
399
55
  {
400
56
  "mcpServers": {
401
57
  "agr-genomics": {
402
- "command": "agr-mcp-server",
403
- "env": {
404
- "LOG_LEVEL": "info"
405
- }
58
+ "command": "npx",
59
+ "args": ["-y", "agr-mcp-server"]
406
60
  }
407
61
  }
408
62
  }
409
63
  ```
410
64
 
411
- #### Option 2: Local Development Setup
65
+ ### From source
66
+
67
+ ```bash
68
+ git clone https://github.com/nuin/agr-mcp-server-js.git
69
+ cd agr-mcp-server-js
70
+ npm install && npm run build
71
+ ```
72
+
73
+ Then use the local path in your config:
74
+
412
75
  ```json
413
76
  {
414
77
  "mcpServers": {
415
78
  "agr-genomics": {
416
79
  "command": "node",
417
- "args": ["<PROJECT_PATH>/src/agr-server-enhanced.js"],
418
- "cwd": "<PROJECT_PATH>",
419
- "env": {
420
- "LOG_LEVEL": "info"
421
- }
80
+ "args": ["/path/to/agr-mcp-server-js/dist/index.js"]
422
81
  }
423
82
  }
424
83
  }
425
84
  ```
426
85
 
427
- Replace `<PROJECT_PATH>` with the absolute path to your cloned repository.
428
-
429
- ### Advanced Natural Language Queries
430
-
431
- With the enhanced complex query system, Claude can now handle sophisticated genomic questions:
432
-
433
- #### Boolean Logic & Multi-Species Queries
434
- - "Find breast cancer genes in human AND DNA repair NOT p53"
435
- - "Search for kinase genes in mouse OR rat involved in signaling"
436
- - "Get tumor suppressor genes involved in apoptosis NOT p53"
437
-
438
- #### Cross-Entity Discovery
439
- - "Find insulin genes and related diabetes diseases"
440
- - "Show BRCA1 orthologs and their cancer associations"
441
- - "Get DNA repair genes and associated phenotypes"
442
-
443
- #### Location & Function Specific
444
- - "Find transcription factors on chromosome 17 in human"
445
- - "Search for kinase genes in mouse involved in development"
446
- - "Get membrane proteins in fly NOT channels"
447
-
448
- #### Traditional Queries (Still Supported)
449
- - "Find orthologs of BRCA1 in mouse and zebrafish"
450
- - "BLAST this DNA sequence and show top 10 matches"
451
- - "Get expression data for TP53 across all tissues"
452
- - "Show me cache performance statistics"
453
-
454
- ## Monitoring Dashboard
455
-
456
- The server provides comprehensive monitoring:
457
-
458
- ```javascript
459
- // Real-time performance metrics
460
- {
461
- "cache": {
462
- "keys": 156,
463
- "hits": 1240,
464
- "misses": 180,
465
- "hitRate": "87.3%"
466
- },
467
- "rateLimits": {
468
- "/search": [timestamps...],
469
- "/gene": [timestamps...]
470
- },
471
- "uptime": 3600.5,
472
- "memoryUsage": "28.4MB"
473
- }
474
- ```
475
-
476
- ## Production Deployment
477
-
478
- ### PM2 Process Manager
479
- ```bash
480
- # Install PM2
481
- npm install -g pm2
86
+ ## Usage
482
87
 
483
- # Start with PM2
484
- pm2 start src/agr-server-enhanced.js --name agr-mcp-server
88
+ Ask questions naturally:
485
89
 
486
- # Monitor processes
487
- pm2 monit
90
+ - "Search for BRCA1 genes in human"
91
+ - "What genes are involved in DNA repair?"
92
+ - "Get information about HGNC:1100"
93
+ - "Find orthologs of insulin gene"
94
+ - "What diseases are associated with TP53?"
95
+ - "Show me expression data for daf-2 in worm"
488
96
 
489
- # View logs
490
- pm2 logs agr-mcp-server
491
- ```
97
+ ### Supported Species
492
98
 
493
- ### Health Monitoring
494
- ```bash
495
- # Built-in health check
496
- npm run health-check
99
+ Human, mouse, rat, zebrafish, fly, worm, yeast, xenopus
497
100
 
498
- # Custom monitoring script
499
- node scripts/monitor.js
500
- ```
101
+ ## Tools
501
102
 
502
- ## Key Advantages Over Python
103
+ | Tool | Description |
104
+ |------|-------------|
105
+ | `search_genes` | Search genes with optional species filter |
106
+ | `get_gene_info` | Detailed gene information (symbol, location, synonyms) |
107
+ | `get_gene_diseases` | Disease associations for a gene |
108
+ | `search_diseases` | Search diseases by name |
109
+ | `get_gene_expression` | Expression data across tissues/stages |
110
+ | `find_orthologs` | Cross-species homologs |
111
+ | `get_gene_phenotypes` | Phenotype annotations |
112
+ | `get_gene_interactions` | Molecular and genetic interactions |
113
+ | `get_gene_alleles` | Alleles/variants for a gene |
114
+ | `search_alleles` | Search alleles by name |
115
+ | `get_species_list` | List supported model organisms |
503
116
 
504
- 1. **Performance**: 25-40% faster response times
505
- 2. **Smart Caching**: Intelligent TTL and automatic cleanup
506
- 3. **Robust Validation**: Comprehensive input checking
507
- 4. **Monitoring**: Real-time performance metrics
508
- 5. **Error Handling**: Advanced retry and recovery logic
509
- 6. **Configuration**: Flexible, environment-aware settings
510
- 7. **Documentation**: TypeScript-style JSDoc throughout
511
- 8. **DevOps**: Docker, PM2, and monitoring ready
117
+ ## Gene ID Formats
512
118
 
513
- ## Support
119
+ | Species | Format | Example |
120
+ |---------|--------|---------|
121
+ | Human | `HGNC:*` | `HGNC:1100` |
122
+ | Mouse | `MGI:*` | `MGI:95892` |
123
+ | Rat | `RGD:*` | `RGD:3889` |
124
+ | Zebrafish | `ZFIN:ZDB-GENE-*` | `ZFIN:ZDB-GENE-990415-72` |
125
+ | Fly | `FB:FBgn*` | `FB:FBgn0000017` |
126
+ | Worm | `WB:WBGene*` | `WB:WBGene00000898` |
127
+ | Yeast | `SGD:S*` | `SGD:S000002536` |
128
+ | Xenopus | `Xenbase:XB-GENE-*` | `Xenbase:XB-GENE-485905` |
514
129
 
515
- - **Issues**: GitHub Issues
516
- - **Documentation**: JSDoc generated docs in `/docs`
517
- - **Health Check**: `npm run health-check`
518
- - **Performance**: `npm run benchmark`
130
+ ## Data Sources
519
131
 
520
- ## Status: Production Ready
132
+ - **Search & gene data**: [Alliance of Genome Resources API](https://www.alliancegenome.org/api)
133
+ - **Advanced search**: [AllianceMine](https://www.alliancegenome.org/alliancemine)
521
134
 
522
- **Enhanced JavaScript Implementation Complete**
523
- - High-performance architecture with caching
524
- - Robust error handling and validation
525
- - Comprehensive monitoring and logging
526
- - Advanced configuration management
527
- - Full testing and quality assurance
528
- - Production deployment ready
529
- - Complete documentation
135
+ ## License
530
136
 
531
- **Ready for immediate deployment as a faster, more reliable alternative to the Python version!**
137
+ MIT
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "agr-mcp-server",
3
- "version": "4.0.1",
3
+ "version": "4.0.2",
4
4
  "description": "MCP server for Alliance of Genome Resources - access genomics data across model organisms",
5
5
  "main": "dist/index.js",
6
6
  "bin": {