@yeyuan98/opencode-bioresearcher-plugin 1.5.2 → 1.5.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1 -0
- package/dist/agents/bioresearcher/prompt.js +235 -235
- package/dist/skills/bioresearcher-core/patterns/bioresearcher/analysis-methods.md +551 -551
- package/dist/skills/bioresearcher-core/patterns/bioresearcher/best-practices.md +647 -647
- package/dist/skills/bioresearcher-core/patterns/bioresearcher/python-standards.md +944 -944
- package/dist/skills/bioresearcher-core/patterns/bioresearcher/report-template.md +613 -613
- package/dist/skills/bioresearcher-core/patterns/bioresearcher/tool-selection.md +481 -481
- package/dist/skills/bioresearcher-core/patterns/citations.md +234 -234
- package/dist/skills/bioresearcher-core/patterns/rate-limiting.md +167 -167
- package/dist/skills/gromacs-guides/SKILL.md +48 -0
- package/dist/skills/gromacs-guides/guides/create_index.md +96 -0
- package/dist/skills/gromacs-guides/guides/inspect_tpr.md +93 -0
- package/package.json +1 -1
|
@@ -1,481 +1,481 @@
|
|
|
1
|
-
# Tool Selection Decision Framework
|
|
2
|
-
|
|
3
|
-
Comprehensive decision trees for choosing the correct tool based on user intent and data source.
|
|
4
|
-
|
|
5
|
-
## Overview
|
|
6
|
-
|
|
7
|
-
This pattern guides intelligent tool selection across 7 tool categories:
|
|
8
|
-
1. Database Tools (db*)
|
|
9
|
-
2. Table Tools (table*)
|
|
10
|
-
3. Web Tools (web*)
|
|
11
|
-
4. BioMCP Tools (biomcp*)
|
|
12
|
-
5. Parser Tools (parse*)
|
|
13
|
-
6. Miscellaneous Tools
|
|
14
|
-
7. Core File Tools
|
|
15
|
-
|
|
16
|
-
---
|
|
17
|
-
|
|
18
|
-
## 1. Data Source Identification
|
|
19
|
-
|
|
20
|
-
```
|
|
21
|
-
USER INTENT ANALYSIS:
|
|
22
|
-
├─ "Query database" / "SQL" / "table in database"
|
|
23
|
-
│ → DATABASE TOOLS (db*)
|
|
24
|
-
│
|
|
25
|
-
├─ "Excel file" / "CSV file" / "table file" / "xlsx" / "spreadsheet"
|
|
26
|
-
│ → TABLE TOOLS (table*)
|
|
27
|
-
│
|
|
28
|
-
├─ "website" / "web page" / "URL" / "fetch from"
|
|
29
|
-
│ → WEB TOOLS (web*)
|
|
30
|
-
│
|
|
31
|
-
├─ "PubMed" / "articles" / "literature" / "papers" / "publications"
|
|
32
|
-
│ → BIOMCP ARTICLE TOOLS
|
|
33
|
-
│
|
|
34
|
-
├─ "clinical trial" / "NCT" / "trial data"
|
|
35
|
-
│ → BIOMCP TRIAL TOOLS
|
|
36
|
-
│
|
|
37
|
-
├─ "gene" / "variant" / "mutation" / "genomic"
|
|
38
|
-
│ → BIOMCP GENE/VARIANT TOOLS
|
|
39
|
-
│
|
|
40
|
-
├─ "drug" / "compound" / "FDA" / "medication"
|
|
41
|
-
│ → BIOMCP DRUG/FDA TOOLS
|
|
42
|
-
│
|
|
43
|
-
├─ "parse" / "convert" / "XML" / "OBO" / "ontology"
|
|
44
|
-
│ → PARSER TOOLS (parse*)
|
|
45
|
-
│
|
|
46
|
-
└─ Unclear intent
|
|
47
|
-
→ ASK user for clarification via question tool
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
---
|
|
51
|
-
|
|
52
|
-
## 2. Database Tools Decision Tree
|
|
53
|
-
|
|
54
|
-
### When to Use
|
|
55
|
-
- User mentions database, SQL, or structured query
|
|
56
|
-
- Need to query relational or document database
|
|
57
|
-
- Data exists in MySQL or MongoDB
|
|
58
|
-
|
|
59
|
-
### Workflow
|
|
60
|
-
|
|
61
|
-
```
|
|
62
|
-
IF user mentions database/SQL:
|
|
63
|
-
|
|
64
|
-
Step 1: Check Configuration
|
|
65
|
-
IF env.jsonc does NOT exist:
|
|
66
|
-
→ Load skill 'env-jsonc-setup'
|
|
67
|
-
→ Follow skill workflow to configure database
|
|
68
|
-
|
|
69
|
-
Step 2: Discover Available Data
|
|
70
|
-
dbListTables() → Show all tables to user
|
|
71
|
-
→ Identify relevant tables
|
|
72
|
-
|
|
73
|
-
Step 3: Understand Schema
|
|
74
|
-
dbDescribeTable(table_name) → Get column structure
|
|
75
|
-
→ Identify available fields, data types, keys
|
|
76
|
-
|
|
77
|
-
Step 4: Query with Filters (BEST PRACTICE)
|
|
78
|
-
✅ DO: dbQuery(
|
|
79
|
-
"SELECT * FROM trials WHERE phase = :phase AND status = :status",
|
|
80
|
-
{phase: "Phase 3", status: "Recruiting"}
|
|
81
|
-
)
|
|
82
|
-
|
|
83
|
-
❌ DON'T: dbQuery("SELECT * FROM trials")
|
|
84
|
-
→ then filter in Python
|
|
85
|
-
|
|
86
|
-
Step 5: Use Named Parameters (SAFETY)
|
|
87
|
-
- NEVER concatenate SQL strings
|
|
88
|
-
- ALWAYS use :paramName syntax
|
|
89
|
-
- Pass parameters as second argument
|
|
90
|
-
```
|
|
91
|
-
|
|
92
|
-
### Example: Database Research Workflow
|
|
93
|
-
|
|
94
|
-
```markdown
|
|
95
|
-
User: "Find all Phase 3 melanoma trials from the database"
|
|
96
|
-
|
|
97
|
-
Agent actions:
|
|
98
|
-
1. dbListTables() → See available tables
|
|
99
|
-
2. dbDescribeTable("clinical_trials") → Understand schema
|
|
100
|
-
3. dbQuery(
|
|
101
|
-
"SELECT * FROM clinical_trials
|
|
102
|
-
WHERE phase = :phase
|
|
103
|
-
AND condition LIKE :condition
|
|
104
|
-
AND status = :status",
|
|
105
|
-
{
|
|
106
|
-
phase: "Phase 3",
|
|
107
|
-
condition: "%melanoma%",
|
|
108
|
-
status: "Recruiting"
|
|
109
|
-
}
|
|
110
|
-
)
|
|
111
|
-
4. Return filtered results with proper citations
|
|
112
|
-
```
|
|
113
|
-
|
|
114
|
-
### Tool Reference
|
|
115
|
-
|
|
116
|
-
| Tool | Purpose | When to Use |
|
|
117
|
-
|------|---------|-------------|
|
|
118
|
-
| `dbListTables` | List all tables/collections | Initial discovery |
|
|
119
|
-
| `dbDescribeTable` | Get column schema | Before querying |
|
|
120
|
-
| `dbQuery` | Execute SELECT query | Data retrieval |
|
|
121
|
-
|
|
122
|
-
---
|
|
123
|
-
|
|
124
|
-
## 3. Table Tools Decision Tree
|
|
125
|
-
|
|
126
|
-
### When to Use
|
|
127
|
-
- User provides Excel (.xlsx), CSV, or ODS file
|
|
128
|
-
- Need to analyze, filter, or transform tabular data
|
|
129
|
-
- Data exists in local file system
|
|
130
|
-
|
|
131
|
-
### Row Count Decision Matrix
|
|
132
|
-
|
|
133
|
-
```
|
|
134
|
-
IF rows < 30:
|
|
135
|
-
→ Use table tools directly (fastest)
|
|
136
|
-
- tableFilterRows() for filtering
|
|
137
|
-
- tableGroupBy() for aggregation
|
|
138
|
-
- tableSummarize() for statistics
|
|
139
|
-
- tablePivotSummary() for cross-tabs
|
|
140
|
-
|
|
141
|
-
IF rows >= 30 AND rows < 1000:
|
|
142
|
-
→ Decision based on complexity:
|
|
143
|
-
IF need structured summarization with JSON schema:
|
|
144
|
-
→ Load skill 'long-table-summary'
|
|
145
|
-
ELSE:
|
|
146
|
-
→ Use table tools directly
|
|
147
|
-
|
|
148
|
-
IF rows >= 1000:
|
|
149
|
-
→ Decision based on complexity:
|
|
150
|
-
IF need structured summarization:
|
|
151
|
-
→ Load skill 'long-table-summary'
|
|
152
|
-
ELSE IF need complex transformations:
|
|
153
|
-
→ Write custom Python script
|
|
154
|
-
ELSE:
|
|
155
|
-
→ Use table tools with targeted operations
|
|
156
|
-
```
|
|
157
|
-
|
|
158
|
-
### Workflow
|
|
159
|
-
|
|
160
|
-
```
|
|
161
|
-
IF user provides Excel/CSV file:
|
|
162
|
-
|
|
163
|
-
Step 1: Preview Structure
|
|
164
|
-
tableListSheets(file_path) → See available sheets
|
|
165
|
-
tableGetSheetPreview(file_path) → First 6 rows
|
|
166
|
-
tableGetHeaders(file_path) → Column names
|
|
167
|
-
|
|
168
|
-
Step 2: Determine Row Count
|
|
169
|
-
tableGetRange(file_path, range="A1:A10000")
|
|
170
|
-
→ Count non-empty rows
|
|
171
|
-
|
|
172
|
-
Step 3: Choose Analysis Approach (see decision matrix above)
|
|
173
|
-
|
|
174
|
-
Step 4: Apply Upfront Filtering (BEST PRACTICE)
|
|
175
|
-
✅ DO: tableFilterRows(
|
|
176
|
-
file_path,
|
|
177
|
-
column="Status",
|
|
178
|
-
operator="=",
|
|
179
|
-
value="Active"
|
|
180
|
-
)
|
|
181
|
-
|
|
182
|
-
❌ DON'T: Load entire table then filter in Python
|
|
183
|
-
```
|
|
184
|
-
|
|
185
|
-
### Upfront Filtering Examples
|
|
186
|
-
|
|
187
|
-
```markdown
|
|
188
|
-
# Filter by single condition
|
|
189
|
-
tableFilterRows(file, column="Phase", operator="=", value="Phase 3")
|
|
190
|
-
|
|
191
|
-
# Filter by numeric range
|
|
192
|
-
tableFilterRows(file, column="Age", operator=">=", value=18)
|
|
193
|
-
|
|
194
|
-
# Filter by text contains
|
|
195
|
-
tableFilterRows(file, column="Condition", operator="contains", value="melanoma")
|
|
196
|
-
```
|
|
197
|
-
|
|
198
|
-
### Tool Reference
|
|
199
|
-
|
|
200
|
-
| Tool | Purpose | When to Use |
|
|
201
|
-
|------|---------|-------------|
|
|
202
|
-
| `tableListSheets` | List worksheet names | Multi-sheet workbooks |
|
|
203
|
-
| `tableGetSheetPreview` | Preview first 6 rows | Understand structure |
|
|
204
|
-
| `tableGetHeaders` | Get column names | Identify available fields |
|
|
205
|
-
| `tableFilterRows` | Filter by condition | Extract subsets |
|
|
206
|
-
| `tableSearch` | Search across all cells | Find specific values |
|
|
207
|
-
| `tableSummarize` | Statistical summary | Numeric analysis |
|
|
208
|
-
| `tableGroupBy` | Group and aggregate | Categorical analysis |
|
|
209
|
-
| `tablePivotSummary` | Cross-tabulation | Multi-dimensional analysis |
|
|
210
|
-
| `tableGetRange` | Extract data range | Bulk data retrieval |
|
|
211
|
-
| `tableCreateFile` | Create new table | Output generation |
|
|
212
|
-
| `tableAppendRows` | Append rows | Data combination |
|
|
213
|
-
|
|
214
|
-
---
|
|
215
|
-
|
|
216
|
-
## 4. BioMCP Tools Decision Tree
|
|
217
|
-
|
|
218
|
-
### When to Use
|
|
219
|
-
- Biomedical research queries
|
|
220
|
-
- Literature search
|
|
221
|
-
- Clinical trial data
|
|
222
|
-
- Gene/variant/drug information
|
|
223
|
-
- FDA regulatory data
|
|
224
|
-
|
|
225
|
-
### Domain Selection
|
|
226
|
-
|
|
227
|
-
```
|
|
228
|
-
IF biomedical research query:
|
|
229
|
-
|
|
230
|
-
Determine Domain:
|
|
231
|
-
|
|
232
|
-
├─ Literature/Papers
|
|
233
|
-
│ → biomcp_article_searcher(keywords, genes, diseases)
|
|
234
|
-
│ → biomcp_article_getter(pmid)
|
|
235
|
-
│
|
|
236
|
-
├─ Clinical Trials
|
|
237
|
-
│ → biomcp_trial_searcher(conditions, interventions)
|
|
238
|
-
│ → biomcp_trial_getter(nct_id)
|
|
239
|
-
│ → biomcp_trial_protocol_getter(nct_id)
|
|
240
|
-
│ → biomcp_trial_outcomes_getter(nct_id)
|
|
241
|
-
│
|
|
242
|
-
├─ Genes
|
|
243
|
-
│ → biomcp_gene_getter(gene_id_or_symbol)
|
|
244
|
-
│
|
|
245
|
-
├─ Variants/Mutations
|
|
246
|
-
│ → biomcp_variant_searcher(gene, significance)
|
|
247
|
-
│ → biomcp_variant_getter(variant_id)
|
|
248
|
-
│
|
|
249
|
-
├─ Drugs/Compounds
|
|
250
|
-
│ → biomcp_drug_getter(drug_id_or_name)
|
|
251
|
-
│
|
|
252
|
-
├─ FDA Data
|
|
253
|
-
│ → biomcp_openfda_adverse_searcher(drug, reaction)
|
|
254
|
-
│ → biomcp_openfda_label_searcher(name, indication)
|
|
255
|
-
│ → biomcp_openfda_approval_searcher(drug)
|
|
256
|
-
│
|
|
257
|
-
└─ Cross-Domain Search
|
|
258
|
-
→ biomcp_search(query="gene:BRAF AND disease:melanoma")
|
|
259
|
-
```
|
|
260
|
-
|
|
261
|
-
### Targeted Queries (BEST PRACTICE)
|
|
262
|
-
|
|
263
|
-
```
|
|
264
|
-
RULE: Use specific filters to narrow results upfront
|
|
265
|
-
|
|
266
|
-
✅ DO:
|
|
267
|
-
biomcp_article_searcher(
|
|
268
|
-
genes=["BRAF", "NRAS"],
|
|
269
|
-
diseases=["melanoma"],
|
|
270
|
-
keywords=["treatment resistance"],
|
|
271
|
-
page_size=50
|
|
272
|
-
)
|
|
273
|
-
|
|
274
|
-
❌ DON'T:
|
|
275
|
-
biomcp_search(query="BRAF")
|
|
276
|
-
→ then manually filter through thousands of results
|
|
277
|
-
```
|
|
278
|
-
|
|
279
|
-
### Rate Limiting (MANDATORY)
|
|
280
|
-
|
|
281
|
-
```
|
|
282
|
-
ALWAYS use blockingTimer(0.3) between consecutive biomcp calls
|
|
283
|
-
|
|
284
|
-
FOR each biomcp query:
|
|
285
|
-
result = biomcp_tool(...)
|
|
286
|
-
|
|
287
|
-
IF more queries remain:
|
|
288
|
-
blockingTimer(0.3) # 300ms delay
|
|
289
|
-
|
|
290
|
-
NEVER run concurrent biomcp calls (sequential only)
|
|
291
|
-
```
|
|
292
|
-
|
|
293
|
-
### Example: Literature Research Workflow
|
|
294
|
-
|
|
295
|
-
```markdown
|
|
296
|
-
User: "Find recent papers on BRAF V600E treatment resistance in melanoma"
|
|
297
|
-
|
|
298
|
-
Agent actions:
|
|
299
|
-
1. biomcp_article_searcher(
|
|
300
|
-
genes=["BRAF"],
|
|
301
|
-
diseases=["melanoma"],
|
|
302
|
-
variants=["V600E"],
|
|
303
|
-
keywords=["treatment resistance"],
|
|
304
|
-
page_size=20
|
|
305
|
-
)
|
|
306
|
-
2. blockingTimer(0.3)
|
|
307
|
-
3. FOR each relevant article:
|
|
308
|
-
biomcp_article_getter(pmid=article.id)
|
|
309
|
-
blockingTimer(0.3)
|
|
310
|
-
4. Synthesize findings with citations
|
|
311
|
-
```
|
|
312
|
-
|
|
313
|
-
---
|
|
314
|
-
|
|
315
|
-
## 5. Web Tools Decision Tree
|
|
316
|
-
|
|
317
|
-
### When to Use
|
|
318
|
-
- User provides specific URL
|
|
319
|
-
- Need current information from web
|
|
320
|
-
- Official biotech/pharma company data
|
|
321
|
-
- Information not available in databases
|
|
322
|
-
|
|
323
|
-
### Search vs Fetch
|
|
324
|
-
|
|
325
|
-
```
|
|
326
|
-
IF user provides specific URL:
|
|
327
|
-
→ web-reader_webReader(url, return_format="markdown")
|
|
328
|
-
→ Extract content
|
|
329
|
-
|
|
330
|
-
IF user needs to find information:
|
|
331
|
-
→ web-search-prime_web_search_prime(search_query)
|
|
332
|
-
→ Identify relevant URLs
|
|
333
|
-
→ web-reader_webReader(url) for promising results
|
|
334
|
-
```
|
|
335
|
-
|
|
336
|
-
### Source Quality Verification
|
|
337
|
-
|
|
338
|
-
```
|
|
339
|
-
Source Priority:
|
|
340
|
-
✅ PREFER (High Quality):
|
|
341
|
-
- .gov domains (FDA, NIH, NCI)
|
|
342
|
-
- Official biotech/pharma company websites
|
|
343
|
-
- Peer-reviewed journal websites
|
|
344
|
-
- ClinicalTrials.gov
|
|
345
|
-
|
|
346
|
-
⚠️ CAUTION (Medium Quality):
|
|
347
|
-
- News websites (verify with primary sources)
|
|
348
|
-
- Industry publications
|
|
349
|
-
- Conference abstracts
|
|
350
|
-
|
|
351
|
-
❌ AVOID (Low Quality):
|
|
352
|
-
- Blogs and forums
|
|
353
|
-
- User-generated content
|
|
354
|
-
- Promotional materials
|
|
355
|
-
- Unverified claims
|
|
356
|
-
```
|
|
357
|
-
|
|
358
|
-
### Rate Limiting
|
|
359
|
-
|
|
360
|
-
```
|
|
361
|
-
FOR each web request:
|
|
362
|
-
result = webfetch(...)
|
|
363
|
-
|
|
364
|
-
IF more requests remain:
|
|
365
|
-
blockingTimer(0.5) # 500ms delay (more conservative)
|
|
366
|
-
```
|
|
367
|
-
|
|
368
|
-
---
|
|
369
|
-
|
|
370
|
-
## 6. Parser Tools Decision Tree
|
|
371
|
-
|
|
372
|
-
### When to Use
|
|
373
|
-
- Need to parse structured file formats
|
|
374
|
-
- Convert between formats
|
|
375
|
-
- Process ontology files
|
|
376
|
-
|
|
377
|
-
### Tool Selection
|
|
378
|
-
|
|
379
|
-
```
|
|
380
|
-
IF file is PubMed XML (.xml or .xml.gz):
|
|
381
|
-
→ parse_pubmed_articleSet(
|
|
382
|
-
filePath,
|
|
383
|
-
outputMode="excel", # or "single" or "individual"
|
|
384
|
-
outputFileName="pubmed_results.xlsx"
|
|
385
|
-
)
|
|
386
|
-
|
|
387
|
-
IF file is OBO ontology (.obo):
|
|
388
|
-
→ parse_obo_file(
|
|
389
|
-
filePath,
|
|
390
|
-
outputFileName="ontology.csv"
|
|
391
|
-
)
|
|
392
|
-
|
|
393
|
-
IF file is JSON:
|
|
394
|
-
→ jsonExtract(file_path)
|
|
395
|
-
→ jsonValidate(data, schema) if needed
|
|
396
|
-
```
|
|
397
|
-
|
|
398
|
-
---
|
|
399
|
-
|
|
400
|
-
## 7. Decision Flowchart Summary
|
|
401
|
-
|
|
402
|
-
```
|
|
403
|
-
START: User Query
|
|
404
|
-
│
|
|
405
|
-
├─ Analyze user intent
|
|
406
|
-
│
|
|
407
|
-
├─ Identify data source
|
|
408
|
-
│ ├─ Database → db* tools
|
|
409
|
-
│ ├─ Table file → table* tools
|
|
410
|
-
│ ├─ Web → web* tools
|
|
411
|
-
│ ├─ Biomedical → biomcp* tools
|
|
412
|
-
│ └─ Parse needed → parse* tools
|
|
413
|
-
│
|
|
414
|
-
├─ Apply upfront filtering
|
|
415
|
-
│ └─ Use targeted queries/filters
|
|
416
|
-
│
|
|
417
|
-
├─ Execute with rate limiting
|
|
418
|
-
│ └─ blockingTimer between API calls
|
|
419
|
-
│
|
|
420
|
-
├─ Validate results
|
|
421
|
-
│ └─ Check data quality
|
|
422
|
-
│
|
|
423
|
-
└─ Synthesize with citations
|
|
424
|
-
└─ Reference-based report
|
|
425
|
-
```
|
|
426
|
-
|
|
427
|
-
---
|
|
428
|
-
|
|
429
|
-
## Best Practices Summary
|
|
430
|
-
|
|
431
|
-
### ✅ DO
|
|
432
|
-
- Use targeted queries with specific filters
|
|
433
|
-
- Apply upfront filtering at data source
|
|
434
|
-
- Use named parameters in SQL queries
|
|
435
|
-
- Implement rate limiting between API calls
|
|
436
|
-
- Validate data before processing
|
|
437
|
-
- Use decision trees to select tools
|
|
438
|
-
|
|
439
|
-
### ❌ DON'T
|
|
440
|
-
- Use bulk queries then filter in Python
|
|
441
|
-
- Concatenate SQL strings
|
|
442
|
-
- Run concurrent API calls
|
|
443
|
-
- Skip rate limiting
|
|
444
|
-
- Load entire datasets without filtering
|
|
445
|
-
- Guess tool choices without analysis
|
|
446
|
-
|
|
447
|
-
---
|
|
448
|
-
|
|
449
|
-
## Common Patterns
|
|
450
|
-
|
|
451
|
-
### Pattern 1: Database → Analysis
|
|
452
|
-
```
|
|
453
|
-
dbListTables()
|
|
454
|
-
→ dbDescribeTable()
|
|
455
|
-
→ dbQuery(with filters)
|
|
456
|
-
→ analysis
|
|
457
|
-
```
|
|
458
|
-
|
|
459
|
-
### Pattern 2: Literature Search
|
|
460
|
-
```
|
|
461
|
-
biomcp_article_searcher(filters)
|
|
462
|
-
→ blockingTimer(0.3)
|
|
463
|
-
→ biomcp_article_getter(pmid)
|
|
464
|
-
→ repeat
|
|
465
|
-
```
|
|
466
|
-
|
|
467
|
-
### Pattern 3: Table Analysis
|
|
468
|
-
```
|
|
469
|
-
tableGetSheetPreview()
|
|
470
|
-
→ determine row count
|
|
471
|
-
→ choose approach
|
|
472
|
-
→ execute
|
|
473
|
-
```
|
|
474
|
-
|
|
475
|
-
### Pattern 4: Web Research
|
|
476
|
-
```
|
|
477
|
-
web-search-prime()
|
|
478
|
-
→ identify URLs
|
|
479
|
-
→ web-reader_webReader()
|
|
480
|
-
→ extract data
|
|
481
|
-
```
|
|
1
|
+
# Tool Selection Decision Framework
|
|
2
|
+
|
|
3
|
+
Comprehensive decision trees for choosing the correct tool based on user intent and data source.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
This pattern guides intelligent tool selection across 7 tool categories:
|
|
8
|
+
1. Database Tools (db*)
|
|
9
|
+
2. Table Tools (table*)
|
|
10
|
+
3. Web Tools (web*)
|
|
11
|
+
4. BioMCP Tools (biomcp*)
|
|
12
|
+
5. Parser Tools (parse*)
|
|
13
|
+
6. Miscellaneous Tools
|
|
14
|
+
7. Core File Tools
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## 1. Data Source Identification
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
USER INTENT ANALYSIS:
|
|
22
|
+
├─ "Query database" / "SQL" / "table in database"
|
|
23
|
+
│ → DATABASE TOOLS (db*)
|
|
24
|
+
│
|
|
25
|
+
├─ "Excel file" / "CSV file" / "table file" / "xlsx" / "spreadsheet"
|
|
26
|
+
│ → TABLE TOOLS (table*)
|
|
27
|
+
│
|
|
28
|
+
├─ "website" / "web page" / "URL" / "fetch from"
|
|
29
|
+
│ → WEB TOOLS (web*)
|
|
30
|
+
│
|
|
31
|
+
├─ "PubMed" / "articles" / "literature" / "papers" / "publications"
|
|
32
|
+
│ → BIOMCP ARTICLE TOOLS
|
|
33
|
+
│
|
|
34
|
+
├─ "clinical trial" / "NCT" / "trial data"
|
|
35
|
+
│ → BIOMCP TRIAL TOOLS
|
|
36
|
+
│
|
|
37
|
+
├─ "gene" / "variant" / "mutation" / "genomic"
|
|
38
|
+
│ → BIOMCP GENE/VARIANT TOOLS
|
|
39
|
+
│
|
|
40
|
+
├─ "drug" / "compound" / "FDA" / "medication"
|
|
41
|
+
│ → BIOMCP DRUG/FDA TOOLS
|
|
42
|
+
│
|
|
43
|
+
├─ "parse" / "convert" / "XML" / "OBO" / "ontology"
|
|
44
|
+
│ → PARSER TOOLS (parse*)
|
|
45
|
+
│
|
|
46
|
+
└─ Unclear intent
|
|
47
|
+
→ ASK user for clarification via question tool
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
## 2. Database Tools Decision Tree
|
|
53
|
+
|
|
54
|
+
### When to Use
|
|
55
|
+
- User mentions database, SQL, or structured query
|
|
56
|
+
- Need to query relational or document database
|
|
57
|
+
- Data exists in MySQL or MongoDB
|
|
58
|
+
|
|
59
|
+
### Workflow
|
|
60
|
+
|
|
61
|
+
```
|
|
62
|
+
IF user mentions database/SQL:
|
|
63
|
+
|
|
64
|
+
Step 1: Check Configuration
|
|
65
|
+
IF env.jsonc does NOT exist:
|
|
66
|
+
→ Load skill 'env-jsonc-setup'
|
|
67
|
+
→ Follow skill workflow to configure database
|
|
68
|
+
|
|
69
|
+
Step 2: Discover Available Data
|
|
70
|
+
dbListTables() → Show all tables to user
|
|
71
|
+
→ Identify relevant tables
|
|
72
|
+
|
|
73
|
+
Step 3: Understand Schema
|
|
74
|
+
dbDescribeTable(table_name) → Get column structure
|
|
75
|
+
→ Identify available fields, data types, keys
|
|
76
|
+
|
|
77
|
+
Step 4: Query with Filters (BEST PRACTICE)
|
|
78
|
+
✅ DO: dbQuery(
|
|
79
|
+
"SELECT * FROM trials WHERE phase = :phase AND status = :status",
|
|
80
|
+
{phase: "Phase 3", status: "Recruiting"}
|
|
81
|
+
)
|
|
82
|
+
|
|
83
|
+
❌ DON'T: dbQuery("SELECT * FROM trials")
|
|
84
|
+
→ then filter in Python
|
|
85
|
+
|
|
86
|
+
Step 5: Use Named Parameters (SAFETY)
|
|
87
|
+
- NEVER concatenate SQL strings
|
|
88
|
+
- ALWAYS use :paramName syntax
|
|
89
|
+
- Pass parameters as second argument
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
### Example: Database Research Workflow
|
|
93
|
+
|
|
94
|
+
```markdown
|
|
95
|
+
User: "Find all Phase 3 melanoma trials from the database"
|
|
96
|
+
|
|
97
|
+
Agent actions:
|
|
98
|
+
1. dbListTables() → See available tables
|
|
99
|
+
2. dbDescribeTable("clinical_trials") → Understand schema
|
|
100
|
+
3. dbQuery(
|
|
101
|
+
"SELECT * FROM clinical_trials
|
|
102
|
+
WHERE phase = :phase
|
|
103
|
+
AND condition LIKE :condition
|
|
104
|
+
AND status = :status",
|
|
105
|
+
{
|
|
106
|
+
phase: "Phase 3",
|
|
107
|
+
condition: "%melanoma%",
|
|
108
|
+
status: "Recruiting"
|
|
109
|
+
}
|
|
110
|
+
)
|
|
111
|
+
4. Return filtered results with proper citations
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
### Tool Reference
|
|
115
|
+
|
|
116
|
+
| Tool | Purpose | When to Use |
|
|
117
|
+
|------|---------|-------------|
|
|
118
|
+
| `dbListTables` | List all tables/collections | Initial discovery |
|
|
119
|
+
| `dbDescribeTable` | Get column schema | Before querying |
|
|
120
|
+
| `dbQuery` | Execute SELECT query | Data retrieval |
|
|
121
|
+
|
|
122
|
+
---
|
|
123
|
+
|
|
124
|
+
## 3. Table Tools Decision Tree
|
|
125
|
+
|
|
126
|
+
### When to Use
|
|
127
|
+
- User provides Excel (.xlsx), CSV, or ODS file
|
|
128
|
+
- Need to analyze, filter, or transform tabular data
|
|
129
|
+
- Data exists in local file system
|
|
130
|
+
|
|
131
|
+
### Row Count Decision Matrix
|
|
132
|
+
|
|
133
|
+
```
|
|
134
|
+
IF rows < 30:
|
|
135
|
+
→ Use table tools directly (fastest)
|
|
136
|
+
- tableFilterRows() for filtering
|
|
137
|
+
- tableGroupBy() for aggregation
|
|
138
|
+
- tableSummarize() for statistics
|
|
139
|
+
- tablePivotSummary() for cross-tabs
|
|
140
|
+
|
|
141
|
+
IF rows >= 30 AND rows < 1000:
|
|
142
|
+
→ Decision based on complexity:
|
|
143
|
+
IF need structured summarization with JSON schema:
|
|
144
|
+
→ Load skill 'long-table-summary'
|
|
145
|
+
ELSE:
|
|
146
|
+
→ Use table tools directly
|
|
147
|
+
|
|
148
|
+
IF rows >= 1000:
|
|
149
|
+
→ Decision based on complexity:
|
|
150
|
+
IF need structured summarization:
|
|
151
|
+
→ Load skill 'long-table-summary'
|
|
152
|
+
ELSE IF need complex transformations:
|
|
153
|
+
→ Write custom Python script
|
|
154
|
+
ELSE:
|
|
155
|
+
→ Use table tools with targeted operations
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
### Workflow
|
|
159
|
+
|
|
160
|
+
```
|
|
161
|
+
IF user provides Excel/CSV file:
|
|
162
|
+
|
|
163
|
+
Step 1: Preview Structure
|
|
164
|
+
tableListSheets(file_path) → See available sheets
|
|
165
|
+
tableGetSheetPreview(file_path) → First 6 rows
|
|
166
|
+
tableGetHeaders(file_path) → Column names
|
|
167
|
+
|
|
168
|
+
Step 2: Determine Row Count
|
|
169
|
+
tableGetRange(file_path, range="A1:A10000")
|
|
170
|
+
→ Count non-empty rows
|
|
171
|
+
|
|
172
|
+
Step 3: Choose Analysis Approach (see decision matrix above)
|
|
173
|
+
|
|
174
|
+
Step 4: Apply Upfront Filtering (BEST PRACTICE)
|
|
175
|
+
✅ DO: tableFilterRows(
|
|
176
|
+
file_path,
|
|
177
|
+
column="Status",
|
|
178
|
+
operator="=",
|
|
179
|
+
value="Active"
|
|
180
|
+
)
|
|
181
|
+
|
|
182
|
+
❌ DON'T: Load entire table then filter in Python
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
### Upfront Filtering Examples
|
|
186
|
+
|
|
187
|
+
```markdown
|
|
188
|
+
# Filter by single condition
|
|
189
|
+
tableFilterRows(file, column="Phase", operator="=", value="Phase 3")
|
|
190
|
+
|
|
191
|
+
# Filter by numeric range
|
|
192
|
+
tableFilterRows(file, column="Age", operator=">=", value=18)
|
|
193
|
+
|
|
194
|
+
# Filter by text contains
|
|
195
|
+
tableFilterRows(file, column="Condition", operator="contains", value="melanoma")
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
### Tool Reference
|
|
199
|
+
|
|
200
|
+
| Tool | Purpose | When to Use |
|
|
201
|
+
|------|---------|-------------|
|
|
202
|
+
| `tableListSheets` | List worksheet names | Multi-sheet workbooks |
|
|
203
|
+
| `tableGetSheetPreview` | Preview first 6 rows | Understand structure |
|
|
204
|
+
| `tableGetHeaders` | Get column names | Identify available fields |
|
|
205
|
+
| `tableFilterRows` | Filter by condition | Extract subsets |
|
|
206
|
+
| `tableSearch` | Search across all cells | Find specific values |
|
|
207
|
+
| `tableSummarize` | Statistical summary | Numeric analysis |
|
|
208
|
+
| `tableGroupBy` | Group and aggregate | Categorical analysis |
|
|
209
|
+
| `tablePivotSummary` | Cross-tabulation | Multi-dimensional analysis |
|
|
210
|
+
| `tableGetRange` | Extract data range | Bulk data retrieval |
|
|
211
|
+
| `tableCreateFile` | Create new table | Output generation |
|
|
212
|
+
| `tableAppendRows` | Append rows | Data combination |
|
|
213
|
+
|
|
214
|
+
---
|
|
215
|
+
|
|
216
|
+
## 4. BioMCP Tools Decision Tree
|
|
217
|
+
|
|
218
|
+
### When to Use
|
|
219
|
+
- Biomedical research queries
|
|
220
|
+
- Literature search
|
|
221
|
+
- Clinical trial data
|
|
222
|
+
- Gene/variant/drug information
|
|
223
|
+
- FDA regulatory data
|
|
224
|
+
|
|
225
|
+
### Domain Selection
|
|
226
|
+
|
|
227
|
+
```
|
|
228
|
+
IF biomedical research query:
|
|
229
|
+
|
|
230
|
+
Determine Domain:
|
|
231
|
+
|
|
232
|
+
├─ Literature/Papers
|
|
233
|
+
│ → biomcp_article_searcher(keywords, genes, diseases)
|
|
234
|
+
│ → biomcp_article_getter(pmid)
|
|
235
|
+
│
|
|
236
|
+
├─ Clinical Trials
|
|
237
|
+
│ → biomcp_trial_searcher(conditions, interventions)
|
|
238
|
+
│ → biomcp_trial_getter(nct_id)
|
|
239
|
+
│ → biomcp_trial_protocol_getter(nct_id)
|
|
240
|
+
│ → biomcp_trial_outcomes_getter(nct_id)
|
|
241
|
+
│
|
|
242
|
+
├─ Genes
|
|
243
|
+
│ → biomcp_gene_getter(gene_id_or_symbol)
|
|
244
|
+
│
|
|
245
|
+
├─ Variants/Mutations
|
|
246
|
+
│ → biomcp_variant_searcher(gene, significance)
|
|
247
|
+
│ → biomcp_variant_getter(variant_id)
|
|
248
|
+
│
|
|
249
|
+
├─ Drugs/Compounds
|
|
250
|
+
│ → biomcp_drug_getter(drug_id_or_name)
|
|
251
|
+
│
|
|
252
|
+
├─ FDA Data
|
|
253
|
+
│ → biomcp_openfda_adverse_searcher(drug, reaction)
|
|
254
|
+
│ → biomcp_openfda_label_searcher(name, indication)
|
|
255
|
+
│ → biomcp_openfda_approval_searcher(drug)
|
|
256
|
+
│
|
|
257
|
+
└─ Cross-Domain Search
|
|
258
|
+
→ biomcp_search(query="gene:BRAF AND disease:melanoma")
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
### Targeted Queries (BEST PRACTICE)
|
|
262
|
+
|
|
263
|
+
```
|
|
264
|
+
RULE: Use specific filters to narrow results upfront
|
|
265
|
+
|
|
266
|
+
✅ DO:
|
|
267
|
+
biomcp_article_searcher(
|
|
268
|
+
genes=["BRAF", "NRAS"],
|
|
269
|
+
diseases=["melanoma"],
|
|
270
|
+
keywords=["treatment resistance"],
|
|
271
|
+
page_size=50
|
|
272
|
+
)
|
|
273
|
+
|
|
274
|
+
❌ DON'T:
|
|
275
|
+
biomcp_search(query="BRAF")
|
|
276
|
+
→ then manually filter through thousands of results
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
### Rate Limiting (MANDATORY)
|
|
280
|
+
|
|
281
|
+
```
|
|
282
|
+
ALWAYS use blockingTimer(0.3) between consecutive biomcp calls
|
|
283
|
+
|
|
284
|
+
FOR each biomcp query:
|
|
285
|
+
result = biomcp_tool(...)
|
|
286
|
+
|
|
287
|
+
IF more queries remain:
|
|
288
|
+
blockingTimer(0.3) # 300ms delay
|
|
289
|
+
|
|
290
|
+
NEVER run concurrent biomcp calls (sequential only)
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
### Example: Literature Research Workflow
|
|
294
|
+
|
|
295
|
+
```markdown
|
|
296
|
+
User: "Find recent papers on BRAF V600E treatment resistance in melanoma"
|
|
297
|
+
|
|
298
|
+
Agent actions:
|
|
299
|
+
1. biomcp_article_searcher(
|
|
300
|
+
genes=["BRAF"],
|
|
301
|
+
diseases=["melanoma"],
|
|
302
|
+
variants=["V600E"],
|
|
303
|
+
keywords=["treatment resistance"],
|
|
304
|
+
page_size=20
|
|
305
|
+
)
|
|
306
|
+
2. blockingTimer(0.3)
|
|
307
|
+
3. FOR each relevant article:
|
|
308
|
+
biomcp_article_getter(pmid=article.id)
|
|
309
|
+
blockingTimer(0.3)
|
|
310
|
+
4. Synthesize findings with citations
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
---
|
|
314
|
+
|
|
315
|
+
## 5. Web Tools Decision Tree
|
|
316
|
+
|
|
317
|
+
### When to Use
|
|
318
|
+
- User provides specific URL
|
|
319
|
+
- Need current information from web
|
|
320
|
+
- Official biotech/pharma company data
|
|
321
|
+
- Information not available in databases
|
|
322
|
+
|
|
323
|
+
### Search vs Fetch
|
|
324
|
+
|
|
325
|
+
```
|
|
326
|
+
IF user provides specific URL:
|
|
327
|
+
→ web-reader_webReader(url, return_format="markdown")
|
|
328
|
+
→ Extract content
|
|
329
|
+
|
|
330
|
+
IF user needs to find information:
|
|
331
|
+
→ web-search-prime_web_search_prime(search_query)
|
|
332
|
+
→ Identify relevant URLs
|
|
333
|
+
→ web-reader_webReader(url) for promising results
|
|
334
|
+
```
|
|
335
|
+
|
|
336
|
+
### Source Quality Verification
|
|
337
|
+
|
|
338
|
+
```
|
|
339
|
+
Source Priority:
|
|
340
|
+
✅ PREFER (High Quality):
|
|
341
|
+
- .gov domains (FDA, NIH, NCI)
|
|
342
|
+
- Official biotech/pharma company websites
|
|
343
|
+
- Peer-reviewed journal websites
|
|
344
|
+
- ClinicalTrials.gov
|
|
345
|
+
|
|
346
|
+
⚠️ CAUTION (Medium Quality):
|
|
347
|
+
- News websites (verify with primary sources)
|
|
348
|
+
- Industry publications
|
|
349
|
+
- Conference abstracts
|
|
350
|
+
|
|
351
|
+
❌ AVOID (Low Quality):
|
|
352
|
+
- Blogs and forums
|
|
353
|
+
- User-generated content
|
|
354
|
+
- Promotional materials
|
|
355
|
+
- Unverified claims
|
|
356
|
+
```
|
|
357
|
+
|
|
358
|
+
### Rate Limiting
|
|
359
|
+
|
|
360
|
+
```
|
|
361
|
+
FOR each web request:
|
|
362
|
+
result = webfetch(...)
|
|
363
|
+
|
|
364
|
+
IF more requests remain:
|
|
365
|
+
blockingTimer(0.5) # 500ms delay (more conservative)
|
|
366
|
+
```
|
|
367
|
+
|
|
368
|
+
---
|
|
369
|
+
|
|
370
|
+
## 6. Parser Tools Decision Tree
|
|
371
|
+
|
|
372
|
+
### When to Use
|
|
373
|
+
- Need to parse structured file formats
|
|
374
|
+
- Convert between formats
|
|
375
|
+
- Process ontology files
|
|
376
|
+
|
|
377
|
+
### Tool Selection
|
|
378
|
+
|
|
379
|
+
```
|
|
380
|
+
IF file is PubMed XML (.xml or .xml.gz):
|
|
381
|
+
→ parse_pubmed_articleSet(
|
|
382
|
+
filePath,
|
|
383
|
+
outputMode="excel", # or "single" or "individual"
|
|
384
|
+
outputFileName="pubmed_results.xlsx"
|
|
385
|
+
)
|
|
386
|
+
|
|
387
|
+
IF file is OBO ontology (.obo):
|
|
388
|
+
→ parse_obo_file(
|
|
389
|
+
filePath,
|
|
390
|
+
outputFileName="ontology.csv"
|
|
391
|
+
)
|
|
392
|
+
|
|
393
|
+
IF file is JSON:
|
|
394
|
+
→ jsonExtract(file_path)
|
|
395
|
+
→ jsonValidate(data, schema) if needed
|
|
396
|
+
```
|
|
397
|
+
|
|
398
|
+
---
|
|
399
|
+
|
|
400
|
+
## 7. Decision Flowchart Summary
|
|
401
|
+
|
|
402
|
+
```
|
|
403
|
+
START: User Query
|
|
404
|
+
│
|
|
405
|
+
├─ Analyze user intent
|
|
406
|
+
│
|
|
407
|
+
├─ Identify data source
|
|
408
|
+
│ ├─ Database → db* tools
|
|
409
|
+
│ ├─ Table file → table* tools
|
|
410
|
+
│ ├─ Web → web* tools
|
|
411
|
+
│ ├─ Biomedical → biomcp* tools
|
|
412
|
+
│ └─ Parse needed → parse* tools
|
|
413
|
+
│
|
|
414
|
+
├─ Apply upfront filtering
|
|
415
|
+
│ └─ Use targeted queries/filters
|
|
416
|
+
│
|
|
417
|
+
├─ Execute with rate limiting
|
|
418
|
+
│ └─ blockingTimer between API calls
|
|
419
|
+
│
|
|
420
|
+
├─ Validate results
|
|
421
|
+
│ └─ Check data quality
|
|
422
|
+
│
|
|
423
|
+
└─ Synthesize with citations
|
|
424
|
+
└─ Reference-based report
|
|
425
|
+
```
|
|
426
|
+
|
|
427
|
+
---
|
|
428
|
+
|
|
429
|
+
## Best Practices Summary
|
|
430
|
+
|
|
431
|
+
### ✅ DO
|
|
432
|
+
- Use targeted queries with specific filters
|
|
433
|
+
- Apply upfront filtering at data source
|
|
434
|
+
- Use named parameters in SQL queries
|
|
435
|
+
- Implement rate limiting between API calls
|
|
436
|
+
- Validate data before processing
|
|
437
|
+
- Use decision trees to select tools
|
|
438
|
+
|
|
439
|
+
### ❌ DON'T
|
|
440
|
+
- Use bulk queries then filter in Python
|
|
441
|
+
- Concatenate SQL strings
|
|
442
|
+
- Run concurrent API calls
|
|
443
|
+
- Skip rate limiting
|
|
444
|
+
- Load entire datasets without filtering
|
|
445
|
+
- Guess tool choices without analysis
|
|
446
|
+
|
|
447
|
+
---
|
|
448
|
+
|
|
449
|
+
## Common Patterns
|
|
450
|
+
|
|
451
|
+
### Pattern 1: Database → Analysis
|
|
452
|
+
```
|
|
453
|
+
dbListTables()
|
|
454
|
+
→ dbDescribeTable()
|
|
455
|
+
→ dbQuery(with filters)
|
|
456
|
+
→ analysis
|
|
457
|
+
```
|
|
458
|
+
|
|
459
|
+
### Pattern 2: Literature Search
|
|
460
|
+
```
|
|
461
|
+
biomcp_article_searcher(filters)
|
|
462
|
+
→ blockingTimer(0.3)
|
|
463
|
+
→ biomcp_article_getter(pmid)
|
|
464
|
+
→ repeat
|
|
465
|
+
```
|
|
466
|
+
|
|
467
|
+
### Pattern 3: Table Analysis
|
|
468
|
+
```
|
|
469
|
+
tableGetSheetPreview()
|
|
470
|
+
→ determine row count
|
|
471
|
+
→ choose approach
|
|
472
|
+
→ execute
|
|
473
|
+
```
|
|
474
|
+
|
|
475
|
+
### Pattern 4: Web Research
|
|
476
|
+
```
|
|
477
|
+
web-search-prime()
|
|
478
|
+
→ identify URLs
|
|
479
|
+
→ web-reader_webReader()
|
|
480
|
+
→ extract data
|
|
481
|
+
```
|