mcp-sequential-research 1.0.0 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +7 -6
- package/docs/MCP_GUIDANCE.md +81 -59
- package/examples/example_workflow.md +6 -3
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -229,12 +229,13 @@ This format is designed for downstream claim-mining tools.
|
|
|
229
229
|
|
|
230
230
|
Works with other MCP servers:
|
|
231
231
|
|
|
232
|
-
| Source Type | Recommended MCP |
|
|
233
|
-
|
|
234
|
-
| Patents | Google Patents MCP
|
|
235
|
-
| Web | Google Search MCP
|
|
236
|
-
|
|
|
237
|
-
|
|
|
232
|
+
| Source Type | Recommended MCP | Tool |
|
|
233
|
+
|-------------|-----------------|------|
|
|
234
|
+
| Patents | Google Patents MCP | `search_patents` |
|
|
235
|
+
| Web Search | Google Search MCP | `google_search` |
|
|
236
|
+
| Web Scraping | Google Search MCP | `read_webpage` |
|
|
237
|
+
| Memory | Memory MCP | `search_nodes` |
|
|
238
|
+
| Academic | Semantic Scholar API | — |
|
|
238
239
|
|
|
239
240
|
## License
|
|
240
241
|
|
package/docs/MCP_GUIDANCE.md
CHANGED
|
@@ -32,14 +32,13 @@ This is the exact **operator loop** Claude Code follows for comprehensive resear
|
|
|
32
32
|
│ └─→ sequential-research:sequential_research_plan(prompt, constraints) │
|
|
33
33
|
│ ↓ │
|
|
34
34
|
│ 2. WEB QUERIES (for each plan.queries where query_family == "web") │
|
|
35
|
-
│ └─→ google-search:
|
|
35
|
+
│ └─→ google-search:google_search({query, num: 10}) │
|
|
36
36
|
│ Returns: {title, link, snippet}[] │
|
|
37
37
|
│ ↓ │
|
|
38
38
|
│ 3. SCRAPE WEB CONTENT │
|
|
39
39
|
│ ├─→ Collect top URLs (dedupe) │
|
|
40
|
-
│
|
|
41
|
-
│
|
|
42
|
-
│ Returns: {markdown, metadata} │
|
|
40
|
+
│ └─→ For each URL: google-search:read_webpage({url}) │
|
|
41
|
+
│ Returns: {title, text, url} │
|
|
43
42
|
│ ↓ │
|
|
44
43
|
│ 4. PATENT QUERIES (for each plan.queries where query_family == "patent") │
|
|
45
44
|
│ └─→ google-patents:search_patents({query, num_results}) │
|
|
@@ -89,7 +88,7 @@ For each query where `query_family == "web"`:
|
|
|
89
88
|
|
|
90
89
|
```json
|
|
91
90
|
{
|
|
92
|
-
"tool": "google-search:
|
|
91
|
+
"tool": "google-search:google_search",
|
|
93
92
|
"arguments": {
|
|
94
93
|
"query": "photonic computing silicon photonics site:.edu OR filetype:pdf",
|
|
95
94
|
"num": 10
|
|
@@ -110,66 +109,69 @@ For each query where `query_family == "web"`:
|
|
|
110
109
|
}
|
|
111
110
|
```
|
|
112
111
|
|
|
113
|
-
### Step 3: Scrape Web Content via
|
|
112
|
+
### Step 3: Scrape Web Content via Google Search MCP `read_webpage`
|
|
114
113
|
|
|
115
|
-
After collecting search results, extract full content using
|
|
114
|
+
After collecting search results, extract full content using `read_webpage`.
|
|
116
115
|
|
|
117
|
-
**For
|
|
116
|
+
**For each URL:**
|
|
118
117
|
```json
|
|
119
118
|
{
|
|
120
|
-
"tool": "
|
|
119
|
+
"tool": "google-search:read_webpage",
|
|
121
120
|
"arguments": {
|
|
122
|
-
"
|
|
123
|
-
"https://example.mit.edu/photonics.pdf",
|
|
124
|
-
"https://lightmatter.co/technology",
|
|
125
|
-
"https://ieee.org/article/photonic-computing"
|
|
126
|
-
],
|
|
127
|
-
"options": {
|
|
128
|
-
"formats": ["markdown"],
|
|
129
|
-
"onlyMainContent": true
|
|
130
|
-
}
|
|
121
|
+
"url": "https://example.mit.edu/photonics.pdf"
|
|
131
122
|
}
|
|
132
123
|
}
|
|
133
124
|
```
|
|
134
125
|
|
|
135
|
-
**
|
|
126
|
+
**Response format:**
|
|
136
127
|
```json
|
|
137
128
|
{
|
|
138
|
-
"
|
|
139
|
-
"
|
|
140
|
-
|
|
141
|
-
"formats": ["markdown"],
|
|
142
|
-
"onlyMainContent": true
|
|
143
|
-
}
|
|
129
|
+
"title": "Silicon Photonics for AI - MIT",
|
|
130
|
+
"text": "# Silicon Photonics for AI\n\nRecent advances in silicon photonics have enabled...",
|
|
131
|
+
"url": "https://example.mit.edu/photonics.pdf"
|
|
144
132
|
}
|
|
145
133
|
```
|
|
146
134
|
|
|
147
|
-
**
|
|
135
|
+
**Note:** Call `read_webpage` for each URL sequentially or in parallel. The tool automatically converts HTML to readable text and handles most page types.
|
|
136
|
+
|
|
137
|
+
### Step 4: Execute Patent Queries via Google Patents MCP
|
|
138
|
+
|
|
139
|
+
For each query where `query_family == "patent"`:
|
|
140
|
+
|
|
148
141
|
```json
|
|
149
142
|
{
|
|
150
|
-
"
|
|
151
|
-
"
|
|
152
|
-
"
|
|
153
|
-
"
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
143
|
+
"tool": "google-patents:search_patents",
|
|
144
|
+
"arguments": {
|
|
145
|
+
"q": "photonic neural network accelerator",
|
|
146
|
+
"num": 10,
|
|
147
|
+
"country": "US",
|
|
148
|
+
"after": "publication:20200101",
|
|
149
|
+
"sort": "new"
|
|
157
150
|
}
|
|
158
151
|
}
|
|
159
152
|
```
|
|
160
153
|
|
|
161
|
-
|
|
154
|
+
**IMPORTANT:** Always specify `sort: "new"` or `sort: "old"`. The default `sort: "relevance"` is NOT supported by SerpApi and will cause an error.
|
|
162
155
|
|
|
163
|
-
|
|
156
|
+
**Parameter notes:**
|
|
157
|
+
- `q` - Search query (required). Use semicolons to separate terms: `"(photonic) OR (optical);neural network"`
|
|
158
|
+
- `num` - Results per page (10-100, default: 10)
|
|
159
|
+
- `country` - Filter by country codes: `"US"`, `"US,WO,EP"`
|
|
160
|
+
- `after` - Date filter format: `"publication:YYYYMMDD"` or `"filing:YYYYMMDD"`
|
|
161
|
+
- `before` - Date filter format: `"publication:YYYYMMDD"` or `"filing:YYYYMMDD"`
|
|
162
|
+
- `status` - Filter: `"GRANT"` or `"APPLICATION"`
|
|
163
|
+
- `sort` - **Must be `"new"` or `"old"`** (NOT `"relevance"`)
|
|
164
164
|
|
|
165
165
|
```json
|
|
166
|
+
// CORRECT - explicit sort parameter
|
|
166
167
|
{
|
|
167
168
|
"tool": "google-patents:search_patents",
|
|
168
169
|
"arguments": {
|
|
169
|
-
"
|
|
170
|
-
"
|
|
171
|
-
"country": "US",
|
|
172
|
-
"after": "
|
|
170
|
+
"q": "(optical computing) AND (neural network)",
|
|
171
|
+
"num": 10,
|
|
172
|
+
"country": "US,WO",
|
|
173
|
+
"after": "publication:20180101",
|
|
174
|
+
"sort": "new"
|
|
173
175
|
}
|
|
174
176
|
}
|
|
175
177
|
```
|
|
@@ -234,7 +236,7 @@ Transform all responses into the standard schema with sequential source IDs:
|
|
|
234
236
|
2. Deduplicate URLs before assigning IDs
|
|
235
237
|
3. Patents get `source_type: "patent"` with extra fields
|
|
236
238
|
4. Web content gets `source_type: "web"`
|
|
237
|
-
5.
|
|
239
|
+
5. The `text` from `read_webpage` goes in `excerpt` (truncated if needed)
|
|
238
240
|
|
|
239
241
|
### Step 6: Call `sequential_research_compile`
|
|
240
242
|
|
|
@@ -280,8 +282,8 @@ await fs.writeFile(`research/${slug}/raw_results.json`, JSON.stringify(rawResult
|
|
|
280
282
|
| Step | MCP Server | Tool | Purpose |
|
|
281
283
|
|------|-----------|------|---------|
|
|
282
284
|
| 1 | sequential-research | `sequential_research_plan` | Generate structured query plan |
|
|
283
|
-
| 2 | google-search | `
|
|
284
|
-
| 3 |
|
|
285
|
+
| 2 | google-search | `google_search` | Get web search results |
|
|
286
|
+
| 3 | google-search | `read_webpage` | Extract full page content |
|
|
285
287
|
| 4 | google-patents | `search_patents` | Search patent database |
|
|
286
288
|
| 5 | — | — | Normalize to raw_results[] |
|
|
287
289
|
| 6 | sequential-research | `sequential_research_compile` | Generate report with citations |
|
|
@@ -289,31 +291,40 @@ await fs.writeFile(`research/${slug}/raw_results.json`, JSON.stringify(rawResult
|
|
|
289
291
|
|
|
290
292
|
---
|
|
291
293
|
|
|
292
|
-
##
|
|
294
|
+
## Web Scraping with `read_webpage`
|
|
295
|
+
|
|
296
|
+
The `google-search:read_webpage` tool provides a simple, reliable way to fetch web content without additional MCP server dependencies.
|
|
293
297
|
|
|
294
|
-
###
|
|
298
|
+
### Usage
|
|
295
299
|
|
|
296
|
-
|
|
297
|
-
|
|
298
|
-
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
300
|
+
```json
|
|
301
|
+
{
|
|
302
|
+
"tool": "google-search:read_webpage",
|
|
303
|
+
"arguments": {
|
|
304
|
+
"url": "https://example.com/article"
|
|
305
|
+
}
|
|
306
|
+
}
|
|
307
|
+
```
|
|
302
308
|
|
|
303
|
-
###
|
|
309
|
+
### Response Format
|
|
304
310
|
|
|
305
311
|
```json
|
|
306
312
|
{
|
|
307
|
-
"
|
|
308
|
-
"
|
|
309
|
-
"
|
|
310
|
-
"excludeTags": ["nav", "footer", "aside"], // Skip navigation
|
|
311
|
-
"waitFor": 2000, // Wait for JS rendering (ms)
|
|
312
|
-
"timeout": 30000 // Request timeout (ms)
|
|
313
|
+
"title": "Article Title",
|
|
314
|
+
"text": "The full text content of the page...",
|
|
315
|
+
"url": "https://example.com/article"
|
|
313
316
|
}
|
|
314
317
|
```
|
|
315
318
|
|
|
316
|
-
###
|
|
319
|
+
### Features
|
|
320
|
+
|
|
321
|
+
- **Automatic HTML to text conversion** — Clean, readable output
|
|
322
|
+
- **No additional setup** — Uses the same google-search MCP server
|
|
323
|
+
- **Handles most page types** — HTML, some PDFs, etc.
|
|
324
|
+
|
|
325
|
+
### Handling Scrape Errors
|
|
326
|
+
|
|
327
|
+
If a URL fails to scrape, include it in results with a note:
|
|
317
328
|
|
|
318
329
|
```json
|
|
319
330
|
{
|
|
@@ -326,13 +337,24 @@ await fs.writeFile(`research/${slug}/raw_results.json`, JSON.stringify(rawResult
|
|
|
326
337
|
"source_type": "web",
|
|
327
338
|
"title": "Page Title (scrape failed)",
|
|
328
339
|
"url": "https://example.com/blocked",
|
|
329
|
-
"excerpt": "[Content unavailable -
|
|
340
|
+
"excerpt": "[Content unavailable - page could not be fetched]"
|
|
330
341
|
}
|
|
331
342
|
],
|
|
332
343
|
"execution_notes": "1 of 5 URLs failed to scrape"
|
|
333
344
|
}
|
|
334
345
|
```
|
|
335
346
|
|
|
347
|
+
### Parallel Execution
|
|
348
|
+
|
|
349
|
+
You can call `read_webpage` for multiple URLs in parallel to improve throughput:
|
|
350
|
+
|
|
351
|
+
```
|
|
352
|
+
// Execute these concurrently:
|
|
353
|
+
google-search:read_webpage({url: "https://site1.com/page"})
|
|
354
|
+
google-search:read_webpage({url: "https://site2.com/page"})
|
|
355
|
+
google-search:read_webpage({url: "https://site3.com/page"})
|
|
356
|
+
```
|
|
357
|
+
|
|
336
358
|
## Citation Format Requirement
|
|
337
359
|
|
|
338
360
|
Citations must be **stable** and **machine-parseable** for downstream claim-mining.
|
|
@@ -119,15 +119,18 @@ Execute queries using appropriate MCP tools. Here's how each query maps to data
|
|
|
119
119
|
"params": {
|
|
120
120
|
"name": "search_patents",
|
|
121
121
|
"arguments": {
|
|
122
|
-
"
|
|
123
|
-
"
|
|
122
|
+
"q": "photonic computing neural network inference",
|
|
123
|
+
"num": 10,
|
|
124
124
|
"country": "US",
|
|
125
|
-
"after": "
|
|
125
|
+
"after": "publication:20200101",
|
|
126
|
+
"sort": "new"
|
|
126
127
|
}
|
|
127
128
|
}
|
|
128
129
|
}
|
|
129
130
|
```
|
|
130
131
|
|
|
132
|
+
**IMPORTANT:** Always use `sort: "new"` or `sort: "old"`. The default `sort: "relevance"` is NOT supported by SerpApi and will cause an error.
|
|
133
|
+
|
|
131
134
|
### Example: Query q1 via Google Search MCP
|
|
132
135
|
|
|
133
136
|
```json
|