pse-mcp 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,220 +1,104 @@
1
- # Version 2.0 is here
2
-
3
- # Google Search MCP Server
4
- An MCP (Model Context Protocol) server that provides Google search capabilities and webpage content analysis tools. This server enables AI models to perform Google searches and analyze webpage content programmatically.
5
-
6
- ## Features
7
-
8
- - Google Custom Search integration
9
- - Advanced search features (filters, sorting, pagination, categorization)
10
- - Webpage content analysis in multiple formats (markdown, HTML, plain text)
11
- - Batch webpage analysis
12
- - Result categorization and classification
13
- - Content summarization
14
- - Optimized, human-readable responses
15
- - MCP-compliant interface
16
-
17
- ## Prerequisites
18
-
19
- - Node.js (v16 or higher)
20
- - Google Cloud Platform account
21
- - Custom Search Engine ID
22
- - Google API Key
23
-
24
- ## Installation
25
-
26
- 1. Clone the repository
27
- 2. Install Node.js dependencies:
28
- ```bash
29
- npm install
30
- ```
31
- 3. Build the TypeScript code:
32
- ```bash
33
- npm run build
34
- ```
35
-
36
- ## Configuration
37
-
38
- 1. Set up environment variables for your Google API credentials:
39
-
40
- You can either set these as system environment variables or configure them in your MCP settings file.
41
-
42
- Required environment variables:
43
- - `GOOGLE_API_KEY`: Your Google API key
44
- - `GOOGLE_SEARCH_ENGINE_ID`: Your Custom Search Engine ID
45
-
46
- 2. Add the server configuration to your MCP settings file (typically located at `%APPDATA%/Code/User/globalStorage/saoudrizwan.claude-dev/settings/cline_mcp_settings.json`):
47
- ```json
48
- {
49
- "mcpServers": {
50
- "google-search": {
51
- "autoApprove": [
52
- "google_search",
53
- "extract_webpage_content",
54
- "extract_multiple_webpages"
55
- ],
56
- "disabled": false,
57
- "timeout": 60,
58
- "command": "node",
59
- "args": [
60
- "/path/to/google-search-mcp-server/dist/google-search.js"
61
- ],
62
- "env": {
63
- "GOOGLE_API_KEY": "your-google-api-key",
64
- "GOOGLE_SEARCH_ENGINE_ID": "your-custom-search-engine-id"
65
- },
66
- "transportType": "stdio"
67
- }
68
- }
69
- }
70
- ```
71
-
72
- ## Running
73
-
74
- Start the MCP server:
75
- ```bash
76
- npm run start
77
- ```
78
-
79
- ## Available Tools
80
-
81
- ### 1. google_search
82
- Search Google and return relevant results from the web. This tool finds web pages, articles, and information on specific topics using Google's search engine.
83
-
84
- ```typescript
85
- {
86
- "name": "google_search",
87
- "arguments": {
88
- "query": "your search query",
89
- "num_results": 5, // optional, default: 5
90
- "site": "example.com", // optional, limit results to specific website
91
- "language": "en", // optional, filter by language (ISO 639-1 code)
92
- "dateRestrict": "m6", // optional, filter by date (e.g., "m6" for last 6 months)
93
- "exactTerms": "exact phrase", // optional, search for exact phrase
94
- "resultType": "news", // optional, specify type (news, images, videos)
95
- "page": 2, // optional, page number for pagination (starts at 1)
96
- "resultsPerPage": 10, // optional, results per page (max: 10)
97
- "sort": "date" // optional, sort by "date" or "relevance" (default)
98
- }
99
- }
100
- ```
101
-
102
- Response includes:
103
- - Search results with title, link, snippet in a readable format
104
- - Pagination information (current page, total results, etc.)
105
- - Categories of results (automatically detected)
106
- - Navigation hints for pagination
107
-
108
- ### 2. extract_webpage_content
109
- Extract and analyze content from a webpage, converting it to readable text. This tool fetches the main content while removing ads, navigation elements, and other clutter.
110
-
111
- ```typescript
112
- {
113
- "name": "extract_webpage_content",
114
- "arguments": {
115
- "url": "https://example.com",
116
- "format": "markdown" // optional, format options: "markdown" (default), "html", or "text"
117
- }
118
- }
119
- ```
120
-
121
- Response includes:
122
- - Title and description of the webpage
123
- - Content statistics (word count, character count)
124
- - Content summary
125
- - Content preview (first 500 characters)
126
-
127
- ### 3. extract_multiple_webpages
128
- Extract and analyze content from multiple webpages in a single request. Ideal for comparing information across different sources or gathering comprehensive information on a topic.
129
-
130
- ```typescript
131
- {
132
- "name": "extract_multiple_webpages",
133
- "arguments": {
134
- "urls": [
135
- "https://example1.com",
136
- "https://example2.com"
137
- ],
138
- "format": "html" // optional, format options: "markdown" (default), "html", or "text"
139
- }
140
- }
141
- ```
142
-
143
- Response includes:
144
- - Title and description of each webpage
145
- - Content statistics for each webpage
146
- - Content summary for each webpage
147
- - Content preview for each webpage (first 150 characters)
148
-
149
- ## Getting Google API Credentials
150
-
151
- 1. Go to the [Google Cloud Console](https://console.cloud.google.com/)
152
- 2. Create a new project or select an existing one
153
- 3. Enable the Custom Search API
154
- 4. Create API credentials (API Key)
155
- 5. Go to the [Custom Search Engine](https://programmablesearchengine.google.com/about/) page
156
- 6. Create a new search engine and get your Search Engine ID
157
- 7. Add these credentials to your MCP settings file or set them as environment variables
158
-
159
- ## Error Handling
160
-
161
- The server provides detailed error messages for:
162
- - Missing or invalid API credentials
163
- - Failed search requests
164
- - Invalid webpage URLs
165
- - Network connectivity issues
166
-
167
- ## Architecture
168
-
169
- The server is built with TypeScript and uses the MCP SDK to provide a standardized interface for AI models to interact with Google Search and webpage content analysis tools. It consists of two main services:
170
-
171
- 1. **GoogleSearchService**: Handles Google API interactions for search functionality
172
- 2. **ContentExtractor**: Manages webpage content analysis and extraction
173
-
174
- The server uses caching mechanisms to improve performance and reduce API calls.
175
-
176
- ## Distributing the Built Version
177
-
178
- If you prefer to distribute only the built version of this tool rather than the source code, you can follow these steps:
179
-
180
- 1. Build the TypeScript code:
181
- ```bash
182
- npm run build
183
- ```
184
-
185
- 2. Create a distribution package with only the necessary files:
186
- ```bash
187
- # Create a distribution directory
188
- mkdir -p dist-package
189
-
190
- # Copy the compiled JavaScript files
191
- cp -r dist dist-package/
192
-
193
- # Copy package files (without dev dependencies)
194
- cp package.json dist-package/
195
- cp README.md dist-package/
196
-
197
- # Create a simplified package.json for distribution
198
- node -e "const pkg = require('./package.json'); delete pkg.devDependencies; delete pkg.scripts.build; delete pkg.scripts.dev; pkg.scripts.start = 'node dist/google-search.js'; require('fs').writeFileSync('dist-package/package.json', JSON.stringify(pkg, null, 2));"
199
- ```
200
-
201
- 3. Users can then install and run the built version:
202
- ```bash
203
- # Install production dependencies only
204
- npm install --production
205
-
206
- # Start the server
207
- npm start
208
- ```
209
-
210
- This approach allows you to distribute the compiled JavaScript files without exposing the TypeScript source code. Users will still need to:
211
-
212
- 1. Configure their Google API credentials as environment variables
213
- 2. Add the server configuration to their MCP settings file
214
- 3. Install the production dependencies
215
-
216
- Note that the package.json in the distribution will only include production dependencies and a simplified set of scripts.
217
-
218
- ## License
219
-
220
- MIT
1
+ # Version 2.0 is here
2
+
3
+ # Google Search MCP Server
4
+ An MCP (Model Context Protocol) server that provides Google search capabilities. This server enables AI models to perform Google searches programmatically.
5
+
6
+ ## Features
7
+
8
+ - Google Custom Search integration
9
+ - Advanced search features (filters, sorting, pagination, categorization)
10
+ - Optimized, human-readable responses
11
+ - MCP-compliant interface
12
+
13
+ ## Prerequisites
14
+
15
+ - Node.js (v16 or higher)
16
+ - Google Cloud Platform account
17
+ - Custom Search Engine ID
18
+ - Google API Key
19
+
20
+ ## Configuration
21
+
22
+ 1. Set up environment variables for your Google API credentials:
23
+
24
+ You can either set these as system environment variables or configure them in your MCP settings file.
25
+
26
+ Required environment variables:
27
+ - `GOOGLE_API_KEY`: Your Google API key
28
+ - `GOOGLE_SEARCH_ENGINE_ID`: Your Custom Search Engine ID
29
+
30
+ 1. mcp settings:
31
+ ```json
32
+ {
33
+ "mcpServers": {
34
+ "google-search": {
35
+ "command": "npx",
36
+ "args": [
37
+ "-y",
38
+ "pse-mcp"
39
+ ],
40
+ "env": {
41
+ "GOOGLE_API_KEY": "your-google-api-key",
42
+ "GOOGLE_SEARCH_ENGINE_ID": "your-custom-search-engine-id"
43
+ }
44
+ }
45
+ }
46
+ }
47
+ ```
48
+
49
+ ## Available Tools
50
+
51
+ ### 1. google_search
52
+ Search Google and return relevant results from the web. This tool finds web pages, articles, and information on specific topics using Google's search engine.
53
+
54
+ ```typescript
55
+ {
56
+ "name": "google_search",
57
+ "arguments": {
58
+ "query": "your search query",
59
+ "num_results": 10, // optional, default: 10
60
+ "site": "example.com", // optional, limit results to specific website
61
+ "language": "en", // optional, filter by language (ISO 639-1 code)
62
+ "dateRestrict": "m6", // optional, filter by date (e.g., "m6" for last 6 months)
63
+ "exactTerms": "exact phrase", // optional, search for exact phrase
64
+ "resultType": "news", // optional, specify type (news, images, videos)
65
+ "page": 2, // optional, page number for pagination (starts at 1)
66
+ "resultsPerPage": 10, // optional, default: 10, max: 10
67
+ "sort": "date" // optional, sort by "date" or "relevance" (default)
68
+ }
69
+ }
70
+ ```
71
+
72
+ Response includes:
73
+ - Search results with title, link, snippet in a readable format
74
+ - Pagination information (current page, total results, etc.)
75
+ - Categories of results (automatically detected)
76
+ - Navigation hints for pagination
77
+
78
+ ## Getting Google API Credentials
79
+
80
+ 1. Go to the [Google Cloud Console](https://console.cloud.google.com/)
81
+ 2. Create a new project or select an existing one
82
+ 3. Enable the Custom Search API
83
+ 4. Create API credentials (API Key)
84
+ 5. Go to the [Custom Search Engine](https://programmablesearchengine.google.com/about/) page
85
+ 6. Create a new search engine and get your Search Engine ID
86
+ 7. Add these credentials to your MCP settings file or set them as environment variables
87
+
88
+ ## Error Handling
89
+
90
+ The server provides detailed error messages for:
91
+ - Missing or invalid API credentials
92
+ - Failed search requests
93
+ - Invalid webpage URLs
94
+ - Network connectivity issues
95
+
96
+ ## Architecture
97
+
98
+ The server is built with TypeScript and uses the MCP SDK to provide a standardized interface for AI models to interact with Google Search. It consists of the **GoogleSearchService**, which handles Google API interactions for search functionality.
99
+
100
+ The server uses caching mechanisms to improve performance and reduce API calls.
101
+
102
+ ## License
103
+
104
+ MIT
@@ -3,11 +3,9 @@ import { Server } from '@modelcontextprotocol/sdk/server/index.js';
3
3
  import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
4
4
  import { CallToolRequestSchema, ListToolsRequestSchema } from '@modelcontextprotocol/sdk/types.js';
5
5
  import { GoogleSearchService } from './services/google-search.service.js';
6
- import { ContentExtractor } from './services/content-extractor.service.js';
7
6
  class GoogleSearchServer {
8
7
  constructor() {
9
8
  this.searchService = new GoogleSearchService();
10
- this.contentExtractor = new ContentExtractor();
11
9
  this.server = new Server({
12
10
  name: 'google-search',
13
11
  version: '1.0.0'
@@ -25,7 +23,7 @@ class GoogleSearchServer {
25
23
  },
26
24
  num_results: {
27
25
  type: 'number',
28
- description: 'Number of results to return (default: 5, max: 10). Increase for broader coverage, decrease for faster response.'
26
+ description: 'Number of results to return (default: 10, max: 10). Increase for broader coverage, decrease for faster response.'
29
27
  },
30
28
  site: {
31
29
  type: 'string',
@@ -53,7 +51,7 @@ class GoogleSearchServer {
53
51
  },
54
52
  resultsPerPage: {
55
53
  type: 'number',
56
- description: 'Number of results to show per page (default: 5, max: 10). Controls how many results are returned for each page.'
54
+ description: 'Number of results to show per page (default: 10, max: 10). Controls how many results are returned for each page.'
57
55
  },
58
56
  sort: {
59
57
  type: 'string',
@@ -62,41 +60,6 @@ class GoogleSearchServer {
62
60
  },
63
61
  required: ['query']
64
62
  }
65
- },
66
- extract_webpage_content: {
67
- description: 'Extract and analyze content from a webpage, converting it to readable text. This tool fetches the main content while removing ads, navigation elements, and other clutter. Use it to get detailed information from specific pages found via google_search. Works with most common webpage formats including articles, blogs, and documentation.',
68
- inputSchema: {
69
- type: 'object',
70
- properties: {
71
- url: {
72
- type: 'string',
73
- description: 'Full URL of the webpage to extract content from (must start with http:// or https://). Ensure the URL is from a public webpage and not behind authentication.'
74
- },
75
- format: {
76
- type: 'string',
77
- description: 'Output format for the extracted content. Options: "markdown" (default), "html", or "text".'
78
- }
79
- },
80
- required: ['url']
81
- }
82
- },
83
- extract_multiple_webpages: {
84
- description: 'Extract and analyze content from multiple webpages in a single request. This tool is ideal for comparing information across different sources or gathering comprehensive information on a topic. Limited to 5 URLs per request to maintain performance.',
85
- inputSchema: {
86
- type: 'object',
87
- properties: {
88
- urls: {
89
- type: 'array',
90
- items: { type: 'string' },
91
- description: 'Array of webpage URLs to extract content from. Each URL must be public and start with http:// or https://. Maximum 5 URLs per request.'
92
- },
93
- format: {
94
- type: 'string',
95
- description: 'Output format for the extracted content. Options: "markdown" (default), "html", or "text".'
96
- }
97
- },
98
- required: ['urls']
99
- }
100
63
  }
101
64
  }
102
65
  }
@@ -116,7 +79,7 @@ class GoogleSearchServer {
116
79
  },
117
80
  num_results: {
118
81
  type: 'number',
119
- description: 'Number of results to return (default: 5, max: 10). Increase for broader coverage, decrease for faster response.'
82
+ description: 'Number of results to return (default: 10, max: 10). Increase for broader coverage, decrease for faster response.'
120
83
  },
121
84
  site: {
122
85
  type: 'string',
@@ -144,7 +107,7 @@ class GoogleSearchServer {
144
107
  },
145
108
  resultsPerPage: {
146
109
  type: 'number',
147
- description: 'Number of results to show per page (default: 5, max: 10). Controls how many results are returned for each page.'
110
+ description: 'Number of results to show per page (default: 10, max: 10). Controls how many results are returned for each page.'
148
111
  },
149
112
  sort: {
150
113
  type: 'string',
@@ -153,43 +116,6 @@ class GoogleSearchServer {
153
116
  },
154
117
  required: ['query']
155
118
  }
156
- },
157
- {
158
- name: 'extract_webpage_content',
159
- description: 'Extract and analyze content from a webpage, converting it to readable text. This tool fetches the main content while removing ads, navigation elements, and other clutter. Use it to get detailed information from specific pages found via google_search. Works with most common webpage formats including articles, blogs, and documentation.',
160
- inputSchema: {
161
- type: 'object',
162
- properties: {
163
- url: {
164
- type: 'string',
165
- description: 'Full URL of the webpage to extract content from (must start with http:// or https://). Ensure the URL is from a public webpage and not behind authentication.'
166
- },
167
- format: {
168
- type: 'string',
169
- description: 'Output format for the extracted content. Options: "markdown" (default), "html", or "text".'
170
- }
171
- },
172
- required: ['url']
173
- }
174
- },
175
- {
176
- name: 'extract_multiple_webpages',
177
- description: 'Extract and analyze content from multiple webpages in a single request. This tool is ideal for comparing information across different sources or gathering comprehensive information on a topic. Limited to 5 URLs per request to maintain performance.',
178
- inputSchema: {
179
- type: 'object',
180
- properties: {
181
- urls: {
182
- type: 'array',
183
- items: { type: 'string' },
184
- description: 'Array of webpage URLs to extract content from. Each URL must be public and start with http:// or https://. Maximum 5 URLs per request.'
185
- },
186
- format: {
187
- type: 'string',
188
- description: 'Output format for the extracted content. Options: "markdown" (default), "html", or "text".'
189
- }
190
- },
191
- required: ['urls']
192
- }
193
119
  }
194
120
  ]
195
121
  }));
@@ -214,22 +140,6 @@ class GoogleSearchServer {
214
140
  });
215
141
  }
216
142
  throw new Error('Invalid arguments for google_search tool');
217
- case 'extract_webpage_content':
218
- if (typeof request.params.arguments === 'object' && request.params.arguments !== null && 'url' in request.params.arguments) {
219
- return this.handleAnalyzeWebpage({
220
- url: String(request.params.arguments.url),
221
- format: request.params.arguments.format ? String(request.params.arguments.format) : 'markdown'
222
- });
223
- }
224
- throw new Error('Invalid arguments for extract_webpage_content tool');
225
- case 'extract_multiple_webpages':
226
- if (typeof request.params.arguments === 'object' && request.params.arguments !== null && 'urls' in request.params.arguments && Array.isArray(request.params.arguments.urls)) {
227
- return this.handleBatchAnalyzeWebpages({
228
- urls: request.params.arguments.urls.map(String),
229
- format: request.params.arguments.format ? String(request.params.arguments.format) : 'markdown'
230
- });
231
- }
232
- throw new Error('Invalid arguments for extract_multiple_webpages tool');
233
143
  default:
234
144
  throw new Error(`Unknown tool: ${request.params.name}`);
235
145
  }
@@ -298,102 +208,6 @@ class GoogleSearchServer {
298
208
  };
299
209
  }
300
210
  }
301
- async handleAnalyzeWebpage(args) {
302
- try {
303
- const content = await this.contentExtractor.extractContent(args.url, args.format);
304
- // Format the response in a more readable, concise way
305
- let responseText = `Content from: ${content.url}\n\n`;
306
- responseText += `Title: ${content.title}\n`;
307
- if (content.description) {
308
- responseText += `Description: ${content.description}\n`;
309
- }
310
- responseText += `\nStats: ${content.stats.word_count} words, ${content.stats.approximate_chars} characters\n\n`;
311
- // Add the summary if available
312
- if (content.summary) {
313
- responseText += `Summary: ${content.summary}\n\n`;
314
- }
315
- // Add a preview of the content
316
- responseText += `Content Preview:\n${content.content_preview.first_500_chars}\n\n`;
317
- // Add a note about requesting specific information
318
- responseText += `Note: This is a preview of the content. For specific information, please ask about particular aspects of this webpage.`;
319
- return {
320
- content: [
321
- {
322
- type: 'text',
323
- text: responseText,
324
- },
325
- ],
326
- };
327
- }
328
- catch (error) {
329
- const errorMessage = error instanceof Error ? error.message : 'Unknown error occurred';
330
- const helpText = 'Common issues:\n- Check if the URL is accessible in a browser\n- Ensure the webpage is public\n- Try again if it\'s a temporary network issue';
331
- return {
332
- content: [
333
- {
334
- type: 'text',
335
- text: `${errorMessage}\n\n${helpText}`,
336
- },
337
- ],
338
- isError: true,
339
- };
340
- }
341
- }
342
- async handleBatchAnalyzeWebpages(args) {
343
- if (args.urls.length > 5) {
344
- return {
345
- content: [{
346
- type: 'text',
347
- text: 'Maximum 5 URLs allowed per request to maintain performance. Please reduce the number of URLs.'
348
- }],
349
- isError: true
350
- };
351
- }
352
- try {
353
- const results = await this.contentExtractor.batchExtractContent(args.urls, args.format);
354
- // Format the response in a more readable, concise way
355
- let responseText = `Content from ${args.urls.length} webpages:\n\n`;
356
- for (const [url, result] of Object.entries(results)) {
357
- responseText += `URL: ${url}\n`;
358
- if ('error' in result) {
359
- responseText += `Error: ${result.error}\n\n`;
360
- continue;
361
- }
362
- responseText += `Title: ${result.title}\n`;
363
- if (result.description) {
364
- responseText += `Description: ${result.description}\n`;
365
- }
366
- responseText += `Stats: ${result.stats.word_count} words\n`;
367
- // Add summary if available
368
- if (result.summary) {
369
- responseText += `Summary: ${result.summary}\n`;
370
- }
371
- responseText += `Preview: ${result.content_preview.first_500_chars.substring(0, 150)}...\n\n`;
372
- }
373
- responseText += `Note: These are previews of the content. To analyze the full content of a specific URL, use the extract_webpage_content tool with that URL.`;
374
- return {
375
- content: [
376
- {
377
- type: 'text',
378
- text: responseText,
379
- },
380
- ],
381
- };
382
- }
383
- catch (error) {
384
- const errorMessage = error instanceof Error ? error.message : 'Unknown error occurred';
385
- const helpText = 'Common issues:\n- Check if all URLs are accessible in a browser\n- Ensure all webpages are public\n- Try again if it\'s a temporary network issue\n- Consider reducing the number of URLs';
386
- return {
387
- content: [
388
- {
389
- type: 'text',
390
- text: `${errorMessage}\n\n${helpText}`,
391
- },
392
- ],
393
- isError: true,
394
- };
395
- }
396
- }
397
211
  async start() {
398
212
  try {
399
213
  const transport = new StdioServerTransport();
@@ -52,7 +52,7 @@ export class GoogleSearchService {
52
52
  this.searchCache.delete(oldestKey);
53
53
  }
54
54
  }
55
- async search(query, numResults = 5, filters) {
55
+ async search(query, numResults = 10, filters) {
56
56
  try {
57
57
  // Generate cache key
58
58
  const cacheKey = this.generateCacheKey(query, numResults, filters);
File without changes
@@ -1,23 +1,23 @@
1
- {
2
- "name": "google-search-mcp",
3
- "version": "0.1.0",
4
- "description": "MCP server for Google search and webpage analysis",
5
- "type": "module",
6
- "scripts": {
7
- "start": "node dist/google-search.js"
8
- },
9
- "dependencies": {
10
- "@modelcontextprotocol/sdk": "^1.0.1",
11
- "@mozilla/readability": "^0.6.0",
12
- "@types/turndown": "^5.0.5",
13
- "axios": "^1.7.9",
14
- "cheerio": "^1.0.0",
15
- "dompurify": "^3.2.3",
16
- "express": "^4.21.2",
17
- "googleapis": "^144.0.0",
18
- "jsdom": "^25.0.1",
19
- "markdown-it": "^14.1.0",
20
- "readability": "^0.1.0",
21
- "turndown": "^7.2.0"
22
- }
1
+ {
2
+ "name": "google-search-mcp",
3
+ "version": "0.1.1",
4
+ "description": "MCP server for Google search and webpage analysis",
5
+ "type": "module",
6
+ "scripts": {
7
+ "start": "node dist/google-search.js"
8
+ },
9
+ "dependencies": {
10
+ "@modelcontextprotocol/sdk": "^1.0.1",
11
+ "@mozilla/readability": "^0.6.0",
12
+ "@types/turndown": "^5.0.5",
13
+ "axios": "^1.7.9",
14
+ "cheerio": "^1.0.0",
15
+ "dompurify": "^3.2.3",
16
+ "express": "^4.21.2",
17
+ "googleapis": "^144.0.0",
18
+ "jsdom": "^25.0.1",
19
+ "markdown-it": "^14.1.0",
20
+ "readability": "^0.1.0",
21
+ "turndown": "^7.2.0"
22
+ }
23
23
  }