@endday/search-mcp 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (86) hide show
  1. package/LICENSE +674 -0
  2. package/README.md +117 -0
  3. package/README.zh.md +116 -0
  4. package/data/blocklist.generated.js +2 -0
  5. package/envs.js +129 -0
  6. package/index.d.ts +191 -0
  7. package/index.js +6 -0
  8. package/mcp/search-mcp.js +8 -0
  9. package/package.json +71 -0
  10. package/src/content/extract.impl.js +228 -0
  11. package/src/content/extract.js +1 -0
  12. package/src/content/fetch.impl.js +400 -0
  13. package/src/content/fetch.js +1 -0
  14. package/src/core/crypto.js +7 -0
  15. package/src/core/errors.impl.js +52 -0
  16. package/src/core/errors.js +1 -0
  17. package/src/core/html.impl.js +69 -0
  18. package/src/core/html.js +1 -0
  19. package/src/mcp/config.js +75 -0
  20. package/src/mcp/format.js +44 -0
  21. package/src/mcp/index.js +10 -0
  22. package/src/mcp/local/content.js +26 -0
  23. package/src/mcp/local/search.js +233 -0
  24. package/src/mcp/schemas.js +132 -0
  25. package/src/mcp/server.js +97 -0
  26. package/src/mcp/tools/content.js +31 -0
  27. package/src/mcp/tools/jinaContent.js +38 -0
  28. package/src/mcp/tools/newsSearch.js +22 -0
  29. package/src/mcp/tools/webSearch.js +57 -0
  30. package/src/platform/auth.impl.js +166 -0
  31. package/src/platform/auth.js +1 -0
  32. package/src/platform/cache.impl.js +166 -0
  33. package/src/platform/cache.js +1 -0
  34. package/src/platform/health.impl.js +133 -0
  35. package/src/platform/health.js +1 -0
  36. package/src/platform/http.impl.js +108 -0
  37. package/src/platform/http.js +1 -0
  38. package/src/platform/logger.impl.js +51 -0
  39. package/src/platform/logger.js +1 -0
  40. package/src/platform/metrics.impl.js +43 -0
  41. package/src/platform/metrics.js +1 -0
  42. package/src/platform/nodeHttpClient.js +104 -0
  43. package/src/platform/rateLimit.impl.js +141 -0
  44. package/src/platform/rateLimit.js +1 -0
  45. package/src/platform/requestContext.impl.js +10 -0
  46. package/src/platform/requestContext.js +1 -0
  47. package/src/platform/session.impl.js +198 -0
  48. package/src/platform/session.js +1 -0
  49. package/src/platform/stateKv.impl.js +18 -0
  50. package/src/platform/stateKv.js +1 -0
  51. package/src/platform/tasks.impl.js +17 -0
  52. package/src/platform/tasks.js +1 -0
  53. package/src/routes/requestParams.impl.js +12 -0
  54. package/src/routes/requestParams.js +1 -0
  55. package/src/search/engineRegistry.impl.js +117 -0
  56. package/src/search/engineRegistry.js +1 -0
  57. package/src/search/engineRequest.impl.js +377 -0
  58. package/src/search/engineRequest.js +1 -0
  59. package/src/search/engineUtils.impl.js +227 -0
  60. package/src/search/engineUtils.js +1 -0
  61. package/src/search/engines/baidu.impl.js +145 -0
  62. package/src/search/engines/baidu.js +2 -0
  63. package/src/search/engines/bing.impl.js +509 -0
  64. package/src/search/engines/bing.js +2 -0
  65. package/src/search/engines/brave.impl.js +223 -0
  66. package/src/search/engines/brave.js +2 -0
  67. package/src/search/engines/duckduckgo.impl.js +164 -0
  68. package/src/search/engines/duckduckgo.js +2 -0
  69. package/src/search/engines/mojeek.impl.js +115 -0
  70. package/src/search/engines/mojeek.js +2 -0
  71. package/src/search/engines/qwant.impl.js +188 -0
  72. package/src/search/engines/qwant.js +2 -0
  73. package/src/search/engines/startpage.impl.js +237 -0
  74. package/src/search/engines/startpage.js +2 -0
  75. package/src/search/engines/toutiao.impl.js +265 -0
  76. package/src/search/engines/toutiao.js +2 -0
  77. package/src/search/engines/yahoo.impl.js +379 -0
  78. package/src/search/engines/yahoo.js +2 -0
  79. package/src/search/gateway.impl.js +423 -0
  80. package/src/search/gateway.js +1 -0
  81. package/src/search/ranking.impl.js +381 -0
  82. package/src/search/ranking.js +1 -0
  83. package/src/search/requestPolicy.impl.js +137 -0
  84. package/src/search/requestPolicy.js +1 -0
  85. package/src/search/upstreamSession.impl.js +148 -0
  86. package/src/search/upstreamSession.js +1 -0
package/README.md ADDED
@@ -0,0 +1,117 @@
1
+ # Search MCP
2
+
3
+ English | [中文](./README.zh.md)
4
+
5
+ > A local MCP server for aggregated web search and readable content extraction
6
+
7
+ ## What It Is
8
+
9
+ `@endday/search-mcp` is now a **local-only** MCP package.
10
+
11
+ It does not ship a remote Worker service anymore. The project is focused on:
12
+
13
+ - local `web_search`
14
+ - local `content`
15
+ - `jina_content` as an optional reader fallback
16
+
17
+ ## Install
18
+
19
+ Requirements:
20
+
21
+ - Node.js 20+
22
+ - an MCP client such as Claude Code, Claude Desktop, OpenClaw, or Codex
23
+
24
+ Run directly with:
25
+
26
+ ```bash
27
+ npx -y @endday/search-mcp
28
+ ```
29
+
30
+ ## MCP Config
31
+
32
+ Add this to your MCP client config:
33
+
34
+ ```json
35
+ {
36
+ "mcpServers": {
37
+ "search-mcp": {
38
+ "command": "npx",
39
+ "args": ["-y", "@endday/search-mcp"],
40
+ "env": {
41
+ "SEARCH_MCP_CLIENT_ID": "search-mcp-local"
42
+ }
43
+ }
44
+ }
45
+ }
46
+ ```
47
+
48
+ Optional environment variables:
49
+
50
+ - `SEARCH_MCP_CLIENT_ID`
51
+ - `SEARCH_MCP_UPSTREAM_CLIENT` (`auto`, `impit`, `fetch`)
52
+ - `SEARCH_MCP_PROXY_URL`
53
+ - `SEARCH_MCP_IGNORE_TLS_ERRORS`
54
+ - `JINA_API_KEY`
55
+ - `JINA_BASE_URL`
56
+ - `SUPPORTED_ENGINES`
57
+ - `DEFAULT_ENGINES`
58
+ - `DEFAULT_ENGINES_ZH`
59
+ - `DEFAULT_ENGINES_NON_ZH`
60
+
61
+ `SEARCH_MCP_UPSTREAM_CLIENT=auto` is the default. In local Node mode it prefers `impit` for upstream requests and falls back to built-in `fetch` if needed. Set it to `impit` to force the impersonated client, or `fetch` to disable it.
62
+
63
+ ## Exposed Tools
64
+
65
+ - `web_search`: runs local engine fetch + parse and returns ranked results
66
+ - `news_search`: runs explicit news search across supported news-capable engines
67
+ - `content`: fetches a URL and extracts readable text locally
68
+ - `jina_content`: reads a URL through Jina AI reader
69
+
70
+ ## Ranking And Filtering
71
+
72
+ - blocked domains are filtered locally from generated denylist subscriptions
73
+ - ordinary domains are treated roughly equally by default
74
+ - only a small set of known high-value sources gets deterministic positive boosts
75
+
76
+ ## Supported Engines
77
+
78
+ - `baidu`
79
+ - `bing`
80
+ - `startpage`
81
+ - `duckduckgo`
82
+ - `brave`
83
+ - `qwant`
84
+ - `yahoo`
85
+ - `mojeek`
86
+ - `toutiao`
87
+
88
+ Recommended defaults in the current local setup:
89
+
90
+ - Chinese queries: `baidu`, `bing`
91
+ - Non-Chinese queries: `bing`, `brave`, `yahoo`, `mojeek`
92
+
93
+ ## Dev
94
+
95
+ Useful commands:
96
+
97
+ ```bash
98
+ npm test
99
+ npm run smoke
100
+ npm run update:blocklist
101
+ npm run docs:dev
102
+ ```
103
+
104
+ ## Docs
105
+
106
+ Long-form docs now live in the VitePress site under [docs-site](./docs-site).
107
+ The docs site is intended to be published with GitHub Pages, so a Cloudflare Worker homepage is no longer needed.
108
+
109
+ For GitHub Pages:
110
+
111
+ 1. Enable `GitHub Pages` in the repository settings.
112
+ 2. Set the source to `GitHub Actions`.
113
+ 3. The workflow at `.github/workflows/docs-pages.yml` will build and deploy the VitePress site.
114
+
115
+ ## License
116
+
117
+ GPL-3.0
package/README.zh.md ADDED
@@ -0,0 +1,116 @@
1
+ # Search MCP
2
+
3
+ [English](./README.md) | 中文
4
+
5
+ > 一个只做本地 MCP 的聚合搜索与正文提取包
6
+
7
+ ## 现在是什么
8
+
9
+ `@endday/search-mcp` 现在是一个**纯本地** MCP 包。
10
+
11
+ 仓库已经不再提供远程 Worker 服务,主线只保留:
12
+
13
+ - 本地 `web_search`
14
+ - 本地 `news_search`
15
+ - 本地 `content`
16
+ - 可选的 `jina_content`
17
+
18
+ ## 安装
19
+
20
+ 要求:
21
+
22
+ - Node.js 20+
23
+ - 一个支持 MCP 的客户端,例如 Claude Code、Claude Desktop、OpenClaw、Codex
24
+
25
+ 直接运行:
26
+
27
+ ```bash
28
+ npx -y @endday/search-mcp
29
+ ```
30
+
31
+ ## MCP 配置
32
+
33
+ 把下面配置加入你的 MCP 客户端:
34
+
35
+ ```json
36
+ {
37
+ "mcpServers": {
38
+ "search-mcp": {
39
+ "command": "npx",
40
+ "args": ["-y", "@endday/search-mcp"],
41
+ "env": {
42
+ "SEARCH_MCP_CLIENT_ID": "search-mcp-local"
43
+ }
44
+ }
45
+ }
46
+ }
47
+ ```
48
+
49
+ 可选环境变量:
50
+
51
+ - `SEARCH_MCP_CLIENT_ID`
52
+ - `SEARCH_MCP_UPSTREAM_CLIENT`(`auto`、`impit`、`fetch`)
53
+ - `SEARCH_MCP_PROXY_URL`
54
+ - `SEARCH_MCP_IGNORE_TLS_ERRORS`
55
+ - `JINA_API_KEY`
56
+ - `JINA_BASE_URL`
57
+ - `SUPPORTED_ENGINES`
58
+ - `DEFAULT_ENGINES`
59
+ - `DEFAULT_ENGINES_ZH`
60
+ - `DEFAULT_ENGINES_NON_ZH`
61
+
62
+ 默认是 `SEARCH_MCP_UPSTREAM_CLIENT=auto`。在本地 Node 模式下,上游请求会优先使用 `impit`,必要时再回退到内置 `fetch`。设为 `impit` 表示强制启用,设为 `fetch` 表示禁用。
63
+
64
+ ## 暴露的工具
65
+
66
+ - `web_search`:本地抓取搜索结果并返回排序后的结果
67
+ - `news_search`:显式新闻搜索,走支持 news vertical 的引擎
68
+ - `content`:本地抓取网页并提取正文
69
+ - `jina_content`:通过 Jina AI 阅读器读取网页
70
+
71
+ ## 排序与过滤
72
+
73
+ - 命中本地生成的黑名单域名会直接过滤
74
+ - 普通域名默认大致同权
75
+ - 只有少量已知高价值来源会拿到确定性的正向加分
76
+
77
+ ## 当前支持的引擎
78
+
79
+ - `baidu`
80
+ - `bing`
81
+ - `startpage`
82
+ - `duckduckgo`
83
+ - `brave`
84
+ - `qwant`
85
+ - `yahoo`
86
+ - `mojeek`
87
+ - `toutiao`
88
+
89
+ 当前本地环境下推荐的默认组合:
90
+
91
+ - 中文查询:`baidu`、`bing`
92
+ - 非中文查询:`bing`、`brave`、`yahoo`、`mojeek`
93
+
94
+ ## 开发命令
95
+
96
+ ```bash
97
+ npm test
98
+ npm run smoke
99
+ npm run update:blocklist
100
+ npm run docs:dev
101
+ ```
102
+
103
+ ## 文档
104
+
105
+ 长文档已经迁到 [docs-site](./docs-site) 里的 VitePress 站点。
106
+ 文档站建议直接发布到 GitHub Pages,不再需要单独保留 Cloudflare Worker 首页。
107
+
108
+ GitHub Pages 配置方式:
109
+
110
+ 1. 在仓库设置里启用 `GitHub Pages`
111
+ 2. Source 选择 `GitHub Actions`
112
+ 3. 使用 `.github/workflows/docs-pages.yml` 自动构建并发布 VitePress 站点
113
+
114
+ ## 许可证
115
+
116
+ GPL-3.0