@yandy0725/pi-web-tools 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +144 -0
- package/index.ts +308 -0
- package/package.json +43 -0
- package/src/config.ts +39 -0
- package/src/deep_search/aliyun.ts +48 -0
- package/src/deep_search/index.ts +1 -0
- package/src/deep_search/types.ts +16 -0
- package/src/image_search/aliyun.ts +87 -0
- package/src/image_search/index.ts +1 -0
- package/src/image_search/types.ts +15 -0
- package/src/openai_client.ts +9 -0
- package/src/provider.ts +69 -0
- package/src/web_fetch.ts +110 -0
- package/src/web_search/exa.ts +163 -0
- package/src/web_search/index.ts +45 -0
- package/src/web_search/types.ts +11 -0
package/README.md
ADDED
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
# pi-web-tools
|
|
2
|
+
|
|
3
|
+
A [pi](https://pi.dev/docs/latest/packages) package providing web and image search tools for coding agents.
|
|
4
|
+
|
|
5
|
+
## Tools
|
|
6
|
+
|
|
7
|
+
| Tool | Description | Source |
|
|
8
|
+
|------|-------------|--------|
|
|
9
|
+
| `web_search` | Pure web search, returns raw results (titles, URLs, snippets) | Exa (REST + MCP free tier) |
|
|
10
|
+
| `deep_search` | Deep research with LLM-synthesized answers | Aliyun (Bailian) Chat Completions API |
|
|
11
|
+
| `image_search` | Search images by text or find similar images by URL | Aliyun (Bailian) Responses API |
|
|
12
|
+
| `web_fetch` | Fetch and convert web pages to text, markdown, or raw HTML | — |
|
|
13
|
+
|
|
14
|
+
## Quick Start
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
# Install from a local checkout
|
|
18
|
+
pi install ./path/to/pi-web-tools
|
|
19
|
+
|
|
20
|
+
# Or test with -e flag
|
|
21
|
+
pi -e ./index.ts
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
### Prerequisites
|
|
25
|
+
|
|
26
|
+
- `web_search`: No config needed — Exa MCP free tier (150 calls/day). Set `EXA_API_KEY` for higher limits.
|
|
27
|
+
- `deep_search` / `image_search`: Set `ALIYUN_API_KEY` or use `/login` in pi to authenticate with Aliyun.
|
|
28
|
+
|
|
29
|
+
## Configuration
|
|
30
|
+
|
|
31
|
+
Configuration uses two layers: environment variables for API keys, and a project config file for other settings.
|
|
32
|
+
|
|
33
|
+
### API Keys (environment variables only)
|
|
34
|
+
|
|
35
|
+
| Variable | Description | Default |
|
|
36
|
+
|----------|-------------|---------|
|
|
37
|
+
| `EXA_API_KEY` | Exa API key. If not set, uses MCP free tier (150 calls/day) | — |
|
|
38
|
+
| `ALIYUN_API_KEY` | Aliyun (Bailian) API key | — |
|
|
39
|
+
| `ALIYUN_BASE_URL` | Aliyun API base URL | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
|
|
40
|
+
| `ALIYUN_DEEP_SEARCH_MODEL` | Model for deep_search | `deepseek-v4-flash` |
|
|
41
|
+
| `ALIYUN_IMAGE_SEARCH_MODEL` | Model for image_search | `qwen3.7-plus` |
|
|
42
|
+
|
|
43
|
+
Aliyun also supports key resolution via pi's `/login` — if you've logged into Aliyun through pi, no env var needed.
|
|
44
|
+
|
|
45
|
+
### Project Config (`.pi/agent/web-tools.json`)
|
|
46
|
+
|
|
47
|
+
Create `.pi/agent/web-tools.json` in your project root for per-project settings:
|
|
48
|
+
|
|
49
|
+
```json
|
|
50
|
+
{
|
|
51
|
+
"aliyun": {
|
|
52
|
+
"baseUrl": "https://dashscope.aliyuncs.com/compatible-mode/v1",
|
|
53
|
+
"aliyunProviderKey": "aliyun",
|
|
54
|
+
"deepSearchModel": "deepseek-v4-flash",
|
|
55
|
+
"imageSearchModel": "qwen3.7-plus"
|
|
56
|
+
}
|
|
57
|
+
}
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Environment variables take precedence over the config file.
|
|
61
|
+
|
|
62
|
+
| Config Key | Env Variable (overrides) | Default | Description |
|
|
63
|
+
|------------|--------------------------|---------|-------------|
|
|
64
|
+
| `aliyun.baseUrl` | `ALIYUN_BASE_URL` | `https://dashscope.aliyuncs.com/compatible-mode/v1` | Aliyun API base URL |
|
|
65
|
+
| `aliyun.aliyunProviderKey` | — | `aliyun` | Pi provider name to extract apiKey/baseUrl from |
|
|
66
|
+
| `aliyun.deepSearchModel` | `ALIYUN_DEEP_SEARCH_MODEL` | `deepseek-v4-flash` | Model for deep_search |
|
|
67
|
+
| `aliyun.imageSearchModel` | `ALIYUN_IMAGE_SEARCH_MODEL` | `qwen3.7-plus` | Model for image_search |
|
|
68
|
+
|
|
69
|
+
**aliyunProviderKey:** deep_search and image_search will extract apiKey and baseUrl from the corresponding pi provider (via `modelRegistry`). Defaults to `"aliyun"`. Environment variables take precedence over provider values. If the provider is not found, falls back to `aliyun.baseUrl` config or default.
|
|
70
|
+
|
|
71
|
+
> **Note:** deep_search uses Chat Completions API and does not return structured sources. image_search uses Responses API.
|
|
72
|
+
|
|
73
|
+
> **Security:** API keys are NEVER read from config files — only from environment variables or pi's built-in credential store (`/login`).
|
|
74
|
+
|
|
75
|
+
## Tools Reference
|
|
76
|
+
|
|
77
|
+
### web_search
|
|
78
|
+
|
|
79
|
+
Search the web with automatic source fallback.
|
|
80
|
+
|
|
81
|
+
**Parameters:**
|
|
82
|
+
|
|
83
|
+
| Parameter | Type | Required | Default | Description |
|
|
84
|
+
|-----------|------|----------|---------|-------------|
|
|
85
|
+
| `query` | string | yes | — | Search query |
|
|
86
|
+
| `numResults` | number | no | 10 | Number of results (1-20) |
|
|
87
|
+
| `source` | `"exa"` | no | — | Search source |
|
|
88
|
+
|
|
89
|
+
**Source:** **Exa** — AI-native search API. With `EXA_API_KEY`: full REST API. Without: MCP free tier (150 calls/day, 3 QPS). Always available, no key needed for basic usage.
|
|
90
|
+
|
|
91
|
+
### deep_search
|
|
92
|
+
|
|
93
|
+
Deep research using Aliyun's LLM-powered search with web content extraction. The model searches the web, extracts page content, and synthesizes a comprehensive answer.
|
|
94
|
+
|
|
95
|
+
**Parameters:**
|
|
96
|
+
|
|
97
|
+
| Parameter | Type | Required | Default | Description |
|
|
98
|
+
|-----------|------|----------|---------|-------------|
|
|
99
|
+
| `query` | string | yes | — | Research question |
|
|
100
|
+
| `enableSearchExtension` | boolean | no | false | Enable vertical domain search |
|
|
101
|
+
| `freshness` | number | no | — | Time range: 7/30/180/365 days |
|
|
102
|
+
| `assignedSiteList` | string[] | no | — | Restrict search to specific sites |
|
|
103
|
+
| `enableImageOutput` | boolean | no | false | Enable mixed text-image output |
|
|
104
|
+
|
|
105
|
+
> Requires `ALIYUN_API_KEY` or `aliyunProviderKey` config. Uses Chat Completions API with forced search (turbo strategy). Sources are not returned.
|
|
106
|
+
|
|
107
|
+
### image_search
|
|
108
|
+
|
|
109
|
+
Search images by text description or find visually similar images by URL.
|
|
110
|
+
|
|
111
|
+
**Parameters:**
|
|
112
|
+
|
|
113
|
+
| Parameter | Type | Required | Description |
|
|
114
|
+
|-----------|------|----------|-------------|
|
|
115
|
+
| `query` | string | no | Text description for text-to-image search |
|
|
116
|
+
| `imageUrl` | string | no | Public image URL for image-to-image search |
|
|
117
|
+
|
|
118
|
+
> At least one of `query` or `imageUrl` must be provided. Both can be combined.
|
|
119
|
+
> The image URL must be publicly accessible. Requires `ALIYUN_API_KEY`.
|
|
120
|
+
|
|
121
|
+
### web_fetch
|
|
122
|
+
|
|
123
|
+
Fetch content from a URL and return as text, markdown, or raw HTML.
|
|
124
|
+
|
|
125
|
+
**Parameters:**
|
|
126
|
+
|
|
127
|
+
| Parameter | Type | Required | Default | Description |
|
|
128
|
+
|-----------|------|----------|---------|-------------|
|
|
129
|
+
| `url` | string | yes | — | URL to fetch |
|
|
130
|
+
| `format` | `"text"` \| `"markdown"` \| `"html"` | no | `"markdown"` | Output format |
|
|
131
|
+
| `timeout` | number | no | 30 | Timeout in seconds (1-120) |
|
|
132
|
+
|
|
133
|
+
## Development
|
|
134
|
+
|
|
135
|
+
```bash
|
|
136
|
+
npm install # Install dependencies
|
|
137
|
+
npm run typecheck # tsc --noEmit
|
|
138
|
+
npm run lint # biome lint
|
|
139
|
+
npm test # vitest run
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
## License
|
|
143
|
+
|
|
144
|
+
MIT
|
package/index.ts
ADDED
|
@@ -0,0 +1,308 @@
|
|
|
1
|
+
import type { ExtensionAPI, ExtensionContext } from "@earendil-works/pi-coding-agent";
|
|
2
|
+
import { Text } from "@earendil-works/pi-tui";
|
|
3
|
+
import { Type } from "typebox";
|
|
4
|
+
import { loadConfig } from "./src/config";
|
|
5
|
+
import { deepSearch } from "./src/deep_search/index";
|
|
6
|
+
import { imageSearch } from "./src/image_search/index";
|
|
7
|
+
import { webFetch } from "./src/web_fetch";
|
|
8
|
+
import { search } from "./src/web_search/index";
|
|
9
|
+
|
|
10
|
+
export default function (pi: ExtensionAPI) {
|
|
11
|
+
// -------------------------------------------------------------------
|
|
12
|
+
// web_search
|
|
13
|
+
// -------------------------------------------------------------------
|
|
14
|
+
pi.registerTool({
|
|
15
|
+
name: "web_search",
|
|
16
|
+
label: "Web Search",
|
|
17
|
+
description:
|
|
18
|
+
`Search the web via Exa and return raw results (titles, URLs, snippets). With EXA_API_KEY: full REST API. Without: MCP free tier (150 calls/day). ` +
|
|
19
|
+
`The current year is ${new Date().getFullYear()}.`,
|
|
20
|
+
promptSnippet:
|
|
21
|
+
"web_search: search the web via Exa. Returns raw results with titles, URLs, snippets. LLM synthesizes the answer.",
|
|
22
|
+
promptGuidelines: [
|
|
23
|
+
"Use web_search when you need current information outside your training data.",
|
|
24
|
+
"Synthesize a clear answer from the search results and cite sources with markdown hyperlinks.",
|
|
25
|
+
],
|
|
26
|
+
parameters: Type.Object({
|
|
27
|
+
query: Type.String({ minLength: 2, description: "The search query." }),
|
|
28
|
+
numResults: Type.Optional(
|
|
29
|
+
Type.Number({ minimum: 1, maximum: 20, default: 10, description: "Number of results (1-20)." }),
|
|
30
|
+
),
|
|
31
|
+
source: Type.Optional(Type.String({ enum: ["exa"], description: "Search source. Default: exa." })),
|
|
32
|
+
}),
|
|
33
|
+
renderCall(args, theme) {
|
|
34
|
+
const p = args as { query: string };
|
|
35
|
+
return new Text(
|
|
36
|
+
theme.fg("toolTitle", theme.bold("web_search ")) + theme.fg("accent", `"${p.query || "..."}"`),
|
|
37
|
+
0,
|
|
38
|
+
0,
|
|
39
|
+
);
|
|
40
|
+
},
|
|
41
|
+
renderResult(result, { expanded }, theme) {
|
|
42
|
+
const text = result.content?.[0];
|
|
43
|
+
const body = text?.type === "text" ? text.text : "";
|
|
44
|
+
const lines = body.split("\n");
|
|
45
|
+
if (!expanded) {
|
|
46
|
+
const preview = lines.slice(0, 6);
|
|
47
|
+
if (lines.length > 6) preview.push(theme.fg("dim", `... ${lines.length - 6} more lines · ctrl+o to expand`));
|
|
48
|
+
return new Text(preview.join("\n"), 0, 0);
|
|
49
|
+
}
|
|
50
|
+
return new Text(body, 0, 0);
|
|
51
|
+
},
|
|
52
|
+
async execute(_toolCallId, params, signal, onUpdate, _ctx) {
|
|
53
|
+
const p = params as { query: string; numResults?: number; source?: string };
|
|
54
|
+
const query = p.query?.trim();
|
|
55
|
+
if (!query) {
|
|
56
|
+
return { content: [{ type: "text", text: "Error: query is required." }], details: {}, isError: true };
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
onUpdate?.({ content: [{ type: "text", text: "Searching..." }], details: {} });
|
|
60
|
+
|
|
61
|
+
let firstProgress = true;
|
|
62
|
+
const onProgress = (msg: string) => {
|
|
63
|
+
if (firstProgress) {
|
|
64
|
+
onUpdate?.({ content: [{ type: "text", text: msg }], details: {} });
|
|
65
|
+
firstProgress = false;
|
|
66
|
+
}
|
|
67
|
+
};
|
|
68
|
+
|
|
69
|
+
try {
|
|
70
|
+
const result = await search(query, p.numResults ?? 10, signal, onProgress, p.source);
|
|
71
|
+
const sourceLabel = `\n\n*Source: ${result.sourceLabel}*`;
|
|
72
|
+
return {
|
|
73
|
+
content: [{ type: "text", text: result.answer + sourceLabel }],
|
|
74
|
+
details: { source: result.sourceLabel, sources: result.sources },
|
|
75
|
+
};
|
|
76
|
+
} catch (error) {
|
|
77
|
+
const message = error instanceof Error ? error.message : String(error);
|
|
78
|
+
return { content: [{ type: "text", text: `Search failed: ${message}` }], details: {}, isError: true };
|
|
79
|
+
}
|
|
80
|
+
},
|
|
81
|
+
});
|
|
82
|
+
|
|
83
|
+
// -------------------------------------------------------------------
|
|
84
|
+
// deep_search
|
|
85
|
+
// -------------------------------------------------------------------
|
|
86
|
+
pi.registerTool({
|
|
87
|
+
name: "deep_search",
|
|
88
|
+
label: "Deep Search",
|
|
89
|
+
description:
|
|
90
|
+
"Deep search powered by Aliyun (Bailian) using Chat Completions API with web search. The model searches the web and synthesizes a comprehensive answer. Supports vertical domain search, time range filtering, site restriction, and mixed image output.",
|
|
91
|
+
promptSnippet:
|
|
92
|
+
"deep_search: Aliyun-powered deep search that synthesizes web results into a comprehensive answer. Supports vertical domain search, time range filtering, site restriction, and mixed image output.",
|
|
93
|
+
promptGuidelines: [
|
|
94
|
+
"Use deep_search for complex research questions that benefit from web search synthesis.",
|
|
95
|
+
"deep_search is powered by Aliyun Chat Completions API. Configure ALIYUN_API_KEY or use aliyunProviderKey in config.",
|
|
96
|
+
],
|
|
97
|
+
parameters: Type.Object({
|
|
98
|
+
query: Type.String({ minLength: 2, description: "The search query." }),
|
|
99
|
+
enableSearchExtension: Type.Optional(
|
|
100
|
+
Type.Boolean({ description: "Enable vertical domain search for more precise results." }),
|
|
101
|
+
),
|
|
102
|
+
freshness: Type.Optional(
|
|
103
|
+
Type.Number({
|
|
104
|
+
enum: [7, 30, 180, 365],
|
|
105
|
+
description: "Time range filter: 7/30/180/365 days. Only effective with turbo strategy.",
|
|
106
|
+
}),
|
|
107
|
+
),
|
|
108
|
+
assignedSiteList: Type.Optional(
|
|
109
|
+
Type.Array(Type.String(), {
|
|
110
|
+
description: 'Restrict search to specific sites (e.g. ["baidu.com", "sina.cn"]).',
|
|
111
|
+
}),
|
|
112
|
+
),
|
|
113
|
+
enableImageOutput: Type.Optional(Type.Boolean({ description: "Enable mixed text-image output in the response." })),
|
|
114
|
+
}),
|
|
115
|
+
renderCall(args, theme) {
|
|
116
|
+
const p = args as { query: string };
|
|
117
|
+
return new Text(
|
|
118
|
+
theme.fg("toolTitle", theme.bold("deep_search ")) + theme.fg("accent", `"${p.query || "..."}"`),
|
|
119
|
+
0,
|
|
120
|
+
0,
|
|
121
|
+
);
|
|
122
|
+
},
|
|
123
|
+
renderResult(result, { expanded }, theme) {
|
|
124
|
+
const text = result.content?.[0];
|
|
125
|
+
const body = text?.type === "text" ? text.text : "";
|
|
126
|
+
const lines = body.split("\n");
|
|
127
|
+
if (!expanded) {
|
|
128
|
+
const preview = lines.slice(0, 6);
|
|
129
|
+
if (lines.length > 6) preview.push(theme.fg("dim", `... ${lines.length - 6} more lines · ctrl+o to expand`));
|
|
130
|
+
return new Text(preview.join("\n"), 0, 0);
|
|
131
|
+
}
|
|
132
|
+
return new Text(body, 0, 0);
|
|
133
|
+
},
|
|
134
|
+
async execute(_toolCallId, params, signal, onUpdate, ctx) {
|
|
135
|
+
const p = params as {
|
|
136
|
+
query: string;
|
|
137
|
+
enableSearchExtension?: boolean;
|
|
138
|
+
freshness?: number;
|
|
139
|
+
assignedSiteList?: string[];
|
|
140
|
+
enableImageOutput?: boolean;
|
|
141
|
+
};
|
|
142
|
+
const query = p.query?.trim();
|
|
143
|
+
if (!query) {
|
|
144
|
+
return { content: [{ type: "text", text: "Error: query is required." }], details: {}, isError: true };
|
|
145
|
+
}
|
|
146
|
+
|
|
147
|
+
onUpdate?.({ content: [{ type: "text", text: "Deep searching..." }], details: {} });
|
|
148
|
+
|
|
149
|
+
try {
|
|
150
|
+
const cfg = loadConfig(ctx.cwd);
|
|
151
|
+
const result = await deepSearch(query, signal, cfg.aliyun, ctx, {
|
|
152
|
+
enableSearchExtension: p.enableSearchExtension,
|
|
153
|
+
freshness: p.freshness,
|
|
154
|
+
assignedSiteList: p.assignedSiteList,
|
|
155
|
+
enableImageOutput: p.enableImageOutput,
|
|
156
|
+
});
|
|
157
|
+
const sourcesText = result.sources.length
|
|
158
|
+
? `\n\nSources:\n${result.sources.map((s, i) => `${i + 1}. [${s.title}](${s.url})`).join("\n")}`
|
|
159
|
+
: "";
|
|
160
|
+
return {
|
|
161
|
+
content: [{ type: "text", text: result.answer + sourcesText }],
|
|
162
|
+
details: { sources: result.sources },
|
|
163
|
+
};
|
|
164
|
+
} catch (error) {
|
|
165
|
+
const message = error instanceof Error ? error.message : String(error);
|
|
166
|
+
return { content: [{ type: "text", text: `Deep search failed: ${message}` }], details: {}, isError: true };
|
|
167
|
+
}
|
|
168
|
+
},
|
|
169
|
+
});
|
|
170
|
+
|
|
171
|
+
// -------------------------------------------------------------------
|
|
172
|
+
// image_search
|
|
173
|
+
// -------------------------------------------------------------------
|
|
174
|
+
pi.registerTool({
|
|
175
|
+
name: "image_search",
|
|
176
|
+
label: "Image Search",
|
|
177
|
+
description:
|
|
178
|
+
"Search for images by text description or find similar images by URL. Powered by Aliyun (Bailian). Returns image results and model analysis.",
|
|
179
|
+
promptSnippet: "image_search: search images by text or find similar images by URL. Powered by Aliyun (Bailian).",
|
|
180
|
+
promptGuidelines: [
|
|
181
|
+
"Use image_search to find images matching a text description (provide query).",
|
|
182
|
+
"Use image_search to find visually similar images (provide imageUrl, the image must be a publicly accessible URL).",
|
|
183
|
+
"Both query and imageUrl can be provided together for combined search.",
|
|
184
|
+
],
|
|
185
|
+
parameters: Type.Object({
|
|
186
|
+
query: Type.Optional(Type.String({ minLength: 2, description: "Text description of the image to search for." })),
|
|
187
|
+
imageUrl: Type.Optional(Type.String({ description: "Public URL of the image to find similar images." })),
|
|
188
|
+
}),
|
|
189
|
+
renderCall(args, theme) {
|
|
190
|
+
const p = args as { query?: string; imageUrl?: string };
|
|
191
|
+
const label = theme.fg("toolTitle", theme.bold("image_search "));
|
|
192
|
+
if (p.imageUrl) return new Text(label + theme.fg("accent", `[image: ${p.imageUrl}]`), 0, 0);
|
|
193
|
+
return new Text(label + theme.fg("accent", `"${p.query || "..."}"`), 0, 0);
|
|
194
|
+
},
|
|
195
|
+
renderResult(result, { expanded }, theme) {
|
|
196
|
+
const text = result.content?.[0];
|
|
197
|
+
const body = text?.type === "text" ? text.text : "";
|
|
198
|
+
const lines = body.split("\n");
|
|
199
|
+
if (!expanded) {
|
|
200
|
+
const preview = lines.slice(0, 6);
|
|
201
|
+
if (lines.length > 6) preview.push(theme.fg("dim", `... ${lines.length - 6} more lines · ctrl+o to expand`));
|
|
202
|
+
return new Text(preview.join("\n"), 0, 0);
|
|
203
|
+
}
|
|
204
|
+
return new Text(body, 0, 0);
|
|
205
|
+
},
|
|
206
|
+
async execute(_toolCallId, params, signal, onUpdate, ctx: ExtensionContext) {
|
|
207
|
+
const p = params as { query?: string; imageUrl?: string };
|
|
208
|
+
if (!p.query && !p.imageUrl) {
|
|
209
|
+
return {
|
|
210
|
+
content: [{ type: "text", text: "Error: at least one of query or imageUrl is required." }],
|
|
211
|
+
details: {},
|
|
212
|
+
isError: true,
|
|
213
|
+
};
|
|
214
|
+
}
|
|
215
|
+
|
|
216
|
+
onUpdate?.({ content: [{ type: "text", text: "Searching images..." }], details: {} });
|
|
217
|
+
|
|
218
|
+
try {
|
|
219
|
+
const cfg = loadConfig(ctx.cwd);
|
|
220
|
+
const result = await imageSearch({ query: p.query, imageUrl: p.imageUrl }, signal, cfg.aliyun, ctx);
|
|
221
|
+
const imagesText = result.images.length
|
|
222
|
+
? "\n\nImages:\n" + result.images.map((img) => `${img.index}. [${img.title}](${img.url})`).join("\n")
|
|
223
|
+
: "";
|
|
224
|
+
return {
|
|
225
|
+
content: [{ type: "text", text: result.answer + imagesText }],
|
|
226
|
+
details: { images: result.images },
|
|
227
|
+
};
|
|
228
|
+
} catch (error) {
|
|
229
|
+
const message = error instanceof Error ? error.message : String(error);
|
|
230
|
+
return {
|
|
231
|
+
content: [{ type: "text", text: `Image search failed: ${message}` }],
|
|
232
|
+
details: {},
|
|
233
|
+
isError: true,
|
|
234
|
+
};
|
|
235
|
+
}
|
|
236
|
+
},
|
|
237
|
+
});
|
|
238
|
+
|
|
239
|
+
// -------------------------------------------------------------------
|
|
240
|
+
// web_fetch
|
|
241
|
+
// -------------------------------------------------------------------
|
|
242
|
+
pi.registerTool({
|
|
243
|
+
name: "web_fetch",
|
|
244
|
+
label: "Web Fetch",
|
|
245
|
+
description: "Fetch content from a URL and return as text, markdown, or raw HTML.",
|
|
246
|
+
promptSnippet: "web_fetch: fetch content from a URL as text, markdown, or raw HTML.",
|
|
247
|
+
promptGuidelines: [
|
|
248
|
+
"Use web_fetch to retrieve full page content from a URL.",
|
|
249
|
+
"Prefer fetching specific pages rather than homepages for more targeted information.",
|
|
250
|
+
],
|
|
251
|
+
parameters: Type.Object({
|
|
252
|
+
url: Type.String({ minLength: 5, description: "The URL to fetch content from." }),
|
|
253
|
+
format: Type.Optional(
|
|
254
|
+
Type.String({
|
|
255
|
+
enum: ["text", "markdown", "html"],
|
|
256
|
+
default: "markdown",
|
|
257
|
+
description: "Output format. Default: markdown.",
|
|
258
|
+
}),
|
|
259
|
+
),
|
|
260
|
+
timeout: Type.Optional(
|
|
261
|
+
Type.Number({
|
|
262
|
+
minimum: 1,
|
|
263
|
+
maximum: 120,
|
|
264
|
+
default: 30,
|
|
265
|
+
description: "Timeout in seconds (1-120). Default: 30.",
|
|
266
|
+
}),
|
|
267
|
+
),
|
|
268
|
+
}),
|
|
269
|
+
renderCall(args, theme) {
|
|
270
|
+
const p = args as { url: string };
|
|
271
|
+
return new Text(theme.fg("toolTitle", theme.bold("web_fetch ")) + theme.fg("accent", p.url || "..."), 0, 0);
|
|
272
|
+
},
|
|
273
|
+
renderResult(result, { expanded }, theme) {
|
|
274
|
+
const text = result.content?.[0];
|
|
275
|
+
const body = text?.type === "text" ? text.text : "";
|
|
276
|
+
const lines = body.split("\n");
|
|
277
|
+
if (!expanded) {
|
|
278
|
+
const preview = lines.slice(0, 6);
|
|
279
|
+
if (lines.length > 6) preview.push(theme.fg("dim", `... ${lines.length - 6} more lines · ctrl+o to expand`));
|
|
280
|
+
return new Text(preview.join("\n"), 0, 0);
|
|
281
|
+
}
|
|
282
|
+
return new Text(body, 0, 0);
|
|
283
|
+
},
|
|
284
|
+
async execute(_toolCallId, params, signal, onUpdate, _ctx) {
|
|
285
|
+
const p = params as { url: string; format?: "text" | "markdown" | "html"; timeout?: number };
|
|
286
|
+
const url = p.url?.trim();
|
|
287
|
+
if (!url) {
|
|
288
|
+
return { content: [{ type: "text", text: "Error: url is required." }], details: {}, isError: true };
|
|
289
|
+
}
|
|
290
|
+
const format = p.format || "markdown";
|
|
291
|
+
const timeout = p.timeout ?? 30;
|
|
292
|
+
|
|
293
|
+
onUpdate?.({ content: [{ type: "text", text: `Fetching ${url}...` }], details: {} });
|
|
294
|
+
|
|
295
|
+
try {
|
|
296
|
+
const result = await webFetch(url, format, timeout, signal);
|
|
297
|
+
const header = `URL: ${result.url}\nContent-Type: ${result.contentType}\n\n`;
|
|
298
|
+
return {
|
|
299
|
+
content: [{ type: "text", text: header + result.content }],
|
|
300
|
+
details: { url: result.url, contentType: result.contentType, status: result.status },
|
|
301
|
+
};
|
|
302
|
+
} catch (error) {
|
|
303
|
+
const message = error instanceof Error ? error.message : String(error);
|
|
304
|
+
return { content: [{ type: "text", text: `Fetch failed: ${message}` }], details: {}, isError: true };
|
|
305
|
+
}
|
|
306
|
+
},
|
|
307
|
+
});
|
|
308
|
+
}
|
package/package.json
ADDED
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@yandy0725/pi-web-tools",
|
|
3
|
+
"publishConfig": {
|
|
4
|
+
"access": "public"
|
|
5
|
+
},
|
|
6
|
+
"version": "0.1.0",
|
|
7
|
+
"description": "pi package providing web_search, deep_search, image_search and web_fetch tools",
|
|
8
|
+
"license": "MIT",
|
|
9
|
+
"type": "module",
|
|
10
|
+
"keywords": [
|
|
11
|
+
"pi-package"
|
|
12
|
+
],
|
|
13
|
+
"files": [
|
|
14
|
+
"index.ts",
|
|
15
|
+
"src/"
|
|
16
|
+
],
|
|
17
|
+
"scripts": {
|
|
18
|
+
"test": "vitest run",
|
|
19
|
+
"test:watch": "vitest",
|
|
20
|
+
"typecheck": "tsc --noEmit",
|
|
21
|
+
"lint": "biome lint .",
|
|
22
|
+
"format": "biome format --write .",
|
|
23
|
+
"check": "biome check ."
|
|
24
|
+
},
|
|
25
|
+
"pi": {
|
|
26
|
+
"extensions": [
|
|
27
|
+
"./index.ts"
|
|
28
|
+
]
|
|
29
|
+
},
|
|
30
|
+
"dependencies": {
|
|
31
|
+
"openai": "^6.26.0"
|
|
32
|
+
},
|
|
33
|
+
"peerDependencies": {
|
|
34
|
+
"@earendil-works/pi-coding-agent": ">=0.74.0"
|
|
35
|
+
},
|
|
36
|
+
"devDependencies": {
|
|
37
|
+
"@biomejs/biome": "^2.5.0",
|
|
38
|
+
"@earendil-works/pi-coding-agent": "^0.74.0",
|
|
39
|
+
"@types/node": "^22.0.0",
|
|
40
|
+
"typescript": "~5.7.0",
|
|
41
|
+
"vitest": "^3.0.0"
|
|
42
|
+
}
|
|
43
|
+
}
|
package/src/config.ts
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
import { readFileSync } from "node:fs";
|
|
2
|
+
import { resolve } from "node:path";
|
|
3
|
+
|
|
4
|
+
interface WebToolsConfig {
|
|
5
|
+
aliyun?: {
|
|
6
|
+
baseUrl?: string;
|
|
7
|
+
aliyunProviderKey?: string;
|
|
8
|
+
deepSearchModel?: string;
|
|
9
|
+
imageSearchModel?: string;
|
|
10
|
+
};
|
|
11
|
+
}
|
|
12
|
+
|
|
13
|
+
let cachedConfig: WebToolsConfig | null = null;
|
|
14
|
+
let cachedCwd: string | null = null;
|
|
15
|
+
|
|
16
|
+
export function loadConfig(cwd?: string): WebToolsConfig {
|
|
17
|
+
const dir = cwd || process.cwd();
|
|
18
|
+
if (cachedConfig && cachedCwd === dir) return cachedConfig;
|
|
19
|
+
|
|
20
|
+
try {
|
|
21
|
+
const path = resolve(dir, ".pi/agent/web-tools.json");
|
|
22
|
+
const raw = readFileSync(path, "utf-8");
|
|
23
|
+
cachedConfig = JSON.parse(raw) as WebToolsConfig;
|
|
24
|
+
cachedCwd = dir;
|
|
25
|
+
return cachedConfig;
|
|
26
|
+
} catch {
|
|
27
|
+
cachedConfig = {};
|
|
28
|
+
cachedCwd = dir;
|
|
29
|
+
return cachedConfig;
|
|
30
|
+
}
|
|
31
|
+
}
|
|
32
|
+
|
|
33
|
+
export function resolveSetting(
|
|
34
|
+
value: string | undefined,
|
|
35
|
+
configValue: string | undefined,
|
|
36
|
+
defaultValue: string,
|
|
37
|
+
): string {
|
|
38
|
+
return value || configValue || defaultValue;
|
|
39
|
+
}
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
import type { ExtensionContext } from "@earendil-works/pi-coding-agent";
|
|
2
|
+
import type OpenAI from "openai";
|
|
3
|
+
import { resolveSetting } from "../config";
|
|
4
|
+
import { createAliyunClient } from "../openai_client";
|
|
5
|
+
import { resolveAliyunProvider } from "../provider";
|
|
6
|
+
import type { DeepSearchOptions, DeepSearchResponse } from "./types";
|
|
7
|
+
|
|
8
|
+
const DEFAULT_DEEP_SEARCH_MODEL = "deepseek-v4-flash";
|
|
9
|
+
const TIMEOUT_MS = 120_000;
|
|
10
|
+
|
|
11
|
+
export async function aliyunDeepSearch(
|
|
12
|
+
query: string,
|
|
13
|
+
signal?: AbortSignal,
|
|
14
|
+
config?: { baseUrl?: string; aliyunProviderKey?: string; deepSearchModel?: string },
|
|
15
|
+
ctx?: ExtensionContext,
|
|
16
|
+
searchOpts?: DeepSearchOptions,
|
|
17
|
+
): Promise<DeepSearchResponse> {
|
|
18
|
+
const { apiKey, baseUrl } = await resolveAliyunProvider({ ctx, config });
|
|
19
|
+
const model = resolveSetting(process.env.ALIYUN_DEEP_SEARCH_MODEL, config?.deepSearchModel, DEFAULT_DEEP_SEARCH_MODEL);
|
|
20
|
+
const client = createAliyunClient({ apiKey, baseUrl });
|
|
21
|
+
|
|
22
|
+
const s = signal ? AbortSignal.any([signal, AbortSignal.timeout(TIMEOUT_MS)]) : AbortSignal.timeout(TIMEOUT_MS);
|
|
23
|
+
|
|
24
|
+
const { enableSearchExtension, freshness, assignedSiteList, enableImageOutput } = searchOpts ?? {};
|
|
25
|
+
|
|
26
|
+
const searchOptions: Record<string, unknown> = {
|
|
27
|
+
search_strategy: "turbo",
|
|
28
|
+
forced_search: true,
|
|
29
|
+
...(enableSearchExtension && { enable_search_extension: true }),
|
|
30
|
+
...(freshness && { freshness }),
|
|
31
|
+
...(assignedSiteList?.length && { assigned_site_list: assignedSiteList }),
|
|
32
|
+
};
|
|
33
|
+
|
|
34
|
+
const completion = await client.chat.completions.create(
|
|
35
|
+
{
|
|
36
|
+
model,
|
|
37
|
+
messages: [{ role: "user", content: query }],
|
|
38
|
+
stream: false,
|
|
39
|
+
enable_search: true,
|
|
40
|
+
search_options: searchOptions,
|
|
41
|
+
...(enableImageOutput && { enable_text_image_mixed: true }),
|
|
42
|
+
} as unknown as OpenAI.Chat.Completions.ChatCompletionCreateParamsNonStreaming,
|
|
43
|
+
{ signal: s },
|
|
44
|
+
);
|
|
45
|
+
|
|
46
|
+
const answer = completion.choices[0]?.message?.content || "No results";
|
|
47
|
+
return { answer, sources: [] };
|
|
48
|
+
}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
export { aliyunDeepSearch as deepSearch } from "./aliyun";
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
export interface DeepSearchSource {
|
|
2
|
+
title: string;
|
|
3
|
+
url: string;
|
|
4
|
+
}
|
|
5
|
+
|
|
6
|
+
export interface DeepSearchResponse {
|
|
7
|
+
answer: string;
|
|
8
|
+
sources: DeepSearchSource[];
|
|
9
|
+
}
|
|
10
|
+
|
|
11
|
+
export interface DeepSearchOptions {
|
|
12
|
+
enableSearchExtension?: boolean;
|
|
13
|
+
freshness?: number;
|
|
14
|
+
assignedSiteList?: string[];
|
|
15
|
+
enableImageOutput?: boolean;
|
|
16
|
+
}
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
import type { ExtensionContext } from "@earendil-works/pi-coding-agent";
|
|
2
|
+
import type OpenAI from "openai";
|
|
3
|
+
import { resolveSetting } from "../config";
|
|
4
|
+
import { createAliyunClient } from "../openai_client";
|
|
5
|
+
import { resolveAliyunProvider } from "../provider";
|
|
6
|
+
import type { ImageResult, ImageSearchParams, ImageSearchResponse } from "./types";
|
|
7
|
+
|
|
8
|
+
const DEFAULT_IMAGE_SEARCH_MODEL = "qwen3.7-plus";
|
|
9
|
+
const TIMEOUT_MS = 120_000;
|
|
10
|
+
|
|
11
|
+
export async function aliyunImageSearch(
|
|
12
|
+
params: ImageSearchParams,
|
|
13
|
+
signal?: AbortSignal,
|
|
14
|
+
config?: { baseUrl?: string; aliyunProviderKey?: string; imageSearchModel?: string },
|
|
15
|
+
ctx?: ExtensionContext,
|
|
16
|
+
): Promise<ImageSearchResponse> {
|
|
17
|
+
const { query, imageUrl } = params;
|
|
18
|
+
|
|
19
|
+
if (!query && !imageUrl) {
|
|
20
|
+
throw new Error("At least one of query or imageUrl must be provided");
|
|
21
|
+
}
|
|
22
|
+
|
|
23
|
+
const { apiKey, baseUrl } = await resolveAliyunProvider({ ctx, config });
|
|
24
|
+
const model = resolveSetting(
|
|
25
|
+
process.env.ALIYUN_IMAGE_SEARCH_MODEL,
|
|
26
|
+
config?.imageSearchModel,
|
|
27
|
+
DEFAULT_IMAGE_SEARCH_MODEL,
|
|
28
|
+
);
|
|
29
|
+
const client = createAliyunClient({ apiKey, baseUrl });
|
|
30
|
+
|
|
31
|
+
const s = signal ? AbortSignal.any([signal, AbortSignal.timeout(TIMEOUT_MS)]) : AbortSignal.timeout(TIMEOUT_MS);
|
|
32
|
+
|
|
33
|
+
let input: unknown;
|
|
34
|
+
let tools: Array<{ type: string }>;
|
|
35
|
+
|
|
36
|
+
if (imageUrl) {
|
|
37
|
+
tools = [{ type: "image_search" }];
|
|
38
|
+
const content: Array<{ type: string; [key: string]: unknown }> = [];
|
|
39
|
+
if (query) {
|
|
40
|
+
content.push({ type: "input_text", text: query });
|
|
41
|
+
}
|
|
42
|
+
content.push({ type: "input_image", image_url: imageUrl });
|
|
43
|
+
input = [{ role: "user", content }];
|
|
44
|
+
} else {
|
|
45
|
+
tools = [{ type: "web_search_image" }];
|
|
46
|
+
input = query;
|
|
47
|
+
}
|
|
48
|
+
|
|
49
|
+
const response = (await client.responses.create(
|
|
50
|
+
{ model, input, tools } as unknown as OpenAI.Responses.ResponseCreateParams,
|
|
51
|
+
{ signal: s },
|
|
52
|
+
)) as unknown as AliyunImageResponse;
|
|
53
|
+
|
|
54
|
+
const images = parseImages(response.output);
|
|
55
|
+
const answer = parseAnswer(response.output);
|
|
56
|
+
|
|
57
|
+
return { answer, images };
|
|
58
|
+
}
|
|
59
|
+
|
|
60
|
+
interface AliyunImageResponse {
|
|
61
|
+
output?: Array<{
|
|
62
|
+
type: string;
|
|
63
|
+
output?: string;
|
|
64
|
+
content?: Array<{ type?: string; text?: string }>;
|
|
65
|
+
}>;
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
function parseImages(output: AliyunImageResponse["output"] = []): ImageResult[] {
|
|
69
|
+
for (const item of output) {
|
|
70
|
+
if (item.type === "web_search_image_call" || item.type === "image_search_call") {
|
|
71
|
+
try {
|
|
72
|
+
return JSON.parse(item.output || "[]") as ImageResult[];
|
|
73
|
+
} catch {
|
|
74
|
+
return [];
|
|
75
|
+
}
|
|
76
|
+
}
|
|
77
|
+
}
|
|
78
|
+
return [];
|
|
79
|
+
}
|
|
80
|
+
|
|
81
|
+
function parseAnswer(output: AliyunImageResponse["output"] = []): string {
|
|
82
|
+
const messages = output.filter((item) => item.type === "message");
|
|
83
|
+
const texts = messages.flatMap((m) =>
|
|
84
|
+
(m.content || []).filter((c) => c.type === "output_text").map((c) => c.text || ""),
|
|
85
|
+
);
|
|
86
|
+
return texts.join("\n") || "No results";
|
|
87
|
+
}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
export { aliyunImageSearch as imageSearch } from "./aliyun";
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
export interface ImageResult {
|
|
2
|
+
index: number;
|
|
3
|
+
title: string;
|
|
4
|
+
url: string;
|
|
5
|
+
}
|
|
6
|
+
|
|
7
|
+
export interface ImageSearchResponse {
|
|
8
|
+
answer: string;
|
|
9
|
+
images: ImageResult[];
|
|
10
|
+
}
|
|
11
|
+
|
|
12
|
+
export interface ImageSearchParams {
|
|
13
|
+
query?: string;
|
|
14
|
+
imageUrl?: string;
|
|
15
|
+
}
|
package/src/provider.ts
ADDED
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
import type { ExtensionContext } from "@earendil-works/pi-coding-agent";
|
|
2
|
+
|
|
3
|
+
const DEFAULT_BASE_URL = "https://dashscope.aliyuncs.com/compatible-mode/v1";
|
|
4
|
+
const DEFAULT_PROVIDER_KEY = "aliyun";
|
|
5
|
+
|
|
6
|
+
interface ProviderConfig {
|
|
7
|
+
baseUrl?: string;
|
|
8
|
+
aliyunProviderKey?: string;
|
|
9
|
+
}
|
|
10
|
+
|
|
11
|
+
interface ResolvedProvider {
|
|
12
|
+
apiKey: string;
|
|
13
|
+
baseUrl: string;
|
|
14
|
+
}
|
|
15
|
+
|
|
16
|
+
export async function resolveAliyunProvider(opts: {
|
|
17
|
+
ctx?: ExtensionContext;
|
|
18
|
+
config?: ProviderConfig;
|
|
19
|
+
}): Promise<ResolvedProvider> {
|
|
20
|
+
const { ctx, config } = opts;
|
|
21
|
+
const providerKey = config?.aliyunProviderKey ?? DEFAULT_PROVIDER_KEY;
|
|
22
|
+
|
|
23
|
+
// --- apiKey ---
|
|
24
|
+
let apiKey: string | undefined;
|
|
25
|
+
|
|
26
|
+
const envApiKey = process.env.ALIYUN_API_KEY;
|
|
27
|
+
if (envApiKey) {
|
|
28
|
+
apiKey = envApiKey;
|
|
29
|
+
}
|
|
30
|
+
|
|
31
|
+
if (!apiKey && providerKey && ctx) {
|
|
32
|
+
const providerKeyResult = await ctx.modelRegistry.getApiKeyForProvider(providerKey);
|
|
33
|
+
if (providerKeyResult) {
|
|
34
|
+
apiKey = providerKeyResult;
|
|
35
|
+
}
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
if (!apiKey) {
|
|
39
|
+
throw new Error(
|
|
40
|
+
"ALIYUN_API_KEY not configured. Set ALIYUN_API_KEY or configure aliyunProviderKey with a valid pi provider.",
|
|
41
|
+
);
|
|
42
|
+
}
|
|
43
|
+
|
|
44
|
+
// --- baseUrl ---
|
|
45
|
+
let baseUrl: string | undefined;
|
|
46
|
+
|
|
47
|
+
const envBaseUrl = process.env.ALIYUN_BASE_URL;
|
|
48
|
+
if (envBaseUrl) {
|
|
49
|
+
baseUrl = envBaseUrl;
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
if (!baseUrl && providerKey && ctx) {
|
|
53
|
+
const allModels = ctx.modelRegistry.getAll();
|
|
54
|
+
const matchingModel = allModels.find((m) => m.provider === providerKey);
|
|
55
|
+
if (matchingModel?.baseUrl) {
|
|
56
|
+
baseUrl = matchingModel.baseUrl;
|
|
57
|
+
}
|
|
58
|
+
}
|
|
59
|
+
|
|
60
|
+
if (!baseUrl && config?.baseUrl) {
|
|
61
|
+
baseUrl = config.baseUrl;
|
|
62
|
+
}
|
|
63
|
+
|
|
64
|
+
if (!baseUrl) {
|
|
65
|
+
baseUrl = DEFAULT_BASE_URL;
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
return { apiKey, baseUrl };
|
|
69
|
+
}
|
package/src/web_fetch.ts
ADDED
|
@@ -0,0 +1,110 @@
|
|
|
1
|
+
const MAX_CONTENT_CHARS = 100_000;
|
|
2
|
+
|
|
3
|
+
export interface FetchResult {
|
|
4
|
+
content: string;
|
|
5
|
+
contentType: string;
|
|
6
|
+
url: string;
|
|
7
|
+
status: number;
|
|
8
|
+
}
|
|
9
|
+
|
|
10
|
+
export async function webFetch(
|
|
11
|
+
url: string,
|
|
12
|
+
format: "text" | "markdown" | "html",
|
|
13
|
+
timeout: number,
|
|
14
|
+
signal?: AbortSignal,
|
|
15
|
+
): Promise<FetchResult> {
|
|
16
|
+
let parsedUrl: URL;
|
|
17
|
+
try {
|
|
18
|
+
parsedUrl = new URL(url);
|
|
19
|
+
} catch {
|
|
20
|
+
throw new Error(`Invalid URL: ${url}`);
|
|
21
|
+
}
|
|
22
|
+
|
|
23
|
+
if (parsedUrl.protocol !== "http:" && parsedUrl.protocol !== "https:") {
|
|
24
|
+
throw new Error(`Unsupported protocol: ${parsedUrl.protocol}`);
|
|
25
|
+
}
|
|
26
|
+
|
|
27
|
+
const timeoutMs = Math.min(timeout * 1000, 120_000);
|
|
28
|
+
const timeoutSignal = AbortSignal.timeout(timeoutMs);
|
|
29
|
+
const s = signal ? AbortSignal.any([signal, timeoutSignal]) : timeoutSignal;
|
|
30
|
+
|
|
31
|
+
const response = await fetch(url, {
|
|
32
|
+
method: "GET",
|
|
33
|
+
headers: {
|
|
34
|
+
"User-Agent": "pi-web-tools/1.0",
|
|
35
|
+
Accept: format === "html" ? "text/html" : "text/html, text/plain, application/json",
|
|
36
|
+
},
|
|
37
|
+
redirect: "follow",
|
|
38
|
+
signal: s,
|
|
39
|
+
});
|
|
40
|
+
|
|
41
|
+
const contentType = response.headers.get("content-type") || "text/plain";
|
|
42
|
+
const body = await response.text();
|
|
43
|
+
|
|
44
|
+
if (!response.ok) {
|
|
45
|
+
const truncated = body.slice(0, 500);
|
|
46
|
+
throw new Error(`HTTP ${response.status} from ${url}${truncated ? `: ${truncated}` : ""}`);
|
|
47
|
+
}
|
|
48
|
+
|
|
49
|
+
let content: string;
|
|
50
|
+
|
|
51
|
+
if (contentType.includes("application/json")) {
|
|
52
|
+
try {
|
|
53
|
+
content = JSON.stringify(JSON.parse(body), null, 2);
|
|
54
|
+
} catch {
|
|
55
|
+
content = body;
|
|
56
|
+
}
|
|
57
|
+
} else if (format === "html") {
|
|
58
|
+
content = body;
|
|
59
|
+
} else {
|
|
60
|
+
content = htmlToText(body, format);
|
|
61
|
+
}
|
|
62
|
+
|
|
63
|
+
if (content.length > MAX_CONTENT_CHARS) {
|
|
64
|
+
const truncated = content.slice(0, MAX_CONTENT_CHARS);
|
|
65
|
+
content = `${truncated}\n\n... [truncated ${content.length - MAX_CONTENT_CHARS} characters]`;
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
return { content, contentType, url: response.url, status: response.status };
|
|
69
|
+
}
|
|
70
|
+
|
|
71
|
+
function htmlToText(html: string, format: "text" | "markdown"): string {
|
|
72
|
+
let text = html;
|
|
73
|
+
|
|
74
|
+
text = text.replace(/<head[\s\S]*?<\/head>/gi, "");
|
|
75
|
+
text = text.replace(/<style[\s\S]*?<\/style>/gi, "");
|
|
76
|
+
text = text.replace(/<script[\s\S]*?<\/script>/gi, "");
|
|
77
|
+
text = text.replace(/<noscript[\s\S]*?<\/noscript>/gi, "");
|
|
78
|
+
|
|
79
|
+
if (format === "markdown") {
|
|
80
|
+
text = text.replace(/<h1[^>]*>([\s\S]*?)<\/h1>/gi, "\n\n# $1\n\n");
|
|
81
|
+
text = text.replace(/<h2[^>]*>([\s\S]*?)<\/h2>/gi, "\n\n## $1\n\n");
|
|
82
|
+
text = text.replace(/<h3[^>]*>([\s\S]*?)<\/h3>/gi, "\n\n### $1\n\n");
|
|
83
|
+
text = text.replace(/<h4[^>]*>([\s\S]*?)<\/h4>/gi, "\n\n#### $1\n\n");
|
|
84
|
+
text = text.replace(/<h5[^>]*>([\s\S]*?)<\/h5>/gi, "\n\n##### $1\n\n");
|
|
85
|
+
text = text.replace(/<h6[^>]*>([\s\S]*?)<\/h6>/gi, "\n\n###### $1\n\n");
|
|
86
|
+
text = text.replace(/<strong[^>]*>([\s\S]*?)<\/strong>/gi, "**$1**");
|
|
87
|
+
text = text.replace(/<b[^>]*>([\s\S]*?)<\/b>/gi, "**$1**");
|
|
88
|
+
text = text.replace(/<em[^>]*>([\s\S]*?)<\/em>/gi, "*$1*");
|
|
89
|
+
text = text.replace(/<i[^>]*>([\s\S]*?)<\/i>/gi, "*$1*");
|
|
90
|
+
text = text.replace(/<code[^>]*>([\s\S]*?)<\/code>/gi, "`$1`");
|
|
91
|
+
text = text.replace(/<pre[^>]*>([\s\S]*?)<\/pre>/gi, "\n\n```\n$1\n```\n\n");
|
|
92
|
+
text = text.replace(/<a[^>]*href="([^"]*)"[^>]*>([\s\S]*?)<\/a>/gi, "[$2]($1)");
|
|
93
|
+
text = text.replace(/<li[^>]*>([\s\S]*?)<\/li>/gi, "- $1\n");
|
|
94
|
+
text = text.replace(/<p[^>]*>/gi, "\n\n");
|
|
95
|
+
}
|
|
96
|
+
|
|
97
|
+
text = text.replace(/<br\s*\/?>/gi, "\n");
|
|
98
|
+
text = text.replace(/<[^>]+>/g, "");
|
|
99
|
+
text = text.replace(/&/g, "&");
|
|
100
|
+
text = text.replace(/</g, "<");
|
|
101
|
+
text = text.replace(/>/g, ">");
|
|
102
|
+
text = text.replace(/"/g, '"');
|
|
103
|
+
text = text.replace(/'/g, "'");
|
|
104
|
+
text = text.replace(/ /g, " ");
|
|
105
|
+
text = text.replace(/\n{3,}/g, "\n\n");
|
|
106
|
+
text = text.replace(/[ \t]+/g, " ");
|
|
107
|
+
text = text.trim();
|
|
108
|
+
|
|
109
|
+
return text;
|
|
110
|
+
}
|
|
@@ -0,0 +1,163 @@
|
|
|
1
|
+
import type { SearchResponse } from "./types";
|
|
2
|
+
|
|
3
|
+
const EXA_REST_URL = "https://api.exa.ai/search";
|
|
4
|
+
const EXA_MCP_URL = "https://mcp.exa.ai/mcp";
|
|
5
|
+
const MCP_TOOL_NAME = "web_search_exa";
|
|
6
|
+
const TIMEOUT_MS = 60_000;
|
|
7
|
+
|
|
8
|
+
export async function exaSearch(query: string, numResults: number, signal?: AbortSignal): Promise<SearchResponse> {
|
|
9
|
+
const apiKey = process.env.EXA_API_KEY;
|
|
10
|
+
|
|
11
|
+
const timeoutSignal = AbortSignal.timeout(TIMEOUT_MS);
|
|
12
|
+
const s = signal ? AbortSignal.any([signal, timeoutSignal]) : timeoutSignal;
|
|
13
|
+
|
|
14
|
+
if (!apiKey) {
|
|
15
|
+
return exaMcpSearch(query, numResults, s);
|
|
16
|
+
}
|
|
17
|
+
return exaRestSearch(query, numResults, apiKey, s);
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
async function exaRestSearch(
|
|
21
|
+
query: string,
|
|
22
|
+
numResults: number,
|
|
23
|
+
apiKey: string,
|
|
24
|
+
signal: AbortSignal,
|
|
25
|
+
): Promise<SearchResponse> {
|
|
26
|
+
const resp = await fetch(EXA_REST_URL, {
|
|
27
|
+
method: "POST",
|
|
28
|
+
headers: { "x-api-key": apiKey, "Content-Type": "application/json" },
|
|
29
|
+
body: JSON.stringify({
|
|
30
|
+
query,
|
|
31
|
+
numResults,
|
|
32
|
+
type: "auto",
|
|
33
|
+
contents: { text: { maxCharacters: 3000 } },
|
|
34
|
+
}),
|
|
35
|
+
signal,
|
|
36
|
+
});
|
|
37
|
+
|
|
38
|
+
if (!resp.ok) {
|
|
39
|
+
const detail = await resp.text().catch(() => resp.statusText);
|
|
40
|
+
throw new Error(`Exa API ${resp.status}: ${detail}`);
|
|
41
|
+
}
|
|
42
|
+
|
|
43
|
+
const data = (await resp.json()) as {
|
|
44
|
+
results?: Array<{ title?: string; url?: string; text?: string }>;
|
|
45
|
+
};
|
|
46
|
+
const results = data.results || [];
|
|
47
|
+
|
|
48
|
+
const sources = results.map((r) => ({
|
|
49
|
+
title: r.title || "Untitled",
|
|
50
|
+
url: r.url || "",
|
|
51
|
+
snippet: (r.text || "").slice(0, 500),
|
|
52
|
+
}));
|
|
53
|
+
|
|
54
|
+
const answer = formatAnswer(sources, query);
|
|
55
|
+
return { answer, sources, sourceLabel: "exa" };
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
async function exaMcpSearch(query: string, numResults: number, signal: AbortSignal): Promise<SearchResponse> {
|
|
59
|
+
// 1. initialize
|
|
60
|
+
const initResult = await mcpCall(
|
|
61
|
+
"initialize",
|
|
62
|
+
{
|
|
63
|
+
protocolVersion: "2024-11-05",
|
|
64
|
+
capabilities: {},
|
|
65
|
+
clientInfo: { name: "pi-web-tools", version: "1.0" },
|
|
66
|
+
},
|
|
67
|
+
signal,
|
|
68
|
+
);
|
|
69
|
+
|
|
70
|
+
if (!initResult) {
|
|
71
|
+
throw new Error("Exa MCP initialize failed: no result");
|
|
72
|
+
}
|
|
73
|
+
|
|
74
|
+
// 2. search
|
|
75
|
+
const searchResult = (await mcpCall(
|
|
76
|
+
MCP_TOOL_NAME,
|
|
77
|
+
{
|
|
78
|
+
query,
|
|
79
|
+
numResults,
|
|
80
|
+
type: "auto",
|
|
81
|
+
contents: { text: { maxCharacters: 3000 } },
|
|
82
|
+
},
|
|
83
|
+
signal,
|
|
84
|
+
)) as { content?: Array<{ text?: string }> } | null;
|
|
85
|
+
|
|
86
|
+
const text = searchResult?.content?.map((c) => c.text || "").join("\n\n") || "";
|
|
87
|
+
const sources = parseMcpResults(text);
|
|
88
|
+
const answer = formatAnswer(sources, query);
|
|
89
|
+
|
|
90
|
+
return { answer, sources, sourceLabel: "exa" };
|
|
91
|
+
}
|
|
92
|
+
|
|
93
|
+
async function mcpCall(method: string, params: unknown, signal: AbortSignal): Promise<Record<string, unknown> | null> {
|
|
94
|
+
const resp = await fetch(EXA_MCP_URL, {
|
|
95
|
+
method: "POST",
|
|
96
|
+
headers: {
|
|
97
|
+
"Content-Type": "application/json",
|
|
98
|
+
Accept: "application/json, text/event-stream",
|
|
99
|
+
},
|
|
100
|
+
body: JSON.stringify({
|
|
101
|
+
jsonrpc: "2.0",
|
|
102
|
+
method: method === "initialize" ? "initialize" : "tools/call",
|
|
103
|
+
params: method === "initialize" ? params : { name: method, arguments: params },
|
|
104
|
+
id: 1,
|
|
105
|
+
}),
|
|
106
|
+
signal,
|
|
107
|
+
});
|
|
108
|
+
|
|
109
|
+
if (!resp.ok) {
|
|
110
|
+
throw new Error(`Exa MCP ${method} failed: ${resp.status}`);
|
|
111
|
+
}
|
|
112
|
+
|
|
113
|
+
const contentType = resp.headers?.get("content-type") || "";
|
|
114
|
+
|
|
115
|
+
if (contentType.includes("text/event-stream")) {
|
|
116
|
+
const text = await resp.text();
|
|
117
|
+
for (const line of text.split("\n")) {
|
|
118
|
+
if (line.startsWith("data: ")) {
|
|
119
|
+
try {
|
|
120
|
+
const parsed = JSON.parse(line.slice(6));
|
|
121
|
+
if (parsed.result) return parsed.result as Record<string, unknown>;
|
|
122
|
+
if (parsed.error) throw new Error(`Exa MCP error: ${parsed.error.message || JSON.stringify(parsed.error)}`);
|
|
123
|
+
} catch (e) {
|
|
124
|
+
if (e instanceof SyntaxError) continue;
|
|
125
|
+
throw e;
|
|
126
|
+
}
|
|
127
|
+
}
|
|
128
|
+
}
|
|
129
|
+
return null;
|
|
130
|
+
}
|
|
131
|
+
|
|
132
|
+
const data = (await resp.json()) as { result?: Record<string, unknown>; error?: unknown };
|
|
133
|
+
if (data.error) throw new Error(`Exa MCP error: ${JSON.stringify(data.error)}`);
|
|
134
|
+
return data.result ?? null;
|
|
135
|
+
}
|
|
136
|
+
|
|
137
|
+
function formatAnswer(sources: Array<{ title: string; url: string; snippet: string }>, query: string): string {
|
|
138
|
+
return (
|
|
139
|
+
sources.map((s, i) => `${i + 1}. [${s.title}](${s.url})\n ${s.snippet}`).join("\n\n") ||
|
|
140
|
+
`No results found for: ${query}`
|
|
141
|
+
);
|
|
142
|
+
}
|
|
143
|
+
|
|
144
|
+
function parseMcpResults(text: string): Array<{ title: string; url: string; snippet: string }> {
|
|
145
|
+
const results: Array<{ title: string; url: string; snippet: string }> = [];
|
|
146
|
+
const blocks = text.split(/\n\n---\n\n/);
|
|
147
|
+
|
|
148
|
+
for (const block of blocks) {
|
|
149
|
+
const titleMatch = block.match(/^Title:\s*(.+)$/m);
|
|
150
|
+
const urlMatch = block.match(/^URL:\s*(.+)$/m);
|
|
151
|
+
const highlightsMatch = block.match(/^Highlights:\s*\n([\s\S]*?)$/m);
|
|
152
|
+
|
|
153
|
+
if (urlMatch) {
|
|
154
|
+
results.push({
|
|
155
|
+
title: titleMatch?.[1]?.trim() || "Untitled",
|
|
156
|
+
url: urlMatch[1].trim(),
|
|
157
|
+
snippet: highlightsMatch?.[1]?.trim().slice(0, 500) || "",
|
|
158
|
+
});
|
|
159
|
+
}
|
|
160
|
+
}
|
|
161
|
+
|
|
162
|
+
return results;
|
|
163
|
+
}
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
import { exaSearch } from "./exa";
|
|
2
|
+
import type { SearchResponse } from "./types";
|
|
3
|
+
|
|
4
|
+
type SearchFn = (query: string, numResults: number, signal?: AbortSignal) => Promise<SearchResponse>;
|
|
5
|
+
|
|
6
|
+
interface SourceEntry {
|
|
7
|
+
name: string;
|
|
8
|
+
fn: SearchFn;
|
|
9
|
+
}
|
|
10
|
+
|
|
11
|
+
const SOURCES: SourceEntry[] = [
|
|
12
|
+
{
|
|
13
|
+
name: "exa",
|
|
14
|
+
fn: exaSearch,
|
|
15
|
+
},
|
|
16
|
+
];
|
|
17
|
+
|
|
18
|
+
export async function search(
|
|
19
|
+
query: string,
|
|
20
|
+
numResults: number,
|
|
21
|
+
signal?: AbortSignal,
|
|
22
|
+
onProgress?: (msg: string) => void,
|
|
23
|
+
specifiedSource?: string,
|
|
24
|
+
): Promise<SearchResponse> {
|
|
25
|
+
const errors: string[] = [];
|
|
26
|
+
|
|
27
|
+
const sources = specifiedSource ? SOURCES.filter((s) => s.name === specifiedSource) : SOURCES;
|
|
28
|
+
|
|
29
|
+
if (specifiedSource && sources.length === 0) {
|
|
30
|
+
throw new Error(`Unknown source: ${specifiedSource}. Available: ${SOURCES.map((s) => s.name).join(", ")}`);
|
|
31
|
+
}
|
|
32
|
+
|
|
33
|
+
for (const source of sources) {
|
|
34
|
+
try {
|
|
35
|
+
onProgress?.(`Trying ${source.name}...`);
|
|
36
|
+
const resp = await source.fn(query, numResults, signal);
|
|
37
|
+
return resp;
|
|
38
|
+
} catch (err) {
|
|
39
|
+
const msg = err instanceof Error ? err.message : String(err);
|
|
40
|
+
errors.push(`${source.name}: ${msg}`);
|
|
41
|
+
}
|
|
42
|
+
}
|
|
43
|
+
|
|
44
|
+
throw new Error(`All search sources failed:\n${errors.map((e) => ` - ${e}`).join("\n")}`);
|
|
45
|
+
}
|