webveil 0.0.0 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +661 -0
- package/README.md +101 -0
- package/dist/cli.d.ts +58 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/cli.js +91 -0
- package/dist/cli.js.map +1 -0
- package/dist/core/backends/custom.d.ts +15 -0
- package/dist/core/backends/custom.d.ts.map +1 -0
- package/dist/core/backends/custom.js +106 -0
- package/dist/core/backends/custom.js.map +1 -0
- package/dist/core/backends/registry.d.ts +13 -0
- package/dist/core/backends/registry.d.ts.map +1 -0
- package/dist/core/backends/registry.js +31 -0
- package/dist/core/backends/registry.js.map +1 -0
- package/dist/core/backends/searxng.d.ts +8 -0
- package/dist/core/backends/searxng.d.ts.map +1 -0
- package/dist/core/backends/searxng.js +43 -0
- package/dist/core/backends/searxng.js.map +1 -0
- package/dist/core/backends/tavily-compat.d.ts +10 -0
- package/dist/core/backends/tavily-compat.d.ts.map +1 -0
- package/dist/core/backends/tavily-compat.js +85 -0
- package/dist/core/backends/tavily-compat.js.map +1 -0
- package/dist/core/backends/types.d.ts +48 -0
- package/dist/core/backends/types.d.ts.map +1 -0
- package/dist/core/backends/types.js +5 -0
- package/dist/core/backends/types.js.map +1 -0
- package/dist/core/config.d.ts +39 -0
- package/dist/core/config.d.ts.map +1 -0
- package/dist/core/config.js +72 -0
- package/dist/core/config.js.map +1 -0
- package/dist/core/egress.d.ts +30 -0
- package/dist/core/egress.d.ts.map +1 -0
- package/dist/core/egress.js +87 -0
- package/dist/core/egress.js.map +1 -0
- package/dist/core/extract.d.ts +45 -0
- package/dist/core/extract.d.ts.map +1 -0
- package/dist/core/extract.js +36 -0
- package/dist/core/extract.js.map +1 -0
- package/dist/core/fetch.d.ts +42 -0
- package/dist/core/fetch.d.ts.map +1 -0
- package/dist/core/fetch.js +76 -0
- package/dist/core/fetch.js.map +1 -0
- package/dist/core/http.d.ts +8 -0
- package/dist/core/http.d.ts.map +1 -0
- package/dist/core/http.js +49 -0
- package/dist/core/http.js.map +1 -0
- package/dist/core/search.d.ts +31 -0
- package/dist/core/search.d.ts.map +1 -0
- package/dist/core/search.js +65 -0
- package/dist/core/search.js.map +1 -0
- package/dist/core/security.d.ts +35 -0
- package/dist/core/security.d.ts.map +1 -0
- package/dist/core/security.js +141 -0
- package/dist/core/security.js.map +1 -0
- package/dist/index.d.ts +22 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +40 -0
- package/dist/index.js.map +1 -0
- package/package.json +62 -2
- package/src/cli.ts +106 -0
- package/src/core/backends/custom.ts +159 -0
- package/src/core/backends/registry.ts +41 -0
- package/src/core/backends/searxng.ts +70 -0
- package/src/core/backends/tavily-compat.ts +156 -0
- package/src/core/backends/types.ts +61 -0
- package/src/core/config.ts +106 -0
- package/src/core/egress.ts +106 -0
- package/src/core/extract.ts +82 -0
- package/src/core/fetch.ts +132 -0
- package/src/core/http.ts +62 -0
- package/src/core/search.ts +104 -0
- package/src/core/security.ts +141 -0
- package/src/index.ts +82 -0
package/README.md
ADDED
|
@@ -0,0 +1,101 @@
|
|
|
1
|
+
# webveil
|
|
2
|
+
|
|
3
|
+
**Anonymous-capable, self-hosted, account-free** web **search + fetch** for AI agents.
|
|
4
|
+
|
|
5
|
+
webveil replaces account-bound tools (notably Ollama's `web_search` / `web_fetch`, which
|
|
6
|
+
proxy a hosted service and sign every request with your account identity) with a
|
|
7
|
+
self-hosted path that has **no account, no API key**, and an **egress you control**
|
|
8
|
+
(direct, HTTP proxy, or SOCKS5/Tor) so searches and fetches can be anonymous. It also
|
|
9
|
+
works perfectly well non-anonymously (direct egress).
|
|
10
|
+
|
|
11
|
+
## Packages
|
|
12
|
+
|
|
13
|
+
webveil is a pnpm workspace monorepo. The **core** (`search()` / `fetch()`) is plain,
|
|
14
|
+
framework-agnostic. Two thin frontends wrap that same core:
|
|
15
|
+
|
|
16
|
+
- **[`webveil`](packages/webveil)** — an [incur](https://github.com/wevm/incur)-based
|
|
17
|
+
**CLI + MCP server** (`--mcp`, skills, `--llms`, TOON output). Pi-agnostic; usable by any
|
|
18
|
+
agent (pi via pi-mcp-adapter, Claude Code, Cursor, Codex, bash). Has a `webveil` bin.
|
|
19
|
+
- **[`pi-webveil`](packages/pi-webveil)** — a **pi extension** registering `web_search` and
|
|
20
|
+
`web_fetch` tools that call the core in-process. A drop-in replacement for Ollama's tools
|
|
21
|
+
(same names), which is the original motivation. Depends on `webveil` via `workspace:*`.
|
|
22
|
+
|
|
23
|
+
## How it works (seams)
|
|
24
|
+
|
|
25
|
+
- **core** — the framework-agnostic `search(query, opts)` and `fetch(url, opts)` functions.
|
|
26
|
+
Both frontends call the same core.
|
|
27
|
+
- **backend seam** — where results/content come from: `searxng` (keyless self-hosted
|
|
28
|
+
metasearch), `tavily-compat` (a generic Tavily-shaped `/search` + `/extract`), and
|
|
29
|
+
`custom` (a local command via a JSON stdin/stdout contract). The backend is handed a
|
|
30
|
+
proxied `http` helper so it cannot bypass egress.
|
|
31
|
+
- **egress seam** — how outbound HTTP leaves the machine: `direct`, `http` (undici
|
|
32
|
+
`ProxyAgent`), or `socks5` (Tor `127.0.0.1:9050`, Mullvad `10.64.0.1:1080`). SOCKS5 is
|
|
33
|
+
the mode that matters for anonymity. Fail-loud if a configured proxy cannot be built.
|
|
34
|
+
- **config seam** — per-folder resolution: env > nearest `.pi/webveil.json` walking up from
|
|
35
|
+
cwd > global `~/.pi/agent/webveil.json` > defaults. Per folder = per account/egress.
|
|
36
|
+
- **extractor seam** — `urlToMarkdown` via `distilly/fetch` by default, injected with
|
|
37
|
+
webveil's egress-bound `fetch`; a backend's own `/extract` (Tavily-compat) may override
|
|
38
|
+
it. Owns the context-friendly markdown + size presets (`s`/`m`/`l`/`f`). See
|
|
39
|
+
[`docs/adr/0001`](docs/adr/0001-extractor-uses-distilly-fetch-with-injected-egress.md).
|
|
40
|
+
- **security** — an SSRF guard lives in the egress fetch, so it covers distilly's
|
|
41
|
+
rule-rewritten requests too.
|
|
42
|
+
|
|
43
|
+
## License
|
|
44
|
+
|
|
45
|
+
AGPL-3.0-or-later. webveil depends on `distilly` (MIT, the local HTML-to-markdown
|
|
46
|
+
extractor; webveil uses its networked `distilly/fetch` entrypoint with an injected egress
|
|
47
|
+
fetch) and `incur` (MIT). MIT code may be used by AGPL software; `distilly` stays
|
|
48
|
+
GPL/AGPL-free so it remains cleanly reusable under MIT. See [`LICENSE`](LICENSE) and
|
|
49
|
+
[`COPYRIGHT`](COPYRIGHT).
|
|
50
|
+
|
|
51
|
+
## Size discipline (per-module LOC)
|
|
52
|
+
|
|
53
|
+
Every module stays small with one responsibility. Per-module LOC is tracked here as a
|
|
54
|
+
first-class quality signal. `target` is the rough ceiling from `CONTEXT.md` (a ceiling, not
|
|
55
|
+
a promise); `LOC` is the actual line count of the built file.
|
|
56
|
+
|
|
57
|
+
### `packages/webveil` (core + CLI/MCP frontend)
|
|
58
|
+
|
|
59
|
+
| module | LOC | target |
|
|
60
|
+
| ---------------------------------- | ---: | -----: |
|
|
61
|
+
| src/index.ts (barrel) | 82 | - |
|
|
62
|
+
| src/cli.ts (incur frontend) | 106 | ~80 |
|
|
63
|
+
| src/core/search.ts | 104 | ~90 |
|
|
64
|
+
| src/core/fetch.ts | 132 | ~90 |
|
|
65
|
+
| src/core/config.ts | 106 | ~80 |
|
|
66
|
+
| src/core/egress.ts | 106 | ~70 |
|
|
67
|
+
| src/core/http.ts | 62 | ~60 |
|
|
68
|
+
| src/core/extract.ts | 82 | ~60 |
|
|
69
|
+
| src/core/security.ts (SSRF guard) | 141 | - |
|
|
70
|
+
| src/core/backends/types.ts | 61 | ~40 |
|
|
71
|
+
| src/core/backends/registry.ts | 41 | ~60 |
|
|
72
|
+
| src/core/backends/searxng.ts | 70 | ~90 |
|
|
73
|
+
| src/core/backends/tavily-compat.ts | 156 | ~90 |
|
|
74
|
+
| src/core/backends/custom.ts | 159 | ~70 |
|
|
75
|
+
| **subtotal** | 1408 | |
|
|
76
|
+
|
|
77
|
+
### `packages/pi-webveil` (pi extension frontend)
|
|
78
|
+
|
|
79
|
+
| module | LOC | target |
|
|
80
|
+
| ------------ | --: | -----: |
|
|
81
|
+
| src/index.ts | 168 | ~90 |
|
|
82
|
+
|
|
83
|
+
**Total own source: 1576 LOC** (excluding deps).
|
|
84
|
+
|
|
85
|
+
> Reality vs. target: several modules currently exceed their `CONTEXT.md` ceilings (notably
|
|
86
|
+
> `tavily-compat.ts`, `custom.ts`, `pi-webveil/src/index.ts`), and two built modules
|
|
87
|
+
> (`index.ts` barrel and `security.ts` SSRF guard) were not in the original target list. The
|
|
88
|
+
> table above reflects the modules as actually built. For calibration, comparable pi
|
|
89
|
+
> web-search extensions: `pi-searxng-search` 350 LOC (1 backend, no egress, no fetch),
|
|
90
|
+
> `leing2021/pi-search` 1714, `pi-search-hub` 9047, `pi-web-providers` 18961. webveil
|
|
91
|
+
> delivers a 3-backend + egress + fetch + per-folder-config tool by leaning on `incur`
|
|
92
|
+
> (CLI/MCP/skills) and `distilly` (extraction).
|
|
93
|
+
|
|
94
|
+
## Develop
|
|
95
|
+
|
|
96
|
+
```sh
|
|
97
|
+
pnpm install
|
|
98
|
+
pnpm build
|
|
99
|
+
pnpm test
|
|
100
|
+
pnpm format:check
|
|
101
|
+
```
|
package/dist/cli.d.ts
ADDED
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
import { Cli } from 'incur';
|
|
3
|
+
import { search as coreSearch } from './core/search.js';
|
|
4
|
+
import { fetch as coreFetch } from './core/fetch.js';
|
|
5
|
+
/**
|
|
6
|
+
* The two core functions the frontend wraps, seamed so tests can inject fakes.
|
|
7
|
+
* Defaults are the real core; a test passes spies to assert the wiring.
|
|
8
|
+
*/
|
|
9
|
+
export interface CliDeps {
|
|
10
|
+
search?: typeof coreSearch;
|
|
11
|
+
fetch?: typeof coreFetch;
|
|
12
|
+
}
|
|
13
|
+
/**
|
|
14
|
+
* Build the webveil CLI. Returns the incur `Cli` so a caller (the bin below, or
|
|
15
|
+
* a test) decides how to serve it. The `search`/`fetch` commands forward to the
|
|
16
|
+
* injected core, normalizing nothing themselves — the core already deduped,
|
|
17
|
+
* clamped, and size-bounded.
|
|
18
|
+
*/
|
|
19
|
+
export declare function createCli(deps?: CliDeps): Cli.Cli<{
|
|
20
|
+
search: {
|
|
21
|
+
args: {
|
|
22
|
+
query: string;
|
|
23
|
+
};
|
|
24
|
+
options: {
|
|
25
|
+
maxResults?: number | undefined;
|
|
26
|
+
};
|
|
27
|
+
};
|
|
28
|
+
} & {
|
|
29
|
+
fetch: {
|
|
30
|
+
args: {
|
|
31
|
+
url: string;
|
|
32
|
+
};
|
|
33
|
+
options: {
|
|
34
|
+
size?: "s" | "m" | "l" | "f" | undefined;
|
|
35
|
+
};
|
|
36
|
+
};
|
|
37
|
+
}, undefined, undefined, undefined>;
|
|
38
|
+
declare const cli: Cli.Cli<{
|
|
39
|
+
search: {
|
|
40
|
+
args: {
|
|
41
|
+
query: string;
|
|
42
|
+
};
|
|
43
|
+
options: {
|
|
44
|
+
maxResults?: number | undefined;
|
|
45
|
+
};
|
|
46
|
+
};
|
|
47
|
+
} & {
|
|
48
|
+
fetch: {
|
|
49
|
+
args: {
|
|
50
|
+
url: string;
|
|
51
|
+
};
|
|
52
|
+
options: {
|
|
53
|
+
size?: "s" | "m" | "l" | "f" | undefined;
|
|
54
|
+
};
|
|
55
|
+
};
|
|
56
|
+
}, undefined, undefined, undefined>;
|
|
57
|
+
export default cli;
|
|
58
|
+
//# sourceMappingURL=cli.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"cli.d.ts","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AAmBA,OAAO,EAAC,GAAG,EAAI,MAAM,OAAO,CAAC;AAC7B,OAAO,EAAC,MAAM,IAAI,UAAU,EAAC,MAAM,kBAAkB,CAAC;AACtD,OAAO,EAAC,KAAK,IAAI,SAAS,EAAC,MAAM,iBAAiB,CAAC;AAEnD;;;GAGG;AACH,MAAM,WAAW,OAAO;IACvB,MAAM,CAAC,EAAE,OAAO,UAAU,CAAC;IAC3B,KAAK,CAAC,EAAE,OAAO,SAAS,CAAC;CACzB;AAKD;;;;;GAKG;AACH,wBAAgB,SAAS,CAAC,IAAI,GAAE,OAAY;;;;;;;;;;;;;;;;;;oCA4C3C;AAKD,QAAA,MAAM,GAAG;;;;;;;;;;;;;;;;;;mCAAc,CAAC;AAexB,eAAe,GAAG,CAAC"}
|
package/dist/cli.js
ADDED
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
// webveil — the incur-based CLI + MCP frontend. ONE `Cli.create()` definition
|
|
3
|
+
// yields the CLI, an MCP server (`--mcp`), skills (`skills add`), a `--llms`
|
|
4
|
+
// manifest, TOON output, and token pagination for free (incur). Pi-agnostic:
|
|
5
|
+
// any agent (pi via pi-mcp-adapter, Claude Code, Cursor, Codex, bash) consumes
|
|
6
|
+
// it the same way. The `webveil` bin points at the built `dist/cli.js`.
|
|
7
|
+
//
|
|
8
|
+
// This is the THIN frontend: each command only parses argv/options and calls
|
|
9
|
+
// the SAME framework-agnostic core (`search()` / `fetch()`) the pi extension
|
|
10
|
+
// calls. The core owns config/egress/backend/extraction; this file owns no
|
|
11
|
+
// network logic of its own.
|
|
12
|
+
//
|
|
13
|
+
// Testability: `createCli(deps)` takes the core functions as injectable deps so
|
|
14
|
+
// a test wires fakes and asserts the commands call the core (via `cli.serve`
|
|
15
|
+
// with custom argv/stdout) WITHOUT touching the network. The bottom of the file
|
|
16
|
+
// builds the real CLI and serves it when run as the bin.
|
|
17
|
+
import { argv } from 'node:process';
|
|
18
|
+
import { fileURLToPath } from 'node:url';
|
|
19
|
+
import { Cli, z } from 'incur';
|
|
20
|
+
import { search as coreSearch } from './core/search.js';
|
|
21
|
+
import { fetch as coreFetch } from './core/fetch.js';
|
|
22
|
+
/** The size presets `fetch` accepts, mirroring the core's `FetchSize`. */
|
|
23
|
+
const SIZES = ['s', 'm', 'l', 'f'];
|
|
24
|
+
/**
|
|
25
|
+
* Build the webveil CLI. Returns the incur `Cli` so a caller (the bin below, or
|
|
26
|
+
* a test) decides how to serve it. The `search`/`fetch` commands forward to the
|
|
27
|
+
* injected core, normalizing nothing themselves — the core already deduped,
|
|
28
|
+
* clamped, and size-bounded.
|
|
29
|
+
*/
|
|
30
|
+
export function createCli(deps = {}) {
|
|
31
|
+
const search = deps.search ?? coreSearch;
|
|
32
|
+
const fetch = deps.fetch ?? coreFetch;
|
|
33
|
+
return Cli.create('webveil', {
|
|
34
|
+
description: 'Anonymous-capable, self-hosted, account-free web search + fetch for agents.',
|
|
35
|
+
})
|
|
36
|
+
.command('search', {
|
|
37
|
+
description: 'Search the web via the configured backend and egress.',
|
|
38
|
+
args: z.object({
|
|
39
|
+
query: z.string().describe('The search query'),
|
|
40
|
+
}),
|
|
41
|
+
options: z.object({
|
|
42
|
+
maxResults: z.coerce
|
|
43
|
+
.number()
|
|
44
|
+
.optional()
|
|
45
|
+
.describe('Maximum number of results to return'),
|
|
46
|
+
}),
|
|
47
|
+
alias: { maxResults: 'n' },
|
|
48
|
+
async run(c) {
|
|
49
|
+
const results = await search(c.args.query, {
|
|
50
|
+
maxResults: c.options.maxResults,
|
|
51
|
+
});
|
|
52
|
+
return { results };
|
|
53
|
+
},
|
|
54
|
+
})
|
|
55
|
+
.command('fetch', {
|
|
56
|
+
description: 'Fetch a URL as clean, size-bounded markdown via the configured egress.',
|
|
57
|
+
args: z.object({
|
|
58
|
+
url: z.string().describe('The URL to fetch'),
|
|
59
|
+
}),
|
|
60
|
+
options: z.object({
|
|
61
|
+
size: z
|
|
62
|
+
.enum(SIZES)
|
|
63
|
+
.optional()
|
|
64
|
+
.describe('Page-size budget preset: s | m | l | f'),
|
|
65
|
+
}),
|
|
66
|
+
alias: { size: 's' },
|
|
67
|
+
async run(c) {
|
|
68
|
+
return fetch(c.args.url, { size: c.options.size });
|
|
69
|
+
},
|
|
70
|
+
});
|
|
71
|
+
}
|
|
72
|
+
// The real CLI (also `export default` so `incur gen` can import it for typed
|
|
73
|
+
// CTAs). Serving is GUARDED to the bin entry below, so importing this module in
|
|
74
|
+
// a test never consumes `process.argv` or exits the process.
|
|
75
|
+
const cli = createCli();
|
|
76
|
+
/** True when this module is the process entry (the `webveil` bin), not imported. */
|
|
77
|
+
function isMain() {
|
|
78
|
+
const entry = argv[1];
|
|
79
|
+
if (!entry)
|
|
80
|
+
return false;
|
|
81
|
+
try {
|
|
82
|
+
return fileURLToPath(import.meta.url) === entry;
|
|
83
|
+
}
|
|
84
|
+
catch {
|
|
85
|
+
return false;
|
|
86
|
+
}
|
|
87
|
+
}
|
|
88
|
+
if (isMain())
|
|
89
|
+
cli.serve();
|
|
90
|
+
export default cli;
|
|
91
|
+
//# sourceMappingURL=cli.js.map
|
package/dist/cli.js.map
ADDED
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"cli.js","sourceRoot":"","sources":["../src/cli.ts"],"names":[],"mappings":";AACA,8EAA8E;AAC9E,6EAA6E;AAC7E,6EAA6E;AAC7E,+EAA+E;AAC/E,wEAAwE;AACxE,EAAE;AACF,6EAA6E;AAC7E,6EAA6E;AAC7E,2EAA2E;AAC3E,4BAA4B;AAC5B,EAAE;AACF,gFAAgF;AAChF,6EAA6E;AAC7E,gFAAgF;AAChF,yDAAyD;AAEzD,OAAO,EAAC,IAAI,EAAC,MAAM,cAAc,CAAC;AAClC,OAAO,EAAC,aAAa,EAAC,MAAM,UAAU,CAAC;AACvC,OAAO,EAAC,GAAG,EAAE,CAAC,EAAC,MAAM,OAAO,CAAC;AAC7B,OAAO,EAAC,MAAM,IAAI,UAAU,EAAC,MAAM,kBAAkB,CAAC;AACtD,OAAO,EAAC,KAAK,IAAI,SAAS,EAAC,MAAM,iBAAiB,CAAC;AAWnD,0EAA0E;AAC1E,MAAM,KAAK,GAAG,CAAC,GAAG,EAAE,GAAG,EAAE,GAAG,EAAE,GAAG,CAAU,CAAC;AAE5C;;;;;GAKG;AACH,MAAM,UAAU,SAAS,CAAC,OAAgB,EAAE;IAC3C,MAAM,MAAM,GAAG,IAAI,CAAC,MAAM,IAAI,UAAU,CAAC;IACzC,MAAM,KAAK,GAAG,IAAI,CAAC,KAAK,IAAI,SAAS,CAAC;IAEtC,OAAO,GAAG,CAAC,MAAM,CAAC,SAAS,EAAE;QAC5B,WAAW,EACV,6EAA6E;KAC9E,CAAC;SACA,OAAO,CAAC,QAAQ,EAAE;QAClB,WAAW,EAAE,uDAAuD;QACpE,IAAI,EAAE,CAAC,CAAC,MAAM,CAAC;YACd,KAAK,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,CAAC,kBAAkB,CAAC;SAC9C,CAAC;QACF,OAAO,EAAE,CAAC,CAAC,MAAM,CAAC;YACjB,UAAU,EAAE,CAAC,CAAC,MAAM;iBAClB,MAAM,EAAE;iBACR,QAAQ,EAAE;iBACV,QAAQ,CAAC,qCAAqC,CAAC;SACjD,CAAC;QACF,KAAK,EAAE,EAAC,UAAU,EAAE,GAAG,EAAC;QACxB,KAAK,CAAC,GAAG,CAAC,CAAC;YACV,MAAM,OAAO,GAAG,MAAM,MAAM,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,EAAE;gBAC1C,UAAU,EAAE,CAAC,CAAC,OAAO,CAAC,UAAU;aAChC,CAAC,CAAC;YACH,OAAO,EAAC,OAAO,EAAC,CAAC;QAClB,CAAC;KACD,CAAC;SACD,OAAO,CAAC,OAAO,EAAE;QACjB,WAAW,EACV,wEAAwE;QACzE,IAAI,EAAE,CAAC,CAAC,MAAM,CAAC;YACd,GAAG,EAAE,CAAC,CAAC,MAAM,EAAE,CAAC,QAAQ,CAAC,kBAAkB,CAAC;SAC5C,CAAC;QACF,OAAO,EAAE,CAAC,CAAC,MAAM,CAAC;YACjB,IAAI,EAAE,CAAC;iBACL,IAAI,CAAC,KAAK,CAAC;iBACX,QAAQ,EAAE;iBACV,QAAQ,CAAC,wCAAwC,CAAC;SACpD,CAAC;QACF,KAAK,EAAE,EAAC,IAAI,EAAE,GAAG,EAAC;QAClB,KAAK,CAAC,GAAG,CAAC,CAAC;YACV,OAAO,KAAK,CAAC,CAAC,CAAC,IAAI,CAAC,GAAG,EAAE,EAAC,IAAI,EAAE,CAAC,CAAC,OAAO,CAAC,IAAI,EAAC,CAAC,CAAC;QAClD,CAAC;KACD,CAAC,CAAC;AACL,CAAC;AAED,6EAA6E;AAC7E,gFAAgF;AAChF,6DAA6D;AAC7D,MAAM,GAAG,GAAG,SAAS,EAAE,CAAC;AAExB,oFAAoF;AACpF,SAAS,MAAM;IACd,MAAM,KAAK,GAAG,IAAI,CAAC,CAAC,CAAC,CAAC;IACtB,IAAI,CAAC,KAAK;QAAE,OAAO,KAAK,CAAC;IACzB,IAAI,CAAC;QACJ,OAAO,aAAa,CAAC,MAAM,CAAC,IAAI,CAAC,GAAG,CAAC,KAAK,KAAK,CAAC;IACjD,CAAC;IAAC,MAAM,CAAC;QACR,OAAO,KAAK,CAAC;IACd,CAAC;AACF,CAAC;AAED,IAAI,MAAM,EAAE;IAAE,GAAG,CAAC,KAAK,EAAE,CAAC;AAE1B,eAAe,GAAG,CAAC"}
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
import { spawn as defaultSpawn } from 'node:child_process';
|
|
2
|
+
import type { Config } from '../config.js';
|
|
3
|
+
import type { Backend } from './types.js';
|
|
4
|
+
/**
|
|
5
|
+
* Minimal `spawn` shape this backend needs, seamed so a test can inject a fake
|
|
6
|
+
* without a real subprocess. Defaults to `node:child_process` `spawn`.
|
|
7
|
+
*/
|
|
8
|
+
export type SpawnFn = typeof defaultSpawn;
|
|
9
|
+
/**
|
|
10
|
+
* Build a custom backend bound to the configured command. The command owns its
|
|
11
|
+
* own I/O; webveil hands it the request as JSON on stdin and parses
|
|
12
|
+
* SearchResult[] from stdout, failing clearly on malformed output.
|
|
13
|
+
*/
|
|
14
|
+
export declare function createCustomBackend(config: Config, spawn?: SpawnFn): Backend;
|
|
15
|
+
//# sourceMappingURL=custom.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"custom.d.ts","sourceRoot":"","sources":["../../../src/core/backends/custom.ts"],"names":[],"mappings":"AAoBA,OAAO,EAAC,KAAK,IAAI,YAAY,EAAC,MAAM,oBAAoB,CAAC;AACzD,OAAO,KAAK,EAAC,MAAM,EAAC,MAAM,cAAc,CAAC;AACzC,OAAO,KAAK,EAAC,OAAO,EAAoC,MAAM,YAAY,CAAC;AAe3E;;;GAGG;AACH,MAAM,MAAM,OAAO,GAAG,OAAO,YAAY,CAAC;AAsF1C;;;;GAIG;AACH,wBAAgB,mBAAmB,CAClC,MAAM,EAAE,MAAM,EACd,KAAK,GAAE,OAAsB,GAC3B,OAAO,CAuBT"}
|
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
// custom backend — the local-command escape hatch (contract lifted from
|
|
2
|
+
// pi-web-providers' custom-wrapper). Instead of an HTTP source, it spawns a
|
|
3
|
+
// configured local command, writes the request as JSON to its stdin, and parses
|
|
4
|
+
// `SearchResult[]` from its stdout. This lets any local script be a backend.
|
|
5
|
+
//
|
|
6
|
+
// Egress note: this backend owns its own I/O (the spawned command does whatever
|
|
7
|
+
// it wants), so the handed `http` helper is unused here — there is no outbound
|
|
8
|
+
// HTTP for webveil to proxy. It still returns the normalized SearchResult shape.
|
|
9
|
+
//
|
|
10
|
+
// Command source: the configured `baseUrl` carries the command line, parsed as a
|
|
11
|
+
// whitespace-separated argv (first token = executable, rest = args), matching how
|
|
12
|
+
// the other backends read `baseUrl` as "where results come from". (Recorded
|
|
13
|
+
// decision; see the task's Decisions block.)
|
|
14
|
+
//
|
|
15
|
+
// Contract:
|
|
16
|
+
// stdin <- JSON: {"query": string, "maxResults"?: number}
|
|
17
|
+
// stdout -> JSON: SearchResult[] (each {title, url, snippet?})
|
|
18
|
+
// Malformed stdout (non-JSON, not an array, or entries missing url/title) FAILS
|
|
19
|
+
// CLEARLY — it never silently returns an empty list.
|
|
20
|
+
import { spawn as defaultSpawn } from 'node:child_process';
|
|
21
|
+
function str(value) {
|
|
22
|
+
return typeof value === 'string' && value.length > 0 ? value : undefined;
|
|
23
|
+
}
|
|
24
|
+
/** Parse the configured command line into [executable, ...args]. */
|
|
25
|
+
function parseCommand(baseUrl) {
|
|
26
|
+
const parts = baseUrl.trim().split(/\s+/).filter(Boolean);
|
|
27
|
+
if (parts.length === 0)
|
|
28
|
+
throw new Error('custom: no command configured (set baseUrl to the command to run)');
|
|
29
|
+
return [parts[0], parts.slice(1)];
|
|
30
|
+
}
|
|
31
|
+
/**
|
|
32
|
+
* Normalize one stdout entry into a SearchResult, FAILING CLEARLY on a malformed
|
|
33
|
+
* entry rather than dropping it — the custom contract is explicit, so a missing
|
|
34
|
+
* url/title is a contract violation the user should see, not a silent skip.
|
|
35
|
+
*/
|
|
36
|
+
function toResult(entry, index) {
|
|
37
|
+
if (typeof entry !== 'object' || entry === null)
|
|
38
|
+
throw new Error(`custom: malformed output — result[${index}] is not an object`);
|
|
39
|
+
const hit = entry;
|
|
40
|
+
const url = str(hit.url);
|
|
41
|
+
const title = str(hit.title);
|
|
42
|
+
if (!url || !title)
|
|
43
|
+
throw new Error(`custom: malformed output — result[${index}] is missing a url or title`);
|
|
44
|
+
const snippet = str(hit.snippet);
|
|
45
|
+
return snippet ? { title, url, snippet } : { title, url };
|
|
46
|
+
}
|
|
47
|
+
/** Parse the command's stdout into SearchResult[], failing clearly on garbage. */
|
|
48
|
+
function parseOutput(stdout) {
|
|
49
|
+
const trimmed = stdout.trim();
|
|
50
|
+
if (trimmed.length === 0)
|
|
51
|
+
throw new Error('custom: command produced no output');
|
|
52
|
+
let parsed;
|
|
53
|
+
try {
|
|
54
|
+
parsed = JSON.parse(trimmed);
|
|
55
|
+
}
|
|
56
|
+
catch (cause) {
|
|
57
|
+
throw new Error(`custom: malformed output — stdout is not valid JSON: ${cause.message}`);
|
|
58
|
+
}
|
|
59
|
+
if (!Array.isArray(parsed))
|
|
60
|
+
throw new Error('custom: malformed output — expected a JSON array of results');
|
|
61
|
+
return parsed.map(toResult);
|
|
62
|
+
}
|
|
63
|
+
/** Spawn the command, write the request to stdin, and collect stdout/stderr. */
|
|
64
|
+
function runCommand(spawn, exe, args, request, signal) {
|
|
65
|
+
return new Promise((resolve, reject) => {
|
|
66
|
+
const child = spawn(exe, args, {
|
|
67
|
+
stdio: ['pipe', 'pipe', 'pipe'],
|
|
68
|
+
signal,
|
|
69
|
+
});
|
|
70
|
+
let stdout = '';
|
|
71
|
+
let stderr = '';
|
|
72
|
+
child.stdout?.on('data', (chunk) => (stdout += String(chunk)));
|
|
73
|
+
child.stderr?.on('data', (chunk) => (stderr += String(chunk)));
|
|
74
|
+
child.on('error', (err) => reject(new Error(`custom: failed to spawn '${exe}': ${err.message}`)));
|
|
75
|
+
child.on('close', (code) => resolve({ stdout, stderr, code }));
|
|
76
|
+
child.stdin?.on('error', () => {
|
|
77
|
+
// A command that exits before reading stdin closes the pipe; ignore the
|
|
78
|
+
// EPIPE here and let the close handler report via exit code/stderr.
|
|
79
|
+
});
|
|
80
|
+
child.stdin?.end(JSON.stringify(request));
|
|
81
|
+
});
|
|
82
|
+
}
|
|
83
|
+
/**
|
|
84
|
+
* Build a custom backend bound to the configured command. The command owns its
|
|
85
|
+
* own I/O; webveil hands it the request as JSON on stdin and parses
|
|
86
|
+
* SearchResult[] from stdout, failing clearly on malformed output.
|
|
87
|
+
*/
|
|
88
|
+
export function createCustomBackend(config, spawn = defaultSpawn) {
|
|
89
|
+
const [exe, args] = parseCommand(config.baseUrl);
|
|
90
|
+
return {
|
|
91
|
+
async search(query, _http, options = {}) {
|
|
92
|
+
const request = { query };
|
|
93
|
+
if (options.maxResults !== undefined)
|
|
94
|
+
request.maxResults = options.maxResults;
|
|
95
|
+
const run = await runCommand(spawn, exe, args, request, options.signal);
|
|
96
|
+
if (run.code !== 0)
|
|
97
|
+
throw new Error(`custom: command '${exe}' exited with code ${run.code}` +
|
|
98
|
+
(run.stderr.trim() ? `: ${run.stderr.trim()}` : ''));
|
|
99
|
+
const results = parseOutput(run.stdout);
|
|
100
|
+
return options.maxResults !== undefined
|
|
101
|
+
? results.slice(0, options.maxResults)
|
|
102
|
+
: results;
|
|
103
|
+
},
|
|
104
|
+
};
|
|
105
|
+
}
|
|
106
|
+
//# sourceMappingURL=custom.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"custom.js","sourceRoot":"","sources":["../../../src/core/backends/custom.ts"],"names":[],"mappings":"AAAA,wEAAwE;AACxE,4EAA4E;AAC5E,gFAAgF;AAChF,6EAA6E;AAC7E,EAAE;AACF,gFAAgF;AAChF,+EAA+E;AAC/E,iFAAiF;AACjF,EAAE;AACF,iFAAiF;AACjF,kFAAkF;AAClF,4EAA4E;AAC5E,6CAA6C;AAC7C,EAAE;AACF,YAAY;AACZ,6DAA6D;AAC7D,kEAAkE;AAClE,gFAAgF;AAChF,qDAAqD;AAErD,OAAO,EAAC,KAAK,IAAI,YAAY,EAAC,MAAM,oBAAoB,CAAC;AAuBzD,SAAS,GAAG,CAAC,KAAc;IAC1B,OAAO,OAAO,KAAK,KAAK,QAAQ,IAAI,KAAK,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,SAAS,CAAC;AAC1E,CAAC;AAED,oEAAoE;AACpE,SAAS,YAAY,CAAC,OAAe;IACpC,MAAM,KAAK,GAAG,OAAO,CAAC,IAAI,EAAE,CAAC,KAAK,CAAC,KAAK,CAAC,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC;IAC1D,IAAI,KAAK,CAAC,MAAM,KAAK,CAAC;QACrB,MAAM,IAAI,KAAK,CACd,mEAAmE,CACnE,CAAC;IACH,OAAO,CAAC,KAAK,CAAC,CAAC,CAAE,EAAE,KAAK,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC;AACpC,CAAC;AAED;;;;GAIG;AACH,SAAS,QAAQ,CAAC,KAAc,EAAE,KAAa;IAC9C,IAAI,OAAO,KAAK,KAAK,QAAQ,IAAI,KAAK,KAAK,IAAI;QAC9C,MAAM,IAAI,KAAK,CACd,qCAAqC,KAAK,oBAAoB,CAC9D,CAAC;IACH,MAAM,GAAG,GAAG,KAAgC,CAAC;IAC7C,MAAM,GAAG,GAAG,GAAG,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC;IACzB,MAAM,KAAK,GAAG,GAAG,CAAC,GAAG,CAAC,KAAK,CAAC,CAAC;IAC7B,IAAI,CAAC,GAAG,IAAI,CAAC,KAAK;QACjB,MAAM,IAAI,KAAK,CACd,qCAAqC,KAAK,6BAA6B,CACvE,CAAC;IACH,MAAM,OAAO,GAAG,GAAG,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;IACjC,OAAO,OAAO,CAAC,CAAC,CAAC,EAAC,KAAK,EAAE,GAAG,EAAE,OAAO,EAAC,CAAC,CAAC,CAAC,EAAC,KAAK,EAAE,GAAG,EAAC,CAAC;AACvD,CAAC;AAED,kFAAkF;AAClF,SAAS,WAAW,CAAC,MAAc;IAClC,MAAM,OAAO,GAAG,MAAM,CAAC,IAAI,EAAE,CAAC;IAC9B,IAAI,OAAO,CAAC,MAAM,KAAK,CAAC;QACvB,MAAM,IAAI,KAAK,CAAC,oCAAoC,CAAC,CAAC;IACvD,IAAI,MAAe,CAAC;IACpB,IAAI,CAAC;QACJ,MAAM,GAAG,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,CAAC;IAC9B,CAAC;IAAC,OAAO,KAAK,EAAE,CAAC;QAChB,MAAM,IAAI,KAAK,CACd,wDAAyD,KAAe,CAAC,OAAO,EAAE,CAClF,CAAC;IACH,CAAC;IACD,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,MAAM,CAAC;QACzB,MAAM,IAAI,KAAK,CACd,6DAA6D,CAC7D,CAAC;IACH,OAAO,MAAM,CAAC,GAAG,CAAC,QAAQ,CAAC,CAAC;AAC7B,CAAC;AAED,gFAAgF;AAChF,SAAS,UAAU,CAClB,KAAc,EACd,GAAW,EACX,IAAc,EACd,OAAsB,EACtB,MAAoB;IAEpB,OAAO,IAAI,OAAO,CAAa,CAAC,OAAO,EAAE,MAAM,EAAE,EAAE;QAClD,MAAM,KAAK,GAAG,KAAK,CAAC,GAAG,EAAE,IAAI,EAAE;YAC9B,KAAK,EAAE,CAAC,MAAM,EAAE,MAAM,EAAE,MAAM,CAAC;YAC/B,MAAM;SACN,CAAC,CAAC;QACH,IAAI,MAAM,GAAG,EAAE,CAAC;QAChB,IAAI,MAAM,GAAG,EAAE,CAAC;QAChB,KAAK,CAAC,MAAM,EAAE,EAAE,CAAC,MAAM,EAAE,CAAC,KAAK,EAAE,EAAE,CAAC,CAAC,MAAM,IAAI,MAAM,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;QAC/D,KAAK,CAAC,MAAM,EAAE,EAAE,CAAC,MAAM,EAAE,CAAC,KAAK,EAAE,EAAE,CAAC,CAAC,MAAM,IAAI,MAAM,CAAC,KAAK,CAAC,CAAC,CAAC,CAAC;QAC/D,KAAK,CAAC,EAAE,CAAC,OAAO,EAAE,CAAC,GAAG,EAAE,EAAE,CACzB,MAAM,CAAC,IAAI,KAAK,CAAC,4BAA4B,GAAG,MAAM,GAAG,CAAC,OAAO,EAAE,CAAC,CAAC,CACrE,CAAC;QACF,KAAK,CAAC,EAAE,CAAC,OAAO,EAAE,CAAC,IAAI,EAAE,EAAE,CAAC,OAAO,CAAC,EAAC,MAAM,EAAE,MAAM,EAAE,IAAI,EAAC,CAAC,CAAC,CAAC;QAC7D,KAAK,CAAC,KAAK,EAAE,EAAE,CAAC,OAAO,EAAE,GAAG,EAAE;YAC7B,wEAAwE;YACxE,oEAAoE;QACrE,CAAC,CAAC,CAAC;QACH,KAAK,CAAC,KAAK,EAAE,GAAG,CAAC,IAAI,CAAC,SAAS,CAAC,OAAO,CAAC,CAAC,CAAC;IAC3C,CAAC,CAAC,CAAC;AACJ,CAAC;AAED;;;;GAIG;AACH,MAAM,UAAU,mBAAmB,CAClC,MAAc,EACd,QAAiB,YAAY;IAE7B,MAAM,CAAC,GAAG,EAAE,IAAI,CAAC,GAAG,YAAY,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC;IACjD,OAAO;QACN,KAAK,CAAC,MAAM,CACX,KAAa,EACb,KAAW,EACX,UAAyB,EAAE;YAE3B,MAAM,OAAO,GAAkB,EAAC,KAAK,EAAC,CAAC;YACvC,IAAI,OAAO,CAAC,UAAU,KAAK,SAAS;gBACnC,OAAO,CAAC,UAAU,GAAG,OAAO,CAAC,UAAU,CAAC;YACzC,MAAM,GAAG,GAAG,MAAM,UAAU,CAAC,KAAK,EAAE,GAAG,EAAE,IAAI,EAAE,OAAO,EAAE,OAAO,CAAC,MAAM,CAAC,CAAC;YACxE,IAAI,GAAG,CAAC,IAAI,KAAK,CAAC;gBACjB,MAAM,IAAI,KAAK,CACd,oBAAoB,GAAG,sBAAsB,GAAG,CAAC,IAAI,EAAE;oBACtD,CAAC,GAAG,CAAC,MAAM,CAAC,IAAI,EAAE,CAAC,CAAC,CAAC,KAAK,GAAG,CAAC,MAAM,CAAC,IAAI,EAAE,EAAE,CAAC,CAAC,CAAC,EAAE,CAAC,CACpD,CAAC;YACH,MAAM,OAAO,GAAG,WAAW,CAAC,GAAG,CAAC,MAAM,CAAC,CAAC;YACxC,OAAO,OAAO,CAAC,UAAU,KAAK,SAAS;gBACtC,CAAC,CAAC,OAAO,CAAC,KAAK,CAAC,CAAC,EAAE,OAAO,CAAC,UAAU,CAAC;gBACtC,CAAC,CAAC,OAAO,CAAC;QACZ,CAAC;KACD,CAAC;AACH,CAAC"}
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
import type { Config } from '../config.js';
|
|
2
|
+
import type { Backend } from './types.js';
|
|
3
|
+
/** Builds a Backend from the resolved config (knows its baseUrl / apiKey). */
|
|
4
|
+
export type BackendFactory = (config: Config) => Backend;
|
|
5
|
+
/** The backend names the registry can resolve. */
|
|
6
|
+
export declare function backendNames(): string[];
|
|
7
|
+
/**
|
|
8
|
+
* Resolve a backend name to a constructed Backend. Throws clearly on an unknown
|
|
9
|
+
* name (listing the known ones) so a misconfigured `backend` fails loud, never
|
|
10
|
+
* silently no-ops.
|
|
11
|
+
*/
|
|
12
|
+
export declare function getBackend(name: string, config: Config): Backend;
|
|
13
|
+
//# sourceMappingURL=registry.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"registry.d.ts","sourceRoot":"","sources":["../../../src/core/backends/registry.ts"],"names":[],"mappings":"AAOA,OAAO,KAAK,EAAC,MAAM,EAAC,MAAM,cAAc,CAAC;AACzC,OAAO,KAAK,EAAC,OAAO,EAAC,MAAM,YAAY,CAAC;AAKxC,8EAA8E;AAC9E,MAAM,MAAM,cAAc,GAAG,CAAC,MAAM,EAAE,MAAM,KAAK,OAAO,CAAC;AASzD,kDAAkD;AAClD,wBAAgB,YAAY,IAAI,MAAM,EAAE,CAEvC;AAED;;;;GAIG;AACH,wBAAgB,UAAU,CAAC,IAAI,EAAE,MAAM,EAAE,MAAM,EAAE,MAAM,GAAG,OAAO,CAOhE"}
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
// backend registry — a tiny `name -> Backend` dispatcher (concept trimmed from
|
|
2
|
+
// pi-search-hub's registry). Each backend registers a factory keyed by its config
|
|
3
|
+
// `backend` name; `getBackend` resolves the name to a constructed Backend (handed
|
|
4
|
+
// the resolved config so it knows its instance baseUrl / apiKey) and fails clearly
|
|
5
|
+
// on an unknown name. Later backend tasks (tavily-compat, custom) append their own
|
|
6
|
+
// registrations to FACTORIES below.
|
|
7
|
+
import { createSearxngBackend } from './searxng.js';
|
|
8
|
+
import { createTavilyCompatBackend } from './tavily-compat.js';
|
|
9
|
+
import { createCustomBackend } from './custom.js';
|
|
10
|
+
/** name -> factory. New backends add an entry here. */
|
|
11
|
+
const FACTORIES = {
|
|
12
|
+
searxng: createSearxngBackend,
|
|
13
|
+
'tavily-compat': createTavilyCompatBackend,
|
|
14
|
+
custom: createCustomBackend,
|
|
15
|
+
};
|
|
16
|
+
/** The backend names the registry can resolve. */
|
|
17
|
+
export function backendNames() {
|
|
18
|
+
return Object.keys(FACTORIES);
|
|
19
|
+
}
|
|
20
|
+
/**
|
|
21
|
+
* Resolve a backend name to a constructed Backend. Throws clearly on an unknown
|
|
22
|
+
* name (listing the known ones) so a misconfigured `backend` fails loud, never
|
|
23
|
+
* silently no-ops.
|
|
24
|
+
*/
|
|
25
|
+
export function getBackend(name, config) {
|
|
26
|
+
const factory = FACTORIES[name];
|
|
27
|
+
if (!factory)
|
|
28
|
+
throw new Error(`webveil: unknown backend '${name}' (known: ${backendNames().join(', ')})`);
|
|
29
|
+
return factory(config);
|
|
30
|
+
}
|
|
31
|
+
//# sourceMappingURL=registry.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"registry.js","sourceRoot":"","sources":["../../../src/core/backends/registry.ts"],"names":[],"mappings":"AAAA,+EAA+E;AAC/E,kFAAkF;AAClF,kFAAkF;AAClF,mFAAmF;AACnF,mFAAmF;AACnF,oCAAoC;AAIpC,OAAO,EAAC,oBAAoB,EAAC,MAAM,cAAc,CAAC;AAClD,OAAO,EAAC,yBAAyB,EAAC,MAAM,oBAAoB,CAAC;AAC7D,OAAO,EAAC,mBAAmB,EAAC,MAAM,aAAa,CAAC;AAKhD,uDAAuD;AACvD,MAAM,SAAS,GAAmC;IACjD,OAAO,EAAE,oBAAoB;IAC7B,eAAe,EAAE,yBAAyB;IAC1C,MAAM,EAAE,mBAAmB;CAC3B,CAAC;AAEF,kDAAkD;AAClD,MAAM,UAAU,YAAY;IAC3B,OAAO,MAAM,CAAC,IAAI,CAAC,SAAS,CAAC,CAAC;AAC/B,CAAC;AAED;;;;GAIG;AACH,MAAM,UAAU,UAAU,CAAC,IAAY,EAAE,MAAc;IACtD,MAAM,OAAO,GAAG,SAAS,CAAC,IAAI,CAAC,CAAC;IAChC,IAAI,CAAC,OAAO;QACX,MAAM,IAAI,KAAK,CACd,6BAA6B,IAAI,aAAa,YAAY,EAAE,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,CAC1E,CAAC;IACH,OAAO,OAAO,CAAC,MAAM,CAAC,CAAC;AACxB,CAAC"}
|
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
import type { Config } from '../config.js';
|
|
2
|
+
import type { Backend } from './types.js';
|
|
3
|
+
/**
|
|
4
|
+
* Build a SearXNG backend bound to the configured instance. The returned backend
|
|
5
|
+
* only ever touches the network via the injected `http` helper.
|
|
6
|
+
*/
|
|
7
|
+
export declare function createSearxngBackend(config: Config): Backend;
|
|
8
|
+
//# sourceMappingURL=searxng.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"searxng.d.ts","sourceRoot":"","sources":["../../../src/core/backends/searxng.ts"],"names":[],"mappings":"AAKA,OAAO,KAAK,EAAC,MAAM,EAAC,MAAM,cAAc,CAAC;AACzC,OAAO,KAAK,EAAC,OAAO,EAAoC,MAAM,YAAY,CAAC;AAsC3E;;;GAGG;AACH,wBAAgB,oBAAoB,CAAC,MAAM,EAAE,MAAM,GAAG,OAAO,CAqB5D"}
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
// searxng backend — the keyless, self-hosted metasearch default. Queries a
|
|
2
|
+
// SearXNG instance's JSON API (`/search?format=json`) THROUGH the handed `http`
|
|
3
|
+
// helper (never a direct fetch, so egress is not bypassable) and normalizes the
|
|
4
|
+
// response into SearchResult[].
|
|
5
|
+
function str(value) {
|
|
6
|
+
return typeof value === 'string' && value.length > 0 ? value : undefined;
|
|
7
|
+
}
|
|
8
|
+
/** Normalize one SearXNG hit; drop entries without a usable url + title. */
|
|
9
|
+
function toResult(hit) {
|
|
10
|
+
const url = str(hit.url);
|
|
11
|
+
const title = str(hit.title);
|
|
12
|
+
if (!url || !title)
|
|
13
|
+
return undefined;
|
|
14
|
+
const snippet = str(hit.content);
|
|
15
|
+
return snippet ? { title, url, snippet } : { title, url };
|
|
16
|
+
}
|
|
17
|
+
/** Build the SearXNG JSON search URL for a query against the instance baseUrl. */
|
|
18
|
+
function buildUrl(baseUrl, query) {
|
|
19
|
+
const url = new URL('search', baseUrl.endsWith('/') ? baseUrl : baseUrl + '/');
|
|
20
|
+
url.searchParams.set('q', query);
|
|
21
|
+
url.searchParams.set('format', 'json');
|
|
22
|
+
return url.toString();
|
|
23
|
+
}
|
|
24
|
+
/**
|
|
25
|
+
* Build a SearXNG backend bound to the configured instance. The returned backend
|
|
26
|
+
* only ever touches the network via the injected `http` helper.
|
|
27
|
+
*/
|
|
28
|
+
export function createSearxngBackend(config) {
|
|
29
|
+
const baseUrl = config.baseUrl;
|
|
30
|
+
return {
|
|
31
|
+
async search(query, http, options = {}) {
|
|
32
|
+
const body = await http.fetchJson(buildUrl(baseUrl, query), { headers: { accept: 'application/json' }, signal: options.signal });
|
|
33
|
+
const results = Array.isArray(body.results) ? body.results : [];
|
|
34
|
+
const normalized = results
|
|
35
|
+
.map(toResult)
|
|
36
|
+
.filter((r) => r !== undefined);
|
|
37
|
+
return options.maxResults !== undefined
|
|
38
|
+
? normalized.slice(0, options.maxResults)
|
|
39
|
+
: normalized;
|
|
40
|
+
},
|
|
41
|
+
};
|
|
42
|
+
}
|
|
43
|
+
//# sourceMappingURL=searxng.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"searxng.js","sourceRoot":"","sources":["../../../src/core/backends/searxng.ts"],"names":[],"mappings":"AAAA,2EAA2E;AAC3E,gFAAgF;AAChF,gFAAgF;AAChF,gCAAgC;AAiBhC,SAAS,GAAG,CAAC,KAAc;IAC1B,OAAO,OAAO,KAAK,KAAK,QAAQ,IAAI,KAAK,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,SAAS,CAAC;AAC1E,CAAC;AAED,4EAA4E;AAC5E,SAAS,QAAQ,CAAC,GAAkB;IACnC,MAAM,GAAG,GAAG,GAAG,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC;IACzB,MAAM,KAAK,GAAG,GAAG,CAAC,GAAG,CAAC,KAAK,CAAC,CAAC;IAC7B,IAAI,CAAC,GAAG,IAAI,CAAC,KAAK;QAAE,OAAO,SAAS,CAAC;IACrC,MAAM,OAAO,GAAG,GAAG,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;IACjC,OAAO,OAAO,CAAC,CAAC,CAAC,EAAC,KAAK,EAAE,GAAG,EAAE,OAAO,EAAC,CAAC,CAAC,CAAC,EAAC,KAAK,EAAE,GAAG,EAAC,CAAC;AACvD,CAAC;AAED,kFAAkF;AAClF,SAAS,QAAQ,CAAC,OAAe,EAAE,KAAa;IAC/C,MAAM,GAAG,GAAG,IAAI,GAAG,CAClB,QAAQ,EACR,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,OAAO,GAAG,GAAG,CAC/C,CAAC;IACF,GAAG,CAAC,YAAY,CAAC,GAAG,CAAC,GAAG,EAAE,KAAK,CAAC,CAAC;IACjC,GAAG,CAAC,YAAY,CAAC,GAAG,CAAC,QAAQ,EAAE,MAAM,CAAC,CAAC;IACvC,OAAO,GAAG,CAAC,QAAQ,EAAE,CAAC;AACvB,CAAC;AAED;;;GAGG;AACH,MAAM,UAAU,oBAAoB,CAAC,MAAc;IAClD,MAAM,OAAO,GAAG,MAAM,CAAC,OAAO,CAAC;IAC/B,OAAO;QACN,KAAK,CAAC,MAAM,CACX,KAAa,EACb,IAAU,EACV,UAAyB,EAAE;YAE3B,MAAM,IAAI,GAAG,MAAM,IAAI,CAAC,SAAS,CAChC,QAAQ,CAAC,OAAO,EAAE,KAAK,CAAC,EACxB,EAAC,OAAO,EAAE,EAAC,MAAM,EAAE,kBAAkB,EAAC,EAAE,MAAM,EAAE,OAAO,CAAC,MAAM,EAAC,CAC/D,CAAC;YACF,MAAM,OAAO,GAAG,KAAK,CAAC,OAAO,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,EAAE,CAAC;YAChE,MAAM,UAAU,GAAG,OAAO;iBACxB,GAAG,CAAC,QAAQ,CAAC;iBACb,MAAM,CAAC,CAAC,CAAC,EAAqB,EAAE,CAAC,CAAC,KAAK,SAAS,CAAC,CAAC;YACpD,OAAO,OAAO,CAAC,UAAU,KAAK,SAAS;gBACtC,CAAC,CAAC,UAAU,CAAC,KAAK,CAAC,CAAC,EAAE,OAAO,CAAC,UAAU,CAAC;gBACzC,CAAC,CAAC,UAAU,CAAC;QACf,CAAC;KACD,CAAC;AACH,CAAC"}
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
import type { Config } from '../config.js';
|
|
2
|
+
import type { Backend } from './types.js';
|
|
3
|
+
/**
|
|
4
|
+
* Build a Tavily-compat backend bound to the configured instance. The returned
|
|
5
|
+
* backend only ever touches the network via the injected `http` helper. A Bearer
|
|
6
|
+
* header is added only when an apiKey is set (the covered instances are usually
|
|
7
|
+
* keyless).
|
|
8
|
+
*/
|
|
9
|
+
export declare function createTavilyCompatBackend(config: Config): Backend;
|
|
10
|
+
//# sourceMappingURL=tavily-compat.d.ts.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"tavily-compat.d.ts","sourceRoot":"","sources":["../../../src/core/backends/tavily-compat.ts"],"names":[],"mappings":"AAYA,OAAO,KAAK,EAAC,MAAM,EAAC,MAAM,cAAc,CAAC;AACzC,OAAO,KAAK,EACX,OAAO,EAOP,MAAM,YAAY,CAAC;AAqDpB;;;;;GAKG;AACH,wBAAgB,yBAAyB,CAAC,MAAM,EAAE,MAAM,GAAG,OAAO,CA2EjE"}
|
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
// tavily-compat backend — a generic Tavily-shaped client (POST `/search` and an
|
|
2
|
+
// optional POST `/extract`) selected purely by `baseUrl`, so it covers
|
|
3
|
+
// orio-search / searcharvester / agent-search and any other Tavily-API-shaped
|
|
4
|
+
// instance. Both endpoints go THROUGH the handed `http` helper (never a direct
|
|
5
|
+
// fetch, so egress is not bypassable). `/search` normalizes to SearchResult[];
|
|
6
|
+
// `/extract` is exposed as the optional `Backend.fetch` a later task uses to
|
|
7
|
+
// override the distilly Extractor.
|
|
8
|
+
//
|
|
9
|
+
// Auth: a Bearer header is sent only when an apiKey is configured. The covered
|
|
10
|
+
// self-hosted instances are typically keyless, so a missing key is normal, not
|
|
11
|
+
// an error.
|
|
12
|
+
function str(value) {
|
|
13
|
+
return typeof value === 'string' && value.length > 0 ? value : undefined;
|
|
14
|
+
}
|
|
15
|
+
/** Normalize one Tavily search hit; drop entries without a usable url + title. */
|
|
16
|
+
function toResult(hit) {
|
|
17
|
+
const url = str(hit.url);
|
|
18
|
+
const title = str(hit.title);
|
|
19
|
+
if (!url || !title)
|
|
20
|
+
return undefined;
|
|
21
|
+
const snippet = str(hit.content);
|
|
22
|
+
return snippet ? { title, url, snippet } : { title, url };
|
|
23
|
+
}
|
|
24
|
+
/** Resolve an endpoint path against the instance baseUrl. */
|
|
25
|
+
function endpoint(baseUrl, path) {
|
|
26
|
+
return new URL(path, baseUrl.endsWith('/') ? baseUrl : baseUrl + '/').toString();
|
|
27
|
+
}
|
|
28
|
+
/**
|
|
29
|
+
* Build a Tavily-compat backend bound to the configured instance. The returned
|
|
30
|
+
* backend only ever touches the network via the injected `http` helper. A Bearer
|
|
31
|
+
* header is added only when an apiKey is set (the covered instances are usually
|
|
32
|
+
* keyless).
|
|
33
|
+
*/
|
|
34
|
+
export function createTavilyCompatBackend(config) {
|
|
35
|
+
const baseUrl = config.baseUrl;
|
|
36
|
+
const apiKey = config.apiKey;
|
|
37
|
+
function headers() {
|
|
38
|
+
const h = {
|
|
39
|
+
'content-type': 'application/json',
|
|
40
|
+
accept: 'application/json',
|
|
41
|
+
};
|
|
42
|
+
if (apiKey)
|
|
43
|
+
h.authorization = `Bearer ${apiKey}`;
|
|
44
|
+
return h;
|
|
45
|
+
}
|
|
46
|
+
function post(path, payload, signal) {
|
|
47
|
+
return {
|
|
48
|
+
method: 'POST',
|
|
49
|
+
headers: headers(),
|
|
50
|
+
body: JSON.stringify(payload),
|
|
51
|
+
signal,
|
|
52
|
+
};
|
|
53
|
+
}
|
|
54
|
+
return {
|
|
55
|
+
async search(query, http, options = {}) {
|
|
56
|
+
const payload = { query };
|
|
57
|
+
if (options.maxResults !== undefined)
|
|
58
|
+
payload.max_results = options.maxResults;
|
|
59
|
+
const body = await http.fetchJson(endpoint(baseUrl, 'search'), post('search', payload, options.signal));
|
|
60
|
+
const results = Array.isArray(body.results) ? body.results : [];
|
|
61
|
+
const normalized = results
|
|
62
|
+
.map(toResult)
|
|
63
|
+
.filter((r) => r !== undefined);
|
|
64
|
+
return options.maxResults !== undefined
|
|
65
|
+
? normalized.slice(0, options.maxResults)
|
|
66
|
+
: normalized;
|
|
67
|
+
},
|
|
68
|
+
async fetch(url, http, options = {}) {
|
|
69
|
+
// Tavily `/extract` has no `s/m/l/f` size knob (it has `format` /
|
|
70
|
+
// `extract_depth` instead); always request markdown. The default
|
|
71
|
+
// distilly Extractor owns webveil's size presets.
|
|
72
|
+
const body = await http.fetchJson(endpoint(baseUrl, 'extract'), post('extract', { urls: url, format: 'markdown' }, options.signal));
|
|
73
|
+
const failure = (body.failed_results ?? []).find((f) => str(f.url) === url);
|
|
74
|
+
if (failure)
|
|
75
|
+
throw new Error(`tavily-compat: /extract failed for ${url}: ${str(failure.error) ?? 'unknown error'}`);
|
|
76
|
+
const hit = (body.results ?? [])[0];
|
|
77
|
+
const markdown = hit ? str(hit.raw_content) : undefined;
|
|
78
|
+
if (markdown === undefined)
|
|
79
|
+
throw new Error(`tavily-compat: no extract result for ${url}`);
|
|
80
|
+
// Tavily `/extract` returns no `truncated` flag and no page title.
|
|
81
|
+
return { url, markdown, truncated: false };
|
|
82
|
+
},
|
|
83
|
+
};
|
|
84
|
+
}
|
|
85
|
+
//# sourceMappingURL=tavily-compat.js.map
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
{"version":3,"file":"tavily-compat.js","sourceRoot":"","sources":["../../../src/core/backends/tavily-compat.ts"],"names":[],"mappings":"AAAA,gFAAgF;AAChF,uEAAuE;AACvE,8EAA8E;AAC9E,+EAA+E;AAC/E,+EAA+E;AAC/E,6EAA6E;AAC7E,mCAAmC;AACnC,EAAE;AACF,+EAA+E;AAC/E,+EAA+E;AAC/E,YAAY;AA2CZ,SAAS,GAAG,CAAC,KAAc;IAC1B,OAAO,OAAO,KAAK,KAAK,QAAQ,IAAI,KAAK,CAAC,MAAM,GAAG,CAAC,CAAC,CAAC,CAAC,KAAK,CAAC,CAAC,CAAC,SAAS,CAAC;AAC1E,CAAC;AAED,kFAAkF;AAClF,SAAS,QAAQ,CAAC,GAAoB;IACrC,MAAM,GAAG,GAAG,GAAG,CAAC,GAAG,CAAC,GAAG,CAAC,CAAC;IACzB,MAAM,KAAK,GAAG,GAAG,CAAC,GAAG,CAAC,KAAK,CAAC,CAAC;IAC7B,IAAI,CAAC,GAAG,IAAI,CAAC,KAAK;QAAE,OAAO,SAAS,CAAC;IACrC,MAAM,OAAO,GAAG,GAAG,CAAC,GAAG,CAAC,OAAO,CAAC,CAAC;IACjC,OAAO,OAAO,CAAC,CAAC,CAAC,EAAC,KAAK,EAAE,GAAG,EAAE,OAAO,EAAC,CAAC,CAAC,CAAC,EAAC,KAAK,EAAE,GAAG,EAAC,CAAC;AACvD,CAAC;AAED,6DAA6D;AAC7D,SAAS,QAAQ,CAAC,OAAe,EAAE,IAAY;IAC9C,OAAO,IAAI,GAAG,CACb,IAAI,EACJ,OAAO,CAAC,QAAQ,CAAC,GAAG,CAAC,CAAC,CAAC,CAAC,OAAO,CAAC,CAAC,CAAC,OAAO,GAAG,GAAG,CAC/C,CAAC,QAAQ,EAAE,CAAC;AACd,CAAC;AAED;;;;;GAKG;AACH,MAAM,UAAU,yBAAyB,CAAC,MAAc;IACvD,MAAM,OAAO,GAAG,MAAM,CAAC,OAAO,CAAC;IAC/B,MAAM,MAAM,GAAG,MAAM,CAAC,MAAM,CAAC;IAE7B,SAAS,OAAO;QACf,MAAM,CAAC,GAA2B;YACjC,cAAc,EAAE,kBAAkB;YAClC,MAAM,EAAE,kBAAkB;SAC1B,CAAC;QACF,IAAI,MAAM;YAAE,CAAC,CAAC,aAAa,GAAG,UAAU,MAAM,EAAE,CAAC;QACjD,OAAO,CAAC,CAAC;IACV,CAAC;IAED,SAAS,IAAI,CACZ,IAAY,EACZ,OAAgB,EAChB,MAAoB;QAEpB,OAAO;YACN,MAAM,EAAE,MAAM;YACd,OAAO,EAAE,OAAO,EAAE;YAClB,IAAI,EAAE,IAAI,CAAC,SAAS,CAAC,OAAO,CAAC;YAC7B,MAAM;SACN,CAAC;IACH,CAAC;IAED,OAAO;QACN,KAAK,CAAC,MAAM,CACX,KAAa,EACb,IAAU,EACV,UAAyB,EAAE;YAE3B,MAAM,OAAO,GAA4B,EAAC,KAAK,EAAC,CAAC;YACjD,IAAI,OAAO,CAAC,UAAU,KAAK,SAAS;gBACnC,OAAO,CAAC,WAAW,GAAG,OAAO,CAAC,UAAU,CAAC;YAC1C,MAAM,IAAI,GAAG,MAAM,IAAI,CAAC,SAAS,CAChC,QAAQ,CAAC,OAAO,EAAE,QAAQ,CAAC,EAC3B,IAAI,CAAC,QAAQ,EAAE,OAAO,EAAE,OAAO,CAAC,MAAM,CAAC,CACvC,CAAC;YACF,MAAM,OAAO,GAAG,KAAK,CAAC,OAAO,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,OAAO,CAAC,CAAC,CAAC,EAAE,CAAC;YAChE,MAAM,UAAU,GAAG,OAAO;iBACxB,GAAG,CAAC,QAAQ,CAAC;iBACb,MAAM,CAAC,CAAC,CAAC,EAAqB,EAAE,CAAC,CAAC,KAAK,SAAS,CAAC,CAAC;YACpD,OAAO,OAAO,CAAC,UAAU,KAAK,SAAS;gBACtC,CAAC,CAAC,UAAU,CAAC,KAAK,CAAC,CAAC,EAAE,OAAO,CAAC,UAAU,CAAC;gBACzC,CAAC,CAAC,UAAU,CAAC;QACf,CAAC;QAED,KAAK,CAAC,KAAK,CACV,GAAW,EACX,IAAU,EACV,UAAwB,EAAE;YAE1B,kEAAkE;YAClE,iEAAiE;YACjE,kDAAkD;YAClD,MAAM,IAAI,GAAG,MAAM,IAAI,CAAC,SAAS,CAChC,QAAQ,CAAC,OAAO,EAAE,SAAS,CAAC,EAC5B,IAAI,CAAC,SAAS,EAAE,EAAC,IAAI,EAAE,GAAG,EAAE,MAAM,EAAE,UAAU,EAAC,EAAE,OAAO,CAAC,MAAM,CAAC,CAChE,CAAC;YACF,MAAM,OAAO,GAAG,CAAC,IAAI,CAAC,cAAc,IAAI,EAAE,CAAC,CAAC,IAAI,CAC/C,CAAC,CAAC,EAAE,EAAE,CAAC,GAAG,CAAC,CAAC,CAAC,GAAG,CAAC,KAAK,GAAG,CACzB,CAAC;YACF,IAAI,OAAO;gBACV,MAAM,IAAI,KAAK,CACd,sCAAsC,GAAG,KAAK,GAAG,CAAC,OAAO,CAAC,KAAK,CAAC,IAAI,eAAe,EAAE,CACrF,CAAC;YACH,MAAM,GAAG,GAAG,CAAC,IAAI,CAAC,OAAO,IAAI,EAAE,CAAC,CAAC,CAAC,CAAC,CAAC;YACpC,MAAM,QAAQ,GAAG,GAAG,CAAC,CAAC,CAAC,GAAG,CAAC,GAAG,CAAC,WAAW,CAAC,CAAC,CAAC,CAAC,SAAS,CAAC;YACxD,IAAI,QAAQ,KAAK,SAAS;gBACzB,MAAM,IAAI,KAAK,CAAC,wCAAwC,GAAG,EAAE,CAAC,CAAC;YAChE,mEAAmE;YACnE,OAAO,EAAC,GAAG,EAAE,QAAQ,EAAE,SAAS,EAAE,KAAK,EAAC,CAAC;QAC1C,CAAC;KACD,CAAC;AACH,CAAC"}
|