webveil 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -13,33 +13,261 @@ works perfectly well non-anonymously (direct egress).
13
13
  webveil is a pnpm workspace monorepo. The **core** (`search()` / `fetch()`) is plain,
14
14
  framework-agnostic. Two thin frontends wrap that same core:
15
15
 
16
- - **[`webveil`](packages/webveil)** an [incur](https://github.com/wevm/incur)-based
16
+ - **[`webveil`](packages/webveil)**, an [incur](https://github.com/wevm/incur)-based
17
17
  **CLI + MCP server** (`--mcp`, skills, `--llms`, TOON output). Pi-agnostic; usable by any
18
18
  agent (pi via pi-mcp-adapter, Claude Code, Cursor, Codex, bash). Has a `webveil` bin.
19
- - **[`pi-webveil`](packages/pi-webveil)** a **pi extension** registering `web_search` and
19
+ - **[`pi-webveil`](packages/pi-webveil)**, a **pi extension** registering `web_search` and
20
20
  `web_fetch` tools that call the core in-process. A drop-in replacement for Ollama's tools
21
21
  (same names), which is the original motivation. Depends on `webveil` via `workspace:*`.
22
22
 
23
+ ## Quick start
24
+
25
+ webveil needs a **backend** to get results from. The zero-config default is a local
26
+ **SearXNG** at `http://127.0.0.1:8080` on `direct` egress (non-anonymous). There is
27
+ **no** zero-setup + anonymous + real-web-results option in the ecosystem, see
28
+ [`work/notes/ideas/default-backend-policy-account-vs-origin.md`](work/notes/ideas/default-backend-policy-account-vs-origin.md);
29
+ SearXNG (you run it) is the closest, `tavily-compat` (needs an account/key) is the other.
30
+
31
+ ### Run SearXNG (matches the default with no config)
32
+
33
+ ```sh
34
+ # Docker: the container binds 8080 internally; map host 8080 -> container 8080
35
+ # so it matches webveil's default baseUrl exactly.
36
+ docker run -d --name searxng -p 8080:8080 searxng/searxng
37
+ ```
38
+
39
+ Then `webveil search "…"` / `web_fetch` work with no config.
40
+
41
+ > **Port gotcha (you WILL hit this):** SearXNG's default port depends on how you install
42
+ > it. A bare-metal / pip / source install defaults to **8888** (`settings.yml`
43
+ > `server.port: 8888`). The Docker image binds **8080** internally regardless (its
44
+ > entrypoint forces `0.0.0.0:8080`). SearXNG's own docs suggest `docker run … -p 8888:8080`
45
+ > (host 8888 → container 8080). webveil's default expects **8080**. If your instance is on
46
+ > any other port, point webveil at it:
47
+ >
48
+ > ```sh
49
+ > export WEBVEIL_BASE_URL=http://127.0.0.1:8888 # or wherever your instance listens
50
+ > ```
51
+ >
52
+ > or set `baseUrl` in `webveil.json` (see config seam below).
53
+
54
+ ### Other SearXNG install options
55
+
56
+ webveil needs something to point `baseUrl` at: an **HTTP `host:port`**, or (script install)
57
+ the **Unix socket** itself. How you get one:
58
+
59
+ - **Docker (above)**, binds a real TCP port directly; simplest if you only need webveil.
60
+ - **Install script as a background service** (`sudo -H ./utils/searxng.sh install all`,
61
+ see <https://docs.searxng.org/admin/installation-scripts.html>), sets SearXNG up as a
62
+ systemd/uWSGI service. **Gotcha:** by default this listens on a **Unix socket**
63
+ (`socket = /usr/local/searxng/run/socket`), NOT a TCP port. And, crucially, that default
64
+ socket speaks the **native uwsgi protocol, NOT HTTP** (`socket = …`, not `http-socket =
65
+ …`), so even a `curl --unix-socket … http://localhost/` returns HTTP 000. webveil's
66
+ `unix:` baseUrl speaks **HTTP over a unix socket** via undici, so it CANNOT reach that
67
+ default uwsgi socket directly. Three ways to reach the install-script instance:
68
+ - **Point webveil straight at an HTTP unix socket** (no proxy, no extra process), once the
69
+ socket actually speaks HTTP. The install-script default does NOT, so first make uWSGI
70
+ serve HTTP on the socket: in the generated `.ini`, replace
71
+ `socket = /usr/local/searxng/run/socket` with
72
+ `http-socket = /usr/local/searxng/run/socket` (HTTP over the socket instead of the
73
+ uwsgi protocol). THEN point webveil at it with a `unix:` URL naming the socket file:
74
+ ```sh
75
+ export WEBVEIL_BASE_URL=unix:/usr/local/searxng/run/socket
76
+ ```
77
+ webveil dials the socket directly over undici (`Agent({connect:{socketPath}})`, no
78
+ extra dependency) and issues its normal `/search?...&format=json` request. The grammar
79
+ is `unix:<socketPath>[:<httpPath>]`: the socket file path, then an OPTIONAL `:` +
80
+ base path (mount point) the SearXNG app lives under (defaults to `/`, so the example
81
+ above requests `/search`; a non-root mount is `unix:/usr/local/searxng/run/socket:/searxng`).
82
+ (`unix:` works against ANY HTTP-on-a-unix-socket server, e.g. a Caddy/nginx upstream
83
+ bound to a socket; the uwsgi-vs-`http-socket` distinction above is the SearXNG-specific
84
+ catch.)
85
+ **Egress must be `direct`** for this: a Unix socket is inherently local, so combining a
86
+ `unix:` baseUrl with `egress=http`/`socks5` fails loud (proxying a local hop is fake
87
+ anonymity, see "Where does anonymity live?" below; proxy SearXNG's `outgoing.proxies`
88
+ instead and keep webveil `direct`).
89
+ - **Front it with a reverse proxy** (this is what the SearXNG docs' nginx/apache step is
90
+ for, it bridges HTTP-on-a-port to the uWSGI socket, serving BOTH the browser UI and
91
+ webveil). **Any HTTP server works**, the docs say so explicitly; **Caddy is fine** and
92
+ a good pick if you already run it. Plain Caddy `reverse_proxy` speaks **HTTP** to its
93
+ upstream, so point it at an `http-socket` (see below) or a TCP `http-socket`:
94
+ ```caddy
95
+ searxng.example.com {
96
+ reverse_proxy unix//usr/local/searxng/run/socket # plain reverse_proxy = HTTP, so the socket must be http-socket = (not the uwsgi socket =)
97
+ }
98
+ ```
99
+ Then point webveil at the Caddy address. (Set SearXNG's `server.base_url` in
100
+ `settings.yml` to match, and keep the limiter in mind, see below.) If you want a Caddy
101
+ frontend AND webveil-direct, the simplest path is ONE `http-socket` that both consume
102
+ (Caddy's HTTP `reverse_proxy` and webveil's `unix:` both speak HTTP to it); you only
103
+ need the uwsgi `socket = ` form if Caddy uses an explicit uwsgi transport.
104
+ - **Or make uWSGI listen on a TCP port** instead of the socket: in the generated
105
+ `.ini`, replace `socket = …/run/socket` with `http-socket = 127.0.0.1:8888`, then point
106
+ webveil at `http://127.0.0.1:8888`. Good when you want ONLY webveil (no public web UI /
107
+ TLS).
108
+
109
+ > **You will also need to enable the JSON API and (for a local instance) disable the
110
+ > limiter.** A fresh script install ships with `server.limiter: true` and often no `json`
111
+ > output format, so webveil gets `429 TOO MANY REQUESTS` or an HTML page. In SearXNG's
112
+ > `settings.yml` set `server.limiter: false` + `server.public_instance: false` (safe for a
113
+ > LOCAL, socket-only instance, NOT internet-exposed) and add `json` under `search.formats:`
114
+ > (`[html, json]`), then restart uWSGI. This applies to EVERY option above, it is a
115
+ > SearXNG-side requirement, not a webveil one.
116
+
117
+ Full SearXNG install options (Docker, Compose, script, bare-metal): the official docs at
118
+ <https://docs.searxng.org/admin/installation.html>. Install topology + the
119
+ uwsgi-vs-`http-socket`, limiter, and reverse-proxy details captured in
120
+ [`work/notes/findings/searxng-install-topology.md`](work/notes/findings/searxng-install-topology.md)
121
+ and
122
+ [`work/notes/findings/searxng-script-socket-is-uwsgi-not-http.md`](work/notes/findings/searxng-script-socket-is-uwsgi-not-http.md).
123
+
124
+ ### Where does anonymity live? (read before turning on egress)
125
+
126
+ **webveil's egress only anonymizes webveil's OWN outbound hop** (webveil → backend, and
127
+ `web_fetch` → the target URL). It does NOT anonymize what a backend does next. This has a
128
+ load-bearing consequence for SearXNG:
129
+
130
+ - A **local** SearXNG makes its actual search-engine requests (→ Google/Bing/…) from
131
+ **its own process, on your machine, with your real IP**. That hop is OUTSIDE webveil's
132
+ egress. So setting `WEBVEIL_EGRESS=socks5` while `baseUrl` is `127.0.0.1` does **NOT**
133
+ make your searches anonymous, webveil would just be proxying a pointless localhost call,
134
+ while SearXNG crawls the web from your real IP. That is **false confidence**, the worst
135
+ outcome.
136
+ - **webveil refuses this combo (fail-loud):** a non-`direct` egress (`http`/`socks5`) with
137
+ a **loopback `baseUrl`** is rejected with an error, rather than silently giving you fake
138
+ anonymity. (A *remote* SearXNG over SOCKS is legitimate and allowed, the guard keys on
139
+ loopback specifically.)
140
+
141
+ So the correct setups:
142
+
143
+ | Goal | webveil egress | backend | Who anonymizes the web hop |
144
+ | --- | --- | --- | --- |
145
+ | Local SearXNG, anonymous searches | `direct` | local SearXNG | **SearXNG itself**, set its `outgoing.proxies` (Tor/SOCKS) in `settings.yml` |
146
+ | Remote SearXNG, hide your IP from it | `socks5` | the **remote** SearXNG url | webveil's hop (Mullvad/Tor) |
147
+ | Anonymous `web_fetch` of arbitrary URLs | `socks5` | (any) | webveil's hop |
148
+ | Non-anonymous everyday use | `direct` | local SearXNG | nobody (honest) |
149
+
150
+ Rule of thumb: **proxy the hop that actually reaches the public internet.** For a
151
+ self-hosted SearXNG that hop is SearXNG's, so the proxy goes on SearXNG
152
+ (`outgoing.proxies`), and webveil stays `direct`. webveil's `socks5` mode is for *remote*
153
+ backends and for `web_fetch`. See
154
+ [`work/notes/findings/webveil-anonymity-boundary.md`](work/notes/findings/webveil-anonymity-boundary.md).
155
+
23
156
  ## How it works (seams)
24
157
 
25
- - **core** the framework-agnostic `search(query, opts)` and `fetch(url, opts)` functions.
158
+ - **core**, the framework-agnostic `search(query, opts)` and `fetch(url, opts)` functions.
26
159
  Both frontends call the same core.
27
- - **backend seam** where results/content come from: `searxng` (keyless self-hosted
160
+ - **backend seam**, where results/content come from: `searxng` (keyless self-hosted
28
161
  metasearch), `tavily-compat` (a generic Tavily-shaped `/search` + `/extract`), and
29
162
  `custom` (a local command via a JSON stdin/stdout contract). The backend is handed a
30
163
  proxied `http` helper so it cannot bypass egress.
31
- - **egress seam** how outbound HTTP leaves the machine: `direct`, `http` (undici
164
+ - **egress seam**, how outbound HTTP leaves the machine: `direct`, `http` (undici
32
165
  `ProxyAgent`), or `socks5` (Tor `127.0.0.1:9050`, Mullvad `10.64.0.1:1080`). SOCKS5 is
33
166
  the mode that matters for anonymity. Fail-loud if a configured proxy cannot be built.
34
- - **config seam** per-folder resolution: env > nearest `.pi/webveil.json` walking up from
35
- cwd > global `~/.pi/agent/webveil.json` > defaults. Per folder = per account/egress.
36
- - **extractor seam** `urlToMarkdown` via `distilly/fetch` by default, injected with
167
+ **Egress is per-request and scoped to webveil ONLY**, it is NOT a system-wide proxy. It
168
+ governs webveil's own search/fetch traffic (and the `fetch` it injects into distilly),
169
+ and nothing else: your shell, `git push`, the browser, and the OS are untouched. So
170
+ webveil on `socks5` does NOT route your `git push` through the proxy. See
171
+ [Anonymous egress](#anonymous-egress-mullvad--tor) and
172
+ [`work/notes/findings/mullvad-socks5-egress-mechanics.md`](work/notes/findings/mullvad-socks5-egress-mechanics.md).
173
+ - **config seam**, per-folder resolution: env > nearest `webveil.json` walking up from
174
+ cwd > global `$XDG_CONFIG_HOME/webveil/config.json` (default
175
+ `~/.config/webveil/config.json`) > defaults. Per folder = per account/egress. The
176
+ project file is a frontend-neutral `webveil.json` read identically by the CLI and the
177
+ pi extension. See [`docs/adr/0002`](docs/adr/0002-config-file-location-neutral-webveil-json.md).
178
+ - **extractor seam**, `urlToMarkdown` via `distilly/fetch` by default, injected with
37
179
  webveil's egress-bound `fetch`; a backend's own `/extract` (Tavily-compat) may override
38
180
  it. Owns the context-friendly markdown + size presets (`s`/`m`/`l`/`f`). See
39
181
  [`docs/adr/0001`](docs/adr/0001-extractor-uses-distilly-fetch-with-injected-egress.md).
40
- - **security** an SSRF guard lives in the egress fetch, so it covers distilly's
182
+ - **security**, an SSRF guard lives in the egress fetch, so it covers distilly's
41
183
  rule-rewritten requests too.
42
184
 
185
+ ## Anonymous egress (Mullvad / Tor)
186
+
187
+ By default webveil uses `direct` egress (your real IP, non-anonymous). Anonymity is
188
+ **opt-in**: it is enabled ONLY when you set it in config/env. webveil never auto-enables a
189
+ proxy (silent anonymity would be a footgun in the other direction).
190
+
191
+ Enable SOCKS5 egress for webveil:
192
+
193
+ ```sh
194
+ export WEBVEIL_EGRESS=socks5
195
+ export WEBVEIL_EGRESS_URL=socks5://10.64.0.1:1080 # Mullvad
196
+ # or socks5://127.0.0.1:9050 # Tor
197
+ ```
198
+
199
+ or per folder in `webveil.json`:
200
+
201
+ ```json
202
+ { "egress": { "mode": "socks5", "url": "socks5://10.64.0.1:1080" } }
203
+ ```
204
+
205
+ ### Two layers keep your `git push` (and everything else) off the proxy
206
+
207
+ A common worry: "if I route through Mullvad, will my `git push` to GitHub leak under the
208
+ VPN exit IP?" With webveil, **no**, for two independent reasons:
209
+
210
+ 1. **webveil's egress is per-request and webveil-only.** It applies the SOCKS5 dispatcher
211
+ inside its own search/fetch code; it does not install a system proxy. `git`, your shell,
212
+ and the OS are never touched. webveil on `socks5` proxies webveil's traffic and nothing
213
+ else.
214
+ 2. **You configure split routing** (below) so that even at the OS level, only the proxy IP
215
+ goes through the tunnel.
216
+
217
+ ### Mullvad: use the SOCKS5 proxy WITHOUT tunnelling all your traffic
218
+
219
+ Mullvad's SOCKS5 proxy at `10.64.0.1:1080` **only exists while a Mullvad WireGuard tunnel
220
+ is up** (it is reachable only through the tunnel). The trick is to keep the tunnel up but
221
+ tell WireGuard NOT to route your normal traffic through it, only the proxy IP. Add this to
222
+ your Mullvad WireGuard `.conf` (`[Interface]` section):
223
+
224
+ ```ini
225
+ Table = off
226
+ PostUp = ip -4 route add 10.64.0.1/32 dev %i; ip -4 route add 10.124.0.0/22 dev %i
227
+ PreDown = ip -4 route delete 10.64.0.1/32 dev %i; ip -4 route delete 10.124.0.0/22 dev %i
228
+ ```
229
+
230
+ `Table = off` stops WireGuard from grabbing the default route; the manual routes send ONLY
231
+ Mullvad's SOCKS5 proxy IPs through the tunnel (`10.124.0.0/22` is the multihop range).
232
+ Result: webveil's SOCKS5 requests exit via Mullvad; all other traffic (git, browser, OS)
233
+ uses your normal ISP connection. (Simpler alternative: leave WireGuard's routing alone and
234
+ rely on layer 1, but split routing is the belt-and-braces version.)
235
+
236
+ Verify the proxy works: `curl https://ipv4.am.i.mullvad.net --socks5-hostname 10.64.0.1`
237
+ should return a Mullvad exit IP; a plain `curl https://am.i.mullvad.net` should return your
238
+ real IP (proving only the proxy is tunnelled).
239
+
240
+ ### "Different exit identity for webveil than for the rest of the machine"
241
+
242
+ If you want webveil to exit somewhere different from your system, you have options, but be
243
+ clear on what is and isn't possible (see
244
+ [`work/notes/findings/mullvad-socks5-egress-mechanics.md`](work/notes/findings/mullvad-socks5-egress-mechanics.md)):
245
+
246
+ - **Different exit LOCATION, same account (easy).** Point webveil at a specific multihop
247
+ SOCKS5 host so it exits elsewhere than your tunnel's entry:
248
+ `WEBVEIL_EGRESS_URL=socks5://us-nyc-wg-socks5-001.relays.mullvad.net:1080`. Your tunnel
249
+ enters where your Mullvad app is connected; webveil's traffic exits in NYC. Same Mullvad
250
+ account, unlinkable-by-location.
251
+ - **Two DIFFERENT Mullvad ACCOUNTS at once (hard, not a webveil feature).** Mullvad's
252
+ SOCKS5 proxy is a property of the ONE active WireGuard tunnel, which is tied to ONE
253
+ account's key. SOCKS5 multihop changes exit location, NOT account. To run account A
254
+ system-wide AND account B for webveil simultaneously, you must isolate them at the OS
255
+ level: run webveil inside its own network namespace / VM / container that has its own
256
+ WireGuard tunnel on account B, while the host runs account A. That is infrastructure work
257
+ outside webveil. For most people, "don't link my searches to my git" is already solved by
258
+ split routing above (searches exit via Mullvad, git stays on your real IP, not correlated
259
+ by exit IP), without needing a second account.
260
+
261
+ ### Tor
262
+
263
+ `WEBVEIL_EGRESS_URL=socks5://127.0.0.1:9050` with the Tor daemon running. Same per-request,
264
+ webveil-only scoping applies.
265
+
266
+ > **Caveat:** webveil's `socks5` mode is NOT a whole-machine VPN. Do not assume enabling it
267
+ > anonymizes anything other than webveil. Conversely, a system-wide full-tunnel VPN under
268
+ > your logged-in identity is the thing that CAN deanonymize a `git push`; webveil's scoped
269
+ > egress deliberately avoids that.
270
+
43
271
  ## License
44
272
 
45
273
  AGPL-3.0-or-later. webveil depends on `distilly` (MIT, the local HTML-to-markdown
@@ -62,7 +290,7 @@ a promise); `LOC` is the actual line count of the built file.
62
290
  | src/cli.ts (incur frontend) | 106 | ~80 |
63
291
  | src/core/search.ts | 104 | ~90 |
64
292
  | src/core/fetch.ts | 132 | ~90 |
65
- | src/core/config.ts | 106 | ~80 |
293
+ | src/core/config.ts | 128 | ~80 |
66
294
  | src/core/egress.ts | 106 | ~70 |
67
295
  | src/core/http.ts | 62 | ~60 |
68
296
  | src/core/extract.ts | 82 | ~60 |
@@ -72,7 +300,7 @@ a promise); `LOC` is the actual line count of the built file.
72
300
  | src/core/backends/searxng.ts | 70 | ~90 |
73
301
  | src/core/backends/tavily-compat.ts | 156 | ~90 |
74
302
  | src/core/backends/custom.ts | 159 | ~70 |
75
- | **subtotal** | 1408 | |
303
+ | **subtotal** | 1430 | |
76
304
 
77
305
  ### `packages/pi-webveil` (pi extension frontend)
78
306
 
@@ -80,7 +308,7 @@ a promise); `LOC` is the actual line count of the built file.
80
308
  | ------------ | --: | -----: |
81
309
  | src/index.ts | 168 | ~90 |
82
310
 
83
- **Total own source: 1576 LOC** (excluding deps).
311
+ **Total own source: 1598 LOC** (excluding deps).
84
312
 
85
313
  > Reality vs. target: several modules currently exceed their `CONTEXT.md` ceilings (notably
86
314
  > `tavily-compat.ts`, `custom.ts`, `pi-webveil/src/index.ts`), and two built modules
@@ -0,0 +1,42 @@
1
+ import { type Dispatcher } from 'undici';
2
+ /** A parsed Unix-socket baseUrl: the socket file path + the app's base path. */
3
+ export interface UnixBaseUrl {
4
+ socketPath: string;
5
+ /** The app's base path (mount point); the backend appends `search` to it. */
6
+ httpPath: string;
7
+ }
8
+ /** Is this resolved `baseUrl` a Unix-domain-socket form (`unix:…`)? */
9
+ export declare function isUnixBaseUrl(baseUrl: string): boolean;
10
+ /**
11
+ * Parse a `unix:<socketPath>[:<httpPath>]` baseUrl into `{socketPath, httpPath}`.
12
+ * Splits on the FIRST `:` after the `unix:` scheme (socket paths conventionally
13
+ * carry no colon); an absent/empty `<httpPath>` defaults to `/`.
14
+ *
15
+ * Throws if the socket path is empty (there is nothing to connect to).
16
+ */
17
+ export declare function parseUnixBaseUrl(baseUrl: string): UnixBaseUrl;
18
+ /**
19
+ * The result of resolving a `baseUrl` for the BACKEND hop: the (possibly
20
+ * rewritten) HTTP base the backend builds its request URL on, plus an OPTIONAL
21
+ * undici `Dispatcher` to carry that hop. For a normal TCP baseUrl the dispatcher
22
+ * is `undefined` (the caller uses the config-wide egress dispatcher); for a
23
+ * `unix:` baseUrl it is a socket-bound `Agent` and the base is a synthetic
24
+ * `http://localhost<httpPath>`.
25
+ */
26
+ export interface BackendTransport {
27
+ baseUrl: string;
28
+ dispatcher?: Dispatcher;
29
+ }
30
+ /**
31
+ * Resolve a `baseUrl` into a backend-hop transport. For a `unix:` baseUrl this
32
+ * builds a socket-bound `Agent({connect:{socketPath}})` and a synthetic
33
+ * `http://localhost<httpPath>` base (the URL host is irrelevant to routing — the
34
+ * socket decides — and only becomes the `Host` header). For any other baseUrl it
35
+ * returns the baseUrl unchanged with NO dispatcher (the caller keeps using the
36
+ * shared config-wide egress dispatcher).
37
+ *
38
+ * NOTE: this is the BACKEND-hop transport only. It is never bound into the
39
+ * shared egress dispatcher, so `web_fetch`/SSRF egress is unaffected.
40
+ */
41
+ export declare function resolveBackendTransport(baseUrl: string): BackendTransport;
42
+ //# sourceMappingURL=baseurl.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"baseurl.d.ts","sourceRoot":"","sources":["../../src/core/baseurl.ts"],"names":[],"mappings":"AA6BA,OAAO,EAAQ,KAAK,UAAU,EAAC,MAAM,QAAQ,CAAC;AAK9C,gFAAgF;AAChF,MAAM,WAAW,WAAW;IAC3B,UAAU,EAAE,MAAM,CAAC;IACnB,6EAA6E;IAC7E,QAAQ,EAAE,MAAM,CAAC;CACjB;AAED,uEAAuE;AACvE,wBAAgB,aAAa,CAAC,OAAO,EAAE,MAAM,GAAG,OAAO,CAEtD;AAED;;;;;;GAMG;AACH,wBAAgB,gBAAgB,CAAC,OAAO,EAAE,MAAM,GAAG,WAAW,CAgB7D;AAED;;;;;;;GAOG;AACH,MAAM,WAAW,gBAAgB;IAChC,OAAO,EAAE,MAAM,CAAC;IAChB,UAAU,CAAC,EAAE,UAAU,CAAC;CACxB;AAED;;;;;;;;;;GAUG;AACH,wBAAgB,uBAAuB,CAAC,OAAO,EAAE,MAAM,GAAG,gBAAgB,CAQzE"}
@@ -0,0 +1,79 @@
1
+ // baseUrl transport parsing — the small, transport-AWARE helper that classifies
2
+ // a resolved `baseUrl` into either a normal TCP HTTP base or a Unix-domain-socket
3
+ // form, and (for the socket form) rewrites it into a synthetic `http://localhost`
4
+ // base that the transport-UNAWARE backends can build their request URL on top of.
5
+ //
6
+ // WHY THIS LIVES HERE (and not in egress.ts): the egress dispatcher
7
+ // (`buildDispatcher`) is built ONCE from config and SHARED by both the backend
8
+ // hop (search.ts) AND the arbitrary-public-URL fetch (fetch.ts). Binding a socket
9
+ // `Agent` into that shared dispatcher would route every `web_fetch` into the
10
+ // SearXNG socket. So the socket transport is scoped to the BACKEND `baseUrl` hop
11
+ // only: search.ts asks this helper to translate the baseUrl, gets back a real
12
+ // `http://localhost…` base (which the searxng backend's `new URL('search', base)`
13
+ // still works on, unchanged) plus the per-hop socket `Agent`, and leaves the
14
+ // fetch/SSRF egress path untouched.
15
+ //
16
+ // GRAMMAR (recorded decision, see the task's done record):
17
+ // unix:<socketPath>[:<httpPath>]
18
+ // - `<socketPath>`: the absolute path to the uWSGI Unix domain socket
19
+ // (e.g. /usr/local/searxng/run/socket). Must not contain a `:` (the parse
20
+ // splits on the FIRST `:` after the `unix:` scheme; conventional socket
21
+ // paths never contain a colon).
22
+ // - `<httpPath>`: OPTIONAL base path the SearXNG app is mounted under,
23
+ // defaulting to `/`. It is the SAME thing the TCP `baseUrl` encodes as its
24
+ // path: the backend appends `search` to it (`new URL('search', base + '/')`),
25
+ // so the install default is just `unix:/usr/local/searxng/run/socket` (the
26
+ // backend then requests `/search`). A non-root mount is `…/socket:/searxng`.
27
+ // A raw `unix:…` string is NOT a valid base for `new URL('search', …)`, so this
28
+ // translation MUST run BEFORE the backend builds its URL.
29
+ import { Agent } from 'undici';
30
+ /** The `unix:` scheme prefix this helper recognizes on a `baseUrl`. */
31
+ const UNIX_PREFIX = 'unix:';
32
+ /** Is this resolved `baseUrl` a Unix-domain-socket form (`unix:…`)? */
33
+ export function isUnixBaseUrl(baseUrl) {
34
+ return baseUrl.startsWith(UNIX_PREFIX);
35
+ }
36
+ /**
37
+ * Parse a `unix:<socketPath>[:<httpPath>]` baseUrl into `{socketPath, httpPath}`.
38
+ * Splits on the FIRST `:` after the `unix:` scheme (socket paths conventionally
39
+ * carry no colon); an absent/empty `<httpPath>` defaults to `/`.
40
+ *
41
+ * Throws if the socket path is empty (there is nothing to connect to).
42
+ */
43
+ export function parseUnixBaseUrl(baseUrl) {
44
+ const rest = baseUrl.slice(UNIX_PREFIX.length);
45
+ const sep = rest.indexOf(':');
46
+ const socketPath = sep === -1 ? rest : rest.slice(0, sep);
47
+ const rawHttpPath = sep === -1 ? '' : rest.slice(sep + 1);
48
+ if (!socketPath)
49
+ throw new Error(`webveil: malformed unix baseUrl ${JSON.stringify(baseUrl)} — ` +
50
+ `expected unix:<socketPath>[:<httpPath>] with a non-empty socket path`);
51
+ const httpPath = rawHttpPath
52
+ ? rawHttpPath.startsWith('/')
53
+ ? rawHttpPath
54
+ : '/' + rawHttpPath
55
+ : '/';
56
+ return { socketPath, httpPath };
57
+ }
58
+ /**
59
+ * Resolve a `baseUrl` into a backend-hop transport. For a `unix:` baseUrl this
60
+ * builds a socket-bound `Agent({connect:{socketPath}})` and a synthetic
61
+ * `http://localhost<httpPath>` base (the URL host is irrelevant to routing — the
62
+ * socket decides — and only becomes the `Host` header). For any other baseUrl it
63
+ * returns the baseUrl unchanged with NO dispatcher (the caller keeps using the
64
+ * shared config-wide egress dispatcher).
65
+ *
66
+ * NOTE: this is the BACKEND-hop transport only. It is never bound into the
67
+ * shared egress dispatcher, so `web_fetch`/SSRF egress is unaffected.
68
+ */
69
+ export function resolveBackendTransport(baseUrl) {
70
+ if (!isUnixBaseUrl(baseUrl))
71
+ return { baseUrl };
72
+ const { socketPath, httpPath } = parseUnixBaseUrl(baseUrl);
73
+ const dispatcher = new Agent({ connect: { socketPath } });
74
+ return {
75
+ baseUrl: `http://localhost${httpPath === '/' ? '' : httpPath}`,
76
+ dispatcher,
77
+ };
78
+ }
79
+ //# sourceMappingURL=baseurl.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"baseurl.js","sourceRoot":"","sources":["../../src/core/baseurl.ts"],"names":[],"mappings":"AAAA,gFAAgF;AAChF,kFAAkF;AAClF,kFAAkF;AAClF,kFAAkF;AAClF,EAAE;AACF,oEAAoE;AACpE,+EAA+E;AAC/E,kFAAkF;AAClF,6EAA6E;AAC7E,iFAAiF;AACjF,8EAA8E;AAC9E,kFAAkF;AAClF,6EAA6E;AAC7E,oCAAoC;AACpC,EAAE;AACF,2DAA2D;AAC3D,mCAAmC;AACnC,0EAA0E;AAC1E,gFAAgF;AAChF,8EAA8E;AAC9E,sCAAsC;AACtC,2EAA2E;AAC3E,iFAAiF;AACjF,oFAAoF;AACpF,iFAAiF;AACjF,mFAAmF;AACnF,kFAAkF;AAClF,4DAA4D;AAE5D,OAAO,EAAC,KAAK,EAAkB,MAAM,QAAQ,CAAC;AAE9C,uEAAuE;AACvE,MAAM,WAAW,GAAG,OAAO,CAAC;AAS5B,uEAAuE;AACvE,MAAM,UAAU,aAAa,CAAC,OAAe;IAC5C,OAAO,OAAO,CAAC,UAAU,CAAC,WAAW,CAAC,CAAC;AACxC,CAAC;AAED;;;;;;GAMG;AACH,MAAM,UAAU,gBAAgB,CAAC,OAAe;IAC/C,MAAM,IAAI,GAAG,OAAO,CAAC,KAAK,CAAC,WAAW,CAAC,MAAM,CAAC,CAAC;IAC/C,MAAM,GAAG,GAAG,IAAI,CAAC,OAAO,CAAC,GAAG,CAAC,CAAC;IAC9B,MAAM,UAAU,GAAG,GAAG,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC,IAAI,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,CAAC,EAAE,GAAG,CAAC,CAAC;IAC1D,MAAM,WAAW,GAAG,GAAG,KAAK,CAAC,CAAC,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,IAAI,CAAC,KAAK,CAAC,GAAG,GAAG,CAAC,CAAC,CAAC;IAC1D,IAAI,CAAC,UAAU;QACd,MAAM,IAAI,KAAK,CACd,mCAAmC,IAAI,CAAC,SAAS,CAAC,OAAO,CAAC,KAAK;YAC9D,sEAAsE,CACvE,CAAC;IACH,MAAM,QAAQ,GAAG,WAAW;QAC3B,CAAC,CAAC,WAAW,CAAC,UAAU,CAAC,GAAG,CAAC;YAC5B,CAAC,CAAC,WAAW;YACb,CAAC,CAAC,GAAG,GAAG,WAAW;QACpB,CAAC,CAAC,GAAG,CAAC;IACP,OAAO,EAAC,UAAU,EAAE,QAAQ,EAAC,CAAC;AAC/B,CAAC;AAeD;;;;;;;;;;GAUG;AACH,MAAM,UAAU,uBAAuB,CAAC,OAAe;IACtD,IAAI,CAAC,aAAa,CAAC,OAAO,CAAC;QAAE,OAAO,EAAC,OAAO,EAAC,CAAC;IAC9C,MAAM,EAAC,UAAU,EAAE,QAAQ,EAAC,GAAG,gBAAgB,CAAC,OAAO,CAAC,CAAC;IACzD,MAAM,UAAU,GAAG,IAAI,KAAK,CAAC,EAAC,OAAO,EAAE,EAAC,UAAU,EAAC,EAAC,CAAC,CAAC;IACtD,OAAO;QACN,OAAO,EAAE,mBAAmB,QAAQ,KAAK,GAAG,CAAC,CAAC,CAAC,EAAE,CAAC,CAAC,CAAC,QAAQ,EAAE;QAC9D,UAAU;KACV,CAAC;AACH,CAAC"}
@@ -25,9 +25,14 @@ export interface ResolveOptions {
25
25
  cwd?: string;
26
26
  /** Environment to read overrides from. Defaults to process.env. */
27
27
  env?: Record<string, string | undefined>;
28
+ /** Home directory for the XDG fallback. Defaults to os.homedir(). */
29
+ homeDir?: string;
28
30
  /**
29
- * Path to the global config file. Defaults to ~/.pi/agent/webveil.json.
30
- * Tests point this at a temp dir to isolate the real home directory.
31
+ * Path to the global config file. When given it WINS outright and the XDG
32
+ * resolution is skipped. Tests point this at a temp dir to isolate the real
33
+ * home directory. When absent, the global file resolves to
34
+ * $XDG_CONFIG_HOME/webveil/config.json, falling back to
35
+ * <homeDir>/.config/webveil/config.json.
31
36
  */
32
37
  globalPath?: string;
33
38
  }
@@ -1 +1 @@
1
- {"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../../src/core/config.ts"],"names":[],"mappings":"AAUA,2DAA2D;AAC3D,MAAM,MAAM,MAAM,GACf;IAAC,IAAI,EAAE,QAAQ,CAAA;CAAC,GAChB;IAAC,IAAI,EAAE,MAAM,CAAC;IAAC,GAAG,EAAE,MAAM,CAAA;CAAC,GAC3B;IAAC,IAAI,EAAE,QAAQ,CAAC;IAAC,GAAG,EAAE,MAAM,CAAA;CAAC,CAAC;AAEjC,sEAAsE;AACtE,MAAM,MAAM,SAAS,GAAG,GAAG,GAAG,GAAG,GAAG,GAAG,GAAG,GAAG,CAAC;AAE9C,+DAA+D;AAC/D,MAAM,WAAW,MAAM;IACtB,OAAO,EAAE,MAAM,CAAC;IAChB,OAAO,EAAE,MAAM,CAAC;IAChB,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,MAAM,EAAE,MAAM,CAAC;IACf,SAAS,EAAE,SAAS,CAAC;CACrB;AAED,mEAAmE;AACnE,MAAM,MAAM,aAAa,GAAG,OAAO,CAAC,MAAM,CAAC,CAAC;AAE5C,MAAM,WAAW,cAAc;IAC9B,4EAA4E;IAC5E,GAAG,CAAC,EAAE,MAAM,CAAC;IACb,mEAAmE;IACnE,GAAG,CAAC,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,GAAG,SAAS,CAAC,CAAC;IACzC;;;OAGG;IACH,UAAU,CAAC,EAAE,MAAM,CAAC;CACpB;AA+CD;;;GAGG;AACH,wBAAgB,aAAa,CAAC,OAAO,GAAE,cAAmB,GAAG,MAAM,CAalE"}
1
+ {"version":3,"file":"config.d.ts","sourceRoot":"","sources":["../../src/core/config.ts"],"names":[],"mappings":"AAcA,2DAA2D;AAC3D,MAAM,MAAM,MAAM,GACf;IAAC,IAAI,EAAE,QAAQ,CAAA;CAAC,GAChB;IAAC,IAAI,EAAE,MAAM,CAAC;IAAC,GAAG,EAAE,MAAM,CAAA;CAAC,GAC3B;IAAC,IAAI,EAAE,QAAQ,CAAC;IAAC,GAAG,EAAE,MAAM,CAAA;CAAC,CAAC;AAEjC,sEAAsE;AACtE,MAAM,MAAM,SAAS,GAAG,GAAG,GAAG,GAAG,GAAG,GAAG,GAAG,GAAG,CAAC;AAE9C,+DAA+D;AAC/D,MAAM,WAAW,MAAM;IACtB,OAAO,EAAE,MAAM,CAAC;IAChB,OAAO,EAAE,MAAM,CAAC;IAChB,MAAM,CAAC,EAAE,MAAM,CAAC;IAChB,MAAM,EAAE,MAAM,CAAC;IACf,SAAS,EAAE,SAAS,CAAC;CACrB;AAED,mEAAmE;AACnE,MAAM,MAAM,aAAa,GAAG,OAAO,CAAC,MAAM,CAAC,CAAC;AAE5C,MAAM,WAAW,cAAc;IAC9B,4EAA4E;IAC5E,GAAG,CAAC,EAAE,MAAM,CAAC;IACb,mEAAmE;IACnE,GAAG,CAAC,EAAE,MAAM,CAAC,MAAM,EAAE,MAAM,GAAG,SAAS,CAAC,CAAC;IACzC,qEAAqE;IACrE,OAAO,CAAC,EAAE,MAAM,CAAC;IACjB;;;;;;OAMG;IACH,UAAU,CAAC,EAAE,MAAM,CAAC;CACpB;AA4DD;;;GAGG;AACH,wBAAgB,aAAa,CAAC,OAAO,GAAE,cAAmB,GAAG,MAAM,CAalE"}
@@ -1,8 +1,12 @@
1
1
  // config seam — per-folder resolution. Precedence (highest wins):
2
- // env > nearest .pi/webveil.json (walking up from cwd) > global
3
- // ~/.pi/agent/webveil.json > defaults.
2
+ // env > nearest webveil.json (walking up from cwd) > global
3
+ // $XDG_CONFIG_HOME/webveil/config.json (~/.config/webveil/config.json) >
4
+ // defaults.
4
5
  // "Per folder = per account/egress." Each layer is a partial; later (lower)
5
- // layers fill gaps the higher layers leave.
6
+ // layers fill gaps the higher layers leave. The project file is a
7
+ // frontend-neutral `webveil.json` (no `.pi/`): both the pi-agnostic CLI and the
8
+ // pi extension resolve the same name, so a project is configured the same way
9
+ // regardless of which frontend reads it. See docs/adr/0002.
6
10
  import { readFileSync } from 'node:fs';
7
11
  import { homedir } from 'node:os';
8
12
  import { dirname, join, parse } from 'node:path';
@@ -12,7 +16,7 @@ const DEFAULTS = {
12
16
  egress: { mode: 'direct' },
13
17
  fetchSize: 'm',
14
18
  };
15
- const PROJECT_FILE = join('.pi', 'webveil.json');
19
+ const PROJECT_FILE = 'webveil.json';
16
20
  function readJson(path) {
17
21
  let text;
18
22
  try {
@@ -23,7 +27,7 @@ function readJson(path) {
23
27
  }
24
28
  return JSON.parse(text);
25
29
  }
26
- /** The nearest `.pi/webveil.json` walking up from `cwd` (first found wins). */
30
+ /** The nearest `webveil.json` walking up from `cwd` (first found wins). */
27
31
  function readProjectChain(cwd) {
28
32
  let dir = cwd;
29
33
  const { root } = parse(dir);
@@ -53,6 +57,15 @@ function readEnv(env) {
53
57
  layer.egress = { mode, url: env.WEBVEIL_EGRESS_URL ?? '' };
54
58
  return layer;
55
59
  }
60
+ /**
61
+ * The global config path, XDG-style: `$XDG_CONFIG_HOME/webveil/config.json`,
62
+ * falling back to `<homeDir>/.config/webveil/config.json` when XDG_CONFIG_HOME
63
+ * is unset. (`options.globalPath`, when given, bypasses this entirely.)
64
+ */
65
+ function resolveGlobalPath(env, homeDir = homedir()) {
66
+ const base = env.XDG_CONFIG_HOME || join(homeDir, '.config');
67
+ return join(base, 'webveil', 'config.json');
68
+ }
56
69
  /**
57
70
  * Resolve the effective config. Higher-precedence layers override lower ones,
58
71
  * key by key: env > project chain > global file > defaults.
@@ -60,7 +73,7 @@ function readEnv(env) {
60
73
  export function resolveConfig(options = {}) {
61
74
  const cwd = options.cwd ?? process.cwd();
62
75
  const env = options.env ?? process.env;
63
- const globalPath = options.globalPath ?? join(homedir(), '.pi', 'agent', 'webveil.json');
76
+ const globalPath = options.globalPath ?? resolveGlobalPath(env, options.homeDir);
64
77
  const layers = [
65
78
  DEFAULTS,
66
79
  readJson(globalPath) ?? {},
@@ -1 +1 @@
1
- {"version":3,"file":"config.js","sourceRoot":"","sources":["../../src/core/config.ts"],"names":[],"mappings":"AAAA,kEAAkE;AAClE,kEAAkE;AAClE,yCAAyC;AACzC,4EAA4E;AAC5E,4CAA4C;AAE5C,OAAO,EAAC,YAAY,EAAC,MAAM,SAAS,CAAC;AACrC,OAAO,EAAC,OAAO,EAAC,MAAM,SAAS,CAAC;AAChC,OAAO,EAAC,OAAO,EAAE,IAAI,EAAE,KAAK,EAAC,MAAM,WAAW,CAAC;AAmC/C,MAAM,QAAQ,GAAW;IACxB,OAAO,EAAE,SAAS;IAClB,OAAO,EAAE,uBAAuB;IAChC,MAAM,EAAE,EAAC,IAAI,EAAE,QAAQ,EAAC;IACxB,SAAS,EAAE,GAAG;CACd,CAAC;AAEF,MAAM,YAAY,GAAG,IAAI,CAAC,KAAK,EAAE,cAAc,CAAC,CAAC;AAEjD,SAAS,QAAQ,CAAC,IAAY;IAC7B,IAAI,IAAY,CAAC;IACjB,IAAI,CAAC;QACJ,IAAI,GAAG,YAAY,CAAC,IAAI,EAAE,MAAM,CAAC,CAAC;IACnC,CAAC;IAAC,MAAM,CAAC;QACR,OAAO,SAAS,CAAC,CAAC,mDAAmD;IACtE,CAAC;IACD,OAAO,IAAI,CAAC,KAAK,CAAC,IAAI,CAAkB,CAAC;AAC1C,CAAC;AAED,+EAA+E;AAC/E,SAAS,gBAAgB,CAAC,GAAW;IACpC,IAAI,GAAG,GAAG,GAAG,CAAC;IACd,MAAM,EAAC,IAAI,EAAC,GAAG,KAAK,CAAC,GAAG,CAAC,CAAC;IAC1B,SAAS,CAAC;QACT,MAAM,KAAK,GAAG,QAAQ,CAAC,IAAI,CAAC,GAAG,EAAE,YAAY,CAAC,CAAC,CAAC;QAChD,IAAI,KAAK;YAAE,OAAO,KAAK,CAAC;QACxB,IAAI,GAAG,KAAK,IAAI;YAAE,OAAO,SAAS,CAAC;QACnC,GAAG,GAAG,OAAO,CAAC,GAAG,CAAC,CAAC;IACpB,CAAC;AACF,CAAC;AAED,SAAS,OAAO,CAAC,GAAuC;IACvD,MAAM,KAAK,GAAkB,EAAE,CAAC;IAChC,IAAI,GAAG,CAAC,eAAe;QAAE,KAAK,CAAC,OAAO,GAAG,GAAG,CAAC,eAAe,CAAC;IAC7D,IAAI,GAAG,CAAC,gBAAgB;QAAE,KAAK,CAAC,OAAO,GAAG,GAAG,CAAC,gBAAgB,CAAC;IAC/D,IAAI,GAAG,CAAC,eAAe;QAAE,KAAK,CAAC,MAAM,GAAG,GAAG,CAAC,eAAe,CAAC;IAC5D,IAAI,GAAG,CAAC,kBAAkB;QACzB,KAAK,CAAC,SAAS,GAAG,GAAG,CAAC,kBAA+B,CAAC;IACvD,MAAM,IAAI,GAAG,GAAG,CAAC,cAAc,CAAC;IAChC,IAAI,IAAI,KAAK,QAAQ;QAAE,KAAK,CAAC,MAAM,GAAG,EAAC,IAAI,EAAE,QAAQ,EAAC,CAAC;SAClD,IAAI,IAAI,KAAK,MAAM,IAAI,IAAI,KAAK,QAAQ;QAC5C,KAAK,CAAC,MAAM,GAAG,EAAC,IAAI,EAAE,GAAG,EAAE,GAAG,CAAC,kBAAkB,IAAI,EAAE,EAAC,CAAC;IAC1D,OAAO,KAAK,CAAC;AACd,CAAC;AAED;;;GAGG;AACH,MAAM,UAAU,aAAa,CAAC,UAA0B,EAAE;IACzD,MAAM,GAAG,GAAG,OAAO,CAAC,GAAG,IAAI,OAAO,CAAC,GAAG,EAAE,CAAC;IACzC,MAAM,GAAG,GAAG,OAAO,CAAC,GAAG,IAAI,OAAO,CAAC,GAAG,CAAC;IACvC,MAAM,UAAU,GACf,OAAO,CAAC,UAAU,IAAI,IAAI,CAAC,OAAO,EAAE,EAAE,KAAK,EAAE,OAAO,EAAE,cAAc,CAAC,CAAC;IAEvE,MAAM,MAAM,GAAoB;QAC/B,QAAQ;QACR,QAAQ,CAAC,UAAU,CAAC,IAAI,EAAE;QAC1B,gBAAgB,CAAC,GAAG,CAAC,IAAI,EAAE;QAC3B,OAAO,CAAC,GAAG,CAAC;KACZ,CAAC;IACF,OAAO,MAAM,CAAC,MAAM,CAAC,EAAE,EAAE,GAAG,MAAM,CAAW,CAAC;AAC/C,CAAC"}
1
+ {"version":3,"file":"config.js","sourceRoot":"","sources":["../../src/core/config.ts"],"names":[],"mappings":"AAAA,kEAAkE;AAClE,8DAA8D;AAC9D,2EAA2E;AAC3E,cAAc;AACd,4EAA4E;AAC5E,kEAAkE;AAClE,gFAAgF;AAChF,8EAA8E;AAC9E,4DAA4D;AAE5D,OAAO,EAAC,YAAY,EAAC,MAAM,SAAS,CAAC;AACrC,OAAO,EAAC,OAAO,EAAC,MAAM,SAAS,CAAC;AAChC,OAAO,EAAC,OAAO,EAAE,IAAI,EAAE,KAAK,EAAC,MAAM,WAAW,CAAC;AAwC/C,MAAM,QAAQ,GAAW;IACxB,OAAO,EAAE,SAAS;IAClB,OAAO,EAAE,uBAAuB;IAChC,MAAM,EAAE,EAAC,IAAI,EAAE,QAAQ,EAAC;IACxB,SAAS,EAAE,GAAG;CACd,CAAC;AAEF,MAAM,YAAY,GAAG,cAAc,CAAC;AAEpC,SAAS,QAAQ,CAAC,IAAY;IAC7B,IAAI,IAAY,CAAC;IACjB,IAAI,CAAC;QACJ,IAAI,GAAG,YAAY,CAAC,IAAI,EAAE,MAAM,CAAC,CAAC;IACnC,CAAC;IAAC,MAAM,CAAC;QACR,OAAO,SAAS,CAAC,CAAC,mDAAmD;IACtE,CAAC;IACD,OAAO,IAAI,CAAC,KAAK,CAAC,IAAI,CAAkB,CAAC;AAC1C,CAAC;AAED,2EAA2E;AAC3E,SAAS,gBAAgB,CAAC,GAAW;IACpC,IAAI,GAAG,GAAG,GAAG,CAAC;IACd,MAAM,EAAC,IAAI,EAAC,GAAG,KAAK,CAAC,GAAG,CAAC,CAAC;IAC1B,SAAS,CAAC;QACT,MAAM,KAAK,GAAG,QAAQ,CAAC,IAAI,CAAC,GAAG,EAAE,YAAY,CAAC,CAAC,CAAC;QAChD,IAAI,KAAK;YAAE,OAAO,KAAK,CAAC;QACxB,IAAI,GAAG,KAAK,IAAI;YAAE,OAAO,SAAS,CAAC;QACnC,GAAG,GAAG,OAAO,CAAC,GAAG,CAAC,CAAC;IACpB,CAAC;AACF,CAAC;AAED,SAAS,OAAO,CAAC,GAAuC;IACvD,MAAM,KAAK,GAAkB,EAAE,CAAC;IAChC,IAAI,GAAG,CAAC,eAAe;QAAE,KAAK,CAAC,OAAO,GAAG,GAAG,CAAC,eAAe,CAAC;IAC7D,IAAI,GAAG,CAAC,gBAAgB;QAAE,KAAK,CAAC,OAAO,GAAG,GAAG,CAAC,gBAAgB,CAAC;IAC/D,IAAI,GAAG,CAAC,eAAe;QAAE,KAAK,CAAC,MAAM,GAAG,GAAG,CAAC,eAAe,CAAC;IAC5D,IAAI,GAAG,CAAC,kBAAkB;QACzB,KAAK,CAAC,SAAS,GAAG,GAAG,CAAC,kBAA+B,CAAC;IACvD,MAAM,IAAI,GAAG,GAAG,CAAC,cAAc,CAAC;IAChC,IAAI,IAAI,KAAK,QAAQ;QAAE,KAAK,CAAC,MAAM,GAAG,EAAC,IAAI,EAAE,QAAQ,EAAC,CAAC;SAClD,IAAI,IAAI,KAAK,MAAM,IAAI,IAAI,KAAK,QAAQ;QAC5C,KAAK,CAAC,MAAM,GAAG,EAAC,IAAI,EAAE,GAAG,EAAE,GAAG,CAAC,kBAAkB,IAAI,EAAE,EAAC,CAAC;IAC1D,OAAO,KAAK,CAAC;AACd,CAAC;AAED;;;;GAIG;AACH,SAAS,iBAAiB,CACzB,GAAuC,EACvC,OAAO,GAAG,OAAO,EAAE;IAEnB,MAAM,IAAI,GAAG,GAAG,CAAC,eAAe,IAAI,IAAI,CAAC,OAAO,EAAE,SAAS,CAAC,CAAC;IAC7D,OAAO,IAAI,CAAC,IAAI,EAAE,SAAS,EAAE,aAAa,CAAC,CAAC;AAC7C,CAAC;AAED;;;GAGG;AACH,MAAM,UAAU,aAAa,CAAC,UAA0B,EAAE;IACzD,MAAM,GAAG,GAAG,OAAO,CAAC,GAAG,IAAI,OAAO,CAAC,GAAG,EAAE,CAAC;IACzC,MAAM,GAAG,GAAG,OAAO,CAAC,GAAG,IAAI,OAAO,CAAC,GAAG,CAAC;IACvC,MAAM,UAAU,GACf,OAAO,CAAC,UAAU,IAAI,iBAAiB,CAAC,GAAG,EAAE,OAAO,CAAC,OAAO,CAAC,CAAC;IAE/D,MAAM,MAAM,GAAoB;QAC/B,QAAQ;QACR,QAAQ,CAAC,UAAU,CAAC,IAAI,EAAE;QAC1B,gBAAgB,CAAC,GAAG,CAAC,IAAI,EAAE;QAC3B,OAAO,CAAC,GAAG,CAAC;KACZ,CAAC;IACF,OAAO,MAAM,CAAC,MAAM,CAAC,EAAE,EAAE,GAAG,MAAM,CAAW,CAAC;AAC/C,CAAC"}
@@ -6,6 +6,22 @@ export declare class EgressError extends Error {
6
6
  cause?: unknown;
7
7
  });
8
8
  }
9
+ /**
10
+ * Fail-loud guard for the false-confidence combo: a `unix:` (local-socket)
11
+ * backend `baseUrl` configured with a NON-direct egress (`http`/`socks5`). A
12
+ * Unix socket is inherently local, so proxying that hop is the same fake-
13
+ * anonymity footgun as proxying a loopback TCP baseUrl: webveil would route a
14
+ * pointless local call through the proxy while the backend (SearXNG) crawls the
15
+ * public web from the real IP, OUTSIDE webveil's egress. Refuse it and point at
16
+ * the real fix.
17
+ *
18
+ * OVERLAP SEAM (recorded): this is the loopback false-confidence family. The
19
+ * sibling task `fail-loud-on-proxied-loopback-backend` adds the broader guard
20
+ * for loopback TCP baseUrls (127.0.0.0/8, ::1, localhost). When it lands, fold
21
+ * THIS `unix:`-is-loopback-equivalent case into that single guard instead of
22
+ * keeping a parallel check here.
23
+ */
24
+ export declare function assertEgressAllowsBaseUrl(cfg: Config): void;
9
25
  /**
10
26
  * Build the undici Dispatcher for the config's egress mode:
11
27
  * - direct → undefined (undici uses its default, un-proxied transport)
@@ -1 +1 @@
1
- {"version":3,"file":"egress.d.ts","sourceRoot":"","sources":["../../src/core/egress.ts"],"names":[],"mappings":"AAQA,OAAO,EAAC,KAAK,EAAE,KAAK,UAAU,EAAE,UAAU,EAAuB,MAAM,QAAQ,CAAC;AAEhF,OAAO,KAAK,EAAC,MAAM,EAAS,MAAM,aAAa,CAAC;AAEhD,8EAA8E;AAC9E,qBAAa,WAAY,SAAQ,KAAK;gBACzB,OAAO,EAAE,MAAM,EAAE,OAAO,CAAC,EAAE;QAAC,KAAK,CAAC,EAAE,OAAO,CAAA;KAAC;CAIxD;AAqBD;;;;;;;;GAQG;AACH,wBAAgB,eAAe,CAAC,GAAG,EAAE,MAAM,GAAG,UAAU,GAAG,SAAS,CAiCnE;AAED,uEAAuE;AACvE,MAAM,MAAM,WAAW,GAAG,OAAO,UAAU,CAAC,KAAK,CAAC;AAElD;;;;;GAKG;AACH,wBAAgB,iBAAiB,CAAC,GAAG,EAAE,MAAM,GAAG,WAAW,CAU1D;AAED,YAAY,EAAC,UAAU,EAAC,CAAC;AACzB,OAAO,EAAC,KAAK,EAAE,UAAU,EAAC,CAAC"}
1
+ {"version":3,"file":"egress.d.ts","sourceRoot":"","sources":["../../src/core/egress.ts"],"names":[],"mappings":"AAQA,OAAO,EAAC,KAAK,EAAE,KAAK,UAAU,EAAE,UAAU,EAAuB,MAAM,QAAQ,CAAC;AAEhF,OAAO,KAAK,EAAC,MAAM,EAAS,MAAM,aAAa,CAAC;AAGhD,8EAA8E;AAC9E,qBAAa,WAAY,SAAQ,KAAK;gBACzB,OAAO,EAAE,MAAM,EAAE,OAAO,CAAC,EAAE;QAAC,KAAK,CAAC,EAAE,OAAO,CAAA;KAAC;CAIxD;AAED;;;;;;;;;;;;;;GAcG;AACH,wBAAgB,yBAAyB,CAAC,GAAG,EAAE,MAAM,GAAG,IAAI,CAU3D;AAqBD;;;;;;;;GAQG;AACH,wBAAgB,eAAe,CAAC,GAAG,EAAE,MAAM,GAAG,UAAU,GAAG,SAAS,CAiCnE;AAED,uEAAuE;AACvE,MAAM,MAAM,WAAW,GAAG,OAAO,UAAU,CAAC,KAAK,CAAC;AAElD;;;;;GAKG;AACH,wBAAgB,iBAAiB,CAAC,GAAG,EAAE,MAAM,GAAG,WAAW,CAU1D;AAED,YAAY,EAAC,UAAU,EAAC,CAAC;AACzB,OAAO,EAAC,KAAK,EAAE,UAAU,EAAC,CAAC"}
@@ -7,6 +7,7 @@
7
7
  // fall back to un-proxied (direct) transport.
8
8
  import { Agent, ProxyAgent, fetch as undiciFetch } from 'undici';
9
9
  import { socksDispatcher } from 'fetch-socks';
10
+ import { isUnixBaseUrl } from './baseurl.js';
10
11
  /** Thrown when a configured egress proxy cannot be built. Never swallowed. */
11
12
  export class EgressError extends Error {
12
13
  constructor(message, options) {
@@ -14,6 +15,31 @@ export class EgressError extends Error {
14
15
  this.name = 'EgressError';
15
16
  }
16
17
  }
18
+ /**
19
+ * Fail-loud guard for the false-confidence combo: a `unix:` (local-socket)
20
+ * backend `baseUrl` configured with a NON-direct egress (`http`/`socks5`). A
21
+ * Unix socket is inherently local, so proxying that hop is the same fake-
22
+ * anonymity footgun as proxying a loopback TCP baseUrl: webveil would route a
23
+ * pointless local call through the proxy while the backend (SearXNG) crawls the
24
+ * public web from the real IP, OUTSIDE webveil's egress. Refuse it and point at
25
+ * the real fix.
26
+ *
27
+ * OVERLAP SEAM (recorded): this is the loopback false-confidence family. The
28
+ * sibling task `fail-loud-on-proxied-loopback-backend` adds the broader guard
29
+ * for loopback TCP baseUrls (127.0.0.0/8, ::1, localhost). When it lands, fold
30
+ * THIS `unix:`-is-loopback-equivalent case into that single guard instead of
31
+ * keeping a parallel check here.
32
+ */
33
+ export function assertEgressAllowsBaseUrl(cfg) {
34
+ if (cfg.egress.mode === 'direct')
35
+ return;
36
+ if (isUnixBaseUrl(cfg.baseUrl))
37
+ throw new EgressError(`egress ${cfg.egress.mode}: a unix: (local socket) baseUrl cannot be ` +
38
+ `proxied — it is inherently local, so proxying it gives fake ` +
39
+ `anonymity (SearXNG still crawls the web from your real IP). Set ` +
40
+ `egress=direct and proxy the backend itself (SearXNG's ` +
41
+ `outgoing.proxies), or use a remote backend.`);
42
+ }
17
43
  function socksFromUrl(raw) {
18
44
  const url = new URL(raw); // throws on a malformed proxy URL → fail loud
19
45
  const protocol = url.protocol.replace(':', '');
@@ -1 +1 @@
1
- {"version":3,"file":"egress.js","sourceRoot":"","sources":["../../src/core/egress.ts"],"names":[],"mappings":"AAAA,+EAA+E;AAC/E,gFAAgF;AAChF,+EAA+E;AAC/E,EAAE;AACF,uEAAuE;AACvE,4EAA4E;AAC5E,8CAA8C;AAE9C,OAAO,EAAC,KAAK,EAAmB,UAAU,EAAE,KAAK,IAAI,WAAW,EAAC,MAAM,QAAQ,CAAC;AAChF,OAAO,EAAC,eAAe,EAAC,MAAM,aAAa,CAAC;AAG5C,8EAA8E;AAC9E,MAAM,OAAO,WAAY,SAAQ,KAAK;IACrC,YAAY,OAAe,EAAE,OAA2B;QACvD,KAAK,CAAC,OAAO,EAAE,OAAO,CAAC,CAAC;QACxB,IAAI,CAAC,IAAI,GAAG,aAAa,CAAC;IAC3B,CAAC;CACD;AAED,SAAS,YAAY,CAAC,GAAW;IAChC,MAAM,GAAG,GAAG,IAAI,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,8CAA8C;IACxE,MAAM,QAAQ,GAAG,GAAG,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,EAAE,EAAE,CAAC,CAAC;IAC/C,IAAI,QAAQ,KAAK,QAAQ,IAAI,QAAQ,KAAK,OAAO,IAAI,QAAQ,KAAK,SAAS;QAC1E,MAAM,IAAI,WAAW,CACpB,sDAAsD,GAAG,EAAE,CAC3D,CAAC;IACH,MAAM,IAAI,GAAG,MAAM,CAAC,GAAG,CAAC,IAAI,CAAC,CAAC;IAC9B,IAAI,CAAC,GAAG,CAAC,QAAQ,IAAI,CAAC,MAAM,CAAC,SAAS,CAAC,IAAI,CAAC,IAAI,IAAI,IAAI,CAAC;QACxD,MAAM,IAAI,WAAW,CAAC,uCAAuC,GAAG,EAAE,CAAC,CAAC;IACrE,OAAO,eAAe,CAAC;QACtB,IAAI,EAAE,CAAC;QACP,IAAI,EAAE,GAAG,CAAC,QAAQ;QAClB,IAAI;QACJ,MAAM,EAAE,GAAG,CAAC,QAAQ,IAAI,SAAS;QACjC,QAAQ,EAAE,GAAG,CAAC,QAAQ,IAAI,SAAS;KACnC,CAAC,CAAC;AACJ,CAAC;AAED;;;;;;;;GAQG;AACH,MAAM,UAAU,eAAe,CAAC,GAAW;IAC1C,MAAM,MAAM,GAAW,GAAG,CAAC,MAAM,CAAC;IAClC,QAAQ,MAAM,CAAC,IAAI,EAAE,CAAC;QACrB,KAAK,QAAQ;YACZ,OAAO,SAAS,CAAC;QAClB,KAAK,MAAM;YACV,IAAI,CAAC;gBACJ,IAAI,CAAC,MAAM,CAAC,GAAG;oBAAE,MAAM,IAAI,KAAK,CAAC,mBAAmB,CAAC,CAAC;gBACtD,OAAO,IAAI,UAAU,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC;YACnC,CAAC;YAAC,OAAO,KAAK,EAAE,CAAC;gBAChB,MAAM,IAAI,WAAW,CACpB,0CAA0C,MAAM,CAAC,GAAG,EAAE,EACtD,EAAC,KAAK,EAAC,CACP,CAAC;YACH,CAAC;QACF,KAAK,QAAQ;YACZ,IAAI,CAAC;gBACJ,IAAI,CAAC,MAAM,CAAC,GAAG;oBAAE,MAAM,IAAI,KAAK,CAAC,mBAAmB,CAAC,CAAC;gBACtD,OAAO,YAAY,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC;YACjC,CAAC;YAAC,OAAO,KAAK,EAAE,CAAC;gBAChB,IAAI,KAAK,YAAY,WAAW;oBAAE,MAAM,KAAK,CAAC;gBAC9C,MAAM,IAAI,WAAW,CACpB,4CAA4C,MAAM,CAAC,GAAG,EAAE,EACxD,EAAC,KAAK,EAAC,CACP,CAAC;YACH,CAAC;QACF,OAAO,CAAC,CAAC,CAAC;YACT,MAAM,UAAU,GAAU,MAAM,CAAC;YACjC,MAAM,IAAI,WAAW,CACpB,wBAAwB,IAAI,CAAC,SAAS,CAAC,UAAU,CAAC,EAAE,CACpD,CAAC;QACH,CAAC;IACF,CAAC;AACF,CAAC;AAKD;;;;;GAKG;AACH,MAAM,UAAU,iBAAiB,CAAC,GAAW;IAC5C,MAAM,UAAU,GAAG,eAAe,CAAC,GAAG,CAAC,CAAC;IACxC,OAAO,CAAC,CAAC,KAAwB,EAAE,IAAkB,EAAE,EAAE,CACxD,WAAW,CACV,KAAc,EACd;QACC,GAAI,CAAC,IAAI,IAAI,EAAE,CAA6B;QAC5C,UAAU;KACD,CACV,CAAgB,CAAC;AACpB,CAAC;AAGD,OAAO,EAAC,KAAK,EAAE,UAAU,EAAC,CAAC"}
1
+ {"version":3,"file":"egress.js","sourceRoot":"","sources":["../../src/core/egress.ts"],"names":[],"mappings":"AAAA,+EAA+E;AAC/E,gFAAgF;AAChF,+EAA+E;AAC/E,EAAE;AACF,uEAAuE;AACvE,4EAA4E;AAC5E,8CAA8C;AAE9C,OAAO,EAAC,KAAK,EAAmB,UAAU,EAAE,KAAK,IAAI,WAAW,EAAC,MAAM,QAAQ,CAAC;AAChF,OAAO,EAAC,eAAe,EAAC,MAAM,aAAa,CAAC;AAE5C,OAAO,EAAC,aAAa,EAAC,MAAM,cAAc,CAAC;AAE3C,8EAA8E;AAC9E,MAAM,OAAO,WAAY,SAAQ,KAAK;IACrC,YAAY,OAAe,EAAE,OAA2B;QACvD,KAAK,CAAC,OAAO,EAAE,OAAO,CAAC,CAAC;QACxB,IAAI,CAAC,IAAI,GAAG,aAAa,CAAC;IAC3B,CAAC;CACD;AAED;;;;;;;;;;;;;;GAcG;AACH,MAAM,UAAU,yBAAyB,CAAC,GAAW;IACpD,IAAI,GAAG,CAAC,MAAM,CAAC,IAAI,KAAK,QAAQ;QAAE,OAAO;IACzC,IAAI,aAAa,CAAC,GAAG,CAAC,OAAO,CAAC;QAC7B,MAAM,IAAI,WAAW,CACpB,UAAU,GAAG,CAAC,MAAM,CAAC,IAAI,6CAA6C;YACrE,8DAA8D;YAC9D,kEAAkE;YAClE,wDAAwD;YACxD,6CAA6C,CAC9C,CAAC;AACJ,CAAC;AAED,SAAS,YAAY,CAAC,GAAW;IAChC,MAAM,GAAG,GAAG,IAAI,GAAG,CAAC,GAAG,CAAC,CAAC,CAAC,8CAA8C;IACxE,MAAM,QAAQ,GAAG,GAAG,CAAC,QAAQ,CAAC,OAAO,CAAC,GAAG,EAAE,EAAE,CAAC,CAAC;IAC/C,IAAI,QAAQ,KAAK,QAAQ,IAAI,QAAQ,KAAK,OAAO,IAAI,QAAQ,KAAK,SAAS;QAC1E,MAAM,IAAI,WAAW,CACpB,sDAAsD,GAAG,EAAE,CAC3D,CAAC;IACH,MAAM,IAAI,GAAG,MAAM,CAAC,GAAG,CAAC,IAAI,CAAC,CAAC;IAC9B,IAAI,CAAC,GAAG,CAAC,QAAQ,IAAI,CAAC,MAAM,CAAC,SAAS,CAAC,IAAI,CAAC,IAAI,IAAI,IAAI,CAAC;QACxD,MAAM,IAAI,WAAW,CAAC,uCAAuC,GAAG,EAAE,CAAC,CAAC;IACrE,OAAO,eAAe,CAAC;QACtB,IAAI,EAAE,CAAC;QACP,IAAI,EAAE,GAAG,CAAC,QAAQ;QAClB,IAAI;QACJ,MAAM,EAAE,GAAG,CAAC,QAAQ,IAAI,SAAS;QACjC,QAAQ,EAAE,GAAG,CAAC,QAAQ,IAAI,SAAS;KACnC,CAAC,CAAC;AACJ,CAAC;AAED;;;;;;;;GAQG;AACH,MAAM,UAAU,eAAe,CAAC,GAAW;IAC1C,MAAM,MAAM,GAAW,GAAG,CAAC,MAAM,CAAC;IAClC,QAAQ,MAAM,CAAC,IAAI,EAAE,CAAC;QACrB,KAAK,QAAQ;YACZ,OAAO,SAAS,CAAC;QAClB,KAAK,MAAM;YACV,IAAI,CAAC;gBACJ,IAAI,CAAC,MAAM,CAAC,GAAG;oBAAE,MAAM,IAAI,KAAK,CAAC,mBAAmB,CAAC,CAAC;gBACtD,OAAO,IAAI,UAAU,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC;YACnC,CAAC;YAAC,OAAO,KAAK,EAAE,CAAC;gBAChB,MAAM,IAAI,WAAW,CACpB,0CAA0C,MAAM,CAAC,GAAG,EAAE,EACtD,EAAC,KAAK,EAAC,CACP,CAAC;YACH,CAAC;QACF,KAAK,QAAQ;YACZ,IAAI,CAAC;gBACJ,IAAI,CAAC,MAAM,CAAC,GAAG;oBAAE,MAAM,IAAI,KAAK,CAAC,mBAAmB,CAAC,CAAC;gBACtD,OAAO,YAAY,CAAC,MAAM,CAAC,GAAG,CAAC,CAAC;YACjC,CAAC;YAAC,OAAO,KAAK,EAAE,CAAC;gBAChB,IAAI,KAAK,YAAY,WAAW;oBAAE,MAAM,KAAK,CAAC;gBAC9C,MAAM,IAAI,WAAW,CACpB,4CAA4C,MAAM,CAAC,GAAG,EAAE,EACxD,EAAC,KAAK,EAAC,CACP,CAAC;YACH,CAAC;QACF,OAAO,CAAC,CAAC,CAAC;YACT,MAAM,UAAU,GAAU,MAAM,CAAC;YACjC,MAAM,IAAI,WAAW,CACpB,wBAAwB,IAAI,CAAC,SAAS,CAAC,UAAU,CAAC,EAAE,CACpD,CAAC;QACH,CAAC;IACF,CAAC;AACF,CAAC;AAKD;;;;;GAKG;AACH,MAAM,UAAU,iBAAiB,CAAC,GAAW;IAC5C,MAAM,UAAU,GAAG,eAAe,CAAC,GAAG,CAAC,CAAC;IACxC,OAAO,CAAC,CAAC,KAAwB,EAAE,IAAkB,EAAE,EAAE,CACxD,WAAW,CACV,KAAc,EACd;QACC,GAAI,CAAC,IAAI,IAAI,EAAE,CAA6B;QAC5C,UAAU;KACD,CACV,CAAgB,CAAC;AACpB,CAAC;AAGD,OAAO,EAAC,KAAK,EAAE,UAAU,EAAC,CAAC"}
@@ -1,5 +1,6 @@
1
1
  import type { Config, ResolveOptions } from './config.js';
2
2
  import type { Dispatcher } from './egress.js';
3
+ import type { BackendTransport } from './baseurl.js';
3
4
  import type { Http, SearchOptions, SearchResult } from './backends/types.js';
4
5
  /**
5
6
  * Collaborators, seamed so the core is testable WITHOUT real config files,
@@ -11,6 +12,8 @@ import type { Http, SearchOptions, SearchResult } from './backends/types.js';
11
12
  export interface SearchDeps {
12
13
  resolveConfig?: (options?: ResolveOptions) => Config;
13
14
  buildDispatcher?: (config: Config) => Dispatcher | undefined;
15
+ assertEgressAllowsBaseUrl?: (config: Config) => void;
16
+ resolveBackendTransport?: (baseUrl: string) => BackendTransport;
14
17
  createHttp?: (dispatcher: Dispatcher | undefined) => Http;
15
18
  getBackend?: (name: string, config: Config) => {
16
19
  search: (query: string, http: Http, options?: SearchOptions) => Promise<SearchResult[]>;
@@ -1 +1 @@
1
- {"version":3,"file":"search.d.ts","sourceRoot":"","sources":["../../src/core/search.ts"],"names":[],"mappings":"AAcA,OAAO,KAAK,EAAC,MAAM,EAAE,cAAc,EAAC,MAAM,aAAa,CAAC;AAExD,OAAO,KAAK,EAAC,UAAU,EAAC,MAAM,aAAa,CAAC;AAG5C,OAAO,KAAK,EAAC,IAAI,EAAE,aAAa,EAAE,YAAY,EAAC,MAAM,qBAAqB,CAAC;AAU3E;;;;;;GAMG;AACH,MAAM,WAAW,UAAU;IAC1B,aAAa,CAAC,EAAE,CAAC,OAAO,CAAC,EAAE,cAAc,KAAK,MAAM,CAAC;IACrD,eAAe,CAAC,EAAE,CAAC,MAAM,EAAE,MAAM,KAAK,UAAU,GAAG,SAAS,CAAC;IAC7D,UAAU,CAAC,EAAE,CAAC,UAAU,EAAE,UAAU,GAAG,SAAS,KAAK,IAAI,CAAC;IAC1D,UAAU,CAAC,EAAE,CACZ,IAAI,EAAE,MAAM,EACZ,MAAM,EAAE,MAAM,KACV;QACJ,MAAM,EAAE,CACP,KAAK,EAAE,MAAM,EACb,IAAI,EAAE,IAAI,EACV,OAAO,CAAC,EAAE,aAAa,KACnB,OAAO,CAAC,YAAY,EAAE,CAAC,CAAC;KAC7B,CAAC;CACF;AAED,iFAAiF;AACjF,MAAM,WAAW,iBAAkB,SAAQ,aAAa,EAAE,cAAc;CAAG;AAc3E;;;;;;;GAOG;AACH,wBAAsB,MAAM,CAC3B,KAAK,EAAE,MAAM,EACb,OAAO,GAAE,iBAAsB,EAC/B,IAAI,GAAE,UAAe,GACnB,OAAO,CAAC,YAAY,EAAE,CAAC,CAwBzB"}
1
+ {"version":3,"file":"search.d.ts","sourceRoot":"","sources":["../../src/core/search.ts"],"names":[],"mappings":"AAcA,OAAO,KAAK,EAAC,MAAM,EAAE,cAAc,EAAC,MAAM,aAAa,CAAC;AAKxD,OAAO,KAAK,EAAC,UAAU,EAAC,MAAM,aAAa,CAAC;AAE5C,OAAO,KAAK,EAAC,gBAAgB,EAAC,MAAM,cAAc,CAAC;AAGnD,OAAO,KAAK,EAAC,IAAI,EAAE,aAAa,EAAE,YAAY,EAAC,MAAM,qBAAqB,CAAC;AAU3E;;;;;;GAMG;AACH,MAAM,WAAW,UAAU;IAC1B,aAAa,CAAC,EAAE,CAAC,OAAO,CAAC,EAAE,cAAc,KAAK,MAAM,CAAC;IACrD,eAAe,CAAC,EAAE,CAAC,MAAM,EAAE,MAAM,KAAK,UAAU,GAAG,SAAS,CAAC;IAC7D,yBAAyB,CAAC,EAAE,CAAC,MAAM,EAAE,MAAM,KAAK,IAAI,CAAC;IACrD,uBAAuB,CAAC,EAAE,CAAC,OAAO,EAAE,MAAM,KAAK,gBAAgB,CAAC;IAChE,UAAU,CAAC,EAAE,CAAC,UAAU,EAAE,UAAU,GAAG,SAAS,KAAK,IAAI,CAAC;IAC1D,UAAU,CAAC,EAAE,CACZ,IAAI,EAAE,MAAM,EACZ,MAAM,EAAE,MAAM,KACV;QACJ,MAAM,EAAE,CACP,KAAK,EAAE,MAAM,EACb,IAAI,EAAE,IAAI,EACV,OAAO,CAAC,EAAE,aAAa,KACnB,OAAO,CAAC,YAAY,EAAE,CAAC,CAAC;KAC7B,CAAC;CACF;AAED,iFAAiF;AACjF,MAAM,WAAW,iBAAkB,SAAQ,aAAa,EAAE,cAAc;CAAG;AAc3E;;;;;;;GAOG;AACH,wBAAsB,MAAM,CAC3B,KAAK,EAAE,MAAM,EACb,OAAO,GAAE,iBAAsB,EAC/B,IAAI,GAAE,UAAe,GACnB,OAAO,CAAC,YAAY,EAAE,CAAC,CAqDzB"}
@@ -11,7 +11,8 @@
11
11
  // and bypass the configured egress. A configured-but-unbuildable proxy throws at
12
12
  // buildDispatcher (fail-loud), never silently un-proxied.
13
13
  import { resolveConfig as defaultResolveConfig } from './config.js';
14
- import { buildDispatcher as defaultBuildDispatcher } from './egress.js';
14
+ import { buildDispatcher as defaultBuildDispatcher, assertEgressAllowsBaseUrl as defaultAssertEgressAllowsBaseUrl, } from './egress.js';
15
+ import { resolveBackendTransport as defaultResolveBackendTransport } from './baseurl.js';
15
16
  import { createHttp as defaultCreateHttp } from './http.js';
16
17
  import { getBackend as defaultGetBackend } from './backends/registry.js';
17
18
  /**
@@ -44,6 +45,8 @@ function dedup(results) {
44
45
  export async function search(query, options = {}, deps = {}) {
45
46
  const resolveConfig = deps.resolveConfig ?? defaultResolveConfig;
46
47
  const buildDispatcher = deps.buildDispatcher ?? defaultBuildDispatcher;
48
+ const assertEgressAllowsBaseUrl = deps.assertEgressAllowsBaseUrl ?? defaultAssertEgressAllowsBaseUrl;
49
+ const resolveBackendTransport = deps.resolveBackendTransport ?? defaultResolveBackendTransport;
47
50
  const createHttp = deps.createHttp ?? defaultCreateHttp;
48
51
  const getBackend = deps.getBackend ?? defaultGetBackend;
49
52
  const config = resolveConfig({
@@ -51,14 +54,38 @@ export async function search(query, options = {}, deps = {}) {
51
54
  env: options.env,
52
55
  globalPath: options.globalPath,
53
56
  });
54
- // Build the dispatcher FIRST: a configured-but-unbuildable proxy throws here,
55
- // before any network access (never an un-proxied request).
56
- const dispatcher = buildDispatcher(config);
57
+ // Fail loud on the false-confidence combo (a local `unix:` socket baseUrl
58
+ // behind a proxy egress) BEFORE any transport is built.
59
+ assertEgressAllowsBaseUrl(config);
60
+ // Resolve the BACKEND-hop transport. For a normal TCP baseUrl this is a no-op
61
+ // (no per-hop dispatcher); for a `unix:` baseUrl it yields a socket-bound
62
+ // `Agent` and a synthetic `http://localhost…` base the backend builds on. The
63
+ // socket transport is scoped to THIS hop only and is NEVER bound into the
64
+ // shared config-wide egress dispatcher, so `web_fetch` egress is unaffected.
65
+ const transport = resolveBackendTransport(config.baseUrl);
66
+ // Build the egress dispatcher FIRST: a configured-but-unbuildable proxy throws
67
+ // here, before any network access (never an un-proxied request). For a socket
68
+ // baseUrl the per-hop socket dispatcher overrides the (direct/undefined) one.
69
+ const dispatcher = transport.dispatcher ?? buildDispatcher(config);
57
70
  const http = createHttp(dispatcher);
58
- const backend = getBackend(config.backend, config);
71
+ // The backend stays transport-unaware: it receives a config whose baseUrl is
72
+ // always a real `http(s):` base (the `unix:` form is rewritten away here).
73
+ const backendConfig = transport.baseUrl === config.baseUrl
74
+ ? config
75
+ : { ...config, baseUrl: transport.baseUrl };
76
+ const backend = getBackend(backendConfig.backend, backendConfig);
59
77
  // Hand the backend ONLY the proxied helper (no maxResults: dedup happens
60
78
  // here, over the full set, so the clamp below is over UNIQUE results).
61
- const raw = await backend.search(query, http, { signal: options.signal });
79
+ let raw;
80
+ try {
81
+ raw = await backend.search(query, http, { signal: options.signal });
82
+ }
83
+ finally {
84
+ // Best-effort close of the per-hop socket Agent (the shared egress
85
+ // dispatcher, owned by config, is NOT touched here).
86
+ if (transport.dispatcher)
87
+ void transport.dispatcher.close();
88
+ }
62
89
  const maxResults = options.maxResults ?? DEFAULT_MAX_RESULTS;
63
90
  return dedup(raw).slice(0, maxResults);
64
91
  }
@@ -1 +1 @@
1
- {"version":3,"file":"search.js","sourceRoot":"","sources":["../../src/core/search.ts"],"names":[],"mappings":"AAAA,6EAA6E;AAC7E,uEAAuE;AACvE,8EAA8E;AAC9E,EAAE;AACF,+EAA+E;AAC/E,8EAA8E;AAC9E,2EAA2E;AAC3E,EAAE;AACF,uEAAuE;AACvE,+EAA+E;AAC/E,iFAAiF;AACjF,0DAA0D;AAE1D,OAAO,EAAC,aAAa,IAAI,oBAAoB,EAAC,MAAM,aAAa,CAAC;AAElE,OAAO,EAAC,eAAe,IAAI,sBAAsB,EAAC,MAAM,aAAa,CAAC;AAEtE,OAAO,EAAC,UAAU,IAAI,iBAAiB,EAAC,MAAM,WAAW,CAAC;AAC1D,OAAO,EAAC,UAAU,IAAI,iBAAiB,EAAC,MAAM,wBAAwB,CAAC;AAGvE;;;;;GAKG;AACH,MAAM,mBAAmB,GAAG,EAAE,CAAC;AA4B/B,sEAAsE;AACtE,SAAS,KAAK,CAAC,OAAuB;IACrC,MAAM,IAAI,GAAG,IAAI,GAAG,EAAU,CAAC;IAC/B,MAAM,GAAG,GAAmB,EAAE,CAAC;IAC/B,KAAK,MAAM,CAAC,IAAI,OAAO,EAAE,CAAC;QACzB,IAAI,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,GAAG,CAAC;YAAE,SAAS;QAC9B,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC;QAChB,GAAG,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IACb,CAAC;IACD,OAAO,GAAG,CAAC;AACZ,CAAC;AAED;;;;;;;GAOG;AACH,MAAM,CAAC,KAAK,UAAU,MAAM,CAC3B,KAAa,EACb,UAA6B,EAAE,EAC/B,OAAmB,EAAE;IAErB,MAAM,aAAa,GAAG,IAAI,CAAC,aAAa,IAAI,oBAAoB,CAAC;IACjE,MAAM,eAAe,GAAG,IAAI,CAAC,eAAe,IAAI,sBAAsB,CAAC;IACvE,MAAM,UAAU,GAAG,IAAI,CAAC,UAAU,IAAI,iBAAiB,CAAC;IACxD,MAAM,UAAU,GAAG,IAAI,CAAC,UAAU,IAAI,iBAAiB,CAAC;IAExD,MAAM,MAAM,GAAG,aAAa,CAAC;QAC5B,GAAG,EAAE,OAAO,CAAC,GAAG;QAChB,GAAG,EAAE,OAAO,CAAC,GAAG;QAChB,UAAU,EAAE,OAAO,CAAC,UAAU;KAC9B,CAAC,CAAC;IAEH,8EAA8E;IAC9E,2DAA2D;IAC3D,MAAM,UAAU,GAAG,eAAe,CAAC,MAAM,CAAC,CAAC;IAC3C,MAAM,IAAI,GAAG,UAAU,CAAC,UAAU,CAAC,CAAC;IAEpC,MAAM,OAAO,GAAG,UAAU,CAAC,MAAM,CAAC,OAAO,EAAE,MAAM,CAAC,CAAC;IACnD,yEAAyE;IACzE,uEAAuE;IACvE,MAAM,GAAG,GAAG,MAAM,OAAO,CAAC,MAAM,CAAC,KAAK,EAAE,IAAI,EAAE,EAAC,MAAM,EAAE,OAAO,CAAC,MAAM,EAAC,CAAC,CAAC;IAExE,MAAM,UAAU,GAAG,OAAO,CAAC,UAAU,IAAI,mBAAmB,CAAC;IAC7D,OAAO,KAAK,CAAC,GAAG,CAAC,CAAC,KAAK,CAAC,CAAC,EAAE,UAAU,CAAC,CAAC;AACxC,CAAC"}
1
+ {"version":3,"file":"search.js","sourceRoot":"","sources":["../../src/core/search.ts"],"names":[],"mappings":"AAAA,6EAA6E;AAC7E,uEAAuE;AACvE,8EAA8E;AAC9E,EAAE;AACF,+EAA+E;AAC/E,8EAA8E;AAC9E,2EAA2E;AAC3E,EAAE;AACF,uEAAuE;AACvE,+EAA+E;AAC/E,iFAAiF;AACjF,0DAA0D;AAE1D,OAAO,EAAC,aAAa,IAAI,oBAAoB,EAAC,MAAM,aAAa,CAAC;AAElE,OAAO,EACN,eAAe,IAAI,sBAAsB,EACzC,yBAAyB,IAAI,gCAAgC,GAC7D,MAAM,aAAa,CAAC;AAErB,OAAO,EAAC,uBAAuB,IAAI,8BAA8B,EAAC,MAAM,cAAc,CAAC;AAEvF,OAAO,EAAC,UAAU,IAAI,iBAAiB,EAAC,MAAM,WAAW,CAAC;AAC1D,OAAO,EAAC,UAAU,IAAI,iBAAiB,EAAC,MAAM,wBAAwB,CAAC;AAGvE;;;;;GAKG;AACH,MAAM,mBAAmB,GAAG,EAAE,CAAC;AA8B/B,sEAAsE;AACtE,SAAS,KAAK,CAAC,OAAuB;IACrC,MAAM,IAAI,GAAG,IAAI,GAAG,EAAU,CAAC;IAC/B,MAAM,GAAG,GAAmB,EAAE,CAAC;IAC/B,KAAK,MAAM,CAAC,IAAI,OAAO,EAAE,CAAC;QACzB,IAAI,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,GAAG,CAAC;YAAE,SAAS;QAC9B,IAAI,CAAC,GAAG,CAAC,CAAC,CAAC,GAAG,CAAC,CAAC;QAChB,GAAG,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IACb,CAAC;IACD,OAAO,GAAG,CAAC;AACZ,CAAC;AAED;;;;;;;GAOG;AACH,MAAM,CAAC,KAAK,UAAU,MAAM,CAC3B,KAAa,EACb,UAA6B,EAAE,EAC/B,OAAmB,EAAE;IAErB,MAAM,aAAa,GAAG,IAAI,CAAC,aAAa,IAAI,oBAAoB,CAAC;IACjE,MAAM,eAAe,GAAG,IAAI,CAAC,eAAe,IAAI,sBAAsB,CAAC;IACvE,MAAM,yBAAyB,GAC9B,IAAI,CAAC,yBAAyB,IAAI,gCAAgC,CAAC;IACpE,MAAM,uBAAuB,GAC5B,IAAI,CAAC,uBAAuB,IAAI,8BAA8B,CAAC;IAChE,MAAM,UAAU,GAAG,IAAI,CAAC,UAAU,IAAI,iBAAiB,CAAC;IACxD,MAAM,UAAU,GAAG,IAAI,CAAC,UAAU,IAAI,iBAAiB,CAAC;IAExD,MAAM,MAAM,GAAG,aAAa,CAAC;QAC5B,GAAG,EAAE,OAAO,CAAC,GAAG;QAChB,GAAG,EAAE,OAAO,CAAC,GAAG;QAChB,UAAU,EAAE,OAAO,CAAC,UAAU;KAC9B,CAAC,CAAC;IAEH,0EAA0E;IAC1E,wDAAwD;IACxD,yBAAyB,CAAC,MAAM,CAAC,CAAC;IAElC,8EAA8E;IAC9E,0EAA0E;IAC1E,8EAA8E;IAC9E,0EAA0E;IAC1E,6EAA6E;IAC7E,MAAM,SAAS,GAAG,uBAAuB,CAAC,MAAM,CAAC,OAAO,CAAC,CAAC;IAE1D,+EAA+E;IAC/E,8EAA8E;IAC9E,8EAA8E;IAC9E,MAAM,UAAU,GAAG,SAAS,CAAC,UAAU,IAAI,eAAe,CAAC,MAAM,CAAC,CAAC;IACnE,MAAM,IAAI,GAAG,UAAU,CAAC,UAAU,CAAC,CAAC;IAEpC,6EAA6E;IAC7E,2EAA2E;IAC3E,MAAM,aAAa,GAClB,SAAS,CAAC,OAAO,KAAK,MAAM,CAAC,OAAO;QACnC,CAAC,CAAC,MAAM;QACR,CAAC,CAAC,EAAC,GAAG,MAAM,EAAE,OAAO,EAAE,SAAS,CAAC,OAAO,EAAC,CAAC;IAC5C,MAAM,OAAO,GAAG,UAAU,CAAC,aAAa,CAAC,OAAO,EAAE,aAAa,CAAC,CAAC;IACjE,yEAAyE;IACzE,uEAAuE;IACvE,IAAI,GAAmB,CAAC;IACxB,IAAI,CAAC;QACJ,GAAG,GAAG,MAAM,OAAO,CAAC,MAAM,CAAC,KAAK,EAAE,IAAI,EAAE,EAAC,MAAM,EAAE,OAAO,CAAC,MAAM,EAAC,CAAC,CAAC;IACnE,CAAC;YAAS,CAAC;QACV,mEAAmE;QACnE,qDAAqD;QACrD,IAAI,SAAS,CAAC,UAAU;YAAE,KAAK,SAAS,CAAC,UAAU,CAAC,KAAK,EAAE,CAAC;IAC7D,CAAC;IAED,MAAM,UAAU,GAAG,OAAO,CAAC,UAAU,IAAI,mBAAmB,CAAC;IAC7D,OAAO,KAAK,CAAC,GAAG,CAAC,CAAC,KAAK,CAAC,CAAC,EAAE,UAAU,CAAC,CAAC;AACxC,CAAC"}
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "webveil",
3
- "version": "0.1.0",
3
+ "version": "0.2.0",
4
4
  "description": "Anonymous-capable, self-hosted, account-free web search and fetch for agents. CLI + MCP (built on incur), pi-agnostic. Swappable backend and egress (direct, http proxy, socks5/Tor).",
5
5
  "license": "AGPL-3.0-or-later",
6
6
  "keywords": [
@@ -0,0 +1,104 @@
1
+ // baseUrl transport parsing — the small, transport-AWARE helper that classifies
2
+ // a resolved `baseUrl` into either a normal TCP HTTP base or a Unix-domain-socket
3
+ // form, and (for the socket form) rewrites it into a synthetic `http://localhost`
4
+ // base that the transport-UNAWARE backends can build their request URL on top of.
5
+ //
6
+ // WHY THIS LIVES HERE (and not in egress.ts): the egress dispatcher
7
+ // (`buildDispatcher`) is built ONCE from config and SHARED by both the backend
8
+ // hop (search.ts) AND the arbitrary-public-URL fetch (fetch.ts). Binding a socket
9
+ // `Agent` into that shared dispatcher would route every `web_fetch` into the
10
+ // SearXNG socket. So the socket transport is scoped to the BACKEND `baseUrl` hop
11
+ // only: search.ts asks this helper to translate the baseUrl, gets back a real
12
+ // `http://localhost…` base (which the searxng backend's `new URL('search', base)`
13
+ // still works on, unchanged) plus the per-hop socket `Agent`, and leaves the
14
+ // fetch/SSRF egress path untouched.
15
+ //
16
+ // GRAMMAR (recorded decision, see the task's done record):
17
+ // unix:<socketPath>[:<httpPath>]
18
+ // - `<socketPath>`: the absolute path to the uWSGI Unix domain socket
19
+ // (e.g. /usr/local/searxng/run/socket). Must not contain a `:` (the parse
20
+ // splits on the FIRST `:` after the `unix:` scheme; conventional socket
21
+ // paths never contain a colon).
22
+ // - `<httpPath>`: OPTIONAL base path the SearXNG app is mounted under,
23
+ // defaulting to `/`. It is the SAME thing the TCP `baseUrl` encodes as its
24
+ // path: the backend appends `search` to it (`new URL('search', base + '/')`),
25
+ // so the install default is just `unix:/usr/local/searxng/run/socket` (the
26
+ // backend then requests `/search`). A non-root mount is `…/socket:/searxng`.
27
+ // A raw `unix:…` string is NOT a valid base for `new URL('search', …)`, so this
28
+ // translation MUST run BEFORE the backend builds its URL.
29
+
30
+ import {Agent, type Dispatcher} from 'undici';
31
+
32
+ /** The `unix:` scheme prefix this helper recognizes on a `baseUrl`. */
33
+ const UNIX_PREFIX = 'unix:';
34
+
35
+ /** A parsed Unix-socket baseUrl: the socket file path + the app's base path. */
36
+ export interface UnixBaseUrl {
37
+ socketPath: string;
38
+ /** The app's base path (mount point); the backend appends `search` to it. */
39
+ httpPath: string;
40
+ }
41
+
42
+ /** Is this resolved `baseUrl` a Unix-domain-socket form (`unix:…`)? */
43
+ export function isUnixBaseUrl(baseUrl: string): boolean {
44
+ return baseUrl.startsWith(UNIX_PREFIX);
45
+ }
46
+
47
+ /**
48
+ * Parse a `unix:<socketPath>[:<httpPath>]` baseUrl into `{socketPath, httpPath}`.
49
+ * Splits on the FIRST `:` after the `unix:` scheme (socket paths conventionally
50
+ * carry no colon); an absent/empty `<httpPath>` defaults to `/`.
51
+ *
52
+ * Throws if the socket path is empty (there is nothing to connect to).
53
+ */
54
+ export function parseUnixBaseUrl(baseUrl: string): UnixBaseUrl {
55
+ const rest = baseUrl.slice(UNIX_PREFIX.length);
56
+ const sep = rest.indexOf(':');
57
+ const socketPath = sep === -1 ? rest : rest.slice(0, sep);
58
+ const rawHttpPath = sep === -1 ? '' : rest.slice(sep + 1);
59
+ if (!socketPath)
60
+ throw new Error(
61
+ `webveil: malformed unix baseUrl ${JSON.stringify(baseUrl)} — ` +
62
+ `expected unix:<socketPath>[:<httpPath>] with a non-empty socket path`,
63
+ );
64
+ const httpPath = rawHttpPath
65
+ ? rawHttpPath.startsWith('/')
66
+ ? rawHttpPath
67
+ : '/' + rawHttpPath
68
+ : '/';
69
+ return {socketPath, httpPath};
70
+ }
71
+
72
+ /**
73
+ * The result of resolving a `baseUrl` for the BACKEND hop: the (possibly
74
+ * rewritten) HTTP base the backend builds its request URL on, plus an OPTIONAL
75
+ * undici `Dispatcher` to carry that hop. For a normal TCP baseUrl the dispatcher
76
+ * is `undefined` (the caller uses the config-wide egress dispatcher); for a
77
+ * `unix:` baseUrl it is a socket-bound `Agent` and the base is a synthetic
78
+ * `http://localhost<httpPath>`.
79
+ */
80
+ export interface BackendTransport {
81
+ baseUrl: string;
82
+ dispatcher?: Dispatcher;
83
+ }
84
+
85
+ /**
86
+ * Resolve a `baseUrl` into a backend-hop transport. For a `unix:` baseUrl this
87
+ * builds a socket-bound `Agent({connect:{socketPath}})` and a synthetic
88
+ * `http://localhost<httpPath>` base (the URL host is irrelevant to routing — the
89
+ * socket decides — and only becomes the `Host` header). For any other baseUrl it
90
+ * returns the baseUrl unchanged with NO dispatcher (the caller keeps using the
91
+ * shared config-wide egress dispatcher).
92
+ *
93
+ * NOTE: this is the BACKEND-hop transport only. It is never bound into the
94
+ * shared egress dispatcher, so `web_fetch`/SSRF egress is unaffected.
95
+ */
96
+ export function resolveBackendTransport(baseUrl: string): BackendTransport {
97
+ if (!isUnixBaseUrl(baseUrl)) return {baseUrl};
98
+ const {socketPath, httpPath} = parseUnixBaseUrl(baseUrl);
99
+ const dispatcher = new Agent({connect: {socketPath}});
100
+ return {
101
+ baseUrl: `http://localhost${httpPath === '/' ? '' : httpPath}`,
102
+ dispatcher,
103
+ };
104
+ }
@@ -1,8 +1,12 @@
1
1
  // config seam — per-folder resolution. Precedence (highest wins):
2
- // env > nearest .pi/webveil.json (walking up from cwd) > global
3
- // ~/.pi/agent/webveil.json > defaults.
2
+ // env > nearest webveil.json (walking up from cwd) > global
3
+ // $XDG_CONFIG_HOME/webveil/config.json (~/.config/webveil/config.json) >
4
+ // defaults.
4
5
  // "Per folder = per account/egress." Each layer is a partial; later (lower)
5
- // layers fill gaps the higher layers leave.
6
+ // layers fill gaps the higher layers leave. The project file is a
7
+ // frontend-neutral `webveil.json` (no `.pi/`): both the pi-agnostic CLI and the
8
+ // pi extension resolve the same name, so a project is configured the same way
9
+ // regardless of which frontend reads it. See docs/adr/0002.
6
10
 
7
11
  import {readFileSync} from 'node:fs';
8
12
  import {homedir} from 'node:os';
@@ -34,9 +38,14 @@ export interface ResolveOptions {
34
38
  cwd?: string;
35
39
  /** Environment to read overrides from. Defaults to process.env. */
36
40
  env?: Record<string, string | undefined>;
41
+ /** Home directory for the XDG fallback. Defaults to os.homedir(). */
42
+ homeDir?: string;
37
43
  /**
38
- * Path to the global config file. Defaults to ~/.pi/agent/webveil.json.
39
- * Tests point this at a temp dir to isolate the real home directory.
44
+ * Path to the global config file. When given it WINS outright and the XDG
45
+ * resolution is skipped. Tests point this at a temp dir to isolate the real
46
+ * home directory. When absent, the global file resolves to
47
+ * $XDG_CONFIG_HOME/webveil/config.json, falling back to
48
+ * <homeDir>/.config/webveil/config.json.
40
49
  */
41
50
  globalPath?: string;
42
51
  }
@@ -48,7 +57,7 @@ const DEFAULTS: Config = {
48
57
  fetchSize: 'm',
49
58
  };
50
59
 
51
- const PROJECT_FILE = join('.pi', 'webveil.json');
60
+ const PROJECT_FILE = 'webveil.json';
52
61
 
53
62
  function readJson(path: string): PartialConfig | undefined {
54
63
  let text: string;
@@ -60,7 +69,7 @@ function readJson(path: string): PartialConfig | undefined {
60
69
  return JSON.parse(text) as PartialConfig;
61
70
  }
62
71
 
63
- /** The nearest `.pi/webveil.json` walking up from `cwd` (first found wins). */
72
+ /** The nearest `webveil.json` walking up from `cwd` (first found wins). */
64
73
  function readProjectChain(cwd: string): PartialConfig | undefined {
65
74
  let dir = cwd;
66
75
  const {root} = parse(dir);
@@ -86,6 +95,19 @@ function readEnv(env: Record<string, string | undefined>): PartialConfig {
86
95
  return layer;
87
96
  }
88
97
 
98
+ /**
99
+ * The global config path, XDG-style: `$XDG_CONFIG_HOME/webveil/config.json`,
100
+ * falling back to `<homeDir>/.config/webveil/config.json` when XDG_CONFIG_HOME
101
+ * is unset. (`options.globalPath`, when given, bypasses this entirely.)
102
+ */
103
+ function resolveGlobalPath(
104
+ env: Record<string, string | undefined>,
105
+ homeDir = homedir(),
106
+ ): string {
107
+ const base = env.XDG_CONFIG_HOME || join(homeDir, '.config');
108
+ return join(base, 'webveil', 'config.json');
109
+ }
110
+
89
111
  /**
90
112
  * Resolve the effective config. Higher-precedence layers override lower ones,
91
113
  * key by key: env > project chain > global file > defaults.
@@ -94,7 +116,7 @@ export function resolveConfig(options: ResolveOptions = {}): Config {
94
116
  const cwd = options.cwd ?? process.cwd();
95
117
  const env = options.env ?? process.env;
96
118
  const globalPath =
97
- options.globalPath ?? join(homedir(), '.pi', 'agent', 'webveil.json');
119
+ options.globalPath ?? resolveGlobalPath(env, options.homeDir);
98
120
 
99
121
  const layers: PartialConfig[] = [
100
122
  DEFAULTS,
@@ -9,6 +9,7 @@
9
9
  import {Agent, type Dispatcher, ProxyAgent, fetch as undiciFetch} from 'undici';
10
10
  import {socksDispatcher} from 'fetch-socks';
11
11
  import type {Config, Egress} from './config.js';
12
+ import {isUnixBaseUrl} from './baseurl.js';
12
13
 
13
14
  /** Thrown when a configured egress proxy cannot be built. Never swallowed. */
14
15
  export class EgressError extends Error {
@@ -18,6 +19,33 @@ export class EgressError extends Error {
18
19
  }
19
20
  }
20
21
 
22
+ /**
23
+ * Fail-loud guard for the false-confidence combo: a `unix:` (local-socket)
24
+ * backend `baseUrl` configured with a NON-direct egress (`http`/`socks5`). A
25
+ * Unix socket is inherently local, so proxying that hop is the same fake-
26
+ * anonymity footgun as proxying a loopback TCP baseUrl: webveil would route a
27
+ * pointless local call through the proxy while the backend (SearXNG) crawls the
28
+ * public web from the real IP, OUTSIDE webveil's egress. Refuse it and point at
29
+ * the real fix.
30
+ *
31
+ * OVERLAP SEAM (recorded): this is the loopback false-confidence family. The
32
+ * sibling task `fail-loud-on-proxied-loopback-backend` adds the broader guard
33
+ * for loopback TCP baseUrls (127.0.0.0/8, ::1, localhost). When it lands, fold
34
+ * THIS `unix:`-is-loopback-equivalent case into that single guard instead of
35
+ * keeping a parallel check here.
36
+ */
37
+ export function assertEgressAllowsBaseUrl(cfg: Config): void {
38
+ if (cfg.egress.mode === 'direct') return;
39
+ if (isUnixBaseUrl(cfg.baseUrl))
40
+ throw new EgressError(
41
+ `egress ${cfg.egress.mode}: a unix: (local socket) baseUrl cannot be ` +
42
+ `proxied — it is inherently local, so proxying it gives fake ` +
43
+ `anonymity (SearXNG still crawls the web from your real IP). Set ` +
44
+ `egress=direct and proxy the backend itself (SearXNG's ` +
45
+ `outgoing.proxies), or use a remote backend.`,
46
+ );
47
+ }
48
+
21
49
  function socksFromUrl(raw: string): Dispatcher {
22
50
  const url = new URL(raw); // throws on a malformed proxy URL → fail loud
23
51
  const protocol = url.protocol.replace(':', '');
@@ -13,8 +13,13 @@
13
13
 
14
14
  import {resolveConfig as defaultResolveConfig} from './config.js';
15
15
  import type {Config, ResolveOptions} from './config.js';
16
- import {buildDispatcher as defaultBuildDispatcher} from './egress.js';
16
+ import {
17
+ buildDispatcher as defaultBuildDispatcher,
18
+ assertEgressAllowsBaseUrl as defaultAssertEgressAllowsBaseUrl,
19
+ } from './egress.js';
17
20
  import type {Dispatcher} from './egress.js';
21
+ import {resolveBackendTransport as defaultResolveBackendTransport} from './baseurl.js';
22
+ import type {BackendTransport} from './baseurl.js';
18
23
  import {createHttp as defaultCreateHttp} from './http.js';
19
24
  import {getBackend as defaultGetBackend} from './backends/registry.js';
20
25
  import type {Http, SearchOptions, SearchResult} from './backends/types.js';
@@ -37,6 +42,8 @@ const DEFAULT_MAX_RESULTS = 10;
37
42
  export interface SearchDeps {
38
43
  resolveConfig?: (options?: ResolveOptions) => Config;
39
44
  buildDispatcher?: (config: Config) => Dispatcher | undefined;
45
+ assertEgressAllowsBaseUrl?: (config: Config) => void;
46
+ resolveBackendTransport?: (baseUrl: string) => BackendTransport;
40
47
  createHttp?: (dispatcher: Dispatcher | undefined) => Http;
41
48
  getBackend?: (
42
49
  name: string,
@@ -80,6 +87,10 @@ export async function search(
80
87
  ): Promise<SearchResult[]> {
81
88
  const resolveConfig = deps.resolveConfig ?? defaultResolveConfig;
82
89
  const buildDispatcher = deps.buildDispatcher ?? defaultBuildDispatcher;
90
+ const assertEgressAllowsBaseUrl =
91
+ deps.assertEgressAllowsBaseUrl ?? defaultAssertEgressAllowsBaseUrl;
92
+ const resolveBackendTransport =
93
+ deps.resolveBackendTransport ?? defaultResolveBackendTransport;
83
94
  const createHttp = deps.createHttp ?? defaultCreateHttp;
84
95
  const getBackend = deps.getBackend ?? defaultGetBackend;
85
96
 
@@ -89,15 +100,40 @@ export async function search(
89
100
  globalPath: options.globalPath,
90
101
  });
91
102
 
92
- // Build the dispatcher FIRST: a configured-but-unbuildable proxy throws here,
93
- // before any network access (never an un-proxied request).
94
- const dispatcher = buildDispatcher(config);
103
+ // Fail loud on the false-confidence combo (a local `unix:` socket baseUrl
104
+ // behind a proxy egress) BEFORE any transport is built.
105
+ assertEgressAllowsBaseUrl(config);
106
+
107
+ // Resolve the BACKEND-hop transport. For a normal TCP baseUrl this is a no-op
108
+ // (no per-hop dispatcher); for a `unix:` baseUrl it yields a socket-bound
109
+ // `Agent` and a synthetic `http://localhost…` base the backend builds on. The
110
+ // socket transport is scoped to THIS hop only and is NEVER bound into the
111
+ // shared config-wide egress dispatcher, so `web_fetch` egress is unaffected.
112
+ const transport = resolveBackendTransport(config.baseUrl);
113
+
114
+ // Build the egress dispatcher FIRST: a configured-but-unbuildable proxy throws
115
+ // here, before any network access (never an un-proxied request). For a socket
116
+ // baseUrl the per-hop socket dispatcher overrides the (direct/undefined) one.
117
+ const dispatcher = transport.dispatcher ?? buildDispatcher(config);
95
118
  const http = createHttp(dispatcher);
96
119
 
97
- const backend = getBackend(config.backend, config);
120
+ // The backend stays transport-unaware: it receives a config whose baseUrl is
121
+ // always a real `http(s):` base (the `unix:` form is rewritten away here).
122
+ const backendConfig: Config =
123
+ transport.baseUrl === config.baseUrl
124
+ ? config
125
+ : {...config, baseUrl: transport.baseUrl};
126
+ const backend = getBackend(backendConfig.backend, backendConfig);
98
127
  // Hand the backend ONLY the proxied helper (no maxResults: dedup happens
99
128
  // here, over the full set, so the clamp below is over UNIQUE results).
100
- const raw = await backend.search(query, http, {signal: options.signal});
129
+ let raw: SearchResult[];
130
+ try {
131
+ raw = await backend.search(query, http, {signal: options.signal});
132
+ } finally {
133
+ // Best-effort close of the per-hop socket Agent (the shared egress
134
+ // dispatcher, owned by config, is NOT touched here).
135
+ if (transport.dispatcher) void transport.dispatcher.close();
136
+ }
101
137
 
102
138
  const maxResults = options.maxResults ?? DEFAULT_MAX_RESULTS;
103
139
  return dedup(raw).slice(0, maxResults);