searchsocket 0.5.0 → 0.6.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +731 -514
- package/dist/cli.js +3335 -492
- package/dist/client.d.cts +1 -1
- package/dist/client.d.ts +1 -1
- package/dist/index.cjs +2378 -475
- package/dist/index.d.cts +113 -40
- package/dist/index.d.ts +113 -40
- package/dist/index.js +2378 -475
- package/dist/{plugin-B_npJSux.d.cts → plugin-C61L-ykY.d.ts} +2 -1
- package/dist/{plugin-M-aW0ev6.d.ts → plugin-DoBW1gkK.d.cts} +2 -1
- package/dist/sveltekit.cjs +2430 -494
- package/dist/sveltekit.d.cts +2 -2
- package/dist/sveltekit.d.ts +2 -2
- package/dist/sveltekit.js +2416 -480
- package/dist/templates/search-dialog/SearchDialog.svelte +175 -0
- package/dist/templates/search-input/SearchInput.svelte +151 -0
- package/dist/templates/search-results/SearchResults.svelte +75 -0
- package/dist/{types-Dk43uz25.d.cts → types-029hl6P2.d.cts} +180 -9
- package/dist/{types-Dk43uz25.d.ts → types-029hl6P2.d.ts} +180 -9
- package/package.json +28 -11
- package/src/svelte/SearchSocket.svelte +35 -0
- package/src/svelte/index.svelte.ts +181 -0
package/README.md
CHANGED
|
@@ -1,34 +1,44 @@
|
|
|
1
1
|
# SearchSocket
|
|
2
2
|
|
|
3
|
-
Semantic site search and MCP retrieval for SvelteKit content projects.
|
|
3
|
+
Semantic site search and MCP retrieval for SvelteKit content projects. Index your site, search it from the browser or AI tools, and scroll users to the exact content they're looking for.
|
|
4
4
|
|
|
5
|
-
**Requirements**: Node.js >= 20
|
|
5
|
+
**Requirements**: Node.js >= 20 | **Backend**: [Upstash Vector](https://upstash.com/docs/vector/overall/getstarted) | **License**: MIT
|
|
6
|
+
|
|
7
|
+
## How it works
|
|
8
|
+
|
|
9
|
+
```
|
|
10
|
+
SvelteKit Pages → Extractor (Cheerio + Turndown) → Chunker → Upstash Vector
|
|
11
|
+
↓
|
|
12
|
+
Search UI ← SvelteKit API Hook ← Search Engine + Ranking
|
|
13
|
+
↓
|
|
14
|
+
MCP Endpoint → Claude Code / Claude Desktop
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
SearchSocket extracts content from your SvelteKit site, converts it to markdown, splits it into chunks, and stores them in Upstash Vector. At runtime, the SvelteKit hook serves both a search API for your frontend and an MCP endpoint for AI tools.
|
|
6
18
|
|
|
7
19
|
## Features
|
|
8
20
|
|
|
9
|
-
- **
|
|
10
|
-
- **
|
|
11
|
-
- **
|
|
12
|
-
- **
|
|
13
|
-
- **
|
|
14
|
-
- **
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
- **
|
|
18
|
-
- **Scroll-to-Text**: `searchsocketScrollToText()` auto-scrolls to matching sections on navigation
|
|
19
|
-
- **MCP Server**: Model Context Protocol tools for search and page retrieval
|
|
21
|
+
- **Semantic + keyword search** — Upstash Vector handles hybrid search with built-in reranking and input enrichment
|
|
22
|
+
- **Dual search** — parallel page-level and chunk-level queries with configurable score blending
|
|
23
|
+
- **Scroll-to-text** — auto-scroll to the matching section when a user clicks a search result, with CSS Highlight API and Text Fragment support
|
|
24
|
+
- **SvelteKit integration** — server hook for the search API, Vite plugin for build-triggered indexing
|
|
25
|
+
- **Svelte 5 components** — reactive `createSearch` store and `<SearchSocket>` metadata component
|
|
26
|
+
- **MCP server** — six tools for Claude Code, Claude Desktop, and other MCP clients (stdio + HTTP)
|
|
27
|
+
- **llms.txt generation** — auto-generate LLM-friendly site indexes during indexing
|
|
28
|
+
- **Four source modes** — index from static output, build manifest, a running server, or raw markdown files
|
|
29
|
+
- **CLI** — init, index, search, dev, status, doctor, clean, prune, test, mcp, add
|
|
20
30
|
|
|
21
31
|
## Install
|
|
22
32
|
|
|
23
33
|
```bash
|
|
24
|
-
# pnpm
|
|
25
34
|
pnpm add -D searchsocket
|
|
26
|
-
|
|
27
|
-
# npm
|
|
28
|
-
npm install -D searchsocket
|
|
29
35
|
```
|
|
30
36
|
|
|
31
|
-
SearchSocket is typically a dev dependency
|
|
37
|
+
SearchSocket is typically a dev dependency since indexing runs at build time. If you use `searchsocketHandle()` at runtime (e.g., in a Node server adapter or serving the MCP endpoint from a production deployment), add it as a regular dependency:
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
pnpm add searchsocket
|
|
41
|
+
```
|
|
32
42
|
|
|
33
43
|
## Quickstart
|
|
34
44
|
|
|
@@ -38,100 +48,134 @@ SearchSocket is typically a dev dependency for CLI indexing. If you use `searchs
|
|
|
38
48
|
pnpm searchsocket init
|
|
39
49
|
```
|
|
40
50
|
|
|
41
|
-
|
|
42
|
-
- `searchsocket.config.ts` — minimal config file
|
|
43
|
-
- `.searchsocket/` — state directory (added to `.gitignore`)
|
|
51
|
+
Creates `searchsocket.config.ts`, the `.searchsocket/` state directory, wires up your SvelteKit hooks and Vite config, and generates `.mcp.json` for Claude Code.
|
|
44
52
|
|
|
45
53
|
### 2. Configure
|
|
46
54
|
|
|
47
55
|
Minimal config (`searchsocket.config.ts`):
|
|
48
56
|
|
|
49
57
|
```ts
|
|
50
|
-
export default {
|
|
51
|
-
embeddings: { apiKeyEnv: "JINA_API_KEY" }
|
|
52
|
-
};
|
|
58
|
+
export default {};
|
|
53
59
|
```
|
|
54
60
|
|
|
55
|
-
|
|
56
|
-
- **Development**: Uses local file DB at `.searchsocket/vectors.db`
|
|
57
|
-
- **Production**: Set `TURSO_DATABASE_URL` and `TURSO_AUTH_TOKEN` to use remote Turso
|
|
61
|
+
That's it — defaults handle the rest. SearchSocket reads `UPSTASH_VECTOR_REST_URL` and `UPSTASH_VECTOR_REST_TOKEN` from your environment automatically.
|
|
58
62
|
|
|
59
|
-
### 3.
|
|
63
|
+
### 3. Set environment variables
|
|
60
64
|
|
|
61
|
-
|
|
65
|
+
```bash
|
|
66
|
+
# .env
|
|
67
|
+
UPSTASH_VECTOR_REST_URL=https://...
|
|
68
|
+
UPSTASH_VECTOR_REST_TOKEN=...
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
Create an [Upstash Vector index](https://console.upstash.com/vector) with the `bge-large-en-v1.5` embedding model (1024 dimensions). Copy the REST URL and token.
|
|
72
|
+
|
|
73
|
+
### 4. Add the SvelteKit hook
|
|
74
|
+
|
|
75
|
+
The `init` command does this for you, but if you need to do it manually:
|
|
62
76
|
|
|
63
77
|
```ts
|
|
78
|
+
// src/hooks.server.ts
|
|
64
79
|
import { searchsocketHandle } from "searchsocket/sveltekit";
|
|
65
80
|
|
|
66
81
|
export const handle = searchsocketHandle();
|
|
67
82
|
```
|
|
68
83
|
|
|
69
|
-
This exposes `POST /api/search`
|
|
84
|
+
This exposes `POST /api/search`, `GET /api/search/health`, the MCP endpoint at `/api/mcp`, and page retrieval routes.
|
|
70
85
|
|
|
71
|
-
|
|
86
|
+
If you run into SSR bundling issues, mark SearchSocket as external in your Vite config:
|
|
72
87
|
|
|
73
|
-
|
|
88
|
+
```ts
|
|
89
|
+
// vite.config.ts
|
|
90
|
+
export default defineConfig({
|
|
91
|
+
plugins: [sveltekit()],
|
|
92
|
+
ssr: {
|
|
93
|
+
external: ["searchsocket", "searchsocket/sveltekit", "searchsocket/client"]
|
|
94
|
+
}
|
|
95
|
+
});
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
### 5. Add search to your frontend
|
|
99
|
+
|
|
100
|
+
Copy the search dialog template into your project:
|
|
74
101
|
|
|
75
|
-
Development (`.env`):
|
|
76
102
|
```bash
|
|
77
|
-
|
|
103
|
+
pnpm searchsocket add search-dialog
|
|
78
104
|
```
|
|
79
105
|
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
106
|
+
This copies a Svelte 5 component to `src/lib/components/search/SearchDialog.svelte` with Cmd+K built in. Import it in your layout and add the scroll-to-text handler:
|
|
107
|
+
|
|
108
|
+
```svelte
|
|
109
|
+
<!-- src/routes/+layout.svelte -->
|
|
110
|
+
<script>
|
|
111
|
+
import { afterNavigate } from "$app/navigation";
|
|
112
|
+
import { searchsocketScrollToText } from "searchsocket/sveltekit";
|
|
113
|
+
import SearchDialog from "$lib/components/search/SearchDialog.svelte";
|
|
114
|
+
|
|
115
|
+
afterNavigate(searchsocketScrollToText);
|
|
116
|
+
</script>
|
|
117
|
+
|
|
118
|
+
<SearchDialog />
|
|
119
|
+
|
|
120
|
+
<slot />
|
|
85
121
|
```
|
|
86
122
|
|
|
87
|
-
|
|
123
|
+
Users can now press Cmd+K to search. See [Building a Search UI](docs/search-ui.md) for scoped search, custom styling, and more patterns.
|
|
124
|
+
|
|
125
|
+
### 6. Deploy
|
|
126
|
+
|
|
127
|
+
SearchSocket is designed to index automatically on deploy. The `init` command already added the Vite plugin to your config. Set these environment variables on your hosting platform (Vercel, Cloudflare, etc.):
|
|
128
|
+
|
|
129
|
+
| Variable | Value |
|
|
130
|
+
|----------|-------|
|
|
131
|
+
| `UPSTASH_VECTOR_REST_URL` | Your Upstash Vector REST URL |
|
|
132
|
+
| `UPSTASH_VECTOR_REST_TOKEN` | Your Upstash Vector REST token |
|
|
133
|
+
| `SEARCHSOCKET_AUTO_INDEX` | `1` |
|
|
134
|
+
|
|
135
|
+
Every deploy will build your site, index the content, and serve the search API — fully automated.
|
|
136
|
+
|
|
137
|
+
For local testing, you can also build and index manually:
|
|
88
138
|
|
|
89
139
|
```bash
|
|
90
|
-
pnpm
|
|
140
|
+
pnpm build
|
|
141
|
+
pnpm searchsocket index
|
|
91
142
|
```
|
|
92
143
|
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
- **`crawl`**: Fetches pages from a running HTTP server
|
|
97
|
-
- **`content-files`**: Reads markdown/svelte source files directly
|
|
144
|
+
### 7. Connect Claude Code (optional)
|
|
145
|
+
|
|
146
|
+
Point Claude Code at your deployed site's MCP endpoint:
|
|
98
147
|
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
148
|
+
```json
|
|
149
|
+
{
|
|
150
|
+
"mcpServers": {
|
|
151
|
+
"searchsocket": {
|
|
152
|
+
"type": "http",
|
|
153
|
+
"url": "https://your-site.com/api/mcp"
|
|
154
|
+
}
|
|
155
|
+
}
|
|
156
|
+
}
|
|
157
|
+
```
|
|
106
158
|
|
|
107
|
-
|
|
159
|
+
See [MCP Server](#mcp-server) for authentication and other options.
|
|
160
|
+
|
|
161
|
+
### Querying the API directly
|
|
162
|
+
|
|
163
|
+
The search API is also available via HTTP and CLI:
|
|
108
164
|
|
|
109
|
-
**Via API:**
|
|
110
165
|
```bash
|
|
166
|
+
# cURL
|
|
111
167
|
curl -X POST http://localhost:5173/api/search \
|
|
112
168
|
-H "content-type: application/json" \
|
|
113
169
|
-d '{"q":"getting started","topK":5,"groupBy":"page"}'
|
|
114
|
-
```
|
|
115
|
-
|
|
116
|
-
**Via client library:**
|
|
117
|
-
```ts
|
|
118
|
-
import { createSearchClient } from "searchsocket/client";
|
|
119
170
|
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
q: "getting started",
|
|
123
|
-
topK: 5,
|
|
124
|
-
groupBy: "page",
|
|
125
|
-
pathPrefix: "/docs"
|
|
126
|
-
});
|
|
171
|
+
# CLI
|
|
172
|
+
pnpm searchsocket search --q "getting started" --top-k 5
|
|
127
173
|
```
|
|
128
174
|
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
```
|
|
175
|
+
### Response format
|
|
176
|
+
|
|
177
|
+
With `groupBy: "page"` (the default):
|
|
133
178
|
|
|
134
|
-
**Response** (with `groupBy: "page"`, the default):
|
|
135
179
|
```json
|
|
136
180
|
{
|
|
137
181
|
"q": "getting started",
|
|
@@ -161,18 +205,16 @@ pnpm searchsocket search --q "getting started" --top-k 5 --path-prefix /docs
|
|
|
161
205
|
}
|
|
162
206
|
],
|
|
163
207
|
"meta": {
|
|
164
|
-
"timingsMs": { "
|
|
165
|
-
"usedRerank": false,
|
|
166
|
-
"modelId": "jina-embeddings-v5-text-small"
|
|
208
|
+
"timingsMs": { "total": 135 }
|
|
167
209
|
}
|
|
168
210
|
}
|
|
169
211
|
```
|
|
170
212
|
|
|
171
|
-
The `chunks` array
|
|
213
|
+
The `chunks` array contains matching sections within each page. Use `groupBy: "chunk"` for flat per-chunk results without page aggregation.
|
|
172
214
|
|
|
173
215
|
## Source Modes
|
|
174
216
|
|
|
175
|
-
SearchSocket supports four
|
|
217
|
+
SearchSocket supports four ways to load your site content for indexing.
|
|
176
218
|
|
|
177
219
|
### `static-output` (default)
|
|
178
220
|
|
|
@@ -182,50 +224,37 @@ Reads prerendered HTML files from SvelteKit's build output directory.
|
|
|
182
224
|
export default {
|
|
183
225
|
source: {
|
|
184
226
|
mode: "static-output",
|
|
185
|
-
staticOutputDir: "build"
|
|
227
|
+
staticOutputDir: "build" // default
|
|
186
228
|
}
|
|
187
229
|
};
|
|
188
230
|
```
|
|
189
231
|
|
|
190
|
-
Best for
|
|
232
|
+
Best for fully prerendered sites. Run `vite build` first, then `searchsocket index`.
|
|
191
233
|
|
|
192
234
|
### `build`
|
|
193
235
|
|
|
194
|
-
Discovers routes
|
|
236
|
+
Discovers routes from SvelteKit's build manifest and renders via an ephemeral `vite preview` server. No manual route lists needed.
|
|
195
237
|
|
|
196
238
|
```ts
|
|
197
239
|
export default {
|
|
198
240
|
source: {
|
|
241
|
+
mode: "build",
|
|
199
242
|
build: {
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
exclude: ["/api/*", "/admin/*"], // glob patterns to skip
|
|
203
|
-
paramValues: { // values for dynamic routes
|
|
243
|
+
exclude: ["/api/*", "/admin/*"],
|
|
244
|
+
paramValues: {
|
|
204
245
|
"/blog/[slug]": ["hello-world", "getting-started"],
|
|
205
246
|
"/docs/[category]/[page]": ["guides/quickstart", "api/search"]
|
|
206
247
|
},
|
|
207
|
-
discover: true,
|
|
208
|
-
seedUrls: ["/"],
|
|
209
|
-
maxPages: 200,
|
|
210
|
-
maxDepth: 5
|
|
248
|
+
discover: true, // crawl internal links to find more pages
|
|
249
|
+
seedUrls: ["/"],
|
|
250
|
+
maxPages: 200,
|
|
251
|
+
maxDepth: 5
|
|
211
252
|
}
|
|
212
253
|
}
|
|
213
254
|
};
|
|
214
255
|
```
|
|
215
256
|
|
|
216
|
-
Best for
|
|
217
|
-
|
|
218
|
-
**How it works**:
|
|
219
|
-
1. Parses `.svelte-kit/output/server/manifest-full.js` to discover all page routes
|
|
220
|
-
2. Expands dynamic routes using `paramValues` (skips dynamic routes without values)
|
|
221
|
-
3. Starts an ephemeral `vite preview` server on a random port
|
|
222
|
-
4. Fetches all routes concurrently for SSR-rendered HTML
|
|
223
|
-
5. Provides exact route-to-file mapping (no heuristic matching needed)
|
|
224
|
-
6. Shuts down the preview server
|
|
225
|
-
|
|
226
|
-
**Dynamic routes**: Each key in `paramValues` maps to a route ID (e.g., `/blog/[slug]`) or its URL equivalent. Each value in the array replaces all `[param]` segments in the URL. Routes with layout groups like `/(app)/blog/[slug]` also match the URL key `/blog/[slug]`.
|
|
227
|
-
|
|
228
|
-
**Link discovery**: Enable `discover: true` to automatically find pages by crawling internal links from `seedUrls`. This is useful when dynamic routes have many parameter values that are impractical to enumerate. The crawler respects `maxPages` and `maxDepth` limits and only follows links within the same origin.
|
|
257
|
+
Best for CI/CD pipelines: `vite build && searchsocket index` with zero route configuration.
|
|
229
258
|
|
|
230
259
|
### `crawl`
|
|
231
260
|
|
|
@@ -234,24 +263,24 @@ Fetches pages from a running HTTP server.
|
|
|
234
263
|
```ts
|
|
235
264
|
export default {
|
|
236
265
|
source: {
|
|
266
|
+
mode: "crawl",
|
|
237
267
|
crawl: {
|
|
238
268
|
baseUrl: "http://localhost:4173",
|
|
239
|
-
routes: ["/", "/docs", "/blog"],
|
|
240
|
-
sitemapUrl: "https://example.com/sitemap.xml"
|
|
269
|
+
routes: ["/", "/docs", "/blog"],
|
|
270
|
+
sitemapUrl: "https://example.com/sitemap.xml"
|
|
241
271
|
}
|
|
242
272
|
}
|
|
243
273
|
};
|
|
244
274
|
```
|
|
245
275
|
|
|
246
|
-
If `routes` is omitted and no `sitemapUrl` is set, defaults to crawling `["/"]` only.
|
|
247
|
-
|
|
248
276
|
### `content-files`
|
|
249
277
|
|
|
250
|
-
Reads markdown and
|
|
278
|
+
Reads markdown and Svelte source files directly, without building or serving.
|
|
251
279
|
|
|
252
280
|
```ts
|
|
253
281
|
export default {
|
|
254
282
|
source: {
|
|
283
|
+
mode: "content-files",
|
|
255
284
|
contentFiles: {
|
|
256
285
|
globs: ["src/routes/**/*.md", "content/**/*.md"],
|
|
257
286
|
baseDir: "."
|
|
@@ -262,65 +291,132 @@ export default {
|
|
|
262
291
|
|
|
263
292
|
## Client Library
|
|
264
293
|
|
|
265
|
-
|
|
294
|
+
### `createSearchClient(options?)`
|
|
295
|
+
|
|
296
|
+
Lightweight browser-side search client.
|
|
266
297
|
|
|
267
298
|
```ts
|
|
268
299
|
import { createSearchClient } from "searchsocket/client";
|
|
269
300
|
|
|
270
301
|
const client = createSearchClient({
|
|
271
|
-
endpoint: "/api/search",
|
|
272
|
-
fetchImpl: fetch
|
|
302
|
+
endpoint: "/api/search", // default
|
|
303
|
+
fetchImpl: fetch // override for SSR or testing
|
|
273
304
|
});
|
|
274
305
|
|
|
275
|
-
const
|
|
306
|
+
const { results } = await client.search({
|
|
276
307
|
q: "deployment guide",
|
|
277
308
|
topK: 8,
|
|
278
309
|
groupBy: "page",
|
|
279
310
|
pathPrefix: "/docs",
|
|
280
311
|
tags: ["guide"],
|
|
281
|
-
|
|
312
|
+
filters: { version: 2 },
|
|
313
|
+
maxSubResults: 3
|
|
282
314
|
});
|
|
315
|
+
```
|
|
283
316
|
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
|
|
317
|
+
### `buildResultUrl(result)`
|
|
318
|
+
|
|
319
|
+
Builds a URL from a search result that includes scroll-to-text metadata:
|
|
320
|
+
|
|
321
|
+
- `_ssk` query parameter — section title for SvelteKit client-side navigation
|
|
322
|
+
- `_sskt` query parameter — text target snippet for precise scroll
|
|
323
|
+
- `#:~:text=` — [Text Fragment](https://developer.mozilla.org/en-US/docs/Web/URI/Fragment/Text_fragments) for native browser scroll on full page loads
|
|
324
|
+
|
|
325
|
+
```ts
|
|
326
|
+
import { buildResultUrl } from "searchsocket/client";
|
|
327
|
+
|
|
328
|
+
const href = buildResultUrl(result);
|
|
329
|
+
// "/docs/getting-started?_ssk=Installation&_sskt=Install+with+pnpm#:~:text=Install%20with%20pnpm"
|
|
292
330
|
```
|
|
293
331
|
|
|
294
|
-
##
|
|
332
|
+
## Svelte 5 Integration
|
|
295
333
|
|
|
296
|
-
|
|
334
|
+
### `createSearch(options?)`
|
|
297
335
|
|
|
298
|
-
|
|
336
|
+
A reactive search store built on Svelte 5 runes with debouncing and LRU caching.
|
|
299
337
|
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
338
|
+
```svelte
|
|
339
|
+
<script>
|
|
340
|
+
import { createSearch } from "searchsocket/svelte";
|
|
341
|
+
import { buildResultUrl } from "searchsocket/client";
|
|
342
|
+
|
|
343
|
+
const search = createSearch({
|
|
344
|
+
endpoint: "/api/search",
|
|
345
|
+
debounce: 250, // ms (default)
|
|
346
|
+
cache: true, // LRU result caching (default)
|
|
347
|
+
cacheSize: 50, // max cached queries (default)
|
|
348
|
+
topK: 10,
|
|
349
|
+
groupBy: "page",
|
|
350
|
+
pathPrefix: "/docs" // scope search to a section
|
|
351
|
+
});
|
|
352
|
+
</script>
|
|
303
353
|
|
|
304
|
-
|
|
354
|
+
<input bind:value={search.query} placeholder="Search docs..." />
|
|
355
|
+
|
|
356
|
+
{#if search.loading}
|
|
357
|
+
<p>Searching...</p>
|
|
358
|
+
{/if}
|
|
359
|
+
|
|
360
|
+
{#if search.error}
|
|
361
|
+
<p class="error">{search.error.message}</p>
|
|
362
|
+
{/if}
|
|
363
|
+
|
|
364
|
+
{#each search.results as result}
|
|
365
|
+
<a href={buildResultUrl(result)}>
|
|
366
|
+
<strong>{result.title}</strong>
|
|
367
|
+
{#if result.sectionTitle}
|
|
368
|
+
<span>— {result.sectionTitle}</span>
|
|
369
|
+
{/if}
|
|
370
|
+
</a>
|
|
371
|
+
<p>{result.snippet}</p>
|
|
372
|
+
{/each}
|
|
373
|
+
```
|
|
305
374
|
|
|
306
|
-
|
|
307
|
-
import { createSearchClient, buildResultUrl } from "searchsocket/client";
|
|
375
|
+
Call `search.destroy()` to clean up when no longer needed (automatic in component context).
|
|
308
376
|
|
|
309
|
-
|
|
310
|
-
const { results } = await client.search({ q: "installation" });
|
|
377
|
+
### `<SearchSocket>` component
|
|
311
378
|
|
|
312
|
-
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
}
|
|
379
|
+
Declarative meta tag component for controlling per-page search behavior:
|
|
380
|
+
|
|
381
|
+
```svelte
|
|
382
|
+
<script>
|
|
383
|
+
import { SearchSocket } from "searchsocket/svelte";
|
|
384
|
+
</script>
|
|
385
|
+
|
|
386
|
+
<!-- Boost this page's search ranking -->
|
|
387
|
+
<SearchSocket weight={1.2} />
|
|
388
|
+
|
|
389
|
+
<!-- Exclude from search -->
|
|
390
|
+
<SearchSocket noindex />
|
|
391
|
+
|
|
392
|
+
<!-- Add filterable tags -->
|
|
393
|
+
<SearchSocket tags={["guide", "advanced"]} />
|
|
394
|
+
|
|
395
|
+
<!-- Add structured metadata (filterable via search API) -->
|
|
396
|
+
<SearchSocket meta={{ version: 2, category: "api" }} />
|
|
397
|
+
```
|
|
398
|
+
|
|
399
|
+
The component renders `<meta>` tags in `<svelte:head>` that SearchSocket reads during indexing.
|
|
400
|
+
|
|
401
|
+
### Template components
|
|
402
|
+
|
|
403
|
+
Copy ready-made search UI components into your project:
|
|
404
|
+
|
|
405
|
+
```bash
|
|
406
|
+
pnpm searchsocket add search-dialog
|
|
407
|
+
pnpm searchsocket add search-input
|
|
408
|
+
pnpm searchsocket add search-results
|
|
317
409
|
```
|
|
318
410
|
|
|
319
|
-
|
|
411
|
+
These are Svelte 5 components copied to `src/lib/components/search/` (configurable via `--dir`). They're starting points to customize, not dependencies.
|
|
412
|
+
|
|
413
|
+
## Scroll-to-Text Navigation
|
|
414
|
+
|
|
415
|
+
When a user clicks a search result, SearchSocket scrolls them to the matching section on the destination page.
|
|
320
416
|
|
|
321
|
-
###
|
|
417
|
+
### Setup
|
|
322
418
|
|
|
323
|
-
|
|
419
|
+
Add the scroll handler to your root layout:
|
|
324
420
|
|
|
325
421
|
```svelte
|
|
326
422
|
<!-- src/routes/+layout.svelte -->
|
|
@@ -332,489 +428,627 @@ A SvelteKit `afterNavigate` hook that reads the `_ssk` parameter and scrolls the
|
|
|
332
428
|
</script>
|
|
333
429
|
```
|
|
334
430
|
|
|
335
|
-
|
|
336
|
-
- Matches headings (h1–h6) case-insensitively with whitespace normalization
|
|
337
|
-
- Falls back to a broader text node search if no heading matches
|
|
338
|
-
- Scrolls smoothly to the first match
|
|
339
|
-
- Is a silent no-op when `_ssk` is absent or no match is found
|
|
431
|
+
### How it works
|
|
340
432
|
|
|
341
|
-
|
|
433
|
+
1. `buildResultUrl()` encodes the section title and text snippet into the URL
|
|
434
|
+
2. On SvelteKit client-side navigation, the `afterNavigate` hook reads `_ssk`/`_sskt` params
|
|
435
|
+
3. A TreeWalker-based text mapper finds the exact position in the DOM
|
|
436
|
+
4. The page scrolls smoothly to the match
|
|
437
|
+
5. The matching text is highlighted using the [CSS Custom Highlight API](https://developer.mozilla.org/en-US/docs/Web/API/CSS_Custom_Highlight_API) (with a DOM fallback for older browsers)
|
|
438
|
+
6. On full page loads, browsers that support Text Fragments (`#:~:text=`) handle scrolling natively
|
|
342
439
|
|
|
343
|
-
|
|
440
|
+
The highlight fades after 2 seconds. Customize with CSS:
|
|
344
441
|
|
|
345
|
-
|
|
442
|
+
```css
|
|
443
|
+
::highlight(ssk-highlight) {
|
|
444
|
+
background-color: rgba(250, 204, 21, 0.4);
|
|
445
|
+
}
|
|
446
|
+
```
|
|
346
447
|
|
|
347
|
-
|
|
348
|
-
- Path: `.searchsocket/vectors.db` (configurable)
|
|
349
|
-
- No account or API keys needed
|
|
350
|
-
- Full vector search with `libsql_vector_idx` and `vector_top_k`
|
|
351
|
-
- Perfect for local development and CI testing
|
|
448
|
+
## Search & Ranking
|
|
352
449
|
|
|
353
|
-
###
|
|
450
|
+
### Dual search
|
|
354
451
|
|
|
355
|
-
|
|
452
|
+
By default, SearchSocket runs two parallel queries — one against page-level summaries and one against individual chunks — then blends the scores:
|
|
356
453
|
|
|
357
|
-
|
|
358
|
-
|
|
359
|
-
|
|
360
|
-
|
|
454
|
+
```ts
|
|
455
|
+
export default {
|
|
456
|
+
search: {
|
|
457
|
+
dualSearch: true, // default
|
|
458
|
+
pageSearchWeight: 0.3 // weight of page results vs chunks (0-1)
|
|
459
|
+
}
|
|
460
|
+
};
|
|
461
|
+
```
|
|
361
462
|
|
|
362
|
-
|
|
363
|
-
turso auth signup
|
|
463
|
+
### Page aggregation
|
|
364
464
|
|
|
365
|
-
|
|
366
|
-
turso db create searchsocket-prod
|
|
465
|
+
With `groupBy: "page"` (default), chunk results are grouped by page URL:
|
|
367
466
|
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
|
|
371
|
-
```
|
|
467
|
+
1. The top chunk score becomes the base page score
|
|
468
|
+
2. Additional matching chunks add a decaying bonus: `chunk_score * decay^i`
|
|
469
|
+
3. Per-URL page weights are applied multiplicatively
|
|
372
470
|
|
|
373
|
-
|
|
374
|
-
```bash
|
|
375
|
-
TURSO_DATABASE_URL=libsql://searchsocket-prod-xxx.turso.io
|
|
376
|
-
TURSO_AUTH_TOKEN=eyJhbGc...
|
|
377
|
-
```
|
|
471
|
+
### Ranking configuration
|
|
378
472
|
|
|
379
|
-
|
|
473
|
+
```ts
|
|
474
|
+
export default {
|
|
475
|
+
ranking: {
|
|
476
|
+
enableIncomingLinkBoost: true, // boost pages with more internal links pointing to them
|
|
477
|
+
enableDepthBoost: true, // boost shallower pages (/ > /docs > /docs/api)
|
|
478
|
+
enableFreshnessBoost: false, // boost recently published content
|
|
479
|
+
enableAnchorTextBoost: false, // boost pages whose link text matches the query
|
|
380
480
|
|
|
381
|
-
|
|
481
|
+
pageWeights: { // per-URL score multipliers (prefix matching)
|
|
482
|
+
"/": 0.95,
|
|
483
|
+
"/docs": 1.15,
|
|
484
|
+
"/download": 1.05
|
|
485
|
+
},
|
|
382
486
|
|
|
383
|
-
|
|
487
|
+
aggregationCap: 5, // max chunks contributing to page score
|
|
488
|
+
aggregationDecay: 0.5, // decay for additional chunks
|
|
489
|
+
minScoreRatio: 0.70, // drop results below 70% of best score
|
|
490
|
+
scoreGapThreshold: 0.4, // trim results >40% below best
|
|
491
|
+
minChunkScoreRatio: 0.5, // threshold for sub-chunks
|
|
384
492
|
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
|
|
388
|
-
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
url: "libsql://my-db.turso.io", // direct URL
|
|
393
|
-
authToken: "eyJhbGc..." // direct auth token
|
|
493
|
+
weights: {
|
|
494
|
+
incomingLinks: 0.05,
|
|
495
|
+
depth: 0.03,
|
|
496
|
+
aggregation: 0.1,
|
|
497
|
+
titleMatch: 0.15,
|
|
498
|
+
freshness: 0.1,
|
|
499
|
+
anchorText: 0.10
|
|
394
500
|
}
|
|
395
501
|
}
|
|
396
502
|
};
|
|
397
503
|
```
|
|
398
504
|
|
|
399
|
-
|
|
505
|
+
Use gentle `pageWeights` values (0.9–1.2) since they compound with other boosts.
|
|
400
506
|
|
|
401
|
-
|
|
402
|
-
|
|
403
|
-
When switching embedding models (e.g., from a 1536-dim model to Jina's 1024-dim), the vector dimension changes. SearchSocket automatically detects this and recreates the chunks table with the new dimension — no manual intervention needed. A full re-index (`--force`) is still required after switching models.
|
|
507
|
+
## Build-Triggered Indexing
|
|
404
508
|
|
|
405
|
-
|
|
509
|
+
The recommended workflow is to index automatically on every deploy. Add the Vite plugin to your config:
|
|
406
510
|
|
|
407
|
-
|
|
408
|
-
|
|
409
|
-
|
|
410
|
-
|
|
411
|
-
- **Vector search native** — `F32_BLOB` vectors, cosine similarity index, `vector_top_k` ANN queries
|
|
511
|
+
```ts
|
|
512
|
+
// vite.config.ts
|
|
513
|
+
import { sveltekit } from "@sveltejs/kit/vite";
|
|
514
|
+
import { searchsocketVitePlugin } from "searchsocket/sveltekit";
|
|
412
515
|
|
|
413
|
-
|
|
516
|
+
export default {
|
|
517
|
+
plugins: [
|
|
518
|
+
sveltekit(),
|
|
519
|
+
searchsocketVitePlugin({
|
|
520
|
+
changedOnly: true, // incremental indexing (default)
|
|
521
|
+
verbose: true
|
|
522
|
+
})
|
|
523
|
+
]
|
|
524
|
+
};
|
|
525
|
+
```
|
|
414
526
|
|
|
415
|
-
|
|
527
|
+
### Vercel / Cloudflare / Netlify
|
|
416
528
|
|
|
417
|
-
|
|
529
|
+
Set these environment variables in your hosting platform:
|
|
418
530
|
|
|
419
|
-
|
|
531
|
+
| Variable | Value |
|
|
532
|
+
|----------|-------|
|
|
533
|
+
| `UPSTASH_VECTOR_REST_URL` | Your Upstash Vector REST URL |
|
|
534
|
+
| `UPSTASH_VECTOR_REST_TOKEN` | Your Upstash Vector REST token |
|
|
535
|
+
| `SEARCHSOCKET_AUTO_INDEX` | `1` |
|
|
420
536
|
|
|
421
|
-
|
|
537
|
+
Every deploy will build your site, index the content into Upstash, and serve the search API and MCP endpoint — fully automated.
|
|
422
538
|
|
|
423
|
-
|
|
424
|
-
// hooks.server.ts (Vercel / Netlify)
|
|
425
|
-
import { searchsocketHandle } from "searchsocket/sveltekit";
|
|
539
|
+
### Environment variable control
|
|
426
540
|
|
|
427
|
-
|
|
428
|
-
|
|
429
|
-
|
|
430
|
-
source: { mode: "static-output" },
|
|
431
|
-
embeddings: { apiKeyEnv: "JINA_API_KEY" },
|
|
432
|
-
}
|
|
433
|
-
});
|
|
434
|
-
```
|
|
541
|
+
```bash
|
|
542
|
+
# Enable indexing on build
|
|
543
|
+
SEARCHSOCKET_AUTO_INDEX=1 pnpm build
|
|
435
544
|
|
|
436
|
-
|
|
437
|
-
|
|
438
|
-
- `TURSO_DATABASE_URL`
|
|
439
|
-
- `TURSO_AUTH_TOKEN`
|
|
545
|
+
# Disable temporarily
|
|
546
|
+
SEARCHSOCKET_DISABLE_AUTO_INDEX=1 pnpm build
|
|
440
547
|
|
|
441
|
-
|
|
548
|
+
# Force full rebuild (ignore incremental cache)
|
|
549
|
+
SEARCHSOCKET_FORCE_REINDEX=1 pnpm build
|
|
550
|
+
```
|
|
442
551
|
|
|
443
|
-
|
|
552
|
+
## Making Images Searchable
|
|
444
553
|
|
|
445
|
-
|
|
554
|
+
SearchSocket converts images to text during extraction using this priority chain:
|
|
446
555
|
|
|
447
|
-
|
|
448
|
-
|
|
449
|
-
|
|
556
|
+
1. `data-search-description` on the `<img>` — your explicit description
|
|
557
|
+
2. `data-search-description` on the parent `<figure>`
|
|
558
|
+
3. `alt` text + `<figcaption>` combined
|
|
559
|
+
4. `alt` text alone (filters generic words like "image", "icon")
|
|
560
|
+
5. `<figcaption>` alone
|
|
561
|
+
6. Removed — images with no useful text are dropped
|
|
450
562
|
|
|
451
|
-
|
|
563
|
+
```html
|
|
564
|
+
<img
|
|
565
|
+
src="/screenshots/settings.png"
|
|
566
|
+
alt="Settings page"
|
|
567
|
+
data-search-description="The settings page showing API key configuration, theme selection, and notification preferences"
|
|
568
|
+
/>
|
|
569
|
+
```
|
|
452
570
|
|
|
453
|
-
|
|
454
|
-
|----------|---------|-------|
|
|
455
|
-
| Vercel | `adapter-auto` (default) | Serverless — use `rawConfig` + remote Turso |
|
|
456
|
-
| Netlify | `adapter-netlify` | Serverless — same as Vercel |
|
|
457
|
-
| VPS / Docker | `adapter-node` | Long-lived process — no limitations, local SQLite works |
|
|
571
|
+
Works with SvelteKit's `enhanced:img`:
|
|
458
572
|
|
|
459
|
-
|
|
573
|
+
```svelte
|
|
574
|
+
<enhanced:img
|
|
575
|
+
src="./screenshots/dashboard.png"
|
|
576
|
+
alt="Dashboard"
|
|
577
|
+
data-search-description="Main dashboard showing active projects and indexing status"
|
|
578
|
+
/>
|
|
579
|
+
```
|
|
460
580
|
|
|
461
|
-
|
|
581
|
+
## MCP Server
|
|
462
582
|
|
|
463
|
-
|
|
583
|
+
SearchSocket includes an MCP server that gives Claude Code, Claude Desktop, and other MCP clients direct access to your site's search index. The MCP endpoint is built into `searchsocketHandle()` — once your site is deployed, any MCP client can connect to it over HTTP.
|
|
464
584
|
|
|
465
|
-
|
|
466
|
-
- **Dimensions**: 1024 (default)
|
|
467
|
-
- **Cost**: ~$0.00005 per 1K tokens
|
|
468
|
-
- **Task adapters**: Uses `retrieval.passage` for indexing, `retrieval.query` for search queries (LoRA task-specific adapters for better retrieval quality)
|
|
585
|
+
### Available tools
|
|
469
586
|
|
|
470
|
-
|
|
587
|
+
| Tool | Description |
|
|
588
|
+
|------|-------------|
|
|
589
|
+
| `search` | Semantic search with filtering, grouping, and reranking |
|
|
590
|
+
| `get_page` | Retrieve full page markdown with frontmatter |
|
|
591
|
+
| `list_pages` | Cursor-paginated page listing |
|
|
592
|
+
| `get_site_structure` | Hierarchical page tree |
|
|
593
|
+
| `find_source_file` | Locate the SvelteKit source file for content |
|
|
594
|
+
| `get_related_pages` | Find related pages by links, semantics, and structure |
|
|
471
595
|
|
|
472
|
-
|
|
473
|
-
2. **Title Prepend**: Page title is prepended to each chunk for better context (`chunking.prependTitle`, default: true)
|
|
474
|
-
3. **Summary Chunk**: A synthetic identity chunk is generated per page with title, URL, and first paragraph (`chunking.pageSummaryChunk`, default: true)
|
|
475
|
-
4. **Embedding**: Each chunk is sent to Jina's embedding API with the `retrieval.passage` task adapter
|
|
476
|
-
5. **Batching**: Requests batched (64 texts per request) for efficiency
|
|
477
|
-
6. **Storage**: Vectors stored in Turso with metadata (URL, title, tags, depth, etc.)
|
|
596
|
+
### Connecting to your deployed site
|
|
478
597
|
|
|
479
|
-
|
|
598
|
+
The recommended setup is to connect Claude Code to your deployed site's MCP endpoint. This way the index stays up to date automatically as you deploy, and there's no local process to manage.
|
|
480
599
|
|
|
481
|
-
|
|
482
|
-
```bash
|
|
483
|
-
pnpm searchsocket index --dry-run
|
|
484
|
-
```
|
|
600
|
+
Add `.mcp.json` to your project root:
|
|
485
601
|
|
|
486
|
-
|
|
487
|
-
|
|
488
|
-
|
|
489
|
-
|
|
490
|
-
|
|
491
|
-
|
|
492
|
-
|
|
493
|
-
|
|
602
|
+
```json
|
|
603
|
+
{
|
|
604
|
+
"mcpServers": {
|
|
605
|
+
"searchsocket": {
|
|
606
|
+
"type": "http",
|
|
607
|
+
"url": "https://your-site.com/api/mcp"
|
|
608
|
+
}
|
|
609
|
+
}
|
|
610
|
+
}
|
|
494
611
|
```
|
|
495
612
|
|
|
496
|
-
|
|
613
|
+
That's it. Restart Claude Code and the six search tools are available. You can search your docs, retrieve page content, and find source files directly from the AI assistant.
|
|
497
614
|
|
|
498
|
-
|
|
615
|
+
To protect the endpoint, add API key authentication:
|
|
499
616
|
|
|
500
617
|
```ts
|
|
501
|
-
|
|
502
|
-
|
|
503
|
-
|
|
504
|
-
|
|
618
|
+
// src/hooks.server.ts
|
|
619
|
+
export const handle = searchsocketHandle({
|
|
620
|
+
rawConfig: {
|
|
621
|
+
mcp: {
|
|
622
|
+
handle: {
|
|
623
|
+
apiKey: process.env.SEARCHSOCKET_MCP_API_KEY
|
|
624
|
+
}
|
|
625
|
+
}
|
|
626
|
+
}
|
|
627
|
+
});
|
|
505
628
|
```
|
|
506
629
|
|
|
507
|
-
|
|
630
|
+
Then pass the key in `.mcp.json`:
|
|
508
631
|
|
|
509
|
-
|
|
632
|
+
```json
|
|
633
|
+
{
|
|
634
|
+
"mcpServers": {
|
|
635
|
+
"searchsocket": {
|
|
636
|
+
"type": "http",
|
|
637
|
+
"url": "https://your-site.com/api/mcp",
|
|
638
|
+
"headers": {
|
|
639
|
+
"Authorization": "Bearer ${SEARCHSOCKET_MCP_API_KEY}"
|
|
640
|
+
}
|
|
641
|
+
}
|
|
642
|
+
}
|
|
643
|
+
}
|
|
644
|
+
```
|
|
510
645
|
|
|
511
|
-
|
|
646
|
+
The `${SEARCHSOCKET_MCP_API_KEY}` syntax references an environment variable so you don't hardcode secrets in `.mcp.json`.
|
|
512
647
|
|
|
513
|
-
|
|
648
|
+
### Auto-approving in Claude Code
|
|
514
649
|
|
|
515
|
-
|
|
516
|
-
2. Additional matching chunks contribute a decaying bonus: `chunk_score * decay^i`
|
|
517
|
-
3. Optional per-URL page weights are applied multiplicatively
|
|
650
|
+
Skip the approval prompt each time a tool is called:
|
|
518
651
|
|
|
519
|
-
|
|
652
|
+
```json
|
|
653
|
+
{
|
|
654
|
+
"allowedMcpServers": [
|
|
655
|
+
{ "serverName": "searchsocket" }
|
|
656
|
+
]
|
|
657
|
+
}
|
|
658
|
+
```
|
|
520
659
|
|
|
521
|
-
|
|
522
|
-
|
|
523
|
-
|
|
524
|
-
|
|
525
|
-
|
|
526
|
-
|
|
527
|
-
|
|
528
|
-
|
|
529
|
-
|
|
530
|
-
|
|
531
|
-
"
|
|
532
|
-
|
|
533
|
-
weights: {
|
|
534
|
-
aggregation: 0.1, // weight of aggregation bonus (default: 0.1)
|
|
535
|
-
incomingLinks: 0.05, // incoming link boost weight (default: 0.05)
|
|
536
|
-
depth: 0.03, // URL depth boost weight (default: 0.03)
|
|
537
|
-
rerank: 1.0 // reranker score weight (default: 1.0)
|
|
660
|
+
Add this to `.claude/settings.json` in your project.
|
|
661
|
+
|
|
662
|
+
### Local development
|
|
663
|
+
|
|
664
|
+
During local development, you can point to your dev server instead:
|
|
665
|
+
|
|
666
|
+
```json
|
|
667
|
+
{
|
|
668
|
+
"mcpServers": {
|
|
669
|
+
"searchsocket": {
|
|
670
|
+
"type": "http",
|
|
671
|
+
"url": "http://localhost:5173/api/mcp"
|
|
538
672
|
}
|
|
539
673
|
}
|
|
540
|
-
}
|
|
674
|
+
}
|
|
541
675
|
```
|
|
542
676
|
|
|
543
|
-
|
|
677
|
+
### Claude Desktop
|
|
678
|
+
|
|
679
|
+
Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
|
|
544
680
|
|
|
545
|
-
|
|
681
|
+
```json
|
|
682
|
+
{
|
|
683
|
+
"mcpServers": {
|
|
684
|
+
"searchsocket": {
|
|
685
|
+
"command": "npx",
|
|
686
|
+
"args": ["searchsocket", "mcp"],
|
|
687
|
+
"cwd": "/path/to/your/project"
|
|
688
|
+
}
|
|
689
|
+
}
|
|
690
|
+
}
|
|
691
|
+
```
|
|
546
692
|
|
|
547
|
-
###
|
|
693
|
+
### Standalone HTTP server
|
|
548
694
|
|
|
549
|
-
|
|
695
|
+
Run the MCP server as a standalone process (outside SvelteKit):
|
|
550
696
|
|
|
551
697
|
```bash
|
|
552
|
-
|
|
553
|
-
-H "content-type: application/json" \
|
|
554
|
-
-d '{"q":"vector search","topK":10,"groupBy":"chunk"}'
|
|
698
|
+
pnpm searchsocket mcp --transport http --port 3338
|
|
555
699
|
```
|
|
556
700
|
|
|
557
|
-
##
|
|
701
|
+
## llms.txt Generation
|
|
558
702
|
|
|
559
|
-
|
|
703
|
+
Generate [llms.txt](https://llmstxt.org/) files during indexing — a standardized way to make your site content available to LLMs.
|
|
560
704
|
|
|
561
|
-
**`vite.config.ts` or `svelte.config.js`:**
|
|
562
705
|
```ts
|
|
563
|
-
import { searchsocketVitePlugin } from "searchsocket/sveltekit";
|
|
564
|
-
|
|
565
706
|
export default {
|
|
566
|
-
|
|
567
|
-
|
|
568
|
-
|
|
569
|
-
|
|
570
|
-
|
|
571
|
-
|
|
572
|
-
|
|
573
|
-
|
|
707
|
+
project: {
|
|
708
|
+
baseUrl: "https://example.com"
|
|
709
|
+
},
|
|
710
|
+
llmsTxt: {
|
|
711
|
+
enable: true,
|
|
712
|
+
title: "My Project",
|
|
713
|
+
description: "Documentation for My Project",
|
|
714
|
+
outputPath: "static/llms.txt", // default
|
|
715
|
+
generateFull: true, // also generate llms-full.txt
|
|
716
|
+
serveMarkdownVariants: false // serve /page.md variants via the hook
|
|
717
|
+
}
|
|
574
718
|
};
|
|
575
719
|
```
|
|
576
720
|
|
|
577
|
-
|
|
578
|
-
```bash
|
|
579
|
-
# Enable via env var
|
|
580
|
-
SEARCHSOCKET_AUTO_INDEX=1 pnpm build
|
|
721
|
+
After indexing, `llms.txt` (page index with links) and `llms-full.txt` (full content) are written to your static directory and served by `searchsocketHandle()`.
|
|
581
722
|
|
|
582
|
-
|
|
583
|
-
SEARCHSOCKET_DISABLE_AUTO_INDEX=1 pnpm build
|
|
584
|
-
```
|
|
585
|
-
|
|
586
|
-
## Commands
|
|
723
|
+
## CLI Commands
|
|
587
724
|
|
|
588
725
|
### `searchsocket init`
|
|
589
726
|
|
|
590
|
-
Initialize config and state directory.
|
|
727
|
+
Initialize config and state directory. Creates `searchsocket.config.ts`, `.searchsocket/`, `.mcp.json`, and wires up your hooks and Vite config.
|
|
591
728
|
|
|
592
729
|
```bash
|
|
593
730
|
pnpm searchsocket init
|
|
731
|
+
pnpm searchsocket init --non-interactive
|
|
594
732
|
```
|
|
595
733
|
|
|
596
734
|
### `searchsocket index`
|
|
597
735
|
|
|
598
|
-
Index content into
|
|
736
|
+
Index content into Upstash Vector.
|
|
599
737
|
|
|
600
738
|
```bash
|
|
601
|
-
#
|
|
602
|
-
pnpm searchsocket index --
|
|
739
|
+
pnpm searchsocket index # incremental (default: --changed-only)
|
|
740
|
+
pnpm searchsocket index --force # full re-index
|
|
741
|
+
pnpm searchsocket index --source build # override source mode
|
|
742
|
+
pnpm searchsocket index --scope staging # override scope
|
|
743
|
+
pnpm searchsocket index --dry-run # preview without writing
|
|
744
|
+
pnpm searchsocket index --max-pages 10 # limit for testing
|
|
745
|
+
pnpm searchsocket index --verbose # detailed output
|
|
746
|
+
pnpm searchsocket index --json # machine-readable output
|
|
747
|
+
```
|
|
603
748
|
|
|
604
|
-
|
|
605
|
-
pnpm searchsocket index --force
|
|
749
|
+
### `searchsocket search`
|
|
606
750
|
|
|
607
|
-
|
|
608
|
-
pnpm searchsocket index --dry-run
|
|
751
|
+
CLI search for testing.
|
|
609
752
|
|
|
610
|
-
|
|
611
|
-
pnpm searchsocket
|
|
753
|
+
```bash
|
|
754
|
+
pnpm searchsocket search --q "getting started" --top-k 5
|
|
755
|
+
pnpm searchsocket search --q "api" --path-prefix /docs
|
|
756
|
+
```
|
|
612
757
|
|
|
613
|
-
|
|
614
|
-
pnpm searchsocket index --max-pages 10 --max-chunks 50
|
|
758
|
+
### `searchsocket dev`
|
|
615
759
|
|
|
616
|
-
|
|
617
|
-
pnpm searchsocket index --scope staging
|
|
760
|
+
Watch for file changes and auto-reindex, with optional playground UI.
|
|
618
761
|
|
|
619
|
-
|
|
620
|
-
pnpm searchsocket
|
|
762
|
+
```bash
|
|
763
|
+
pnpm searchsocket dev # watch + playground at :3337
|
|
764
|
+
pnpm searchsocket dev --mcp --mcp-port 3338 # also start MCP HTTP server
|
|
765
|
+
pnpm searchsocket dev --no-playground # watch only
|
|
621
766
|
```
|
|
622
767
|
|
|
623
768
|
### `searchsocket status`
|
|
624
769
|
|
|
625
|
-
Show indexing status
|
|
770
|
+
Show indexing status and backend health.
|
|
626
771
|
|
|
627
772
|
```bash
|
|
628
773
|
pnpm searchsocket status
|
|
774
|
+
```
|
|
775
|
+
|
|
776
|
+
### `searchsocket doctor`
|
|
629
777
|
|
|
630
|
-
|
|
631
|
-
|
|
632
|
-
|
|
633
|
-
|
|
634
|
-
# vector backend: turso/libsql (local (.searchsocket/vectors.db))
|
|
635
|
-
# vector health: ok
|
|
636
|
-
# last indexed (main): 2025-02-23T10:30:00Z
|
|
637
|
-
# tracked chunks: 156
|
|
638
|
-
# last estimated tokens: 32,400
|
|
639
|
-
# last estimated cost: $0.000648
|
|
778
|
+
Validate config, env vars, provider connectivity, and write access.
|
|
779
|
+
|
|
780
|
+
```bash
|
|
781
|
+
pnpm searchsocket doctor
|
|
640
782
|
```
|
|
641
783
|
|
|
642
|
-
### `searchsocket
|
|
784
|
+
### `searchsocket test`
|
|
643
785
|
|
|
644
|
-
|
|
786
|
+
Run search quality assertions against the live index.
|
|
645
787
|
|
|
646
788
|
```bash
|
|
647
|
-
pnpm searchsocket
|
|
789
|
+
pnpm searchsocket test # uses searchsocket.test.json
|
|
790
|
+
pnpm searchsocket test --file custom-tests.json # custom test file
|
|
791
|
+
```
|
|
792
|
+
|
|
793
|
+
Test file format:
|
|
648
794
|
|
|
649
|
-
|
|
650
|
-
|
|
795
|
+
```json
|
|
796
|
+
[
|
|
797
|
+
{
|
|
798
|
+
"query": "installation guide",
|
|
799
|
+
"expect": {
|
|
800
|
+
"topResult": "/docs/getting-started",
|
|
801
|
+
"inTop5": ["/docs/getting-started", "/docs/quickstart"]
|
|
802
|
+
}
|
|
803
|
+
}
|
|
804
|
+
]
|
|
651
805
|
```
|
|
652
806
|
|
|
653
|
-
|
|
654
|
-
- `src/routes/**` (route files)
|
|
655
|
-
- `build/` (if static-output mode)
|
|
656
|
-
- Build output dir (if build mode)
|
|
657
|
-
- Content files (if content-files mode)
|
|
658
|
-
- `searchsocket.config.ts` (if crawl or build mode)
|
|
807
|
+
Reports pass/fail per assertion and Mean Reciprocal Rank (MRR) across all queries.
|
|
659
808
|
|
|
660
809
|
### `searchsocket clean`
|
|
661
810
|
|
|
662
|
-
Delete local state and optionally remote
|
|
811
|
+
Delete local state and optionally remote indexes.
|
|
663
812
|
|
|
664
813
|
```bash
|
|
665
|
-
#
|
|
666
|
-
pnpm searchsocket clean
|
|
667
|
-
|
|
668
|
-
# Local + remote vectors
|
|
669
|
-
pnpm searchsocket clean --remote --scope staging
|
|
814
|
+
pnpm searchsocket clean # local state only
|
|
815
|
+
pnpm searchsocket clean --remote # also delete remote scope
|
|
816
|
+
pnpm searchsocket clean --scope staging # specific scope
|
|
670
817
|
```
|
|
671
818
|
|
|
672
819
|
### `searchsocket prune`
|
|
673
820
|
|
|
674
|
-
|
|
821
|
+
List and delete stale scopes. Compares against git branches to find orphaned scopes.
|
|
675
822
|
|
|
676
823
|
```bash
|
|
677
|
-
#
|
|
678
|
-
pnpm searchsocket prune --
|
|
824
|
+
pnpm searchsocket prune # dry-run (default)
|
|
825
|
+
pnpm searchsocket prune --apply # actually delete
|
|
826
|
+
pnpm searchsocket prune --older-than 30d # only scopes older than 30 days
|
|
827
|
+
```
|
|
679
828
|
|
|
680
|
-
|
|
681
|
-
pnpm searchsocket prune --older-than 30d --apply
|
|
829
|
+
### `searchsocket mcp`
|
|
682
830
|
|
|
683
|
-
|
|
684
|
-
|
|
831
|
+
Run the MCP server standalone.
|
|
832
|
+
|
|
833
|
+
```bash
|
|
834
|
+
pnpm searchsocket mcp # stdio (default)
|
|
835
|
+
pnpm searchsocket mcp --transport http --port 3338 # HTTP
|
|
836
|
+
pnpm searchsocket mcp --access public --api-key SECRET # public with auth
|
|
685
837
|
```
|
|
686
838
|
|
|
687
|
-
### `searchsocket
|
|
839
|
+
### `searchsocket add`
|
|
688
840
|
|
|
689
|
-
|
|
841
|
+
Copy Svelte 5 search UI template components into your project.
|
|
690
842
|
|
|
691
843
|
```bash
|
|
692
|
-
pnpm searchsocket
|
|
693
|
-
|
|
694
|
-
|
|
695
|
-
#
|
|
696
|
-
# PASS env JINA_API_KEY
|
|
697
|
-
# PASS turso/libsql (local file: .searchsocket/vectors.db)
|
|
698
|
-
# PASS source: build manifest
|
|
699
|
-
# PASS source: vite binary
|
|
700
|
-
# PASS embedding provider connectivity
|
|
701
|
-
# PASS vector backend connectivity
|
|
702
|
-
# PASS vector backend write permission
|
|
703
|
-
# PASS state directory writable
|
|
844
|
+
pnpm searchsocket add search-dialog
|
|
845
|
+
pnpm searchsocket add search-input
|
|
846
|
+
pnpm searchsocket add search-results
|
|
847
|
+
pnpm searchsocket add search-dialog --dir src/lib/components/ui # custom dir
|
|
704
848
|
```
|
|
705
849
|
|
|
706
|
-
|
|
850
|
+
## Real-World Example
|
|
707
851
|
|
|
708
|
-
|
|
852
|
+
Here's how [Canopy](https://canopy.dev) integrates SearchSocket into a production SvelteKit site.
|
|
709
853
|
|
|
710
|
-
|
|
711
|
-
# stdio transport (default)
|
|
712
|
-
pnpm searchsocket mcp
|
|
854
|
+
### Configuration
|
|
713
855
|
|
|
714
|
-
|
|
715
|
-
|
|
856
|
+
```ts
|
|
857
|
+
// searchsocket.config.ts
|
|
858
|
+
export default {
|
|
859
|
+
project: {
|
|
860
|
+
id: "canopy-website",
|
|
861
|
+
baseUrl: "https://canopy.dev"
|
|
862
|
+
},
|
|
863
|
+
source: {
|
|
864
|
+
mode: "build"
|
|
865
|
+
},
|
|
866
|
+
extract: {
|
|
867
|
+
dropSelectors: [".nav-blur", ".mobile-overlay", ".docs-sidebar"]
|
|
868
|
+
},
|
|
869
|
+
ranking: {
|
|
870
|
+
minScoreRatio: 0.70,
|
|
871
|
+
pageWeights: {
|
|
872
|
+
"/": 0.95,
|
|
873
|
+
"/download": 1.05,
|
|
874
|
+
"/docs/**": 1.05
|
|
875
|
+
},
|
|
876
|
+
aggregationCap: 3,
|
|
877
|
+
aggregationDecay: 0.3
|
|
878
|
+
}
|
|
879
|
+
};
|
|
716
880
|
```
|
|
717
881
|
|
|
718
|
-
###
|
|
882
|
+
### Server hook
|
|
719
883
|
|
|
720
|
-
|
|
884
|
+
```ts
|
|
885
|
+
// src/hooks.server.ts
|
|
886
|
+
import { searchsocketHandle } from "searchsocket/sveltekit";
|
|
887
|
+
import { env } from "$env/dynamic/private";
|
|
721
888
|
|
|
722
|
-
|
|
723
|
-
|
|
889
|
+
export const handle = searchsocketHandle({
|
|
890
|
+
rawConfig: {
|
|
891
|
+
project: { id: "canopy-website", baseUrl: "https://canopy.dev" },
|
|
892
|
+
source: { mode: "build" },
|
|
893
|
+
upstash: {
|
|
894
|
+
url: env.UPSTASH_VECTOR_REST_URL,
|
|
895
|
+
token: env.UPSTASH_VECTOR_REST_TOKEN
|
|
896
|
+
},
|
|
897
|
+
extract: {
|
|
898
|
+
dropSelectors: [".nav-blur", ".mobile-overlay", ".docs-sidebar"]
|
|
899
|
+
},
|
|
900
|
+
ranking: {
|
|
901
|
+
minScoreRatio: 0.70,
|
|
902
|
+
pageWeights: { "/": 0.95, "/download": 1.05, "/docs/**": 1.05 },
|
|
903
|
+
aggregationCap: 3,
|
|
904
|
+
aggregationDecay: 0.3
|
|
905
|
+
}
|
|
906
|
+
}
|
|
907
|
+
});
|
|
724
908
|
```
|
|
725
909
|
|
|
726
|
-
|
|
910
|
+
### Search modal with scoped search
|
|
727
911
|
|
|
728
|
-
|
|
912
|
+
```svelte
|
|
913
|
+
<!-- SearchModal.svelte -->
|
|
914
|
+
<script>
|
|
915
|
+
import { createSearchClient, buildResultUrl } from "searchsocket/client";
|
|
916
|
+
|
|
917
|
+
let { open = $bindable(false), pathPrefix = "", placeholder = "Search..." } = $props();
|
|
918
|
+
|
|
919
|
+
const client = createSearchClient();
|
|
920
|
+
let query = $state("");
|
|
921
|
+
let results = $state([]);
|
|
922
|
+
|
|
923
|
+
async function doSearch() {
|
|
924
|
+
if (!query.trim()) { results = []; return; }
|
|
925
|
+
const res = await client.search({
|
|
926
|
+
q: query,
|
|
927
|
+
topK: 8,
|
|
928
|
+
groupBy: "page",
|
|
929
|
+
pathPrefix: pathPrefix || undefined
|
|
930
|
+
});
|
|
931
|
+
results = res.results;
|
|
932
|
+
}
|
|
933
|
+
</script>
|
|
934
|
+
|
|
935
|
+
{#if open}
|
|
936
|
+
<dialog open>
|
|
937
|
+
<input bind:value={query} oninput={doSearch} {placeholder} />
|
|
938
|
+
{#each results as result}
|
|
939
|
+
<a href={buildResultUrl(result)} onclick={() => open = false}>
|
|
940
|
+
<strong>{result.title}</strong>
|
|
941
|
+
{#if result.sectionTitle}<span>— {result.sectionTitle}</span>{/if}
|
|
942
|
+
<p>{result.snippet}</p>
|
|
943
|
+
</a>
|
|
944
|
+
{/each}
|
|
945
|
+
</dialog>
|
|
946
|
+
{/if}
|
|
947
|
+
```
|
|
729
948
|
|
|
730
|
-
###
|
|
949
|
+
### Scroll-to-text in layout
|
|
950
|
+
|
|
951
|
+
```svelte
|
|
952
|
+
<!-- src/routes/+layout.svelte -->
|
|
953
|
+
<script>
|
|
954
|
+
import { afterNavigate } from "$app/navigation";
|
|
955
|
+
import { searchsocketScrollToText } from "searchsocket/sveltekit";
|
|
956
|
+
|
|
957
|
+
afterNavigate(searchsocketScrollToText);
|
|
958
|
+
</script>
|
|
959
|
+
```
|
|
731
960
|
|
|
732
|
-
|
|
733
|
-
- Semantic search across indexed content
|
|
734
|
-
- Returns ranked results with URL, title, snippet, score, and routeFile
|
|
735
|
-
- Options: `scope`, `topK` (1-100), `pathPrefix`, `tags`, `groupBy` (`"page"` | `"chunk"`)
|
|
961
|
+
### Deploy and index
|
|
736
962
|
|
|
737
|
-
|
|
738
|
-
- Retrieve full indexed page content as markdown with frontmatter
|
|
739
|
-
- Options: `scope`
|
|
963
|
+
Indexing runs automatically on every Vercel deploy. Set these env vars in the Vercel dashboard:
|
|
740
964
|
|
|
741
|
-
|
|
965
|
+
- `UPSTASH_VECTOR_REST_URL`
|
|
966
|
+
- `UPSTASH_VECTOR_REST_TOKEN`
|
|
967
|
+
- `SEARCHSOCKET_AUTO_INDEX=1`
|
|
742
968
|
|
|
743
|
-
|
|
969
|
+
The Vite plugin handles the rest. Alternatively, use a postbuild script:
|
|
744
970
|
|
|
745
971
|
```json
|
|
746
972
|
{
|
|
747
|
-
"
|
|
748
|
-
"
|
|
749
|
-
|
|
750
|
-
"command": "npx",
|
|
751
|
-
"args": ["searchsocket", "mcp"],
|
|
752
|
-
"env": {}
|
|
753
|
-
}
|
|
973
|
+
"scripts": {
|
|
974
|
+
"build": "vite build",
|
|
975
|
+
"postbuild": "searchsocket index"
|
|
754
976
|
}
|
|
755
977
|
}
|
|
756
978
|
```
|
|
757
979
|
|
|
758
|
-
|
|
759
|
-
|
|
760
|
-
```bash
|
|
761
|
-
claude mcp list
|
|
762
|
-
```
|
|
763
|
-
|
|
764
|
-
### Setup (Claude Desktop)
|
|
765
|
-
|
|
766
|
-
Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
|
|
980
|
+
### Connect Claude Code to the deployed site
|
|
767
981
|
|
|
768
982
|
```json
|
|
769
983
|
{
|
|
770
984
|
"mcpServers": {
|
|
771
985
|
"searchsocket": {
|
|
772
|
-
"
|
|
773
|
-
"
|
|
774
|
-
"cwd": "/path/to/your/project"
|
|
986
|
+
"type": "http",
|
|
987
|
+
"url": "https://canopy.dev/api/mcp"
|
|
775
988
|
}
|
|
776
989
|
}
|
|
777
990
|
}
|
|
778
991
|
```
|
|
779
992
|
|
|
780
|
-
|
|
993
|
+
Now Claude Code can search the live docs, retrieve page content, and find source files — all backed by the production index that stays current with every deploy.
|
|
781
994
|
|
|
782
|
-
###
|
|
995
|
+
### Excluding pages from search
|
|
783
996
|
|
|
784
|
-
|
|
785
|
-
|
|
786
|
-
|
|
787
|
-
|
|
997
|
+
```svelte
|
|
998
|
+
<!-- src/routes/blog/+page.svelte (archive page) -->
|
|
999
|
+
<svelte:head>
|
|
1000
|
+
<meta name="searchsocket-weight" content="0" />
|
|
1001
|
+
</svelte:head>
|
|
788
1002
|
```
|
|
789
1003
|
|
|
790
|
-
|
|
1004
|
+
Or with the component:
|
|
791
1005
|
|
|
792
|
-
|
|
1006
|
+
```svelte
|
|
1007
|
+
<script>
|
|
1008
|
+
import { SearchSocket } from "searchsocket/svelte";
|
|
1009
|
+
</script>
|
|
1010
|
+
|
|
1011
|
+
<SearchSocket weight={0} />
|
|
1012
|
+
```
|
|
793
1013
|
|
|
794
|
-
|
|
1014
|
+
### Vite SSR config
|
|
795
1015
|
|
|
796
|
-
|
|
1016
|
+
```ts
|
|
1017
|
+
// vite.config.ts
|
|
1018
|
+
import { sveltekit } from "@sveltejs/kit/vite";
|
|
1019
|
+
import { defineConfig } from "vite";
|
|
1020
|
+
|
|
1021
|
+
export default defineConfig({
|
|
1022
|
+
plugins: [sveltekit()],
|
|
1023
|
+
ssr: {
|
|
1024
|
+
external: ["searchsocket", "searchsocket/sveltekit", "searchsocket/client"]
|
|
1025
|
+
}
|
|
1026
|
+
});
|
|
1027
|
+
```
|
|
797
1028
|
|
|
798
|
-
|
|
799
|
-
- `JINA_API_KEY` — Jina AI API key for embeddings and reranking
|
|
1029
|
+
## Environment Variables
|
|
800
1030
|
|
|
801
|
-
###
|
|
1031
|
+
### Required
|
|
802
1032
|
|
|
803
|
-
|
|
804
|
-
|
|
805
|
-
|
|
1033
|
+
| Variable | Description |
|
|
1034
|
+
|----------|-------------|
|
|
1035
|
+
| `UPSTASH_VECTOR_REST_URL` | Upstash Vector REST API endpoint |
|
|
1036
|
+
| `UPSTASH_VECTOR_REST_TOKEN` | Upstash Vector REST API token |
|
|
806
1037
|
|
|
807
|
-
|
|
1038
|
+
### Optional
|
|
808
1039
|
|
|
809
|
-
|
|
1040
|
+
| Variable | Description |
|
|
1041
|
+
|----------|-------------|
|
|
1042
|
+
| `SEARCHSOCKET_SCOPE` | Override scope (when `scope.mode: "env"`) |
|
|
1043
|
+
| `SEARCHSOCKET_AUTO_INDEX` | Enable build-triggered indexing (`1`, `true`, or `yes`) |
|
|
1044
|
+
| `SEARCHSOCKET_DISABLE_AUTO_INDEX` | Disable build-triggered indexing |
|
|
1045
|
+
| `SEARCHSOCKET_FORCE_REINDEX` | Force full re-index in CI/CD |
|
|
810
1046
|
|
|
811
|
-
|
|
812
|
-
- `SEARCHSOCKET_AUTO_INDEX` — Enable build-triggered indexing
|
|
813
|
-
- `SEARCHSOCKET_DISABLE_AUTO_INDEX` — Disable build-triggered indexing
|
|
1047
|
+
The CLI automatically loads `.env` from the working directory on startup.
|
|
814
1048
|
|
|
815
|
-
## Configuration
|
|
1049
|
+
## Configuration Reference
|
|
816
1050
|
|
|
817
|
-
|
|
1051
|
+
See [docs/config.md](docs/config.md) for the full configuration reference. Here's the full example:
|
|
818
1052
|
|
|
819
1053
|
```ts
|
|
820
1054
|
export default {
|
|
@@ -824,41 +1058,24 @@ export default {
|
|
|
824
1058
|
},
|
|
825
1059
|
|
|
826
1060
|
scope: {
|
|
827
|
-
mode: "git",
|
|
1061
|
+
mode: "git", // "fixed" | "git" | "env"
|
|
828
1062
|
fixed: "main",
|
|
829
1063
|
sanitize: true
|
|
830
1064
|
},
|
|
831
1065
|
|
|
1066
|
+
exclude: ["/admin/*", "/api/*"],
|
|
1067
|
+
respectRobotsTxt: true,
|
|
1068
|
+
|
|
832
1069
|
source: {
|
|
833
|
-
mode: "build",
|
|
1070
|
+
mode: "build",
|
|
834
1071
|
staticOutputDir: "build",
|
|
835
|
-
strictRouteMapping: false,
|
|
836
|
-
|
|
837
|
-
// Build mode (recommended for CI/CD)
|
|
838
1072
|
build: {
|
|
839
|
-
outputDir: ".svelte-kit/output",
|
|
840
|
-
previewTimeout: 30000,
|
|
841
1073
|
exclude: ["/api/*"],
|
|
842
1074
|
paramValues: {
|
|
843
1075
|
"/blog/[slug]": ["hello-world", "getting-started"]
|
|
844
1076
|
},
|
|
845
|
-
discover:
|
|
846
|
-
|
|
847
|
-
maxPages: 200,
|
|
848
|
-
maxDepth: 5
|
|
849
|
-
},
|
|
850
|
-
|
|
851
|
-
// Crawl mode (alternative)
|
|
852
|
-
crawl: {
|
|
853
|
-
baseUrl: "http://localhost:4173",
|
|
854
|
-
routes: ["/", "/docs", "/blog"],
|
|
855
|
-
sitemapUrl: "https://example.com/sitemap.xml"
|
|
856
|
-
},
|
|
857
|
-
|
|
858
|
-
// Content files mode (alternative)
|
|
859
|
-
contentFiles: {
|
|
860
|
-
globs: ["src/routes/**/*.md"],
|
|
861
|
-
baseDir: "."
|
|
1077
|
+
discover: true,
|
|
1078
|
+
maxPages: 200
|
|
862
1079
|
}
|
|
863
1080
|
},
|
|
864
1081
|
|
|
@@ -868,77 +1085,77 @@ export default {
|
|
|
868
1085
|
dropSelectors: [".sidebar", ".toc"],
|
|
869
1086
|
ignoreAttr: "data-search-ignore",
|
|
870
1087
|
noindexAttr: "data-search-noindex",
|
|
871
|
-
|
|
1088
|
+
imageDescAttr: "data-search-description"
|
|
872
1089
|
},
|
|
873
1090
|
|
|
874
1091
|
chunking: {
|
|
875
|
-
maxChars:
|
|
1092
|
+
maxChars: 1500,
|
|
876
1093
|
overlapChars: 200,
|
|
877
1094
|
minChars: 250,
|
|
878
|
-
|
|
879
|
-
|
|
880
|
-
prependTitle: true, // prepend page title to chunk text before embedding
|
|
881
|
-
pageSummaryChunk: true // generate synthetic identity chunk per page
|
|
1095
|
+
prependTitle: true,
|
|
1096
|
+
pageSummaryChunk: true
|
|
882
1097
|
},
|
|
883
1098
|
|
|
884
|
-
|
|
885
|
-
|
|
886
|
-
|
|
887
|
-
apiKey: "jina_...", // direct API key (or use apiKeyEnv)
|
|
888
|
-
apiKeyEnv: "JINA_API_KEY",
|
|
889
|
-
batchSize: 64,
|
|
890
|
-
concurrency: 4
|
|
1099
|
+
upstash: {
|
|
1100
|
+
urlEnv: "UPSTASH_VECTOR_REST_URL",
|
|
1101
|
+
tokenEnv: "UPSTASH_VECTOR_REST_TOKEN"
|
|
891
1102
|
},
|
|
892
1103
|
|
|
893
|
-
|
|
894
|
-
|
|
895
|
-
|
|
896
|
-
url: "libsql://my-db.turso.io", // direct URL (or use urlEnv)
|
|
897
|
-
authToken: "eyJhbGc...", // direct token (or use authTokenEnv)
|
|
898
|
-
urlEnv: "TURSO_DATABASE_URL",
|
|
899
|
-
authTokenEnv: "TURSO_AUTH_TOKEN",
|
|
900
|
-
localPath: ".searchsocket/vectors.db"
|
|
901
|
-
}
|
|
902
|
-
},
|
|
903
|
-
|
|
904
|
-
rerank: {
|
|
905
|
-
enabled: true,
|
|
906
|
-
topN: 20,
|
|
907
|
-
model: "jina-reranker-v3"
|
|
1104
|
+
search: {
|
|
1105
|
+
dualSearch: true,
|
|
1106
|
+
pageSearchWeight: 0.3
|
|
908
1107
|
},
|
|
909
1108
|
|
|
910
1109
|
ranking: {
|
|
911
1110
|
enableIncomingLinkBoost: true,
|
|
912
1111
|
enableDepthBoost: true,
|
|
913
|
-
pageWeights: {
|
|
914
|
-
|
|
915
|
-
"/docs": 1.15
|
|
916
|
-
},
|
|
917
|
-
minScore: 0,
|
|
1112
|
+
pageWeights: { "/docs": 1.15 },
|
|
1113
|
+
minScoreRatio: 0.70,
|
|
918
1114
|
aggregationCap: 5,
|
|
919
|
-
aggregationDecay: 0.5
|
|
920
|
-
minChunkScoreRatio: 0.5,
|
|
921
|
-
weights: {
|
|
922
|
-
incomingLinks: 0.05,
|
|
923
|
-
depth: 0.03,
|
|
924
|
-
rerank: 1.0,
|
|
925
|
-
aggregation: 0.1
|
|
926
|
-
}
|
|
1115
|
+
aggregationDecay: 0.5
|
|
927
1116
|
},
|
|
928
1117
|
|
|
929
1118
|
api: {
|
|
930
1119
|
path: "/api/search",
|
|
931
|
-
cors: {
|
|
932
|
-
|
|
933
|
-
|
|
934
|
-
|
|
935
|
-
|
|
936
|
-
|
|
937
|
-
|
|
1120
|
+
cors: { allowOrigins: ["https://example.com"] }
|
|
1121
|
+
},
|
|
1122
|
+
|
|
1123
|
+
mcp: {
|
|
1124
|
+
enable: true,
|
|
1125
|
+
handle: { path: "/api/mcp" }
|
|
1126
|
+
},
|
|
1127
|
+
|
|
1128
|
+
llmsTxt: {
|
|
1129
|
+
enable: true,
|
|
1130
|
+
title: "My Project",
|
|
1131
|
+
description: "Documentation for My Project"
|
|
1132
|
+
},
|
|
1133
|
+
|
|
1134
|
+
state: {
|
|
1135
|
+
dir: ".searchsocket"
|
|
938
1136
|
}
|
|
939
1137
|
};
|
|
940
1138
|
```
|
|
941
1139
|
|
|
1140
|
+
## CI/CD
|
|
1141
|
+
|
|
1142
|
+
See [docs/ci.md](docs/ci.md) for ready-to-use GitHub Actions workflows covering:
|
|
1143
|
+
|
|
1144
|
+
- Main branch indexing on push
|
|
1145
|
+
- PR dry-run validation
|
|
1146
|
+
- Preview branch scope isolation
|
|
1147
|
+
- Scheduled scope pruning
|
|
1148
|
+
- Vercel build-triggered indexing
|
|
1149
|
+
|
|
1150
|
+
## Further Reading
|
|
1151
|
+
|
|
1152
|
+
- [Building a Search UI](docs/search-ui.md) — Cmd+K modals, scoped search, styling, and API reference
|
|
1153
|
+
- [Tuning Search Relevance](docs/tuning.md) — visual playground, ranking parameters, and search quality testing
|
|
1154
|
+
- [Configuration Reference](docs/config.md) — all config options, indexing hooks, and custom records
|
|
1155
|
+
- [CI/CD Workflows](docs/ci.md) — GitHub Actions and Vercel integration
|
|
1156
|
+
- [MCP over HTTP Guide](docs/mcp-claude-code.md) — detailed HTTP MCP setup for Claude Code
|
|
1157
|
+
- [Troubleshooting](docs/troubleshooting.md) — common issues, diagnostics, and FAQ
|
|
1158
|
+
|
|
942
1159
|
## License
|
|
943
1160
|
|
|
944
1161
|
MIT
|