searchsocket 0.3.3 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +57 -39
- package/dist/cli.js +947 -1378
- package/dist/client.cjs +45 -0
- package/dist/client.d.cts +3 -2
- package/dist/client.d.ts +3 -2
- package/dist/client.js +45 -1
- package/dist/index.cjs +909 -1286
- package/dist/index.d.cts +73 -33
- package/dist/index.d.ts +73 -33
- package/dist/index.js +906 -1281
- package/dist/plugin-B_npJSux.d.cts +36 -0
- package/dist/plugin-M-aW0ev6.d.ts +36 -0
- package/dist/scroll.cjs +185 -0
- package/dist/scroll.d.cts +42 -0
- package/dist/scroll.d.ts +42 -0
- package/dist/scroll.js +183 -0
- package/dist/sveltekit.cjs +997 -1204
- package/dist/sveltekit.d.cts +3 -43
- package/dist/sveltekit.d.ts +3 -43
- package/dist/sveltekit.js +995 -1202
- package/dist/{types-BrG6XTUU.d.cts → types-Dk43uz25.d.cts} +50 -109
- package/dist/{types-BrG6XTUU.d.ts → types-Dk43uz25.d.ts} +50 -109
- package/package.json +10 -3
package/README.md
CHANGED
|
@@ -6,17 +6,17 @@ Semantic site search and MCP retrieval for SvelteKit content projects.
|
|
|
6
6
|
|
|
7
7
|
## Features
|
|
8
8
|
|
|
9
|
-
- **Embeddings**: Jina AI `jina-embeddings-
|
|
9
|
+
- **Embeddings**: Jina AI `jina-embeddings-v5-text-small` with task-specific LoRA adapters (configurable)
|
|
10
10
|
- **Vector Backend**: Turso/libSQL with vector search (local file DB for development, remote for production)
|
|
11
|
-
- **Rerank**:
|
|
11
|
+
- **Rerank**: Jina `jina-reranker-v3` enabled by default — same API key
|
|
12
12
|
- **Page Aggregation**: Group results by page with score-weighted chunk decay
|
|
13
13
|
- **Meta Extraction**: Automatically extracts `<meta name="description">` and `<meta name="keywords">` for improved relevance
|
|
14
14
|
- **SvelteKit Integrations**:
|
|
15
15
|
- `searchsocketHandle()` for `POST /api/search` endpoint
|
|
16
16
|
- `searchsocketVitePlugin()` for build-triggered indexing
|
|
17
|
-
- **Client Library**: `createSearchClient()` for browser-side search
|
|
17
|
+
- **Client Library**: `createSearchClient()` for browser-side search, `buildResultUrl()` for scroll-to-section links
|
|
18
|
+
- **Scroll-to-Text**: `searchsocketScrollToText()` auto-scrolls to matching sections on navigation
|
|
18
19
|
- **MCP Server**: Model Context Protocol tools for search and page retrieval
|
|
19
|
-
- **Git-Tracked Markdown Mirror**: Commit-safe deterministic markdown outputs
|
|
20
20
|
|
|
21
21
|
## Install
|
|
22
22
|
|
|
@@ -163,7 +163,7 @@ pnpm searchsocket search --q "getting started" --top-k 5 --path-prefix /docs
|
|
|
163
163
|
"meta": {
|
|
164
164
|
"timingsMs": { "embed": 120, "vector": 15, "rerank": 0, "total": 135 },
|
|
165
165
|
"usedRerank": false,
|
|
166
|
-
"modelId": "jina-embeddings-
|
|
166
|
+
"modelId": "jina-embeddings-v5-text-small"
|
|
167
167
|
}
|
|
168
168
|
}
|
|
169
169
|
```
|
|
@@ -291,6 +291,53 @@ for (const result of response.results) {
|
|
|
291
291
|
}
|
|
292
292
|
```
|
|
293
293
|
|
|
294
|
+
## Scroll-to-Text Navigation
|
|
295
|
+
|
|
296
|
+
When a visitor clicks a search result, SearchSocket can automatically scroll them to the relevant section on the destination page. This uses two utilities:
|
|
297
|
+
|
|
298
|
+
### `buildResultUrl(result)`
|
|
299
|
+
|
|
300
|
+
Builds a URL from a search result that includes:
|
|
301
|
+
- A `_ssk` query parameter for SvelteKit client-side navigation (read by `searchsocketScrollToText`)
|
|
302
|
+
- A [Text Fragment](https://developer.mozilla.org/en-US/docs/Web/URI/Fragment/Text_fragments) (`#:~:text=`) for native browser scroll-to-text on full page loads (Chrome 80+, Safari 16.1+, Firefox 131+)
|
|
303
|
+
|
|
304
|
+
Import from `searchsocket/client`:
|
|
305
|
+
|
|
306
|
+
```ts
|
|
307
|
+
import { createSearchClient, buildResultUrl } from "searchsocket/client";
|
|
308
|
+
|
|
309
|
+
const client = createSearchClient();
|
|
310
|
+
const { results } = await client.search({ q: "installation" });
|
|
311
|
+
|
|
312
|
+
// Use in your search UI
|
|
313
|
+
for (const result of results) {
|
|
314
|
+
const href = buildResultUrl(result);
|
|
315
|
+
// "/docs/getting-started?_ssk=Installation#:~:text=Installation"
|
|
316
|
+
}
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
If the result has no `sectionTitle`, the original URL is returned unchanged.
|
|
320
|
+
|
|
321
|
+
### `searchsocketScrollToText`
|
|
322
|
+
|
|
323
|
+
A SvelteKit `afterNavigate` hook that reads the `_ssk` parameter and scrolls the matching heading into view. Add it to your root layout:
|
|
324
|
+
|
|
325
|
+
```svelte
|
|
326
|
+
<!-- src/routes/+layout.svelte -->
|
|
327
|
+
<script>
|
|
328
|
+
import { afterNavigate } from '$app/navigation';
|
|
329
|
+
import { searchsocketScrollToText } from 'searchsocket/sveltekit';
|
|
330
|
+
|
|
331
|
+
afterNavigate(searchsocketScrollToText);
|
|
332
|
+
</script>
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
The hook:
|
|
336
|
+
- Matches headings (h1–h6) case-insensitively with whitespace normalization
|
|
337
|
+
- Falls back to a broader text node search if no heading matches
|
|
338
|
+
- Scrolls smoothly to the first match
|
|
339
|
+
- Is a silent no-op when `_ssk` is absent or no match is found
|
|
340
|
+
|
|
294
341
|
## Vector Backend: Turso/libSQL
|
|
295
342
|
|
|
296
343
|
SearchSocket uses **Turso** (libSQL) as its single vector backend, providing a unified experience across development and production.
|
|
@@ -399,7 +446,6 @@ The built-in `InMemoryRateLimiter` auto-disables on serverless platforms (it res
|
|
|
399
446
|
|
|
400
447
|
The following features are only used during `searchsocket index` (CLI), not the search handler:
|
|
401
448
|
- `ensureStateDirs` — creates `.searchsocket/` state directories
|
|
402
|
-
- Markdown mirror — writes `.searchsocket/mirror/` files
|
|
403
449
|
- Local SQLite fallback — only needed when `TURSO_DATABASE_URL` is not set
|
|
404
450
|
|
|
405
451
|
### Adapter Guidance
|
|
@@ -416,9 +462,9 @@ SearchSocket uses **Jina AI's embedding models** to convert text into semantic v
|
|
|
416
462
|
|
|
417
463
|
### Default Model
|
|
418
464
|
|
|
419
|
-
- **Model**: `jina-embeddings-
|
|
465
|
+
- **Model**: `jina-embeddings-v5-text-small`
|
|
420
466
|
- **Dimensions**: 1024 (default)
|
|
421
|
-
- **Cost**: ~$0.
|
|
467
|
+
- **Cost**: ~$0.00005 per 1K tokens
|
|
422
468
|
- **Task adapters**: Uses `retrieval.passage` for indexing, `retrieval.query` for search queries (LoRA task-specific adapters for better retrieval quality)
|
|
423
469
|
|
|
424
470
|
### How It Works
|
|
@@ -537,34 +583,6 @@ SEARCHSOCKET_AUTO_INDEX=1 pnpm build
|
|
|
537
583
|
SEARCHSOCKET_DISABLE_AUTO_INDEX=1 pnpm build
|
|
538
584
|
```
|
|
539
585
|
|
|
540
|
-
## Git-Tracked Markdown Mirror
|
|
541
|
-
|
|
542
|
-
Indexing writes a **deterministic markdown mirror**:
|
|
543
|
-
|
|
544
|
-
```
|
|
545
|
-
.searchsocket/pages/<scope>/<path>.md
|
|
546
|
-
```
|
|
547
|
-
|
|
548
|
-
Example:
|
|
549
|
-
```
|
|
550
|
-
.searchsocket/pages/main/docs/intro.md
|
|
551
|
-
```
|
|
552
|
-
|
|
553
|
-
Each file contains:
|
|
554
|
-
- Frontmatter: URL, title, scope, route file, metadata
|
|
555
|
-
- Markdown: Extracted content
|
|
556
|
-
|
|
557
|
-
**Why commit it?**
|
|
558
|
-
- Content workflows (edit markdown, regenerate embeddings)
|
|
559
|
-
- Version control for indexed content
|
|
560
|
-
- Debugging (see exactly what was indexed)
|
|
561
|
-
- Offline search (grep the mirror)
|
|
562
|
-
|
|
563
|
-
Add to `.gitignore` if you don't need it:
|
|
564
|
-
```
|
|
565
|
-
.searchsocket/pages/
|
|
566
|
-
```
|
|
567
|
-
|
|
568
586
|
## Commands
|
|
569
587
|
|
|
570
588
|
### `searchsocket init`
|
|
@@ -612,7 +630,7 @@ pnpm searchsocket status
|
|
|
612
630
|
# Output:
|
|
613
631
|
# project: my-site
|
|
614
632
|
# resolved scope: main
|
|
615
|
-
# embedding model: jina-embeddings-
|
|
633
|
+
# embedding model: jina-embeddings-v5-text-small
|
|
616
634
|
# vector backend: turso/libsql (local (.searchsocket/vectors.db))
|
|
617
635
|
# vector health: ok
|
|
618
636
|
# last indexed (main): 2025-02-23T10:30:00Z
|
|
@@ -865,7 +883,7 @@ export default {
|
|
|
865
883
|
|
|
866
884
|
embeddings: {
|
|
867
885
|
provider: "jina",
|
|
868
|
-
model: "jina-embeddings-
|
|
886
|
+
model: "jina-embeddings-v5-text-small",
|
|
869
887
|
apiKey: "jina_...", // direct API key (or use apiKeyEnv)
|
|
870
888
|
apiKeyEnv: "JINA_API_KEY",
|
|
871
889
|
batchSize: 64,
|
|
@@ -886,7 +904,7 @@ export default {
|
|
|
886
904
|
rerank: {
|
|
887
905
|
enabled: true,
|
|
888
906
|
topN: 20,
|
|
889
|
-
model: "jina-reranker-
|
|
907
|
+
model: "jina-reranker-v3"
|
|
890
908
|
},
|
|
891
909
|
|
|
892
910
|
ranking: {
|