sweet-search 2.3.0 → 2.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -30,8 +30,18 @@
30
30
  * before addon loads so
31
31
  * the Rust dtype policy
32
32
  * picks BF16/F16/F32 by
33
- * compute capability.
33
+ * compute capability and
34
+ * model family.
34
35
  * See mod.rs::optimal_dtype
36
+ * SWEET_SEARCH_NATIVE_DTYPE=f32|bf16|f16 Global dtype preference.
37
+ * On CUDA, BF16 is used for
38
+ * embeddings on Ampere+ but
39
+ * LI remains F32 for quality.
40
+ * SWEET_SEARCH_NATIVE_EMBED_DTYPE=f32|bf16|f16 Per-model diagnostic
41
+ * override for embeddings.
42
+ * SWEET_SEARCH_NATIVE_LI_DTYPE=f32|bf16|f16 Per-model diagnostic
43
+ * override; BF16/F16 LI is
44
+ * known to drift on CUDA.
35
45
  * CANDLE_METAL_COMPUTE_PER_BUFFER=<N> — candle default 50 (tuned)
36
46
  * CANDLE_METAL_COMMAND_POOL_SIZE=<N> — candle default 5 (tuned)
37
47
  */
@@ -87,7 +97,8 @@ export function pickCascadeDirForDevice(deviceKind, cascadeDirOverride, resolveC
87
97
  /**
88
98
  * Ensure `SWEET_SEARCH_CUDA_COMPUTE_CAP` is set for the current process
89
99
  * before the addon loads a CUDA model. The Rust `optimal_dtype` reads
90
- * this env var to pick BF16 on Ampere+ and F16/F32 on older GPUs.
100
+ * this env var to pick BF16 for the embedding model on Ampere+ while
101
+ * keeping ModernBERT LI on F32 unless explicitly overridden.
91
102
  *
92
103
  * Idempotent: honors an already-set value (useful for forcing a dtype
93
104
  * tier in benchmarks) and silently no-ops when there is no NVIDIA GPU.
@@ -0,0 +1,92 @@
1
+ ---
2
+ name: sweet-index
3
+ description: "Use when (re)indexing a Sweet Search project. Runs the full-profile indexer with GPU model prewarming (CoreML cascade on M3+, candle Metal on M1/M2, ORT CPU elsewhere), kills ORT CPU models during indexing to avoid memory contention, and rewarms them for query readiness on completion. Incremental runs under 20 files stay on ORT CPU."
4
+ category: developer-tooling
5
+ priority: high
6
+ tokenEstimate: 600
7
+ agents: []
8
+ implementation_status: active
9
+ optimization_version: 1.0
10
+ last_optimized: 2026-04-17
11
+ dependencies: [sweet-search]
12
+ quick_reference_card: true
13
+ tags: [indexing, embeddings, late-interaction, gpu, coreml, metal, ort, prewarming, sweet-search]
14
+ trust_tier: 1
15
+ ---
16
+
17
+ # /sweet-index — Index the Codebase
18
+
19
+ <default_to_action>
20
+ When the user invokes `/sweet-index`, run the full-profile indexing command
21
+ immediately. Do not ask clarifying questions — the indexer is idempotent, safe
22
+ to re-run, and handles incremental vs full reindex automatically.
23
+ </default_to_action>
24
+
25
+ ## What this does
26
+
27
+ Runs `core/indexing/index-codebase-v21.js` with the `--full` flag so every
28
+ artifact is rebuilt from scratch. The indexer itself manages the model
29
+ lifecycle end-to-end:
30
+
31
+ 1. **Kill resident ORT CPU models** — prevents memory contention and mutex
32
+ fighting with the GPU models about to be loaded.
33
+ 2. **Detect best backend** via `hardware-capability.js` —
34
+ `coreml-cascade` on M3+ Apple Silicon, `candle-metal` on M1/M2,
35
+ `candle-cpu` elsewhere.
36
+ 3. **Load GPU models + warmup forward pass** — compiles Metal pipelines,
37
+ CoreML variant bundles, and BLAS thread pools so the first indexing
38
+ batch pays no cold-start cost.
39
+ 4. **Index the codebase** — code graph, vector embeddings, HNSW,
40
+ late-interaction index, quantized artifacts, sparse-gram index.
41
+ 5. **Kill GPU models** — releases Metal queues and Neural Engine.
42
+ 6. **Load + warmup ORT CPU models** — both embedding and LI get one dummy
43
+ forward pass so the first query after indexing is warm.
44
+
45
+ On small-changeset incremental runs (under 20 files), the indexer skips the
46
+ GPU swap entirely — the load/warmup overhead would dwarf the actual work.
47
+
48
+ ## Usage
49
+
50
+ ```bash
51
+ node core/indexing/index-codebase-v21.js --full
52
+ ```
53
+
54
+ Or via npm script:
55
+
56
+ ```bash
57
+ npm run index:full
58
+ ```
59
+
60
+ ## What to report
61
+
62
+ After the command completes, pick out these lines from stderr:
63
+
64
+ - `GPU index pool armed (<backend>)` → confirms which backend was used
65
+ - `embed=<load>+<warm>ms, li=<load>+<warm>ms` → prewarm timings
66
+ - `CPU models warmed for queries: load=…ms, warm=…ms (embed=ok, li=ok)` →
67
+ confirms ORT CPU is armed for subsequent searches
68
+ - `INDEXING COMPLETE (FULL)` with `Duration`, `Files indexed`, `Entities`,
69
+ `Relationships` → headline stats
70
+
71
+ ## Flags (full list)
72
+
73
+ | Flag | Purpose |
74
+ |------|---------|
75
+ | `--full` | Full reindex — rebuild everything. Always armed via this skill. |
76
+ | `--no-late-interaction` | Skip LI index (faster, lower quality). Rarely wanted. |
77
+ | `--late-interaction-pool=N` | Token pooling factor (2 halves tokens). |
78
+ | `--vectors-only` | Skip code graph — breaks GraphRAG. Avoid. |
79
+ | `--graph-only` | Only build code graph, skip vectors. |
80
+ | `--verbose` / `-v` | Force per-phase progress output. |
81
+
82
+ Do **not** pass `--sqlite-fast` — it disables fsync between phases and is
83
+ only safe for benchmarking on throwaway state.
84
+
85
+ ## When not to use
86
+
87
+ - **Single-file edits**: the indexer auto-detects small changesets and stays
88
+ on CPU, so `/sweet-index` is still safe, but a watcher-triggered
89
+ incremental run is cheaper.
90
+ - **Queries feel slow**: usually means the ORT CPU models are not loaded.
91
+ Run `/sweet-index` once to rewarm them, or restart the sweet-search
92
+ server.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "sweet-search",
3
- "version": "2.3.0",
3
+ "version": "2.4.2",
4
4
  "description": "Sweet Search - SOTA Hybrid Code Search Engine with WASM CatBoost Query Router, Semantic/Lexical/Structural Search, and Multilingual Support",
5
5
  "type": "module",
6
6
  "main": "core/search/sweet-search.js",
@@ -13,7 +13,7 @@
13
13
  "author": "Marko Sladojevic <marko@panonit.com> (https://panonit.com)",
14
14
  "repository": {
15
15
  "type": "git",
16
- "url": "https://github.com/panonitorg/sweet-search"
16
+ "url": "git+https://github.com/panonitorg/sweet-search.git"
17
17
  },
18
18
  "bugs": {
19
19
  "url": "https://github.com/panonitorg/sweet-search/issues"
@@ -34,8 +34,8 @@
34
34
  "panonit"
35
35
  ],
36
36
  "bin": {
37
- "sweet-search": "./core/cli.js",
38
- "sweet-search-mcp": "./mcp/server.js"
37
+ "sweet-search": "core/cli.js",
38
+ "sweet-search-mcp": "mcp/server.js"
39
39
  },
40
40
  "files": [
41
41
  "core/*.js",
@@ -48,6 +48,7 @@
48
48
  "core/vocabulary/",
49
49
  "core/vector-store/",
50
50
  "core/query/",
51
+ "core/skills/",
51
52
  "mcp/",
52
53
  "scripts/benchmark-harness.js",
53
54
  "scripts/init.js",
@@ -139,12 +140,12 @@
139
140
  "vitest": "^4.0.16"
140
141
  },
141
142
  "optionalDependencies": {
142
- "@sweet-search/native-darwin-arm64": "2.3.0",
143
- "@sweet-search/native-darwin-x64": "2.3.0",
144
- "@sweet-search/native-linux-arm64-gnu": "2.3.0",
145
- "@sweet-search/native-linux-arm64-gnu-cuda": "2.3.0",
146
- "@sweet-search/native-linux-x64-gnu": "2.3.0",
147
- "@sweet-search/native-linux-x64-gnu-cuda": "2.3.0"
143
+ "@sweet-search/native-darwin-arm64": "2.4.2",
144
+ "@sweet-search/native-darwin-x64": "2.4.2",
145
+ "@sweet-search/native-linux-arm64-gnu": "2.4.2",
146
+ "@sweet-search/native-linux-arm64-gnu-cuda": "2.4.2",
147
+ "@sweet-search/native-linux-x64-gnu": "2.4.2",
148
+ "@sweet-search/native-linux-x64-gnu-cuda": "2.4.2"
148
149
  },
149
150
  "engines": {
150
151
  "node": ">=18.0.0"
package/scripts/init.js CHANGED
@@ -277,7 +277,7 @@ export async function downloadModelsForProfile(profile, options = {}) {
277
277
  function printReport(report) {
278
278
  const {
279
279
  profile, maxsimTier, routerType, models, verification, runtimeDownloads,
280
- capability, cascadeReport, dedupReport, prewarmHookReport,
280
+ capability, cascadeReport, dedupReport, prewarmHookReport, skillReport,
281
281
  } = report;
282
282
 
283
283
  console.log('');
@@ -359,6 +359,16 @@ function printReport(report) {
359
359
  // 'skipped' is silent — explicit user opt-out.
360
360
  }
361
361
 
362
+ if (skillReport) {
363
+ if (skillReport.status === 'installed') {
364
+ console.log(` /sweet-index skill: installed → ${skillReport.skillPath}`);
365
+ } else if (skillReport.status === 'already-installed') {
366
+ console.log(` /sweet-index skill: already installed`);
367
+ } else if (skillReport.status === 'error') {
368
+ console.log(` /sweet-index skill: ERROR — ${skillReport.detail}`);
369
+ }
370
+ }
371
+
362
372
  console.log(` Runtime downloads: ${runtimeDownloads}`);
363
373
 
364
374
  const passedCount = verification.checks.filter(c => c.status === 'pass').length;
@@ -535,6 +545,54 @@ export function registerPrewarmSessionStartHook({
535
545
  };
536
546
  }
537
547
 
548
+ // ---------------------------------------------------------------------------
549
+ // /sweet-index skill installation
550
+ // ---------------------------------------------------------------------------
551
+
552
+ // The skill is shipped inside the npm tarball at core/skills/sweet-index/SKILL.md
553
+ // (see package.json::files). Init copies it into the project's
554
+ // .claude/skills/sweet-index/ — creating the .claude tree if absent — so users
555
+ // who haven't yet adopted Claude Code still get the skill available the moment
556
+ // they do. Per-project install (not global ~/.claude) so different projects
557
+ // can pin different sweet-search versions without skill drift.
558
+ //
559
+ // Returns `{ status, detail, skillPath? }` for the init report:
560
+ // installed — copied SKILL.md (new install)
561
+ // already-installed — destination existed, left untouched (idempotent)
562
+ // error — copy failed; init continues (never blocks)
563
+ export function installSweetIndexSkill({ projectRoot, packageRoot } = {}) {
564
+ const skillDir = join(projectRoot, '.claude', 'skills', 'sweet-index');
565
+ const skillDest = join(skillDir, 'SKILL.md');
566
+ const skillSrc = join(packageRoot, 'core', 'skills', 'sweet-index', 'SKILL.md');
567
+
568
+ if (!existsSync(skillSrc)) {
569
+ return {
570
+ status: 'error',
571
+ detail: `skill source missing in package: ${skillSrc} (re-install sweet-search)`,
572
+ };
573
+ }
574
+
575
+ if (existsSync(skillDest)) {
576
+ return {
577
+ status: 'already-installed',
578
+ detail: skillDir,
579
+ skillPath: skillDest,
580
+ };
581
+ }
582
+
583
+ try {
584
+ mkdirSync(skillDir, { recursive: true });
585
+ copyFileSync(skillSrc, skillDest);
586
+ return {
587
+ status: 'installed',
588
+ detail: skillDir,
589
+ skillPath: skillDest,
590
+ };
591
+ } catch (err) {
592
+ return { status: 'error', detail: err.message };
593
+ }
594
+ }
595
+
538
596
  // ---------------------------------------------------------------------------
539
597
  // Help text
540
598
  // ---------------------------------------------------------------------------
@@ -876,20 +934,20 @@ export async function runInit(args) {
876
934
  process.stderr.write(`[init] Warning: Could not install index-maintainer: ${e.message}\n`);
877
935
  }
878
936
 
879
- // 11. Install /sweet-index skill
880
- try {
881
- const skillDir = join(projectRoot, '.claude', 'skills', 'sweet-index');
882
- const skillDest = join(skillDir, 'SKILL.md');
883
- const skillSrc = join(PACKAGE_ROOT, 'core', 'skills', 'sweet-index', 'SKILL.md');
884
- if (!existsSync(skillDest)) {
885
- mkdirSync(skillDir, { recursive: true });
886
- copyFileSync(skillSrc, skillDest);
887
- process.stderr.write(`[init] Installed /sweet-index skill to ${skillDir}\n`);
888
- } else {
889
- process.stderr.write(`[init] /sweet-index skill already installed\n`);
890
- }
891
- } catch (e) {
892
- process.stderr.write(`[init] Warning: Could not install /sweet-index skill: ${e.message}\n`);
937
+ // 11. Install /sweet-index skill — always, even if .claude/ doesn't exist.
938
+ // Users who haven't adopted Claude Code yet still get the skill in place
939
+ // the moment they do; we treat the skill as part of the product, not a
940
+ // Claude-Code-conditional add-on.
941
+ const skillReport = installSweetIndexSkill({
942
+ projectRoot,
943
+ packageRoot: PACKAGE_ROOT,
944
+ });
945
+ if (skillReport.status === 'installed') {
946
+ process.stderr.write(`[init] Installed /sweet-index skill to ${skillReport.detail}\n`);
947
+ } else if (skillReport.status === 'already-installed') {
948
+ process.stderr.write(`[init] /sweet-index skill already installed\n`);
949
+ } else if (skillReport.status === 'error') {
950
+ process.stderr.write(`[init] Warning: Could not install /sweet-index skill: ${skillReport.detail}\n`);
893
951
  }
894
952
 
895
953
  // 11.5. Register Claude Code SessionStart daemon-prewarm hook.
@@ -920,6 +978,7 @@ export async function runInit(args) {
920
978
  cascadeReport,
921
979
  dedupReport,
922
980
  prewarmHookReport,
981
+ skillReport,
923
982
  });
924
983
  }
925
984
 
@@ -11,7 +11,7 @@
11
11
  * sweet-search uninstall [--dry-run] [--keep-models] [--purge] [--force]
12
12
  */
13
13
 
14
- import { existsSync, readdirSync, readFileSync, renameSync, rmSync, statSync, unlinkSync, writeFileSync } from 'node:fs';
14
+ import { existsSync, readdirSync, readFileSync, renameSync, rmdirSync, rmSync, statSync, unlinkSync, writeFileSync } from 'node:fs';
15
15
  import { dirname, join } from 'node:path';
16
16
  import { execSync } from 'node:child_process';
17
17
  import { fileURLToPath } from 'node:url';
@@ -213,6 +213,65 @@ export function stopRunningDaemon({
213
213
  return result;
214
214
  }
215
215
 
216
+ /**
217
+ * Remove the sweet-search /sweet-index skill from `.claude/skills/sweet-index/`.
218
+ * Only removes the directory we created — leaves `.claude/skills/` and `.claude/`
219
+ * untouched even if they're empty afterwards, because the user may add other
220
+ * skills/hooks/settings to `.claude/` over time and we don't own that root.
221
+ *
222
+ * Returns `{ status, detail, skillPath? }`:
223
+ * not-found — directory absent (nothing to do)
224
+ * removed — rm -rf on the sweet-index/ subtree succeeded
225
+ * dry-run — found the directory but skipped the delete
226
+ * error — rm failed (permissions, etc.); uninstall continues
227
+ */
228
+ export function removeSweetIndexSkill(projectRoot, { dryRun = false } = {}) {
229
+ const skillDir = join(projectRoot, '.claude', 'skills', 'sweet-index');
230
+ if (!existsSync(skillDir)) {
231
+ return { status: 'not-found', detail: 'no .claude/skills/sweet-index/' };
232
+ }
233
+ if (dryRun) {
234
+ return { status: 'dry-run', detail: skillDir, skillPath: skillDir };
235
+ }
236
+ try {
237
+ rmSync(skillDir, { recursive: true, force: true });
238
+ return { status: 'removed', detail: skillDir, skillPath: skillDir };
239
+ } catch (err) {
240
+ return { status: 'error', detail: err.message };
241
+ }
242
+ }
243
+
244
+ /**
245
+ * Best-effort cleanup of empty parent directories left behind after rm -rf'ing
246
+ * the per-model cache dirs and the CoreML cascade root.
247
+ *
248
+ * Walks up from `start` toward `stopAt` (exclusive) and removes each directory
249
+ * iff it's empty. rmdirSync naturally fails on non-empty dirs, so this is
250
+ * inherently safe — we never delete a directory that has files we didn't put
251
+ * there. Stops at the first non-empty dir or when `stopAt` is reached.
252
+ *
253
+ * Used to clean ~/.cache/sweet-search/{models,coreml-cascade}/ → ~/.cache/sweet-search/
254
+ * after their contents are removed. Without this, uninstall leaves an empty
255
+ * sweet-search directory dangling under the user's cache root.
256
+ */
257
+ function pruneEmptyAncestors(start, stopAt) {
258
+ let dir = start;
259
+ while (dir && dir !== stopAt && dir !== dirname(dir)) {
260
+ if (!existsSync(dir)) {
261
+ dir = dirname(dir);
262
+ continue;
263
+ }
264
+ try {
265
+ const entries = readdirSync(dir);
266
+ if (entries.length > 0) return; // non-empty — stop walking
267
+ rmdirSync(dir);
268
+ } catch {
269
+ return; // permission / race / non-empty — stop walking
270
+ }
271
+ dir = dirname(dir);
272
+ }
273
+ }
274
+
216
275
  /**
217
276
  * Remove the sweet-search-owned SessionStart entry from `.claude/settings.json`,
218
277
  * preserving every other hook, permission, and top-level key. Detection is
@@ -309,9 +368,12 @@ What gets removed:
309
368
  - CoreML variant cascade (if built) — includes ~1.8 GB of .mlpackage
310
369
  artifacts AND the sibling .mlmodelc compiled cache files next to
311
370
  each variant. Skipped by --keep-models.
371
+ - .claude/skills/sweet-index/ (the per-project /sweet-index skill copy)
372
+ - daemon-prewarm SessionStart entry inside .claude/settings.json
312
373
 
313
374
  What is NOT removed:
314
375
  - User source code, indexes, or database files outside .sweet-search/
376
+ - .claude/ itself or any other hooks/skills/settings the user owns
315
377
  - The npm package itself (unless --purge)
316
378
  `);
317
379
  }
@@ -369,8 +431,13 @@ export async function runUninstall(args) {
369
431
  const hookPreview = removePrewarmSessionStartHook(projectRoot, { dryRun: true });
370
432
  const hasHookEntry = hookPreview.status === 'dry-run';
371
433
 
434
+ // Check for the /sweet-index skill so we can report it even when
435
+ // .sweet-search/ was already deleted by hand.
436
+ const skillPreview = removeSweetIndexSkill(projectRoot, { dryRun: true });
437
+ const hasSkillEntry = skillPreview.status === 'dry-run';
438
+
372
439
  // Nothing to remove?
373
- if (removals.length === 0 && !hasHookEntry) {
440
+ if (removals.length === 0 && !hasHookEntry && !hasSkillEntry) {
374
441
  console.log('Nothing to remove — Sweet Search is not initialized in this project.');
375
442
  return;
376
443
  }
@@ -387,6 +454,9 @@ export async function runUninstall(args) {
387
454
  if (hasHookEntry) {
388
455
  console.log(` daemon-prewarm SessionStart hook in .claude/settings.json`);
389
456
  }
457
+ if (hasSkillEntry) {
458
+ console.log(` /sweet-index skill (.claude/skills/sweet-index/)`);
459
+ }
390
460
  console.log(` Total: ${formatBytes(totalBytes)}`);
391
461
  if (parsed.keepModels) {
392
462
  console.log(' Model cache: kept (--keep-models)');
@@ -398,6 +468,10 @@ export async function runUninstall(args) {
398
468
  if (dryHook.status === 'dry-run') {
399
469
  console.log(` Would also remove: prewarm SessionStart hook (.claude/settings.json — ${dryHook.detail})`);
400
470
  }
471
+ const drySkill = removeSweetIndexSkill(projectRoot, { dryRun: true });
472
+ if (drySkill.status === 'dry-run') {
473
+ console.log(` Would also remove: /sweet-index skill (${drySkill.detail})`);
474
+ }
401
475
  console.log('Dry run — nothing was removed.');
402
476
  return;
403
477
  }
@@ -430,6 +504,29 @@ export async function runUninstall(args) {
430
504
  }
431
505
  }
432
506
 
507
+ // Prune empty parent directories left behind under the model cache root
508
+ // (~/.cache/sweet-search/{models,coreml-cascade}/ → ~/.cache/sweet-search/).
509
+ // rmdirSync naturally fails on non-empty dirs, so this only deletes
510
+ // directories we've effectively emptied. Stops before $HOME/.cache.
511
+ if (!parsed.keepModels) {
512
+ const cacheRoot = resolveModelCacheRoot(); // .../sweet-search/models
513
+ const sweetSearchCacheRoot = dirname(cacheRoot); // .../sweet-search
514
+ const userCacheRoot = dirname(sweetSearchCacheRoot); // .../.cache (do not touch)
515
+ pruneEmptyAncestors(cacheRoot, userCacheRoot);
516
+ }
517
+
518
+ // Remove the per-project /sweet-index skill init copied into .claude/.
519
+ // Non-fatal — a failure here just leaves the SKILL.md stub behind.
520
+ const skillResult = removeSweetIndexSkill(projectRoot, { dryRun: parsed.dryRun });
521
+ if (skillResult.status === 'removed') {
522
+ console.log(` Removed: /sweet-index skill (${skillResult.detail})`);
523
+ removed++;
524
+ } else if (skillResult.status === 'error') {
525
+ console.log(` Failed to remove /sweet-index skill: ${skillResult.detail}`);
526
+ kept++;
527
+ }
528
+ // 'not-found' and 'dry-run' are silent in the main output.
529
+
433
530
  // Reverse the Claude Code daemon-prewarm SessionStart entry init added to
434
531
  // .claude/settings.json. Non-fatal — a failure here doesn't leave the
435
532
  // user in a worse state than before uninstall ran.