npm - sweet-search - Versions diffs - 2.5.6 → 2.5.8 - Mend

sweet-search 2.5.6 → 2.5.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/README.md +48 -11
package/core/embedding/embedding-local-model.js +1 -0
package/core/graph/relationship-resolver.js +5 -1
package/core/indexing/index-codebase-v21.js +2 -6
package/core/indexing/indexer-ann.js +3 -3
package/core/indexing/indexer-build.js +1 -1
package/core/indexing/indexer-utils.js +47 -17
package/core/infrastructure/onnx-session-utils.js +1 -0
package/core/ranking/late-interaction-model.js +1 -0
package/package.json +7 -7
package/scripts/init.js +21 -5
package/scripts/postinstall-banner.js +39 -33

package/README.md CHANGED Viewed

@@ -144,27 +144,29 @@ We measure sweet-search four ways — from how much it helps a real agent down t
 <table>
 <tr>
-<td width="25%" valign="top">
+<td width="50%" valign="top">
-**① Code-retrieval** *(agent-in-the-loop)*<br>
+🤖 **[① Code-retrieval](#bench-code-retrieval)** *(agent-in-the-loop)*<br>
 <sub>Does it make a real coding agent **cheaper and more useful** when it searches your repo? Paired against each model's own grep-and-read loop.</sub>
 </td>
-<td width="25%" valign="top">
+<td width="50%" valign="top">
-**② Task-completion** *(coming soon)*<br>
+🚧 **[② Task-completion](#bench-task-completion)** *(coming soon)*<br>
 <sub>Does cheaper, denser context **compound** into a higher resolve-rate on multi-step engineering tasks? Harness in progress.</sub>
 </td>
-<td width="25%" valign="top">
+</tr>
+<tr>
+<td width="50%" valign="top">
-**③ Paper-type IR** *(academic)*<br>
+📄 **[③ Paper-type IR](#bench-paper-type)** *(academic)*<br>
 <sub>The standard NL→code retrieval suites (GCSN, M2CRB, CoSQA…), full-corpus MRR@10.</sub>
 </td>
-<td width="25%" valign="top">
+<td width="50%" valign="top">
-**④ Engine speed**<br>
+⚡ **[④ Engine speed](#bench-engine-speed)**<br>
 <sub>Raw systems numbers — grep throughput, query latency, rerank kernels, HNSW.</sub>
 </td>
@@ -173,6 +175,7 @@ We measure sweet-search four ways — from how much it helps a real agent down t
 ---
+<a id="bench-code-retrieval"></a>
 ### 🤖 1. Code-retrieval benchmarks — *the agent-in-the-loop test*
 We install the evolved agent prompt (the [GEPA-evolved search discipline](#-an-agent-prompt-that-was-evolved-not-written)), point a coding agent at a real repo, and pair it **probe-for-probe against the same model running its own native grep-and-read loop**. Same model, same tasks, same judge — the only difference is whether sweet-search is wired in.
@@ -220,12 +223,14 @@ The win is **harness-adaptive**: where the native loop is disciplined (Claude Co
 ---
+<a id="bench-task-completion"></a>
 ### 🚧 2. Task-completion benchmarks — *coming soon*
 > Retrieval quality is necessary but not sufficient. Cheaper, denser context only matters if it **compounds across a real, multi-step engineering task** — finding the code, understanding it, changing it, and not breaking anything. The next suite measures exactly that: **resolve-rate on SWE-bench-style multi-file tasks**, sweet-search-wired vs. native, on the same paired, multiplicity-controlled bar as above. Harness and pilot are in progress — numbers land here when they clear that bar, and not before.
 ---
+<a id="bench-paper-type"></a>
 ### 📄 3. Paper-type retrieval benchmarks — *academic NL→code IR*
 > [!WARNING]
@@ -271,6 +276,7 @@ and French queries.
 ---
+<a id="bench-engine-speed"></a>
 ### ⚡ 4. Engine speed — *systems benchmarks, measured in-repo*
 <div align="center">
@@ -566,9 +572,36 @@ What it teaches:
 > **Chunk → enrich → embed → quantize** — every step on-device and in Rust. Batches are sized to *your CPU's actual cache*, two open code-models do the encoding, and two separate quantizations make the index both **faster to build** and **small enough to live in RAM**. Zero API keys; nothing ever leaves the machine.
-| ① Structure-aware chunk | ② Enrich from structure | ③ Embed — two models | ④ Quantize + persist |
-|:--|:--|:--|:--|
-| cAST over tree-sitter ASTs — whole functions, never sliced mid-body | deterministic preamble from the code graph — **no LLM call** | dense **CodeRankEmbed** + per-token **LateOn-Code** | INT8 weights → **2× faster build** · INT4 vectors → **fits in RAM** |
+<table>
+<tr>
+<td width="50%" valign="top">
+① 🧩 **[Structure-aware chunk](#idx-chunk)**<br>
+<sub>cAST over tree-sitter ASTs — whole functions, never sliced mid-body</sub>
+</td>
+<td width="50%" valign="top">
+② 🏷️ **[Enrich from structure](#idx-enrich)**<br>
+<sub>deterministic preamble from the code graph — **no LLM call**</sub>
+</td>
+</tr>
+<tr>
+<td width="50%" valign="top">
+③ 🤖 **[Embed — two models](#idx-embed)**<br>
+<sub>dense **CodeRankEmbed** + per-token **LateOn-Code**</sub>
+</td>
+<td width="50%" valign="top">
+④ 🗜️ **[Quantize + persist](#idx-quantize)**<br>
+<sub>INT8 weights → **2× faster build** · INT4 vectors → **fits in RAM**</sub>
+</td>
+</tr>
+</table>
 **The inference engine, picked for your silicon:**
@@ -579,10 +612,12 @@ What it teaches:
 | 🟩 NVIDIA GPU (SM 7.0+) | candle **CUDA**; **flash-attention** on Ampere+ |
 | 💻 No accelerator | **ONNX Runtime INT8** — tuned CPU path, 132 MB model, **zero GPU weights downloaded** |
+<a id="idx-chunk"></a>
 ### 🧩 Chunking — every chunk is whole code, never a fixed window
 - **[cAST](https://arxiv.org/abs/2506.15655)** structure-aware chunking over real **tree-sitter** ASTs: a recursive *split-then-merge* greedily packs sibling AST nodes up to the size cap and recurses *into* nodes too big to fit. So a chunk is always a **function, a class, or a contiguous run of declarations** — never a body cut in half, never a string split mid-literal.
 - **14 languages** get true AST grammars — `JS · TS · TSX · Python · Go · Rust · Java · C · C++ · Ruby · PHP · Kotlin · Swift · C#` — and a **39-config regex registry** carries structure-aware chunking to **70+ more extensions**.
+<a id="idx-enrich"></a>
 ### 🏷️ Metadata — context the encoder can actually see
 - Every chunk ships its **symbol name · entity type · signature · line span** — the metadata that powers the code graph, `ss-read` annotations, and the self-contained answers everywhere else.
 - **Contextual enrichment:** before embedding, each chunk is prefixed with a structured preamble assembled from the AST + code graph — *file path · enclosing-scope breadcrumb · name & type · merged siblings · the imports it actually uses*. **Both** encoders see it, so a bare `getId()` still retrieves on the class and module around it.
@@ -593,6 +628,7 @@ What it teaches:
 - **Uses every core the hardware really has** — full count on ARM/Apple Silicon; x86 SMT siblings discounted because they don't scale inference linearly.
 - **ORT drives the CPU path** (ONNX Runtime); GPU hosts swap in fused kernels (below). Either way inference runs off the event loop as a napi `AsyncTask`, so tokenization and SQLite writes overlap compute instead of stalling behind it.
+<a id="idx-quantize"></a>
 ### 🗜️ Two quantizations — one buys speed, one buys size
 | | **Model weights** · INT8 ORT | **Index vectors** · INT4 binary |
 |:--|:--|:--|
@@ -600,6 +636,7 @@ What it teaches:
 | **Win** | **~2× faster** indexing · 4× smaller model (**132 MB**) | LI index **1.34 GiB → ~396 MiB** · INT4 nibble-packing halves it again |
 | **Fidelity** | **≥ 0.96 cosine** vs FP32 | **no measurable retrieval loss** (A/B-tested vs INT8) |
+<a id="idx-embed"></a>
 ### 🤖 Two models — both open, both local, both code-specialized
 - **[CodeRankEmbed](https://huggingface.co/nomic-ai/CodeRankEmbed)** — 768-d dense bi-encoder (137M, Apache-2.0) for first-stage recall.
 - **[LateOn-Code](https://huggingface.co/lightonai/LateOn-Code)** — ModernBERT per-token **late interaction** (149M) for the rerank.

package/core/embedding/embedding-local-model.js CHANGED Viewed

@@ -172,6 +172,7 @@ export function buildLocalSessionOptions(quantLabel = 'q8', coremlAvailable = fa
   const sessionOptions = {
     graphOptimizationLevel: 'all',
+    logSeverityLevel: 3, // ERROR — silence ORT's expected "optimized model is machine-specific" warning
     intraOpNumThreads: intraOpThreads,
     interOpNumThreads: interOpThreads,
     executionMode,

package/core/graph/relationship-resolver.js CHANGED Viewed

@@ -160,7 +160,11 @@ export function resolveRelationshipTargets(db) {
   resolveAll();
-  console.log(`  ✓ Resolved ${resolved}/${unresolved.length} relationships`);
+  if (resolved > 0) {
+    console.log(`  ✓ Linked ${resolved}/${unresolved.length} references to local definitions`);
+  } else {
+    console.log(`  ${unresolved.length} references resolve to external/library symbols (no local definition to link)`);
+  }
   if (ambiguous > 0) {
     console.log(`  ⚠ ${ambiguous} ambiguous targets (multiple matches)`);
   }

package/core/indexing/index-codebase-v21.js CHANGED Viewed

@@ -140,10 +140,6 @@ async function main() {
     applyPersistedLiModel(process.env.SWEET_SEARCH_PROJECT_ROOT || process.cwd());
   }
-  log(`${colors.bright}╔═══════════════════════════════════════════════════╗${colors.reset}`, 'bright');
-  log(`${colors.bright}║   Sweet Search Codebase Indexer v2.3 (SOTA Dec'25) ║${colors.reset}`, 'bright');
-  log(`${colors.bright}╚═══════════════════════════════════════════════════╝${colors.reset}`, 'bright');
   if (vectorsOnly) {
     log('⚠ WARNING: --vectors-only skips code graph rebuild', 'yellow');
     log('  GraphRAG structural queries will use stale data', 'yellow');
@@ -338,13 +334,13 @@ Output:
     }
     // =========================================================================
-    // PHASE 3: Code Graph + HCGS Preparation (if not --vectors-only)
+    // PHASE 1: Code Graph (if not --vectors-only)
     // =========================================================================
     let graphStats = { entities: 0, relationships: 0 };
     let hcgsPromise = null;
     if (!vectorsOnly) {
-      const graphResult = await runPhase('Code Graph + HCGS Prep', buildCodeGraphWithHCGSPhase, {
+      const graphResult = await runPhase('Code Graph', buildCodeGraphWithHCGSPhase, {
         allFiles,
         filesToIndex,
         dryRun,

package/core/indexing/indexer-ann.js CHANGED Viewed

@@ -396,7 +396,7 @@ function diversityFirstPermutationRowids(filePaths) {
 // =============================================================================
 export async function incrementalUpdateHNSW(dbPath, changedFiles, dryRun = false) {
-  log('\n━━━ Phase 3: HNSW Index (Incremental) ━━━', 'bright');
+  log('\n━━━ Phase 4: HNSW Index (Incremental) ━━━', 'bright');
   if (dryRun) {
     log('DRY RUN: Skipping HNSW incremental update', 'magenta');
@@ -510,7 +510,7 @@ export async function incrementalUpdateHNSW(dbPath, changedFiles, dryRun = false
 // =============================================================================
 export async function buildHNSWIndex(dbPath, dryRun = false) {
-  log('\n━━━ Phase 3: HNSW Index ━━━', 'bright');
+  log('\n━━━ Phase 4: HNSW Index ━━━', 'bright');
   if (dryRun) {
     log('DRY RUN: Skipping HNSW index', 'magenta');
@@ -705,7 +705,7 @@ export async function buildLateInteractionIndex(chunks, dryRun = false, filesToR
     segmentSize = null, // override SSLX-v3 segment threshold (default 10k)
     projectRoot,        // honored by LI skip policy for .sweet-search.config.json excludes
   } = options;
-  log('\n━━━ Phase 4: Late Interaction Index (LateOn-Code) ━━━', 'bright');
+  log('\n━━━ Phase 3: Late Interaction Index (LateOn-Code) ━━━', 'bright');
   if (dryRun) {
     log('DRY RUN: Skipping late interaction index', 'magenta');

package/core/indexing/indexer-build.js CHANGED Viewed

@@ -643,7 +643,7 @@ export async function chunkFiles(files) {
     try {
       const enriched = await enrichChunksFromGraph(allChunks, ASTChunker);
       if (enriched > 0) {
-        log(`✓ Enriched ${enriched}/${allChunks.length} chunks with scope/import context`, 'green');
+        log(`✓ Added scope/import context to ${enriched} code chunks`, 'green');
       }
     } catch (err) {
       log(`⚠ Chunk enrichment skipped: ${err.message}`, 'yellow');

package/core/indexing/indexer-utils.js CHANGED Viewed

@@ -110,20 +110,24 @@ export function isVerboseMode() {
 }
 // ---------------------------------------------------------------------------
-// Progress rendering — an in-place "sticky" bar that animates as a phase runs.
+// Progress rendering — a live region of animated, in-place bars.
 //
-// On a TTY (verbose or not) the bar redraws on a single line via carriage return
-// + erase-to-EOL, with smooth 1/8-block fill. While a bar is active, log() pins it:
-// it clears the bar, prints the log line above, then redraws the bar below — so
-// interleaved diagnostics (e.g. the HNSW "checkpoint:" line) never split the bar.
-// Non-TTY (pipes / CI) falls back to throttled newlines so nothing is swallowed.
+// On a TTY (verbose included), each phase's bar animates in place via cursor
+// moves + erase-to-EOL, with smooth 1/8-block fill. Multiple bars can run at
+// once (e.g. Embedding + Late Interaction in parallel) — they share one pinned
+// region at the bottom and update independently. While bars are live, log()
+// prints its line above the region and redraws the bars below, so diagnostics
+// never split a bar. The region "commits" (stays on screen) once every bar in
+// it has reached 100%. Non-TTY (pipes / CI) falls back to throttled newlines.
 // ---------------------------------------------------------------------------
 const BAR_WIDTH = 30;
 const LABEL_COL = 17;           // pad "Label:" to this width so every bar's [ ] aligns
 const SUB_BLOCKS = ['', '▏', '▎', '▍', '▌', '▋', '▊', '▉']; // eighth-block partial fills
 const CLEAR_EOL = '\x1b[K';
-let activeBar = null;           // last-rendered bar string while a phase is in progress (TTY only)
+const liveBars = new Map();     // label -> { current, total }; insertion order = display order
+let regionLines = 0;            // bar lines currently pinned at the bottom (TTY)
 let lastLoggedPercent = {};
+let deferredLogs = [];          // lines held back while parallel bars run (flushed on commit)
 function renderBar(current, total, label) {
   const ratio = total > 0 ? Math.max(0, Math.min(1, current / total)) : 1;
@@ -137,12 +141,29 @@ function renderBar(current, total, label) {
   return `${colors.cyan}${head}[${bar}${empty}] ${pct}% (${current}/${total})${colors.reset}`;
 }
+function drawRegion() {
+  let out = regionLines > 0 ? `\x1b[${regionLines}A\r` : '\r';
+  for (const [label, b] of liveBars) out += renderBar(b.current, b.total, label) + CLEAR_EOL + '\n';
+  process.stdout.write(out);
+  regionLines = liveBars.size;
+}
 export function log(message, color = 'reset') {
   if (quietMode) return;
   const line = `${colors[color]}${message}${colors.reset}`;
-  if (activeBar && process.stdout.isTTY) {
-    // Pin the bar: clear it, print the log line above, redraw the bar below.
-    process.stdout.write(`\r${CLEAR_EOL}${line}\n${activeBar}${CLEAR_EOL}`);
+  if (regionLines > 0 && process.stdout.isTTY) {
+    if (liveBars.size > 1) {
+      // Parallel bars are live: defer the line. Printing it now would scroll the
+      // region and freeze a duplicate bar-pair into scrollback (e.g. the "✓ Late
+      // interaction index built" line when LI finishes before Embedding). Flushed
+      // once every bar in the region completes.
+      deferredLogs.push(line);
+      return;
+    }
+    // Single bar: print the line above it, then redraw the bar below.
+    let out = `\x1b[${regionLines}A\r${line}${CLEAR_EOL}\n`;
+    for (const [label, b] of liveBars) out += renderBar(b.current, b.total, label) + CLEAR_EOL + '\n';
+    process.stdout.write(out);
   } else {
     console.log(line);
   }
@@ -160,13 +181,22 @@ export function logProgress(current, total, label) {
     }
     return;
   }
-  // Interactive TTY: animate the bar in place.
-  activeBar = renderBar(current, total, label);
-  process.stdout.write(`\r${activeBar}${CLEAR_EOL}`);
-  if (current >= total) {
-    process.stdout.write('\n');
-    activeBar = null;
-    lastLoggedPercent[label] = 0;
+  // Interactive TTY: update this bar in the live region and redraw.
+  liveBars.set(label, { current, total });
+  drawRegion();
+  // Once every live bar is complete, commit the region (leave it on screen).
+  let allDone = true;
+  for (const b of liveBars.values()) if (b.current < b.total) { allDone = false; break; }
+  if (allDone) {
+    for (const k of liveBars.keys()) lastLoggedPercent[k] = 0;
+    liveBars.clear();
+    regionLines = 0;
+    // Flush any lines deferred while the parallel bars were running — now below
+    // the finished bars, in arrival order.
+    if (deferredLogs.length) {
+      for (const l of deferredLogs) console.log(l);
+      deferredLogs = [];
+    }
   }
 }

package/core/infrastructure/onnx-session-utils.js CHANGED Viewed

@@ -192,6 +192,7 @@ export function buildSessionOptions(modelId, suffix, coremlAvailable = false, ru
     ?? parseInt(process.env.SWEET_SEARCH_ORT_INTER_OP_THREADS || '1', 10);
   const opts = {
     graphOptimizationLevel: 'all',
+    logSeverityLevel: 3, // ERROR — silence ORT's expected "optimized model is machine-specific" warning
     intraOpNumThreads: runtimeOptions.intraOpThreads ?? bestIntraOpThreads(runtimeOptions),
     interOpNumThreads: interOpThreads,
     executionMode,

package/core/ranking/late-interaction-model.js CHANGED Viewed

@@ -193,6 +193,7 @@ async function loadModel() {
   const { getOptimizedGraphPath } = await import('../infrastructure/onnx-session-utils.js');
   const session = await ort.InferenceSession.create(onnxPath, {
     executionProviders: ['cpu'],
+    logSeverityLevel: 3, // ERROR — silence ORT's expected "optimized model is machine-specific" warning
     intraOpNumThreads: lateInteractionRuntimeConfig.intraOpThreads ?? bestIntraOpThreads(),
     interOpNumThreads: 1,
     optimizedModelFilePath: getOptimizedGraphPath(modelConfig.hfId, 'lateon'),

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "sweet-search",
-  "version": "2.5.6",
+  "version": "2.5.8",
   "description": "Sweet Search - SOTA Hybrid Code Search Engine with WASM CatBoost Query Router, Semantic/Lexical/Structural Search, and Multilingual Support",
   "type": "module",
   "main": "core/search/sweet-search.js",
@@ -163,12 +163,12 @@
   },
   "optionalDependencies": {
     "usearch": "^2.21.4",
-    "@sweet-search/native-darwin-arm64": "2.5.6",
-    "@sweet-search/native-darwin-x64": "2.5.6",
-    "@sweet-search/native-linux-arm64-gnu": "2.5.6",
-    "@sweet-search/native-linux-arm64-gnu-cuda": "2.5.6",
-    "@sweet-search/native-linux-x64-gnu": "2.5.6",
-    "@sweet-search/native-linux-x64-gnu-cuda": "2.5.6"
+    "@sweet-search/native-darwin-arm64": "2.5.8",
+    "@sweet-search/native-darwin-x64": "2.5.8",
+    "@sweet-search/native-linux-arm64-gnu": "2.5.8",
+    "@sweet-search/native-linux-arm64-gnu-cuda": "2.5.8",
+    "@sweet-search/native-linux-x64-gnu": "2.5.8",
+    "@sweet-search/native-linux-x64-gnu-cuda": "2.5.8"
   },
   "engines": {
     "node": ">=18.0.0"

package/scripts/init.js CHANGED Viewed

@@ -252,9 +252,26 @@ export function detectProjectRoot(cwd = process.cwd()) {
 export function ensureDataDir(projectRoot) {
   const dataDir = join(projectRoot, DATA_DIR_NAME);
   mkdirSync(dataDir, { recursive: true });
+  maybeIgnoreDataDir(projectRoot);
   return dataDir;
 }
+// Add `.sweet-search/` to the project's .gitignore so the local index isn't
+// committed — but ONLY if a .gitignore already exists. We never create one for
+// a project that doesn't already use it.
+function maybeIgnoreDataDir(projectRoot) {
+  try {
+    const gitignorePath = join(projectRoot, '.gitignore');
+    if (!existsSync(gitignorePath)) return;
+    const content = readFileSync(gitignorePath, 'utf8');
+    const already = content.split(/\r?\n/).map((l) => l.trim().replace(/^\//, '').replace(/\/$/, ''))
+      .some((l) => l === DATA_DIR_NAME);
+    if (already) return;
+    const sep = content.length === 0 || content.endsWith('\n') ? '' : '\n';
+    writeFileSync(gitignorePath, `${content}${sep}\n# Sweet Search local index\n${DATA_DIR_NAME}/\n`);
+  } catch { /* best-effort — never block init on .gitignore */ }
+}
 // ---------------------------------------------------------------------------
 // Init config read/write
 // ---------------------------------------------------------------------------
@@ -1576,11 +1593,10 @@ export async function runInit(args) {
   const skippedOptIns = getSkippedOptInModels(profile);
   let modelResults = new Map();
-  // Tell the user once which optional models are being skipped. This is
-  // NOT an error — these are opt-in features (e.g. cross-encoder
-  // rerankers disabled by default since commit 43a61eb). Without this
-  // line, init silently omitting them looks like a missing-model bug.
-  if (skippedOptIns.length > 0) {
+  // Opt-in models (e.g. cross-encoder rerankers, disabled by default since
+  // commit 43a61eb) are skipped silently — they're optional features, not
+  // missing models. Set DEBUG=1 to see which were skipped and how to enable.
+  if (process.env.DEBUG && skippedOptIns.length > 0) {
     for (const skipped of skippedOptIns) {
       process.stderr.write(
         `[init] Skipping opt-in model "${skipped.key}" — ` +

package/scripts/postinstall-banner.js CHANGED Viewed

@@ -1,46 +1,52 @@
 #!/usr/bin/env node
 /**
- * postinstall — play the animated banner once after install.
+ * postinstall — print a short "what next" message after install.
  *
- * npm pipes lifecycle-script stdout (it's not a TTY), so we render to the
- * controlling terminal directly via /dev/tty when possible. This is Unix-only;
- * on Windows (no /dev/tty) or when there is no controlling terminal (CI, detached,
- * sandboxed installs) we simply skip.
- *
- * Defensive by design: renders only to a real terminal, honours CI / NO_BANNER /
- * SWEET_SEARCH_NO_BANNER, swallows every error, and always exits 0 so it can never
- * fail `npm install`.
+ * npm pipes postinstall stdout (and swallows it for `-g`), so we write to the
+ * controlling terminal (/dev/tty) directly — same reason the message vanished
+ * when we used process.stdout. It is deliberately PLAIN TEXT (no graphics /
+ * animation): during `npm install` npm writes its own spinner to the same
+ * terminal concurrently, which would corrupt a large chunked escape sequence
+ * (the base64 garbage we saw) — a short text line is atomic and safe. The rich
+ * animated banner is reserved for `sweet-search init` / `index`, where we own
+ * the TTY. Best-effort; never throws.
  */
 import process from 'node:process';
-import tty from 'node:tty';
-import { openSync, closeSync } from 'node:fs';
-import { dirname, join } from 'node:path';
-import { fileURLToPath } from 'node:url';
+import { openSync, writeSync, closeSync } from 'node:fs';
-async function run() {
+function run() {
   const env = process.env;
-  if (env.CI || env.NO_BANNER || env.SWEET_SEARCH_NO_BANNER) return;
+  if (env.NO_BANNER || env.SWEET_SEARCH_NO_BANNER) return;
-  // Pick an output stream that is a real terminal.
-  let stream = process.stdout.isTTY ? process.stdout : null;
-  let ownedFd = -1;
-  if (!stream && process.platform !== 'win32') {
-    try {
-      ownedFd = openSync('/dev/tty', 'r+');     // throws if no controlling terminal
-      const s = new tty.WriteStream(ownedFd);
-      if (s.isTTY) stream = s;
-    } catch { /* no controlling terminal — skip */ }
+  // Choose a real-terminal sink: stdout if it's already a TTY (foreground
+  // scripts), otherwise the controlling terminal. Windows has no /dev/tty →
+  // the banner shows on `init`/`index` instead.
+  let fd = -1;
+  const useStdout = !!process.stdout.isTTY;
+  if (!useStdout) {
+    if (process.platform === 'win32') return;
+    try { fd = openSync('/dev/tty', 'w'); } catch { return; } // no controlling terminal → skip
   }
-  if (!stream) return;
+  const c = (n, s) => `\x1b[${n}m${s}\x1b[0m`;
+  const msg = [
+    '',
+    `  ${c('1;38;5;213', 'sweet-search')} installed ${c('2', '— SOTA hybrid code search')}`,
+    '',
+    `  ${c('1', 'Get started:')}`,
+    `    ${c('36', 'sweet-search init')}        set up the current project`,
+    `    ${c('36', 'sweet-search index')}       build the search index`,
+    `    ${c('36', 'sweet-search "query"')}     search your code`,
+    `  ${c('2', '(installed locally? prefix with')} ${c('2;36', 'npx')}${c('2', ')')}`,
+    '',
+    '',
+  ].join('\n');
   try {
-    const here = dirname(fileURLToPath(import.meta.url));
-    const { showBanner } = await import(join(here, '..', 'core', 'banner', 'render-banner.js'));
-    // query:false — we have no matching stdin for this tty stream; rely on env-based detection.
-    const shown = await showBanner({ stream, env, query: false, maxMs: 2200 });
-    if (shown) stream.write('  sweet-search installed — run `sweet-search init` to get started.\n');
-  } catch { /* never break an install */ }
-  finally { if (ownedFd >= 0) { try { closeSync(ownedFd); } catch { /* noop */ } } }
+    if (useStdout) process.stdout.write(msg);
+    else { writeSync(fd, msg); }
+  } catch { /* best-effort */ }
+  finally { if (fd >= 0) { try { closeSync(fd); } catch { /* noop */ } } }
 }
-run().finally(() => process.exit(0));
+run();